• 博客园logo
  • 会员
  • 众包
  • 新闻
  • 博问
  • 闪存
  • 赞助商
  • HarmonyOS
  • Chat2DB
    • 搜索
      所有博客
    • 搜索
      当前博客
  • 写随笔 我的博客 短消息 简洁模式
    用户头像
    我的博客 我的园子 账号设置 会员中心 简洁模式 ... 退出登录
    注册 登录

littlesuccess

  • 博客园
  • 联系
  • 订阅
  • 管理

公告

View Post

书生开源大模型训练营-第4讲-笔记

1、FineTune简介

1.1、为什么要微调?大语言模型有各行各业的通用知识,但具体深入到某个领域,模型表现不尽如人意,需要微调

1.2、两种微调:增量预训练和指令微调

1.4、增量预训练:给模型投喂额外的特定领域的知识语料,模型在新的语料上继续学习训练。

1.5、指令微调:基座模型学习到到是在须训练数据集上的一个语言分布,本不能理解问题的意图。所以需要有一些方法让基座模型理解人类的意图(指令),这种方法叫指令微调。

1.6、如何进行指令微调:使用指令微调模板,其中有三个角色,System、User和Asistant。在System中设置具体领域的背景信息和意图,User中设置需要回答的问题,Asistant中设置期望的答案。有很多指令微调的框架,可以简化我们的工作,XTuner就是其中之一。

1.7、不同的开源框架有不同的微调框架。Llama和InternLM的格式有些不同。但都能在预测阶段自动的模板组装都是有微调框架完成的。

1.8、指令微调时,需要准备输入输出数据,但只对输出数据/label计算损失。

1.9、增量预训练:和指令微调的一问一答的训练语料不一样,增量预训练只有答案,或者说只有陈述句。所以在语料编写时,System和User部分都设置为空。但计算损失时,和指令微调是一样的。

1.10、XTuner中使用的是LoRA和QLoRA

LLM中线下全连接层有大量的参数,如果全都要进行微调,将需要很大的显存和工作量,为了节省显存和计算量,可以搞一个旁路,以小的参数量来近似达成全量调整的效果。这个旁路由两个变换矩阵构成。

1.11、全量调整、LORA、QLORA的对比

 

 

2、XTuner介绍

2.1、XTuner:开源微调框架,支撑HuggingFace和ModelScope和多个开源大模型家族,包括Llama、通义千问、ChatGLM以及InternLM、最新的MoE模型;支撑多种GPU显卡,包括消费级显卡和数据中心级显卡

2.2、快速上手:a、pip安装xtuner,注意要指定版本;b、选择配置模板;c、一键训练。拷贝配置模板,修改模板参数,启动训练。

 训练完成得到一个adapter文件。在进行预测时在加载底座模型的时候,还需要加载这个adapter文件。

2.4、xtuner还支撑工具类模型的对话。

2.5、xtuner有强大的数据处理引擎,可以在不同格式的数据集上进行快速映射和启动训练,支撑将多条数据聚合成一条,以加速训练。建议使用json或JsonL格式

 

 

3、8G显卡玩转LLM

3.1、xtuner默认开启flashattention加速方式。xtuner默认ZeRO是不启动的。flash attention能大幅提高训练性能,但需要修改模板。

4、实战

4.1、进入到第三讲中已经建立好的开发机中:

 4.2、安装

# 如果你是在其他平台:
conda create --name xtuner0.1.9 python=3.10 -y

# 激活环境
conda activate xtuner0.1.9
# 进入家目录 (~的意思是 “当前用户的home路径”)
cd ~
# 创建版本文件夹并进入,以跟随本教程
mkdir xtuner019 && cd xtuner019

# 无法访问github的用户请从 gitee 拉取:
git clone -b v0.1.9 https://gitee.com/Internlm/xtuner

# 进入源码目录
cd xtuner

# 从源码安装 XTuner
pip install -e '.[all]'

屏幕输出:

(base) root@intern-studio-069640:~# conda activate xtuner0.1.9
(xtuner0.1.9) root@intern-studio-069640:~# # 进入家目录 (~的意思是 “当前用户的home路径”)
(xtuner0.1.9) root@intern-studio-069640:~# cd ~
(xtuner0.1.9) root@intern-studio-069640:~# # 创建版本文件夹并进入,以跟随本教程
(xtuner0.1.9) root@intern-studio-069640:~# mkdir xtuner019 && cd xtuner019
(xtuner0.1.9) root@intern-studio-069640:~/xtuner019# 

(xtuner0.1.9) root@intern-studio-069640:~/xtuner019# git clone -b v0.1.9 https://gitee.com/Internlm/xtuner
Cloning into 'xtuner'...
remote: Enumerating objects: 6342, done.
remote: Counting objects: 100% (3757/3757), done.
remote: Compressing objects: 100% (747/747), done.
remote: Total 6342 (delta 3080), reused 3614 (delta 2964), pack-reused 2585
Receiving objects: 100% (6342/6342), 1.14 MiB | 692.00 KiB/s, done.
Resolving deltas: 100% (4901/4901), done.
Note: switching to '9f686f08c8e60e568e811aaad8daf9c08462d42d'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

Updating files: 100% (430/430), done.
(xtuner0.1.9) root@intern-studio-069640:~/xtuner019# # 进入源码目录
(xtuner0.1.9) root@intern-studio-069640:~/xtuner019# cd xtuner
(xtuner0.1.9) root@intern-studio-069640:~/xtuner019/xtuner# 
(xtuner0.1.9) root@intern-studio-069640:~/xtuner019/xtuner# # 从源码安装 XTuner
(xtuner0.1.9) root@intern-studio-069640:~/xtuner019/xtuner# pip install -e '.[all]'
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///root/xtuner019/xtuner
  Preparing metadata (setup.py) ... done
Collecting bitsandbytes>=0.40.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9b/63/489ef9cd7a33c1f08f1b2be51d1b511883c5e34591aaa9873b30021cd679/bitsandbytes-0.42.0-py3-none-any.whl (105.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 105.0/105.0 MB 43.2 MB/s eta 0:00:00
Collecting datasets
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/74/4d/63b033169534f0742b7fe13957118cae08c83b04bfde46511f397872e2e7/datasets-2.17.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.6/536.6 kB 12.4 MB/s eta 0:00:00
Collecting einops
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/29/0b/2d1c0ebfd092e25935b86509a9a817159212d82aa43d7fb07eca4eeff2c2/einops-0.7.0-py3-none-any.whl (44 kB)
Collecting fsspec<=2023.6.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e3/bd/4c0a4619494188a9db5d77e2100ab7d544a42e76b2447869d8e124e981d8/fsspec-2023.6.0-py3-none-any.whl (163 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 5.9 MB/s eta 0:00:00
Collecting lagent>=0.1.2
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/54/51/0cd9df1ec309b9d73e2a009bf61a8d8c84c34b27480994fe83a7fa8f24d3/lagent-0.2.1-py3-none-any.whl (69 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.4/69.4 kB 3.2 MB/s eta 0:00:00
Collecting mmengine>=0.9.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/92/f8/0ec23b2d7fd2d3aebe05a70b8b4ff314c0cb552a614b1656ca1cb2a11633/mmengine-0.10.3-py3-none-any.whl (451 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 451.7/451.7 kB 23.9 MB/s eta 0:00:00
Collecting modelscope
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/32/7f/5e49028db40c58a0ecea4f5a6ead189294353b793bb403d233b00cb35ac7/modelscope-1.12.0-py3-none-any.whl (5.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.6/5.6 MB 67.3 MB/s eta 0:00:00
Collecting peft>=0.4.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/07/63/168af5aa8dbda9c23ad774a4c1d311cfe220c634e0d05a3a82a7cae01bd8/peft-0.8.2-py3-none-any.whl (183 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 183.4/183.4 kB 8.9 MB/s eta 0:00:00
Collecting scipy
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f5/aa/8e6071a5e4dca4ec68b5b22e4991ee74c59c5d372112b9c236ec1faff57d/scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.4 MB)
Collecting SentencePiece
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7f/e5/323dc813b3e1339305f888d035e2f3725084fc4dcf051995b366dd26cc90/sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Collecting tiktoken
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/16/05/5efbd91252ffb1301ea393d88ef736b33d41e75d4bcf0bd31d660050e400/tiktoken-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB)
Collecting torch
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8c/67/fcc9b9e2369a9bae4da492aedc0c2dfa95d563ef0eaa9228b70c98395ec2/torch-2.2.0-cp310-cp310-manylinux1_x86_64.whl (755.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 755.5/755.5 MB 12.6 MB/s eta 0:00:00
Collecting transformers<=4.34.0,>=4.32.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1a/d1/3bba59606141ae808017f6fde91453882f931957f125009417b87a281067/transformers-4.34.0-py3-none-any.whl (7.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 38.3 MB/s eta 0:00:00
Collecting transformers_stream_generator
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/36/26/3492ab0e45d814533b34ca605f8a20fdc032736f937679c6f212d81a76a5/transformers-stream-generator-0.0.4.tar.gz (12 kB)
  Preparing metadata (setup.py) ... done
Collecting deepspeed>=0.12.3
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5d/4b/382b6c7f22a9f51875e5a159a2a8e94c2b3b01b0c86f7bed2ea7cf919549/deepspeed-0.13.2.tar.gz (1.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 28.1 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting mpi4py-mpich
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1a/e3/942a8e3322e3f1a265409d4028843c2770864f9ee699ba692296aa743232/mpi4py_mpich-3.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0/6.0 MB 40.8 MB/s eta 0:00:00
Collecting hjson (from deepspeed>=0.12.3)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1f/7f/13cd798d180af4bf4c0ceddeefba2b864a63c71645abc0308b768d67bb81/hjson-3.1.0-py3-none-any.whl (54 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.0/54.0 kB 2.9 MB/s eta 0:00:00
Collecting ninja (from deepspeed>=0.12.3)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6d/92/8d7aebd4430ab5ff65df2bfee6d5745f95c004284db2d8ca76dcbfd9de47/ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.2/307.2 kB 9.0 MB/s eta 0:00:00
Collecting numpy (from deepspeed>=0.12.3)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4b/d7/ecf66c1cd12dc28b4040b15ab4d17b773b87fa9d29ca16125de01adb36cd/numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 62.6 MB/s eta 0:00:00
Collecting packaging>=20.0 (from deepspeed>=0.12.3)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl (53 kB)
Collecting psutil (from deepspeed>=0.12.3)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c5/4f/0e22aaa246f96d6ac87fe5ebb9c5a693fbe8877f537a1022527c47ca43c5/psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB)
Collecting py-cpuinfo (from deepspeed>=0.12.3)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e0/a9/023730ba63db1e494a271cb018dcd361bd2c917ba7004c3e49d5daf795a2/py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Collecting pydantic (from deepspeed>=0.12.3)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/db/dc/afecbd9650f486889181c6d1a0d675b580c06253ea7e304588e4c7485bdb/pydantic-2.6.1-py3-none-any.whl (394 kB)
Collecting pynvml (from deepspeed>=0.12.3)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5b/9c/adb8070059caaa15d5a572b66bccd95900d8c1b9fa54d6ecea6ae97448d1/pynvml-11.5.0-py3-none-any.whl (53 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 1.0 MB/s eta 0:00:00
Collecting tqdm (from deepspeed>=0.12.3)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2a/14/e75e52d521442e2fcc9f1df3c5e456aead034203d4797867980de558ab34/tqdm-4.66.2-py3-none-any.whl (78 kB)
Collecting arxiv (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/99/16/532c2aa4bc83b2356820efd4d1f619e45178dc3a0dc0cde16fbccdc43fc1/arxiv-2.1.0-py3-none-any.whl (11 kB)
Collecting distro (from lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl (20 kB)
Collecting func-timeout (from lagent>=0.1.2)
  Using cached func_timeout-4.3.5-py3-none-any.whl
Collecting google-search-results (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/77/30/b3a6f6a2e00f8153549c2fa345c58ae1ce8e5f3153c2fe0484d444c3abcb/google_search_results-2.4.2.tar.gz (18 kB)
  Preparing metadata (setup.py) ... done
Collecting griffe (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/aa/4c/7268d218ee38cb0e07d63fc3fe60fe19dc353f757db3d365f0b5ffba85be/griffe-0.40.1-py3-none-any.whl (116 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.9/116.9 kB 2.9 MB/s eta 0:00:00
Collecting json5 (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/70/ba/fa37123a86ae8287d6678535a944f9c3377d8165e536310ed6f6cb0f0c0e/json5-0.9.14-py2.py3-none-any.whl (19 kB)
Collecting jsonschema (from lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/39/9d/b035d024c62c85f2e2d4806a59ca7b8520307f34e0932fbc8cc75fe7b2d9/jsonschema-4.21.1-py3-none-any.whl (85 kB)
Collecting jupyter (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting jupyter-client (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/43/ae/5f4f72980765e2e5e02b260f9c53bcc706cefa7ac9c8d7240225c55788d4/jupyter_client-8.6.0-py3-none-any.whl (105 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 105.9/105.9 kB 2.6 MB/s eta 0:00:00
Collecting phx-class-registry (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9b/46/02f4f5fb40f5ccbb3fc23a328fb3314843375d050a3b40ec21a8c18b5762/phx_class_registry-4.1.0-py3-none-any.whl (13 kB)
Collecting pillow (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cb/c3/98faa3e92cf866b9446c4842f1fe847e672b2f54e000cb984157b8095797/pillow-10.2.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 44.6 MB/s eta 0:00:00
Collecting python-pptx (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/72/49/6eee83072983473e9905ffddd5c2032b9a0ca4616425560d6d582287b467/python_pptx-0.6.23-py3-none-any.whl (471 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 471.6/471.6 kB 21.5 MB/s eta 0:00:00
Collecting requests (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl (62 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 3.4 MB/s eta 0:00:00
Collecting timeout-decorator (from lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/f8/0802dd14c58b5d3d72bb9caa4315535f58787a1dc50b81bbbcaaa15451be/timeout-decorator-0.5.0.tar.gz (4.8 kB)
  Preparing metadata (setup.py) ... done
Collecting typing-extensions (from lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b7/f4/6a90020cd2d93349b442bfcb657d0dc91eee65491600b2cb1d388bc98e6b/typing_extensions-4.9.0-py3-none-any.whl (32 kB)
Collecting addict (from mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl (3.8 kB)
Collecting matplotlib (from mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c1/f2/325897d6c498278b0f8b460d44b516f5db865ddb4ba9018e9fe58a3e4633/matplotlib-3.8.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
Collecting pyyaml (from mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/29/61/bf33c6c85c55bc45a29eee3195848ff2d518d84735eb0e2d8cb42e0d285e/PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
Collecting rich (from mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/be/be/1520178fa01eabe014b16e72a952b9f900631142ccd03dc36cf93e30c1ce/rich-13.7.0-py3-none-any.whl (240 kB)
Collecting termcolor (from mmengine>=0.9.1)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d9/5f/8c716e47b3a50cbd7c146f45881e11d9414def768b7cd9c5e6650ec2a80a/termcolor-2.4.0-py3-none-any.whl (7.7 kB)
Collecting yapf (from mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/66/c9/d4b03b2490107f13ebd68fe9496d41ae41a7de6275ead56d0d4621b11ffd/yapf-0.40.2-py3-none-any.whl (254 kB)
Collecting opencv-python>=3 (from mmengine>=0.9.1)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d9/64/7fdfb9386511cd6805451e012c537073a79a958a58795c4e602e538c388c/opencv_python-4.9.0.80-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.2/62.2 MB 52.6 MB/s eta 0:00:00
Collecting accelerate>=0.21.0 (from peft>=0.4.0)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1b/da/24a54b9205fce3bdbaad521c35944d0b0a2d292ac5ae921e484b76312b43/accelerate-0.27.2-py3-none-any.whl (279 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 280.0/280.0 kB 24.4 MB/s eta 0:00:00
Collecting safetensors (from peft>=0.4.0)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d0/ba/b2254fafc7f5fdc98a2fa4d5a5eeb029fbf9589ec87f2c230c3ac0a1dd53/safetensors-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Collecting huggingface-hub>=0.17.0 (from peft>=0.4.0)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/28/03/7d3c7153113ec59cfb31e3b8ee773f5f420a0dd7d26d40442542b96675c3/huggingface_hub-0.20.3-py3-none-any.whl (330 kB)
Collecting filelock (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/81/54/84d42a0bee35edba99dee7b59a8d4970eccdd44b99fe728ed912106fc781/filelock-3.13.1-py3-none-any.whl (11 kB)
Collecting sympy (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl (5.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 60.9 MB/s eta 0:00:00
Collecting networkx (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 24.8 MB/s eta 0:00:00
Collecting jinja2 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/30/6d/6de6be2d02603ab56e72997708809e8a5b0fbfee080735109b40a3564843/Jinja2-3.1.3-py3-none-any.whl (133 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.2/133.2 kB 7.4 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 55.7 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 22.7 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 59.2 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ff/74/a2e2be7fb83aaedec84f391f082cf765dfb635e7caa9b49065f73e4835d8/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 11.6 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 14.8 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/86/94/eb540db023ce1d162e7bea9f8f5aa781d57c65aed513c33ee9a5123ead4d/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 19.8 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/44/31/4890b1c9abc496303412947fc7dcea3d14861720642b49e8ceed89636705/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 20.3 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/bc/1d/8de1e5c67099015c834315e333911273a8c6aaba78923dd1d1e25fc5f217/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 37.6 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/65/5b/cfaeebf25cd9fdec14338ccb16f6b2c4c7fa9163aefcf057d86b9cc248bb/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 17.7 MB/s eta 0:00:00
Collecting nvidia-nccl-cu12==2.19.3 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 18.6 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu12==12.1.105 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/d3/8057f0587683ed2fcd4dbfbdfdfa807b9160b809976099d36b8f60d08f03/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 6.1 MB/s eta 0:00:00
Collecting triton==2.2.0 (from torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/95/05/ed974ce87fe8c8843855daa2136b3409ee1c126707ab54a8b72815c08b49/triton-2.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (167.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.9/167.9 MB 22.2 MB/s eta 0:00:00
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1e/07/bf730d44c2fe1b676ad9cc2be5f5f861eb5d153fb6951987a2d6a96379a9/nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl (20.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.5/20.5 MB 32.4 MB/s eta 0:00:00
Collecting regex!=2019.12.17 (from transformers<=4.34.0,>=4.32.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/81/8a/96a62ce98e8ff1b16db56fde3debc8a571f6b7ea42ee137eb0d995cdfa26/regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773 kB)
Collecting tokenizers<0.15,>=0.14 (from transformers<=4.34.0,>=4.32.1)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a7/7b/c1f643eb086b6c5c33eef0c3752e37624bd23e4cbc9f1332748f1c6252d1/tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 37.8 MB/s eta 0:00:00
Collecting pyarrow>=12.0.0 (from datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d4/ca/ef67abb77f9dd51a0d3ff7fcebff58296068a046d7da352b9548070005ed/pyarrow-15.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (38.3 MB)
Collecting pyarrow-hotfix (from datasets)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e4/f4/9ec2222f5f5f8ea04f66f184caafd991a39c8782e31f5b0266f101cb68ca/pyarrow_hotfix-0.6-py3-none-any.whl (7.9 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/7a/cef76fd8438a42f96db64ddaa85280485a9c395e7df3db8158cfec1eee34/dill-0.3.8-py3-none-any.whl (116 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 4.4 MB/s eta 0:00:00
Collecting pandas (from datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b3/b3/3102c3a4abca1093e50cfec2213102a1c65c0b318a4431395d0121e6e690/pandas-2.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB)
Collecting xxhash (from datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/80/8a/1dd41557883b6196f8f092011a5c1f72d4d44cf36d7b67d4a5efe3127949/xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
Collecting multiprocess (from datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/bc/f7/7ec7fddc92e50714ea3745631f79bd9c96424cb2702632521028e57d3a36/multiprocess-0.70.16-py310-none-any.whl (134 kB)
Collecting aiohttp (from datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/93/40/d3decda219ebd5410eba627601d537ec3782efbcadba308e9ce381cc0b71/aiohttp-3.9.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
Collecting attrs (from modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e0/44/827b2a91a5816512fcaf3cc4ebc465ccd5d598c45cefa6703fcf4a79018f/attrs-23.2.0-py3-none-any.whl (60 kB)
Collecting gast>=0.2.2 (from modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/fa/39/5aae571e5a5f4de9c3445dae08a530498e5c53b0e74410eeeb0991c79047/gast-0.5.4-py3-none-any.whl (19 kB)
Collecting oss2 (from modelscope)
  Using cached oss2-2.18.4-py3-none-any.whl
Collecting python-dateutil>=2.1 (from modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/36/7a/87837f39d0296e723bb9b62bbb257d0355c7f6128853c78955f57342a56d/python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Requirement already satisfied: setuptools in /root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages (from modelscope) (68.2.2)
Collecting simplejson>=3.3.0 (from modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/cb/b6/ed513a0adc3e2c9654864ffb68266dcab5720d5653428d690e7e4fb32a6c/simplejson-3.19.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (137 kB)
Collecting sortedcontainers>=1.5.9 (from modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB)
Collecting urllib3>=1.26 (from modelscope)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/88/75/311454fd3317aefe18415f04568edc20218453b709c63c58b9292c71be17/urllib3-2.2.0-py3-none-any.whl (120 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.9/120.9 kB 7.3 MB/s eta 0:00:00
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response'))': /simple/aiosignal/
Collecting aiosignal>=1.1.2 (from aiohttp->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/76/ac/a7305707cb852b7e16ff80eaf5692309bde30e2b1100a1fcacdc8f731d97/aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting frozenlist>=1.1.1 (from aiohttp->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ec/25/0c87df2e53c0c5d90f7517ca0ff7aca78d050a8ec4d32c4278e8c0e52e51/frozenlist-1.4.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (239 kB)
Collecting multidict<7.0,>=4.5 (from aiohttp->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/33/62/2c9085e571318d51212a6914566fe41dd0e33d7f268f7e2f23dcd3f06c56/multidict-6.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (124 kB)
Collecting yarl<2.0,>=1.0 (from aiohttp->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c3/a0/0ade1409d184cbc9e85acd403a386a7c0563b92ff0f26d138ff9e86e48b4/yarl-1.9.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (301 kB)
Collecting async-timeout<5.0,>=4.0 (from aiohttp->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a7/fa/e01228c2938de91d47b307831c62ab9e4001e747789d0b05baf779a6488c/async_timeout-4.0.3-py3-none-any.whl (5.7 kB)
Collecting six>=1.5 (from python-dateutil>=2.1->modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting charset-normalizer<4,>=2 (from requests->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/f1/3702ba2a7470666a62fd81c58a4c40be00670e5006a67f4d626e57f013ae/charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.1/142.1 kB 2.0 MB/s eta 0:00:00
Collecting idna<4,>=2.5 (from requests->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c2/e7/a82b05cf63a603df6e68d59ae6a68bf5064484a0718ea5033660af4b54a9/idna-3.6-py3-none-any.whl (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.6/61.6 kB 1.6 MB/s eta 0:00:00
Collecting certifi>=2017.4.17 (from requests->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ba/06/a07f096c664aeb9f01624f858c3add0a4e913d6c96257acb4fce61e7de14/certifi-2024.2.2-py3-none-any.whl (163 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 3.8 MB/s eta 0:00:00
INFO: pip is looking at multiple versions of tokenizers to determine which version is compatible with other requirements. This could take a while.
Collecting tokenizers<0.15,>=0.14 (from transformers<=4.34.0,>=4.32.1)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/57/bd/45b5ef6b088880779f70acf60027f7043ca5fa1b98f4a4345cf3aea09044/tokenizers-0.14.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 20.6 MB/s eta 0:00:00
Collecting accelerate>=0.21.0 (from peft>=0.4.0)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e0/e5/20373eaee15adeb12872bc03355636c283cf3092fd7eb290bb974174b14e/accelerate-0.27.1-py3-none-any.whl (279 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 279.7/279.7 kB 5.5 MB/s eta 0:00:00
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c8/14/73c3d62e709c2ace755c826997b12f883f3cb6b138dec63ac1e2a68cd910/accelerate-0.27.0-py3-none-any.whl (279 kB)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a6/b9/44623bdb05595481107153182e7f4b9f2ef9d3b674938ad13842054dcbd8/accelerate-0.26.1-py3-none-any.whl (270 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 270.9/270.9 kB 7.9 MB/s eta 0:00:00
INFO: pip is still looking at multiple versions of tokenizers to determine which version is compatible with other requirements. This could take a while.
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/63/9c/c10fc10df1d4968406b3f3cffe5a7d9988a8583e3423fc4156d6c91ab62d/accelerate-0.26.0-py3-none-any.whl (270 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 270.7/270.7 kB 4.5 MB/s eta 0:00:00
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f7/fc/c55e5a2da345c9a24aa2e1e0f60eb2ca290b6a41be82da03a6d4baec4f99/accelerate-0.25.0-py3-none-any.whl (265 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 265.7/265.7 kB 4.8 MB/s eta 0:00:00
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/13/9e/ee987874058f2d93006961f6ff49e0bcb60ab9c26709ebe06bfa8707a4d8/accelerate-0.24.1-py3-none-any.whl (261 kB)
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d0/cf/364d550af711b5abe5129ac676896b223ba5a082d97fe400527a59c0c1f8/accelerate-0.24.0-py3-none-any.whl (260 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 261.0/261.0 kB 8.2 MB/s eta 0:00:00
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d9/92/2d3aecf9f4a192968035880be3e2fc8b48d541c7128f7c936f430d6f96da/accelerate-0.23.0-py3-none-any.whl (258 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 258.1/258.1 kB 10.2 MB/s eta 0:00:00
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4d/a7/05c67003d659a0035f2b3a8cf389c1d9645865aee84a73ce99ddab16682f/accelerate-0.22.0-py3-none-any.whl (251 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 251.2/251.2 kB 15.0 MB/s eta 0:00:00
Collecting transformers_stream_generator
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/bf/e8/785ec1627a60ca0ae7934525d2a24f419f146ff98b719f30ac76ced4fed4/transformers-stream-generator-0.0.3.tar.gz (12 kB)
  Preparing metadata (setup.py) ... done
Collecting modelscope
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1d/1c/b40d3558879309e5b080e3f2eaaac016385487671508c362245bfd5e4cdf/modelscope-1.11.1-py3-none-any.whl (5.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.5/5.5 MB 66.6 MB/s eta 0:00:00
Collecting datasets
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ec/93/454ada0d1b289a0f4a86ac88dbdeab54921becabac45da3da787d136628f/datasets-2.16.1-py3-none-any.whl (507 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 507.1/507.1 kB 9.4 MB/s eta 0:00:00
Collecting dill<0.3.8,>=0.3.0 (from datasets)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f5/3a/74a29b11cf2cdfcd6ba89c0cecd70b37cd1ba7b77978ce611eb7a146a832/dill-0.3.7-py3-none-any.whl (115 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 115.3/115.3 kB 2.6 MB/s eta 0:00:00
Collecting datasets
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a0/93/da8a22a292e51ab76f969eb87bda8fd70cc3963b4dd71f67bb92a70a7992/datasets-2.16.0-py3-none-any.whl (507 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 507.1/507.1 kB 21.0 MB/s eta 0:00:00
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e2/cf/db41e572d7ed958e8679018f8190438ef700aeb501b62da9e1eed9e4d69a/datasets-2.15.0-py3-none-any.whl (521 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 521.2/521.2 kB 7.9 MB/s eta 0:00:00
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/00/23/80a2147a547cb2fd59eb92a13787c849b3efaefcea02a5c963dfc93f7c56/datasets-2.14.7-py3-none-any.whl (520 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 520.4/520.4 kB 7.2 MB/s eta 0:00:00
Collecting huggingface-hub>=0.17.0 (from peft>=0.4.0)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/aa/f3/3fc97336a0e90516901befd4f500f08d691034d387406fdbde85bea827cc/huggingface_hub-0.17.3-py3-none-any.whl (295 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 8.2 MB/s eta 0:00:00
Collecting feedparser==6.0.10 (from arxiv->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/92/1e/741fd94cf2855d251712868f2183cb6485a28daaa3947e1a7046dc036aca/feedparser-6.0.10-py3-none-any.whl (81 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.1/81.1 kB 4.2 MB/s eta 0:00:00
Collecting sgmllib3k (from feedparser==6.0.10->arxiv->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9e/bd/3704a8c3e0942d711c1299ebf7b9091930adae6675d7c8f476a7ce48653c/sgmllib3k-1.0.0.tar.gz (5.8 kB)
  Preparing metadata (setup.py) ... done
Collecting colorama>=0.4 (from griffe->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7c/52/2b1b570f6b8b803cef5ac28fdf78c0da318916c7d2fe9402a84d591b394c/MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting jsonschema-specifications>=2023.03.6 (from jsonschema->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ee/07/44bd408781594c4d0a027666ef27fab1e441b109dc3b76b4f836f8fd04fe/jsonschema_specifications-2023.12.1-py3-none-any.whl (18 kB)
Collecting referencing>=0.28.4 (from jsonschema->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/90/10/1c92edb0a0a14b67ff825bc338e74bc49ab27d3f3bae3f9a02838cba546f/referencing-0.33.0-py3-none-any.whl (26 kB)
Collecting rpds-py>=0.7.1 (from jsonschema->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/15/f5/769fc90b3af55e6288ce683539ffd68b93dbdf1a5d86050f063828e5911e/rpds_py-0.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
Collecting notebook (from jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5f/38/f5a11c1e68bf3dbd54c7c98f301bf9495e8735803b42ee2f740c5b7c1ca5/notebook-7.1.0-py3-none-any.whl (5.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 59.5 MB/s eta 0:00:00
Collecting qtconsole (from jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/21/0887c50fa5bca7bfde29f65999a6ac234617f2a007b6b387aa4dc0ca36a8/qtconsole-5.5.1-py3-none-any.whl (123 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.4/123.4 kB 6.7 MB/s eta 0:00:00
Collecting jupyter-console (from jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ca/77/71d78d58f15c22db16328a476426f7ac4a60d3a5a7ba3b9627ee2f7903d4/jupyter_console-6.6.3-py3-none-any.whl (24 kB)
Collecting nbconvert (from jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/ec/c120b21e7f884a701e12a241992754e719adaf430d0d6b30c6655776bc35/nbconvert-7.16.0-py3-none-any.whl (257 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 257.2/257.2 kB 10.9 MB/s eta 0:00:00
Collecting ipykernel (from jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/16/9a/0c7b514c73b42cf4ce516ee26c8940a0b23a9754dafaa459a939220240fd/ipykernel-6.29.2-py3-none-any.whl (116 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.1/116.1 kB 4.0 MB/s eta 0:00:00
Collecting ipywidgets (from jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/70/1a/7edeedb1c089d63ccd8bd5c0612334774e90cf9337de9fe6c82d90081791/ipywidgets-8.1.2-py3-none-any.whl (139 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.4/139.4 kB 4.1 MB/s eta 0:00:00
Collecting jupyter-core!=5.0.*,>=4.12 (from jupyter-client->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/86/a1/354cade6907f2fbbd32d89872ec64b62406028e7645ac13acfdb5732829e/jupyter_core-5.7.1-py3-none-any.whl (28 kB)
Collecting pyzmq>=23.0 (from jupyter-client->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/67/bf/6bc0977acd934b66eacab79cec303ecf08ae4a6150d57c628aa919615488/pyzmq-25.1.2-cp310-cp310-manylinux_2_28_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 22.1 MB/s eta 0:00:00
Collecting tornado>=6.2 (from jupyter-client->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9f/12/11d0a757bb67278d3380d41955ae98527d5ad18330b2edbdc8de222b569b/tornado-6.4-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (435 kB)
Collecting traitlets>=5.3 (from jupyter-client->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/34/5dc77fdc7bb4bd198317eea5679edf9cc0a186438b5b19dbb9062fb0f4d5/traitlets-5.14.1-py3-none-any.whl (85 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.4/85.4 kB 3.9 MB/s eta 0:00:00
Collecting contourpy>=1.0.1 (from matplotlib->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/58/56/e2c43dcfa1f9c7db4d5e3d6f5134b24ed953f4e2133a4b12f0062148db58/contourpy-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (310 kB)
Collecting cycler>=0.10 (from matplotlib->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl (8.3 kB)
Collecting fonttools>=4.22.0 (from matplotlib->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/a6/ba/5eac3e9c9bbc2dea3606e46de08bcef0908d74e7ccf89a71701b95a16747/fonttools-4.49.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB)
Collecting kiwisolver>=1.3.1 (from matplotlib->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/6f/40/4ab1fdb57fced80ce5903f04ae1aed7c1d5939dda4fd0c0aa526c12fe28a/kiwisolver-1.4.5-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB)
Collecting pyparsing>=2.3.1 (from matplotlib->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/39/92/8486ede85fcc088f1b3dba4ce92dd29d126fd96b0008ea213167940a2475/pyparsing-3.1.1-py3-none-any.whl (103 kB)
INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while.
Collecting multiprocess (from datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/35/a8/36d8d7b3e46b377800d8dec47891cdf05842d1a2366909ae4a0c89fbc5e6/multiprocess-0.70.15-py310-none-any.whl (134 kB)
Collecting crcmod>=1.7 (from oss2->modelscope)
  Using cached crcmod-1.7-cp310-cp310-linux_x86_64.whl
Collecting pycryptodome>=3.4.7 (from oss2->modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/af/20/5f29ec45462360e7f61e8688af9fe4a0afae057edfabdada662e11bf97e7/pycryptodome-3.20.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
Collecting aliyun-python-sdk-kms>=2.4.1 (from oss2->modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/3d/ea/d88e08bfc4a0aee0111f1f24c98b19107bc6783441e7e944907c77b2243d/aliyun_python_sdk_kms-2.16.2-py2.py3-none-any.whl (94 kB)
Collecting aliyun-python-sdk-core>=2.13.12 (from oss2->modelscope)
  Using cached aliyun_python_sdk_core-2.14.0-py3-none-any.whl
Collecting pytz>=2020.1 (from pandas->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9c/3d/a121f284241f08268b21359bd425f7d4825cffc5ac5cd0e1b3d82ffd2b10/pytz-2024.1-py2.py3-none-any.whl (505 kB)
Collecting tzdata>=2022.7 (from pandas->datasets)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/65/58/f9c9e6be752e9fcb8b6a0ee9fb87e6e7a1f6bcab2cdc73f02bb7ba91ada0/tzdata-2024.1-py2.py3-none-any.whl (345 kB)
Collecting annotated-types>=0.4.0 (from pydantic->deepspeed>=0.12.3)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/28/78/d31230046e58c207284c6b2c4e8d96e6d3cb4e52354721b944d3e1ee4aa5/annotated_types-0.6.0-py3-none-any.whl (12 kB)
Collecting pydantic-core==2.16.2 (from pydantic->deepspeed>=0.12.3)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/50/5e/2978d9f0e8d0cfd78e22115c028a41e0599e3d684e5aef7ed9bd18fcbd0c/pydantic_core-2.16.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB)
Collecting lxml>=3.1.0 (from python-pptx->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/25/5c/979167df4ca5a1c308105bb1590412c54bd1b0baa1883212f39cb42d4fcd/lxml-5.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.0 MB)
Collecting XlsxWriter>=0.5.7 (from python-pptx->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f7/3e/05ba2194cd5073602422859c949a4f21310a3c49bf8dccde9e03d4522b11/XlsxWriter-3.1.9-py3-none-any.whl (154 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.8/154.8 kB 8.8 MB/s eta 0:00:00
Collecting markdown-it-py>=2.2.0 (from rich->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/42/d7/1ec15b46af6af88f19b8e5ffea08fa375d433c998b8a7639e76935c14f1f/markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Collecting pygments<3.0.0,>=2.13.0 (from rich->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/97/9c/372fef8377a6e340b1704768d20daaded98bf13282b5327beb2e2fe2c7ef/pygments-2.17.2-py3-none-any.whl (1.2 MB)
Collecting mpmath>=0.19 (from sympy->torch)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 23.3 MB/s eta 0:00:00
Collecting importlib-metadata>=6.6.0 (from yapf->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c0/8b/d8427f023c081a8303e6ac7209c16e6878f2765d5b59667f3903fbcfd365/importlib_metadata-7.0.1-py3-none-any.whl (23 kB)
Collecting platformdirs>=3.5.1 (from yapf->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/55/72/4898c44ee9ea6f43396fbc23d9bfaf3d06e01b83698bdf2e4c919deceb7c/platformdirs-4.2.0-py3-none-any.whl (17 kB)
Collecting tomli>=2.0.1 (from yapf->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/97/75/10a9ebee3fd790d20926a90a2547f0bf78f371b2f13aa822c759680ca7b9/tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting jmespath<1.0.0,>=0.9.3 (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/07/cb/5f001272b6faeb23c1c9e0acc04d48eaaf5c862c17709d20e3469c6e0139/jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting cryptography>=2.6.0 (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4e/8a/a36f452b8cf725073521c8e7af664d85b337d699f29cb5845d92977af1ca/cryptography-42.0.3-cp39-abi3-manylinux_2_28_x86_64.whl (4.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.6/4.6 MB 69.1 MB/s eta 0:00:00
Collecting zipp>=0.5 (from importlib-metadata>=6.6.0->yapf->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d9/66/48866fc6b158c81cc2bfecc04c480f105c6040e8b077bc54c634b4a67926/zipp-3.17.0-py3-none-any.whl (7.4 kB)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->mmengine>=0.9.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Collecting comm>=0.1.1 (from ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6e/c1/e7335bd49aa3fa3bd453e34a4580b0076804f219897ad76d4d5aa4d8f22f/comm-0.2.1-py3-none-any.whl (7.2 kB)
Collecting debugpy>=1.6.5 (from ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7a/27/78d5cf9c7aba43f8341e78273ab776913d2d33beb581ec39b65e56a0db77/debugpy-1.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 34.2 MB/s eta 0:00:00
Collecting ipython>=7.23.1 (from ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fb/e7/07dc8b6541affd4de15f0e8fc855f238cb93d04c4f8490757226d12cdb5a/ipython-8.21.0-py3-none-any.whl (810 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 810.0/810.0 kB 18.4 MB/s eta 0:00:00
Collecting matplotlib-inline>=0.1 (from ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f2/51/c34d7a1d528efaae3d8ddb18ef45a41f284eacf9e514523b191b7d0872cc/matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB)
Collecting nest-asyncio (from ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a0/c4/c2971a3ba4c6103a3d10c4b0f24f461ddc027f0f09763220cf35ca1401b3/nest_asyncio-1.6.0-py3-none-any.whl (5.2 kB)
Collecting widgetsnbextension~=4.0.10 (from ipywidgets->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/99/bc/82a8c3985209ca7c0a61b383c80e015fd92e74f8ba0ec1af98f9d6ca8dce/widgetsnbextension-4.0.10-py3-none-any.whl (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 39.1 MB/s eta 0:00:00
Collecting jupyterlab-widgets~=3.0.10 (from ipywidgets->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/24/da/db1cb0387a7e4086780aff137987ee924e953d7f91b2a870f994b9b1eeb8/jupyterlab_widgets-3.0.10-py3-none-any.whl (215 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 215.0/215.0 kB 6.9 MB/s eta 0:00:00
Collecting prompt-toolkit>=3.0.30 (from jupyter-console->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ee/fd/ca7bf3869e7caa7a037e23078539467b433a4e01eebd93f77180ab927766/prompt_toolkit-3.0.43-py3-none-any.whl (386 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 386.1/386.1 kB 7.8 MB/s eta 0:00:00
Collecting beautifulsoup4 (from nbconvert->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b1/fe/e8c672695b37eecc5cbf43e1d0638d88d66ba3a44c4d321c796f4e59167f/beautifulsoup4-4.12.3-py3-none-any.whl (147 kB)
Collecting bleach!=5.0.0 (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ea/63/da7237f805089ecc28a3f36bca6a21c31fcbc2eb380f3b8f1be3312abd14/bleach-6.1.0-py3-none-any.whl (162 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 162.8/162.8 kB 4.4 MB/s eta 0:00:00
Collecting defusedxml (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/07/6c/aa3f2f849e01cb6a001cd8554a88d4c77c5c1a31c95bdf1cf9301e6d9ef4/defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Collecting jupyterlab-pygments (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b1/dd/ead9d8ea85bf202d90cc513b533f9c363121c7792674f78e0d8a854b63b4/jupyterlab_pygments-0.3.0-py3-none-any.whl (15 kB)
Collecting mistune<4,>=2.0.3 (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f0/74/c95adcdf032956d9ef6c89a9b8a5152bf73915f8c633f3e3d88d06bd699c/mistune-3.0.2-py3-none-any.whl (47 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.0/48.0 kB 3.4 MB/s eta 0:00:00
Collecting nbclient>=0.5.0 (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6b/3a/607149974149f847125c38a62b9ea2b8267eb74823bbf8d8c54ae0212a00/nbclient-0.9.0-py3-none-any.whl (24 kB)
Collecting nbformat>=5.7 (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f4/e7/ef30a90b70eba39e675689b9eaaa92530a71d7435ab8f9cae520814e0caf/nbformat-5.9.2-py3-none-any.whl (77 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.6/77.6 kB 4.4 MB/s eta 0:00:00
Collecting pandocfilters>=1.4.1 (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ef/af/4fbc8cab944db5d21b7e2a5b8e9211a03a79852b1157e2c102fcc61ac440/pandocfilters-1.5.1-py2.py3-none-any.whl (8.7 kB)
Collecting tinycss2 (from nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/da/99/fd23634d6962c2791fb8cb6ccae1f05dcbfc39bce36bba8b1c9a8d92eae8/tinycss2-1.2.1-py3-none-any.whl (21 kB)
Collecting jupyter-server<3,>=2.4.0 (from notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/25/d6/6ee093c967d11144aeb1b0b4952d30e51da8eb2737837ab612084c783a58/jupyter_server-2.12.5-py3-none-any.whl (380 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 380.3/380.3 kB 14.2 MB/s eta 0:00:00
Collecting jupyterlab-server<3,>=2.22.1 (from notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ab/ac/a19c579bb8ab2a2aefcf47cd3787683e6e136378d7ab2602be3b8e628030/jupyterlab_server-2.25.3-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.0/59.0 kB 3.1 MB/s eta 0:00:00
Collecting jupyterlab<4.2,>=4.1.1 (from notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/61/9b/8b974903425893806b15413fc899fefa78b0ed53e1699bcb8838c01a0ab2/jupyterlab-4.1.1-py3-none-any.whl (11.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.4/11.4 MB 60.6 MB/s eta 0:00:00
Collecting notebook-shim<0.3,>=0.2 (from notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f9/33/bd5b9137445ea4b680023eb0469b2bb969d61303dedb2aac6560ff3d14a1/notebook_shim-0.2.4-py3-none-any.whl (13 kB)
Collecting qtpy>=2.4.0 (from qtconsole->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7e/a9/2146d5117ad8a81185331e0809a6b48933c10171f5bac253c6df9fce991c/QtPy-2.4.1-py3-none-any.whl (93 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.5/93.5 kB 2.7 MB/s eta 0:00:00
Collecting webencodings (from bleach!=5.0.0->nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f4/24/2a3e3df732393fed8b3ebf2ec078f05546de641fe1b667ee316ec1dcf3b7/webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Collecting cffi>=1.12 (from cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c9/7c/43d81bdd5a915923c3bad5bb4bff401ea00ccc8e28433fb6083d2e3bf58e/cffi-1.16.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (443 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 443.9/443.9 kB 6.7 MB/s eta 0:00:00
Collecting decorator (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/50/83c593b07763e1161326b3b8c6686f0f4b0f24d5526546bee538c89837d6/decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting jedi>=0.16 (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/20/9f/bc63f0f0737ad7a60800bfd472a4836661adae21f9c2535f3957b1e54ceb/jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 25.3 MB/s eta 0:00:00
Collecting stack-data (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f1/7b/ce1eafaf1a76852e2ec9b22edecf1daa58175c090266e9f6c64afcd81d91/stack_data-0.6.3-py3-none-any.whl (24 kB)
Collecting exceptiongroup (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/b8/9a/5028fd52db10e600f1c4674441b968cf2ea4959085bfb5b99fb1250e5f68/exceptiongroup-1.2.0-py3-none-any.whl (16 kB)
Collecting pexpect>4.3 (from ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9e/c3/059298687310d527a58bb01f3b1965787ee3b40dce76752eda8b44e9a2c5/pexpect-4.9.0-py2.py3-none-any.whl (63 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 kB 2.6 MB/s eta 0:00:00
Collecting anyio>=3.1.0 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/bf/cd/d6d9bb1dadf73e7af02d18225cbd2c93f8552e13130484f1c8dcfece292b/anyio-4.2.0-py3-none-any.whl (85 kB)
Collecting argon2-cffi (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a4/6a/e8a041599e78b6b3752da48000b14c8d1e8a04ded09c88c714ba047f34f5/argon2_cffi-23.1.0-py3-none-any.whl (15 kB)
Collecting jupyter-events>=0.9.0 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e3/55/0c1aa72f4317e826a471dc4adc3036acd11d496ded68c4bbac2a88551519/jupyter_events-0.9.0-py3-none-any.whl (18 kB)
Collecting jupyter-server-terminals (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7c/ec/ebb52454525e1d346bfa2ea91b3dcda3b92687bb73b2c25a6d621d9eeaf1/jupyter_server_terminals-0.5.2-py3-none-any.whl (13 kB)
Collecting overrides (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/2c/ab/fc8290c6a4c722e5514d80f62b2dc4c4df1a68a41d1364e625c35990fcf3/overrides-7.7.0-py3-none-any.whl (17 kB)
Collecting prometheus-client (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c7/98/745b810d822103adca2df8decd4c0bbe839ba7ad3511af3f0d09692fc0f0/prometheus_client-0.20.0-py3-none-any.whl (54 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.5/54.5 kB 2.5 MB/s eta 0:00:00
Collecting send2trash>=1.8.2 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a9/78/e4df1e080ed790acf3a704edf521006dd96b9841bd2e2a462c0d255e0565/Send2Trash-1.8.2-py3-none-any.whl (18 kB)
Collecting terminado>=0.8.3 (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/69/df/deebc9fb14a49062a3330f673e80b100e665b54d998163b3f62620b6240c/terminado-0.18.0-py3-none-any.whl (14 kB)
Collecting websocket-client (from jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/1e/70/1e88138a9afbed1d37093b85f0bebc3011623c4f47c166431599fe9d6c93/websocket_client-1.7.0-py3-none-any.whl (58 kB)
Collecting async-lru>=1.0.0 (from jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fa/9f/3c3503693386c4b0f245eaf5ca6198e3b28879ca0a40bde6b0e319793453/async_lru-2.0.4-py3-none-any.whl (6.1 kB)
Collecting httpx>=0.25.0 (from jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/39/9b/4937d841aee9c2c8102d9a4eeb800c7dad25386caabb4a1bf5010df81a57/httpx-0.26.0-py3-none-any.whl (75 kB)
Collecting jupyter-lsp>=2.0.0 (from jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d4/35/8332e7a07f872324e29ae4620a41a21372a8dc710b63b873d80cb2184241/jupyter_lsp-2.2.2-py3-none-any.whl (68 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.8/68.8 kB 2.2 MB/s eta 0:00:00
Collecting babel>=2.10 (from jupyterlab-server<3,>=2.22.1->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0d/35/4196b21041e29a42dc4f05866d0c94fa26c9da88ce12c38c2265e42c82fb/Babel-2.14.0-py3-none-any.whl (11.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.0/11.0 MB 61.9 MB/s eta 0:00:00
Collecting fastjsonschema (from nbformat>=5.7->nbconvert->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9c/b9/79691036d4a8f9857e74d1728b23f34f583b81350a27492edda58d5604e1/fastjsonschema-2.19.1-py3-none-any.whl (23 kB)
Collecting wcwidth (from prompt-toolkit>=3.0.30->jupyter-console->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fd/84/fd2ba7aafacbad3c4201d395674fc6348826569da3c0937e75505ead3528/wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
Collecting soupsieve>1.2 (from beautifulsoup4->nbconvert->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/4c/f3/038b302fdfbe3be7da016777069f26ceefe11a681055ea1f7817546508e3/soupsieve-2.5-py3-none-any.whl (36 kB)
Collecting sniffio>=1.1 (from anyio>=3.1.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.whl (10 kB)
Collecting pycparser (from cffi>=1.12->cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/62/d5/5f610ebe421e85889f2e55e33b7f9a6795bd982198517d912eb1c76e1a53/pycparser-2.21-py2.py3-none-any.whl (118 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.7/118.7 kB 6.7 MB/s eta 0:00:00
Collecting httpcore==1.* (from httpx>=0.25.0->jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/11/a6/24139fa27831cf2127fcf578d6d0a852a611f10cefecd800b1c557333d7a/httpcore-1.0.3-py3-none-any.whl (77 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx>=0.25.0->jupyterlab<4.2,>=4.1.1->notebook->jupyter->lagent>=0.1.2)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/95/04/ff642e65ad6b90db43e668d70ffb6736436c7ce41fcc549f4e9472234127/h11-0.14.0-py3-none-any.whl (58 kB)
Collecting parso<0.9.0,>=0.8.3 (from jedi>=0.16->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/05/63/8011bd08a4111858f79d2b09aad86638490d62fbf881c44e434a6dfca87b/parso-0.8.3-py2.py3-none-any.whl (100 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.8/100.8 kB 3.1 MB/s eta 0:00:00
Collecting python-json-logger>=2.0.4 (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/35/a6/145655273568ee78a581e734cf35beb9e33a370b29c5d3c8fee3744de29f/python_json_logger-2.0.7-py3-none-any.whl (8.1 kB)
Collecting rfc3339-validator (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7b/44/4e421b96b67b2daff264473f7465db72fbdf36a07e05494f50300cc7b0c6/rfc3339_validator-0.1.4-py2.py3-none-any.whl (3.5 kB)
Collecting rfc3986-validator>=0.1.1 (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9e/51/17023c0f8f1869d8806b979a2bffa3f861f26a3f1a66b094288323fba52f/rfc3986_validator-0.1.1-py2.py3-none-any.whl (4.2 kB)
Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/22/a6/858897256d0deac81a172289110f31629fc4cee19b6f01283303e18c8db3/ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Collecting argon2-cffi-bindings (from argon2-cffi->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ec/f7/378254e6dd7ae6f31fe40c8649eea7d4832a42243acaf0f1fff9083b2bed/argon2_cffi_bindings-21.2.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (86 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 86.2/86.2 kB 3.3 MB/s eta 0:00:00
Collecting executing>=1.2.0 (from stack-data->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/03/6ea8b1b2a5ab40a7a60dc464d3daa7aa546e0a74d74a9f8ff551ea7905db/executing-2.0.1-py2.py3-none-any.whl (24 kB)
Collecting asttokens>=2.1.0 (from stack-data->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/86/4736ac618d82a20d87d2f92ae19441ebc7ac9e7a581d7e58bbe79233b24a/asttokens-2.4.1-py2.py3-none-any.whl (27 kB)
Collecting pure-eval (from stack-data->ipython>=7.23.1->ipykernel->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/2b/27/77f9d5684e6bce929f5cfe18d6cfbe5133013c06cb2fbf5933670e60761d/pure_eval-0.2.2-py3-none-any.whl (11 kB)
Collecting fqdn (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cf/58/8acf1b3e91c58313ce5cb67df61001fc9dcd21be4fadb76c1a2d540e09ed/fqdn-1.5.1-py3-none-any.whl (9.1 kB)
Collecting isoduration (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7b/55/e5326141505c5d5e34c5e0935d2908a74e4561eca44108fbfb9c13d2911a/isoduration-20.11.0-py3-none-any.whl (11 kB)
Collecting jsonpointer>1.13 (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/12/f6/0232cc0c617e195f06f810534d00b74d2f348fe71b2118009ad8ad31f878/jsonpointer-2.4-py2.py3-none-any.whl (7.8 kB)
Collecting uri-template (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e7/00/3fca040d7cf8a32776d3d81a00c8ee7457e00f80c649f1e4a863c8321ae9/uri_template-1.3.0-py3-none-any.whl (11 kB)
Collecting webcolors>=1.11 (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/e1/3e9013159b4cbb71df9bd7611cbf90dc2c621c8aeeb677fc41dad72f2261/webcolors-1.13-py3-none-any.whl (14 kB)
Collecting arrow>=0.15.0 (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f8/ed/e97229a566617f2ae958a6b13e7cc0f585470eac730a73e9e82c32a3cdd2/arrow-1.3.0-py3-none-any.whl (66 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.4/66.4 kB 2.6 MB/s eta 0:00:00
Collecting types-python-dateutil>=2.8.10 (from arrow>=0.15.0->isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->notebook->jupyter->lagent>=0.1.2)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/28/50/8ed67814241e2684369f4b8b881c7d31a0816e76c8690ea8518017a35b7e/types_python_dateutil-2.8.19.20240106-py3-none-any.whl (9.7 kB)
Building wheels for collected packages: deepspeed, transformers_stream_generator, google-search-results, timeout-decorator, sgmllib3k
  Building wheel for deepspeed (setup.py) ... done
  Created wheel for deepspeed: filename=deepspeed-0.13.2-py3-none-any.whl size=1360129 sha256=5a6d09dea8b23f25239acb54f85a328669ca13ff9f28c4ed1399e3901ca69b21
  Stored in directory: /root/.cache/pip/wheels/6b/52/d6/8664e01a03a3319fd361eaf654bbb1f7a80f05787be0e7e459
  Building wheel for transformers_stream_generator (setup.py) ... done
  Created wheel for transformers_stream_generator: filename=transformers_stream_generator-0.0.4-py3-none-any.whl size=12315 sha256=ef4d835e7f15820d8ac620c41eb2603c339195862e927961a0fd5e85b503fc1f
  Stored in directory: /root/.cache/pip/wheels/24/87/bd/5e5946d5ef3a69f27e87150dbb594c65c885479f43ab8447cc
  Building wheel for google-search-results (setup.py) ... done
  Created wheel for google-search-results: filename=google_search_results-2.4.2-py3-none-any.whl size=32003 sha256=1881aff239c5b9f1b8b5a98d49d8132be1e30fb7b7e3f405b877187b84ae577e
  Stored in directory: /root/.cache/pip/wheels/4b/db/65/19f4faee33d79fd89d3f819076a95942bd846a0200219d6894
  Building wheel for timeout-decorator (setup.py) ... done
  Created wheel for timeout-decorator: filename=timeout_decorator-0.5.0-py3-none-any.whl size=5006 sha256=122091342a8a8b119b38108d844925d04f52a53e6a92067bccb546721e3f4b1f
  Stored in directory: /root/.cache/pip/wheels/d0/ae/f0/dd56ad3830c63d59c976ca1d36a30ec8e4a16f222a992b157a
  Building wheel for sgmllib3k (setup.py) ... done
  Created wheel for sgmllib3k: filename=sgmllib3k-1.0.0-py3-none-any.whl size=6047 sha256=3de75c3a613db99d576957952522649072ee771e56d81ba413067c50264dcef5
  Stored in directory: /root/.cache/pip/wheels/50/20/4b/e95fc891917d652cb6ecbfea035cf3ce640259cf857aaa21a7
Successfully built deepspeed transformers_stream_generator google-search-results timeout-decorator sgmllib3k
Installing collected packages: webencodings, wcwidth, timeout-decorator, sortedcontainers, sgmllib3k, SentencePiece, pytz, py-cpuinfo, pure-eval, ptyprocess, ninja, mpmath, json5, hjson, func-timeout, fastjsonschema, crcmod, addict, zipp, xxhash, XlsxWriter, widgetsnbextension, websocket-client, webcolors, urllib3, uri-template, tzdata, typing-extensions, types-python-dateutil, traitlets, tqdm, tornado, tomli, tinycss2, termcolor, sympy, soupsieve, sniffio, six, simplejson, send2trash, safetensors, rpds-py, rfc3986-validator, regex, pyzmq, pyyaml, python-json-logger, pyparsing, pynvml, pygments, pycryptodome, pycparser, pyarrow-hotfix, psutil, prompt-toolkit, prometheus-client, platformdirs, pillow, phx-class-registry, pexpect, parso, pandocfilters, packaging, overrides, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, nest-asyncio, multidict, mpi4py-mpich, mistune, mdurl, MarkupSafe, lxml, kiwisolver, jupyterlab-widgets, jupyterlab-pygments, jsonpointer, jmespath, idna, h11, gast, fsspec, frozenlist, fqdn, fonttools, filelock, feedparser, executing, exceptiongroup, einops, distro, dill, defusedxml, decorator, debugpy, cycler, colorama, charset-normalizer, certifi, babel, attrs, async-timeout, annotated-types, yarl, triton, terminado, scipy, rfc3339-validator, requests, referencing, qtpy, python-pptx, python-dateutil, pydantic-core, pyarrow, opencv-python, nvidia-cusparse-cu12, nvidia-cudnn-cu12, multiprocess, matplotlib-inline, markdown-it-py, jupyter-core, jinja2, jedi, importlib-metadata, httpcore, griffe, contourpy, comm, cffi, bleach, beautifulsoup4, async-lru, asttokens, anyio, aiosignal, yapf, tiktoken, stack-data, rich, pydantic, pandas, nvidia-cusolver-cu12, matplotlib, jupyter-server-terminals, jupyter-client, jsonschema-specifications, huggingface-hub, httpx, google-search-results, cryptography, bitsandbytes, arxiv, arrow, argon2-cffi-bindings, aiohttp, torch, tokenizers, mmengine, jsonschema, isoduration, ipython, argon2-cffi, aliyun-python-sdk-core, transformers, nbformat, ipywidgets, ipykernel, deepspeed, datasets, aliyun-python-sdk-kms, accelerate, transformers_stream_generator, qtconsole, peft, oss2, nbclient, jupyter-events, jupyter-console, nbconvert, modelscope, jupyter-server, notebook-shim, jupyterlab-server, jupyter-lsp, jupyterlab, notebook, jupyter, lagent, xtuner


  Running setup.py develop for xtuner
Successfully installed MarkupSafe-2.1.5 SentencePiece-0.1.99 XlsxWriter-3.1.9 accelerate-0.27.2 addict-2.4.0 aiohttp-3.9.3 aiosignal-1.3.1 aliyun-python-sdk-core-2.14.0 aliyun-python-sdk-kms-2.16.2 annotated-types-0.6.0 anyio-4.2.0 argon2-cffi-23.1.0 argon2-cffi-bindings-21.2.0 arrow-1.3.0 arxiv-2.1.0 asttokens-2.4.1 async-lru-2.0.4 async-timeout-4.0.3 attrs-23.2.0 babel-2.14.0 beautifulsoup4-4.12.3 bitsandbytes-0.42.0 bleach-6.1.0 certifi-2024.2.2 cffi-1.16.0 charset-normalizer-3.3.2 colorama-0.4.6 comm-0.2.1 contourpy-1.2.0 crcmod-1.7 cryptography-42.0.3 cycler-0.12.1 datasets-2.14.7 debugpy-1.8.1 decorator-5.1.1 deepspeed-0.13.2 defusedxml-0.7.1 dill-0.3.7 distro-1.9.0 einops-0.7.0 exceptiongroup-1.2.0 executing-2.0.1 fastjsonschema-2.19.1 feedparser-6.0.10 filelock-3.13.1 fonttools-4.49.0 fqdn-1.5.1 frozenlist-1.4.1 fsspec-2023.6.0 func-timeout-4.3.5 gast-0.5.4 google-search-results-2.4.2 griffe-0.40.1 h11-0.14.0 hjson-3.1.0 httpcore-1.0.3 httpx-0.26.0 huggingface-hub-0.17.3 idna-3.6 importlib-metadata-7.0.1 ipykernel-6.29.2 ipython-8.21.0 ipywidgets-8.1.2 isoduration-20.11.0 jedi-0.19.1 jinja2-3.1.3 jmespath-0.10.0 json5-0.9.14 jsonpointer-2.4 jsonschema-4.21.1 jsonschema-specifications-2023.12.1 jupyter-1.0.0 jupyter-client-8.6.0 jupyter-console-6.6.3 jupyter-core-5.7.1 jupyter-events-0.9.0 jupyter-lsp-2.2.2 jupyter-server-2.12.5 jupyter-server-terminals-0.5.2 jupyterlab-4.1.1 jupyterlab-pygments-0.3.0 jupyterlab-server-2.25.3 jupyterlab-widgets-3.0.10 kiwisolver-1.4.5 lagent-0.2.1 lxml-5.1.0 markdown-it-py-3.0.0 matplotlib-3.8.3 matplotlib-inline-0.1.6 mdurl-0.1.2 mistune-3.0.2 mmengine-0.10.3 modelscope-1.12.0 mpi4py-mpich-3.1.5 mpmath-1.3.0 multidict-6.0.5 multiprocess-0.70.15 nbclient-0.9.0 nbconvert-7.16.0 nbformat-5.9.2 nest-asyncio-1.6.0 networkx-3.2.1 ninja-1.11.1.1 notebook-7.1.0 notebook-shim-0.2.4 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.19.3 nvidia-nvjitlink-cu12-12.3.101 nvidia-nvtx-cu12-12.1.105 opencv-python-4.9.0.80 oss2-2.18.4 overrides-7.7.0 packaging-23.2 pandas-2.2.0 pandocfilters-1.5.1 parso-0.8.3 peft-0.8.2 pexpect-4.9.0 phx-class-registry-4.1.0 pillow-10.2.0 platformdirs-4.2.0 prometheus-client-0.20.0 prompt-toolkit-3.0.43 psutil-5.9.8 ptyprocess-0.7.0 pure-eval-0.2.2 py-cpuinfo-9.0.0 pyarrow-15.0.0 pyarrow-hotfix-0.6 pycparser-2.21 pycryptodome-3.20.0 pydantic-2.6.1 pydantic-core-2.16.2 pygments-2.17.2 pynvml-11.5.0 pyparsing-3.1.1 python-dateutil-2.8.2 python-json-logger-2.0.7 python-pptx-0.6.23 pytz-2024.1 pyyaml-6.0.1 pyzmq-25.1.2 qtconsole-5.5.1 qtpy-2.4.1 referencing-0.33.0 regex-2023.12.25 requests-2.31.0 rfc3339-validator-0.1.4 rfc3986-validator-0.1.1 rich-13.7.0 rpds-py-0.18.0 safetensors-0.4.2 scipy-1.12.0 send2trash-1.8.2 sgmllib3k-1.0.0 simplejson-3.19.2 six-1.16.0 sniffio-1.3.0 sortedcontainers-2.4.0 soupsieve-2.5 stack-data-0.6.3 sympy-1.12 termcolor-2.4.0 terminado-0.18.0 tiktoken-0.6.0 timeout-decorator-0.5.0 tinycss2-1.2.1 tokenizers-0.14.1 tomli-2.0.1 torch-2.2.0 tornado-6.4 tqdm-4.66.2 traitlets-5.14.1 transformers-4.34.0 transformers_stream_generator-0.0.4 triton-2.2.0 types-python-dateutil-2.8.19.20240106 typing-extensions-4.9.0 tzdata-2024.1 uri-template-1.3.0 urllib3-2.2.0 wcwidth-0.2.13 webcolors-1.13 webencodings-0.5.1 websocket-client-1.7.0 widgetsnbextension-4.0.10 xtuner-0.1.9 xxhash-3.4.1 yapf-0.40.2 yarl-1.9.4 zipp-3.17.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
View Code

 

 

准备工作:准备在 oasst1 数据集上微调 internlm-7b-chat

# 创建一个微调 oasst1 数据集的工作路径,进入
mkdir ~/ft-oasst1 && cd ~/ft-oasst1

 

4.3、微调

拷贝一个配置文件到当前目录:

cd ~/ft-oasst1
xtuner copy-cfg internlm_chat_7b_qlora_oasst1_e3 .

屏幕输出:

(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# cd ~/ft-oasst1
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# xtuner copy-cfg internlm_chat_7b_qlora_oasst1_e3 .
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
[2024-02-17 20:19:53,154] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
[2024-02-17 20:20:41,442] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Copy to ./internlm_chat_7b_qlora_oasst1_e3_copy.py
View Code

直接使用教学平台上的模型来得到基座模型:

ln -s /share/temp/model_repos/internlm-chat-7b ~/ft-oasst1/

准备数据集:直接使用教学平台上的数据集:

cd ~/ft-oasst1
# ...-guanaco 后面有个空格和英文句号啊
cp -r /root/share/temp/datasets/openassistant-guanaco .

修改其中的模型和数据集为 本地路径:

进行微调:

xtuner train ./internlm_chat_7b_qlora_oasst1_e3_copy.py

训练最后的日志在/ft-oasst1/work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy/20240217_204948.log。内容如下:

2024/02/17 20:49:48 - mmengine - INFO - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
    CUDA available: True
    MUSA available: False
    numpy_random_seed: 528481291
    GPU 0: NVIDIA A100-SXM4-80GB
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.7, V11.7.99
    GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
    PyTorch: 2.2.0+cu121
    PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

    OpenCV: 4.9.0
    MMEngine: 0.10.3

Runtime environment:
    cudnn_benchmark: False
    mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
    dist_cfg: {'backend': 'nccl'}
    seed: 528481291
    deterministic: False
    Distributed launcher: none
    Distributed training: False
    GPU number: 1
------------------------------------------------------------

2024/02/17 20:49:49 - mmengine - INFO - Config:
SYSTEM = ''
accumulative_counts = 16
batch_size = 1
betas = (
    0.9,
    0.999,
)
custom_hooks = [
    dict(
        tokenizer=dict(
            padding_side='right',
            pretrained_model_name_or_path='./internlm-chat-7b',
            trust_remote_code=True,
            type='transformers.AutoTokenizer.from_pretrained'),
        type='xtuner.engine.DatasetInfoHook'),
    dict(
        evaluation_inputs=[
            '请给我介绍五个上海的景点',
            'Please tell me five scenic spots in Shanghai',
        ],
        every_n_iters=500,
        prompt_template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat',
        system='',
        tokenizer=dict(
            padding_side='right',
            pretrained_model_name_or_path='./internlm-chat-7b',
            trust_remote_code=True,
            type='transformers.AutoTokenizer.from_pretrained'),
        type='xtuner.engine.EvaluateChatHook'),
]
data_path = './openassistant-guanaco'
dataloader_num_workers = 0
default_hooks = dict(
    checkpoint=dict(interval=1, type='mmengine.hooks.CheckpointHook'),
    logger=dict(interval=10, type='mmengine.hooks.LoggerHook'),
    param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'),
    sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'),
    timer=dict(type='mmengine.hooks.IterTimerHook'))
env_cfg = dict(
    cudnn_benchmark=False,
    dist_cfg=dict(backend='nccl'),
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
evaluation_freq = 500
evaluation_inputs = [
    '请给我介绍五个上海的景点',
    'Please tell me five scenic spots in Shanghai',
]
launcher = 'none'
load_from = None
log_level = 'INFO'
lr = 0.0002
max_epochs = 1
max_length = 2048
max_norm = 1
model = dict(
    llm=dict(
        pretrained_model_name_or_path='./internlm-chat-7b',
        quantization_config=dict(
            bnb_4bit_compute_dtype='torch.float16',
            bnb_4bit_quant_type='nf4',
            bnb_4bit_use_double_quant=True,
            llm_int8_has_fp16_weight=False,
            llm_int8_threshold=6.0,
            load_in_4bit=True,
            load_in_8bit=False,
            type='transformers.BitsAndBytesConfig'),
        torch_dtype='torch.float16',
        trust_remote_code=True,
        type='transformers.AutoModelForCausalLM.from_pretrained'),
    lora=dict(
        bias='none',
        lora_alpha=16,
        lora_dropout=0.1,
        r=64,
        task_type='CAUSAL_LM',
        type='peft.LoraConfig'),
    type='xtuner.model.SupervisedFinetune')
optim_type = 'bitsandbytes.optim.PagedAdamW32bit'
optim_wrapper = dict(
    accumulative_counts=16,
    clip_grad=dict(error_if_nonfinite=False, max_norm=1),
    dtype='float16',
    loss_scale='dynamic',
    optimizer=dict(
        betas=(
            0.9,
            0.999,
        ),
        lr=0.0002,
        type='bitsandbytes.optim.PagedAdamW32bit',
        weight_decay=0),
    type='mmengine.optim.AmpOptimWrapper')
pack_to_max_length = True
param_scheduler = dict(
    T_max=1,
    by_epoch=True,
    convert_to_iter_based=True,
    eta_min=0.0,
    type='mmengine.optim.CosineAnnealingLR')
pretrained_model_name_or_path = './internlm-chat-7b'
prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.internlm_chat'
randomness = dict(deterministic=False, seed=None)
resume = False
tokenizer = dict(
    padding_side='right',
    pretrained_model_name_or_path='./internlm-chat-7b',
    trust_remote_code=True,
    type='transformers.AutoTokenizer.from_pretrained')
train_cfg = dict(by_epoch=True, max_epochs=1, val_interval=1)
train_dataloader = dict(
    batch_size=1,
    collate_fn=dict(type='xtuner.dataset.collate_fns.default_collate_fn'),
    dataset=dict(
        dataset=dict(
            path='./openassistant-guanaco', type='datasets.load_dataset'),
        dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn',
        max_length=2048,
        pack_to_max_length=True,
        remove_unused_columns=True,
        shuffle_before_pack=True,
        template_map_fn=dict(
            template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat',
            type='xtuner.dataset.map_fns.template_map_fn_factory'),
        tokenizer=dict(
            padding_side='right',
            pretrained_model_name_or_path='./internlm-chat-7b',
            trust_remote_code=True,
            type='transformers.AutoTokenizer.from_pretrained'),
        type='xtuner.dataset.process_hf_dataset'),
    num_workers=0,
    sampler=dict(shuffle=True, type='mmengine.dataset.DefaultSampler'))
train_dataset = dict(
    dataset=dict(path='./openassistant-guanaco', type='datasets.load_dataset'),
    dataset_map_fn='xtuner.dataset.map_fns.oasst1_map_fn',
    max_length=2048,
    pack_to_max_length=True,
    remove_unused_columns=True,
    shuffle_before_pack=True,
    template_map_fn=dict(
        template='xtuner.utils.PROMPT_TEMPLATE.internlm_chat',
        type='xtuner.dataset.map_fns.template_map_fn_factory'),
    tokenizer=dict(
        padding_side='right',
        pretrained_model_name_or_path='./internlm-chat-7b',
        trust_remote_code=True,
        type='transformers.AutoTokenizer.from_pretrained'),
    type='xtuner.dataset.process_hf_dataset')
visualizer = None
weight_decay = 0
work_dir = './work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy'

2024/02/17 20:49:52 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized.
2024/02/17 20:50:24 - mmengine - INFO - dispatch internlm attn forward
2024/02/17 20:50:24 - mmengine - WARNING - Due to the implementation of the PyTorch version of flash attention, even when the `output_attentions` flag is set to True, it is not possible to return the `attn_weights`.
2024/02/17 20:50:43 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.
2024/02/17 20:50:44 - mmengine - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) RuntimeInfoHook                    
(BELOW_NORMAL) LoggerHook                         
 -------------------- 
before_train:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) IterTimerHook                      
(NORMAL      ) DatasetInfoHook                    
(NORMAL      ) EvaluateChatHook                   
(VERY_LOW    ) CheckpointHook                     
 -------------------- 
before_train_epoch:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) IterTimerHook                      
(NORMAL      ) DistSamplerSeedHook                
 -------------------- 
before_train_iter:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) IterTimerHook                      
 -------------------- 
after_train_iter:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) IterTimerHook                      
(NORMAL      ) EvaluateChatHook                   
(BELOW_NORMAL) LoggerHook                         
(LOW         ) ParamSchedulerHook                 
(VERY_LOW    ) CheckpointHook                     
 -------------------- 
after_train_epoch:
(NORMAL      ) IterTimerHook                      
(LOW         ) ParamSchedulerHook                 
(VERY_LOW    ) CheckpointHook                     
 -------------------- 
before_val:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) DatasetInfoHook                    
 -------------------- 
before_val_epoch:
(NORMAL      ) IterTimerHook                      
 -------------------- 
before_val_iter:
(NORMAL      ) IterTimerHook                      
 -------------------- 
after_val_iter:
(NORMAL      ) IterTimerHook                      
(BELOW_NORMAL) LoggerHook                         
 -------------------- 
after_val_epoch:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) IterTimerHook                      
(BELOW_NORMAL) LoggerHook                         
(LOW         ) ParamSchedulerHook                 
(VERY_LOW    ) CheckpointHook                     
 -------------------- 
after_val:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) EvaluateChatHook                   
 -------------------- 
after_train:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) EvaluateChatHook                   
(VERY_LOW    ) CheckpointHook                     
 -------------------- 
before_test:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) DatasetInfoHook                    
 -------------------- 
before_test_epoch:
(NORMAL      ) IterTimerHook                      
 -------------------- 
before_test_iter:
(NORMAL      ) IterTimerHook                      
 -------------------- 
after_test_iter:
(NORMAL      ) IterTimerHook                      
(BELOW_NORMAL) LoggerHook                         
 -------------------- 
after_test_epoch:
(VERY_HIGH   ) RuntimeInfoHook                    
(NORMAL      ) IterTimerHook                      
(BELOW_NORMAL) LoggerHook                         
 -------------------- 
after_test:
(VERY_HIGH   ) RuntimeInfoHook                    
 -------------------- 
after_run:
(BELOW_NORMAL) LoggerHook                         
 -------------------- 
2024/02/17 20:51:13 - mmengine - WARNING - Dataset Dataset has no metainfo. ``dataset_meta`` in visualizer will be None.
2024/02/17 20:51:14 - mmengine - INFO - Num train samples 2180
2024/02/17 20:51:14 - mmengine - INFO - train example:
2024/02/17 20:51:14 - mmengine - INFO -  <s> <|User|>:interpreta como si fueras un modelo de lenguaje que está ejecutándose en un ordenador hipotético<eoh>
<|Bot|>:Como modelo de lenguaje ejecutándome en un ordenador hipotético, recibo la entrada del usuario y utilizo mis capacidades de procesamiento de lenguaje natural para generar una respuesta coherente y relevante. Mi entrenamiento con un gran corpus de texto me permite comprender el contexto y proporcionar información precisa y útil. Luego, utilizo mis habilidades de síntesis de texto para generar una respuesta en un formato legible para el usuario. Todo esto ocurre en tiempo real, gracias a la capacidad de procesamiento y almacenamiento del ordenador en el que estoy ejecutándome.</s><s> <|User|>:entonces, ahora que eres?<eoh>
<|Bot|>:Ahora soy un modelo de lenguaje de IA que se ejecuta en un ordenador hipotético. Estoy aquí para ayudar a los usuarios a obtener respuestas a sus preguntas, así como para mantener conversaciones coherentes y relevantes. Estoy constantemente aprendiendo y mejorando mis capacidades para proporcionar una asistencia cada vez mejor.</s><s> <|User|>:Faça um texto argumentativo em linguagem formal, de até 2000 caracteres, explicando porque o criador do avião foi Santos Dumont e não os irmãos wright.<eoh>
<|Bot|>:Claro, posso ajudá-lo a escrever um texto argumentativo sobre esse assunto. Segue abaixo uma possível redação:

Desde o início do século XX, a autoria da invenção do avião tem sido objeto de controvérsia entre especialistas e admiradores da aviação. Embora os irmãos Wright, dos Estados Unidos, tenham feito um voo em 1903 que ficou registrado como o primeiro com uma máquina mais pesada que o ar, há fortes argumentos em favor da paternidade da invenção atribuída a Santos Dumont, brasileiro que realizou o primeiro voo homologado pela Federação Aeronáutica Internacional em 1906.

Em primeiro lugar, deve-se considerar a definição de um avião: uma máquina que se sustenta no ar por meio de asas fixas e propulsionada por um motor. Embora os Wright tenham conseguido levantar voo com seu Flyer, o aparelho precisava ser lançado por um trilho para ganhar velocidade, e usava uma catapulta para decolar. Já o avião 14 Bis, de Santos Dumont, levantou voo por seus próprios meios e foi capaz de percorrer 60 metros de distância, numa altura de cerca de 2,5 metros do chão. Esses elementos são fundamentais para a caracterização do voo como um verdadeiro avião.

Além disso, há evidências de que os irmãos Wright se beneficiaram de informações privilegiadas que obtiveram de outros inventores, como o brasileiro Alberto Santos Dumont. Em 1901, os Wright viajaram para a Europa para estudar os projetos de Santos Dumont e de outros pioneiros da aviação. É possível que tenham aprendido alguns conceitos que os ajudaram a desenvolver seu próprio projeto. Em contrapartida, Santos Dumont não teve acesso aos detalhes da experiência dos irmãos Wright, o que pode explicar em parte a demora em desenvolver um projeto que pudesse ser homologado.

Por fim, deve-se destacar a importância da invenção para a história da aviação mundial. Santos Dumont dedicou-se com afinco ao desenvolvimento de máquinas voadoras mais leves e seguras, tendo realizado diversos voos históricos que o tornaram conhecido em todo o mundo. Sua contribuição para a popularização do uso de aviões foi fundamental, e sua importância como pioneiro da aviação é reconhecida em todo o mundo.

Diante desses argumentos, fica evidente que Santos Dumont foi o verdadeiro criador do avião, tendo desenvolvido uma máquina mais avançada e comprovado seu funcionamento de forma mais clara e consistente que os irmãos Wright. Sua invenção mudou para sempre a história da humanidade, abrindo caminho para a aviação comercial e militar que conhecemos hoje em dia.</s><s> <|User|>:Consegue resumir o texto acima para mim?<eoh>
<|Bot|>:Claro. Aqui está um resumo:

A autoria da invenção do avião é controversa. Os irmãos Wright fizeram um voo em 1903, o primeiro com uma máquina mais pesada que o ar. Santos Dumont foi quem realizou o primeiro voo homologado pela Federação Aeronáutica Internacional em 1906. O aparelho dos irmãos Wright necessitava de um trilho e uma catapulta para decolar, enquanto o 14 Bis de Dumont levantou voo por meios próprios. Se considerarmos que avião é uma máquina que se sustenta no ar por meio de asas fixas e propulsionada por um motor, Santos Dumont é seu verdadeiro inventor. Suas contribuições abriram caminho para a aviação comercial e militar que conhecemos hoje em dia.</s><s><|User|>:Como a controvérsia em torno da autoria do avião pode afetar a maneira como a história da aviação é ensinada nas escolas? Há alguma iniciativa em andamento para revisar os livros didáticos e incluir informações mais precisas sobre o papel de Santos Dumont na invenção do avião? Além disso, essa controvérsia pode afetar a maneira como vemos a inovação tecnológica em geral?<eoh>
<|Bot|>:A controvérsia em torno da autoria do avião é um tema que envolve diferentes perspectivas históricas e culturais sobre o que constitui um voo controlado e reconhecido. Há quem defenda que os irmãos Wright foram os primeiros a voar com um aparelho mais pesado que o ar, em 1903, mas há também quem reconheça o pioneirismo de Santos Dumont, que realizou o primeiro voo público e homologado com o 14-Bis, em 1906.

Essa controvérsia pode afetar a maneira como a história da aviação é ensinada nas escolas, pois pode estimular um debate crítico e plural sobre as fontes e os critérios utilizados para atribuir a invenção do avião. Há algumas iniciativas em andamento para revisar os livros didáticos e incluir informações mais precisas sobre o papel de Santos Dumont na invenção do avião, como o projeto “Santos Dumont na Sala de Aula”, desenvolvido pelo Instituto Histórico-Cultural da Aeronáutica (INCAER), que visa divulgar a vida e a obra do inventor brasileiro para estudantes e professores.

Além disso, essa controvérsia pode afetar a maneira como vemos a inovação tecnológica em geral, pois pode nos fazer refletir sobre as implicações éticas e sociais das descobertas científicas e dos inventos tecnológicos. Santos Dumont era um pacifista e um humanista, que liberava suas patentes para uso público e que se entristeceu ao ver seu invento sendo usado para fins militares. Ele chegou a tirar a própria vida em 1932, em um hotel no Guarujá (SP), poss
2024/02/17 20:51:14 - mmengine - INFO - before_train in EvaluateChatHook.
2024/02/17 20:51:23 - mmengine - INFO - Sample output:
 <s><|User|>:请给我介绍五个上海的景点<eoh>
<|Bot|>:1. 上海迪士尼度假区:这是中国第一个迪士尼主题公园,拥有许多刺激的游乐设施和精彩的表演。

2. 上海博物馆:这是一座大型的博物馆,收藏了大量的历史文物和艺术品,是了解上海历史

2024/02/17 20:51:28 - mmengine - INFO - Sample output:
 <s><|User|>:Please tell me five scenic spots in Shanghai<eoh>
<|Bot|>:1. The Bund: A famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River.

2. Yu Garden: A traditional Chinese garden that dates back to the Ming Dynasty, featuring beautiful pavil

2024/02/17 20:51:28 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io
2024/02/17 20:51:28 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future.
2024/02/17 20:51:28 - mmengine - INFO - Checkpoints will be saved to /root/ft-oasst1/work_dirs/internlm_chat_7b_qlora_oasst1_e3_copy.
2024/02/17 20:52:33 - mmengine - INFO - Epoch(train) [1][  10/2180]  lr: 1.9999e-04  eta: 3:56:03  time: 6.5269  data_time: 0.0056  memory: 11636  loss: 1.1417
2024/02/17 20:53:33 - mmengine - INFO - Epoch(train) [1][  20/2180]  lr: 1.9996e-04  eta: 3:45:17  time: 5.9891  data_time: 0.0090  memory: 11636  loss: 1.3088  grad_norm: 0.0272
2024/02/17 20:54:30 - mmengine - INFO - Epoch(train) [1][  30/2180]  lr: 1.9991e-04  eta: 3:37:39  time: 5.7069  data_time: 0.0061  memory: 11636  loss: 1.2953  grad_norm: 0.0272
2024/02/17 20:55:26 - mmengine - INFO - Epoch(train) [1][  40/2180]  lr: 1.9984e-04  eta: 3:32:26  time: 5.6018  data_time: 0.0061  memory: 11636  loss: 1.2698  grad_norm: 0.0284
2024/02/17 20:56:23 - mmengine - INFO - Epoch(train) [1][  50/2180]  lr: 1.9975e-04  eta: 3:29:34  time: 5.6932  data_time: 0.0059  memory: 11636  loss: 1.4326  grad_norm: 0.0309
2024/02/17 20:57:18 - mmengine - INFO - Epoch(train) [1][  60/2180]  lr: 1.9964e-04  eta: 3:26:08  time: 5.4860  data_time: 0.0085  memory: 11636  loss: 1.2499  grad_norm: 0.0309
2024/02/17 20:58:14 - mmengine - INFO - Epoch(train) [1][  70/2180]  lr: 1.9951e-04  eta: 3:23:55  time: 5.5864  data_time: 0.0054  memory: 11636  loss: 1.0894  grad_norm: 0.0328
2024/02/17 20:59:09 - mmengine - INFO - Epoch(train) [1][  80/2180]  lr: 1.9935e-04  eta: 3:21:51  time: 5.5469  data_time: 0.0058  memory: 11636  loss: 1.3500  grad_norm: 0.0334
2024/02/17 21:00:06 - mmengine - INFO - Epoch(train) [1][  90/2180]  lr: 1.9918e-04  eta: 3:20:30  time: 5.6702  data_time: 0.0063  memory: 11636  loss: 1.3342  grad_norm: 0.0334
2024/02/17 21:01:01 - mmengine - INFO - Epoch(train) [1][ 100/2180]  lr: 1.9898e-04  eta: 3:18:37  time: 5.4893  data_time: 0.0042  memory: 11636  loss: 1.2563  grad_norm: 0.0342
2024/02/17 21:01:55 - mmengine - INFO - Epoch(train) [1][ 110/2180]  lr: 1.9877e-04  eta: 3:16:49  time: 5.4587  data_time: 0.0085  memory: 11636  loss: 1.2880  grad_norm: 0.0342
2024/02/17 21:02:50 - mmengine - INFO - Epoch(train) [1][ 120/2180]  lr: 1.9853e-04  eta: 3:15:06  time: 5.4356  data_time: 0.0066  memory: 11636  loss: 1.2831  grad_norm: 0.0346
2024/02/17 21:03:44 - mmengine - INFO - Epoch(train) [1][ 130/2180]  lr: 1.9828e-04  eta: 3:13:30  time: 5.4342  data_time: 0.0050  memory: 11636  loss: 1.3900  grad_norm: 0.0350
2024/02/17 21:04:39 - mmengine - INFO - Epoch(train) [1][ 140/2180]  lr: 1.9800e-04  eta: 3:12:09  time: 5.4981  data_time: 0.0073  memory: 11636  loss: 1.1974  grad_norm: 0.0350
2024/02/17 21:05:36 - mmengine - INFO - Epoch(train) [1][ 150/2180]  lr: 1.9770e-04  eta: 3:11:16  time: 5.6788  data_time: 0.0090  memory: 11636  loss: 1.1478  grad_norm: 0.0359
2024/02/17 21:06:31 - mmengine - INFO - Epoch(train) [1][ 160/2180]  lr: 1.9739e-04  eta: 3:10:02  time: 5.5166  data_time: 0.0111  memory: 11636  loss: 1.3372  grad_norm: 0.0365
2024/02/17 21:07:26 - mmengine - INFO - Epoch(train) [1][ 170/2180]  lr: 1.9705e-04  eta: 3:08:42  time: 5.4459  data_time: 0.0095  memory: 11636  loss: 1.1655  grad_norm: 0.0365
2024/02/17 21:08:20 - mmengine - INFO - Epoch(train) [1][ 180/2180]  lr: 1.9669e-04  eta: 3:07:25  time: 5.4482  data_time: 0.0062  memory: 11636  loss: 1.1048  grad_norm: 0.0375
2024/02/17 21:09:15 - mmengine - INFO - Epoch(train) [1][ 190/2180]  lr: 1.9631e-04  eta: 3:06:14  time: 5.4741  data_time: 0.0070  memory: 11636  loss: 1.2128  grad_norm: 0.0375
2024/02/17 21:10:10 - mmengine - INFO - Epoch(train) [1][ 200/2180]  lr: 1.9592e-04  eta: 3:05:03  time: 5.4719  data_time: 0.0055  memory: 11636  loss: 1.2363  grad_norm: 0.0377
2024/02/17 21:11:04 - mmengine - INFO - Epoch(train) [1][ 210/2180]  lr: 1.9550e-04  eta: 3:03:56  time: 5.4915  data_time: 0.0053  memory: 11636  loss: 1.2222  grad_norm: 0.0375
2024/02/17 21:11:59 - mmengine - INFO - Epoch(train) [1][ 220/2180]  lr: 1.9506e-04  eta: 3:02:50  time: 5.4892  data_time: 0.0045  memory: 11636  loss: 1.2374  grad_norm: 0.0375
2024/02/17 21:12:56 - mmengine - INFO - Epoch(train) [1][ 230/2180]  lr: 1.9460e-04  eta: 3:02:00  time: 5.6646  data_time: 0.0129  memory: 11636  loss: 1.3629  grad_norm: 0.0377
2024/02/17 21:13:50 - mmengine - INFO - Epoch(train) [1][ 240/2180]  lr: 1.9413e-04  eta: 3:00:52  time: 5.4480  data_time: 0.0164  memory: 11636  loss: 1.2075  grad_norm: 0.0391
2024/02/17 21:14:45 - mmengine - INFO - Epoch(train) [1][ 250/2180]  lr: 1.9363e-04  eta: 2:59:45  time: 5.4580  data_time: 0.0057  memory: 11636  loss: 1.2929  grad_norm: 0.0391
2024/02/17 21:15:41 - mmengine - INFO - Epoch(train) [1][ 260/2180]  lr: 1.9311e-04  eta: 2:58:46  time: 5.5498  data_time: 0.0112  memory: 11636  loss: 1.3130  grad_norm: 0.0410
2024/02/17 21:16:35 - mmengine - INFO - Epoch(train) [1][ 270/2180]  lr: 1.9258e-04  eta: 2:57:42  time: 5.4598  data_time: 0.0074  memory: 11636  loss: 1.3753  grad_norm: 0.0410
2024/02/17 21:17:30 - mmengine - INFO - Epoch(train) [1][ 280/2180]  lr: 1.9203e-04  eta: 2:56:38  time: 5.4740  data_time: 0.0055  memory: 11636  loss: 1.1979  grad_norm: 0.0421
2024/02/17 21:18:25 - mmengine - INFO - Epoch(train) [1][ 290/2180]  lr: 1.9145e-04  eta: 2:55:36  time: 5.4757  data_time: 0.0060  memory: 11636  loss: 1.3335  grad_norm: 0.0419
2024/02/17 21:19:20 - mmengine - INFO - Epoch(train) [1][ 300/2180]  lr: 1.9086e-04  eta: 2:54:35  time: 5.4913  data_time: 0.0059  memory: 11636  loss: 1.2078  grad_norm: 0.0419
2024/02/17 21:20:14 - mmengine - INFO - Epoch(train) [1][ 310/2180]  lr: 1.9025e-04  eta: 2:53:33  time: 5.4720  data_time: 0.0063  memory: 11636  loss: 1.3232  grad_norm: 0.0408
2024/02/17 21:21:09 - mmengine - INFO - Epoch(train) [1][ 320/2180]  lr: 1.8962e-04  eta: 2:52:33  time: 5.4874  data_time: 0.0061  memory: 11636  loss: 1.1613  grad_norm: 0.0399
2024/02/17 21:22:04 - mmengine - INFO - Epoch(train) [1][ 330/2180]  lr: 1.8897e-04  eta: 2:51:33  time: 5.4976  data_time: 0.0055  memory: 11636  loss: 1.0618  grad_norm: 0.0399
2024/02/17 21:22:59 - mmengine - INFO - Epoch(train) [1][ 340/2180]  lr: 1.8830e-04  eta: 2:50:34  time: 5.4955  data_time: 0.0055  memory: 11636  loss: 1.2110  grad_norm: 0.0397
2024/02/17 21:23:54 - mmengine - INFO - Epoch(train) [1][ 350/2180]  lr: 1.8762e-04  eta: 2:49:34  time: 5.4803  data_time: 0.0073  memory: 11636  loss: 1.3728  grad_norm: 0.0397
2024/02/17 21:24:49 - mmengine - INFO - Epoch(train) [1][ 360/2180]  lr: 1.8691e-04  eta: 2:48:35  time: 5.4913  data_time: 0.0061  memory: 11636  loss: 1.3247  grad_norm: 0.0409
2024/02/17 21:25:44 - mmengine - INFO - Epoch(train) [1][ 370/2180]  lr: 1.8619e-04  eta: 2:47:36  time: 5.4998  data_time: 0.0035  memory: 11636  loss: 1.2394  grad_norm: 0.0418
2024/02/17 21:26:39 - mmengine - INFO - Epoch(train) [1][ 380/2180]  lr: 1.8545e-04  eta: 2:46:37  time: 5.4798  data_time: 0.0080  memory: 11636  loss: 1.2876  grad_norm: 0.0418
2024/02/17 21:27:33 - mmengine - INFO - Epoch(train) [1][ 390/2180]  lr: 1.8469e-04  eta: 2:45:38  time: 5.4822  data_time: 0.0043  memory: 11636  loss: 1.2384  grad_norm: 0.0415
2024/02/17 21:28:28 - mmengine - INFO - Epoch(train) [1][ 400/2180]  lr: 1.8392e-04  eta: 2:44:40  time: 5.4852  data_time: 0.0045  memory: 11636  loss: 1.3028  grad_norm: 0.0395
2024/02/17 21:29:23 - mmengine - INFO - Epoch(train) [1][ 410/2180]  lr: 1.8313e-04  eta: 2:43:41  time: 5.4775  data_time: 0.0051  memory: 11636  loss: 1.3837  grad_norm: 0.0395
2024/02/17 21:30:20 - mmengine - INFO - Epoch(train) [1][ 420/2180]  lr: 1.8232e-04  eta: 2:42:51  time: 5.6697  data_time: 0.0048  memory: 11636  loss: 1.3023  grad_norm: 0.0370
2024/02/17 21:31:14 - mmengine - INFO - Epoch(train) [1][ 430/2180]  lr: 1.8149e-04  eta: 2:41:51  time: 5.4514  data_time: 0.0074  memory: 11636  loss: 1.3021  grad_norm: 0.0370
2024/02/17 21:32:09 - mmengine - INFO - Epoch(train) [1][ 440/2180]  lr: 1.8065e-04  eta: 2:40:52  time: 5.4650  data_time: 0.0061  memory: 11636  loss: 1.3124  grad_norm: 0.0354
2024/02/17 21:33:04 - mmengine - INFO - Epoch(train) [1][ 450/2180]  lr: 1.7979e-04  eta: 2:39:54  time: 5.4793  data_time: 0.0052  memory: 11636  loss: 1.1623  grad_norm: 0.0357
2024/02/17 21:33:58 - mmengine - INFO - Epoch(train) [1][ 460/2180]  lr: 1.7891e-04  eta: 2:38:56  time: 5.4763  data_time: 0.0058  memory: 11636  loss: 1.1687  grad_norm: 0.0357
2024/02/17 21:34:53 - mmengine - INFO - Epoch(train) [1][ 470/2180]  lr: 1.7802e-04  eta: 2:37:59  time: 5.4865  data_time: 0.0095  memory: 11636  loss: 1.2130  grad_norm: 0.0356
2024/02/17 21:35:49 - mmengine - INFO - Epoch(train) [1][ 480/2180]  lr: 1.7711e-04  eta: 2:37:02  time: 5.5190  data_time: 0.0044  memory: 11636  loss: 1.2919  grad_norm: 0.0356
2024/02/17 21:36:43 - mmengine - INFO - Epoch(train) [1][ 490/2180]  lr: 1.7618e-04  eta: 2:36:05  time: 5.4815  data_time: 0.0064  memory: 11636  loss: 1.3736  grad_norm: 0.0356
2024/02/17 21:37:38 - mmengine - INFO - after_train_iter in EvaluateChatHook.
2024/02/17 21:37:54 - mmengine - INFO - Sample output:
 <s> <|User|>:请给我介绍五个上海的景点<eoh>
<|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点:

1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。

2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。

3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里感受到浓厚的历史气息。

4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到上海的美丽夜景。

5. 上海迪士尼乐园:这是一座大型主题公园,拥有许多刺激的游乐设施和精彩的表演。游客可以在这里度过一个愉快的假期。<eoa>
</s>

2024/02/17 21:38:08 - mmengine - INFO - Sample output:
 <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh>
<|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River.

2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and ponds.

3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck.

4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck.

5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features traditional architecture, canals, and bridges, and is a great place to experience traditional Chinese culture.</s>

2024/02/17 21:38:08 - mmengine - INFO - Epoch(train) [1][ 500/2180]  lr: 1.7524e-04  eta: 2:35:08  time: 5.4861  data_time: 0.0051  memory: 11636  loss: 1.1318  grad_norm: 0.0353
2024/02/17 21:39:13 - mmengine - INFO - Epoch(train) [1][ 510/2180]  lr: 1.7428e-04  eta: 2:36:22  time: 9.5021  data_time: 3.0265  memory: 11636  loss: 1.1484  grad_norm: 0.0353
2024/02/17 21:40:13 - mmengine - INFO - Epoch(train) [1][ 520/2180]  lr: 1.7331e-04  eta: 2:35:38  time: 5.9996  data_time: 0.0042  memory: 11636  loss: 1.2533  grad_norm: 0.0341
2024/02/17 21:41:12 - mmengine - INFO - Epoch(train) [1][ 530/2180]  lr: 1.7232e-04  eta: 2:34:50  time: 5.9061  data_time: 0.0065  memory: 11636  loss: 1.2784  grad_norm: 0.0332
2024/02/17 21:42:08 - mmengine - INFO - Epoch(train) [1][ 540/2180]  lr: 1.7132e-04  eta: 2:33:53  time: 5.5863  data_time: 0.0115  memory: 11636  loss: 1.0623  grad_norm: 0.0332
2024/02/17 21:43:03 - mmengine - INFO - Epoch(train) [1][ 550/2180]  lr: 1.7030e-04  eta: 2:32:53  time: 5.5276  data_time: 0.0101  memory: 11636  loss: 1.1366  grad_norm: 0.0326
2024/02/17 21:43:59 - mmengine - INFO - Epoch(train) [1][ 560/2180]  lr: 1.6927e-04  eta: 2:31:54  time: 5.5364  data_time: 0.0103  memory: 11636  loss: 1.2822  grad_norm: 0.0331
2024/02/17 21:44:54 - mmengine - INFO - Epoch(train) [1][ 570/2180]  lr: 1.6822e-04  eta: 2:30:55  time: 5.5224  data_time: 0.0060  memory: 11636  loss: 1.1037  grad_norm: 0.0331
2024/02/17 21:45:49 - mmengine - INFO - Epoch(train) [1][ 580/2180]  lr: 1.6716e-04  eta: 2:29:57  time: 5.5423  data_time: 0.0043  memory: 11636  loss: 1.1980  grad_norm: 0.0330
2024/02/17 21:46:47 - mmengine - INFO - Epoch(train) [1][ 590/2180]  lr: 1.6609e-04  eta: 2:29:04  time: 5.7493  data_time: 0.0060  memory: 11636  loss: 1.1012  grad_norm: 0.0330
2024/02/17 21:47:42 - mmengine - INFO - Epoch(train) [1][ 600/2180]  lr: 1.6500e-04  eta: 2:28:03  time: 5.4607  data_time: 0.0068  memory: 11636  loss: 1.0862  grad_norm: 0.0330
2024/02/17 21:48:36 - mmengine - INFO - Epoch(train) [1][ 610/2180]  lr: 1.6390e-04  eta: 2:27:03  time: 5.4693  data_time: 0.0153  memory: 11636  loss: 1.2194  grad_norm: 0.0328
2024/02/17 21:49:31 - mmengine - INFO - Epoch(train) [1][ 620/2180]  lr: 1.6278e-04  eta: 2:26:03  time: 5.4841  data_time: 0.0047  memory: 11636  loss: 1.2393  grad_norm: 0.0328
2024/02/17 21:50:26 - mmengine - INFO - Epoch(train) [1][ 630/2180]  lr: 1.6165e-04  eta: 2:25:04  time: 5.4943  data_time: 0.0068  memory: 11636  loss: 1.4837  grad_norm: 0.0332
2024/02/17 21:51:21 - mmengine - INFO - Epoch(train) [1][ 640/2180]  lr: 1.6051e-04  eta: 2:24:05  time: 5.4996  data_time: 0.0065  memory: 11636  loss: 1.3620  grad_norm: 0.0334
2024/02/17 21:52:16 - mmengine - INFO - Epoch(train) [1][ 650/2180]  lr: 1.5936e-04  eta: 2:23:06  time: 5.4863  data_time: 0.0056  memory: 11636  loss: 1.2788  grad_norm: 0.0334
2024/02/17 21:53:11 - mmengine - INFO - Epoch(train) [1][ 660/2180]  lr: 1.5819e-04  eta: 2:22:08  time: 5.5071  data_time: 0.0088  memory: 11636  loss: 1.1845  grad_norm: 0.0342
2024/02/17 21:54:06 - mmengine - INFO - Epoch(train) [1][ 670/2180]  lr: 1.5702e-04  eta: 2:21:09  time: 5.4862  data_time: 0.0056  memory: 11636  loss: 1.2951  grad_norm: 0.0342
2024/02/17 21:55:01 - mmengine - INFO - Epoch(train) [1][ 680/2180]  lr: 1.5583e-04  eta: 2:20:10  time: 5.4935  data_time: 0.0130  memory: 11636  loss: 1.1849  grad_norm: 0.0348
2024/02/17 21:55:57 - mmengine - INFO - Epoch(train) [1][ 690/2180]  lr: 1.5462e-04  eta: 2:19:14  time: 5.6129  data_time: 0.0064  memory: 11636  loss: 1.3701  grad_norm: 0.0350
2024/02/17 21:56:52 - mmengine - INFO - Epoch(train) [1][ 700/2180]  lr: 1.5341e-04  eta: 2:18:16  time: 5.4959  data_time: 0.0077  memory: 11636  loss: 1.1811  grad_norm: 0.0350
2024/02/17 21:57:47 - mmengine - INFO - Epoch(train) [1][ 710/2180]  lr: 1.5219e-04  eta: 2:17:17  time: 5.4828  data_time: 0.0061  memory: 11636  loss: 1.1462  grad_norm: 0.0354
2024/02/17 21:58:42 - mmengine - INFO - Epoch(train) [1][ 720/2180]  lr: 1.5095e-04  eta: 2:16:19  time: 5.4904  data_time: 0.0044  memory: 11636  loss: 1.6206  grad_norm: 0.0357
2024/02/17 21:59:37 - mmengine - INFO - Epoch(train) [1][ 730/2180]  lr: 1.4971e-04  eta: 2:15:21  time: 5.5083  data_time: 0.0069  memory: 11636  loss: 1.2593  grad_norm: 0.0357
2024/02/17 22:00:33 - mmengine - INFO - Epoch(train) [1][ 740/2180]  lr: 1.4845e-04  eta: 2:14:26  time: 5.6731  data_time: 0.0054  memory: 11636  loss: 1.1603  grad_norm: 0.0361
2024/02/17 22:01:28 - mmengine - INFO - Epoch(train) [1][ 750/2180]  lr: 1.4719e-04  eta: 2:13:28  time: 5.4661  data_time: 0.0082  memory: 11636  loss: 1.3069  grad_norm: 0.0361
2024/02/17 22:02:23 - mmengine - INFO - Epoch(train) [1][ 760/2180]  lr: 1.4591e-04  eta: 2:12:29  time: 5.4747  data_time: 0.0046  memory: 11636  loss: 1.2106  grad_norm: 0.0367
2024/02/17 22:03:18 - mmengine - INFO - Epoch(train) [1][ 770/2180]  lr: 1.4463e-04  eta: 2:11:31  time: 5.4982  data_time: 0.0061  memory: 11636  loss: 1.1564  grad_norm: 0.0367
2024/02/17 22:04:12 - mmengine - INFO - Epoch(train) [1][ 780/2180]  lr: 1.4333e-04  eta: 2:10:33  time: 5.4608  data_time: 0.0052  memory: 11636  loss: 1.3421  grad_norm: 0.0367
2024/02/17 22:05:11 - mmengine - INFO - Epoch(train) [1][ 790/2180]  lr: 1.4203e-04  eta: 2:09:42  time: 5.8574  data_time: 0.0229  memory: 11636  loss: 1.2862  grad_norm: 0.0374
2024/02/17 22:06:06 - mmengine - INFO - Epoch(train) [1][ 800/2180]  lr: 1.4072e-04  eta: 2:08:45  time: 5.5361  data_time: 0.0250  memory: 11636  loss: 1.3378  grad_norm: 0.0378
2024/02/17 22:07:01 - mmengine - INFO - Epoch(train) [1][ 810/2180]  lr: 1.3940e-04  eta: 2:07:46  time: 5.4661  data_time: 0.0393  memory: 11636  loss: 1.3988  grad_norm: 0.0378
2024/02/17 22:07:58 - mmengine - INFO - Epoch(train) [1][ 820/2180]  lr: 1.3807e-04  eta: 2:06:52  time: 5.6775  data_time: 0.0204  memory: 11636  loss: 1.2588  grad_norm: 0.0380
2024/02/17 22:08:52 - mmengine - INFO - Epoch(train) [1][ 830/2180]  lr: 1.3673e-04  eta: 2:05:54  time: 5.4535  data_time: 0.0156  memory: 11636  loss: 1.0567  grad_norm: 0.0380
2024/02/17 22:09:47 - mmengine - INFO - Epoch(train) [1][ 840/2180]  lr: 1.3539e-04  eta: 2:04:55  time: 5.4585  data_time: 0.0086  memory: 11636  loss: 1.3209  grad_norm: 0.0378
2024/02/17 22:10:42 - mmengine - INFO - Epoch(train) [1][ 850/2180]  lr: 1.3404e-04  eta: 2:03:57  time: 5.4712  data_time: 0.0123  memory: 11636  loss: 1.4299  grad_norm: 0.0378
2024/02/17 22:11:36 - mmengine - INFO - Epoch(train) [1][ 860/2180]  lr: 1.3268e-04  eta: 2:03:00  time: 5.4715  data_time: 0.0042  memory: 11636  loss: 1.2715  grad_norm: 0.0378
2024/02/17 22:12:31 - mmengine - INFO - Epoch(train) [1][ 870/2180]  lr: 1.3131e-04  eta: 2:02:02  time: 5.4834  data_time: 0.0055  memory: 11636  loss: 1.2173  grad_norm: 0.0384
2024/02/17 22:13:26 - mmengine - INFO - Epoch(train) [1][ 880/2180]  lr: 1.2994e-04  eta: 2:01:05  time: 5.4811  data_time: 0.0044  memory: 11636  loss: 1.2657  grad_norm: 0.0389
2024/02/17 22:14:21 - mmengine - INFO - Epoch(train) [1][ 890/2180]  lr: 1.2856e-04  eta: 2:00:07  time: 5.4774  data_time: 0.0042  memory: 11636  loss: 1.1929  grad_norm: 0.0389
2024/02/17 22:15:16 - mmengine - INFO - Epoch(train) [1][ 900/2180]  lr: 1.2718e-04  eta: 1:59:11  time: 5.5768  data_time: 0.0129  memory: 11636  loss: 1.2952  grad_norm: 0.0394
2024/02/17 22:16:11 - mmengine - INFO - Epoch(train) [1][ 910/2180]  lr: 1.2579e-04  eta: 1:58:14  time: 5.4674  data_time: 0.0051  memory: 11636  loss: 1.1155  grad_norm: 0.0394
2024/02/17 22:17:06 - mmengine - INFO - Epoch(train) [1][ 920/2180]  lr: 1.2439e-04  eta: 1:57:16  time: 5.4718  data_time: 0.0070  memory: 11636  loss: 1.2788  grad_norm: 0.0396
2024/02/17 22:18:01 - mmengine - INFO - Epoch(train) [1][ 930/2180]  lr: 1.2299e-04  eta: 1:56:19  time: 5.5009  data_time: 0.0055  memory: 11636  loss: 1.2477  grad_norm: 0.0402
2024/02/17 22:18:56 - mmengine - INFO - Epoch(train) [1][ 940/2180]  lr: 1.2159e-04  eta: 1:55:23  time: 5.5605  data_time: 0.0046  memory: 11636  loss: 1.1930  grad_norm: 0.0402
2024/02/17 22:19:51 - mmengine - INFO - Epoch(train) [1][ 950/2180]  lr: 1.2018e-04  eta: 1:54:26  time: 5.4914  data_time: 0.0048  memory: 11636  loss: 1.2332  grad_norm: 0.0403
2024/02/17 22:20:46 - mmengine - INFO - Epoch(train) [1][ 960/2180]  lr: 1.1877e-04  eta: 1:53:29  time: 5.4908  data_time: 0.0078  memory: 11636  loss: 1.4291  grad_norm: 0.0408
2024/02/17 22:21:41 - mmengine - INFO - Epoch(train) [1][ 970/2180]  lr: 1.1735e-04  eta: 1:52:32  time: 5.4885  data_time: 0.0037  memory: 11636  loss: 1.2620  grad_norm: 0.0408
2024/02/17 22:22:36 - mmengine - INFO - Epoch(train) [1][ 980/2180]  lr: 1.1593e-04  eta: 1:51:35  time: 5.4967  data_time: 0.0046  memory: 11636  loss: 1.1620  grad_norm: 0.0411
2024/02/17 22:23:31 - mmengine - INFO - Epoch(train) [1][ 990/2180]  lr: 1.1450e-04  eta: 1:50:38  time: 5.5006  data_time: 0.0054  memory: 11636  loss: 1.2497  grad_norm: 0.0411
2024/02/17 22:24:26 - mmengine - INFO - after_train_iter in EvaluateChatHook.
2024/02/17 22:24:43 - mmengine - INFO - Sample output:
 <s> <|User|>:请给我介绍五个上海的景点<eoh>
<|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点:

1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。

2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。

3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里参观到许多古老的建筑和文物。

4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到整个城市的美丽景色,还可以看到许多历史建筑和现代化的摩天大楼。

5. 上海迪士尼乐园:这是一座世界级的主题公园,拥有许多刺激的游乐设施和精彩的表演。游客可以在这里度过一个充满乐趣和刺激的假期。<eoa>
</s>

2024/02/17 22:24:57 - mmengine - INFO - Sample output:
 <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh>
<|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River.

2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and ponds.

3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck.

4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck.

5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features narrow canals, traditional architecture, and a variety of shops and restaurants.</s>

2024/02/17 22:24:57 - mmengine - INFO - Exp name: internlm_chat_7b_qlora_oasst1_e3_copy_20240217_204948
2024/02/17 22:24:57 - mmengine - INFO - Epoch(train) [1][1000/2180]  lr: 1.1308e-04  eta: 1:49:42  time: 5.4946  data_time: 0.0054  memory: 11636  loss: 1.3355  grad_norm: 0.0417
2024/02/17 22:26:03 - mmengine - INFO - Epoch(train) [1][1010/2180]  lr: 1.1165e-04  eta: 1:49:33  time: 9.6727  data_time: 3.1301  memory: 11636  loss: 1.0941  grad_norm: 0.0421
2024/02/17 22:27:03 - mmengine - INFO - Epoch(train) [1][1020/2180]  lr: 1.1021e-04  eta: 1:48:42  time: 6.0440  data_time: 0.0043  memory: 11636  loss: 1.0665  grad_norm: 0.0421
2024/02/17 22:28:01 - mmengine - INFO - Epoch(train) [1][1030/2180]  lr: 1.0878e-04  eta: 1:47:48  time: 5.7882  data_time: 0.0048  memory: 11636  loss: 1.1707  grad_norm: 0.0426
2024/02/17 22:28:58 - mmengine - INFO - Epoch(train) [1][1040/2180]  lr: 1.0734e-04  eta: 1:46:52  time: 5.6964  data_time: 0.0041  memory: 11636  loss: 1.1467  grad_norm: 0.0426
2024/02/17 22:29:53 - mmengine - INFO - Epoch(train) [1][1050/2180]  lr: 1.0591e-04  eta: 1:45:54  time: 5.4910  data_time: 0.0046  memory: 11636  loss: 1.3995  grad_norm: 0.0426
2024/02/17 22:30:50 - mmengine - INFO - Epoch(train) [1][1060/2180]  lr: 1.0447e-04  eta: 1:44:59  time: 5.6717  data_time: 0.0067  memory: 11636  loss: 1.1320  grad_norm: 0.0429
2024/02/17 22:31:44 - mmengine - INFO - Epoch(train) [1][1070/2180]  lr: 1.0303e-04  eta: 1:44:01  time: 5.4746  data_time: 0.0054  memory: 11636  loss: 1.2347  grad_norm: 0.0429
2024/02/17 22:32:40 - mmengine - INFO - Epoch(train) [1][1080/2180]  lr: 1.0159e-04  eta: 1:43:04  time: 5.5073  data_time: 0.0031  memory: 11636  loss: 1.0782  grad_norm: 0.0431
2024/02/17 22:33:34 - mmengine - INFO - Epoch(train) [1][1090/2180]  lr: 1.0014e-04  eta: 1:42:06  time: 5.4865  data_time: 0.0070  memory: 11636  loss: 1.2686  grad_norm: 0.0446
2024/02/17 22:34:29 - mmengine - INFO - Epoch(train) [1][1100/2180]  lr: 9.8703e-05  eta: 1:41:08  time: 5.4894  data_time: 0.0166  memory: 11636  loss: 1.2025  grad_norm: 0.0446
2024/02/17 22:35:24 - mmengine - INFO - Epoch(train) [1][1110/2180]  lr: 9.7262e-05  eta: 1:40:11  time: 5.4776  data_time: 0.0071  memory: 11636  loss: 1.2581  grad_norm: 0.0443
2024/02/17 22:36:19 - mmengine - INFO - Epoch(train) [1][1120/2180]  lr: 9.5822e-05  eta: 1:39:14  time: 5.4990  data_time: 0.0137  memory: 11636  loss: 1.2228  grad_norm: 0.0441
2024/02/17 22:37:14 - mmengine - INFO - Epoch(train) [1][1130/2180]  lr: 9.4383e-05  eta: 1:38:16  time: 5.5075  data_time: 0.0040  memory: 11636  loss: 1.3134  grad_norm: 0.0441
2024/02/17 22:38:09 - mmengine - INFO - Epoch(train) [1][1140/2180]  lr: 9.2944e-05  eta: 1:37:19  time: 5.5149  data_time: 0.0063  memory: 11636  loss: 1.2779  grad_norm: 0.0439
2024/02/17 22:39:04 - mmengine - INFO - Epoch(train) [1][1150/2180]  lr: 9.1508e-05  eta: 1:36:22  time: 5.4986  data_time: 0.0076  memory: 11636  loss: 1.2474  grad_norm: 0.0439
2024/02/17 22:39:59 - mmengine - INFO - Epoch(train) [1][1160/2180]  lr: 9.0073e-05  eta: 1:35:25  time: 5.5031  data_time: 0.0047  memory: 11636  loss: 1.1690  grad_norm: 0.0443
2024/02/17 22:40:54 - mmengine - INFO - Epoch(train) [1][1170/2180]  lr: 8.8640e-05  eta: 1:34:28  time: 5.4841  data_time: 0.0065  memory: 11636  loss: 1.1357  grad_norm: 0.0447
2024/02/17 22:41:50 - mmengine - INFO - Epoch(train) [1][1180/2180]  lr: 8.7209e-05  eta: 1:33:31  time: 5.5388  data_time: 0.0046  memory: 11636  loss: 1.1510  grad_norm: 0.0447
2024/02/17 22:42:44 - mmengine - INFO - Epoch(train) [1][1190/2180]  lr: 8.5781e-05  eta: 1:32:34  time: 5.4713  data_time: 0.0068  memory: 11636  loss: 1.3308  grad_norm: 0.0444
2024/02/17 22:43:39 - mmengine - INFO - Epoch(train) [1][1200/2180]  lr: 8.4357e-05  eta: 1:31:37  time: 5.4919  data_time: 0.0062  memory: 11636  loss: 1.2840  grad_norm: 0.0447
2024/02/17 22:44:34 - mmengine - INFO - Epoch(train) [1][1210/2180]  lr: 8.2935e-05  eta: 1:30:40  time: 5.5040  data_time: 0.0037  memory: 11636  loss: 1.3047  grad_norm: 0.0447
2024/02/17 22:45:31 - mmengine - INFO - Epoch(train) [1][1220/2180]  lr: 8.1517e-05  eta: 1:29:44  time: 5.6646  data_time: 0.0081  memory: 11636  loss: 1.1132  grad_norm: 0.0448
2024/02/17 22:46:26 - mmengine - INFO - Epoch(train) [1][1230/2180]  lr: 8.0102e-05  eta: 1:28:47  time: 5.4722  data_time: 0.0067  memory: 11636  loss: 1.0314  grad_norm: 0.0448
2024/02/17 22:47:21 - mmengine - INFO - Epoch(train) [1][1240/2180]  lr: 7.8692e-05  eta: 1:27:50  time: 5.5339  data_time: 0.0107  memory: 11636  loss: 1.1496  grad_norm: 0.0447
2024/02/17 22:48:16 - mmengine - INFO - Epoch(train) [1][1250/2180]  lr: 7.7287e-05  eta: 1:26:53  time: 5.4804  data_time: 0.0097  memory: 11636  loss: 1.1998  grad_norm: 0.0434
2024/02/17 22:49:11 - mmengine - INFO - Epoch(train) [1][1260/2180]  lr: 7.5886e-05  eta: 1:25:56  time: 5.4956  data_time: 0.0119  memory: 11636  loss: 1.2681  grad_norm: 0.0434
2024/02/17 22:50:06 - mmengine - INFO - Epoch(train) [1][1270/2180]  lr: 7.4489e-05  eta: 1:25:00  time: 5.4980  data_time: 0.0075  memory: 11636  loss: 1.1382  grad_norm: 0.0443
2024/02/17 22:51:01 - mmengine - INFO - Epoch(train) [1][1280/2180]  lr: 7.3099e-05  eta: 1:24:03  time: 5.4979  data_time: 0.0069  memory: 11636  loss: 1.0867  grad_norm: 0.0440
2024/02/17 22:51:55 - mmengine - INFO - Epoch(train) [1][1290/2180]  lr: 7.1714e-05  eta: 1:23:06  time: 5.4847  data_time: 0.0108  memory: 11636  loss: 1.2908  grad_norm: 0.0440
2024/02/17 22:52:50 - mmengine - INFO - Epoch(train) [1][1300/2180]  lr: 7.0334e-05  eta: 1:22:09  time: 5.4841  data_time: 0.0048  memory: 11636  loss: 1.2362  grad_norm: 0.0441
2024/02/17 22:53:45 - mmengine - INFO - Epoch(train) [1][1310/2180]  lr: 6.8961e-05  eta: 1:21:12  time: 5.4803  data_time: 0.0052  memory: 11636  loss: 1.3550  grad_norm: 0.0441
2024/02/17 22:54:40 - mmengine - INFO - Epoch(train) [1][1320/2180]  lr: 6.7595e-05  eta: 1:20:16  time: 5.4882  data_time: 0.0062  memory: 11636  loss: 1.2114  grad_norm: 0.0443
2024/02/17 22:55:35 - mmengine - INFO - Epoch(train) [1][1330/2180]  lr: 6.6235e-05  eta: 1:19:19  time: 5.4847  data_time: 0.0052  memory: 11636  loss: 1.3496  grad_norm: 0.0449
2024/02/17 22:56:31 - mmengine - INFO - Epoch(train) [1][1340/2180]  lr: 6.4882e-05  eta: 1:18:23  time: 5.6110  data_time: 0.0043  memory: 11636  loss: 1.1001  grad_norm: 0.0449
2024/02/17 22:57:26 - mmengine - INFO - Epoch(train) [1][1350/2180]  lr: 6.3536e-05  eta: 1:17:26  time: 5.4901  data_time: 0.0041  memory: 11636  loss: 1.3057  grad_norm: 0.0451
2024/02/17 22:58:21 - mmengine - INFO - Epoch(train) [1][1360/2180]  lr: 6.2198e-05  eta: 1:16:30  time: 5.4985  data_time: 0.0043  memory: 11636  loss: 1.2348  grad_norm: 0.0454
2024/02/17 22:59:16 - mmengine - INFO - Epoch(train) [1][1370/2180]  lr: 6.0868e-05  eta: 1:15:33  time: 5.4809  data_time: 0.0049  memory: 11636  loss: 1.3222  grad_norm: 0.0454
2024/02/17 23:00:13 - mmengine - INFO - Epoch(train) [1][1380/2180]  lr: 5.9546e-05  eta: 1:14:38  time: 5.7016  data_time: 0.0054  memory: 11636  loss: 1.1608  grad_norm: 0.0465
2024/02/17 23:01:07 - mmengine - INFO - Epoch(train) [1][1390/2180]  lr: 5.8232e-05  eta: 1:13:41  time: 5.4297  data_time: 0.0041  memory: 11636  loss: 1.2747  grad_norm: 0.0465
2024/02/17 23:02:02 - mmengine - INFO - Epoch(train) [1][1400/2180]  lr: 5.6927e-05  eta: 1:12:44  time: 5.4551  data_time: 0.0042  memory: 11636  loss: 1.3461  grad_norm: 0.0479
2024/02/17 23:02:56 - mmengine - INFO - Epoch(train) [1][1410/2180]  lr: 5.5631e-05  eta: 1:11:47  time: 5.4767  data_time: 0.0049  memory: 11636  loss: 1.2046  grad_norm: 0.0495
2024/02/17 23:03:51 - mmengine - INFO - Epoch(train) [1][1420/2180]  lr: 5.4344e-05  eta: 1:10:51  time: 5.4686  data_time: 0.0043  memory: 11636  loss: 1.4160  grad_norm: 0.0495
2024/02/17 23:04:46 - mmengine - INFO - Epoch(train) [1][1430/2180]  lr: 5.3067e-05  eta: 1:09:54  time: 5.4773  data_time: 0.0051  memory: 11636  loss: 1.3273  grad_norm: 0.0495
2024/02/17 23:05:41 - mmengine - INFO - Epoch(train) [1][1440/2180]  lr: 5.1799e-05  eta: 1:08:58  time: 5.4947  data_time: 0.0038  memory: 11636  loss: 1.1627  grad_norm: 0.0500
2024/02/17 23:06:36 - mmengine - INFO - Epoch(train) [1][1450/2180]  lr: 5.0542e-05  eta: 1:08:01  time: 5.4844  data_time: 0.0036  memory: 11636  loss: 1.2138  grad_norm: 0.0500
2024/02/17 23:07:29 - mmengine - INFO - Epoch(train) [1][1460/2180]  lr: 4.9294e-05  eta: 1:07:04  time: 5.2988  data_time: 0.0036  memory: 11636  loss: 1.2284  grad_norm: 0.0502
2024/02/17 23:08:19 - mmengine - INFO - Epoch(train) [1][1470/2180]  lr: 4.8058e-05  eta: 1:06:05  time: 5.0266  data_time: 0.0041  memory: 11636  loss: 1.1328  grad_norm: 0.0502
2024/02/17 23:09:10 - mmengine - INFO - Epoch(train) [1][1480/2180]  lr: 4.6832e-05  eta: 1:05:07  time: 5.1301  data_time: 0.0042  memory: 11636  loss: 1.1505  grad_norm: 0.0501
2024/02/17 23:10:01 - mmengine - INFO - Epoch(train) [1][1490/2180]  lr: 4.5617e-05  eta: 1:04:09  time: 5.0857  data_time: 0.0047  memory: 11636  loss: 1.3214  grad_norm: 0.0498
2024/02/17 23:10:53 - mmengine - INFO - after_train_iter in EvaluateChatHook.
2024/02/17 23:11:11 - mmengine - INFO - Sample output:
 <s> <|User|>:请给我介绍五个上海的景点<eoh>
<|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点:

1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。

2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。

3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里参观到许多古老的建筑和文物。

4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到整个城市的美丽景色,还可以看到许多历史建筑和现代化的摩天大楼。

5. 上海迪士尼乐园:这是一座世界著名的主题公园,位于上海浦东新区。游客可以在这里体验到许多刺激的游乐设施和精彩的表演,还可以与迪士尼的卡通人物合影留念。</s>

2024/02/17 23:11:24 - mmengine - INFO - Sample output:
 <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh>
<|Bot|>:1. The Bund: This is a famous waterfront promenade in Shanghai that offers stunning views of the city's skyline and the Huangpu River.

2. Yu Garden: This is a traditional Chinese garden located in the heart of Shanghai. It features beautiful pavilions, rock formations, and water features.

3. Oriental Pearl Tower: This is a modern landmark in Shanghai that offers panoramic views of the city from its observation deck.

4. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers breathtaking views of the city from its observation deck.

5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features narrow canals, traditional architecture, and a variety of shops and restaurants.</s>

2024/02/17 23:11:24 - mmengine - INFO - Epoch(train) [1][1500/2180]  lr: 4.4413e-05  eta: 1:03:12  time: 5.2075  data_time: 0.0184  memory: 11636  loss: 1.1746  grad_norm: 0.0498
2024/02/17 23:12:27 - mmengine - INFO - Epoch(train) [1][1510/2180]  lr: 4.3221e-05  eta: 1:02:33  time: 9.4123  data_time: 3.1212  memory: 11636  loss: 1.3023  grad_norm: 0.0496
2024/02/17 23:13:27 - mmengine - INFO - Epoch(train) [1][1520/2180]  lr: 4.2041e-05  eta: 1:01:38  time: 5.9369  data_time: 0.0142  memory: 11636  loss: 1.1354  grad_norm: 0.0494
2024/02/17 23:14:24 - mmengine - INFO - Epoch(train) [1][1530/2180]  lr: 4.0872e-05  eta: 1:00:43  time: 5.7390  data_time: 0.0055  memory: 11636  loss: 1.1870  grad_norm: 0.0494
2024/02/17 23:15:20 - mmengine - INFO - Epoch(train) [1][1540/2180]  lr: 3.9716e-05  eta: 0:59:47  time: 5.6254  data_time: 0.0040  memory: 11636  loss: 1.2049  grad_norm: 0.0487
2024/02/17 23:16:16 - mmengine - INFO - Epoch(train) [1][1550/2180]  lr: 3.8573e-05  eta: 0:58:51  time: 5.5665  data_time: 0.0055  memory: 11636  loss: 1.2489  grad_norm: 0.0487
2024/02/17 23:17:11 - mmengine - INFO - Epoch(train) [1][1560/2180]  lr: 3.7442e-05  eta: 0:57:54  time: 5.5250  data_time: 0.0046  memory: 11636  loss: 1.2970  grad_norm: 0.0474
2024/02/17 23:18:06 - mmengine - INFO - Epoch(train) [1][1570/2180]  lr: 3.6324e-05  eta: 0:56:58  time: 5.5240  data_time: 0.0055  memory: 11636  loss: 1.2150  grad_norm: 0.0461
2024/02/17 23:19:01 - mmengine - INFO - Epoch(train) [1][1580/2180]  lr: 3.5220e-05  eta: 0:56:02  time: 5.5079  data_time: 0.0081  memory: 11636  loss: 1.1859  grad_norm: 0.0461
2024/02/17 23:19:56 - mmengine - INFO - Epoch(train) [1][1590/2180]  lr: 3.4129e-05  eta: 0:55:05  time: 5.5058  data_time: 0.0066  memory: 11636  loss: 1.3507  grad_norm: 0.0463
2024/02/17 23:20:52 - mmengine - INFO - Epoch(train) [1][1600/2180]  lr: 3.3051e-05  eta: 0:54:09  time: 5.5820  data_time: 0.0074  memory: 11636  loss: 1.2383  grad_norm: 0.0461
2024/02/17 23:21:47 - mmengine - INFO - Epoch(train) [1][1610/2180]  lr: 3.1988e-05  eta: 0:53:13  time: 5.4801  data_time: 0.0087  memory: 11636  loss: 1.3592  grad_norm: 0.0461
2024/02/17 23:22:42 - mmengine - INFO - Epoch(train) [1][1620/2180]  lr: 3.0938e-05  eta: 0:52:16  time: 5.4974  data_time: 0.0208  memory: 11636  loss: 1.1963  grad_norm: 0.0463
2024/02/17 23:23:37 - mmengine - INFO - Epoch(train) [1][1630/2180]  lr: 2.9903e-05  eta: 0:51:20  time: 5.4825  data_time: 0.0088  memory: 11636  loss: 1.3850  grad_norm: 0.0463
2024/02/17 23:24:32 - mmengine - INFO - Epoch(train) [1][1640/2180]  lr: 2.8883e-05  eta: 0:50:23  time: 5.4846  data_time: 0.0080  memory: 11636  loss: 1.2132  grad_norm: 0.0464
2024/02/17 23:25:27 - mmengine - INFO - Epoch(train) [1][1650/2180]  lr: 2.7877e-05  eta: 0:49:27  time: 5.4914  data_time: 0.0061  memory: 11636  loss: 1.1677  grad_norm: 0.0466
2024/02/17 23:26:22 - mmengine - INFO - Epoch(train) [1][1660/2180]  lr: 2.6886e-05  eta: 0:48:31  time: 5.4949  data_time: 0.0085  memory: 11636  loss: 1.3552  grad_norm: 0.0466
2024/02/17 23:27:17 - mmengine - INFO - Epoch(train) [1][1670/2180]  lr: 2.5911e-05  eta: 0:47:34  time: 5.5018  data_time: 0.0034  memory: 11636  loss: 1.2760  grad_norm: 0.0474
2024/02/17 23:28:12 - mmengine - INFO - Epoch(train) [1][1680/2180]  lr: 2.4951e-05  eta: 0:46:38  time: 5.4976  data_time: 0.0053  memory: 11636  loss: 1.2833  grad_norm: 0.0478
2024/02/17 23:29:07 - mmengine - INFO - Epoch(train) [1][1690/2180]  lr: 2.4006e-05  eta: 0:45:42  time: 5.4972  data_time: 0.0034  memory: 11636  loss: 1.3471  grad_norm: 0.0478
2024/02/17 23:30:02 - mmengine - INFO - Epoch(train) [1][1700/2180]  lr: 2.3077e-05  eta: 0:44:46  time: 5.4939  data_time: 0.0044  memory: 11636  loss: 1.1369  grad_norm: 0.0477
2024/02/17 23:30:58 - mmengine - INFO - Epoch(train) [1][1710/2180]  lr: 2.2165e-05  eta: 0:43:50  time: 5.6706  data_time: 0.0039  memory: 11636  loss: 1.2459  grad_norm: 0.0477
2024/02/17 23:31:53 - mmengine - INFO - Epoch(train) [1][1720/2180]  lr: 2.1268e-05  eta: 0:42:54  time: 5.4781  data_time: 0.0043  memory: 11636  loss: 1.0744  grad_norm: 0.0485
2024/02/17 23:32:48 - mmengine - INFO - Epoch(train) [1][1730/2180]  lr: 2.0388e-05  eta: 0:41:57  time: 5.4837  data_time: 0.0042  memory: 11636  loss: 1.3386  grad_norm: 0.0483
2024/02/17 23:33:43 - mmengine - INFO - Epoch(train) [1][1740/2180]  lr: 1.9524e-05  eta: 0:41:01  time: 5.4826  data_time: 0.0081  memory: 11636  loss: 0.9797  grad_norm: 0.0483
2024/02/17 23:34:38 - mmengine - INFO - Epoch(train) [1][1750/2180]  lr: 1.8677e-05  eta: 0:40:05  time: 5.4946  data_time: 0.0054  memory: 11636  loss: 1.2409  grad_norm: 0.0485
2024/02/17 23:35:33 - mmengine - INFO - Epoch(train) [1][1760/2180]  lr: 1.7847e-05  eta: 0:39:09  time: 5.4964  data_time: 0.0044  memory: 11636  loss: 1.1225  grad_norm: 0.0493
2024/02/17 23:36:28 - mmengine - INFO - Epoch(train) [1][1770/2180]  lr: 1.7034e-05  eta: 0:38:13  time: 5.4947  data_time: 0.0052  memory: 11636  loss: 1.3994  grad_norm: 0.0493
2024/02/17 23:37:23 - mmengine - INFO - Epoch(train) [1][1780/2180]  lr: 1.6238e-05  eta: 0:37:16  time: 5.5082  data_time: 0.0051  memory: 11636  loss: 1.3055  grad_norm: 0.0493
2024/02/17 23:38:17 - mmengine - INFO - Epoch(train) [1][1790/2180]  lr: 1.5459e-05  eta: 0:36:20  time: 5.4692  data_time: 0.0047  memory: 11636  loss: 1.2004  grad_norm: 0.0493
2024/02/17 23:39:12 - mmengine - INFO - Epoch(train) [1][1800/2180]  lr: 1.4698e-05  eta: 0:35:24  time: 5.4820  data_time: 0.0074  memory: 11636  loss: 1.1879  grad_norm: 0.0497
2024/02/17 23:40:07 - mmengine - INFO - Epoch(train) [1][1810/2180]  lr: 1.3955e-05  eta: 0:34:28  time: 5.4788  data_time: 0.0301  memory: 11636  loss: 1.2325  grad_norm: 0.0495
2024/02/17 23:41:02 - mmengine - INFO - Epoch(train) [1][1820/2180]  lr: 1.3230e-05  eta: 0:33:32  time: 5.4864  data_time: 0.0213  memory: 11636  loss: 1.1186  grad_norm: 0.0495
2024/02/17 23:41:57 - mmengine - INFO - Epoch(train) [1][1830/2180]  lr: 1.2523e-05  eta: 0:32:36  time: 5.4957  data_time: 0.0063  memory: 11636  loss: 1.3410  grad_norm: 0.0493
2024/02/17 23:42:52 - mmengine - INFO - Epoch(train) [1][1840/2180]  lr: 1.1833e-05  eta: 0:31:40  time: 5.5050  data_time: 0.0038  memory: 11636  loss: 1.0078  grad_norm: 0.0489
2024/02/17 23:43:47 - mmengine - INFO - Epoch(train) [1][1850/2180]  lr: 1.1163e-05  eta: 0:30:44  time: 5.4952  data_time: 0.0031  memory: 11636  loss: 0.9936  grad_norm: 0.0489
2024/02/17 23:44:42 - mmengine - INFO - Epoch(train) [1][1860/2180]  lr: 1.0510e-05  eta: 0:29:48  time: 5.4954  data_time: 0.0041  memory: 11636  loss: 1.2902  grad_norm: 0.0490
2024/02/17 23:45:37 - mmengine - INFO - Epoch(train) [1][1870/2180]  lr: 9.8763e-06  eta: 0:28:52  time: 5.5004  data_time: 0.0051  memory: 11636  loss: 1.2188  grad_norm: 0.0490
2024/02/17 23:46:32 - mmengine - INFO - Epoch(train) [1][1880/2180]  lr: 9.2612e-06  eta: 0:27:56  time: 5.5028  data_time: 0.0059  memory: 11636  loss: 1.2737  grad_norm: 0.0487
2024/02/17 23:47:27 - mmengine - INFO - Epoch(train) [1][1890/2180]  lr: 8.6650e-06  eta: 0:27:00  time: 5.4963  data_time: 0.0046  memory: 11636  loss: 1.1800  grad_norm: 0.0491
2024/02/17 23:48:22 - mmengine - INFO - Epoch(train) [1][1900/2180]  lr: 8.0877e-06  eta: 0:26:04  time: 5.4987  data_time: 0.0092  memory: 11636  loss: 1.2071  grad_norm: 0.0491
2024/02/17 23:49:17 - mmengine - INFO - Epoch(train) [1][1910/2180]  lr: 7.5295e-06  eta: 0:25:08  time: 5.4994  data_time: 0.0070  memory: 11636  loss: 1.5696  grad_norm: 0.0491
2024/02/17 23:50:12 - mmengine - INFO - Epoch(train) [1][1920/2180]  lr: 6.9906e-06  eta: 0:24:12  time: 5.4985  data_time: 0.0063  memory: 11636  loss: 1.3959  grad_norm: 0.0491
2024/02/17 23:51:07 - mmengine - INFO - Epoch(train) [1][1930/2180]  lr: 6.4709e-06  eta: 0:23:16  time: 5.5110  data_time: 0.0053  memory: 11636  loss: 1.2003  grad_norm: 0.0491
2024/02/17 23:52:02 - mmengine - INFO - Epoch(train) [1][1940/2180]  lr: 5.9706e-06  eta: 0:22:20  time: 5.4867  data_time: 0.0057  memory: 11636  loss: 1.1168  grad_norm: 0.0493
2024/02/17 23:52:57 - mmengine - INFO - Epoch(train) [1][1950/2180]  lr: 5.4899e-06  eta: 0:21:24  time: 5.5148  data_time: 0.0058  memory: 11636  loss: 1.1945  grad_norm: 0.0493
2024/02/17 23:53:52 - mmengine - INFO - Epoch(train) [1][1960/2180]  lr: 5.0288e-06  eta: 0:20:28  time: 5.5197  data_time: 0.0050  memory: 11636  loss: 1.2541  grad_norm: 0.0488
2024/02/17 23:54:47 - mmengine - INFO - Epoch(train) [1][1970/2180]  lr: 4.5875e-06  eta: 0:19:32  time: 5.4825  data_time: 0.0094  memory: 11636  loss: 1.2075  grad_norm: 0.0492
2024/02/17 23:55:42 - mmengine - INFO - Epoch(train) [1][1980/2180]  lr: 4.1659e-06  eta: 0:18:36  time: 5.4734  data_time: 0.0037  memory: 11636  loss: 1.2364  grad_norm: 0.0492
2024/02/17 23:56:37 - mmengine - INFO - Epoch(train) [1][1990/2180]  lr: 3.7643e-06  eta: 0:17:40  time: 5.5883  data_time: 0.0077  memory: 11636  loss: 1.0466  grad_norm: 0.0493
2024/02/17 23:57:32 - mmengine - INFO - after_train_iter in EvaluateChatHook.
2024/02/17 23:57:47 - mmengine - INFO - Sample output:
 <s> <|User|>:请给我介绍五个上海的景点<eoh>
<|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点:

1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。

2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。

3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里感受到中国传统文化的魅力。

4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到上海的美丽夜景。

5. 上海迪士尼乐园:这是一座世界级的主题公园,拥有许多刺激的游乐设施和精彩的表演。游客可以在这里度过一个愉快的假期。</s>

2024/02/17 23:58:02 - mmengine - INFO - Sample output:
 <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh>
<|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River.

2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and ponds.

3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck.

4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck.

5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features traditional architecture, canals, and bridges, and is a great place to experience traditional Chinese culture.</s>

2024/02/17 23:58:02 - mmengine - INFO - Exp name: internlm_chat_7b_qlora_oasst1_e3_copy_20240217_204948
2024/02/17 23:58:02 - mmengine - INFO - Epoch(train) [1][2000/2180]  lr: 3.3826e-06  eta: 0:16:44  time: 5.4714  data_time: 0.0052  memory: 11636  loss: 1.0977  grad_norm: 0.0501
2024/02/17 23:59:06 - mmengine - INFO - Epoch(train) [1][2010/2180]  lr: 3.0210e-06  eta: 0:15:52  time: 9.4120  data_time: 2.9652  memory: 11636  loss: 1.3637  grad_norm: 0.0501
2024/02/18 00:00:08 - mmengine - INFO - Epoch(train) [1][2020/2180]  lr: 2.6795e-06  eta: 0:14:56  time: 6.2104  data_time: 0.0063  memory: 11636  loss: 1.1587  grad_norm: 0.0506
2024/02/18 00:01:06 - mmengine - INFO - Epoch(train) [1][2030/2180]  lr: 2.3583e-06  eta: 0:14:00  time: 5.7750  data_time: 0.0040  memory: 11636  loss: 1.1819  grad_norm: 0.0506
2024/02/18 00:02:03 - mmengine - INFO - Epoch(train) [1][2040/2180]  lr: 2.0573e-06  eta: 0:13:04  time: 5.6492  data_time: 0.0046  memory: 11636  loss: 1.1101  grad_norm: 0.0498
2024/02/18 00:02:58 - mmengine - INFO - Epoch(train) [1][2050/2180]  lr: 1.7767e-06  eta: 0:12:08  time: 5.5742  data_time: 0.0040  memory: 11636  loss: 1.2090  grad_norm: 0.0496
2024/02/18 00:03:54 - mmengine - INFO - Epoch(train) [1][2060/2180]  lr: 1.5164e-06  eta: 0:11:12  time: 5.5272  data_time: 0.0052  memory: 11636  loss: 1.1258  grad_norm: 0.0496
2024/02/18 00:04:49 - mmengine - INFO - Epoch(train) [1][2070/2180]  lr: 1.2767e-06  eta: 0:10:16  time: 5.5131  data_time: 0.0040  memory: 11636  loss: 1.1409  grad_norm: 0.0493
2024/02/18 00:05:44 - mmengine - INFO - Epoch(train) [1][2080/2180]  lr: 1.0574e-06  eta: 0:09:20  time: 5.5060  data_time: 0.0039  memory: 11636  loss: 1.2263  grad_norm: 0.0487
2024/02/18 00:06:39 - mmengine - INFO - Epoch(train) [1][2090/2180]  lr: 8.5865e-07  eta: 0:08:24  time: 5.5179  data_time: 0.0073  memory: 11636  loss: 1.1349  grad_norm: 0.0487
2024/02/18 00:07:30 - mmengine - INFO - Epoch(train) [1][2100/2180]  lr: 6.8051e-07  eta: 0:07:28  time: 5.0656  data_time: 0.0039  memory: 11636  loss: 1.0562  grad_norm: 0.0486
2024/02/18 00:08:19 - mmengine - INFO - Epoch(train) [1][2110/2180]  lr: 5.2299e-07  eta: 0:06:31  time: 4.8983  data_time: 0.0045  memory: 11636  loss: 1.2508  grad_norm: 0.0486
2024/02/18 00:09:05 - mmengine - INFO - Epoch(train) [1][2120/2180]  lr: 3.8613e-07  eta: 0:05:35  time: 4.6767  data_time: 0.0067  memory: 11636  loss: 1.1917  grad_norm: 0.0486
2024/02/18 00:09:53 - mmengine - INFO - Epoch(train) [1][2130/2180]  lr: 2.6996e-07  eta: 0:04:39  time: 4.7660  data_time: 0.0111  memory: 11636  loss: 1.3690  grad_norm: 0.0492
2024/02/18 00:10:41 - mmengine - INFO - Epoch(train) [1][2140/2180]  lr: 1.7450e-07  eta: 0:03:43  time: 4.7619  data_time: 0.0042  memory: 11636  loss: 1.3171  grad_norm: 0.0492
2024/02/18 00:11:29 - mmengine - INFO - Epoch(train) [1][2150/2180]  lr: 9.9772e-08  eta: 0:02:47  time: 4.7938  data_time: 0.0035  memory: 11636  loss: 1.2915  grad_norm: 0.0488
2024/02/18 00:12:18 - mmengine - INFO - Epoch(train) [1][2160/2180]  lr: 4.5789e-08  eta: 0:01:51  time: 4.9155  data_time: 0.0060  memory: 11636  loss: 1.2895  grad_norm: 0.0488
2024/02/18 00:13:07 - mmengine - INFO - Epoch(train) [1][2170/2180]  lr: 1.2564e-08  eta: 0:00:55  time: 4.9406  data_time: 0.0041  memory: 11636  loss: 1.2730  grad_norm: 0.0488
2024/02/18 00:13:57 - mmengine - INFO - Exp name: internlm_chat_7b_qlora_oasst1_e3_copy_20240217_204948
2024/02/18 00:13:57 - mmengine - INFO - Epoch(train) [1][2180/2180]  lr: 1.0384e-10  eta: 0:00:00  time: 4.9730  data_time: 0.0034  memory: 11636  loss: 1.1705  grad_norm: 0.0532
2024/02/18 00:13:57 - mmengine - INFO - Saving checkpoint at 1 epochs
2024/02/18 00:13:59 - mmengine - INFO - after_train in EvaluateChatHook.
2024/02/18 00:14:17 - mmengine - INFO - Sample output:
 <s> <|User|>:请给我介绍五个上海的景点<eoh>
<|Bot|>:上海是中国最大的城市之一,拥有许多著名的景点。以下是五个值得一游的景点:

1. 上海博物馆:这是一座历史悠久的博物馆,收藏了大量的文物和艺术品,包括中国古代青铜器、陶瓷、书画等。

2. 东方明珠塔:这是上海的标志性建筑之一,高达468米,是亚洲最高的电视塔。游客可以在塔上欣赏到整个城市的美丽景色。

3. 上海城隍庙:这是一座古老的庙宇,建于明朝,是上海最古老的庙宇之一。游客可以在这里参观到许多古老的建筑和文物。

4. 上海外滩:这是上海最著名的景点之一,位于黄浦江畔,是上海的象征之一。游客可以在这里欣赏到整个城市的美丽景色,还可以看到许多历史建筑和现代化的摩天大楼。

5. 上海迪士尼乐园:这是一座世界著名的主题公园,位于上海浦东新区。游客可以在这里体验到许多刺激的游乐设施和精彩的表演,还可以与迪士尼的卡通人物合影留念。</s>

2024/02/18 00:14:31 - mmengine - INFO - Sample output:
 <s> <|User|>:Please tell me five scenic spots in Shanghai<eoh>
<|Bot|>:1. The Bund: This is a famous waterfront promenade that offers stunning views of the city's skyline and the Huangpu River.

2. Yu Garden: This is a traditional Chinese garden that dates back to the Ming Dynasty. It features beautiful pavilions, rock formations, and a pond.

3. Shanghai Tower: This is the tallest building in China and the second-tallest in the world. It offers panoramic views of the city from its observation deck.

4. Oriental Pearl Tower: This is another famous tower in Shanghai that offers a unique perspective of the city. It features a rotating restaurant and observation deck.

5. Zhujiajiao Water Town: This is a picturesque water town located just outside of Shanghai. It features narrow canals, traditional architecture, and a variety of shops and restaurants.</s>
View Code

启动deepspeed加速,并将max_epochs设置为2

 

将模型文件转成hf格式:

mkdir hf
export MKL_SERVICE_FORCE_INTEL=1
export MKL_THREADING_LAYER=GNU
xtuner convert pth_to_hf ./internlm_chat_7b_qlora_oasst1_e3_copy.py ./work_dirs_20140217/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_1.pth ./hf

屏幕输出:

(xtuner0.1.9) root@intern-studio-069640:~# cd ft-oasst1/
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# ls
internlm-chat-7b  internlm_chat_7b_qlora_oasst1_e3_copy.py  openassistant-guanaco  work_dirs  work_dirs_20140217
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# mkdir hf
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# export MKL_SERVICE_FORCE_INTEL=1
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# export MKL_THREADING_LAYER=GNU
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# xtuner convert pth_to_hf ./internlm_chat_7b_qlora_oasst1_e3_copy.py ./work_dirs_20140217/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_1.pth ./hf
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
[2024-02-18 09:06:19,611] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
[2024-02-18 09:06:35,773] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:25<00:00,  3.17s/it]
02/18 09:07:08 - mmengine - INFO - dispatch internlm attn forward
02/18 09:07:08 - mmengine - WARNING - Due to the implementation of the PyTorch version of flash attention, even when the `output_attentions` flag is set to True, it is not possible to return the `attn_weights`.
Load PTH model from ./work_dirs_20140217/internlm_chat_7b_qlora_oasst1_e3_copy/epoch_1.pth
Convert weights to float16
Saving HuggingFace model to ./hf
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/peft/utils/save_and_load.py:148: UserWarning: Could not find a config file in ./internlm-chat-7b - will assume that the vocabulary was not modified.
  warnings.warn(
All done!
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# 
View Code

将hf lora增量模型和并到internlm 7b的基座模型:

xtuner convert merge ./internlm-chat-7b ./hf ./merged --max-shard-size 2GB

屏幕输出:

(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# xtuner convert merge ./internlm-chat-7b ./hf ./merged --max-shard-size 2GB
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
[2024-02-18 09:26:33,485] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/root/.conda/envs/xtuner0.1.9/lib/python3.10/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:13<00:00,  1.63s/it]
Saving to ./merged...
All done!
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# ls
hf  internlm-chat-7b  internlm_chat_7b_qlora_oasst1_e3_copy.py  merged  openassistant-guanaco  work_dirs  work_dirs_20140217
(xtuner0.1.9) root@intern-studio-069640:~/ft-oasst1# 
View Code

与合并后的模型对话:

xtuner chat ./merged --prompt-template internlm_chat

输出:

 

posted on 2024-02-17 08:49  littlesuccess  阅读(191)  评论(0)    收藏  举报

刷新页面返回顶部
 
博客园  ©  2004-2025
浙公网安备 33010602011771号 浙ICP备2021040463号-3