复现The Annotated Transformer代码时遇到的问题和相关链接【window11,vscode环境】

The Annotated Transformer原网页:The Annotated Transformer

The Annotated Transformer源代码:harvardnlp/annotated-transformer

《The Annotated Transformer》环境配置-CSDN博客

调试The Annotated Transformer_annotatedtransformer.ipynb-CSDN博客

 

# 创建虚拟环境
conda create -n Annotated_Transformer python=3.8.19
# 激活虚拟环境
activate Annotated_Transformer
# 安装requirements.txt中的依赖包
pip install -r requirements.txt

安装依赖包时报错:WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:1135)'))': /whl/torch_stable.html

关闭代理软件解决

 

安装依赖包报错,涉及cymem、murmurhash(spaCy的依赖):

error: Microsoft Visual C++ 14.0 or greater is required.  
Get it with "Microsoft C++ Build Tools":  
https://visualstudio.microsoft.com/visual-cpp-build-tools/

Microsoft C++ 生成/构建工具(Build Tools)和安装软件时提示“Microsoft Visual C++ 14.0 is required.”的解决_visual c++ build tools-CSDN博客

按照链接中的方法安装Microsoft C++ 生成工具后, 重试,解决。

 

配置Jupyter内核ipykernel:

pip install ipykernel
python -m ipykernel install --name Annotated_Transformer

vscode在ipynb文件页面右上角选择内核

image

 

报错:OSError: [E050] Can't find model 'de_core_news_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

从github下载de_core_news_sm-3.2.0、en_core_web_sm-3.2.0,放到项目文件夹下

Release de_core_news_sm-3.2.0 · explosion/spacy-models

Release en_core_web_sm-3.2.0 · explosion/spacy-models

然后安装,解决。

pip install de_core_news_sm-3.2.0.tar.gz
pip install en_core_web_sm-3.2.0.tar.gz

 

if is_interactive_notebook()报错:TypeError: issubclass() arg 1 must be a class

可能是和pydantic、typing_extensions版本不兼容的原因

pip install typing_extensions==4.5.0
pip install "spacy~=3.2.6"

安装后,关闭项目,重新打开后运行,解决。

FIXED: Pydantic issubclass error for python 3.8 and 3.9 · Issue #12659 · explosion/spaCy

【22ver Harvard Transformer源码】Spacy 3.2.0,Python3.8的pydantic包 issubclass() arg 1 must be a class报错问题-CSDN博客

FIXED: Pydantic issubclass error for python 3.8 and 3.9 · Issue #12659 · explosion/spaCy

 

 

报错:Could not get the file at http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz

从GitHub下载Multi30k的几个文件后,放到默认的路径C:\Users\<您的用户名>\.torchtext\cache

small_DL_repo/datasets/Multi30k at master · neychev/small_DL_repo

Could not get the file at http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz. [RequestE_更新the multi30k url-CSDN博客

 

报错:RuntimeError: The computed hash 6d1ca1dba99e2c5dd54cae1226ff11c2551e6ce63527ebb072a1f70f72a5cd36 of C:\Users\Administrator/.torchtext/cache\Multi30k\mmt16_task1_test.tar.gz does not match the expectedhash 0681be16a532912288a91ddd573594fbdd57c0fbb81486eff7c55247e35326c2. Delete the file manually and retry.

修改anaconda3\envs\Annotated_Transformer\Lib\site-packages\torchtext\datasets路径下的multi30k.py文件,修改"test"对应的MD5值。

MD5 = {
    "train": "20140d013d05dd9a72dfde46478663ba05737ce983f478f960c1123c6671be5e",
    "valid": "a7aa20e9ebd5ba5adce7909498b94410996040857154dab029851af3a866da8c",
    "test": "6d1ca1dba99e2c5dd54cae1226ff11c2551e6ce63527ebb072a1f70f72a5cd36",
}

如果还是相同报错,Restart Kemel后重新运行。

annotated_transformer复现_annotated-transformer离线multi30k下载错误-CSDN博客

Could not get the file at http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz. [RequestException] None - nlp - PyTorch Forums

 

其他参考链接:

告别Jupyter Notebook,我能用VSCode跑.ipynb文件啦!_vscode ipynb-CSDN博客

现在安装GPU版Pytorch还需要手动安装CUDA和CuDnn吗? - 知乎

 

posted @ 2025-09-17 21:28  infocodez  阅读(41)  评论(0)    收藏  举报