Kevincoding - 博客园

2024年6月22日

摘要： GPT：参数量：1.17亿个参数。模型架构：采用12层的Transformer编码器架构 GPT由pretraining和fine-tuning (SFT) 两部分组成 training objective：predict the next token 做pretraining的好处：语料学习阅读全文

posted @ 2024-06-22 15:20 Kevincoding 阅读(215) 评论(0) 推荐(0)

2024年6月21日

super().__init__(**kwargs)

摘要：最近补充一下python知识，关于super().init(**kwargs)属于在class中常用的调用父类方法： `class Child(father_class): name: str = "Stitch" profile: str = "Tutorial Assistant" goal: 阅读全文

posted @ 2024-06-21 16:56 Kevincoding 阅读(163) 评论(0) 推荐(0)

kevinblogs

公告