大模型实战教程,从0手撸LLM
资源文件列表:

llms-from-scratch-cn-main/
__MACOSX/._llms-from-scratch-cn-main 212B
llms-from-scratch-cn-main/Translated_Book/
__MACOSX/llms-from-scratch-cn-main/._Translated_Book 212B
llms-from-scratch-cn-main/images/
__MACOSX/llms-from-scratch-cn-main/._images 212B
llms-from-scratch-cn-main/README.md 12.86KB
__MACOSX/llms-from-scratch-cn-main/._README.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/
__MACOSX/llms-from-scratch-cn-main/._Model_Architecture_Discussions 212B
llms-from-scratch-cn-main/.gitignore 3.22KB
__MACOSX/llms-from-scratch-cn-main/._.gitignore 212B
llms-from-scratch-cn-main/Book/
__MACOSX/llms-from-scratch-cn-main/._Book 212B
llms-from-scratch-cn-main/Codes/
__MACOSX/llms-from-scratch-cn-main/._Codes 212B
llms-from-scratch-cn-main/LICENSE.txt 1.02KB
__MACOSX/llms-from-scratch-cn-main/._LICENSE.txt 212B
llms-from-scratch-cn-main/Translated_Book/ch01/
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch01 212B
llms-from-scratch-cn-main/Translated_Book/ch04/
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch04 212B
llms-from-scratch-cn-main/Translated_Book/ch03/
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch03 212B
llms-from-scratch-cn-main/Translated_Book/img/
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._img 212B
llms-from-scratch-cn-main/Translated_Book/ch02/
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch02 212B
llms-from-scratch-cn-main/Translated_Book/ch05/
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch05 212B
llms-from-scratch-cn-main/images/mental-model.jpg 173.65KB
__MACOSX/llms-from-scratch-cn-main/images/._mental-model.jpg 212B
llms-from-scratch-cn-main/images/cover.jpg 47.2KB
__MACOSX/llms-from-scratch-cn-main/images/._cover.jpg 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._llama3 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._phi-3 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._olmo 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._MiniCPM 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v1 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v6 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._pangu 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._mamba 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-compare 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/.keep
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._.keep 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._ChatGLM4 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._ChatGLM3 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/img/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._img 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._openelm 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._gptj 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v3 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v4 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v5 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v2 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._phi 212B
llms-from-scratch-cn-main/Book/ch06/
__MACOSX/llms-from-scratch-cn-main/Book/._ch06 212B
llms-from-scratch-cn-main/Book/ch01/
__MACOSX/llms-from-scratch-cn-main/Book/._ch01 212B
llms-from-scratch-cn-main/Book/ch04/
__MACOSX/llms-from-scratch-cn-main/Book/._ch04 212B
llms-from-scratch-cn-main/Book/ch03/
__MACOSX/llms-from-scratch-cn-main/Book/._ch03 212B
llms-from-scratch-cn-main/Book/ch02/
__MACOSX/llms-from-scratch-cn-main/Book/._ch02 212B
llms-from-scratch-cn-main/Book/ch05/
__MACOSX/llms-from-scratch-cn-main/Book/._ch05 212B
llms-from-scratch-cn-main/Codes/ch07/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch07 212B
llms-from-scratch-cn-main/Codes/ch06/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch06 212B
llms-from-scratch-cn-main/Codes/ch01/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch01 212B
llms-from-scratch-cn-main/Codes/appendix-B/
__MACOSX/llms-from-scratch-cn-main/Codes/._appendix-B 212B
llms-from-scratch-cn-main/Codes/ch04/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch04 212B
llms-from-scratch-cn-main/Codes/ch03/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch03 212B
llms-from-scratch-cn-main/Codes/ch02/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch02 212B
llms-from-scratch-cn-main/Codes/ch05/
__MACOSX/llms-from-scratch-cn-main/Codes/._ch05 212B
llms-from-scratch-cn-main/Codes/appendix-A/
__MACOSX/llms-from-scratch-cn-main/Codes/._appendix-A 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.1什么是LLM.md 20.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.1什么是LLM.md 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.0理解大型语言模型.md 14.14KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.0理解大型语言模型.md 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.8总结.ipynb 2.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.8总结.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch01/.keep
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._.keep 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.6深入剖析GPT架构.ipynb 5.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.6深入剖析GPT架构.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.7构建大语言模型.ipynb 2.47KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.7构建大语言模型.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.2LLMs的应用.md 5.15KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.2LLMs的应用.md 212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.5利用大型数据集.ipynb 5.59KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.5利用大型数据集.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch01/welcome.ipynb 21.17KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._welcome.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.7 生成文本.ipynb 6.47KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.7 生成文本.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.5 在transfomer模块中连接注意力层和线性层.ipynb 12.37KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.5 在transfomer模块中连接注意力层和线性层.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/.keep
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._.keep 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.6 编码GPT模型.ipynb 15.42KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.6 编码GPT模型.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.1.ipynb 17.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.1.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.3 实现使用 GELU 激活函数的前馈网络.ipynb 56.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.3 实现使用 GELU 激活函数的前馈网络.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.2 使用层归一化对激活进行归一化.ipynb 15.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.2 使用层归一化对激活进行归一化.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.2.ipynb 15.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.2.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.4 增加快捷链接.ipynb 11.6KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.4 增加快捷链接.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.1 从头开始实现 GPT 模型以生成文本.ipynb 17.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.1 从头开始实现 GPT 模型以生成文本.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.6 编码GPT模型-Copy1.ipynb 15.42KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.6 编码GPT模型-Copy1.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.1.ipynb 9.1KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.1.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.3.ipynb 25.13KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.3.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.7.ipynb 2.4KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.7.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.5.ipynb 27.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.5.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.2.ipynb 3.82KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.2.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.4.ipynb 25.56KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.4.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch03/.keep
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._.keep 212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.6.ipynb 23.54KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.6.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-1.jpg 94.35KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-12.jpg 138.23KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-12.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-13.jpg 167.87KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-13.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-26.jpg 115.67KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-26.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-9.jpg 129.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-9.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-2.jpg 131.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-24.jpg 128.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-24.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-18.jpg 196.89KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-18.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-11.jpg 93.37KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-11.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-19.jpg 126.72KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-19.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-10.jpg 167.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-10.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-25.jpg 83.88KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-25.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-3.jpg 104.26KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-3.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-8.jpg 61.73KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-8.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-7.jpg 47.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-7.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-14.jpg 87.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-14.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-21.jpg 43.3KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-21.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-20.jpg 45.71KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-20.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-15.jpg 168.83KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-15.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-6.jpg 70.01KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-6.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-4.jpg 74.29KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-4.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-17.jpg 112.79KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-17.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-22.jpg 172.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-22.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-9.jpg 81.89KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-9.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-8.jpg 109.97KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-8.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-23.jpg 61.8KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-23.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-16.jpg 155.7KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-16.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-5.jpg 79.79KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-5.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-10.jpg 150.51KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-10.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-9.jpg 213.86KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-9.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-8.jpg 95.4KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-8.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-11.jpg 90.18KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-11.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-13.jpg 107.2KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-13.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-12.jpg 133.53KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-12.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-9.jpg 66.19KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-9.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-16.jpg 115.53KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-16.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-17.jpg 125.99KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-17.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-D-1.jpg 50.15KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-D-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-8.jpg 79.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-8.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/.keep 1B
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._.keep 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-15.jpg 86.82KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-15.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-8.jpg 104.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-8.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-9.jpg 97.65KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-9.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-14.jpg 135.22KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-14.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-D-2.jpg 59.13KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-D-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-6.jpg 101.75KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-6.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-19.jpg 147.47KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-19.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-10.jpg 93.3KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-10.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-4.jpg 126.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-4.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-6.png 87.65KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-6.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-1.jpg 102.06KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-5.jpg 105.8KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-5.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-18.jpg 69.81KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-18.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-11.jpg 202.93KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-11.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-11.png 208.53KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-11.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-7.jpg 112.03KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-7.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-5.jpg 83.27KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-5.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-13.png 182.67KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-13.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-13.jpg 91.33KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-13.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-7.jpg 56.11KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-7.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-5.png 152.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-5.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-3.png 117.99KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-3.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-2.jpg 102KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-4.png 134.22KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-4.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-12.jpg 70.08KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-12.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-6.png 132.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-6.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-12.png 69.83KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-12.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-4.jpg 70.08KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-4.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-2.jpg 101.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-16.jpg 94.51KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-16.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-6.jpg 93.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-6.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-7.jpg 95.7KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-7.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/cover-1.jpg 83.35KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._cover-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-1.png 225.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-1.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-17.jpg 137.73KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-17.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-3.jpg 104.25KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-3.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-1.jpg 76.02KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-3.jpg 113.68KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-3.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-20.png 86.1KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-20.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-1.jpg 68.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-15.jpg 108.58KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-15.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-3.png 202.77KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-3.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-5.png 157.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-5.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-4.jpg 111.83KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-4.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/cover-2.jpg 83.55KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._cover-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-2.png 188KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-2.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-14.jpg 72.42KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-14.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-21.png 138.58KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-21.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-2.jpg 81.32KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-3.jpg 114.82KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-3.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.2.png 67.07KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.2.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-10.jpg 80.32KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-10.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-3.png 209.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-3.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-8.jpg 118.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-8.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-12.jpg 85.3KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-12.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-5.jpg 42.94KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-5.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-4.jpg 87.74KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-4.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-13.jpg 109.86KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-13.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-9.jpg 197.05KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-9.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-2.png 121.98KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-2.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-11.jpg 96.81KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-11.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.3.png 87.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.3.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-2.jpg 123.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.1.png 54.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.1.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-13.jpg 65.57KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-13.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-18.jpg 101.23KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-18.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-11.jpg 112.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-11.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-6.jpg 134.34KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-6.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1.7-1.jpg 939.45KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1.7-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-7.jpg 120.84KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-7.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-10.jpg 87.2KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-10.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-1.png 188.84KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-1.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-12.jpg 71.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-12.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-1.jpg 89.94KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-5.jpg 153.75KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-5.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.4.png 129.01KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.4.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-5.png 274.39KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-5.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-3.jpg 78.54KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-3.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-14.jpg 82.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-14.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-15.jpg 72.77KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-15.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-2.jpg 85.79KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-2.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-4.png 218.43KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-4.png 212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.5.png 109.51KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.5.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-4.jpg 151KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-4.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-6.jpg 148.58KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-6.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-6.png 217.09KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-6.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-17.jpg 88.76KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-17.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-16.jpg 85.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-16.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-1.jpg 106.27KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-1.jpg 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-7.png 216.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-7.png 212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.6.png 76.27KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.6.png 212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-7.jpg 92.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-7.jpg 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.1理解词嵌入.ipynb 6.61KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.1理解词嵌入.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.5 字节对编码(BPE).ipynb 101.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.5 字节对编码(BPE).ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.8词位置编码.ipynb 8.11KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.8词位置编码.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.6使用滑动窗口进行数据采样.ipynb 20.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.6使用滑动窗口进行数据采样.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/.keep
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._.keep 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.7 构建词符嵌入.ipynb 6.24KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.7 构建词符嵌入.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.文本数据处理.ipynb 3.25KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.文本数据处理.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.2文本分词(序列化).ipynb 13.23KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.2文本分词(序列化).ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.3将令牌转换为令牌 ID.ipynb 16.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.3将令牌转换为令牌 ID.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.4添加特殊上下文tokens.ipynb 13.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.4添加特殊上下文tokens.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch05/5.1 在未标记的数据上进行预训练.ipynb 63.19KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._5.1 在未标记的数据上进行预训练.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch05/.keep
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._.keep 212B
llms-from-scratch-cn-main/Translated_Book/ch05/5.3.ipynb 23.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._5.3.ipynb 212B
llms-from-scratch-cn-main/Translated_Book/ch05/5.2.ipynb 14.26KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._5.2.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/llama3-from-scratch.ipynb 289.18KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._llama3-from-scratch.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/LICENSE 1.05KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._LICENSE 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/requirements.txt 48B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._requirements.txt 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._images 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/params.txt 182B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._params.txt 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/params.json 212B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._params.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/tokenizer.model 2.08MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._tokenizer.model 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/README.md 44.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._README.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/modeling_phi3.py 70.42KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/._modeling_phi3.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/phi-3.ipynb 8.7KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/._phi-3.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/configuration_phi3.py 9.25KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/._configuration_phi3.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/configuration_olmo.py 7.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/._configuration_olmo.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/olmo.ipynb 6.33KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/._olmo.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/modeling_olmo.py 57.71KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/._modeling_olmo.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/configuration_minicpm.py 2.4KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._configuration_minicpm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/tokenizer_config.json 1.11KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._tokenizer_config.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/special_tokens_map.json 414B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._special_tokens_map.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/MiniCPM.ipynb 56.76KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._MiniCPM.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/gitattributes 1.52KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._gitattributes 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/config.json 712B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._config.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/tokenizer.json 5.92MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._tokenizer.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/MiniCPM.py 31.54KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._MiniCPM.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/generation_config.json 113B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._generation_config.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/tokenizer.model 1.9MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._tokenizer.model 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/README.md 11.31KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._README.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/MiniCPMTest.ipynb 9.94KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._MiniCPMTest.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/model.py 21.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/._model.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/readme.md 8.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/._readme.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/RWKV_v6_demo.ipynb 15.14KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._RWKV_v6_demo.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/rwkv_vocab_v20230424.txt 1.04MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._rwkv_vocab_v20230424.txt 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/img/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._img 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/RWKV-v6-guide.ipynb 21.63KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._RWKV-v6-guide.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/tokenization_gptpangu_bak.py 4.58KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._tokenization_gptpangu_bak.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/modeling_gptpangu.py 21.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._modeling_gptpangu.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/pangu.ipynb 12.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._pangu.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/tokenization_gptpangu.py 4.15KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._tokenization_gptpangu.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/configuration_gptpangu.py 1.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._configuration_gptpangu.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/demo.ipynb 10.42KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/._demo.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/model.py 12.17KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/._model.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/README.md 1.32KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/._README.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v5.py 9.98KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v5.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v1.py 21.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v1.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v4.py 6.42KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v4.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/readme.md 11.02KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._readme.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v3.py 8.92KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v3.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v6.py 9.43KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v6.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v2.py 8.03KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v2.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/chatglm4.ipynb 11.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._chatglm4.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/configuration_chatglm.py 2.21KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._configuration_chatglm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/tokenization_chatglm.py 15.28KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._tokenization_chatglm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/chatglm4-guide.ipynb 188.91KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._chatglm4-guide.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/modeling_chatglm.py 51.74KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._modeling_chatglm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/glm.py 46.9KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._glm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/tokenizer_config.json 1.38KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._tokenizer_config.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/quantization.py 14.32KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._quantization.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/tokenization_chatglm.py 12.69KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._tokenization_chatglm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/configuration_chatglm_full.py 1.07KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._configuration_chatglm_full.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/tokenizer.model 994.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._tokenizer.model 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/README.md 1.43KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._README.md 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/img/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._img 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/加载模型权重.ipynb 79.31KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._加载模型权重.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/img/.keep 1B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/img/._.keep 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/openelm.ipynb 11.18KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/._openelm.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/configuration_openelm.py 13.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/._configuration_openelm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/modeling_openelm.py 38.32KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/._modeling_openelm.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/gptj.ipynb 8.33KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/._gptj.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/modeling_gptj.py 60.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/._modeling_gptj.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/configuration_gptj.py 7.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/._configuration_gptj.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/model_run.py 11.06KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._model_run.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/20B_tokenizer.json 2.35MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._20B_tokenizer.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/model.py 8.97KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._model.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/rwkv-v3-guide.ipynb 27.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._rwkv-v3-guide.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/rwkv-v3.ipynb 10.57KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._rwkv-v3.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/utils.py 3.98KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._utils.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/20B_tokenizer.json 2.35MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/._20B_tokenizer.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/rwkv-v4-guide.ipynb 21.21KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/._rwkv-v4-guide.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/rwkv_vocab_v20230424.txt 1.04MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._rwkv_vocab_v20230424.txt 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/img/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._img 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/RWKV_v5_demo.ipynb 21.4KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._RWKV_v5_demo.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/RWKV-v5-guide.ipynb 42.41KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._RWKV-v5-guide.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/20B_tokenizer.json 2.35MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._20B_tokenizer.json 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/rwkv-v2-guide.ipynb 33.25KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._rwkv-v2-guide.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/model.py 8.03KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._model.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/img/
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._img 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/rwkv-v2.ipynb 35.39KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._rwkv-v2.ipynb 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/modeling_phi.py 66.49KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/._modeling_phi.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/configuration_phi.py 8.26KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/._configuration_phi.py 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/phi.ipynb 13.82KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/._phi.ipynb 212B
llms-from-scratch-cn-main/Book/ch06/.keep
__MACOSX/llms-from-scratch-cn-main/Book/ch06/._.keep 212B
llms-from-scratch-cn-main/Book/ch01/.keep
__MACOSX/llms-from-scratch-cn-main/Book/ch01/._.keep 212B
llms-from-scratch-cn-main/Book/ch04/.keep
__MACOSX/llms-from-scratch-cn-main/Book/ch04/._.keep 212B
llms-from-scratch-cn-main/Book/ch03/.keep
__MACOSX/llms-from-scratch-cn-main/Book/ch03/._.keep 212B
llms-from-scratch-cn-main/Book/ch02/.keep
__MACOSX/llms-from-scratch-cn-main/Book/ch02/._.keep 212B
llms-from-scratch-cn-main/Book/ch05/.keep
__MACOSX/llms-from-scratch-cn-main/Book/ch05/._.keep 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._01_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._03_model-evaluation 212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._05_dataset-generation 212B
llms-from-scratch-cn-main/Codes/ch07/README.md 740B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._README.md 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._02_dataset-utilities 212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._04_preference-tuning-with-dpo 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/._01_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/._02_bonus_additional-experiments 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/._03_bonus_imdb-classification 212B
llms-from-scratch-cn-main/Codes/ch01/README.md 84B
__MACOSX/llms-from-scratch-cn-main/Codes/ch01/._README.md 212B
llms-from-scratch-cn-main/Codes/appendix-B/README.md 829B
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-B/._README.md 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/._01_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/ch04/README.md 147B
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/._README.md 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/._01_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/ch03/README.md 120B
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/._README.md 212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._02_bonus_bytepair-encoder 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._03_bonus_embedding-vs-matmul 212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._01_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/ch02/README.md 500B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._README.md 212B
llms-from-scratch-cn-main/Codes/ch02/09_summary/
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._09_summary 212B
llms-from-scratch-cn-main/Codes/ch05/04_learning_rate_schedulers/
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._04_learning_rate_schedulers 212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._03_bonus_pretraining_on_gutenberg 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._01_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/ch05/README.md 600B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._README.md 212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._05_bonus_hparam_tuning 212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._02_alternative_weight_loading 212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/._03_main-chapter-code 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/._01_optional-python-setup-preferences 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/._02_installing-python-libraries 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/archi.png 845.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._archi.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/ropesplit.png 401.41KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._ropesplit.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/tokens.png 488.49KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._tokens.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/keys.png 430.16KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._keys.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/embeddings.png 470.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._embeddings.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/attention.png 202.27KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._attention.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_39_0.png 26.96KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_39_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_41_0.png 25.82KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_41_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/heads.png 799.73KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._heads.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/last_norm.png 1003.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._last_norm.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/god.png 1.21MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._god.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/qkv.png 497.17KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._qkv.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/swiglu.png 604.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._swiglu.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/freq_cis.png 813.92KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._freq_cis.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/model.png 658.84KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._model.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_42_0.png 27.37KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_42_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/rms.png 340.74KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._rms.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/softmax.png 190.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._softmax.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/value.png 199.91KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._value.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/weightmatrix.png 379.86KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._weightmatrix.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/qsplit.png 551.01KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._qsplit.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/keys0.png 422.6KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._keys0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/q_per_token.png 483.94KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._q_per_token.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_30_0.png 48.6KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_30_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/finallayer.png 799.14KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._finallayer.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/norm.png 308.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._norm.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/stacked.png 383.59KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._stacked.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/v0.png 188.19KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._v0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/a10.png 633.97KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._a10.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/42.png 772.73KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._42.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_54_0.png 27.37KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_54_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/afterattention.png 289.26KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._afterattention.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/rope.png 516.22KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._rope.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/norm_after.png 297.39KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._norm_after.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/mask.png 471.46KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._mask.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_52_0.png 25.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_52_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_50_0.png 26.94KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_50_0.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/karpathyminbpe.png 787.45KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._karpathyminbpe.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/qkmatmul.png 189.33KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._qkmatmul.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/img/01.png 100.34KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/img/._01.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/img/img.png 111.38KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/img/._img.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/img/01.png 100.34KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/img/._01.png 212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/img/01.png 231.82KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/img/._01.png 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/exercise-solutions.ipynb 36.83KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._exercise-solutions.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/ch07.ipynb 125.91KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._ch07.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/previous_chapters.py 17.6KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/instruction-data.json 198.75KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._instruction-data.json 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/exercise_experiments.py 18.98KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._exercise_experiments.py 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/ollama_evaluate.py 3.88KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._ollama_evaluate.py 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/load-finetuned-model.ipynb 6.01KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._load-finetuned-model.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/README.md 3.36KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._README.md 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/gpt_download.py 5.61KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._gpt_download.py 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/instruction-data-with-response.json 28.59KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._instruction-data-with-response.json 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/tests.py 597B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._tests.py 212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/gpt_instruction_finetuning.py 11.14KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._gpt_instruction_finetuning.py 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/config.json 115B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._config.json 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/eval-example-data.json 36.01KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._eval-example-data.json 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._scores 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/README.md 1.02KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._README.md 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/llm-instruction-eval-openai.ipynb 20.12KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._llm-instruction-eval-openai.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/llm-instruction-eval-ollama.ipynb 23.12KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._llm-instruction-eval-ollama.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/requirements-extra.txt 28B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._requirements-extra.txt 212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/llama3-ollama.ipynb 29.48KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/._llama3-ollama.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/README.md 295B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/._README.md 212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/instruction-data-llama3-7b.json 10KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/._instruction-data-llama3-7b.json 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/config.json 115B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._config.json 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/instruction-examples-modified.json 53.55KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._instruction-examples-modified.json 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/README.md 2.16KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._README.md 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/create-passive-voice-entries.ipynb 11.94KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._create-passive-voice-entries.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/instruction-examples.json 38.43KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._instruction-examples.json 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/requirements-extra.txt 47B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._requirements-extra.txt 212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/find-near-duplicates.py 5.08KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._find-near-duplicates.py 212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb 179.94KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._dpo-from-scratch.ipynb 212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/instruction-data-with-preference.json 377.9KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._instruction-data-with-preference.json 212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/previous_chapters.py 17.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/README.md 366B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._README.md 212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb 21.23KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._create-preference-data-ollama.ipynb 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/exercise-solutions.ipynb 5.1KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._exercise-solutions.ipynb 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/previous_chapters.py 11.75KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/ch06.ipynb 137.77KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._ch06.ipynb 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/README.md 700B
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._README.md 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/gpt-class-finetune.py 15.34KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._gpt-class-finetune.py 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/gpt_download.py 3.76KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._gpt_download.py 212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/tests.py 597B
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._tests.py 212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/previous_chapters.py 13.21KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/additional-experiments.py 20.45KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._additional-experiments.py 212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/README.md 8.64KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._README.md 212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/gpt_download.py 3.76KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._gpt_download.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/train-sklearn-logreg.py 2.83KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._train-sklearn-logreg.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/previous_chapters.py 11.75KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/sklearn-baseline.ipynb 7.88KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._sklearn-baseline.ipynb 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/download-prepare-dataset.py 3.07KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._download-prepare-dataset.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/README.md 3.44KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._README.md 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/gpt_download.py 3.76KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._gpt_download.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/train-bert-hf.py 10.71KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._train-bert-hf.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/train-gpt.py 12.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._train-gpt.py 212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/requirements-extra.txt 40B
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._requirements-extra.txt 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/exercise-solutions.ipynb 11.57KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._exercise-solutions.ipynb 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/previous_chapters.py 3.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/ch04.ipynb 82.48KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._ch04.ipynb 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/README.md 502B
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._README.md 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._figures 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/gpt.py 9.39KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._gpt.py 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/ch03.ipynb 71.89KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._ch03.ipynb 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/exercise-solutions.ipynb 7.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._exercise-solutions.ipynb 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/small-text-sample.txt 1.92KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._small-text-sample.txt 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/README.md 264B
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._README.md 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._figures 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/multihead-attention.ipynb 15.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._multihead-attention.ipynb 212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/gpt2_model/
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._gpt2_model 212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/bpe_openai_gpt2.py 7.81KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._bpe_openai_gpt2.py 212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/README.md 233B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._README.md 212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/compare-bpe-tiktoken.ipynb 10.8KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._compare-bpe-tiktoken.ipynb 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/._images 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/README.md 218B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/._README.md 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/embeddings-and-linear-layers.ipynb 12.33KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/._embeddings-and-linear-layers.ipynb 212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/dataloader.ipynb 4.67KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._dataloader.ipynb 212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/exercise-solutions.ipynb 7.27KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._exercise-solutions.ipynb 212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/ch02.ipynb 45.42KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._ch02.ipynb 212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/README.md 221B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._README.md 212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/the-verdict.txt 20KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._the-verdict.txt 212B
llms-from-scratch-cn-main/Codes/ch02/09_summary/09_summary.ipynb 2.07KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/09_summary/._09_summary.ipynb 212B
llms-from-scratch-cn-main/Codes/ch05/04_learning_rate_schedulers/README.md 506B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/04_learning_rate_schedulers/._README.md 212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/prepare_dataset.py 2.82KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._prepare_dataset.py 212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/previous_chapters.py 11.02KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/README.md 6.14KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._README.md 212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/pretraining_simple.py 8.29KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._pretraining_simple.py 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/ch05.ipynb 143.95KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._ch05.ipynb 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._images 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/previous_chapters.py 9.35KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/README.md 578B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._README.md 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/gpt_train.py 7.91KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._gpt_train.py 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/gpt_download.py 3.49KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._gpt_download.py 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/gpt_generate.py 9.68KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._gpt_generate.py 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/tests.py 1.24KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._tests.py 212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/hparam_search.py 7.46KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._hparam_search.py 212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/previous_chapters.py 9.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/README.md 745B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._README.md 212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/the-verdict.txt 20KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._the-verdict.txt 212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/weight-loading-hf-transformers.ipynb 11.17KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/._weight-loading-hf-transformers.ipynb 212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/previous_chapters.py 9.88KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/._previous_chapters.py 212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/README.md 319B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/._README.md 212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/code-part2.ipynb 11.36KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._code-part2.ipynb 212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/exercise-solutions.ipynb 3.71KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._exercise-solutions.ipynb 212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/code-part1.ipynb 30.47KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._code-part1.ipynb 212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/DDP-script.py 5.09KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._DDP-script.py 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/README.md 3.48KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/._README.md 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/._figures 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/requirements.txt 137B
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._requirements.txt 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/README.md 2.11KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._README.md 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._figures 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/python_environment_check.ipynb 1.29KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._python_environment_check.ipynb 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/python_environment_check.py 2.22KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._python_environment_check.py 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/llama3-8b-model-2-response.json 393B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._llama3-8b-model-2-response.json 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/llama3-8b-model-1-response.json 402B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._llama3-8b-model-1-response.json 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/gpt4-model-1-response.json 445B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._gpt4-model-1-response.json 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/gpt4-model-2-response.json 408B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._gpt4-model-2-response.json 212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/correlation-analysis.ipynb 33.8KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._correlation-analysis.ipynb 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/overview-after-ln.webp 20.73KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._overview-after-ln.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/gpt.webp 29.85KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._gpt.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model-final.webp 20.98KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model-final.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/use-gpt.webp 14.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._use-gpt.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/shortcut-example.webp 32.1KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._shortcut-example.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model.webp 25.04KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/iterative-generate.webp 23.72KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._iterative-generate.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/chapter-steps.webp 29.38KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._chapter-steps.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/layernorm2.webp 13.96KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._layernorm2.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/generate-text.webp 36.26KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._generate-text.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/transformer-block.webp 25.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._transformer-block.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model-2.webp 14.87KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model-2.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model-3.webp 21.01KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model-3.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/iterative-gen.webp 17.92KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._iterative-gen.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/gpt-in-out.webp 20.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._gpt-in-out.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/layernorm.webp 26.98KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._layernorm.webp 212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/ffn.webp 24.34KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._ffn.webp 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-3.png 53.18KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-3.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-2.png 60.61KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-2.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/dot-product.png 93.4KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._dot-product.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-1.png 52.21KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-1.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-4.png 53.85KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-4.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/attention.png 66.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._attention.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/masked.png 59.09KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._masked.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/single-head.png 71.34KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._single-head.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/multi-head.png 59.77KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._multi-head.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/attention-matrix.png 136.29KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._attention-matrix.png 212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/dropout.png 62.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._dropout.png 212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/gpt2_model/encoder.json 1017.87KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/gpt2_model/._encoder.json 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/4.png 290.55KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._4.png 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/5.png 288.61KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._5.png 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/2.png 132.57KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._2.png 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/3.png 216.44KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._3.png 212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/1.png 133.33KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._1.png 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/img-1.webp 86.94KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/._img-1.webp 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/img-3.webp 58.68KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/._img-3.webp 212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/img-2.webp 72.46KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/._img-2.webp 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/download.png 174.07KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._download.png 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/pytorch-installer.jpg 94.51KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._pytorch-installer.jpg 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/new-env.png 185.38KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._new-env.png 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/miniforge-install.png 258.47KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._miniforge-install.png 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/check-pip.png 219.68KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._check-pip.png 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/conda-install.png 186.52KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._conda-install.png 212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/activate-env.png 180KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._activate-env.png 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/watermark.jpg 35.99KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._watermark.jpg 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/pytorch-installer.jpg 94.51KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._pytorch-installer.jpg 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/jupyter-issues.jpg 102.72KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._jupyter-issues.jpg 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/check_2.jpg 78.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._check_2.jpg 212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/check_1.jpg 107.24KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._check_1.jpg 212B
资源介绍:
如果你想从0手写代码,构建大语言模型,本项目很适合你。 本项目 "LLMs From Scratch" 是由 Datawhale 提供的一个从头开始构建类似 ChatGPT 大型语言模型(LLM)的实践教程。 我们旨在通过详细的指导、代码示例和深度学习资源,帮助开发者和研究者掌握创建大语言模型和大语言模型架构的核心技术。 本项目包括了从0逐步构建GLM4\Llama3\RWKV6的教程,从0构建大模型,一起深入理解大模型原理。此外,我将直接从meta提供给llama3的模型文件中加载张量,你需要在运行此文件之前下载权重。 这是下载权重的官方链接: [点击这里下载权重](https://llama.meta.com/llama-downloads/)

他的实现链接: [点击这里查看他的实现](https://github.com/karpathy/minbpe)

但由于我们是从头开始实现llama3,我们将逐个张量地读取文件。


无论如何,我们的[17x1]标记现在是[17x4096],即长度为4096的17个嵌入向量(每个标记一个)。
注意: 跟踪形状,这样可以更容易理解所有内容

需要记住的一些事情,我们需要一个norm_eps(来自配置),因为我们不希望意外地将RMS设置为0并除以0。
以下是公式:

无论如何,所以在我们标准化后,形状仍然是[17x4096],与嵌入向量相同,但是标准化了
