Commit 47defb1
committed
feat: add DeepSeek-V3 support for MaxText to Hugging Face conversion
- Update architecture validation in checkpoint conversion to include MLA and MoE parameters.
- Implement output projection initialization for MLA layers.1 parent 3971206 commit 47defb1
3 files changed
Lines changed: 23 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | | - | |
144 | | - | |
145 | 143 | | |
146 | 144 | | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
147 | 152 | | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
148 | 159 | | |
149 | 160 | | |
150 | 161 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
786 | 786 | | |
787 | 787 | | |
788 | 788 | | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
789 | 793 | | |
790 | 794 | | |
791 | 795 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
696 | 696 | | |
697 | 697 | | |
698 | 698 | | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
699 | 703 | | |
700 | 704 | | |
701 | | - | |
| 705 | + | |
702 | 706 | | |
703 | 707 | | |
704 | 708 | | |
705 | 709 | | |
706 | 710 | | |
707 | 711 | | |
708 | 712 | | |
709 | | - | |
| 713 | + | |
710 | 714 | | |
711 | 715 | | |
712 | 716 | | |
| |||
0 commit comments