Skip to content

Commit 87e93bd

Browse files
committed
jekyll page rebuild
1 parent 57f1ca8 commit 87e93bd

1 file changed

Lines changed: 50 additions & 8 deletions

File tree

_posts/2023/2023-04/2023-04-18-FauxPilot-开源插件-GitHub-Copilot .md

Lines changed: 50 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -51,15 +51,57 @@ FauxPilot 在 GitHub 上的地址如下:[https://github.com/moyix/fauxpilot](h
5151
}
5252
```
5353

54+
## 设置
55+
56+
运行设置脚本以选择要使用的模型。 这将从 Huggingface 下载模型,然后将其转换为与 FasterTransformer 一起使用。
57+
58+
$ ./setup.sh
59+
Models available:
60+
[1] codegen-350M-mono (2GB total VRAM required; Python-only)
61+
[2] codegen-350M-multi (2GB total VRAM required; multi-language)
62+
[3] codegen-2B-mono (7GB total VRAM required; Python-only)
63+
[4] codegen-2B-multi (7GB total VRAM required; multi-language)
64+
[5] codegen-6B-mono (13GB total VRAM required; Python-only)
65+
[6] codegen-6B-multi (13GB total VRAM required; multi-language)
66+
[7] codegen-16B-mono (32GB total VRAM required; Python-only)
67+
[8] codegen-16B-multi (32GB total VRAM required; multi-language)
68+
Enter your choice [6]: 2
69+
Enter number of GPUs [1]: 1
70+
Where do you want to save the model [/home/moyix/git/fauxpilot/models]? /fastdata/mymodels
71+
Downloading and converting the model, this will take a while...
72+
Converting model codegen-350M-multi with 1 GPUs
73+
Loading CodeGen model
74+
Downloading config.json: 100%|██████████| 996/996 [00:00<00:00, 1.25MB/s]
75+
Downloading pytorch_model.bin: 100%|██████████| 760M/760M [00:11<00:00, 68.3MB/s]
76+
Creating empty GPTJ model
77+
Converting...
78+
Conversion complete.
79+
Saving model to codegen-350M-multi-hf...
5480

55-
56-
57-
58-
59-
60-
61-
62-
81+
=============== Argument ===============
82+
saved_dir: /models/codegen-350M-multi-1gpu/fastertransformer/1
83+
in_file: codegen-350M-multi-hf
84+
trained_gpu_num: 1
85+
infer_gpu_num: 1
86+
processes: 4
87+
weight_data_type: fp32
88+
========================================
89+
transformer.wte.weight
90+
transformer.h.0.ln_1.weight
91+
[... more conversion output trimmed ...]
92+
transformer.ln_f.weight
93+
transformer.ln_f.bias
94+
lm_head.weight
95+
lm_head.bias
96+
Done! Now run ./launch.sh to start the FauxPilot server.
97+
98+
99+
100+
## FauxPilot
101+
102+
This is an attempt to build a locally hosted version of [GitHub Copilot](https://github.com/features/copilot/). It uses the [SalesForce CodeGen](https://developer.nvidia.com/nvidia-triton-inference-server) models inside of NVIDIA's Triton Inference Server with the [FasterTransformer backend](https://github.com/triton-inference-server/fastertransformer_backend/).
103+
104+
## Prerequisites
63105

64106

65107
$ ./launch.sh

0 commit comments

Comments
 (0)