You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Done! Now run ./launch.sh to start the FauxPilot server.
162
+
163
+
164
+
Then you can just run `./launch.sh:`
165
+
106
166
107
167
$ ./launch.sh
108
168
[+] Running 2/0
@@ -193,3 +253,60 @@ This is an attempt to build a locally hosted version of [GitHub Copilot](https:/
193
253
fauxpilot-triton-1 | I0803 01:51:04.740423 93 grpc_server.cc:4587] Started GRPCInferenceService at 0.0.0.0:8001
194
254
fauxpilot-triton-1 | I0803 01:51:04.740608 93 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
195
255
fauxpilot-triton-1 | I0803 01:51:04.781561 93 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
256
+
257
+
258
+
## API
259
+
260
+
Once everything is up and running, you should have a server listening for requests on `http://localhost:5000`. You can now talk to it using the standard [OpenAIAPI](https://platform.openai.com/docs/api-reference/) (although the full API isn't implemented yet). For example, from Python, using the [OpenAI Python bindings](https://github.com/openai/openai-python):
261
+
262
+
$ ipython
263
+
Python 3.8.10 (default, Mar 15 2022, 12:22:08)
264
+
Type 'copyright', 'credits' or 'license' for more information
265
+
IPython 8.2.0 -- An enhanced Interactive Python. Type '?' for help.
266
+
267
+
In [1]: import openai
268
+
269
+
In [2]: openai.api_key = 'dummy'
270
+
271
+
In [3]: openai.api_base = 'http://127.0.0.1:5000/v1'
272
+
273
+
In [4]: result = openai.Completion.create(engine='codegen', prompt='def hello', max_tokens=16, temperature=0.1, stop=["\n\n"])
274
+
275
+
In [5]: result
276
+
Out[5]:
277
+
<OpenAIObject text_completion id=cmpl-6hqu8Rcaq25078IHNJNVooU4xLY6w at 0x7f602c3d2f40> JSON: {
278
+
"choices": [
279
+
{
280
+
"finish_reason": "stop",
281
+
"index": 0,
282
+
"logprobs": null,
283
+
"text": "() {\n return \"Hello, World!\";\n}"
284
+
}
285
+
],
286
+
"created": 1659492191,
287
+
"id": "cmpl-6hqu8Rcaq25078IHNJNVooU4xLY6w",
288
+
"model": "codegen",
289
+
"object": "text_completion",
290
+
"usage": {
291
+
"completion_tokens": 15,
292
+
"prompt_tokens": 2,
293
+
"total_tokens": 17
294
+
}
295
+
}
296
+
297
+
298
+
## Copilot Plugin
299
+
300
+
Perhaps more excitingly, you can configure the official [VSCode Copilot plugin](https://marketplace.visualstudio.com/items?itemName=GitHub.copilot) to use your local server. Just edit your `settings.json` to add:
And you should be able to use Copilot with your own locally hosted suggestions! Of course, probably a lot of stuff is subtly broken. In particular, the probabilities returned by the server are partly fake. Fixing this would require changing FasterTransformer so that it can return log-probabilities for the top k tokens rather that just the chosen token.
0 commit comments