You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: readme.md
+33-4Lines changed: 33 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,22 @@
1
-
[](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/autocomplete.ipynb)
This repo trains deep learning models on source code.
6
+
Training model: [](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
7
+
8
+
Evaluating trained model: [](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
9
+
10
+
It gives quite decent results by saving above 30% key strokes in most files,
11
+
and close to 50% in some. We calculated key strokes saved by making a single (best)
12
+
prediction and selecting it with a single key.
13
+
14
+
The dataset we use is the python code found in repos linked in
We download all the repositories as zip files, extract them, remove non python files and split them
17
+
randomly to build training and validation datasets.
18
+
19
+
We train a character level model without any tokenization of the source code, since it's the simplest.
6
20
7
21
### Try it yourself
8
22
@@ -19,6 +33,21 @@ This repo trains deep learning models on source code.
19
33
*Try changing hyper-parameters like model dimensions and number of layers*.
20
34
5. Run `evaluate.py` to evaluate the model.
21
35
36
+
You can also run the training notebook on Google Colab.
37
+
38
+
[](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
39
+
40
+
### Sample
41
+
42
+
Here's a sample evaluation of a trained transformer model.
43
+
44
+
Colors:
45
+
* <spanstyle="color:yellow">yellow</span>: the token predicted is wrong and the user needs to type that character.
46
+
* <spanstyle="color:blue">blue</span>: the token predicted is correct and the user selects it with a special key press, such as TAB or ENTER.
47
+
* <spanstyle="color:green">green</span>: autocompleted characters based on the prediction
0 commit comments