Skip to content

Commit c633219

Browse files
committed
Merge branch 'master' of github.com:lab-ml/source_code_modelling
merge
2 parents 908dc8c + 0c982a5 commit c633219

1 file changed

Lines changed: 33 additions & 4 deletions

File tree

readme.md

Lines changed: 33 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,22 @@
1-
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/autocomplete.ipynb)
1+
# Python Autocomplete
22

3-
# Source Code Modeling
3+
[This](https://github.com/lab-ml/python_autocomplete) project try autocompleting python
4+
source code using LSTM or Transformer models.
45

5-
This repo trains deep learning models on source code.
6+
Training model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
7+
8+
Evaluating trained model: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
9+
10+
It gives quite decent results by saving above 30% key strokes in most files,
11+
and close to 50% in some. We calculated key strokes saved by making a single (best)
12+
prediction and selecting it with a single key.
13+
14+
The dataset we use is the python code found in repos linked in
15+
[Awesome-pytorch-list](https://github.com/bharathgs/Awesome-pytorch-list).
16+
We download all the repositories as zip files, extract them, remove non python files and split them
17+
randomly to build training and validation datasets.
18+
19+
We train a character level model without any tokenization of the source code, since it's the simplest.
620

721
### Try it yourself
822

@@ -19,6 +33,21 @@ This repo trains deep learning models on source code.
1933
*Try changing hyper-parameters like model dimensions and number of layers*.
2034
5. Run `evaluate.py` to evaluate the model.
2135

36+
You can also run the training notebook on Google Colab.
37+
38+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
39+
40+
### Sample
41+
42+
Here's a sample evaluation of a trained transformer model.
43+
44+
Colors:
45+
* <span style="color:yellow">yellow</span>: the token predicted is wrong and the user needs to type that character.
46+
* <span style="color:blue">blue</span>: the token predicted is correct and the user selects it with a special key press, such as TAB or ENTER.
47+
* <span style="color:green">green</span>: autocompleted characters based on the prediction
48+
2249
<p align="center">
23-
<img src="/python-autocomplete.png?raw=true" width="100%" title="Screenshot">
50+
<img src="/images/python-autocomplete.png?raw=true" width="100%" title="Screenshot">
2451
</p>
52+
53+
We are working on a simple extension for VSCode for demonstration.

0 commit comments

Comments
 (0)