Skip to content

Commit 51ee384

Browse files
committed
extract python
1 parent 4793e87 commit 51ee384

4 files changed

Lines changed: 26 additions & 10 deletions

File tree

extract_downloads.sh

Lines changed: 0 additions & 8 deletions
This file was deleted.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import zipfile
2+
from pathlib import Path
3+
4+
from labml import lab, monit
5+
6+
7+
def main():
8+
download = Path(lab.get_data_path() / 'download')
9+
source = Path(lab.get_data_path() / 'source')
10+
11+
for repo in download.iterdir():
12+
with monit.section(f"Extract {repo.stem}"):
13+
repo_source = source / repo.stem
14+
if repo_source.exists():
15+
continue
16+
with zipfile.ZipFile(repo, 'r') as repo_zip:
17+
repo_zip.extractall(repo_source)
18+
19+
if not source.exists():
20+
source.mkdir(parents=True)
21+
22+
23+
if __name__ == '__main__':
24+
main()

python_autocomplete/remove_non_source_files.py

Whitespace-only changes.

readme.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@ This repo trains deep learning models on source code.
66

77
1. Clone this repo
88
2. Install requirements from `requirements.txt`
9-
3. Download Github repos by running `download.py`.
9+
3. Download Github repos by running `python_autocomplete/download.py`.
1010
It downloads all the repos mentioned in
1111
[PyTorch awesome list](https://github.com/bharathgs/Awesome-pytorch-list).
12-
4. Run `extrat_downloads.sh` to extract the downloaded zip files to `data/source`.
12+
4. Run `python_autocomplete/extract_downloads.py` to extract the downloaded zip files to `data/source`.
1313
You can directly copy any python code to `data/source` to train on them.
1414
5. Run `create_dataset.py` to collect all python files.
1515
The collected code will be written to `data/train.py` and, `data/eval.py`.

0 commit comments

Comments
 (0)