Skip to content

Commit f88927a

Browse files
committed
fix: move data files in data
1 parent a91e52f commit f88927a

2 files changed

Lines changed: 1 addition & 1 deletion

File tree

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ download_collinfo:
3838
CC-MAIN-2024-22.warc.paths.gz:
3939
@echo "downloading the list from s3, requires s3 auth even though it is free"
4040
@echo "note that this file should be in the repo"
41-
aws s3 ls s3://commoncrawl/cc-index/table/cc-main/warc/crawl=CC-MAIN-2024-22/subset=warc/ | awk '{print $$4}' | gzip -9 > CC-MAIN-2024-22.warc.paths.gz
41+
aws s3 ls s3://commoncrawl/cc-index/table/cc-main/warc/crawl=CC-MAIN-2024-22/subset=warc/ | awk '{print $$4}' | gzip -9 > data/CC-MAIN-2024-22.warc.paths.gz
4242

4343
duck_ccf_local_files: build
4444
@echo "warning! only works on Common Crawl Foundadtion's development machine"

0 commit comments

Comments
 (0)