Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 3 additions & 8 deletions .github/workflows/build-image.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,16 @@ name: Image Build
on:
push:
branches:
- naf2008
- fasttext-meta
tags:
- "*"
paths:
- requirements.txt
- Dockerfile
# Allows you to run this workflow manually from the Actions tab
pull_request:
workflow_dispatch:


jobs:
image-build:
Expand All @@ -27,13 +29,6 @@ jobs:
with:
images: inseefrlab/codif-ape-train

- name: Make free space
# https://github.com/actions/virtual-environments/issues/2840
run: |
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/share/boost "$AGENT_TOOLSDIRECTORY"
docker rmi -f $(docker images -aq)
shell: bash

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

Expand Down
4 changes: 2 additions & 2 deletions argo-workflows/train-workflow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,10 @@ spec:
- name: MODEL_CLASS
- name: EXPERIMENT_NAME
container:
image: inseefrlab/codif-ape-train:naf2008
image: inseefrlab/codif-ape-train:fasttext-meta
imagePullPolicy: Always
command: ["/bin/bash", -c]
args: ["git clone -b naf2008 https://github.com/InseeFrLab/codif-ape-train.git &&\
args: ["git clone -b fasttext-meta https://github.com/InseeFrLab/codif-ape-train.git &&\
cd codif-ape-train/ &&\
export MLFLOW_EXPERIMENT_NAME={{inputs.parameters.EXPERIMENT_NAME}} &&\
mlflow run ~/work/codif-ape-train/ \
Expand Down
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,3 @@ sentencepiece
accelerate
datasets
evaluate
torchFastText @ git+https://github.com/inseefrlab/torch-fastText@package
4 changes: 2 additions & 2 deletions src/utils/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ def get_sirene_4_data(
fs = get_file_system()

if revision == "NAF2008":
path = "projet-ape/extractions/20241027_sirene4.parquet"
path = "projet-ape/extractions/domain_specific_cleaned/full_dataset_20241027_sirene4_nacerev2_fuzzy_regex_similarity.parquet"
elif revision == "NAF2025":
path = "projet-ape/NAF-revision/relabeled-data/20241027_sirene4_nace2025.parquet"
path = "projet-ape/extractions/domain_specific_cleaned/full_dataset_20241027_sirene4_nace2025_fuzzy_regex_similarity.parquet"
else:
raise ValueError("Revision must be either 'NAF2008' or 'NAF2025'.")

Expand Down