Skip to content

gitblankhub/2023DCC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

2023 Data Creator Camp : Korean Food Image Classification

This project focuses on building and analyzing image classification models for Korean food datasets, primarily using ResNet architectures. The goal was to classify various Korean food images and then adapt the best-performing model for a health-focused Korean food classification task through transfer learning.

Project duration : Sep 2023 - Dec 2023

Dataset Sources

Note: The final competition submission files, including specific code for the best-performing models, are not uploaded in this repository.

0. Data Overview

  • kfood_train : 33,593 images, 42 classes of general Korean food (갈비구이, 갈치구이, 훈제오리 etc)
  • kfood_val: 4,198 images, evaluation set for kfood_train.
  • kfood_health_train: 14,115 images, 13 classes of health-focused Korean food (갈비찜, 된장찌개, 부대찌개 etc). Used for the transfer learning task.
  • kfood_health_val: 1,764 images, evaluation set for kfood_health_train.

Data Preparation : An 8:2 cross-validation split was used for dataset preparation.

Mission 1 : Initial Korean Food Classification

Build a baseline image classification model using ResNet18 on the kfood datasets and analyze its performance, particularly focusing on overfitting.

Model & Training Details
Following instructions

  • Architecture: ResNet18 (without pre-trained weights)
  • Epochs: 50
  • Loss Function: Cross Entropy.
  • Optimizer: SGD (Stochastic Gradient Descent).
  • Learning Rate: 0.001
  • Batch Size: 32
  • Image Preprocessing: Resize to 224x224 pixels.

Results

  • Validation Accuracy: 0.91
  • Test Accuracy: 0.6065
  • Observation: Significant overfitting was observed, indicating the model performed very well on the training data but generalized poorly to unseen test data.

[Study] ResNet Concepts
Deep Residual Learning for Image Recognition Paper
ResNets address the degradation problem in deep neural networks (where increasing depth leads to higher training error, not just overfitting) by introducing residual learning. Instead of directly learning a mapping $H(x)$, ResNets learn a residual mapping $F(x) = H(x) - x$. This is facilitated by identity shortcut connections which allow the output of a block to be the sum of the original input and the output of the stacked layers: Output $= F(x) + x$. This structure makes training very deep networks much easier.

Mission 2: Improving Korean Food Classification Accuracy

Enhance the classification accuracy on the kfood datasets by experimenting with deeper ResNet architectures, advanced optimizers, and image augmentation techniques.

Our team's approach: Extensive tuning was performed, including trying ResNet34, ResNet50, ResNet101, and ResNet154 architectures, various augmentation combinations, also learning rates & epoch tunings.

Model & Training Details

  • Architecture: ResNet50
  • Epochs: 50
  • Loss Function: Cross Entropy
  • Optimizer: Adam (Adaptive Moment Estimation)
  • Learning Rate: 0.005
  • Batch Size: 32
  • Image Preprocessing:** Resize to 224x224 pixels
  • Improvements:
    (1) Normalization: RGB channel-wise normalization applied to images.
    (2) Image Augmentation (for training data):RandomRotation, CenterCrop.

Results

  • Validation Accuracy: 0.71
  • Test Accuracy: 0.745

Final Attempt

  • Architecture: ResNet101
  • Epochs: 70
  • Optimizer: Adam
  • Learning Rate: 0.001
  • Batch Size: 64 (to reduce time)
  • Image Preprocessing:** Resize to 224x224 pixels
  • Further a

Results

  • Test Accuracy: 0.7918

Mission 3 : Transfer Learning for Health Food Classification

Utilize the best-performing model from Mission 2 as a pre-trained model for classifying a new dataset of health-focused Korean food images into 13 classes.

Our team's approach: Transfer learning was applied by loading the checkpoint from Mission 2's best model (mission2.pt) and adapting it for the new task.

Model & Training Details

  • Base Architecture: ResNet101 (pre-trained with mission2.pt)
  • Output Layer Modification: The final Fully Connected (FC) layer was modified to output 13 classes, matching the kfood_health dataset.
  • Batch Size: 64
  • Epochs: 50
  • Optimizer: Adam
  • Learning Rate: 0.001

Fine-tuning Strategies Explored:

  1. Train the entire model: All layers' weights were unfrozen and re-trained. (Slightly better performance observed with this method).
  2. Partial Freezing: Initial layers were frozen, and only later layers were re-trained.
  3. Linear Probing : All layers of pretrained model are frozen and only train last single FC layer.

Results

  • Test Accuracy: 0.95(Initial) -> 0.98(Final)

**[Study] Transfer learning & fine tuning **
PyTorch Tutorials: Transfer Learning.
Transfer Learning: A machine learning technique where a model pre-trained on a large dataset for a general task is repurposed as the starting point for a new, related task. This avoids training a model from scratch, leveraging learned features.
Fine-tuning: A specific type of transfer learning where, after replacing the final layers to match the new task's number of classes, the entire model or a subset of its layers are further trained on the new dataset. Initial layers can optionally be "frozen" (their weights not updated) to preserve general feature extractors.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors