Skip to content

Latest commit

 

History

History
198 lines (159 loc) · 15.1 KB

File metadata and controls

198 lines (159 loc) · 15.1 KB

On-Device AI: ON THE AIr

PseudoLab Discord Community Stars Badge Forks Badge Pull Requests Badge Issues Badge GitHub contributors

Welcome to On-Device AI: ON THE AIr repository! We aim to research On-Device AI with an emphasis on common model compression techniques, conducting paper reviews, and benchmarking real-world performance using NVIDIA Jetson devices. Join us in advancing On-Device AI through open collaboration and innovation! 🚀

🌟 프로젝트 목표 (Project Vision)

"Propose the optimal model compression techniques for NVIDIA Jetson devices by leveraging the knowledge gained from research paper reviews on model compression methods."

  • Learn various pruning techniques during this season (10th).
  • Apply the learned model compression methods to existing models.
  • Test the actual performance on the NVIDIA Jetson platform.
  • Share the results for collaborative insights and community contribution.
  • Foster synergy between individual growth and collective intelligence.
  • Promote a knowledge-sharing culture based on the open-source spirit.

🧑 역동적인 팀 소개 (Dynamic Team)

역할 이름 기술 스택 배지 주요 관심 분야
Project Manager 정현우 Python PyTorch On-Device AI, CV, Robotics
Member 김민성 Python -
Member 구승연 Python -
Member 문규식 Python -
Member 박선영 Python -
Member 박예리 Python -
Member 양문기 Python -
Member 최예제 Python -
Member 최유진 Python -
Member 최해인 Python -

🚀 프로젝트 로드맵 (Project Roadmap)

gantt
    title 2025 On-Device AI 프로젝트 여정
    section 전체 커리큘럼
    Pruning      :a1, 2025-03-03, 119d
    Quantization :a2, after a1, 120d

    section Pruning 세부 활동
    SPECIFIC OR UNIVERSAL SPEEDUP   :b1, 2025-03-03, 35d
    WHEN TO PRUNE                   :b2, after b1, 84d

    section 실습 세부 활동 with Jetson
    Object Detection with Pruning   :c1, 2025-04-01, 63d
    LLM with Pruning                :c2, after c1, 30d
    CV with Pruning                 :c3, after c1, 30d
Loading

💻 주차별 활동 (Activity History)

Paper Review

날짜 내용 발표자 진행방식 참고자료 비고
2025/03/05 OT 정현우 온라인 -
2025/03/12 Unstructured Pruning 구승연 온라인 J. Frankle and M. Carbin, “The lottery ticket hypothesis: finding sparse, trainable neural networks,” in ICLR, 2019.
2025/03/19 Structured Pruning 김민성 오프라인 X. Ma, G. Fang, and X. Wang, “LLM-Pruner: On the structural pruning of large language models,” in NeurIPS, vol. 36, 2023, pp.21 702–21 720.
2025/03/26 Magical Week 휴일 미정 - -
2025/04/03 Semi-structured Pruning 최유진 온라인 F. Meng, H. Cheng, K. Li, H. Luo, X. Guo, G. Lu, and X. Sun, “Pruning filter in filter,” in NeurIPSW, 2020.
2025/04/09 Pruning Before Training 문규식 온라인 S. Liu, T. Chen, X. Chen, L. Shen, D. C. Mocanu, Z. Wang, and M. Pechenizkiy, “The unreasonable effectiveness of random pruning: Return of the most naive baseline for sparse training,” in ICLR, 2022.
2025/04/16 Pruning During Training: Sparsity Regularization based Methods 박예리 온라인 W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” in NIPS, 2016.
2025/04/23 Pruning During Training: Dynamic Sparse Training based Methods 구승연 오프라인 U. Evci, T. Gale, J. Menick, P. S. Castro, and E. Elsen, “Rigging the lottery: Making all tickets winners,” in ICML, 2020.
2025/04/30 Zero-shot Pruning 정현우 온라인 Wang, Hongjie, Bhishma Dedhia, and Niraj K. Jha. "Zero-TPrune: Zero-shot token pruning through leveraging of the attention graph in pre-trained transformers." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024. Magical Week
2025/05/07 Pruning During Training: Score-based Methods 최해인 온라인 Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, “Filter pruning via geometric median for deep convolutional neural networks acceleration,” in CVPR, 2019, pp. 4340–4349.
2025/05/14 리뷰 정리 정현우 오프라인 - Pseudo Con
2025/05/21 Pruning During Training: Differentiable Pruning based methods 정현우 온라인 X. Ning, T. Zhao, W. Li, P. Lei, Y. Wang, and H. Yang, “DSA: More efficient budgeted pruning via differentiable sparsity allocation,” in ECCV, 2020, pp. 592–607. Pseudo Con
2025/05/28 Pruning After Training: LTH and its Variants 정현우 온라인 Zhang, Shuai, et al. "Why lottery ticket wins? a theoretical perspective of sample complexity on sparse neural networks." Advances in Neural Information Processing Systems 34 (2021): 2707-2720.
2025/06/04 Pruning After Training: Other score-based Methods 김민성 오프라인 Men, Xin, et al. "Shortgpt: Layers in large language models are more redundant than you expect." arXiv preprint arXiv:2403.03853 (2024).
2025/06/11 Pruning After Training: Sparsity Regularization based Methods 최예제 온라인 Xia, Mengzhou, et al. "Sheared llama: Accelerating language model pre-training via structured pruning." arXiv preprint arXiv:2310.06694 (2023).
2025/06/18 Pruning After Training: Pruning in Early Training 양문기 온라인 You, Haoran, et al. "Drawing early-bird tickets: Towards more efficient training of deep networks." arXiv preprint arXiv:1909.11957 (2019).
2025/06/25 Pruning After Training: Post-Training Pruning 박선영 온라인 Frantar, Elias, and Dan Alistarh. "Sparsegpt: Massive language models can be accurately pruned in one-shot." International Conference on Machine Learning. PMLR, 2023.
2025/07/02 Run-time Pruning 정현우 오프라인 Tang, Yehui, et al. "Manifold regularized dynamic network pruning." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021.

Hands-On Pruning with Jetson

날짜 내용 진행방식 비고
2025/04/01 OT 및 계획 수립 온라인
2025/04/15 Object Detection Model 선정 온라인
2025/04/29 [PDT] Sparsity Regularization based Method 구현 및 테스트 온라인 Magical Week
2025/05/06 ASP 기반 모델 학습 온라인
2025/05/13 TensorRT 변환 및 HW 내 성능 비교 오프라인 PseudoCon
2025/05/20 [PDT] Sparse Training based Methods 구현 및 테스트 온라인
2025/05/27 [PDT] Score-based Methods 구현 및 테스트 온라인
2025/06/03 [PDT] Differentiable Pruning based methods 구현 및 테스트 온라인
2025/06/10 [PDT] 구현된 모델들 TensorRT 변환 및 HW 성능 비교 오프라인
2025/06/17 [PAT] LTH and its Variants 구현 및 테스트 온라인
2025/06/24 [PAT] Pruning in Early Training 구현 및 테스트 온라인
2025/07/01 [PAT] Post-Training Pruning 구현 및 테스트 온라인
2025/07/08 Run-time Pruning 구현 및 테스트 온라인

진행 방식

Paper Review

매주 스터디 진행 방식은 다음과 같습니다.

  1. 근황 이야기 (20 ~ 30분 예상)
  2. 발표자를 제외한 참여자들이 준비한 On-Device AI 관련된 이슈들을 공유한다. (20 ~ 40분 예상)
  3. 발표자는 준비한 논문 리뷰를 발표한다. (30분 ~ 1시간 예상)

이에 따라 다음 내용들을 준비하시면 됩니다
공통사항

  • 해당 주차 논문을 읽는다.

발표자

  • 해당 주차 논문에 대한 발표 준비를 한다.

참여자

  • On-Device AI와 관련된 기술들(TensorRT, LiteRT, ONNX 등)의 트렌드나 이슈를 준비한다.

💡 학습 자원 (Learning Resources)

세부 논문들은 주차별 활동 내 참고자료 참고

참고 문헌

🌱 참여 안내 (How to Engage)

진행 정보

  • 시간: 매주 수요일 오후 8시
  • 장소: 온라인 / 오프라인(강남역)

참여 조건

  • On-Device AI(경량화, 최적화 등)에 관심 있으신 분
  • 4개월 동안 꾸준히 참여하실 수 있는 분
  • 딥러닝 기초 지식 보유하신 분
  • 논문을 읽고 리뷰하실 수 있는 분

팀원으로 참여하시려면 러너 모집 기간에 신청해주세요.

  • 링크 (준비중)

누구나 청강을 통해 모임을 참여하실 수 있습니다.

  1. 특별한 신청 없이 정기 모임 시간에 맞추어 디스코드 #Room-GH 채널로 입장
  2. Magical Week 중 행사에 참가
  3. Pseudo Lab 행사에서 만나기

About Pseudo Lab 👋🏼

Pseudo-Lab is a non-profit organization focused on advancing machine learning and AI technologies. Our core values of Sharing, Motivation, and Collaborative Joy drive us to create impactful open-source projects. With over 5k+ researchers, we are committed to advancing machine learning and AI technologies.

Contributors 😃



License 🗞

This project is licensed under the MIT License.