Skip to content

Commit 2b9bda4

Browse files
committed
[update] README
1 parent 3e7f8b1 commit 2b9bda4

3 files changed

Lines changed: 194 additions & 100 deletions

File tree

README.md

Lines changed: 137 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,137 @@
1-
# Module-LLM
2-
3-
<div class="product_pic"><img class="pic" src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/4.webp" width="25%">
4-
5-
## Description
6-
7-
**Module LLM** is an integrated offline Large Language Model (LLM) inference module designed for terminal devices that require efficient and intelligent interaction. Whether for smart homes, voice assistants, or industrial control, Module LLM provides a smooth and natural AI experience without relying on the cloud, ensuring privacy and stability. Integrated with the **StackFlow** framework and **Arduino/UiFlow** libraries, smart features can be easily implemented with just a few lines of code.<br>
8-
Powered by the advanced **AX630C** SoC processor, it integrates a 3.2 TOPs high-efficiency NPU with native support for Transformer models, handling complex AI tasks with ease. Equipped with **4GB LPDDR4** memory (1GB available for user applications, 3GB dedicated to hardware acceleration) and **32GB eMMC** storage, it supports parallel loading and sequential inference of multiple models, ensuring smooth multitasking. The main chip's runtime power consumption of approximately 1.5W, making it highly efficient and suitable for long-term operation.<br>
9-
It features a built-in microphone, speaker, TF storage card, **USB OTG**, and RGB status light, meeting diverse application needs with support for voice interaction and data transfer. The module offers flexible expansion: the onboard SD card slot supports cold/hot firmware upgrades, and the **UART** communication interface simplifies connection and debugging, ensuring continuous optimization and expansion of module functionality. The USB port supports master-slave auto-switching, serving as both a debugging port and allowing connection to additional USB devices like cameras. Users can purchase the LLM debugging kit to add a 100 Mbps Ethernet port and kernel serial port, using it as an SBC.<br>
10-
The module is compatible with multiple models and comes pre-installed with the **Qwen2.5-0.5B** language model. It features **KWS** (wake word), **ASR** (speech recognition), **LLM** (large language model), and **TTS** (text-to-speech) functionalities, with support for standalone calls or **pipeline** automatic transfer for convenient development. Future support includes Qwen2.5-1.5B, Llama3.2-1B, and InternVL2-1B models, allowing hot model updates to keep up with community trends and accommodate various complex AI tasks. Vision recognition capabilities include support for CLIP, YoloWorld, and future updates for DepthAnything, SegmentAnything, and other advanced models to enhance intelligent recognition and analysis.<br>
11-
Plug and play with **M5 hosts**, Module LLM offers an easy-to-use AI interaction experience. Users can quickly integrate it into existing smart devices without complex settings, enabling smart functionality and improving device intelligence. This product is suitable for offline voice assistants, text-to-speech conversion, smart home control, interactive robots, and more.
12-
13-
14-
15-
## Product Features
16-
17-
- Offline inference, 3.2T@INT8 precision computing power
18-
- Integrated KWS (wake word), ASR (speech recognition), LLM (large language model), TTS (text-to-speech generation)
19-
- Multi-model parallel processing
20-
- Onboard 32GB eMMC storage and 4GB LPDDR4 memory
21-
- Onboard microphone and speaker
22-
- Serial communication
23-
- SD card firmware upgrade
24-
- Supports ADB debugging
25-
- RGB indicator light
26-
- Built-in Ubuntu system
27-
- Supports OTG functionality
28-
- Compatible with Arduino/UIFlow
29-
30-
>
31-
32-
## Applications
33-
34-
- Offline voice assistants
35-
- Text-to-speech conversion
36-
- Smart home control
37-
- Interactive robots
38-
39-
## Specifications
40-
41-
| Specification | Parameter |
42-
| ---------------- | ------------------------------------------------------------------------------------------- |
43-
| Processor SoC | AX630C@Dual Cortex A53 1.2 GHz <br> MAX.12.8 TOPS @INT4 and 3.2 TOPS @INT8 |
44-
| Memory | 4GB LPDDR4 (1GB system memory + 3GB dedicated for hardware acceleration) |
45-
| Storage | 32GB eMMC5.1 |
46-
| Communication | Serial communication default baud rate 115200@8N1 (adjustable) |
47-
| Microphone | MSM421A |
48-
| Audio Driver | AW8737 |
49-
| Speaker |@1W, Size:2014 cavity speaker |
50-
| Built-in Units | KWS (wake word), ASR (speech recognition), LLM (large language model), TTS (text-to-speech) |
51-
| RGB Light | 3x RGB LED@2020 driven by LP5562 (status indication) |
52-
| Power | Idle: 5V@0.5W, Full load: 5V@1.5W |
53-
| Button | For entering download mode for firmware upgrade |
54-
| Upgrade Port | SD card / Type-C port |
55-
| Working Temp | 0-40°C |
56-
| Product Size | 54x54x13mm |
57-
| Packaging Size | 133x95x16mm |
58-
| Product Weight | 17.4g |
59-
| Packaging Weight | 32.0g |
60-
61-
62-
## Related Links
63-
64-
- [AX630C](https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/AX630C.pdf)
65-
66-
## PinMap
67-
68-
| Module LLM | RXD | TXD |
69-
| ------------ | --- | --- |
70-
| Core (Basic) | G16 | G17 |
71-
| Core2 | G13 | G14 |
72-
| CoreS3 | G18 | G17 |
73-
74-
>LLM Module Pin Switching| LLM Module has reserved soldering pads for pin switching. In cases of pin multiplexing conflicts, the PCB trace can be cut and reconnected to other sets of pins.
75-
76-
<img alt="module size" src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/03.jpg" width="25%" />
77-
78-
> Taking `CoreS3` as an example, the first column (left green box) is the TX pin for serial communication, where users can choose one out of four options as needed (from top to bottom, the pins are G18, G7, G14, and G10). The default is set to IO18. To switch to a different pin, cut the connection on the solder pad (at the red line) — it’s recommended to use a blade for this — and then connect to one of the three remaining pins below. The second column (right green box) is for RX pin selection, and, as with the TX pin, it also allows a choice of one out of four options.
79-
80-
81-
82-
## Video
83-
84-
- Module LLM product introduction and example showcase [Module_LLM_Video.mp4](https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/Module_LLM_Video.mp4)
85-
86-
## AI Benchmark Comparison
87-
88-
<img alt="compare" src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/Benchmark%E5%AF%B9%E6%AF%94.png" width="100%" />
1+
# StackFlow
2+
3+
<p align="center"><img src="https://static-cdn.m5stack.com/resource/public/assets/m5logo2022.svg" alt="basic" width="300" height="300"></p>
4+
5+
<p align="center">
6+
StackFlow is a simple, fast, and elegant one-stop AI service infrastructure project aimed at embedded developers. Its purpose is to enable Makers and Hackers to quickly obtain powerful AI acceleration capabilities in current embedded devices. StackFlow can infuse a wise soul into various human-machine interaction devices.
7+
</p>
8+
9+
<p align="center">
10+
<a href="https://opencollective.com/wukong-robot/contribute/tier/8131-sponsor" target="_blank"><img src="https://avatars.githubusercontent.com/u/17420673?s=48&v=4"></a>
11+
</p>
12+
13+
14+
15+
## Table of Contents
16+
17+
* [Features](#Features)
18+
* [Demo](#demo)
19+
* [SystemRequirements](#SystemRequirements)
20+
* [Compile](#Compile)
21+
* [Installation](#Installation)
22+
* [Upgrade](#Upgrade)
23+
* [Run](#Run)
24+
* [Configuration](#Configuration)
25+
* [Interface](#Interface)
26+
* [Contribution](#Contribution)
27+
28+
29+
## Features
30+
<!-- ![](doc/assets/network.png) -->
31+
* Distributed communication architecture. Each unit can operate independently or collaborate with other units.
32+
* Support for multiple models, including but not limited to speech recognition, speech synthesis, image recognition, natural language processing, and LLM large model assistant inference, etc.
33+
* Internal data flow. Different units can be configured to work together as needed, avoiding complex data processing workflows.
34+
* Simple and easy to use. Exchange data through standard JSON to quickly implement AI services.
35+
* Offline operation. Local AI services can be implemented without the need for an internet connection.
36+
* Multi-platform support, including but not limited to Module LLM, LLM630 Compute Kit, etc.
37+
* Flexible configuration. All units can be fully configured with operational parameters, allowing for model swapping and modification of model parameters within the same data flow processing scenario.
38+
* Simple and easy to use. Developers only need to focus on the model and hardware platform without worrying about the underlying communication and data processing details, enabling quick implementation of AI services.
39+
* Efficient and stable. Data transmission via ZMQ channels ensures high efficiency, low latency, and strong stability.
40+
* Open source and free. StackFlow is licensed under the MIT License.
41+
* Multilingual support. The core unit is implemented in C++ with extreme performance optimization and can be extended to support multiple programming languages. (Requires support for ZMQ programming)
42+
43+
StackFlow is continuously being optimized and iterated. While the framework becomes more comprehensive, more features will be added. Stay tuned.
44+
45+
Main working modes of the StackFlow voice assistant:
46+
47+
After startup, KWS, ASR, LLM, TTS, and AUDIO are configured to work collaboratively. When KWS detects a keyword in the audio obtained from the AUDIO unit, it sends a wake-up signal. At this point, ASR starts working, recognizing the audio data from AUDIO and publishing the results to its output channel. Once LLM receives the text data converted by ASR, it begins reasoning and publishes the results to its output channel. TTS, upon receiving the results from LLM, starts voice synthesis and plays the synthesized audio data according to the configuration.
48+
49+
50+
## Demo
51+
- [StackFlow continuous speech recognition](./projects/llm_framework/README.md)
52+
- [StackFlow LLM large model awakening dialogue](./projects/llm_framework/README.md)
53+
- [StackFlow TTS voice synthesis playback](./projects/llm_framework/README.md)
54+
- [StackFlow yolo visual detection](https://github.com/Abandon-ht/ModuleLLM_Development_Guide/tree/dev/ESP32/cpp)
55+
- [StackFlow VLM image description](https://github.com/Abandon-ht/ModuleLLM_Development_Guide/tree/dev/ESP32/cpp)
56+
57+
## SystemRequirements ##
58+
The current AI units of StackFlow are built on the AXERA acceleration platform, with the main chip platforms being ax630c and ax650n. The system requirement is Ubuntu.
59+
60+
## Compile ##
61+
StackFlow mainly operates on embedded Linux devices. Generally, please perform compilation work on the host Linux device. The compilation toolchain is aarch64-none-linux-gnu.
62+
```bash
63+
# Install X86 cross-compilation toolchain
64+
wget https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/linux/llm/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu.tar.gz
65+
sudo tar zxvf gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu.tar.gz -C /opt
66+
67+
# Install dependencies
68+
sudo apt install python3 python3-pip libffi-dev
69+
pip3 install parse scons requests kconfiglib
70+
71+
# Download StackFlow source code
72+
git clone https://github.com/m5stack/StackFlow.git
73+
cd StackFlow
74+
git submodule update --init
75+
cd projects/llm_framework
76+
scons distclean
77+
78+
# Compile. Note: When compiling, you need to be connected to the internet to download source code, binary libraries, and other files. Please ensure a stable network connection.
79+
scons -j22
80+
81+
# Package the deb file. Note: Due to the large size of LLM model files, packaging the deb file requires a considerable amount of disk space. It is recommended to use disk space of 128GB or more. During packaging, a large number of binary files will be downloaded, so please be aware of data usage.
82+
cd tools
83+
python3 llm_pack.py
84+
```
85+
86+
## Installation ##
87+
The program and model data of StackFlow are separate. After the program installation is complete, you need to download the model data separately and configure it into the program. The installation involves first installing the program package, and then installing the model package.
88+
89+
Bare-metal environment installation (execute the following commands on the LLM device):
90+
```bash
91+
# First, install the dynamic library dependencies.
92+
dpkg -i ./lib-llm_1.4-m5stack1_arm64.deb
93+
# Then, install the llm-sys main unit.
94+
dpkg -i ./llm-sys_1.4-m5stack1_arm64.deb
95+
# Install other llm units.
96+
dpkg -i ./llm-xxx_1.4-m5stack1_arm64.deb
97+
# Install the model package.
98+
dpkg -i ./llm-xxx_1.4-m5stack1_arm64.deb
99+
# Note the installation order of lib-llm_1.4-m5stack1_arm64.deb and llm-sys_1.4-m5stack1_arm64.deb. The installation order of other llm units and model packages is not required.
100+
```
101+
102+
## Upgrade
103+
When upgrading, you can either upgrade the AI unit individually or upgrade the entire StackFlow framework.
104+
When upgrading a single unit, you can upgrade via an SD card or manually install using the `dpkg` command. It's important to note that for minor version packages, you can install the upgrade package individually, but for major version upgrades, all llm units must be installed completely.
105+
Command line upgrade package:
106+
```bash
107+
# Install the llm units that need upgrading.
108+
dpkg -i ./llm-xxx_1.4-m5stack1_arm64.deb
109+
```
110+
[Device automatic upgrade installation.](https://docs.m5stack.com/en/guide/llm/llm/image)
111+
## Run ##
112+
Relevant AI services will automatically run at startup and can also be manually started via command.
113+
Check the running status of the sys unit:
114+
```bash
115+
systemctl status llm-sys
116+
```
117+
You can refer to the systemd service commands for relevant commands.
118+
119+
## Configuration ##
120+
StackFlow's configuration is divided into two categories: unit operation parameter configuration and model operation parameter configuration.
121+
Both types of configuration files use the JSON format and are located in multiple directories. The directories are as follows:
122+
```
123+
/opt/m5stack/data/models/
124+
/opt/m5stack/share/
125+
```
126+
## Interface ##
127+
StackFlow can be accessed via UART and TCP ports. The default baud rate for the UART port is 115200, and the default port for the TCP port is 10001. Parameters can be modified through configuration files.
128+
129+
## Contribution
130+
131+
* If you like this project, please give it a star first;
132+
* To report a bug, please go to the [issue page](https://github.com/m5stack/StackFlow/issues);
133+
* If you want to contribute code, feel free to fork it and then submit a pull request;
134+
135+
## Star History
136+
137+
[![Star History Chart](https://api.star-history.com/svg?repos=m5stack/StackFlow&type=Date)](https://star-history.com/#m5stack/StackFlow&Date)

0 commit comments

Comments
 (0)