Local Speech To Speech Translator is a language translation and transcription service built with Spring Boot, integrating advanced AI models for speech-to-text transcription, text to text translation and text-to-speech capabilities. All processing is performed entirely on your local machine using local LLM models—no external API calls or cloud services are used. Your data never leaves your device, ensuring privacy and offline operation.
Click to view embedded video (if supported by your viewer):
test_with_english_audio.mov
- Speech-to-Text: Transcribe audio using Whisper JNI, running locally.
- Text-to-Speech: Generate speech from text using tts-edge-java and JavaFX, all on-device.
- Language Translation: Translate text using Ollama LLM via LangChain4J, with models running locally.
- REST API: Expose translation and transcription endpoints for local use.
local-speech-to-speech-translator/
├── src/
│ ├── main/
│ │ ├── java/in/ravir/ai/
│ │ │ ├── controller/ # API endpoints (e.g., TranslationController)
│ │ │ ├── service/ # Business logic (TranslationService, TranscriptionService)
│ │ │ ├── processor/ # Audio/AI processing (AudioStreamProcessor, etc.)
│ │ │ ├── assistant/ # AI integration (OllamaTranslationAssistant)
│ │ │ ├── repository/ # In-memory storage (TranslationInMemory)
│ │ │ └── LocalS2STApplication.java
│ │ └── resources/
│ │ └── application.yml # Configuration
│ └── test/
│ └── java/in/ravir/ai/service/TranslationServiceTest.java
├── pom.xml # Maven build file
├── LICENSE
└── README.md
- Java 21+
- Maven 3.8+
- Ollama running locally (for LLM translation)
-
Clone the repository:
git clone <repo-url> cd local-speech-to-speech-translator
-
Download whisper model, preferably "ggml-large-v3.bin" from Huggingface and place in
user.home -
Start Ollama:
- Download and run Ollama from https://ollama.com/
- Pull the required model:
ollama pull mrjacktung/mradermacher-llamax3-8b-alpaca-gguf ollama serve
-
Build and run the application:
mvn clean install mvn spring-boot:run
Edit src/main/resources/application.yml as needed:
spring:
application:
name: Local Speech To Speech Translator
servlet:
multipart:
max-file-size: 10MB
max-request-size: 10MB
langchain4j:
ollama:
chat-model:
base-url: http://localhost:11434
model-name: mrjacktung/mradermacher-llamax3-8b-alpaca-ggufAll dependencies are managed via Maven (pom.xml). Key dependencies:
- Spring Boot (web, test)
- LangChain4J (
langchain4j-ollama-spring-boot-starter,langchain4j-spring-boot-starter) - Whisper JNI (
io.github.givimad:whisper-jni) — Speech-to-text - TTS Edge Java (
io.github.whitemagic2014:tts-edge-java) — Text-to-speech - JavaFX Media (
org.openjfx:javafx-media) — Audio playback - Lombok (
org.projectlombok:lombok) — Code annotations
See pom.xml for full details and versions.
- Access the REST API endpoints (see
TranslationController.javafor details) - Example test for TTS and audio playback:
TranslationServiceTest.java