Skip to content

ravi2519/local-speech-to-speech-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local Speech To Speech Translator

Local Speech To Speech Translator is a language translation and transcription service built with Spring Boot, integrating advanced AI models for speech-to-text transcription, text to text translation and text-to-speech capabilities. All processing is performed entirely on your local machine using local LLM models—no external API calls or cloud services are used. Your data never leaves your device, ensuring privacy and offline operation.

Demo

Click to view embedded video (if supported by your viewer):
test_with_english_audio.mov

Features

  • Speech-to-Text: Transcribe audio using Whisper JNI, running locally.
  • Text-to-Speech: Generate speech from text using tts-edge-java and JavaFX, all on-device.
  • Language Translation: Translate text using Ollama LLM via LangChain4J, with models running locally.
  • REST API: Expose translation and transcription endpoints for local use.

Project Structure

local-speech-to-speech-translator/
├── src/
│   ├── main/
│   │   ├── java/in/ravir/ai/
│   │   │   ├── controller/      # API endpoints (e.g., TranslationController)
│   │   │   ├── service/         # Business logic (TranslationService, TranscriptionService)
│   │   │   ├── processor/       # Audio/AI processing (AudioStreamProcessor, etc.)
│   │   │   ├── assistant/       # AI integration (OllamaTranslationAssistant)
│   │   │   ├── repository/      # In-memory storage (TranslationInMemory)
│   │   │   └── LocalS2STApplication.java
│   │   └── resources/
│   │       └── application.yml  # Configuration
│   └── test/
│       └── java/in/ravir/ai/service/TranslationServiceTest.java
├── pom.xml                      # Maven build file
├── LICENSE
└── README.md

Requirements

  • Java 21+
  • Maven 3.8+
  • Ollama running locally (for LLM translation)

Setup

  1. Clone the repository:

    git clone <repo-url>
    cd local-speech-to-speech-translator
  2. Download whisper model, preferably "ggml-large-v3.bin" from Huggingface and place in user.home

  3. Start Ollama:

    • Download and run Ollama from https://ollama.com/
    • Pull the required model:
      ollama pull mrjacktung/mradermacher-llamax3-8b-alpaca-gguf
      ollama serve
  4. Build and run the application:

    mvn clean install
    mvn spring-boot:run

Configuration

Edit src/main/resources/application.yml as needed:

spring:
  application:
    name: Local Speech To Speech Translator
  servlet:
    multipart:
      max-file-size: 10MB
      max-request-size: 10MB

langchain4j:
  ollama:
    chat-model:
      base-url: http://localhost:11434
      model-name: mrjacktung/mradermacher-llamax3-8b-alpaca-gguf

External Dependencies

All dependencies are managed via Maven (pom.xml). Key dependencies:

  • Spring Boot (web, test)
  • LangChain4J (langchain4j-ollama-spring-boot-starter, langchain4j-spring-boot-starter)
  • Whisper JNI (io.github.givimad:whisper-jni) — Speech-to-text
  • TTS Edge Java (io.github.whitemagic2014:tts-edge-java) — Text-to-speech
  • JavaFX Media (org.openjfx:javafx-media) — Audio playback
  • Lombok (org.projectlombok:lombok) — Code annotations

See pom.xml for full details and versions.

Usage

  • Access the REST API endpoints (see TranslationController.java for details)
  • Example test for TTS and audio playback: TranslationServiceTest.java

About

Local Speech to Speech Translator Service

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages