FUTO Voice Input

FUTO Voice Input is an application that lets you do speech-to-text on Android, integrating with third party keyboards or apps that use the generic speech-to-text APIs.

To download the application, visit the FUTO Voice Input page. You can also find the contact there to report issues or suggestions.

If you have any feedback, issues are welcomed on the public issue tracker. Private inquiries are welcomed at the support email listed on the website, or via the Send Feedback button in-app.

Status

Development has largely shifted focus to the FUTO Keyboard app, which has voice input built-in. However, FUTO Voice Input will remain available if you prefer to use it with another keyboard.

API support

The following APIs are supported:

android.speech.action.RECOGNIZE_SPEECH implicit intent, for apps and some keyboards - this opens the floating window in the center of the screen
IME with voice subtype mode, for keyboards - this opens on the bottom half of the screen in place of the keyboard

Currently this does not support the SpeechRecognizer API, which few apps seem to use. Support for this is planned in the future.

Speech-to-Text providers

Local on-device Whisper (default)
- Uses whisper.cpp via JNI for fast on-device inference
- VAD (voice activity detection) handles auto-start/stop; configurable in settings
Soniox Cloud
- Async REST mode: records locally and uploads for transcription
- Realtime WebSocket mode: streams audio and shows partial tokens live; final text replaces partials
- Configure in-app under Settings → Speech: choose provider "Soniox Cloud", set mode (Async/Realtime), and supply your Soniox API key

Realtime behavior (Soniox): partial results werden live angezeigt und als composing Text in die aktuelle Eingabe eingefügt; finale Ergebnisse ersetzen die partials. Ein Fallback über einen Accessibility-Service stellt sicher, dass Ergebnisse auch bei Intent-Nutzern in das fokussierte Feld eingefügt werden.

Keyboard support

Keyboard support is touched on in the Help section of the app. In short, the following keyboards are supported:

FUTO Keyboard has FUTO Voice Input built-in; if you want to force it to use the external app you have to disable built-in voice input in its settings
HeliBoard
FlorisBoard supports it on newer releases
AnySoftKeyboard
Unexpected Keyboard (v1.23+)
AOSP Keyboard available in LineageOS and others

If you're okay with using proprietary keyboards, the following are supported:

Grammarly Keyboard, which uses the IME
Microsoft SwiftKey, which uses the implicit intent

Incompatible keyboards:

Gboard - hardcoded to use Google's voice input, does not support third-party options
Samsung Keyboard - hardcoded to only allow either Samsung Voice Input, or Google Voice Input
Simple Keyboard by Raimondas Rimkus - no voice button
Simple Keyboard by Simple Mobile Tools - no voice button
TypeWise - no voice button but suggestion filed in 2019

Language support

FUTO Voice Input is currently based on the OpenAI Whisper model, and could theoretically support all of the languages that OpenAI Whisper supports. However, in practice, the smaller models tend to not perform too good with languages that had fewer training hours. To avoid presenting something worse than nothing, only languages with more than 1,000 training hours are included as options in the UI:

English
Chinese (currently has some weird behavior between traditional/simplified)
German
Spanish
Russian
French
Portuguese
Korean
Japanese
Turkish
Polish
Italian
Swedish
Dutch
Catalan
Finnish
Indonesian

Language support and accuracy may expand in the future with better optimization and fine-tuned models. Feedback is welcomed about language-related issues or general language accuracy.

Development

You can develop this app by opening it in Android Studio. Otherwise, you can use Gradle to build the app like so:

./gradlew assembleStandaloneRelease

There are five product flavors:

dev - for development, includes Play Store billing and all payment methods, auto-update, etc
devSameId - like dev but uses the same applicationId as release (use with care)
playStore - Play Store build, does not include auto-update and only includes Play Store billing
standalone - no Play Store billing, includes auto-update and PayPal (via FutoPay)
fDroid - no Play Store billing and no auto-update, PayPal only

Helpful commands:

# Fast dev iteration
./gradlew assembleDevDebug
./gradlew installDevDebug

# Flavor-specific unit tests
./gradlew testDevDebugUnitTest

# Lint/static analysis
./gradlew lint

Submodules: initialize the FutoPay Android app before building flavors that include PayPal billing.

git submodule update --init --recursive

Settings

Key options available in Settings → Speech:

STT provider: "Whisper (on-device)" or "Soniox Cloud"
Soniox mode: "Async" or "Realtime"
Soniox API key: required for Soniox usage
Languages: toggle supported languages and optional personal dictionary
VAD strategy: choose Classic VAD, Smart Turn v3 (default) or Hybrid
VAD & UX: sound effects, animations, verbose progress, and auto-stop thresholds

Smart Turn v3

Smart Turn v3 ist ein von pipecat-ai bereitgestelltes, BSD-2-Clause-lizenziertes Endpunkt-Erkennungsmodell. Die ONNX-Gewichte (smart-turn-v3.0.onnx) liegen unter app/src/main/assets/. Die Integration ermöglicht eine genauere Erkennung von Sprachenden sowohl für den lokalen Whisper-Recognizer als auch die Soniox-Provider. Weitere Informationen findest du im Smart Turn Repository und im Daily.co Blogpost.

Some prebuilt binaries are included in the libs directory to make the build faster, there are also instructions to build them yourself.

License

This code is currently licensed under the FUTO Source First License 1.0

Credits

The microphone icon was taken from Feather Icons, an open-source icon pack authored by Cole Bemis.

Thanks to the following projects for making this possible:

OpenAI - OpenAI Whisper
Georgi Gerganov - whisper.cpp
TensorFlow Authors - TensorFlow Lite (tflite was used in the past, it's no longer used)
Max-Planck-Society - PocketFFT
The WebRTC project authors - WebRTC VAD
Georgiy Konovalov - android-vad
Other app dependencies, listed in app/build.gradle

Name		Name	Last commit message	Last commit date
Latest commit History 306 Commits
.github		.github
app		app
dep		dep
gradle/wrapper		gradle/wrapper
libs		libs
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitmodules		.gitmodules
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE.md		LICENSE.md
README.md		README.md
SONIOX_ANDROID_FINDINGS.md		SONIOX_ANDROID_FINDINGS.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
setUpPropertiesCI.sh		setUpPropertiesCI.sh
settings.gradle		settings.gradle
soniox-final-transcript-replacement-plan.md		soniox-final-transcript-replacement-plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FUTO Voice Input

Status

API support

Speech-to-Text providers

Keyboard support

Language support

Development

Settings

Smart Turn v3

License

Credits

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

MyButtermilk/voice-input

Folders and files

Latest commit

History

Repository files navigation

FUTO Voice Input

Status

API support

Speech-to-Text providers

Keyboard support

Language support

Development

Settings

Smart Turn v3

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages