Fix README and CLI

mitchsayre · mitchsayre · commit c81dc6f7b097 · 2026-05-10T17:34:30.000-04:00
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,7 +1,17 @@
 # Contributing
 
-`wfloat` is a pure Python package. Native code comes from the
-`wfloat-sherpa-onnx` dependency, which provides `import sherpa_onnx`.
+`wfloat` is the Python client for `wfloat-tts`, Wfloat's on-device text-to-speech
+model.
+
+Product context:
+
+- Homepage: https://wfloat.com
+- Docs: https://docs.wfloat.com
+- Model card and samples: https://huggingface.co/Wfloat/wfloat-tts
+- Web package: https://github.com/wfloat/wfloat-web
+- React Native package: https://github.com/wfloat/react-native-wfloat
+
+This repo should stay focused on the Python experience.
 
 ## Prerequisites
 
@@ -12,23 +22,23 @@
 ```bash
 python3 -m venv .venv
 source .venv/bin/activate
-python -m pip install --upgrade pip
-python -m pip install setuptools wheel build twine
+python3 -m pip install --upgrade pip
+python3 -m pip install setuptools wheel build twine
 ```
 
 Install `wfloat`:
 
 ```bash
-pip install -e .
+python3 -m pip install -e .
 ```
 
-That will also install the matching `wfloat-sherpa-onnx` dependency.
+That also installs the matching `wfloat-sherpa-onnx` dependency.
 
 ## Build release artifacts
 
 ```bash
 rm -rf build dist
-python -m build
+python3 -m build
 ```
 
 That produces:
@@ -41,20 +51,26 @@ That produces:
 Unit tests do not require `sherpa_onnx`:
 
 ```bash
-python -m unittest discover -s tests -v
+PYTHONPATH=python python3 -m unittest discover -s tests -v
 ```
 
 You can also run a smoke check:
 
 ```bash
-python -c "import sherpa_onnx, wfloat; print(wfloat.__version__)"
+python3 -c "import sherpa_onnx, wfloat; print(wfloat.__version__)"
 ```
 
 ## CI
 
-CI now:
+CI:
 
 - builds pure Python artifacts once
 - installs those artifacts on each target platform
 - relies on normal dependency resolution for `wfloat-sherpa-onnx`
 - runs the unit test suite and an integration smoke test
+
+## Notes for changes
+
+- Keep docs short and user-facing.
+- Describe this package as the Python way to run `wfloat-tts` locally.
+- If voices, emotions, or examples change, check the model card and docs first.
diff --git a/README.md b/README.md
@@ -1,22 +1,28 @@
 # wfloat
 
-`wfloat` is a high-level Python wrapper around `sherpa-onnx` for loading
-Wfloat-compatible speech models and generating audio files.
+`wfloat` is the Python package for `wfloat-tts`, Wfloat's on-device English
+text-to-speech model.
 
-## Install
+It runs speech locally in Python instead of calling a hosted inference API.
+The model supports 20 voices with emotion and intensity control.
 
-Install `wfloat` normally:
+If you're building for the browser, use
+[`@wfloat/wfloat-web`](https://github.com/wfloat/wfloat-web). If you're
+building for React Native, use
+[`@wfloat/react-native-wfloat`](https://github.com/wfloat/react-native-wfloat).
 
-```bash
-pip install wfloat
-```
+Try it in the browser: https://wfloat.com/demo
 
-That will also install the matching `wfloat-sherpa-onnx` dependency from PyPI.
+<audio controls src="./sample.wav">
+  <a href="./sample.wav">Sample dialogue</a>
+</audio>
 
-When installing from this repo locally:
+[Sample dialogue](sample.wav)
+
+## Install
 
 ```bash
-pip install ./packages/wfloat-python
+pip install wfloat
 ```
 
 ## Usage
@@ -27,20 +33,115 @@ import wfloat
 model = wfloat.load("wfloat/wfloat-tts")
 
 result = model.generate(
-    text="The signal is clean. Start the recording.",
-    voice_id="narrator_woman",
-    emotion="neutral",
-    intensity=0.5,
-    speed=1.0,
+    text="No, no, that's not possible. The formula should have crystallized, but it adapted instead. Do you realize what that means for the rest of my work?",
+    voice_id="mad_scientist_woman",
+    emotion="surprise",
+    intensity=0.7,
 )
 
 result.audio.save("out.wav")
 ```
 
-## Notes
+For multi-speaker dialogue:
+
+```python
+import wfloat
+
+model = wfloat.load("wfloat/wfloat-tts")
+
+result = model.generate_dialogue(
+    segments=[
+        {
+            "voice_id": "wise_elder_man",
+            "text": "Rain taps against the tavern shutters as you step inside.",
+            "emotion": "neutral",
+            "intensity": 0.5,
+        },
+        {
+            "voice_id": "strong_hero_man",
+            "text": "You're late. Two bandits stole the king's map over three hours ago.",
+            "emotion": "fear",
+            "intensity": 0.6,
+        },
+        {
+            "voice_id": "strong_hero_man",
+            "text": "They fled north, up into the woods.",
+            "emotion": "neutral",
+            "intensity": 0.5,
+        },
+    ],
+    silence_between_segments_sec=0.35,
+)
+
+result.audio.save("dialogue.wav")
+```
+
+You can also generate a WAV from the command line:
+
+```bash
+wfloat generate \
+  --text "Hello world!" \
+  --out out.wav \
+  --voice-id mad_scientist_woman \
+  --emotion surprise \
+  --intensity 0.7 \
+  --silence-padding-sec 0
+```
+
+For the full CLI help:
+
+```bash
+wfloat generate --help
+```
+
+The first load downloads the model assets. After that, the package uses the
+cached local copy.
+
+## Speaker IDs
+
+Use `voice_id` string names or numeric `sid` values:
+
+| Speaker | SID |
+| --- | ---: |
+| `skilled_hero_man` | 0 |
+| `skilled_hero_woman` | 1 |
+| `fun_hero_man` | 2 |
+| `fun_hero_woman` | 3 |
+| `strong_hero_man` | 4 |
+| `strong_hero_woman` | 5 |
+| `mad_scientist_man` | 6 |
+| `mad_scientist_woman` | 7 |
+| `clever_villain_man` | 8 |
+| `clever_villain_woman` | 9 |
+| `narrator_man` | 10 |
+| `narrator_woman` | 11 |
+| `wise_elder_man` | 12 |
+| `wise_elder_woman` | 13 |
+| `outgoing_anime_man` | 14 |
+| `outgoing_anime_woman` | 15 |
+| `scary_villain_man` | 16 |
+| `scary_villain_woman` | 17 |
+| `news_reporter_man` | 18 |
+| `news_reporter_woman` | 19 |
+
+## Emotions
+
+Supported emotion labels:
+
+- `neutral`
+- `joy`
+- `sadness`
+- `anger`
+- `fear`
+- `surprise`
+- `dismissive`
+- `confusion`
+
+`intensity` must be between `0.0` and `1.0`.
+
+## More
 
-- `wfloat` does not build or bundle native libraries.
-- Low-level bindings come from the installed `wfloat-sherpa-onnx` dependency,
-  which provides `import sherpa_onnx`.
-- The public API is intentionally high-level; low-level native config objects
-  are re-exported only for advanced use.
+- Docs: https://docs.wfloat.com
+- Model card, voices, emotions, and samples: https://huggingface.co/Wfloat/wfloat-tts
+- Web package: https://github.com/wfloat/wfloat-web
+- React Native package: https://github.com/wfloat/react-native-wfloat
diff --git a/python/wfloat/_cli.py b/python/wfloat/_cli.py
@@ -7,26 +7,26 @@ def build_parser() -> argparse.ArgumentParser:
     parser = argparse.ArgumentParser(prog="wfloat")
     subparsers = parser.add_subparsers(dest="command")
 
-    synth = subparsers.add_parser("synth", help="Generate speech and write a WAV file.")
-    synth.add_argument("--model", default="wfloat/wfloat-tts", help="Model name to load.")
-    synth.add_argument("--text", required=True, help="Text to synthesize.")
-    synth.add_argument("--out", required=True, help="Output WAV path.")
-    synth.add_argument("--voice-id", default=None, help="Voice ID name or numeric SID.")
-    synth.add_argument("--emotion", default=None, help="Emotion name.")
-    synth.add_argument("--intensity", type=float, default=None, help="Emotion intensity.")
-    synth.add_argument("--speed", type=float, default=None, help="Speech speed.")
-    synth.add_argument(
+    generate = subparsers.add_parser("generate", help="Generate speech and write a WAV file.")
+    generate.add_argument("--model", default="wfloat/wfloat-tts", help="Model name to load.")
+    generate.add_argument("--text", required=True, help="Text to synthesize.")
+    generate.add_argument("--out", required=True, help="Output WAV path.")
+    generate.add_argument("--voice-id", default=None, help="Voice ID name or numeric SID.")
+    generate.add_argument("--emotion", default=None, help="Emotion name.")
+    generate.add_argument("--intensity", type=float, default=None, help="Emotion intensity.")
+    generate.add_argument("--speed", type=float, default=None, help="Speech speed.")
+    generate.add_argument(
         "--silence-padding-sec",
         type=float,
         default=None,
         help="Silence padding between generated sentence chunks.",
     )
-    synth.add_argument(
+    generate.add_argument(
         "--cache-dir",
         default=None,
         help="Optional override for the cache directory.",
     )
-    synth.add_argument(
+    generate.add_argument(
         "--force-download",
         action="store_true",
         help="Redownload model assets even if cached copies are present.",
@@ -49,7 +49,7 @@ def main(argv=None) -> int:
     parser = build_parser()
     args = parser.parse_args(argv)
 
-    if args.command != "synth":
+    if args.command != "generate":
         parser.print_help()
         return 1
 
diff --git a/python/wfloat/_version.py b/python/wfloat/_version.py
@@ -1 +1 @@
-__version__ = "1.0.0"
+__version__ = "1.0.1"
diff --git a/sample.wav b/sample.wav
diff --git a/setup.py b/setup.py
@@ -36,7 +36,7 @@ def read_package_version() -> str:
     package_dir={"": "python"},
     packages=setuptools.find_packages(where="python"),
     install_requires=[
-        "wfloat-sherpa-onnx==1.12.23",
+        "wfloat-sherpa-onnx==1.12.24",
     ],
     include_package_data=True,
     entry_points={
diff --git a/tests/test_basic.py b/tests/test_basic.py
@@ -109,7 +109,7 @@ def test_bindings_import_without_generated_audio(self):
             git_date="today",
             git_sha1="abc123",
             prepare_wfloat_text=lambda text, *args, **kwargs: text,
-            version="1.12.23",
+            version="1.12.24",
             write_wave=lambda *args, **kwargs: None,
         )
 

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-__version__ = "1.0.0"`
	`1`	`+__version__ = "1.0.1"`
Original file line number	Diff line number	Diff line change
`@@ -109,7 +109,7 @@ def test_bindings_import_without_generated_audio(self):`
`109`	`109`	`git_date="today",`
`110`	`110`	`git_sha1="abc123",`
`111`	`111`	`prepare_wfloat_text=lambda text, args, *kwargs: text,`
`112`		`- version="1.12.23",`
	`112`	`+ version="1.12.24",`
`113`	`113`	`write_wave=lambda args, *kwargs: None,`
`114`	`114`	`)`
`115`	`115`