Skip to content

Commit c81dc6f

Browse files
committed
Fix README and CLI
1 parent 1d6436a commit c81dc6f

7 files changed

Lines changed: 163 additions & 46 deletions

File tree

CONTRIBUTING.md

Lines changed: 26 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,17 @@
11
# Contributing
22

3-
`wfloat` is a pure Python package. Native code comes from the
4-
`wfloat-sherpa-onnx` dependency, which provides `import sherpa_onnx`.
3+
`wfloat` is the Python client for `wfloat-tts`, Wfloat's on-device text-to-speech
4+
model.
5+
6+
Product context:
7+
8+
- Homepage: https://wfloat.com
9+
- Docs: https://docs.wfloat.com
10+
- Model card and samples: https://huggingface.co/Wfloat/wfloat-tts
11+
- Web package: https://github.com/wfloat/wfloat-web
12+
- React Native package: https://github.com/wfloat/react-native-wfloat
13+
14+
This repo should stay focused on the Python experience.
515

616
## Prerequisites
717

@@ -12,23 +22,23 @@
1222
```bash
1323
python3 -m venv .venv
1424
source .venv/bin/activate
15-
python -m pip install --upgrade pip
16-
python -m pip install setuptools wheel build twine
25+
python3 -m pip install --upgrade pip
26+
python3 -m pip install setuptools wheel build twine
1727
```
1828

1929
Install `wfloat`:
2030

2131
```bash
22-
pip install -e .
32+
python3 -m pip install -e .
2333
```
2434

25-
That will also install the matching `wfloat-sherpa-onnx` dependency.
35+
That also installs the matching `wfloat-sherpa-onnx` dependency.
2636

2737
## Build release artifacts
2838

2939
```bash
3040
rm -rf build dist
31-
python -m build
41+
python3 -m build
3242
```
3343

3444
That produces:
@@ -41,20 +51,26 @@ That produces:
4151
Unit tests do not require `sherpa_onnx`:
4252

4353
```bash
44-
python -m unittest discover -s tests -v
54+
PYTHONPATH=python python3 -m unittest discover -s tests -v
4555
```
4656

4757
You can also run a smoke check:
4858

4959
```bash
50-
python -c "import sherpa_onnx, wfloat; print(wfloat.__version__)"
60+
python3 -c "import sherpa_onnx, wfloat; print(wfloat.__version__)"
5161
```
5262

5363
## CI
5464

55-
CI now:
65+
CI:
5666

5767
- builds pure Python artifacts once
5868
- installs those artifacts on each target platform
5969
- relies on normal dependency resolution for `wfloat-sherpa-onnx`
6070
- runs the unit test suite and an integration smoke test
71+
72+
## Notes for changes
73+
74+
- Keep docs short and user-facing.
75+
- Describe this package as the Python way to run `wfloat-tts` locally.
76+
- If voices, emotions, or examples change, check the model card and docs first.

README.md

Lines changed: 122 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,28 @@
11
# wfloat
22

3-
`wfloat` is a high-level Python wrapper around `sherpa-onnx` for loading
4-
Wfloat-compatible speech models and generating audio files.
3+
`wfloat` is the Python package for `wfloat-tts`, Wfloat's on-device English
4+
text-to-speech model.
55

6-
## Install
6+
It runs speech locally in Python instead of calling a hosted inference API.
7+
The model supports 20 voices with emotion and intensity control.
78

8-
Install `wfloat` normally:
9+
If you're building for the browser, use
10+
[`@wfloat/wfloat-web`](https://github.com/wfloat/wfloat-web). If you're
11+
building for React Native, use
12+
[`@wfloat/react-native-wfloat`](https://github.com/wfloat/react-native-wfloat).
913

10-
```bash
11-
pip install wfloat
12-
```
14+
Try it in the browser: https://wfloat.com/demo
1315

14-
That will also install the matching `wfloat-sherpa-onnx` dependency from PyPI.
16+
<audio controls src="./sample.wav">
17+
<a href="./sample.wav">Sample dialogue</a>
18+
</audio>
1519

16-
When installing from this repo locally:
20+
[Sample dialogue](sample.wav)
21+
22+
## Install
1723

1824
```bash
19-
pip install ./packages/wfloat-python
25+
pip install wfloat
2026
```
2127

2228
## Usage
@@ -27,20 +33,115 @@ import wfloat
2733
model = wfloat.load("wfloat/wfloat-tts")
2834

2935
result = model.generate(
30-
text="The signal is clean. Start the recording.",
31-
voice_id="narrator_woman",
32-
emotion="neutral",
33-
intensity=0.5,
34-
speed=1.0,
36+
text="No, no, that's not possible. The formula should have crystallized, but it adapted instead. Do you realize what that means for the rest of my work?",
37+
voice_id="mad_scientist_woman",
38+
emotion="surprise",
39+
intensity=0.7,
3540
)
3641

3742
result.audio.save("out.wav")
3843
```
3944

40-
## Notes
45+
For multi-speaker dialogue:
46+
47+
```python
48+
import wfloat
49+
50+
model = wfloat.load("wfloat/wfloat-tts")
51+
52+
result = model.generate_dialogue(
53+
segments=[
54+
{
55+
"voice_id": "wise_elder_man",
56+
"text": "Rain taps against the tavern shutters as you step inside.",
57+
"emotion": "neutral",
58+
"intensity": 0.5,
59+
},
60+
{
61+
"voice_id": "strong_hero_man",
62+
"text": "You're late. Two bandits stole the king's map over three hours ago.",
63+
"emotion": "fear",
64+
"intensity": 0.6,
65+
},
66+
{
67+
"voice_id": "strong_hero_man",
68+
"text": "They fled north, up into the woods.",
69+
"emotion": "neutral",
70+
"intensity": 0.5,
71+
},
72+
],
73+
silence_between_segments_sec=0.35,
74+
)
75+
76+
result.audio.save("dialogue.wav")
77+
```
78+
79+
You can also generate a WAV from the command line:
80+
81+
```bash
82+
wfloat generate \
83+
--text "Hello world!" \
84+
--out out.wav \
85+
--voice-id mad_scientist_woman \
86+
--emotion surprise \
87+
--intensity 0.7 \
88+
--silence-padding-sec 0
89+
```
90+
91+
For the full CLI help:
92+
93+
```bash
94+
wfloat generate --help
95+
```
96+
97+
The first load downloads the model assets. After that, the package uses the
98+
cached local copy.
99+
100+
## Speaker IDs
101+
102+
Use `voice_id` string names or numeric `sid` values:
103+
104+
| Speaker | SID |
105+
| --- | ---: |
106+
| `skilled_hero_man` | 0 |
107+
| `skilled_hero_woman` | 1 |
108+
| `fun_hero_man` | 2 |
109+
| `fun_hero_woman` | 3 |
110+
| `strong_hero_man` | 4 |
111+
| `strong_hero_woman` | 5 |
112+
| `mad_scientist_man` | 6 |
113+
| `mad_scientist_woman` | 7 |
114+
| `clever_villain_man` | 8 |
115+
| `clever_villain_woman` | 9 |
116+
| `narrator_man` | 10 |
117+
| `narrator_woman` | 11 |
118+
| `wise_elder_man` | 12 |
119+
| `wise_elder_woman` | 13 |
120+
| `outgoing_anime_man` | 14 |
121+
| `outgoing_anime_woman` | 15 |
122+
| `scary_villain_man` | 16 |
123+
| `scary_villain_woman` | 17 |
124+
| `news_reporter_man` | 18 |
125+
| `news_reporter_woman` | 19 |
126+
127+
## Emotions
128+
129+
Supported emotion labels:
130+
131+
- `neutral`
132+
- `joy`
133+
- `sadness`
134+
- `anger`
135+
- `fear`
136+
- `surprise`
137+
- `dismissive`
138+
- `confusion`
139+
140+
`intensity` must be between `0.0` and `1.0`.
141+
142+
## More
41143

42-
- `wfloat` does not build or bundle native libraries.
43-
- Low-level bindings come from the installed `wfloat-sherpa-onnx` dependency,
44-
which provides `import sherpa_onnx`.
45-
- The public API is intentionally high-level; low-level native config objects
46-
are re-exported only for advanced use.
144+
- Docs: https://docs.wfloat.com
145+
- Model card, voices, emotions, and samples: https://huggingface.co/Wfloat/wfloat-tts
146+
- Web package: https://github.com/wfloat/wfloat-web
147+
- React Native package: https://github.com/wfloat/react-native-wfloat

python/wfloat/_cli.py

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,26 +7,26 @@ def build_parser() -> argparse.ArgumentParser:
77
parser = argparse.ArgumentParser(prog="wfloat")
88
subparsers = parser.add_subparsers(dest="command")
99

10-
synth = subparsers.add_parser("synth", help="Generate speech and write a WAV file.")
11-
synth.add_argument("--model", default="wfloat/wfloat-tts", help="Model name to load.")
12-
synth.add_argument("--text", required=True, help="Text to synthesize.")
13-
synth.add_argument("--out", required=True, help="Output WAV path.")
14-
synth.add_argument("--voice-id", default=None, help="Voice ID name or numeric SID.")
15-
synth.add_argument("--emotion", default=None, help="Emotion name.")
16-
synth.add_argument("--intensity", type=float, default=None, help="Emotion intensity.")
17-
synth.add_argument("--speed", type=float, default=None, help="Speech speed.")
18-
synth.add_argument(
10+
generate = subparsers.add_parser("generate", help="Generate speech and write a WAV file.")
11+
generate.add_argument("--model", default="wfloat/wfloat-tts", help="Model name to load.")
12+
generate.add_argument("--text", required=True, help="Text to synthesize.")
13+
generate.add_argument("--out", required=True, help="Output WAV path.")
14+
generate.add_argument("--voice-id", default=None, help="Voice ID name or numeric SID.")
15+
generate.add_argument("--emotion", default=None, help="Emotion name.")
16+
generate.add_argument("--intensity", type=float, default=None, help="Emotion intensity.")
17+
generate.add_argument("--speed", type=float, default=None, help="Speech speed.")
18+
generate.add_argument(
1919
"--silence-padding-sec",
2020
type=float,
2121
default=None,
2222
help="Silence padding between generated sentence chunks.",
2323
)
24-
synth.add_argument(
24+
generate.add_argument(
2525
"--cache-dir",
2626
default=None,
2727
help="Optional override for the cache directory.",
2828
)
29-
synth.add_argument(
29+
generate.add_argument(
3030
"--force-download",
3131
action="store_true",
3232
help="Redownload model assets even if cached copies are present.",
@@ -49,7 +49,7 @@ def main(argv=None) -> int:
4949
parser = build_parser()
5050
args = parser.parse_args(argv)
5151

52-
if args.command != "synth":
52+
if args.command != "generate":
5353
parser.print_help()
5454
return 1
5555

python/wfloat/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "1.0.0"
1+
__version__ = "1.0.1"

sample.wav

1.93 MB
Binary file not shown.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ def read_package_version() -> str:
3636
package_dir={"": "python"},
3737
packages=setuptools.find_packages(where="python"),
3838
install_requires=[
39-
"wfloat-sherpa-onnx==1.12.23",
39+
"wfloat-sherpa-onnx==1.12.24",
4040
],
4141
include_package_data=True,
4242
entry_points={

tests/test_basic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ def test_bindings_import_without_generated_audio(self):
109109
git_date="today",
110110
git_sha1="abc123",
111111
prepare_wfloat_text=lambda text, *args, **kwargs: text,
112-
version="1.12.23",
112+
version="1.12.24",
113113
write_wave=lambda *args, **kwargs: None,
114114
)
115115

0 commit comments

Comments
 (0)