Let's be honest, it is simply too boring to manually read long articles and blog posts. So, I built this extension.
This extension converts any web article, blog post, or long-form text into natural English speech instantly using Microsoft's Edge TTS.
Unlike traditional extensions, this one features a custom draggable floating widget that is directly injected into the page you are reading. This means your audio will not stop and your interface will not suddenly vanish when you click somewhere else on the page.
- Draggable Floating UI: Spawns a sleek, minimizable, dark-mode audio player right on the webpage itself.
- One-Click Reading: Automatically extracts and plays readable content while stripping out annoying ads and footers.
- Natural Voice Generation: Powered by edge-tts to give you high-quality, neural text-to-speech without the robot voice.
- Background Playback: Because the widget lives in your active tab, you can read along and click links without the audio abruptly crashing.
- Keyboard Shortcuts: Control your reading experience completely hands-free:
- Ctrl+Shift+P: Play / Pause
- Ctrl+Shift+Up: Speed Up (+0.25x)
- Ctrl+Shift+Down: Speed Down (-0.25x)
- Persisted Preferences: The extension automatically remembers your preferred reading speed so you don't have to keep setting it.
- Manual Area Memory: Draw a box to read a specific paragraph. The widget remembers your selection for multiple playbacks until you explicitly clear it.
The system is uniquely split to get around typical Chrome extension limitations:
- Invisible Proxy (popup.js): An invisible trigger that acts purely to bypass Chrome caching bugs without interrupting your browsing flow.
- Content Script Engine (content.js): Manages the exact DOM lifecycle. This completely houses the logic for creating the HTML Widget, handling the dragging mechanics, and natively playing the audio right in your active tab.
- FastAPI Backend: Acts as an asynchronous translation layer. It receives text, passes it through edge-tts, and continuously streams back an audio chunked response.
You will need Python installed to run the backend.
-
Open your terminal and navigate to the backend folder:
cd backend -
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
-
Start the FastAPI development server:
uvicorn app.main:app --reload
The backend should now be running at http://127.0.0.1:8000.
- Open Google Chrome.
- Navigate to chrome://extensions/
- Enable "Developer mode" using the toggle switch in the top right corner.
- Click the "Load unpacked" button.
- Select the extension folder from this repository.
- Start the API: Ensure your local FastAPI backend is running via Uvicorn.
- Find an Article: Navigate to a blog post, news article, or documentation page in Chrome. (If you just installed the extension, refresh the page first).
- Spawn Widget: Click the extension icon in your Chrome toolbar. A draggable player will appear on the screen.
- Read: Click "Start Reading". The extension will parse the page, stream the text to your local backend, and generate the audio.
- Select Manual Area: Click the boundary button to draw a rectangle over a specific paragraph. The widget plays what you select and remembers it until cleared.
- DOM Extraction Engine
- Python Edge-TTS Streaming API
- Phase 9 Integration: Full Webpage Floating Widget
- Global Keyboard Shortcuts
- Memory Preferences (Speed and Area Selection)
- Manual Area Bounding Box Selection
- Dockerized Backend Deployment
- Offline local-LLM TTS fallback