EchoMind is a Chrome extension that allows you to record audio from any webpage, transcribe it using Groq's blazing-fast implementation of the Whisper model, and then process the transcribed text using powerful LLMs hosted by Cerebras or Google. It's your personal AI assistant for the web!
- Tab Audio Recording: Capture audio directly from your active browser tab. No need for external recording software.
- Fast Transcription: Leverage Groq's incredibly fast implementation of OpenAI's Whisper-large-v3-turbo model for accurate and speedy speech-to-text conversion.
- Multilingual Support: Transcribe audio in multiple languages, including Chinese (zh), English (en), Spanish (es), French (fr), German (de), Japanese (ja), Korean (ko), and Russian (ru). Automatic language detection is also supported.
- AI-Powered Processing: Process the transcribed text using:
- Meta's llama-3.3-70b (hosted by Cerebras)
- Google's gemini-2.0-flash-exp
- AI-Processing Options:
- Summarization: Get concise summaries of lengthy audio content.
- Question Answering: Ask questions about the audio and receive AI-generated answers.
- Custom Prompts: Craft your own custom prompts to tailor the AI's processing to your specific needs.
- User-Friendly Interface: A clean and intuitive popup interface makes recording, transcribing, and processing simple.
- Minimize Mode: Minimize the popup to a discreet microphone icon while recording, keeping your workspace uncluttered.
There are two ways to install EchoMind:
1. From GitHub Releases (Recommended for most users):
- Go to the Releases page for this repository.
- Download the latest
EchoMind-vX.X.zip
file (whereX.X
is the version number). - Open Chrome and navigate to
chrome://extensions
. - Enable "Developer mode" in the top right corner (if it's not already enabled).
- Drag and drop the downloaded
EchoMind-vX.X.zip
file onto thechrome://extensions
page. Chrome will install the extension. - If Developer mode is off, you may need to unpack the zip file first, and then "Load Unpacked".
2. From Source (For developers or advanced users):
-
Clone the Repository:
git clone https://github.com/Franklyc/EchoMind.git
-
Load the Extension in Chrome:
- Open Chrome and navigate to
chrome://extensions
. - Enable "Developer mode" in the top right corner.
- Click "Load unpacked".
- Select the
EchoMind
directory you just cloned.
- Open Chrome and navigate to
After either installation method:
-
Obtain API Keys:
- Groq: Groq Cloud
- Cerebras: Cerebras Cloud
- Google: Google AI Studio
-
Configure API Keys:
- Click the EchoMind extension icon in your Chrome toolbar.
- Click the "Settings" gear icon.
- Enter your Groq, Cerebras, and Google API keys in the respective fields.
- Click "Save Keys".
-
Start Recording:
- Navigate to the webpage with the audio you want to capture.
- Click the EchoMind extension icon.
- Select your desired transcription language (or leave it on "Auto" for automatic detection).
- Click the "Start" button. The button will change to indicate recording is in progress.
-
Stop Recording:
- Click the "Stop" button. EchoMind will automatically send the recorded audio to Groq for transcription.
-
View Transcript:
- The transcribed text will appear in the "Transcript" section.
-
Process with AI:
- Choose an AI Model: Select from the dropdown.
- Choose a "Processing Option":
- Summarize: Generates a summary of the transcript.
- Answer Questions: Prepares the AI to answer questions based on the transcript. You'll likely want to follow this up with a custom prompt.
- Custom Prompt: Enter your own instructions for the AI.
- (Optional) If you selected "Custom Prompt", enter your prompt in the "Custom Prompt" textarea.
- Click the "Process" button. The AI's output will appear in the "AI Output" section.
-
Minimize/Restore:
- Click the minimize button on top right corner.
- Click the microphone icon to restore the extension popup.
activeTab
: Required to access the currently active tab for audio capture.storage
: Used to store your API keys, selected AI model, and the transcribed text locally.tabCapture
: Enables capturing audio from the current tab.host_permissions
:*://api.groq.com/*
: Allows the extension to communicate with the Groq API for transcription.*://api.cerebras.ai/*
: Allows the extension to communicate with the Cerebras API for text processing.*://generativelanguage.googleapis.com/*
: Allows the extension to communicate with the Google Generative Language API for text processing.
This project is licensed under the MIT License - see the LICENSE file for details. Copyright (c) 2025 Franklyc.
Contributions are welcome! If you find a bug or have a feature request, please open an issue. If you'd like to contribute code, please fork the repository and submit a pull request.