Voice Synthesizer (SSML) v2.0G

English

Voice Synthesizer (SSML) v2.0G

A desktop application for Windows, macOS, and Linux that converts text and SSML (Speech Synthesis Markup Language) into MP3 audio files using the Google Cloud Text-to-Speech API.

Main Features

Intuitive GUI: Built with Tkinter, easy to use with a dark theme.
Text and SSML Support: Automatically detects whether the input is plain text or contains SSML tags.
Dynamic Voices: Loads the list of available es-ES voices directly from the Google API on startup.
SSML Assistant:
- Tag Panel: Displays a list of common SSML tags in a resizable side panel.
- Quick Insert: Inserts a tag into the text with just a double-click.
- Smart Wrapping: If you have text selected in the editor, double-clicking an SSML tag (e.g., <emphasis>text</emphasis>) will wrap your selection (<emphasis>your selected text</emphasis>).
- Customizable Tags: You can add your own tags by editing the labelSSML.txt file.
File Management:
- Automatic Naming: Generates sequential filenames (e.g., audio_01.mp3, audio_02.mp3) to avoid overwriting previous work.
- Text Saving: Alongside each audio file, a .txt file with the exact content used for synthesis is automatically saved.
- Persistent Output Folder: Remembers the last used output folder between sessions.
- Quick Access: Allows opening the output folder directly from the application menu.
Event Log: A log panel displays all session events (info, success, warnings, and errors) in real-time with color codes.
Status Bar: Always shows the current output folder at the bottom of the window.

What's New in Version 2.0G

This version represents a massive rewrite and improvement of the application, developed with the assistance of Gemini Code Assist.

Refactoring to Classes: The code has been completely restructured into a class (TextToSpeechApp), improving readability, maintainability, and eliminating the use of global variables.
Dynamic Voice Loading: The voice list is no longer hardcoded. The application now queries the Google API on startup to always get the most up-to-date voices.
Redesigned Interface with Panes: A PanedWindow has been implemented, allowing the user to resize the SSML tag panel and the main text area to their liking.
Smart SSML Tag Insertion: The double-click functionality now detects if text is selected to wrap it with the tag, greatly speeding up the editing process.
Event Log Panel: The simple info label has been replaced with a text box that acts as a log, keeping a history of all session messages.
Full Menu Bar: A standard menu bar has been added with the following options:
- File:
  - Change output folder...: Allows selecting a new folder to save audios.
  - Open output folder: Opens the current folder in the system's file explorer.
  - Exit: Closes the application.
- Help:
  - About...: Displays information about the application, version, and author.
Status Bar: A bar has been added at the bottom that persistently displays the output folder path.
Version Centralization: The version number (v2.0G) is now a constant, facilitating future updates and being displayed in both the window title and the "About" dialog.

Requirements

Python 3.x
A Google Cloud account with the Text-to-Speech API enabled.
Google Cloud credentials (a service account JSON file).
The libraries listed in requirements.txt (you can install them with pip install -r requirements.txt).
- google-cloud-texttospeech
- python-dotenv
- google-auth

Installation and Setup

Follow these steps to get the application running:

Clone this repository and navigate to the project directory.

Create and activate a virtual environment:

# On Windows
python -m venv .venv
.venv\Scripts\activate

# On macOS/Linux
python3 -m venv .venv
source .venv/bin/activate

Install the dependencies:
```
pip install -r requirements.txt
```
Configure your Google Cloud credentials: a. Rename the .env.example file to .env. b. Place your JSON credential file (downloaded from Google Cloud) inside the .credencial folder. c. Open the .env file and make sure the filename matches your credential file. It should look like this:
```
GOOGLE_APPLICATION_CREDENTIALS=".credencial/your-credentials-file.json"
```

Usage

Make sure you have completed the setup and activated the virtual environment.
Run the main script from the project root:
```
python MiText2Speech_GUI.py
```
The application will start. Type or paste your text, select a voice, and click "Generate Audio".

Español

Sintetizador de Voz (SSML) v2.0G

Una aplicación de escritorio para Windows, macOS y Linux que permite convertir texto y SSML (Speech Synthesis Markup Language) en archivos de audio MP3 utilizando la API Text-to-Speech de Google Cloud.

Características Principales

Interfaz Gráfica Intuitiva: Construida con Tkinter, fácil de usar y con un diseño oscuro.
Soporte para Texto y SSML: Detecta automáticamente si el texto introducido es plano o contiene etiquetas SSML.
Voces Dinámicas: Carga la lista de voces es-ES disponibles directamente desde la API de Google al iniciar.
Asistente SSML:
- Panel de Etiquetas: Muestra una lista de etiquetas SSML comunes en un panel lateral redimensionable.
- Inserción Rápida: Inserta una etiqueta en el texto con solo un doble clic.
- Envoltura Inteligente: Si tienes un texto seleccionado en el editor, al hacer doble clic en una etiqueta SSML (p. ej. <emphasis>texto</emphasis>), la aplicación reemplazará el texto de ejemplo por tu selección, envolviéndola (<emphasis>tu texto seleccionado</emphasis>).
- Etiquetas Personalizables: Puedes añadir tus propias etiquetas editando el archivo labelSSML.txt.
Gestión de Archivos:
- Nomenclatura Automática: Genera nombres de archivo secuenciales (ej: audio_01.mp3, audio_02.mp3) para evitar sobrescribir trabajos anteriores.
- Guardado de Texto: Junto a cada archivo de audio, se guarda automáticamente un archivo .txt con el contenido exacto que se utilizó para la síntesis.
- Carpeta de Salida Persistente: Recuerda la última carpeta de salida utilizada entre sesiones.
- Acceso Rápido: Permite abrir la carpeta de salida directamente desde el menú de la aplicación.
Registro de Eventos: Un panel de registro muestra en tiempo real todos los eventos de la sesión (información, éxitos, advertencias y errores) con códigos de color.
Barra de Estado: Muestra siempre la carpeta de salida actual en la parte inferior de la ventana.

Novedades en la Versión 2.0G

Esta versión representa una reescritura y mejora masiva de la aplicación, desarrollada con la asistencia de Gemini Code Assist.

Refactorización a Clases: El código ha sido completamente reestructurado en una clase (TextToSpeechApp), mejorando la legibilidad, mantenibilidad y eliminando el uso de variables globales.
Carga Dinámica de Voces: La lista de voces ya no está fija en el código. Ahora se consulta a la API de Google al iniciar la aplicación para obtener siempre las voces más actualizadas.
Interfaz Rediseñada con Paneles: Se ha implementado un PanedWindow que permite al usuario redimensionar el panel de etiquetas SSML y el área de texto principal a su gusto.
Inserción Inteligente de Etiquetas SSML: La funcionalidad de doble clic ahora detecta si hay texto seleccionado para envolverlo con la etiqueta, agilizando enormemente el proceso de edición.
Panel de Registro de Eventos: Se ha sustituido la simple etiqueta de información por un cuadro de texto que actúa como un log, guardando un historial de todos los mensajes de la sesión.
Barra de Menú Completa: Se ha añadido una barra de menú estándar con las siguientes opciones:
- Archivo:
  - Cambiar carpeta de salida...: Permite seleccionar una nueva carpeta para guardar los audios.
  - Abrir carpeta de salida: Abre la carpeta actual en el explorador de archivos del sistema.
  - Salir: Cierra la aplicación.
- Ayuda:
  - Acerca de...: Muestra información sobre la aplicación, la versión y el autor.
Barra de Estado: Se ha añadido una barra en la parte inferior que muestra de forma persistente la ruta de la carpeta de salida.
Centralización de la Versión: El número de versión (v2.0G) ahora es una constante, facilitando futuras actualizaciones y mostrándose tanto en el título de la ventana como en el diálogo "Acerca de".

Requisitos

Python 3.x
Una cuenta de Google Cloud con la API Text-to-Speech habilitada.
Credenciales de Google Cloud (un archivo JSON de cuenta de servicio).
Las librerías listadas en requirements.txt (puedes instalarlas con pip install -r requirements.txt).
- google-cloud-texttospeech
- python-dotenv
- google-auth

Instalación y Configuración

Sigue estos pasos para poner en marcha la aplicación:

Clona este repositorio y navega al directorio del proyecto.

Crea y activa un entorno virtual:

# En Windows
python -m venv .venv
.venv\Scripts\activate

# En macOS/Linux
python3 -m venv .venv
source .venv/bin/activate

Instala las dependencias:
```
pip install -r requirements.txt
```
Configura tus credenciales de Google Cloud: a. Renombra el archivo .env.example a .env. b. Coloca tu archivo de credenciales JSON (descargado desde Google Cloud) dentro de la carpeta .credencial. c. Abre el archivo .env y asegúrate de que el nombre del archivo coincida con el de tu credencial. Debería verse así:
```
GOOGLE_APPLICATION_CREDENTIALS=".credencial/tu-archivo-de-credenciales.json"
```

Uso

Asegúrate de haber completado la configuración y de tener el entorno virtual activado.
Ejecuta el script principal desde la raíz del proyecto:
```
python MiText2Speech_GUI.py
```
La aplicación se iniciará. Escribe o pega tu texto, selecciona una voz y haz clic en "Generar Audio".

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.credencial		.credencial
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.md		LICENSE.md
MiText2Speech.py		MiText2Speech.py
MiText2Speech_GUI.py		MiText2Speech_GUI.py
MiText2Speech_GUI_backup.py		MiText2Speech_GUI_backup.py
README.md		README.md
labelSSML.txt		labelSSML.txt
listar_voces.py		listar_voces.py
requirements.txt		requirements.txt
screenshot.png		screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English

Voice Synthesizer (SSML) v2.0G

Main Features

What's New in Version 2.0G

Requirements

Installation and Setup

Usage

Español

Sintetizador de Voz (SSML) v2.0G

Características Principales

Novedades en la Versión 2.0G

Requisitos

Instalación y Configuración

Uso

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

English

Voice Synthesizer (SSML) v2.0G

Main Features

What's New in Version 2.0G

Requirements

Installation and Setup

Usage

Español

Sintetizador de Voz (SSML) v2.0G

Características Principales

Novedades en la Versión 2.0G

Requisitos

Instalación y Configuración

Uso

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages