A fully custom Domain-Specific Language and compiler that transforms algorithmic music syntax into playable MIDI audio files.
- Overview
- Architecture
- Execution Flow
- Language Syntax
- Tech Stack
- Quick Start
- Local Build
- Project Structure
- Compiler Internals
MaestroLang is a production-grade, custom-built Domain-Specific Language (DSL) and compiler, engineered entirely from scratch. It introduces a clean, C-style musical grammar and compiles it into fully playable binary MIDI (.mid) audio files β without writing a single note in a traditional DAW.
Rather than targeting machine code or assembly, MaestroLang employs Source-to-Source Compilation (Transpilation): a C-based compiler frontend (Flex + Bison) performs all lexical, syntactic, and semantic validation, then emits intermediate Python targeting the music21 audio engine, which is auto-executed to produce the final audio artifact.
Design Philosophy: Zero-friction compilation. One command in. One
.midfile out.
| Capability | Implementation |
|---|---|
| Lexical Analysis | Flex (regex-based tokenizer) |
| Syntax Parsing | GNU Bison (LALR(1) CFG) |
| Semantic Validation | C-based Symbol Table (in-memory string array) |
| Code Generation | Syntax-Directed Translation (SDT) β no AST required |
| Audio Output | Python music21 library |
| OS Independence | Docker containerization |
| Cross-Platform | macOS, Linux, Windows (WSL2 + PowerShell) |
The compiler follows a strict 4-phase pipeline. Each phase transforms the representation of the program before handing off to the next stage.
graph TD
A["Source File (.mstr)"] --> PH1
subgraph Phase1[" "]
PH1["Phase 1 - Lexical Analysis - lexer.l / Flex"]
PH1 --> B
B["Character Stream Reader"]
B --> C["Regex Pattern Matching"]
C --> D["Token Stream (Keywords, Pitches, Literals)"]
end
D --> PH2
subgraph Phase2[" "]
PH2["Phase 2 - Syntax Analysis - parser.y / Bison"]
PH2 --> E
E["LALR(1) Parser (Context-Free Grammar)"]
E --> F["Grammar Rule Validation"]
F --> G["Syntax Error Detection (yylineno)"]
end
F --> PH3
subgraph Phase3[" "]
PH3["Phase 3 - Semantic Analysis - Embedded in Bison"]
PH3 --> H
H["Symbol Table Lookup / Insert"]
H --> I{"Identifier Valid?"}
I -- "No" --> J["Semantic Error - Undeclared / Duplicate Macro"]
I -- "Yes" --> K["Physics Bounds Validation (BPM 1-300)"]
end
K --> PH4
subgraph Phase4[" "]
PH4["Phase 4 - Code Generation and Transpilation"]
PH4 --> L
L["Syntax-Directed Translation (SDT)"]
L --> M["Dynamic Indentation Tracking (indent_level)"]
M --> N["generated_audio.py (Python + music21)"]
N --> O["system() call - python3 runtime"]
end
O --> P["generated_audio.mid (Binary MIDI Output)"]
style Phase1 fill:#1a1a2e,stroke:#4f8ef7,color:#fff
style Phase2 fill:#16213e,stroke:#f7a800,color:#fff
style Phase3 fill:#0f3460,stroke:#e94560,color:#fff
style Phase4 fill:#1a1a2e,stroke:#52e0a1,color:#fff
style J fill:#e94560,stroke:#e94560,color:#fff
style P fill:#52e0a1,stroke:#52e0a1,color:#000
style PH1 fill:#1a3a6e,stroke:#4f8ef7,color:#fff
style PH2 fill:#3a2e00,stroke:#f7a800,color:#fff
style PH3 fill:#3a0a1a,stroke:#e94560,color:#fff
style PH4 fill:#0a3a2a,stroke:#52e0a1,color:#fff
End-to-end lifecycle from user keystroke to audio file β step by step.
sequenceDiagram
actor User
participant Shell as Shell / Terminal
participant C as C Binary (maestro)
participant Flex as Flex Lexer
participant Bison as Bison Parser
participant SymTable as Symbol Table
participant PyFile as generated_audio.py
participant Python as Python Runtime
participant MIDI as MIDI Output
User->>Shell: maestro song.mstr
Shell->>C: Execute binary with file arg
C->>C: Open song.mstr (read)
C->>C: Open generated_audio.py (write)
C->>C: Write Python boilerplate imports
C->>Flex: Feed character stream
loop For each token
Flex->>Bison: Return Token (keyword / pitch / literal)
Bison->>Bison: Match CFG grammar rule
alt Syntax Error
Bison-->>User: SyntaxError at line N
end
Bison->>SymTable: Lookup / Insert identifier
alt Semantic Error
SymTable-->>User: SemanticError (undeclared / duplicate)
end
Bison->>PyFile: fprintf β emit Python line (SDT)
Note over Bison,PyFile: indent_level tracks loop depth
end
Flex->>Bison: EOF
C->>C: Close all file handles
C->>Python: system("python3 generated_audio.py")
Python->>Python: music21 parses objects
Python->>Python: Calculate acoustic frequencies
Python->>MIDI: Write binary data
MIDI-->>User: generated_audio.mid
C-->>Shell: Exit Code 0
MaestroLang uses a clean, C-style syntax specifically designed for algorithmic music composition. Source files use the .mstr extension.
| Keyword | Description | Example |
|---|---|---|
Track |
Top-level music block. Wraps all composition logic. | Track "BossFight" { ... } |
Tempo |
Sets the BPM (1β300). Validated at compile time. | Tempo 150; |
Play |
Plays a single note with a specified duration. | Play C4(quarter); |
Chord |
Plays multiple notes simultaneously (polyphony). | Chord [C4, E4, G4](half); |
Repeat |
Bounded loop. Generates a Python for block. |
Repeat 4 { ... } |
Define |
Declares a reusable macro (musical phrase). | Define Bassline { ... } |
PlayMacro |
Invokes a previously defined macro. | PlayMacro Bassline; |
// /* */ |
Single-line and multi-line comments. Stripped at lex phase. | // tempo comment |
whole | half | quarter | eighth | sixteenth
Pitches follow standard scientific pitch notation, validated by Flex regex [A-G][b#]?[0-9]:
C4 D#3 Eb5 F#2 G6 A2 Bb4
Track "BossFight" {
Tempo 150; // Fast-paced combat tempo
/* ββ Reusable Phrases (Macros) ββ */
Define Bassline {
Play A2(eighth);
Play E3(eighth);
Play A2(eighth);
Play F3(eighth);
}
Define TensionRiff {
Play D4(sixteenth);
Play F4(sixteenth);
Play A4(eighth);
}
/* ββ Build the tension: 4x loop ββ */
Repeat 4 {
PlayMacro Bassline;
PlayMacro TensionRiff;
}
/* ββ Dramatic polyphonic finale ββ */
Chord [A3, C4, E4](whole);
Chord [F3, A3, C4](whole);
Chord [G3, B3, D4](half);
Chord [A3, E4, A4](whole);
}βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MaestroLang Compiler Stack β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β PHASE β TECHNOLOGY β
βββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Lexical β Flex β Regex tokenization of .mstr source β
β Syntactic β GNU Bison β LALR(1) Context-Free Grammar enforcement β
β Semantic β C (embedded in parser.y) β Symbol Table, BPM bounds β
β Code Generation β C fprintf + SDT β writes generated_audio.py on the fly β
β Audio Rendering β Python 3 + music21 β MIDI binary file production β
β Build System β GNU Make β links Flex/Bison/GCC outputs β
β Distribution β Docker (python:3.9-slim base) β cross-OS execution β
βββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Recommended: Use Docker for a zero-dependency, cross-platform experience. No GCC, Flex, Bison, or Python required on your host machine.
git clone https://github.com/YOUR_USERNAME/MaestroLang.git
cd MaestroLangdocker build -t maestrolang .This installs GCC, Flex, Bison, Python, and
music21inside a clean Linux container and compiles themaestrobinary.
Create a file called song.mstr in your current directory using the language syntax above.
| macOS / Linux | Windows PowerShell |
|---|---|
docker run --rm \
-v $(pwd):/work \
maestrolang song.mstr |
docker run --rm `
-v ${PWD}:/work `
maestrolang song.mstr |
Output: generated_audio.mid will appear in your current directory. Open it with any MIDI player, GarageBand, VLC, or import into a DAW.
Prerequisites:
# Install compiler tools (macOS with Homebrew)
brew install gcc flex bison
# Or on Ubuntu/Debian
sudo apt update && sudo apt install gcc make flex bison
# Install Python audio engine
pip3 install music21Build:
make clean && make
# (Optional) Install globally
sudo cp maestro /usr/local/bin/Run:
maestro my_song.mstr
# β Compilation successful! Generating audio...
# β generated_audio.mid created.Windows does not natively support GCC, Flex, or Bison. The standard path is WSL2 (Windows Subsystem for Linux).
Step 1 β Enable WSL2
# Run in PowerShell as Administrator
wsl --install
# Restart when prompted, then open the Ubuntu terminal appStep 2 β Install Dependencies
sudo apt update
sudo apt install gcc make flex bison python3 python3-pip
pip3 install music21Step 3 β Build & Run
git clone https://github.com/YOUR_USERNAME/MaestroLang.git
cd MaestroLang
make clean && make
./maestro my_song.mstrAccess your output file from Windows Explorer:
explorer.exe .MaestroLang/
β
βββ lexer.l # Flex: Regex rules, token definitions, comment stripping
βββ parser.y # Bison: CFG grammar, SDT actions, Symbol Table, Semantic checks
βββ Makefile # Build automation: links Flex + Bison + GCC outputs β 'maestro'
βββ Dockerfile # Container: python:3.9-slim + GCC/Flex/Bison + music21 setup
β
βββ examples/
β βββ boss_fight.mstr # Example: fast-paced combat theme with loops and macros
β βββ pop.mstr # Example: pop chord progressions
β βββ ambient.mstr # Example: slow ambient textures
β
βββ README.md # This file
Symbol Table Design
The Symbol Table is implemented as a fixed-size C string array embedded directly in parser.y. It performs two operations:
- Insert (
define_macro): WhenDefine <Name>is parsed, the identifier string is appended to the table. If it already exists, aSemantic Error: Macro '<Name>' already definedis raised and compilation halts. - Lookup (
lookup_macro): WhenPlayMacro <Name>is parsed, a linear search is performed. If the identifier is not found, aSemantic Error: Undeclared Macro '<Name>'is raised.
This deliberately avoids heap allocation, making the compiler fast and memory-safe for the bounded macro scope of a single Track block.
Syntax-Directed Translation (SDT) β No AST
Traditional compilers build an Abstract Syntax Tree (AST) and then perform a separate tree-walk code generation pass. MaestroLang eliminates this overhead by using Syntax-Directed Translation: C fprintf commands are embedded directly inside Bison grammar rule actions and fire immediately upon a successful parse match.
Example SDT mapping:
| MaestroLang Input | Emitted Python (generated_audio.py) |
|---|---|
Tempo 120; |
s.append(tempo.MetronomeMark(number=120)) |
Play C4(quarter); |
p.append(note.Note('C4', type='quarter')) |
Chord [C4,E4,G4](half); |
p.append(chord.Chord(['C4','E4','G4'], type='half')) |
Repeat 4 { |
for _i in range(4): + indent_level++ |
} (close Repeat) |
indent_level-- |
Define Bassline { |
Python function def + Symbol Table insert |
PlayMacro Bassline; |
Python function call + Symbol Table lookup |
Dynamic Indentation Tracking
Python enforces syntactic whitespace. Because MaestroLang generates Python code via fprintf calls in C without building an AST, indentation must be tracked dynamically at parse-time.
A global integer indent_level is maintained in parser.y:
- Incremented when a
RepeatorDefineblock opens ({) - Decremented when the matching close brace (
}) is reduced
Every fprintf call for a Python statement prepends indent_level Γ 4 spaces before writing the code line. This produces correctly indented, syntactically valid Python from first principles.
Docker Containerization Strategy
Challenge: C binaries are architecture-specific. A binary compiled on an Apple Silicon Mac will not run on a Windows x86_64 machine.
Solution: The Dockerfile packages the entire compiler toolchain:
FROM python:3.9-slim
# Install C toolchain + parser generators
RUN apt-get update && apt-get install -y \
gcc make flex bison
# Install Python audio engine
RUN pip install music21
# Copy source and compile inside the container
COPY . /app
WORKDIR /app
RUN make clean && make
# Set the compiled binary as the container entrypoint
ENTRYPOINT ["/app/maestro"]Volume Mapping (-v $(pwd):/work) mounts the user's local directory as /work inside the container. The compiler reads the .mstr file and writes generated_audio.mid to /work, which resolves directly to the user's host filesystem. The container exits immediately after β zero residual state.
Contributions, issues, and feature requests are welcome.
# Fork the repo, create a feature branch, and submit a PR
git checkout -b feature/multi-track-support
git commit -m "feat: add multi-track instrument layer"
git push origin feature/multi-track-supportPlease open an issue first to discuss significant changes.
Distributed under the MIT License. See LICENSE for full details.
MaestroLang β where compiler theory meets music theory.
Built with passion for the intersection of Code and Sound.