Skip to content

simply-mihir/Maestrolang

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MaestroLang

An Algorithmic Music Compiler

A fully custom Domain-Specific Language and compiler that transforms algorithmic music syntax into playable MIDI audio files.

C Python Flex/Bison Docker Make

Build Transpiler Platforms MIDI


Table of Contents

  1. Overview
  2. Architecture
  3. Execution Flow
  4. Language Syntax
  5. Tech Stack
  6. Quick Start
  7. Local Build
  8. Project Structure
  9. Compiler Internals

Overview

MaestroLang is a production-grade, custom-built Domain-Specific Language (DSL) and compiler, engineered entirely from scratch. It introduces a clean, C-style musical grammar and compiles it into fully playable binary MIDI (.mid) audio files β€” without writing a single note in a traditional DAW.

Rather than targeting machine code or assembly, MaestroLang employs Source-to-Source Compilation (Transpilation): a C-based compiler frontend (Flex + Bison) performs all lexical, syntactic, and semantic validation, then emits intermediate Python targeting the music21 audio engine, which is auto-executed to produce the final audio artifact.

Design Philosophy: Zero-friction compilation. One command in. One .mid file out.

Key Engineering Highlights

Capability Implementation
Lexical Analysis Flex (regex-based tokenizer)
Syntax Parsing GNU Bison (LALR(1) CFG)
Semantic Validation C-based Symbol Table (in-memory string array)
Code Generation Syntax-Directed Translation (SDT) β€” no AST required
Audio Output Python music21 library
OS Independence Docker containerization
Cross-Platform macOS, Linux, Windows (WSL2 + PowerShell)

Architecture

The compiler follows a strict 4-phase pipeline. Each phase transforms the representation of the program before handing off to the next stage.

graph TD
    A["Source File (.mstr)"] --> PH1

    subgraph Phase1[" "]
        PH1["Phase 1 - Lexical Analysis - lexer.l / Flex"]
        PH1 --> B
        B["Character Stream Reader"]
        B --> C["Regex Pattern Matching"]
        C --> D["Token Stream (Keywords, Pitches, Literals)"]
    end

    D --> PH2

    subgraph Phase2[" "]
        PH2["Phase 2 - Syntax Analysis - parser.y / Bison"]
        PH2 --> E
        E["LALR(1) Parser (Context-Free Grammar)"]
        E --> F["Grammar Rule Validation"]
        F --> G["Syntax Error Detection (yylineno)"]
    end

    F --> PH3

    subgraph Phase3[" "]
        PH3["Phase 3 - Semantic Analysis - Embedded in Bison"]
        PH3 --> H
        H["Symbol Table Lookup / Insert"]
        H --> I{"Identifier Valid?"}
        I -- "No" --> J["Semantic Error - Undeclared / Duplicate Macro"]
        I -- "Yes" --> K["Physics Bounds Validation (BPM 1-300)"]
    end

    K --> PH4

    subgraph Phase4[" "]
        PH4["Phase 4 - Code Generation and Transpilation"]
        PH4 --> L
        L["Syntax-Directed Translation (SDT)"]
        L --> M["Dynamic Indentation Tracking (indent_level)"]
        M --> N["generated_audio.py (Python + music21)"]
        N --> O["system() call - python3 runtime"]
    end

    O --> P["generated_audio.mid (Binary MIDI Output)"]

    style Phase1 fill:#1a1a2e,stroke:#4f8ef7,color:#fff
    style Phase2 fill:#16213e,stroke:#f7a800,color:#fff
    style Phase3 fill:#0f3460,stroke:#e94560,color:#fff
    style Phase4 fill:#1a1a2e,stroke:#52e0a1,color:#fff
    style J fill:#e94560,stroke:#e94560,color:#fff
    style P fill:#52e0a1,stroke:#52e0a1,color:#000
    style PH1 fill:#1a3a6e,stroke:#4f8ef7,color:#fff
    style PH2 fill:#3a2e00,stroke:#f7a800,color:#fff
    style PH3 fill:#3a0a1a,stroke:#e94560,color:#fff
    style PH4 fill:#0a3a2a,stroke:#52e0a1,color:#fff
Loading

Execution Flow

End-to-end lifecycle from user keystroke to audio file β€” step by step.

sequenceDiagram
    actor User
    participant Shell as  Shell / Terminal
    participant C as  C Binary (maestro)
    participant Flex as  Flex Lexer
    participant Bison as  Bison Parser
    participant SymTable as  Symbol Table
    participant PyFile as  generated_audio.py
    participant Python as  Python Runtime
    participant MIDI as  MIDI Output

    User->>Shell: maestro song.mstr
    Shell->>C: Execute binary with file arg

    C->>C: Open song.mstr (read)
    C->>C: Open generated_audio.py (write)
    C->>C: Write Python boilerplate imports

    C->>Flex: Feed character stream
    loop For each token
        Flex->>Bison: Return Token (keyword / pitch / literal)
        Bison->>Bison: Match CFG grammar rule
        alt Syntax Error
            Bison-->>User:  SyntaxError at line N
        end
        Bison->>SymTable: Lookup / Insert identifier
        alt Semantic Error
            SymTable-->>User:  SemanticError (undeclared / duplicate)
        end
        Bison->>PyFile: fprintf β†’ emit Python line (SDT)
        Note over Bison,PyFile: indent_level tracks loop depth
    end

    Flex->>Bison: EOF
    C->>C: Close all file handles
    C->>Python: system("python3 generated_audio.py")

    Python->>Python: music21 parses objects
    Python->>Python: Calculate acoustic frequencies
    Python->>MIDI: Write binary data

    MIDI-->>User:  generated_audio.mid
    C-->>Shell: Exit Code 0
Loading

Language Syntax

MaestroLang uses a clean, C-style syntax specifically designed for algorithmic music composition. Source files use the .mstr extension.

Full Language Reference

Keyword Description Example
Track Top-level music block. Wraps all composition logic. Track "BossFight" { ... }
Tempo Sets the BPM (1–300). Validated at compile time. Tempo 150;
Play Plays a single note with a specified duration. Play C4(quarter);
Chord Plays multiple notes simultaneously (polyphony). Chord [C4, E4, G4](half);
Repeat Bounded loop. Generates a Python for block. Repeat 4 { ... }
Define Declares a reusable macro (musical phrase). Define Bassline { ... }
PlayMacro Invokes a previously defined macro. PlayMacro Bassline;
// /* */ Single-line and multi-line comments. Stripped at lex phase. // tempo comment

Supported Note Durations

whole  |  half  |  quarter  |  eighth  |  sixteenth

Pitch Format

Pitches follow standard scientific pitch notation, validated by Flex regex [A-G][b#]?[0-9]:

C4   D#3   Eb5   F#2   G6   A2   Bb4

Complete Example β€” boss_fight.mstr

Track "BossFight" {
    Tempo 150; // Fast-paced combat tempo

    /* ── Reusable Phrases (Macros) ── */
    Define Bassline {
        Play A2(eighth);
        Play E3(eighth);
        Play A2(eighth);
        Play F3(eighth);
    }

    Define TensionRiff {
        Play D4(sixteenth);
        Play F4(sixteenth);
        Play A4(eighth);
    }

    /* ── Build the tension: 4x loop ── */
    Repeat 4 {
        PlayMacro Bassline;
        PlayMacro TensionRiff;
    }

    /* ── Dramatic polyphonic finale ── */
    Chord [A3, C4, E4](whole);
    Chord [F3, A3, C4](whole);
    Chord [G3, B3, D4](half);
    Chord [A3, E4, A4](whole);
}

Tech Stack

Core Compiler Infrastructure

C
C / GCC
Compiler core, Symbol Table, system() handoff
Python
Python 3
Generated target code runtime
Flex

Flex
Fast Lexical Analyzer β€” tokenization
GNU Bison

GNU Bison
LALR(1) parser generator β€” grammar + SDT
Docker
Docker
OS-independent containerized build & run
Linux
GNU Make
Build automation via Makefile
🎡
music21
Python audio engine β€” MIDI generation
πŸͺŸ
WSL2
Windows native execution environment

Stack Breakdown by Compiler Phase

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        MaestroLang Compiler Stack                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ PHASE           β”‚ TECHNOLOGY                                                β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Lexical         β”‚ Flex  β†’  Regex tokenization of .mstr source               β”‚
β”‚ Syntactic       β”‚ GNU Bison  β†’  LALR(1) Context-Free Grammar enforcement    β”‚
β”‚ Semantic        β”‚ C (embedded in parser.y)  β†’  Symbol Table, BPM bounds     β”‚
β”‚ Code Generation β”‚ C fprintf + SDT  β†’  writes generated_audio.py on the fly  β”‚
β”‚ Audio Rendering β”‚ Python 3 + music21  β†’  MIDI binary file production        β”‚
β”‚ Build System    β”‚ GNU Make  β†’  links Flex/Bison/GCC outputs                 β”‚
β”‚ Distribution    β”‚ Docker (python:3.9-slim base)  β†’  cross-OS execution      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quick Start

Recommended: Use Docker for a zero-dependency, cross-platform experience. No GCC, Flex, Bison, or Python required on your host machine.

Step 1 β€” Clone

git clone https://github.com/YOUR_USERNAME/MaestroLang.git
cd MaestroLang

Step 2 β€” Build the Docker Image

docker build -t maestrolang .

This installs GCC, Flex, Bison, Python, and music21 inside a clean Linux container and compiles the maestro binary.

Step 3 β€” Write Your Song

Create a file called song.mstr in your current directory using the language syntax above.

Step 4 β€” Compile & Play

macOS / Linux Windows PowerShell
docker run --rm \
  -v $(pwd):/work \
  maestrolang song.mstr
docker run --rm `
  -v ${PWD}:/work `
  maestrolang song.mstr

Output: generated_audio.mid will appear in your current directory. Open it with any MIDI player, GarageBand, VLC, or import into a DAW.


Local Build

macOS / Linux

Prerequisites:

# Install compiler tools (macOS with Homebrew)
brew install gcc flex bison

# Or on Ubuntu/Debian
sudo apt update && sudo apt install gcc make flex bison

# Install Python audio engine
pip3 install music21

Build:

make clean && make

# (Optional) Install globally
sudo cp maestro /usr/local/bin/

Run:

maestro my_song.mstr
# β†’ Compilation successful! Generating audio...
# β†’ generated_audio.mid created.

Windows (via WSL2)

Windows does not natively support GCC, Flex, or Bison. The standard path is WSL2 (Windows Subsystem for Linux).

Step 1 β€” Enable WSL2

# Run in PowerShell as Administrator
wsl --install
# Restart when prompted, then open the Ubuntu terminal app

Step 2 β€” Install Dependencies

sudo apt update
sudo apt install gcc make flex bison python3 python3-pip
pip3 install music21

Step 3 β€” Build & Run

git clone https://github.com/YOUR_USERNAME/MaestroLang.git
cd MaestroLang
make clean && make
./maestro my_song.mstr

Access your output file from Windows Explorer:

explorer.exe .

Project Structure

MaestroLang/
β”‚
β”œβ”€β”€ lexer.l               # Flex: Regex rules, token definitions, comment stripping
β”œβ”€β”€ parser.y              # Bison: CFG grammar, SDT actions, Symbol Table, Semantic checks
β”œβ”€β”€ Makefile              # Build automation: links Flex + Bison + GCC outputs β†’ 'maestro'
β”œβ”€β”€ Dockerfile            # Container: python:3.9-slim + GCC/Flex/Bison + music21 setup
β”‚
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ boss_fight.mstr   # Example: fast-paced combat theme with loops and macros
β”‚   β”œβ”€β”€ pop.mstr          # Example: pop chord progressions
β”‚   └── ambient.mstr      # Example: slow ambient textures
β”‚
└── README.md             # This file

Compiler Internals Deep-Dive

Symbol Table Design

The Symbol Table is implemented as a fixed-size C string array embedded directly in parser.y. It performs two operations:

  • Insert (define_macro): When Define <Name> is parsed, the identifier string is appended to the table. If it already exists, a Semantic Error: Macro '<Name>' already defined is raised and compilation halts.
  • Lookup (lookup_macro): When PlayMacro <Name> is parsed, a linear search is performed. If the identifier is not found, a Semantic Error: Undeclared Macro '<Name>' is raised.

This deliberately avoids heap allocation, making the compiler fast and memory-safe for the bounded macro scope of a single Track block.

Syntax-Directed Translation (SDT) β€” No AST

Traditional compilers build an Abstract Syntax Tree (AST) and then perform a separate tree-walk code generation pass. MaestroLang eliminates this overhead by using Syntax-Directed Translation: C fprintf commands are embedded directly inside Bison grammar rule actions and fire immediately upon a successful parse match.

Example SDT mapping:

MaestroLang Input Emitted Python (generated_audio.py)
Tempo 120; s.append(tempo.MetronomeMark(number=120))
Play C4(quarter); p.append(note.Note('C4', type='quarter'))
Chord [C4,E4,G4](half); p.append(chord.Chord(['C4','E4','G4'], type='half'))
Repeat 4 { for _i in range(4): + indent_level++
} (close Repeat) indent_level--
Define Bassline { Python function def + Symbol Table insert
PlayMacro Bassline; Python function call + Symbol Table lookup
Dynamic Indentation Tracking

Python enforces syntactic whitespace. Because MaestroLang generates Python code via fprintf calls in C without building an AST, indentation must be tracked dynamically at parse-time.

A global integer indent_level is maintained in parser.y:

  • Incremented when a Repeat or Define block opens ({)
  • Decremented when the matching close brace (}) is reduced

Every fprintf call for a Python statement prepends indent_level Γ— 4 spaces before writing the code line. This produces correctly indented, syntactically valid Python from first principles.

Docker Containerization Strategy

Challenge: C binaries are architecture-specific. A binary compiled on an Apple Silicon Mac will not run on a Windows x86_64 machine.

Solution: The Dockerfile packages the entire compiler toolchain:

FROM python:3.9-slim

# Install C toolchain + parser generators
RUN apt-get update && apt-get install -y \
    gcc make flex bison

# Install Python audio engine
RUN pip install music21

# Copy source and compile inside the container
COPY . /app
WORKDIR /app
RUN make clean && make

# Set the compiled binary as the container entrypoint
ENTRYPOINT ["/app/maestro"]

Volume Mapping (-v $(pwd):/work) mounts the user's local directory as /work inside the container. The compiler reads the .mstr file and writes generated_audio.mid to /work, which resolves directly to the user's host filesystem. The container exits immediately after β€” zero residual state.


Contributing

Contributions, issues, and feature requests are welcome.

# Fork the repo, create a feature branch, and submit a PR
git checkout -b feature/multi-track-support
git commit -m "feat: add multi-track instrument layer"
git push origin feature/multi-track-support

Please open an issue first to discuss significant changes.


License

Distributed under the MIT License. See LICENSE for full details.


MaestroLang β€” where compiler theory meets music theory.

Built with passion for the intersection of Code and Sound.

About

A custom DSL & compiler built from scratch in C using Flex and Bison. Transpiles algorithmic music syntax into executable Python via Syntax-Directed Translation, generating playable MIDI audio files. Fully containerised with Docker for cross-platform support.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors