Autoregressive Mosaics

Overview

The idea is simple: humans are naturally great at creating mosaic art. From the Roman Empire to French Neo-Impressionism, we can effortlessly place individual strokes to form a larger, coherent image, balancing local action with global structure.

Large Language Models, however, struggle with this because they fundamentally lack spatial grounding. Autoregressive Mosaics is an attempt to force an LLM trained only on text to paint a picture one discrete pixel at a time. The system gives the model a blank grid (M x N) and a text prompt; the model must infer where to place structure and color step-by-step using only its linguistic priors.

The results are often visually primitive, unstable, or unintentionally abstract, but that is exactly the point. They offer a raw look into how text-only models represent (and fracture) geometry, shape, and visual concepts.

As with any art, outputs are open to interpretation. Squint a little: what do you see? Does the result resemble what you asked for?

Model Used

This project currently uses:

Qwen/Qwen2.5-14B-Instruct

Qwen2.5-14B-Instruct is a text-first instruction-following language model. It is trained on large-scale mixed corpora (natural language + code) and tuned for instruction completion, reasoning, and structured generation. It is not a native image model in this setup, and it does not receive pixel tensors or vision encoder features here.

That makes the behavior in this project interesting: the model can still produce outputs that resemble visual structure, even though it is only generating text tokens.

Research Question

If a language model is trained primarily to model text and code, to what extent can it still recover coherent 2D visual concepts when forced to act as a pixel-level or programmatic painter?

Autoregressive Mosaics treats this as an empirical question by constraining generation and observing where geometry emerges, degrades, or collapses.

Two Generation Methods

To explore this phenomenon, the project includes two distinct generation pipelines.

1) ASCII Canvas (`ver2-asciicanvas`)

In this approach, the model behaves like a literal cell-by-cell painter.

In a single forward pass, the LLM generates:
- an ASCII topology grid inside <ascii>...</ascii>
- a symbol-to-color map inside <palette>...</palette>
Each grid cell is directly represented in text, so the model must make an explicit decision per position.
The backend parses, sanitizes, and force-fits the result to exact M x N shape, then maps characters to HEX colors.

Why this fails interestingly:

The model predicts tokens in a strict 1D sequence.
2D consistency (object boundaries, symmetry, position memory) is hard to sustain over long generations.
Shapes can drift, tear, collapse, or mutate across rows, producing fragmented but often compelling abstractions.

2) Code Canvas (`ver3-codecanvas`)

In this approach, the model behaves like a mosaic artist who writes code.

Instead of raw pixels, the LLM outputs Python rendering logic (render(canvas)).
The code uses a constrained drawing API (fill, set_pixel, rect, line, circle, triangle).
A deterministic renderer executes that code and rasterizes the final grid.

Why this performs better:

The model can express intent in compact symbolic form ("draw a circle at center") rather than committing to every cell token.
Deterministic geometry handles exact spatial bookkeeping.
This aligns with LLM strengths: symbolic decomposition, procedural logic, and code synthesis.
The result is a neuro-symbolic pipeline: language model for high-level plan, strict engine for spatial execution.

Repository Layout

ver2-asciicanvas/ - ASCII topology + palette generation backend and UI.
ver3-codecanvas/ - Code-generation neuro-symbolic backend and UI.
results/ - Sample outputs, visualization script, and project banner.
backend.py, index.html - earlier root-level prototype files.

Quick Start

Requirements

Python 3.10+
PyTorch + Transformers stack
GPU recommended for Qwen 14B

Install typical dependencies in your environment (example names may vary by setup):

pip install fastapi uvicorn torch transformers accelerate

Run ASCII Canvas (ver2)

cd ver2-asciicanvas
python backend.py

Then open: http://localhost:8123

Run Code Canvas (ver3)

cd ver3-codecanvas
python backend.py

Then open: http://localhost:8123

Note: both versions default to port 8123, so run one backend at a time.

What This Project Is (and Is Not)

This is not a production image generator.
This is an interpretability-flavored art experiment probing the boundary between text autoregression and spatial reasoning.
Failures are part of the signal, not just noise.

Copyright and License

This code is provided for viewing purposes only in conjunction with the CVPR art gallery. Copying, modification, distribution, and derivative works without citations are prohibited.

Citation

If you reference this work or repository, please cite it as follows:

Plain Text: A. Nedungadi, "Autoregressive Mosaics." GitHub, 2026. [Online]. Available: https://github.com/ashwin-ned/autoregressive-mosaics

BibTeX:

@misc{ned2026autoregressivemosaics,
  author = {Nedungadi, Ashwin},
  title = {Autoregressive Mosaics},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{[https://github.com/ashwin-ned/autoregressive-mosaics](https://github.com/ashwin-ned/autoregressive-mosaics)}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
gallery_images		gallery_images
results		results
ver2-asciicanvas		ver2-asciicanvas
ver3-codecanvas		ver3-codecanvas
.gitignore		.gitignore
README.md		README.md
banner_1920x1080.png		banner_1920x1080.png
index.html		index.html
main.js		main.js
style.css		style.css
symbolic-a-bird-1772110130942.png		symbolic-a-bird-1772110130942.png
symbolic-a-red-apple--1772103332350.png		symbolic-a-red-apple--1772103332350.png
symbolic-king-tut-s-tomb-1772661687365.png		symbolic-king-tut-s-tomb-1772661687365.png
symbolic-oranges-1772115891729.png		symbolic-oranges-1772115891729.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autoregressive Mosaics

Overview

Model Used

Research Question

Two Generation Methods

1) ASCII Canvas (`ver2-asciicanvas`)

2) Code Canvas (`ver3-codecanvas`)

Repository Layout

Quick Start

Requirements

Run ASCII Canvas (ver2)

Run Code Canvas (ver3)

What This Project Is (and Is Not)

Copyright and License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autoregressive Mosaics

Overview

Model Used

Research Question

Two Generation Methods

1) ASCII Canvas (ver2-asciicanvas)

2) Code Canvas (ver3-codecanvas)

Repository Layout

Quick Start

Requirements

Run ASCII Canvas (ver2)

Run Code Canvas (ver3)

What This Project Is (and Is Not)

Copyright and License

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1) ASCII Canvas (`ver2-asciicanvas`)

2) Code Canvas (`ver3-codecanvas`)

Packages