Martin Hunt martin-hunt

Martin Hunt

Scientific and Embedded Software Developer

🔭 I'm currently working on Writing software for Signal Processing and Hearing Research. I'm hoping to get approval to make much of it open source in the near future.

🏫 What I'm Learning

🤖 AI Coding Tools

This is an exciting time for software development. Over the last few years I've shifted my workflow as the models and tools have improved. At this point I'm writing >90% of my code with AI, across unit tests, documentation, and full modules. I've read hundreds of articles, watched tons of videos, and put these tools through several real projects and dozens of experiments. Here's what I've learned.

My progression: GitHub Copilot as smarter autocomplete -> prompt engineering for tests and docs -> Claude Code. Switching to Claude Code enabled memory (CLAUDE.md / AGENTS.md), skills, hooks, plugins — what the community now calls the "user harness." Tuning that environment is the difference between an efficient AI coding environment suitable for production and an inefficient mess that wastes time and money. The harness is the interface between you and the model, and it's where your time investment has the biggest leverage.

What works

Brownfield with a real test suite. An existing codebase is an executable specification. The model can read the structure, infer the style, follow established patterns, and — critically — run the tests to verify its work. The feedback loop is tight: change -> test -> diff. That's the loop that produces shippable code.

Tests are the most important part of the spec. Prose specs are ambiguous, decay over time, and can't be verified. Tests are unambiguous and runnable. When I add a feature, the model writes failing tests first, then implements until they pass. When I fix a bug, step one is a test that reproduces it. The model doesn't need to "understand the spec" — it needs a goal it can verify by running code.

Greenfield is brownfield with a bootstrap step. Use plan mode to sketch the project, generate an initial skeleton with tests, then operate as brownfield from there. The spec-writing phase is short and disposable — once code exists, the code is the spec.

Invest in the harness. Code quality tracks harness quality. Once the harness can run tests, format, and lint on its own, the model can verify its own work between turns, and the surface area you have to babysit collapses.

What doesn't

The industry is rapidly reinventing software engineering, and much of it is hype. Solid engineering is getting drowned out by PR-driven stories about startups burning a million dollars in tokens to ship a side project.

Vibe Coding works for throwaway prototypes — note-taking apps with cute names and nice landing pages. Don't ship anything you'd need to maintain.

Spec-Driven Development is more thoughtful but structurally broken. You write a detailed spec; the model generates code from it. The spec is never perfect, so you iterate. But each regeneration throws away the understanding you'd built about the previous version, the code is something neither you nor the model owns, and there's no clean exit to hand-editing. You're locked in the loop, burning tokens.

The deeper problem is that prose is the wrong feedback medium. Tests run; specs don't. Without an executable target, every "iteration" is just rephrasing.

Harness engineering notes

It's a hierarchy. Global CLAUDE.md and skills apply to every project, so keep them minimal — preferences, not project specifics. Project-level files add the build/test commands, conventions, and domain knowledge that change per repo. Directory-level files handle subsystem quirks. Each layer extends the one above it.
What goes in CLAUDE.md. Build/test/lint commands the model should run before declaring done; conventions you keep restating ("we use X over Y, here's why"); a domain glossary for non-obvious terms; paths to design docs or schemas worth reading first; explicit "never do Z" rules from past mistakes. Keep it terse — every line costs context on every turn. You can break the file up into multiple files and import them using the "@" syntax to keep things organized, but the context cost remains.
Skills vs hooks. Skills are model-invoked capabilities the agent decides to use ("review this PR", "run security scan"). Hooks are deterministic shell scripts the harness fires on events (post-edit format, pre-commit lint). Different mechanisms: skills handle judgment calls, hooks handle invariants. Don't make the model "remember to format" — that's a hook.
Allowlist your read-only commands. Auto-permitting common reads — git status, ls, grep, your test runner — is one of the biggest friction reducers. The model stops asking permission for things you'd never deny, and the loop tightens dramatically.
The feedback loop is the point.
- Same mistake twice -> CLAUDE.md. If you correct "use X not Y" once, fine. If you correct it again next session, the lesson didn't persist — because you are the memory, not the harness. Write it down once and the model gets it free forever.
- Ten minutes of back-and-forth -> skill. Long thrash usually means you're hand-walking the model through a procedure: "first check the logs, then run this, then look for that pattern...". That procedure is a skill. Encode it once, invoke it with one phrase next time.
- Every correction is a signal. The general form. Before you type the correction, ask: why was this needed? Missing context? -> CLAUDE.md. Missing capability? -> skill or hook. Missing permission? -> allowlist. The correction itself is the symptom; the harness gap is the cause.
Plan mode for anything non-trivial. Let the model think before it writes. Plan mode separates "design the change" from "make the change" and catches misunderstandings before they hit code. For one-line fixes it's overhead; for anything multi-file it's the cheapest insurance you can buy.

📸 Project Screenshot Gallery

Unfortunately, many of the projects I have worked on are proprietary and cannot be shared publicly. Much of my code was written for embedded devices most of the rest for desktops. The ones with GUIs were written so the realtime code running on the embedded device (or desktop) is controlled by IPC and a small embedded web server. I have included screenshots of some of the applications I have developed in the last few years to provide a visual representation of my work. These screenshots showcase the user interfaces and features of the applications, giving insight into the types of projects I have been involved in. Please note that while the screenshots provide a glimpse into my work, they do not fully capture the complexity and functionality of the applications I have developed. If you are interested in learning more about my work or have specific questions about the projects, please feel free to reach out to me directly. I am happy to discuss my experience and the technologies I have used in more detail.

Featured Projects

🎯 SpeechFit - Audiology Research Platform

An application for conducting hearing research studies and managing participant sessions. Features include real-time data visualization, researcher dashboards, session management, and advanced acoustic analysis tools.

Tech Stack: C++, Python, FastAPI, Vue.js, PostgreSQL, NiceGUI

📷 View More SpeechFit Screenshots

_{Session Management Interface}	_{Researcher Dashboard}
_{HO2 Test Results Visualization}	_{Testing View}

🧪 LTest - Hearing Testing Suite

A hearing assessment tool providing comprehensive audiological testing capabilities. Includes automated test protocols, real-time threshold tracking, and detailed result reporting for clinical and research applications.

Tech Stack: C++, Python, NumPy, SciPy, Nuitka

_{Audiologist View}

📷 View More LTest Screenshots

_{Users View}

🗣️ QuickSIN
_{Quick Speech-in-Noise assessment tool for evaluating hearing performance in noisy environments. Clinical-grade testing application.}

Additional Tools & Applications

🔊 WDRC Tool _{Wide Dynamic Range Compression configuration Test and Verification tool for hearing aid algorithms. Allows comparison of different implementations.}	📊 PCDTool _{CLI tool to connect to our embedded devices by USB or Bluetooth.}	🎚️ CFScope _{Allows detailed measurement and comparison of processed audio signals. Useful for evaluating audio processing algorithms and hearing aid performance.}
📡 LScope _{Similar to CFScope, LScope provides detailed measurement and comparison of processed audio signals, focusing on different aspects of audio analysis.}	🎵 MUSHRA _{Multi-Stimulus test with Hidden Reference and Anchor implementation. Standard tool for subjective audio quality assessment in hearing research.}	⚡ BenchTest _{Measures the SNR and THD of an audio signal when processed by different algorithms.}
🎼 Freping _{Frequency Warping tool. Demonstrates the effects of frequency warping on audio signals.}