Skip to content

Add 24-bit integer and 32-bit floating-point PCM .wav input/output support#56

Closed
mvdirty wants to merge 11 commits intoServeurpersoCom:masterfrom
mvdirty:feat/24-and-32-bit-wav-output
Closed

Add 24-bit integer and 32-bit floating-point PCM .wav input/output support#56
mvdirty wants to merge 11 commits intoServeurpersoCom:masterfrom
mvdirty:feat/24-and-32-bit-wav-output

Conversation

@mvdirty
Copy link
Copy Markdown
Contributor

@mvdirty mvdirty commented Apr 13, 2026

This PR adds 24-bit integer and 32-bit floating-point PCM .wav input/output support.

Input support is added to all .wav-file input execution paths.

Output support is added to ace-server, ace-synth, mp3-codec, and neural-codec.

When outputting 32-bit floating-point .wav audio, normalization and peak clipping are automatically disabled in output execution paths which normally include normalization and peak clipping. Currently, these:

  • include ace-server, ace-synth, and neural-codec
  • exclude mp3-codec, because it currently performs no normalization or peak clipping for any output format.

Testing has been performed primarily via ace-server, with input/output tests using all supported input and output formats. Secondary testing was performed via ace-understand and ace-synth, using various scripts in the examples folder which were edited to apply --wav and --wav-format appropriately.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --output <format> option to ace-synth supporting mp3, wav, wav16, wav24, and wav32 output formats.
    • Added --wav-format option to neural-codec and mp3-codec for WAV output format selection.
    • 32-bit float WAV format now disables normalization and peak clipping.
  • Documentation

    • Updated CLI documentation for new output and WAV format options.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Warning

Rate limit exceeded

@mvdirty has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 53 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 47 minutes and 53 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f2149e67-1622-4eed-93b0-9ce358a2172b

📥 Commits

Reviewing files that changed from the base of the PR and between 29c0eb2 and 5c234fe.

📒 Files selected for processing (8)
  • docs/ARCHITECTURE.md
  • examples/lego.sh
  • src/audio-io.h
  • src/wav.h
  • tools/ace-server.cpp
  • tools/ace-synth.cpp
  • tools/mp3-codec.cpp
  • tools/neural-codec.cpp
📝 Walkthrough

Walkthrough

This pull request introduces a generalized WAV format selection mechanism across the codebase. New enums and APIs in audio-io.h support multiple WAV formats (S16, S24, IEEE F32), replacing single-format encoding. WAV reading in wav.h is enhanced to handle extensible formats. CLI tools expose --output and --wav-format options. Documentation and examples are updated accordingly.

Changes

Cohort / File(s) Summary
Documentation
docs/ARCHITECTURE.md
Updated CLI documentation for ace-synth, ace-understand, neural-codec, and mp3-codec to reflect new --output and --wav-format options supporting multiple WAV formats; wav32 disables normalization and peak clipping.
Core Audio I/O Library
src/audio-io.h
Added enums WavFormat, AudioFileKind, and AudioFileFormat to represent file and sample formats. Replaced single-format audio_encode_wav() with format-dispatching implementation supporting S16, S24 (via extensible header), and IEEE F32 with appropriate sanitization and clamping. Updated audio_write_wav() and audio_write() signatures to accept format parameters; normalization now depends on format selection.
WAV Format Handling
src/wav.h
Added little-endian primitive readers for uint16, uint32, signed 24-bit, and IEEE-754 f32le. Enhanced read_wav_buf() to parse extensible WAV format headers, support 24-bit PCM decoding, and properly handle RIFF chunk padding; replaced pointer casts with typed LE readers for safer sample access.
Audio Tool Integration
tools/ace-synth.cpp, tools/mp3-codec.cpp, tools/neural-codec.cpp
Added --output (ace-synth) or --wav-format (mp3/neural-codec) CLI parsing with validation; tools now pass selected format to audio writing functions. ace-synth replaces boolean output_wav with typed AudioFileFormat; mp3/neural-codec constrain wav-format selection and validate output paths accordingly.
Server Audio Processing
tools/ace-server.cpp
Changed audio post-processing to perform normalization conditionally based on selected AudioFileFormat via should_normalize_audio() instead of always normalizing; WAV encoding now passes explicit format to audio_encode_wav().
Examples
examples/lego.sh
Updated ace-synth invocations to use --output wav instead of --wav flag.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI Parser
    participant Validator as Format Validator
    participant Writer as Audio Writer
    participant Encoder as WAV Encoder
    participant File as Output File

    CLI->>Validator: Parse --output or --wav-format
    Validator->>Validator: Validate format string
    alt Format Valid
        Validator-->>CLI: Return AudioFileFormat/WavFormat
        CLI->>Writer: Call audio_write() with format
        Writer->>Writer: Check should_normalize_audio()
        alt Normalization Required
            Writer->>Writer: Normalize audio
        end
        Writer->>Encoder: Call audio_encode_wav(format)
        Encoder->>Encoder: Dispatch on format (S16/S24/F32)
        alt Format == S16 or S24
            Encoder->>Encoder: Clamp [-1,1]
            Encoder->>Encoder: Quantize to PCM
        else Format == F32
            Encoder->>Encoder: Sanitize NaN/Inf
            Encoder->>Encoder: Emit IEEE-754
        end
        Encoder->>File: Write WAV with format header
    else Format Invalid
        Validator-->>CLI: Error
        CLI->>CLI: Exit with status 1
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A writer speaks:
WAVs now bloom in formats three—
Sixteen, twenty-four, and float!
No more single paths to glee,
Each format now can freely float.
Extensible headers sing,
As audio formats spread their wing! 🎵

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: adding 24-bit and 32-bit floating-point PCM WAV input/output support, which is the core feature implemented across multiple files and tools.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/audio-io.h`:
- Around line 591-623: The two header-defined functions
parse_optional_wav_format and should_normalize_wav_audio are non-inline
definitions included in multiple translation units causing ODR violations; mark
both functions as inline in the header (i.e., add the inline specifier to the
declarations/definitions of parse_optional_wav_format(const char*, WavFormat&)
and should_normalize_wav_audio(WavFormat)) so the linker will allow identical
definitions across TUs.
- Around line 382-400: The functions wav_clamp1, wav_quantize_s16,
wav_quantize_s24 and wav_write_f32le must first coerce NaN and infinities to
0.0f to match the documented contract; add a small sanitization helper (e.g.
float sanitize_f32(float x)) that returns 0.0f for !std::isfinite(x) and
otherwise returns x, then call it from wav_clamp1 (or make wav_clamp1 call
sanitize_f32 internally) so NaN/Inf are not left unchanged, call sanitize_f32
before quantization in wav_quantize_s16 and wav_quantize_s24 to avoid undefined
conversion, and call sanitize_f32 in wav_write_f32le before memcpy so no raw
NaN/Inf bits are written. Ensure you include <cmath> and use std::isfinite for
the check.

In `@src/wav.h`:
- Around line 107-143: The code only handles 24-bit as extensible PCM and 32-bit
float only as legacy tag 3; update the conditional checks to also accept classic
PCM24 (handle when audio_format == 1 && bits_per_sample == 24 using the existing
PCM24 branch logic) and extensible float32 (handle when audio_format == 0xfffe
&& extensible_subformat == 3 && bits_per_sample == 32 using the existing float32
branch logic). Modify the if/else-if conditions around the PCM24 block
(currently testing audio_format == 0xfffe && bits_per_sample == 24 &&
extensible_subformat == 1) and the float32 block (currently audio_format == 3 &&
bits_per_sample == 32) to include the alternative combinations, and reuse
wav_read_s24le, wav_read_f32le, n_samples, n_channels, data, pos, data_bytes and
audio allocation/assignment code so pointer arithmetic and malloc size remain
identical.

In `@tools/neural-codec.cpp`:
- Around line 319-321: Update the help string that documents the "--wav-format"
option in tools/neural-codec.cpp to explicitly state that "--wav-format" is only
applicable when using "--decode" (i.e., it controls the WAV output format for
decode mode only), so the CLI help matches the runtime validation performed for
"--wav-format" (which rejects the option unless "--decode" is set). Reference
the "--wav-format" and "--decode" flags when editing the usage text so users see
the restriction up front.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8903395c-c1c7-4119-b91b-a9a8a01ec678

📥 Commits

Reviewing files that changed from the base of the PR and between 331d9b0 and d3db1c2.

📒 Files selected for processing (8)
  • README.md
  • docs/ARCHITECTURE.md
  • src/audio-io.h
  • src/wav.h
  • tools/ace-server.cpp
  • tools/ace-synth.cpp
  • tools/mp3-codec.cpp
  • tools/neural-codec.cpp

Comment thread src/audio-io.h Outdated
Comment thread src/audio-io.h Outdated
Comment thread src/wav.h
Comment thread tools/neural-codec.cpp Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/audio-io.h (1)

616-628: Consider adding an unreachable hint to silence compiler warnings.

Some compilers may warn that control reaches end of non-void function even though all enum cases are covered and std::terminate() is called. Adding a compiler-specific unreachable hint or restructuring with a default case could silence these warnings, though this is a minor style consideration.

Optional: Add unreachable annotation
 inline bool should_normalize_wav_audio(WavFormat wav_format) {
     switch (wav_format) {
         case WAV_FORMAT_PCM_S16:
             return true;
         case WAV_FORMAT_PCM_S24:
             return true;
         case WAV_FORMAT_IEEE_F32:
             return false;
+        default:
+            break;
     }
 
     assert(false && "unsupported wav_format");
     std::terminate();
+#if defined(__GNUC__) || defined(__clang__)
+    __builtin_unreachable();
+#elif defined(_MSC_VER)
+    __assume(0);
+#endif
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/audio-io.h` around lines 616 - 628, The switch in
should_normalize_wav_audio covers all WavFormat cases but some compilers still
warn about control reaching end of a non-void function; update the function to
provide an explicit unreachable hint after the assert/terminate (or use a
default case that asserts) to silence warnings. Locate
should_normalize_wav_audio and either add a default: path that calls
assert(false && "unsupported wav_format") and then a platform hint such as
__builtin_unreachable() (or ::_assume(false) on MSVC) or, after the existing
std::terminate(), append a compiler unreachable intrinsic so the compiler knows
the code path cannot return; keep the existing enum case logic for
WAV_FORMAT_PCM_S16, WAV_FORMAT_PCM_S24, and WAV_FORMAT_IEEE_F32.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/audio-io.h`:
- Around line 616-628: The switch in should_normalize_wav_audio covers all
WavFormat cases but some compilers still warn about control reaching end of a
non-void function; update the function to provide an explicit unreachable hint
after the assert/terminate (or use a default case that asserts) to silence
warnings. Locate should_normalize_wav_audio and either add a default: path that
calls assert(false && "unsupported wav_format") and then a platform hint such as
__builtin_unreachable() (or ::_assume(false) on MSVC) or, after the existing
std::terminate(), append a compiler unreachable intrinsic so the compiler knows
the code path cannot return; keep the existing enum case logic for
WAV_FORMAT_PCM_S16, WAV_FORMAT_PCM_S24, and WAV_FORMAT_IEEE_F32.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2a51e17a-2cb3-48af-af1b-8e3b73f7a48b

📥 Commits

Reviewing files that changed from the base of the PR and between d3db1c2 and 89f36b6.

📒 Files selected for processing (5)
  • docs/ARCHITECTURE.md
  • src/audio-io.h
  • tools/ace-synth.cpp
  • tools/mp3-codec.cpp
  • tools/neural-codec.cpp
✅ Files skipped from review due to trivial changes (2)
  • docs/ARCHITECTURE.md
  • tools/mp3-codec.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • tools/ace-synth.cpp

@ServeurpersoCom
Copy link
Copy Markdown
Owner

Hey, nice work on the audio encoding side. The endian-safe helpers, WAVE_FORMAT_EXTENSIBLE for 24-bit, the RIFF padding fix, wav_sanitize/clamp, and the quantization math are all solid. I want to merge the core audio code.

However I'd like to simplify the format selection architecture. Instead of --wav + --wav-format as two separate flags with cross-validation in every binary, I'm thinking of a single --output flag:

  --output <format>   Output format (default: mp3)
                      mp3, wav, pcm16, pcm24, fp32
                      wav is an alias for pcm16

This removes all the "ERROR: --wav-format requires usage of --wav" validation, and on the server side it becomes a per-request URL parameter ?output=fp32 instead of a global --wav-format flag, which is better because different clients may want different formats.

Could you rebase on current master and rework the CLI/server plumbing around this? The audio-io.h and wav.h core is good to go as is, it's just the argument parsing and server integration that needs the redesign.

@mvdirty mvdirty force-pushed the feat/24-and-32-bit-wav-output branch from d3a213a to 0c2e497 Compare April 15, 2026 17:03
@mvdirty
Copy link
Copy Markdown
Contributor Author

mvdirty commented Apr 15, 2026

However I'd like to simplify the format selection architecture. Instead of --wav + --wav-format as two separate flags with cross-validation in every binary, I'm thinking of a single --output flag

...

Could you rebase on current master and rework the CLI/server plumbing around this? The audio-io.h and wav.h core is good to go as is, it's just the argument parsing and server integration that needs the redesign.

I have:

  • Added new enum types. One describes all possible audio file formats. Another discriminates between mp3 and wav kinds. These are supported by various conversion functions.
  • Aligned ace-synth on --output, with --wav continuing to exist as an alias for --output wav
  • Renamed the format options so they read better, and sort better. We now have mp3, wav (alias for wav16), wav16, wav24, and wav32.
  • Removed the --wav-format CLI option from ace-server, leaving some clear TODOs in three places for you to address. (Search for TODO Serveurperso)
  • Left neural-codec and mp3-codec as you last saw them, because they would require deeper CLI changes I don't feel comfortable making arbitrarily. It would be best for either a) you to update them to your preferences or b) you to specify exactly what you want for those and I can attempt the changes myself.
  • Rebased on latest master (which only had changes in entirely-unrelated areas)

I have yet to:

  • Finish testing ace-synth's new --output CLI arg support, but I'm going to do that right now. I doubt I will encounter any issues, but I will push any needed fixes and let you know if there were any.

@mvdirty
Copy link
Copy Markdown
Contributor Author

mvdirty commented Apr 15, 2026

Looks like I've made a couple of dumb errors (one in code, and another that is isolated my local ace-synth test script setup.) I have fixed the code and my scripts and am re-testing now.

@mvdirty mvdirty marked this pull request as draft April 15, 2026 17:32
@ServeurpersoCom
Copy link
Copy Markdown
Owner

Since MP3 is the default and most widely used format, there's no need to ensure backward compatibility with the --wav option. Avoiding anything related to legacy/fallback or backward compatibility allows for clean code while the project is still in its early stages :)

@ServeurpersoCom
Copy link
Copy Markdown
Owner

ServeurpersoCom commented Apr 15, 2026

You can also remove the long descriptions regarding binary help formats to maintain the original short style (no indentation in help); The names of types you used are clean and self-explanatory. For long description we'll add them to the documentation :) Afterwards, I'll do a review with notes in the code, and it will be ready for merge. It's a truly clean WAV format support :)

@mvdirty
Copy link
Copy Markdown
Contributor Author

mvdirty commented Apr 15, 2026

Testing of the updates is now complete, via both ace-synth and ace-server, by exercising all support output formats. The PR is again ready for review.

@mvdirty mvdirty marked this pull request as ready for review April 15, 2026 18:23
@mvdirty mvdirty force-pushed the feat/24-and-32-bit-wav-output branch from 29c0eb2 to 677dd00 Compare April 15, 2026 18:27
@mvdirty
Copy link
Copy Markdown
Contributor Author

mvdirty commented Apr 15, 2026

Addressed latest conflict and force-pushed.

@mvdirty
Copy link
Copy Markdown
Contributor Author

mvdirty commented Apr 15, 2026

I will now address the removal of --wav and the usage-level details (while leaving them in the relevant markdown files)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tools/ace-synth.cpp (1)

106-110: Consider dropping the legacy --wav alias.

--output wav already covers this case, and keeping --wav leaves a second parser path plus an extra precedence rule to maintain. Since backward compatibility was explicitly called out as unnecessary here, removing the alias would simplify both the parser and the help text.

Also applies to: 137-141

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/ace-synth.cpp` around lines 106 - 110, The command-line parser in
tools/ace-synth.cpp contains a legacy alias branch that handles "--wav" (seen in
the argv parsing block comparing argv[i] to "--wav") which duplicates
functionality already covered by "--output wav"; remove the entire conditional
branch that checks for "--wav" (and the related duplicate handling around lines
handling the same flag at 137-141) so parsing always uses the "--output" path,
and update any help/usage text to drop the "--wav" alias reference; keep
handling of audio_file_format via AUDIO_FILE_FORMAT_WAV_S16 only when parsing
"--output" values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/wav.h`:
- Around line 89-110: The malloc() result for the audio buffer is not checked,
so if allocation fails subsequent writes to audio (in the PCM16, PCM24 and
float32 branches where audio is assigned) will dereference NULL; after each
allocation (the assignments to audio in the branches handling bits_per_sample ==
16, == 24 and float32), check that audio != NULL and on failure perform the
function's existing error handling/cleanup path (e.g., free any allocated state,
set n_samples to 0 or an error code, and return or propagate the error) to avoid
a crash; update the branches that set n_samples and audio (the blocks using
audio = (float *) malloc(...), p = data + pos, and the loops that write audio[])
to bail out if malloc returned NULL.

In `@tools/neural-codec.cpp`:
- Around line 397-405: The decode path currently allows MP3 output because
audio_write() selects MP3 when output_path ends with ".mp3", so --decode (-mode
== 1) with -o out.mp3 will produce MP3 and ignore --wav-format; to fix, enforce
WAV-only output in decode mode by validating/normalizing output_path (or
rejecting non-.wav values) when mode == 1: update the checks around mode,
wav_format_str, parse_wav_format and the code that sets/uses output_path before
calling audio_write() (referencing mode, wav_format_str, parse_wav_format,
output_path, and audio_write) so that decode mode either forces a .wav extension
or errors out for non-WAV outputs, and mirror the same change at the other
occurrence noted (lines ~497-498).

---

Nitpick comments:
In `@tools/ace-synth.cpp`:
- Around line 106-110: The command-line parser in tools/ace-synth.cpp contains a
legacy alias branch that handles "--wav" (seen in the argv parsing block
comparing argv[i] to "--wav") which duplicates functionality already covered by
"--output wav"; remove the entire conditional branch that checks for "--wav"
(and the related duplicate handling around lines handling the same flag at
137-141) so parsing always uses the "--output" path, and update any help/usage
text to drop the "--wav" alias reference; keep handling of audio_file_format via
AUDIO_FILE_FORMAT_WAV_S16 only when parsing "--output" values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 79f32bb5-b8ef-489a-936a-9e60f034d58e

📥 Commits

Reviewing files that changed from the base of the PR and between 89f36b6 and 29c0eb2.

📒 Files selected for processing (8)
  • docs/ARCHITECTURE.md
  • examples/lego.sh
  • src/audio-io.h
  • src/wav.h
  • tools/ace-server.cpp
  • tools/ace-synth.cpp
  • tools/mp3-codec.cpp
  • tools/neural-codec.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/ARCHITECTURE.md
  • tools/ace-server.cpp

Comment thread src/wav.h
Comment on lines 89 to +110
n_samples = (int) (data_bytes / ((size_t) n_channels * 2));
audio = (float *) malloc((size_t) n_samples * 2 * sizeof(float));
const short * pcm = (const short *) (data + pos);
const uint8_t * p = data + pos;

for (int t = 0; t < n_samples; t++) {
if (n_channels == 1) {
float s = (float) pcm[t] / 32768.0f;
audio[t * 2 + 0] = s;
audio[t * 2 + 1] = s;
int16_t s = (int16_t) wav_read_u16le(p + t * 2);
float f = (float) s / 32768.0f;
audio[t * 2 + 0] = f;
audio[t * 2 + 1] = f;
} else {
audio[t * 2 + 0] = (float) pcm[t * n_channels + 0] / 32768.0f;
audio[t * 2 + 1] = (float) pcm[t * n_channels + 1] / 32768.0f;
const uint8_t * frame = p + (size_t) t * n_channels * 2;
int16_t l = (int16_t) wav_read_u16le(frame + 0);
int16_t r = (int16_t) wav_read_u16le(frame + 2);
audio[t * 2 + 0] = (float) l / 32768.0f;
audio[t * 2 + 1] = (float) r / 32768.0f;
}
}
} else if (audio_format == 0xfffe && bits_per_sample == 24 && extensible_subformat == 1) {
n_samples = (int) (data_bytes / ((size_t) n_channels * 3));
audio = (float *) malloc((size_t) n_samples * 2 * sizeof(float));
const uint8_t * p = data + pos;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check malloc() before the first sample write.

Each decode branch stores into audio immediately after allocation. If malloc() fails, the first write at Line 97 / Line 116 / Line 134 dereferences NULL and turns a recoverable read failure into a crash.

Suggested guard
                 n_samples         = (int) (data_bytes / ((size_t) n_channels * 2));
                 audio             = (float *) malloc((size_t) n_samples * 2 * sizeof(float));
+                if (!audio) {
+                    fprintf(stderr, "[WAV] Out of memory while decoding\n");
+                    return NULL;
+                }
                 const uint8_t * p = data + pos;

Apply the same check after the PCM24 and float32 allocations as well.

Also applies to: 127-129

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wav.h` around lines 89 - 110, The malloc() result for the audio buffer is
not checked, so if allocation fails subsequent writes to audio (in the PCM16,
PCM24 and float32 branches where audio is assigned) will dereference NULL; after
each allocation (the assignments to audio in the branches handling
bits_per_sample == 16, == 24 and float32), check that audio != NULL and on
failure perform the function's existing error handling/cleanup path (e.g., free
any allocated state, set n_samples to 0 or an error code, and return or
propagate the error) to avoid a crash; update the branches that set n_samples
and audio (the blocks using audio = (float *) malloc(...), p = data + pos, and
the loops that write audio[]) to bail out if malloc returned NULL.

Comment thread tools/neural-codec.cpp
@mvdirty
Copy link
Copy Markdown
Contributor Author

mvdirty commented Apr 15, 2026

I have addressed the feedback provided by @ServeurpersoCom

ServeurpersoCom added a commit that referenced this pull request Apr 15, 2026
Based on core audio work from PR #56.

wav.h: endian-safe reader with 24-bit WAVE_FORMAT_EXTENSIBLE support,
odd-chunk padding, byte-level reads instead of memcpy+cast.

audio-io.h: three WAV encoders (s16, s24, IEEE f32) with endian-safe
byte writes, NaN/Inf sanitization, and per-format clamping. Single
WavFormat enum, all public APIs default to WAV_S16 for backwards
compatibility. WAV_F32 skips normalization to preserve full range.

No CLI changes yet (--format plumbing is next).

Co-authored-by: mvdirty <[email protected]>
@ServeurpersoCom
Copy link
Copy Markdown
Owner

I redid the piping and the nitpicks myself. It's now merged.

commit 3a2a7d3bd567d713f7e6c525ea328620d9c7da22 (HEAD -> master, origin/master)
Author: Pascal <[email protected]>
Date:   Wed Apr 15 21:40:16 2026 +0200

    cli: add --format option for WAV output format selection

    Replace ace-synth --wav with --format (mp3, wav16, wav24, wav32).
    Add --format to mp3-codec and neural-codec for WAV bit depth control.

    ace-synth: --format controls both container (.mp3/.wav) and WAV subformat.
      Default: mp3. wav32 (IEEE float) skips normalization to preserve full range.

    mp3-codec: --format selects WAV bit depth when output is .wav.
    neural-codec: --format selects WAV bit depth for --decode output.

commit 0ffdad16785e3b9070e98260dbad0e295f402ade
Author: Pascal <[email protected]>
Date:   Wed Apr 15 21:10:49 2026 +0200

    audio: add WAV 24-bit and 32-bit float encoding/decoding

    Based on core audio work from PR #56.

    wav.h: endian-safe reader with 24-bit WAVE_FORMAT_EXTENSIBLE support,
    odd-chunk padding, byte-level reads instead of memcpy+cast.

    audio-io.h: three WAV encoders (s16, s24, IEEE f32) with endian-safe
    byte writes, NaN/Inf sanitization, and per-format clamping. Single
    WavFormat enum, all public APIs default to WAV_S16 for backwards
    compatibility. WAV_F32 skips normalization to preserve full range.

    No CLI changes yet (--format plumbing is next).

    Co-authored-by: mvdirty <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants