Audio Diff

An implementation of audio comparison using mel-spectrograms and the Longest Common Subsequence (LCS) algorithm, demonstrated through three modification cases.

Algorithm Overview

Core Components

Feature Extraction
- Mel-spectrogram conversion
- Parameters:
  - n_mels: 128 (frequency bins)
  - hop_length: 512 (frame shift)
  - win_length: 2048 (frame size)
  - Sample rate: 16kHz
Distance Calculation
- Frame-to-frame Euclidean distance
- Distance matrix computation
- Adaptive thresholding
Sequence Alignment
- LCS algorithm implementation
- Dynamic programming approach
- Matching path identification

Demo Cases

1. Word Replacement [checkout: demo_replacement.ipynb]

Purpose: Demonstrate word substitution while maintaining context

Implementation:

Original: "the big cat"
Modified: "the small cat"
Process:
1. Generate audio segments
2. Add 200ms silence between words
3. Compare sequences
4. Identify replaced segment

Expected Results:

Matching segments at start/end
Different segment in middle
Preserved timing structure

!python create_demo_replacement.py

2. Word Insertion [checkout: demo_insertion.ipynb]

Purpose: Demonstrate word addition between existing words

Implementation:

Original: "the cat"
Modified: "the big cat"
Process:
1. Generate base audio
2. Insert new word
3. Add silence buffers
4. Analyze timeline expansion

Expected Results:

Matching segments before/after insertion
New segment identified
Timeline shift detection

!python create_demo_insertion.py

3. Word Deletion [checkout: demo_deletion.ipynb]

Purpose: Demonstrate word removal from sequence

Implementation:

Original: "the big cat"
Modified: "the cat"
Process:
1. Generate full sequence
2. Remove middle word
3. Maintain silence buffers
4. Analyze timeline compression

Expected Results:

Matching segments preserved
Deleted segment identified
Timeline compression detected

!python create_demo_deletion.py

Visualization Outputs

1. Waveform Connections

Purpose: Show temporal alignment

Dual waveform display
Matching point connections
Color-coded segments
- Green: Matching
- Red: Modified
- Gray: Connections

2. Distance Matrix

Purpose: Display similarity patterns

Frame-level distances
Alignment path
Modification regions

3. Alignment Analysis

Purpose: Detailed matching visualization

Temporal correspondence
Segment boundaries
Modification highlights

4. Distance Histogram

Purpose: Threshold analysis

Distance distribution
Similarity patterns
Threshold selection

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
plots		plots
README.md		README.md
compare_audio_mel.py		compare_audio_mel.py
create_demo_deletion.py		create_demo_deletion.py
create_demo_insertion.py		create_demo_insertion.py
create_demo_replacement.py		create_demo_replacement.py
demo_deletion.ipynb		demo_deletion.ipynb
demo_insertion.ipynb		demo_insertion.ipynb
demo_replacement.ipynb		demo_replacement.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Diff

Algorithm Overview

Core Components

Demo Cases

1. Word Replacement [checkout: demo_replacement.ipynb]

2. Word Insertion [checkout: demo_insertion.ipynb]

3. Word Deletion [checkout: demo_deletion.ipynb]

Visualization Outputs

1. Waveform Connections

2. Distance Matrix

3. Alignment Analysis

4. Distance Histogram

Technical Details

Audio Processing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

cozec/audio_diff

Folders and files

Latest commit

History

Repository files navigation

Audio Diff

Algorithm Overview

Core Components

Demo Cases

1. Word Replacement [checkout: demo_replacement.ipynb]

2. Word Insertion [checkout: demo_insertion.ipynb]

3. Word Deletion [checkout: demo_deletion.ipynb]

Visualization Outputs

1. Waveform Connections

2. Distance Matrix

3. Alignment Analysis

4. Distance Histogram

Technical Details

Audio Processing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages