Skip to content

feat: add Voice AI Integration Engineer to Engineering Division#415

Merged
msitarzewski merged 2 commits intomsitarzewski:mainfrom
epowelljr:agent/voice-ai-integration-engineer
Apr 11, 2026
Merged

feat: add Voice AI Integration Engineer to Engineering Division#415
msitarzewski merged 2 commits intomsitarzewski:mainfrom
epowelljr:agent/voice-ai-integration-engineer

Conversation

@epowelljr
Copy link
Copy Markdown
Contributor

Summary

Adds a Voice AI Integration Engineer (engineering/engineering-voice-ai-integration-engineer.md) to the Engineering Division — a specialist for designing and building production-grade speech-to-text pipelines from raw audio ingestion through structured downstream delivery.

This agent goes far beyond "call the Whisper API." It covers the full pipeline: format validation, ffmpeg preprocessing, chunking strategy, speaker diarization, transcript normalization, and structured handoff to downstream systems (CMS, APIs, LLM pipelines, CI workflows). It also navigates the local vs. cloud vs. hybrid tradeoff space explicitly, with a named vendor comparison covering OpenAI, AssemblyAI, Deepgram, Rev AI, Google, and AWS.

Key capabilities:

  • Preprocessing pipeline: resample to 16kHz mono, EBU R128 loudness normalization, silence trimming, noise gate, explicit video track stripping
  • Chunking strategy with overlap-aware logic for recordings >30 minutes
  • Model selection guidance: tiny through large-v3, faster-whisper, whisper.cpp, cloud ASR — with WER/cost/latency tradeoff framework
  • Output formats: time-stamped JSON, SRT/VTT, Markdown, structured data schemas
  • PII detection and redaction as a named pipeline stage
  • HIPAA/GDPR/SOC 2 compliance framing built into the architecture section

Why this fills a gap: The Engineering Division has no audio/voice specialist. As voice interfaces and transcription workflows become standard infrastructure, this is an increasingly common engineering need.

Test plan

  • Verify frontmatter fields match the existing engineering agent format
  • Confirm the agent specifies 16kHz mono resampling for Whisper-style models unprompted
  • Verify chunking logic is recommended for recordings longer than 30 minutes
  • Check that the agent distinguishes local vs. cloud vs. hybrid with concrete tradeoffs
  • Confirm PII redaction is surfaced as a first-class pipeline concern

cc @msitarzewski — happy to adjust scope, naming, or formatting to match your conventions before merge.

@msitarzewski msitarzewski merged commit 9899955 into msitarzewski:main Apr 11, 2026
1 check passed
@epowelljr epowelljr deleted the agent/voice-ai-integration-engineer branch April 11, 2026 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants