CleanSong AI — Whisper Transcriber

Fast model detects explicit words → Large model refines only those segments. Returns word-level timestamps.