Detect panels, balloons, sound effects, and vertical text. Re-letter in the target language with the original artwork untouched.
Vision model identifies reading order, balloons, narration, SFX.
New text wraps cleanly inside balloons, font matches the tone — no overflow.
Tategaki, furigana, onomatopoeia — all handled correctly.