Whisper vs AssemblyAI — which is more accurate for Portuguese?

On clean audio with a single speaker: accuracy is practically identical (94-97%) for PT-BR. Differences appear in specific cases: 1) Noisy audio: Whisper large-v3 has an edge (trained specifically for noise robustness). 2) Multiple speakers: AssemblyAI has an edge (native diarization). 3) Automatic punctuation: AssemblyAI inserts periods, commas, question marks. Whisper produces flowing text without punctuation. 4) Regional Brazilian accents: similar, with a slight AssemblyAI advantage in internal tests.

Whisper is free — why use AssemblyAI?

Open-source Whisper is free but not simple to use: requires Python installed, GPU for acceptable speed (CPU is 10-20x slower), and technical configuration. Via OpenAI API, Whisper costs $0.006/min — cheap, but without diarization, automatic punctuation, or advanced features. AssemblyAI (and VoxScriber which uses it) delivers everything ready: web interface, export formats, diarization, punctuation — no technical setup.

Which is faster: Whisper or AssemblyAI?

AssemblyAI is async — send the file and receive results when ready (no chunking needed for files up to 5GB). For a 1-hour file: AssemblyAI returns in ~3-4 minutes. Whisper via OpenAI API is synchronous but with a 25MB limit — larger files need manual chunking (split + merge), adding complexity and time. For production with large files, AssemblyAI is significantly simpler.

Which engine does VoxScriber use: Whisper or AssemblyAI?

VoxScriber uses AssemblyAI as the default engine for all users (free and paid). It is the engine with the best accuracy-to-features ratio for Portuguese: native diarization, automatic punctuation, and no chunking needed for large files. It costs 4 cycles/minute. Users on LITE and Advanced plans can switch to Whisper (cheaper, 1 cycle/minute) for specific cases such as very noisy audio.

Whisper large-v3 vs AssemblyAI Best — which to use for Portuguese interviews?

For interviews with 2 speakers and medium audio quality (phone recorder or Zoom): AssemblyAI Best + diarization is the best choice — automatically identifies speakers, adds punctuation, and has precise per-sentence timestamps. Whisper large-v3 produces more accurate text in bad audio, but without speaker identification — you would need to use pyannote.audio in post-processing for diarization.

Can I use both Whisper and AssemblyAI in the same project?

Yes. In VoxScriber (LITE+ plans), you can choose the engine per file. Common workflow: use AssemblyAI for standard files (meetings, interviews with good quality) and switch to Whisper when a file has a lot of background noise or low quality. Credits are debited per engine: AssemblyAI = 4 cycles/min; Whisper = 1 cycle/min (cheaper).

Whisper · AssemblyAI · Portuguese · Speaker Detection · Technical

Whisper vs AssemblyAI — Which Transcribes Portuguese Better?

Technical comparison between OpenAI Whisper and AssemblyAI for Portuguese (PT-BR) transcription. Accuracy, speed, cost, and use cases — with real test data.

🎙️ Transcreva gratuitamente

Faça upload do seu áudio ou vídeo e receba o texto em segundos.

Try free — no credit card →

30 minutes free per month. No credit card required.

Formatos suportados: MP3, MP4, WAV, OPUS, M4A — any format

Resultado em segundos

100% em português do Brasil

Privacidade garantida

Sem instalação

Como funciona

Define your priority: accuracy, speed, or cost

For maximum accuracy on clean Portuguese audio: AssemblyAI and Whisper large-v3 are equivalent (94-97%). For noisy audio: Whisper has the edge. For fast processing of long files: AssemblyAI (async, no chunking). For running locally at no cost: open-source Whisper.

Consider features beyond transcription

AssemblyAI includes: speaker diarization, sentiment analysis, automatic summaries, entity detection, and chapters. Whisper: text + timestamps only. If you need advanced features without manual post-processing, AssemblyAI is more complete.

Calculate real cost for your volume

AssemblyAI: $0.37/hour of audio (direct API) or 4 cycles/min on VoxScriber. Whisper via OpenAI API: $0.006/min and just 1 cycle/min on VoxScriber — cheaper, but without diarization or punctuation. Local Whisper: free, but requires GPU and infrastructure setup.

Perguntas frequentes

Try AssemblyAI free — 30 min, no credit card

Try free — no credit card →

30 minutes free per month. No credit card required.

Continue explorando

Best Portuguese transcriberComparison between 6 tools.Audio transcriptionComplete guide.Otter.ai AlternativeSwitch tools.Compare HubAll tool comparisons.