About the project

Why Compare STT exists

Every speech-to-text provider claims to be the best. They publish benchmarks on cherry-picked datasets, compare against last year's models, and sprinkle asterisks everywhere. We've all read those posts. We've all stared at WER tables. And we've all wondered: how do they actually perform on my audio?

Compare STT was built from that frustration.

Instead of trusting self-reported numbers, you test it yourself. Record your voice, upload a meeting snippet, drop in a podcast clip—whatever reflects your real use case. Two providers transcribe it blindly, side by side, and you decide which one got it right. No names. No bias. Just output.

Built for exploration—not a substitute for proper evaluation. For rigorous analysis, see the benchmark methodology.

“The best benchmark is the one you can't game—because you don't know what's coming next, or who you're up against.”

How it works

You submit audio (record live or upload a file, up to 2 minutes).
Two providers transcribe it anonymously: “Model A” vs “Model B”.
You pick the better transcription—or call it a tie.
Identities are revealed after your vote.

What we don't do

We don't store your audio or transcriptions.
We don't track you—no cookies, no accounts, no IP logging.
We don't sell data (there's nothing to sell).
We don't take money from providers to influence results.

The only thing we keep is the vote outcome: who won, who lost, or if it was a tie. That's it.

Why Compare STT exists

How it works

What we don't do

Sponsored by Gladia