How to transcribe audio to text (free and fast)
The fastest way to transcribe audio to text, plus free options like local Whisper and phone built-ins. Step-by-step, with tips to improve accuracy.
The fastest way to transcribe audio to text is to upload your file to an AI transcription tool, wait a minute or two, and download the result as a text file. Free options exist too: most online tools include a free tier, your phone has a built-in transcriber, and OpenAI Whisper is free to run locally if you're comfortable with a technical setup.
This guide walks through each realistic option, step by step, so you can pick the one that fits your file and your time.
The fastest way: upload to an AI transcription tool
If you just want the text and don't want to install anything, an online AI tool is the answer. You drag in a file, the tool does the work on its servers, and you get a transcript back in minutes. Here's the flow using TranscribTxt as the worked example:
- Go to the site. Open the transcription tool in your browser. No download or account setup is required to start a free transcription.
- Upload your file or paste a link. Drag in an MP3, M4A, or WAV audio file, or a video file (MP4, MOV, WebM). You can also paste a YouTube or other URL and skip the download step entirely.
- Let it detect the language. TranscribTxt auto-detects across 99 languages, so you don't have to set anything. It runs on ElevenLabs Scribe under the hood.
- Wait a few minutes. A 10-minute clip is typically done in under a minute; an hour-long file in a few minutes. Speed depends on file size, not just duration.
- Download the transcript. Export as TXT for plain text, SRT for subtitles with timestamps, or JSON if you need structured data. Files are deleted after transcription, so nothing lingers on the server.
If you need to know who said what, speaker labels (diarization) are available on the Pro and Business plans. For a deeper comparison of online converters, see the audio to text converter guide.
Free options
You don't have to pay to transcribe audio. The right free choice depends on how much you transcribe and how technical you want to get.
TranscribTxt free tier. The free plan gives you 5 files per month with no credit card. Same engine, same export formats — just a monthly cap. Good for occasional one-off files.
OpenAI Whisper (local). Whisper is free and open-source, and it runs entirely on your own computer. There are no per-file limits and your audio never leaves your machine. The catch is setup: you install Python and the model, then run it from the command line, and a GPU helps a lot for speed. Great if you're technical and transcribe in bulk; overkill if you just have one file.
Phone built-ins. Both major phones can transcribe for free, on-device:
- iPhone Voice Memos transcribes recordings on recent iOS versions — open a memo and the text appears alongside it.
- Google Recorder (Pixel and many Android phones) transcribes live as you record and lets you search the text afterward.
These are convenient and private, but accuracy is strongest for clear English and falls off with accents, multiple speakers, or noise. For more on the no-cost landscape, see free transcription software, and for the phone-specific walkthrough, voice memo to text.
How the methods compare
| Method | Cost | Speed | Best for |
|---|---|---|---|
| Online AI tool (TranscribTxt) | Free 5/mo, then $12/mo | A few minutes per file | Most people — no setup, multiple formats |
| Whisper (local) | Free | Minutes to hours, depends on hardware | Technical users, bulk or private files |
| Phone built-in | Free | Real-time or instant | Quick voice memos, clear English |
| Manual typing | Free (your time) | ~4 hours per audio hour | Short clips needing perfect accuracy |
Paid online plans scale up from there. TranscribTxt's Pro plan is $12/month for 1,200 minutes, and Business is $29/month for 6,000 minutes with speaker labels. Competing tools price differently, so check their current tiers directly rather than assuming.
Manual transcription: when and why
Typing a transcript by hand is slow — plan on roughly four hours of work per hour of audio, even with a foot pedal and playback software. So why would anyone do it?
- Very short clips. For a 20-second snippet, typing it yourself is faster than uploading and downloading.
- Maximum control. When every word must be exact — a legal record or a quote you'll publish verbatim — you may want to type as you listen so you catch ambiguity in real time.
- Terrible audio. If a recording is so noisy that AI output is mostly wrong, a human ear sometimes does better.
In practice, the efficient approach is hybrid: let an AI tool produce a first draft, then proofread it against the audio. You get most of the speed of automation with the accuracy of a human pass.
Tips to improve accuracy
The quality of your transcript depends heavily on the quality of your audio. A few things help no matter which method you use:
- Record close to the speaker. Distance and room echo are the biggest accuracy killers. A close mic beats a fancy mic across the room.
- Cut background noise. Turn off fans, close windows, and avoid cafes if you can. Noise drags clean 95%+ accuracy down toward 80%.
- One speaker at a time. Crosstalk confuses every transcriber. If you can, ask people not to talk over each other.
- Use a less-compressed format. WAV or a high-bitrate MP3 preserves more detail than a heavily compressed file.
- Always proofread. Names, jargon, and punctuation are where AI slips most. Budget a quick editing pass.
For a full breakdown of what affects results and how to measure them, read the AI transcription accuracy guide.
The short version
For one file with no fuss, upload it to an online AI tool and download the text — that's the fastest path. For ongoing free use, lean on your phone's built-in transcriber or set up Whisper locally. Save manual typing for tiny clips or audio that's too rough for automation.
Want to try it now? You can transcribe your first 5 files free with no card — drag in an audio or video file and have the text in a few minutes.
Frequently Asked Questions
How do I transcribe audio to text for free?
Use a free tier of an online tool or a free local app. TranscribTxt gives 5 files per month with no card required — upload your file and download the text in a few minutes. For unlimited free use, OpenAI Whisper runs on your own computer but needs a technical setup. iPhone Voice Memos and Google Recorder also transcribe for free on-device.
What's the fastest way to transcribe audio?
Upload the file to an AI transcription tool. A 10-minute recording is usually done in under a minute, and an hour-long file in a few minutes. This is far faster than typing it out manually, which takes roughly four hours per audio hour. Drag in your MP3, WAV, or M4A, wait, then download a TXT or SRT.
Can I transcribe audio to text on my phone?
Yes. iPhone Voice Memos transcribes recordings on newer iOS versions, and Google Recorder on Pixel and many Android phones does live transcription as you record. Both are free and run on-device, so nothing leaves your phone. Accuracy is strongest for clear English speech and drops with heavy accents or background noise.
What audio formats can I transcribe?
Most AI tools accept MP3, M4A, and WAV, plus common video formats like MP4, MOV, and WebM since they extract the audio track. TranscribTxt also accepts a YouTube or other URL so you don't have to download the file first. Uncompressed formats like WAV give slightly better accuracy than heavily compressed MP3s.
Is automatic transcription accurate enough to use?
For clean, single-speaker audio, modern AI transcription is accurate enough to use with light editing. Expect to fix names, technical terms, and punctuation. Noisy or multi-speaker recordings need more cleanup. For anything published or legal, always proofread against the audio rather than trusting the raw output.