How to Transcribe a Meeting Recording in Your Browser
Transcribe a meeting recording in your browser. Covers recording files, speaker labels, cleanup, and next steps.
To transcribe a meeting recording in your browser, download the recording as an audio or video file, then upload it to a browser-based transcription tool. Processing takes roughly 1 minute per 10 minutes of audio. The resulting text can be copied into a document, summarized with an AI, or searched for specific topics discussed.
The Audio Transcriber runs the recognition model locally — your recording is processed in browser memory and never sent to a server. For business meetings containing strategy discussions, personnel matters, or financial information, this is a meaningful distinction.
Getting the Recording File from Your Platform
Every major video conferencing platform stores recordings differently. Here is where to find yours.
Zoom
Locally recorded meetings are saved to your Documents/Zoom folder automatically. Each meeting gets a folder containing an MP4 video file and an M4A audio file. Use the M4A — it is audio-only, which means smaller file size and faster processing.
Cloud-recorded meetings require the host to download them from the Zoom web portal. Log in, go to Recordings, find the meeting, and click Download. You will get the same choice: MP4 or M4A. Take the M4A unless you need the video.
Microsoft Teams
Recordings are saved to SharePoint or OneDrive (depending on your organization's settings). Open the Teams chat where the meeting occurred, find the recording card, click the three-dot menu, and select Download. The file is an MP4.
Google Meet
Recordings go to the meeting organizer's Google Drive, in a folder called "Meet Recordings." Open Drive, find the folder, right-click the file, and select Download. The file is an MP4.
Webex
Log into your Webex site, navigate to Recordings, find the session, and click the download icon. Webex offers MP4 for cloud recordings. If you recorded locally, the file is already on your machine in the Webex folder under your home directory.
Working with Video Files
All of these platforms give you MP4 files by default. You can upload MP4 directly to the transcription tool — the audio track is extracted automatically. If you want to reduce file size before uploading, use VLC to extract just the audio: open VLC, go to Media, select Convert, add your MP4, choose the Audio - MP3 profile in the settings, and start the conversion. A 60-minute video that is 400MB becomes a 60MB MP3.
Step-by-Step Transcription
Step 1: Download the recording using the platform instructions above. For video files, either upload directly or extract audio using VLC.
Step 2: For meetings over 2 hours, consider splitting the file at a natural break point — lunch break, topic shift, or when the meeting pauses. A single 3-hour recording processes fine, but reviewing one continuous text block is harder than two shorter ones. Use Audacity or VLC to cut the file.
Step 3: Upload the file to the Audio Transcriber. The model loads into browser memory on first use, which takes a few seconds. Subsequent uploads skip this step.
Step 4: Wait for processing. Expect roughly 1 minute for every 10 minutes of audio on a modern laptop CPU. A 45-minute meeting takes 4-5 minutes to transcribe.
Step 5: Copy or download the transcript. Paste it into a document, or feed it directly into Claude, ChatGPT, or another AI for summarization.
Speaker Identification
The Audio Transcriber produces a single text stream without speaker labels. Speaker diarization (identifying who said what) is a separate technical problem from transcription — the two are distinct processes, and browser-based tools handle them separately.
Your options depend on how you plan to use the transcript.
Manual labeling is the most straightforward approach. Play back the recording while reviewing the transcript, and add "Sarah:" or "Marcus:" at each speaker transition. For a 60-minute meeting, this takes 20-40 minutes depending on how many speakers there are and how frequently they switch. For meetings where you know the participants well, you will often recognize speakers from their phrasing and can label without replaying.
Cloud services with automatic diarization can automatically label speakers. The tradeoff is that these services require uploading your audio to their servers. If the meeting contained anything confidential, that is the relevant constraint.
For action items and decisions, speaker labels are usually not necessary. An AI summary of a transcript does not need to know who said each line to extract "agreed to extend the contract by 30 days" or "John will follow up with the vendor by Friday."
For formal records — HR proceedings, board meetings, legal depositions — manual labeling is worth the time, and the accuracy of those labels matters. For anything that might be referenced in a dispute, have a second person verify the labels against the recording.
If you are transcribing a meeting that happens regularly with the same participants, consider writing a short prompt to guide AI summarization: "This transcript is from a product team meeting. Speakers are Alex (PM), Dana (Engineering), and Sam (Design). Extract all decisions and action items."
What to Do with the Transcript
The transcript is raw text. Here is what you can do with it.
Extract action items and decisions. Paste the transcript into Claude or ChatGPT with a prompt like: "Extract all action items and decisions from this meeting transcript. List action items as 'Owner: task (due date if mentioned)' and decisions as bullet points." A 60-minute meeting typically produces 5-15 action items and 3-8 decisions.
Generate formal meeting minutes. Prompt: "Convert this transcript into formal meeting minutes. Include: date and attendees (listed at top of document), agenda items discussed, decisions made, and action items with owners and due dates."
Search for a specific discussion. Rather than rewinding a recording, search the transcript text for the keyword. "Budget" in a 90-minute strategy meeting takes you directly to those 4 minutes of discussion. Plain text search on a 10,000-word transcript is instant.
Build a project record. Save the transcript in a folder with the meeting date and title. Over time, you build a searchable archive of every discussion. When someone asks "didn't we discuss this three months ago?" you can check.
Feed it to a read-later tool. Transcripts from recorded lectures or conference talks can go into Readwise, Notion, or Obsidian as text notes — making audio content searchable and linkable alongside your other notes.
Why Privacy Matters for Meeting Recordings
Business meetings contain information that is rarely intended for outside parties. A strategy discussion about a potential acquisition target. A quarterly forecast before the earnings call. A performance conversation about an employee. A debrief after a client negotiation.
Meeting bots and cloud transcription services can auto-join calls, label speakers, store transcripts, and sync notes into CRM or project-management systems. Those are useful features when the meeting is designed for shared cloud records. They also mean the transcript lives on third-party infrastructure.
Browser-based transcription keeps the transcription step local. The recording file is loaded into browser memory on your device. The Audio Transcriber runs locally. When you close the tab, nothing persists. No audio is transmitted over the network.
For meetings covered by confidentiality agreements, attorney-client privilege, medical privacy obligations, or internal data-handling rules, local processing is the workflow to start with.
File Format and Size Reference
Understanding file sizes helps you anticipate upload times and storage needs.
MP3 at 128kbps is approximately 1MB per minute of audio. A 60-minute meeting is around 60MB. This is the format you want for transcription — small, widely supported, and no meaningful quality reduction for speech compared to higher bitrates.
M4A (the format Zoom exports for audio-only recordings) is similar in size to MP3 at the same bitrate. You can upload M4A directly without converting.
WAV is uncompressed and is 8-10x larger than MP3 for the same recording. A 60-minute WAV file is 500-600MB. There is no transcription accuracy benefit to WAV over a high-quality MP3 for speech recognition. Convert WAV to MP3 before uploading.
MP4 video files are significantly larger — a 60-minute Teams recording is typically 300-500MB depending on video resolution settings. You can upload MP4 directly, but extracting the audio track first reduces file size by 80-90%.
FLAC is lossless compressed audio. It is roughly 4-5x larger than MP3 but significantly smaller than WAV. If you have FLAC files, you can upload them directly — they will produce the same accuracy as WAV.
Transcribing Lectures and Classes
The same workflow applies to lecture recordings. Most learning management systems (Canvas, Moodle, Blackboard) and video platforms (Panopto, Echo360) allow downloading recordings as MP4 or MP3 files if the instructor has enabled downloads.
Download the file, upload it to the lecture notes transcription tool, and you have the full text of the lecture within a few minutes. Use the text to create searchable notes, identify key terms for studying, or ask an AI to generate a study guide from the content.
For lectures with extensive technical vocabulary — organic chemistry, legal theory, clinical pharmacology — domain-specific terms are worth checking in the transcript. A chemistry lecture may render "phenylpropanolamine" phonetically. Keep a list of key terms you know will appear and search for them in the output.