← Back to updates
Wave Team

Introducing Voice ID: Wave Learns Who’s Talking

Voice ID creates a mathematical fingerprint of each speaker’s voice, then automatically identifies them in every future recording — making transcripts and summaries more useful without any manual labeling.

Introducing Voice ID: Wave Learns Who’s Talking

If you've ever read a transcript that says "Speaker A" and "Speaker B" and had to work out who said what from context, you know the problem. Transcripts are more useful when they include real names. Summaries are more actionable when they say "Sarah suggested moving the deadline" instead of "Speaker 2 suggested moving the deadline."

That's what Voice ID solves. Save a speaker's name once, and Wave recognizes them automatically in every future recording. No manual labeling. No dragging and dropping names onto speaker blocks. It just works — and it gets better the more you use it.

How It Works

The process is simple from your side. After a recording, you'll see speakers labeled as "Speaker 1," "Speaker 2," and so on. Tap a speaker and give them a name. That's it — you've created a Voice ID.

Behind the scenes, Wave does the heavy lifting. When you name a speaker, Wave takes the audio segments where that person spoke and creates a voice embedding — a compact mathematical representation of how their voice sounds. Think of it like a voice fingerprint: a string of numbers that captures the unique characteristics of someone's speech — their pitch, cadence, tone, and vocal texture.

From that point on, every time you make a new recording, Wave compares the speakers it hears against your saved Voice IDs. If there's a confident match, the speaker is automatically labeled with their name before you even open the transcript.

The Tech Behind It (Simplified)

Voice embeddings might sound complex, but the concept is intuitive. When Wave processes a speaker's audio, it converts the voice into a vector — a list of numbers that represents the unique qualities of that voice. This isn't a recording. You can't play it back. You can't reverse-engineer it into audio. It's purely mathematical.

To identify a speaker, Wave uses something called cosine similarity — essentially measuring how "close" two voice vectors are to each other. If the vector from a new recording is close enough to a saved Voice ID, Wave knows it's the same person. If it's not close enough, the speaker stays unlabeled for you to name if you want.

The matching is conservative on purpose. Wave would rather leave a speaker unlabeled than mislabel them. You'll see confidence levels on each profile — Building, Good, and Strong — so you always know how reliable the recognition is.

It Gets Better Over Time

Voice IDs aren't static. Every time Wave confidently recognizes someone, it quietly refines the profile by incorporating the new voice data. More recordings of a person means a stronger, more accurate fingerprint.

Wave also uses a sliding window approach, giving more weight to recent recordings. This means Voice IDs stay accurate even if someone's voice changes subtly over time — a cold, a new microphone, a different recording environment. The system adapts.

You can track this progress directly. Each Voice ID shows its confidence level:

  • Building — Wave has limited data. Recognition may not be consistent yet.
  • Good — Enough data for reliable recognition in most situations.
  • Strong — Highly accurate recognition across different recording conditions.

The more you record, the stronger your Voice IDs get — with zero extra effort.

Privacy By Design

Voice data is sensitive. We built Voice ID with that front and center.

  • Your Voice IDs, your account only. Voice IDs are scoped entirely to your account. No one else can access them — not other users, not Wave employees, not anyone.
  • One-way embeddings. Voice fingerprints are mathematical vectors, not audio. They cannot be played back or reverse-engineered into a recording of someone's voice.
  • SOC 2 compliant infrastructure. Voice IDs are stored with the same SOC 2 grade security that protects all your Wave data — encrypted at rest and in transit.
  • Delete anytime. Remove any Voice ID and its embedding data is permanently deleted. No soft deletes, no hidden backups.
  • Disable with one toggle. Turn off Voice ID entirely in settings. Wave will stop creating and matching immediately.

Where It Works

Voice ID is available now on Wave for iOS and Android, with web and desktop support coming soon. Your Voice IDs sync across all platforms — name a speaker on your phone, and they'll be recognized on the web app too.

Voice ID works with every recording type: in-person recordings, phone calls, meeting bot recordings, desktop capture, and imported audio files.

Getting Started

It takes about 10 seconds to create your first Voice ID:

  1. Open any recording with speaker labels in Wave.
  2. Tap a speaker name (e.g., "Speaker 1") and enter the person's real name.
  3. That's it. Wave creates the Voice ID and starts recognizing them automatically.

The next time that person shows up in a recording, their name will already be there — in the transcript, in the summary, and in the speaker timeline.

Try Wave free — record, transcribe, and summarize on your phone.

Wave app screenshot showing meeting transcription
Wave AI note taker background pattern
Start today

Wave. Catch every word