Masking: Why Vocals Drown Even When the Fader Goes Up | VGP Studio

The fader is all the way up and you still can't hear the words

You've been there. The vocal is sitting at -3 dB, then -1 dB, then clipping the master bus, and somehow the singer still sounds like she's behind a wall. You solo the vocal. It's fine. Clear, present, detailed. You unsolo it and it vanishes into the mix like it was never there. The instinct is to keep pushing the fader. More level, more presence, right? But the vocal doesn't actually get clearer. It just gets louder. The words are still muddy. The consonants are still buried. You're fighting physics, and physics is winning. This is masking. It is one of the most common reasons mixes sound cluttered, and it has nothing to do with how loud something is.

What masking actually is

Masking is simple: one sound makes another sound harder to hear. Not quieter. Harder to *hear*. Your ear can only resolve so much detail within a narrow frequency range at any given moment. When two sources occupy the same band at the same time, the louder one wins and the quieter one disappears. Not because it's gone from the signal, but because your auditory system can't separate them anymore. There are two flavors worth knowing about. Simultaneous masking happens when two sounds overlap in frequency at the same time, which is the classic "vocal vs. guitar" problem. Temporal masking is weirder: a loud transient can actually make sounds *before* and *after* it harder to perceive. A snare hit can briefly mask the tail of a vocal phrase that came right before it. Your brain retroactively edits what you heard. Simultaneous masking is the one that ruins vocal clarity in 90% of cases. So that's where we'll spend our time.

The math (kept short)

There's a rough model for this:

ext{Audibility} = L_{\text{target}} - L_{\text{masker}}

A "critical band" is roughly a third-octave wide in the range we care about for vocals (1 kHz to 5 kHz). If your vocal is at -12 dB in the 2 to 4 kHz range and a distorted guitar is at -10 dB in that same range, the audibility of the vocal in that band is about -2 dB. Negative audibility means the vocal is being masked. It's still there in the waveform. Your ears just can't pull it out. Pushing the vocal fader up by 6 dB raises the *entire* vocal spectrum, including the low-mids where the vocal is already fine. Now the vocal is louder overall but the ratio in the problem band only improved by 6 dB, and you've made the low-mid buildup worse. You traded one problem for another.

Why low-mids are the usual suspect

The 200 to 500 Hz range is where masking does its worst damage, and it's because everything lives there. Acoustic guitars have body there. Electric guitars have chunk there. Synth pads fill it. Piano left-hand voicings sit right in it. Vocals have their fundamental and first harmonics there. When you stack four or five elements that all have energy in that band, the cumulative level in 200 to 500 Hz climbs way above everything else in the spectrum. That buildup doesn't sound like "too much low-mid" when you're focused on individual tracks. It sounds like the vocal is unclear. Like the mix is "muddy." You reach for the vocal fader because you think the vocal is the problem. The vocal is the symptom. The buildup is the disease. The clarity range for vocals, roughly 2 to 5 kHz, has its own version of this problem. A bright rhythm guitar, a lead synth, or even hi-hats with a lot of body can compete with vocal presence frequencies. When the vocal enters a section with all of those playing, it loses definition even though it was perfectly clear in the verse where fewer elements were competing.

Five-minute DAW experiment

This takes about five minutes and will change how you think about vocal levels.

1 Open a mix where the vocal feels buried. Something you've been fighting with.

2 Solo the vocal. Listen to the 2 to 5 kHz range. Is it present? Probably yes.

3 Now solo the instrument you suspect is competing. Guitar, synth, whatever. Listen to the same range. Notice how much energy it has up there.

4 Unsolo both and play them together. You'll hear the vocal lose clarity the instant the competing instrument enters.

5 Now, instead of pushing the vocal up, pull the competing instrument down by 2 to 3 dB. Or reach for an EQ and cut 2 to 3 dB in the 2 to 4 kHz range on that instrument.

6 Play the full mix. The vocal will likely be more present without touching the vocal fader at all.

The key insight is that you gave the vocal more room by removing competition, not by adding level. This is almost always more effective than fader rides.

The solution hierarchy

Not all fixes are equal. Some are elegant and some are duct tape. Arrangement is the best masking solution. If the guitar doesn't play during the vocal phrase, there's no masking. Period. No processing needed. This is why great arrangers and great mix engineers often arrive at the same result from opposite directions. The arranger prevents the problem. The mix engineer treats it after the fact. Static EQ is the next step. If the guitar needs to play during the vocal, cut the guitar in the range where it competes with the vocal. A 2 to 3 dB shelf or bell cut around 3 kHz on the guitar can free up that space permanently. This works when the masking is consistent. Dynamic EQ or sidechain compression is for time-varying masking, where the guitar is only a problem when the vocal is active. A dynamic EQ on the guitar, sidechained to the vocal, will dip the guitar's presence range only when the singer is singing. When the vocal stops, the guitar gets its full brightness back. This is more transparent than a static cut because the guitar doesn't sound thin during instrumental sections. Pushing the fader up is the worst option. It works a little, temporarily, and it makes everything else worse. More vocal level means more vocal bleed into the low-mids, more competition with other elements that were fine before, and a louder mix that's closer to clipping. You solve one problem and create two.

The mistake almost everyone makes

The mistake is treating level as the fix for clarity. They are not the same thing. Level is how loud something is. Clarity is how *separable* something is from everything around it. A vocal at -18 dB in a sparse arrangement with nothing competing in its frequency range will sound more present than a vocal at -6 dB in a dense mix where synths, guitars, and pads are all stacked in the same bands. I've done this comparison in sessions. The quieter vocal wins every time when it comes to intelligibility. When you reach for the fader, ask yourself: is the vocal actually too quiet, or is something else too loud in the vocal's space? Nine times out of ten, it's the second thing.

Producer takeaway

Sometimes the vocal doesn't need to go up. The synth at 3 kHz needs to come down when the vocal enters. That reframe, from "add more vocal" to "subtract the competition," is probably the single most useful mixing concept I can think of. It applies to every element in every mix. If the kick is buried, check the bass in the 60 to 80 Hz range before you boost the kick. If the snare is lost, check the guitars in the 1 to 2 kHz range before you push the snare. Masking isn't a bug. It's just how hearing works. Once you stop fighting it and start working with it, mixes open up in ways that no amount of fader pushing can achieve.

Masking: Why Your Vocals Drown Even When the Fader Goes Up

Key Takeaways & Core Concepts