top of page

Human Ear

  • Writer: Matheus Antunes
    Matheus Antunes
  • Nov 3
  • 4 min read

In the previous post, I talked about the Decibel and how the mathematical relationship between two quantities is simplified by the scale. I often used the term "perception" to refer to how the human ear effectively hears different intensities or frequencies. Now, let's dissect exactly what that 'perception' means by exploring the biology behind how we hear.


Before anything else, it's worth noting that I am by no means an expert in audiology. The knowledge about the human ear's functioning and how the brain processes sound is a deep field of study. Practically everything I've learned and presented in this text is derived from my study of Bob Katz's work, "Mastering Audio: The Art and the Science". What I share here is my interpretation and what I've absorbed from this incredible source, which I consider essential reading for any audio professional.


How We Hear


Understanding how sound is processed by our auditory system and interpreted by the brain is fundamental for those who work in audio, as it's not just about physics, but about the biology and psychology of perception. Hearing is a complex process that transforms mechanical pressure waves into electrical impulses, which the brain decodes into sonic experiences.


Mechanical Capture and Amplification


It all begins in the outer ear, composed of the pinna (auricle) and the ear canal. The pinna functions as a funnel and a reflector, optimized to capture sound waves and direct them into the canal, while its complex shape also helps us identify the direction of the sound. The ear canal, in turn, amplifies certain frequencies, especially between 2 kHz and 5 kHz, which is crucial for speech intelligibility. At the end of the canal, the sound waves make the eardrum vibrate. These vibrations are transmitted to the middle ear, a chain of three small bones (hammer, anvil, and stirrup) that act as a lever system. They amplify the force of the vibrations, a vital step, as they need to move the dense fluid of the next stage, the inner ear.


Frequency Analysis in the Inner Ear


The stirrup is connected to the cochlea, the snail-shaped structure of the inner ear. The stirrup's vibrations create pressure waves in this fluid. Inside the cochlea rests the basilar membrane, the most important part for frequency analysis. Different frequencies cause maximum vibrations at different points on this membrane: high frequencies resonate at the base, while low frequencies resonate at the apex. On this membrane are the hair cells, the true electrical transducers. When the basilar membrane vibrates, the hair cells bend, generating electrical impulses that are sent to the auditory nerve.


Brain Processing and the Concept of Recruitment


The auditory nerve carries these impulses to the brain, where sophisticated processing occurs. The signal passes through the brainstem, which analyzes location, and the thalamus, which directs it, until it reaches the auditory cortex, where sound is truly interpreted. It is here that we distinguish speech from music and identify timbres.

The brain does not process sound like a microphone; it actively interprets it. The way volume is encoded is an example of this. The perception of volume is not determined solely by the firing rate of a neuron, but crucially by the quantity of neurons activated, a concept known as "recruitment."

The auditory nerve's neurons fire in an "all-or-nothing" fashion, a binary principle. For a very faint sound, only a very small group of hair cells at the exact frequency point vibrates, activating few neurons. The brain interprets this as "faint sound." However, for a very loud sound, the vibration on the basilar membrane is so intense that it "spreads" and "blurs" to the sides, activating neighboring cells that normally respond to other frequencies. The brain now receives an avalanche of signals from a large quantity of neurons and interprets this massive neural activity as "very loud sound."

It is this combined response of rate and recruitment, which is logarithmic, that explains why we use the Decibel scale. This phenomenon is also the physiological basis for "masking," where a loud sound recruits so many cells that it hides fainter sounds at nearby frequencies. Likewise, it explains the Fletcher-Munson Curves, as at high volumes, the recruitment of low and high-frequency cells makes the sound seem "fuller" than at low volumes.

The brain is also a master at creating a 3D sound image by comparing the minuscule differences in time and intensity between the two ears, as well as filtering out irrelevant noise to focus on what is important. All this complexity shows us that our auditory perception is an incredibly sophisticated analyzer, and that is why our work in audio must always consider how this complex biological and neural system will interpret sound.


Conclusion


The human auditory system is much more than a simple passive transducer, like a microphone. It is an active, complex, and profoundly non-linear processor that interprets, filters, and, in essence, constructs the sonic reality we perceive. The sound's journey doesn't end at the eardrum; it only begins, passing through a series of mechanical, hydraulic, and neural transformations.

For the audio professional, this understanding is the difference between being a gear technician and being a sound engineer. Your real job is not just to manipulate Volts (dBu) or bits (dBFS); it is to create a stimulus that effectively "communicates" with this biological system.

When you equalize a kick drum so it doesn't mask the bass, you are working directly against the phenomenon of neural "recruitment." When you check your mix at low volumes and notice the bass and treble have disappeared, you are witnessing the Fletcher-Munson Curves in action, dictated by the variable sensitivity of the hair cells.

Every tool we use, from compression to stereo panning, is ultimately a way to optimize the signal for how the brain decodes pitch, intensity, and space. Your console or your DAW is just the interface. Your true destination, the final instrument you are "playing," is the listener's perceptual system.



Comments


MIXED BY

M.A.

bottom of page