PHYS341/2023/Project16

From UBC Wiki

The Human Singing Voice

Speech Production

The human voice is considered a musical instrument, but it has many more moving parts than other instruments, and it involves an added element: language. To understand how we sing, we must also understand how we speak!

The Basics of the Vocal Tract

The sub-systems of the vocal tract

There are three sub-systems in the vocal tract, categorized by their relationship to the larynx. The sub-laryngeal system (also known as the sub-glottal system) is made up of everything below the larynx;[1] this is the system that powers sound production.

The larynx itself is known as the laryngeal system, and it contains the bones, muscles, cartilage, and ligaments that actually create the sound with vibration.[1] Pitch is controlled here!

Above that is the supra-laryngeal system, which contains the articulators. Articulators allow the sound to be modified into the sounds of language by changing the shape of the resonance space. This system can also be separated into the nasal cavity and the oral cavity, areas through which the air flows that each contribute to the sound quality.[1]

Fun fact: Every part of the vocal tract performs a function unrelated to speech production! [1]

The Physics of Articulation: Every Single Thing That the Human Voice Can Do

All of the elements of human speech and singing involves a combination of features determined by the airstream mechanism, the state of the larynx, the state of the velar port, the place of articulation, and the manner of articulation.

The airstream mechanism

The airstream that we use to produce speech is controlled by a combination of an air mover and the direction of movement.[1] The mechanism used for the vast majority of speech and singing is pulmonic (moved by the lungs) and egressive (flowing from inside to outside). The other movement possibilities are glottalic (the vocal folds) and velaric (the tongue), and the other airflow direction is ingressive (flowing inward). Note: not every combination is possible.

The lungs actually don’t have any muscles of their own, so the action of taking a breath begins in the diaphragm (and the muscles of the ribcage).[1] When we inhale, the parachute-shaped diaphragm contracts, creating space in the chest cavity, and the lungs enlarge to fill the space. Because the space inside the lungs is now larger, the air pressure decreases; to equalize the differences in pressure inside and outside the body, air is pulled in from the outside of the body. This process is reversed when we exhale.

Trained vocalists must be able to exert meticulous control over the diaphragm because this is the source of the airflow; the speed and pressure of the stream of air moving through each of the subsequent mechanisms has a major effect on the sound that is produced. Good control over the diaphragm also affects note duration and stability because the expansion of the diaphragm, and therefore the release of air, is slow and steady; notes can be held much longer because there is more air remaining in the lungs.

The larynx

Spectrum of [a] in female chest register, 261 Hz (C4)

Speech sounds can either be voiced or voiceless.[1] An easy way to understand this is to put two fingers against your voice box and compare what happens when you say “sss” and “zzz” - “s” sounds are voiceless, and “z” sounds are voiced. This simply indicates whether or not the vocal folds are vibrating. The vocal folds are elastic membranes stretched across the trachea.[1] Their tension can be adjusted by attached muscles in the larynx, and their position can be adjusted by connected cartilages. When they are pulled closed, airflow stops, and when they are pulled apart, air flows freely. Voicing occurs when the vocal folds are held in a position between open and closed; air is allowed to pass through, but the escaping air exerts force on them, causing them to vibrate. Escaping air from the lungs moves through the larynx and across the vocal folds, exerting pressure on them, a phenomenon known as the Bernoulli principle.[2] This pressure displaces the vocal folds until restoring forces allow them to return to equilibrium, and the resulting oscillations are the source of the sound-producing vibration. The frequency of this vibration, or how many times per second the vocal folds oscillate (1 Hz represents one oscillation per second), is known as pitch. Pitch can be changed by adjusting the tension of the vocal folds; increased tension will lead to higher vibration frequency.[1]

Spectrum of [a] in female falsetto register, 523 Hz (C5)

Vocal range is what we use to define the highest and lowest pitches that a vocalist can produce. Range can be extended to an extent through practice, but there are biological limits; the size of the vocal folds determines the range of the voice.[1] Male vocal folds are typically longer and thicker than female vocal folds, and a child’s vocal folds are typically shorter and thinner than an adult’s.

Spectrum of [a] in female whistle register, 1046 Hz (C6)

Vocal registers can be differentiated by a change in the muscular mechanism used to produce vibration, and the frequencies at which they occur vary amongst individuals. [2] The modal register, commonly known as the chest register, is where speech occurs. To produce sound in this register, muscles within the folds and around the folds (the vocalis and cricothyroid muscles) are tightened.[3] The "register break" is the place in the voice where the natural division occurs, and with training can be smoothed out so that the modal and falsetto registers blend together.[2] The falsetto register is used by allowing the muscles within the folds (the vocalis muscles) to relax while tightening the muscles around the folds (the cricothyroid muscles).[3] Whistle register lies above falsetto and encompasses the highest frequencies produced by the human voice. These frequencies are able to occur when vibration in the vocal folds is restricted to the anterior part of the folds, and the posterior area is slightly pulled apart, allowing some air to escape freely.[4]

The velar port

The velar port is the opening at the back of the mouth that connects the mouth and nose. The muscles at the back of the soft palate can move to close off the walls of the pharynx, controlling whether air flows through the oral or nasal cavity. The different shapes, structures, and soft or hard tissues of each cavity affect the resonance of the sound. If the velar port is closed, the air travels through the oral cavity. If the velar port is open, the air travels through the nasal cavity.[1] Humming is nasalized.

The place of articulation

As mentioned above, articulators are grouped into two categories: active and passive.

Active articulators are the parts that actively move to narrow or widen the vocal tract.[1] These are the lower lip, the tongue front, body and root, and the velum (soft palate).

Passive articulators cannot move to make contact with another articulator. Active articulators move to make contact with these, which is what determines the “placement” of the sound.[1] These include the upper lip, upper teeth, hard palate, velum (soft palate), velar port, and pharyngeal wall (the back of the throat).

*Note that the soft palate is in both categories; it can act as a valve or remain passive.

Clearly, there are numerous possible shapes that can be made in the oral cavity. Each subtle change in the articulators changes the shape of the resonance space. Formant frequencies are the frequencies that naturally resonate in the vocal tract; these frequencies have a higher amplitude than other frequencies in the harmonic series because they are enhanced by the shape of the instrument.[5] Formant frequencies are a key factor in determining the timbre of a sound, as is shown in the spectra to the right.

The manner of articulation

The constriction made by the articulators determines the manner of articulation. There are three ways that air moving through the vocal tract can become sound:

  • A complete constriction will stop the airflow, causing pressure to build up and interfering with vocal fold vibration. The release of this pressure creates the sound. This manner of articulation is used in beatboxing!
  • A partial constriction restricts the air stream to move through a narrow slit or groove. Hissing is an example of this.
  • An open vocal tract occurs in the absence of a constriction, creating a resonant sound. All vowels (and a few consonants) are in this category.[1] In singing, this is the most important manner of articulation.

Vocal Technique: The Voice as a Musical Instrument

Vocalists are trained to be able to adjust each component of the vocal tract. This learned “technique” allows relaxed control over each aspect of musical sound that a person can produce with their instrument.

The Characteristics of Musical Tone

Pitch

The frequency at which vocal folds vibrate is called the fundamental frequency. Each musical tone, unless it is a pure tone, produces a number of formant frequencies, the lowest of which is the fundamental.[6] Harmonics are formants whose frequencies are multiples of the fundamental.

Intensity

Intensity refers to how much energy is in a sound wave, and is analogous to volume; higher energy results in higher intensity, which results in higher volume.[2][6] It is mainly controlled by the sub-laryngeal system using pressure.

Waveform of [a] in chest register, 523 Hz (C5)
Waveform of [a] in falsetto register, 523 Hz (C5)
Waveform of [a] in falsetto register, with vibrato, 523 Hz (C5)

Quality (timbre)

Quality refers to any properties not belonging to the other three characteristics of musical tone. [7] The particular timbre that a sound has is a result of the pattern of overlapping periodic sinusoidal waves within it.[3] Each of these individual waves represents one of the harmonics and its frequency and intensity.[8]

Other contributors to timbre include the sound envelope (attack, sustain, and decay) and vibrato.

Attack, sustain, and decay

The onset of a note and the changes that occur before it reaches the peak loudness is known as attack. Decay refers to how long the sound energy continues after production ceases. [3] Vocal onset and offset depend on coordination between closing the vocal folds and starting and stopping the breath. For example, a glottal stop is the result of closing the vocal folds before the airstream begins.[1]

Vibrato

Vibrato is an ornamentation technique in which the pitch oscillates rapidly around the pitch of the sustained note (less than a semitone on either side) at the rate of 4-8 Hz. [6] The image below shows vibrato at a rate of 5 Hz. It occurs as a natural pulse in the tension of the vocal folds when the air stream is controlled and steady, but it is not well understood.

Spectrogram (1 second) of [a] in falsetto register, with vibrato, 523 Hz (C5)

Visualizing Timbre in the Voice

Bright timbre is rich in overtones.[6]

Spectrogram of [a] with bright tone, 1046 Hz (C6)
Spectrum of [a] with bright tone, 1046 Hz (C6)

Warm and light timbre:

Spectrogram of [a] with light tone, 523 Hz (C5)
Spectrum of [a] with light tone, 523 Hz (C5)

Darker and fuller timbre:

Spectrogram of [a] in dark tone, 523 Hz (C5)
Spectrum of [a] in dark tone, 523 Hz (C5)

Breathy (aspirated): the vocal folds do not close completely, allowing more breath to escape.[3]

Spectrogram of [a] in breathy tone, 261 Hz (C4)
Spectrum of [a] in breathy tone, 523 Hz (C5)

Creaky voice, known as vocal fry, is produced by constricting the larynx.

Spectrogram of [a] with vocal fry, 261 Hz (C4)
Spectrum of [a] in creaky tone, 523 Hz (C5)

References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 Zsiga, E. C. (2013). The Sounds of Language: An Introduction to Phonetics and Phonology. Wiley-Blackwell.
  2. 2.0 2.1 2.2 2.3 Frisell, A. (1966). The Soprano Voice. Bruce Humphries Publishers.
  3. 3.0 3.1 3.2 3.3 3.4 Moore, G. D. (2006). Physics of Music Lecture Notes.
  4. Mathieson, L. (2001). Greene and Mathieson's The Voice & Its Disorders. Whurr.
  5. Sundberg, J. (1981). The voice as a sound generator. Royal Swedish Academy of Music, 33, 6-14.
  6. 6.0 6.1 6.2 6.3 White, H. E., & White, D. H. (1980). Physics and Music: The Science of Musical Sound. Saunders College/Holt, Rinehart and Winston.
  7. Field-Hyde, F. C. (1950). The Art and Science of Voice Training. Oxford University Press.
  8. Britannica, The Editors of Encyclopaedia. "timbre". Encyclopedia Britannica, 1 Feb. 2018, https://www.britannica.com/science/timbre. Accessed 29 March 2023.