Medical Masks and the Human Voice

Kristen Eredics

Introduction

Transmission of COVID-19

In late 2019, a new respiratory virus was discovered, and was subsequently named Coronavirus disease 2019 (COVID-19). The fast spreading disease prompted the World Health Organization to officially declare a pandemic by March 2020. To combat the spread of this virus, 2019, nations implemented stay-at-home orders, mandatory mask use, and social distancing rules across the world.^[1]

Masks have proven to be effective at mitigating transmission amongst communities.^[2] They still remain a large aspect of everyday life to reduce transmission in the community, even two years later. This brings me to my question, can wearing a mask affect our voice? After looking through the UBC Library, I discovered that I am not the only one with this question. Therefore, I have compiled research that will hopefully answer this broad question. I also took the opportunity to measure my own voice, with and without a KN95 mask to determine if my results align with the research. The following sections will outline what the researchers have found and how society has learned to adapt its voice to this new reality.

Background on the Human Voice

Diagram of the human vocal tract

Anatomy of the vocal tract

To understand how masks could affect the human voice, it is important to have a general understanding of the vocal tract. The vocal tract has two cavities, the oral and nasal cavity. The oral cavity contains the lips, teeth, tongue, alveolar ridge, hard palate, velum and uvula. Further down is the pharynx, epiglottis and larynx. The larynx consists of the vocal folds and glottis.

The vocal folds, located in the larynx, is essential to the creation of sound. They move back and forth, creating a vibration that modulates the air that is exiting the lungs from the trachea. You can shape your vocal folds in ways that can create a different combination of frequencies that generate the vowels we hear in everyday speech.^[3] The vocal folds are also important to understanding frequency and harmonics.

The vocal folds can be referred to as the source in sound production. Source can be defined as the sound produced by the vocal cords when voicing. On the other hand, anything above the larynx acts as a filter, modifying the waveforms produced by the source (vocal fold). This part of the vocal tract filters sound due to its role as an articulator, shaping the oral consonants we hear in everyday speech, thus also modifying waveforms. The filter in the vocal tract is essentially an acoustic filter, which is a technical term to describe a filter that removes frequencies and oscillations produced by a source or 'power system'. Specifically speaking, acoustic filters may be low pass filters that remove unwanted high frequencies, or a high pass filter that removes unwanted low frequencies.^[4]

An important term to understand when reading or speaking about acoustic filters is acoustic attenuation which is the loss of energy (in this case sound) in a medium.^[5] This affects the frequencies that define aspects of speech used to communicate, therefore impairing speech intelligibility^[6]. It will come up later when speaking about masks and their role in filtering or attenuating sound.

Formants, Frequencies and Intensity

When the vocal folds vibrate, they produce complex periodic waves. Complex periodic waves are a combination of two or more sine waves. This generates a repeating wave pattern. The number of times this waves repeats per second determines what we know to be the fundamental frequency (f0). This frequency is directly related to pitch.^[3] Pitch is "the quality of a sound governed by the rate of vibrations producing it; the degree of highness or lowness of a tone".^[7] It is associated with the length, tension and mass of the vocal folds ^[8].

The fundamental frequency is also related to Harmonics^[3]. Harmonics produce the source of sound (vibrations in the vocal folds), in which the strongest and slowest vibration is the fundamental frequency. While producing the fundamental frequency, there is also faster vibrations occurring simultaneously which are known as the harmonics. A series of harmonic tones are generated from the formants produced by the vocal tract. Depending on how the vocal tract filters sound, the vocal tract will vibrate at different pitches.^[3] Formants can be changed by changing the size and shape of the vocal tract. When the frequencies are close to the resonance frequency of the vocal tract, they pass through freely producing formants and generating peaks in spectral slices. There are 3 or 4 formants relevant to phonetics. Each formant (f1, f2, f3, f4) is measured in frequency, and the frequencies of these formants can help determine what is happening in the vocal tract (tongue body height, tongue body advancement, what is being said, identifying speaker voices, etc).^[9]

Intensity reflects the amplitude of vocal fold vibration, which is essentially loudness. It is dependent on the closure of the glottis and vocal fold tension.^[8]

Tube model

The vocal tract can also be represented using the tube model! In the picture provided, we see one small tube that is closed and connected to a bigger tube that is open. The glottis is at the end of the small tube, where it is closed. The lips represent the opening of the mouth. This model helps conceptualize the idea that our vocal system is essentially an instrument as well.

tube model of vocal tract

Measurements wearing a KN95 vs without a KN95

Out of curiosity, I decided to test my own voice to see if a mask would affect any acoustics. I recorded a mono-stereo sample of my voice in a quiet setting wearing a KN95 mask and a control scenario where I was not wearing a KN95 mask. The target note that I was producing was a D^b₄. The test I performed was not in a social context where intelligibility would influence speech production. I went into this task without any expectations, except that the KN95 mask voice sample would be more muffled than the control.

Recordings

0:00 Control scenario: D^b₄ without a mask	0:00 D^b₄ wearing a KN95 mask
Control scenario: D^b₄ without a mask	D^b₄ wearing a KN95 mask

Spectrum graphs with and without a KN95 mask

Spectrum of my voice when not wearing a KN95	Spectrum of my voice while wearing a KN95

Results

The results from the two tests were quite interesting. When comparing the two, there were a few differences and similarities. The fundamental frequency did not change, meaning the note I produced never significantly fluctuated and remained as a D^b₄. The intensity (amplitude or loudness) did not change between the two conditions and remained at 67 dB. The F1 also did not significantly change, with the value 843 Hz ± 1 Hz. The F2 and F3 experienced a large difference between the two recordings. When not wearing a KN95 mask, the F2 was valued at 1332 Hz, which is higher than the F2 while wearing a KN95 which is valued at 1184 Hz. Furthermore, the F3 while not wearing a KN95 mask was valued at 1985 Hz whereas the value was 1687 Hz when wearing a KN95. As a result, Pitch and F1 did not change when I put on the KN95 mask, whereas the F2 and F3 decreased when I put on the KN95 mask.

	Not wearing a mask	Wearing a KN95
Pitch (F0)	273 Hz	274 Hz
F1	842 Hz	843 Hz
F2	1332 Hz	1184 Hz
F3	1985 Hz	1687 Hz
Note	D^b₄	D^b₄
Intensity	67 dB	67 dB

Interpretations

When observing the values between the two conditions (control and wearing a KN95), there are two things that I notice. The Pitch, F1 and Intensity remained unchanged. Meaning, I maintained the same note at with the same loudness with and without a KN95 mask. However, the F2 and F3 was significantly higher when not wearing a KN95 mask compared to wearing a KN95 mask.

Other researchers have found the same trend, with lower F2 and F3 values when wearing a mask and unchanged pitch and F1 values. The researchers stated two possibilities for the lower F2 and F3 values when wearing a mask^[10]:

"Formant frequencies [F2 and F3] represent characteristics of articulation and resonance. They are produced by the vocal tract, which extends from the lips to the vocal folds, and can be influenced by several factors, such as vocal tract length, lips closure pattern, tongue volume and position and lowering of the mandible. These changes in formant frequencies seem to result from the differential filtering effects of masks; they may be influenced by the potential involuntary adjustments of vocal tract properties by mask wearers, to be heard".^[10]
"it is known that attenuation effects of face masks seem to be higher (with more energy transmission loss) in the higher frequencies, generally above 1000 Hz, as seen in previous studies. F2 and F3 are usually detected above 1000–1500 Hz, so masks may cause a transmission loss at these frequencies and explain the encountered changes."^[10]

The results of my recordings can therefore be interpreted as possibly being filtered by the mask due to the F2 and F3 values in the 1000-1500 Hz range, resulting in acoustic transmission loss, and/or an active change in my vocal tract to be better understood (/heard) by listeners.

Current Research

A study conducted by Gama et al. shared very similar results to the mini-study that I conducted. This study also found altered formant frequencies for F2 and F3 and no changes to the fundamental frequency. The one difference found between the my measurements and the study's was the amplitude increase in Gama et al.'s paper. In my study, the amplitude decreased in the decimals. Therefore, I assumed it wasn't significant enough to state that my amplitude changed. They also found a transmission loss at around 1000 Hz.^[10]

Other studies had conflicting results with increases in fundamental frequency, f1, f2 and decrease in f3 when wearing a medical mask. However, these studies were performed in social contexts which may have altered the findings due to speech intelligibility factors. They also found an increase in intensity and a noticeable transmission loss above 2000 Hz. It is important to note that this study was conducted using a cloth face mask, whereas the prior study that I discussed also used a KN95.^[8] ^[1]

Based on the results found, researchers believe that speech becomes less intelligible when wearing medical masks, causing participants to adjust their vocal tract to increase their pitch and loudness to sound clear. Background noise also increased when intelligibility decreased, causing the participants to respond with a stronger voice. Based on the results, there was no significant difference between male and female participants.^[8] Furthermore transmission loss results in speech becoming unintelligible, even if the mask has as few layers as a cloth mask.^[6] ^[10]

Future research

Based on the mini-study conducted, I would advise others to run more trials to determine if there is a change in amplitude. Furthermore, social context should be taken into account. I also believe a further in-depth literature review should be conducted, as I feel there is much more research out there that I may not understand. Thank you.

References

↑ ^{Jump up to: 1.0} ^1.1 Shekaraiah, S.; Suresh, K. (2021). "Effect of face mask on voice production during COVID-19 pandemic: A systematic review". Journal of Voice.
↑ Rao, Isabelle J.; et al. (2021). "Effectiveness of Face Masks in Reducing the Spread of COVID-19: A Model-Based Analysis". Medical Decision Making. Explicit use of et al. in: |last2= (help)
↑ ^{Jump up to: 3.0} ^3.1 ^3.2 ^3.3 Johnson, Keith (2012). Acoustic and Auditory Phonetics. Wiley-Blackwell.
↑ Tokuda, Isao (2021). "The Source–Filter Theory of Speech". Oxford Research Encyclopedia of Linguistics.
↑ Foley, Dennis (17 May 2018). "What Is Attenuation?".
↑ ^{Jump up to: 6.0} ^6.1 Pörschmann, C.; Lübeck, T.; Arend, J. M. (2020). "Impact of face masks on voice radiation". The Journal of the Acoustical Society of America.
↑ Oxford Languages Definition via Google
↑ ^{Jump up to: 8.0} ^8.1 ^8.2 ^8.3 Lin, Y; Cheng, L.; Wang, Q.; Xu, W. (2021). "Effects of medical masks on voice assessment during the COVID-19 pandemic". Journal of Voice.
↑ Wood, Sidney (15 Jan 2005). "What are formants?".
↑ ^{Jump up to: 10.0} ^10.1 ^10.2 ^10.3 ^10.4 Gamma, R.; Castro, Maria E.; Titske van Lith-Bijl, Julia; Desuter, Gauthier (22 Sep 2021). [10.1007/s00405-021-07086-9 "Does the wearing of masks change voice and speech parameters?"] Check |url= value (help). European Archives of Oto-Rhino-Laryngology. European Laryngological Society.

[:3-1] {Jump up to: 1.0} ^1.1 Shekaraiah, S.; Suresh, K. (2021). "Effect of face mask on voice production during COVID-19 pandemic: A systematic review". Journal of Voice.

[2] Rao, Isabelle J.; et al. (2021). "Effectiveness of Face Masks in Reducing the Spread of COVID-19: A Model-Based Analysis". Medical Decision Making. Explicit use of et al. in: |last2= (help)

[:2-3] {Jump up to: 3.0} ^3.1 ^3.2 ^3.3 Johnson, Keith (2012). Acoustic and Auditory Phonetics. Wiley-Blackwell.

[4] Tokuda, Isao (2021). "The Source–Filter Theory of Speech". Oxford Research Encyclopedia of Linguistics.

[5] Foley, Dennis (17 May 2018). "What Is Attenuation?".

[:1-6] {Jump up to: 6.0} ^6.1 Pörschmann, C.; Lübeck, T.; Arend, J. M. (2020). "Impact of face masks on voice radiation". The Journal of the Acoustical Society of America.

[7] Oxford Languages Definition via Google

[:0-8] {Jump up to: 8.0} ^8.1 ^8.2 ^8.3 Lin, Y; Cheng, L.; Wang, Q.; Xu, W. (2021). "Effects of medical masks on voice assessment during the COVID-19 pandemic". Journal of Voice.

[9] Wood, Sidney (15 Jan 2005). "What are formants?".

[:4-10] {Jump up to: 10.0} ^10.1 ^10.2 ^10.3 ^10.4 Gamma, R.; Castro, Maria E.; Titske van Lith-Bijl, Julia; Desuter, Gauthier (22 Sep 2021). [10.1007/s00405-021-07086-9 "Does the wearing of masks change voice and speech parameters?"] Check |url= value (help). European Archives of Oto-Rhino-Laryngology. European Laryngological Society.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]