Effect of tone of voice on language acquisition in infants

Introduction

What do we plan to do: Proposed project

We plan to do an experimental study of language acquisition wherein we look at whether the tone of voice can affect the language acquisition of 8-month-old children when learning unfamiliar words. We will do a transitional probability study with three groups of babies. Each group will be presented with a list of syllables read in a different tone of voice by a speaker on a screen. We will investigate if there will be a difference between the learning ability of the baby when listening to the stream of syllables, when they are presented in the three different tones: monotonous, happy, angry. Will the tone of voice affect the baby's ability to segment the word from random syllable composition and thereby recognise and learn new words?

Hypothesis: There will be a difference between the looking times for the recognised words in the assessment stage between the three tone of voice categories.

Why is this important: What is new to the table

Previous studies have shown that babies at the age of 8 month, are capable of using statistical information to distinguish word boundaries (Aslin, Saffran & Newport 1998). Distinguishing between syllables and segment words are an important part in linguistic knowledge of a native language. The previous studies have studied acquisitioning of an artificial language relying on one source of information, transitional probability. Natural word segmentation relies on many types of information. Prosody, pauses and utterance boundaries are examples, but the tone of voice is also an important information source. In this research experiment, we will combine two aspects of the spoken language. We will investigate how the babies acquire linguistic properties, word segmentation, when a nonlinguistic aspect is changed, the emotional tone of voice. We will study this aspect of language learning, to see if and if so what effect it has on a child’s abilities to acquire word segmentation. This contributes to the knowledge already known with the aspect of the tone of voice, which could help understanding how babies learn their native language in the best way.

The reason why we think that the tone of voice has an impact on language acquisition, comes from another study (Nygaard & Queen 2008). This study focusses on, how the information source, variating emotional tone, is used in relation to process a linguistic content. The study shows that a nonlinguistic property, emotional tone of voice, influences the time of lexical processing and selection. It also states that there is a dissociation between the linguistic content and the identification of the emotional tone of voice, but they are dependent on each other. However, this study examines adults, who can combine what is being said and how it’s being said. They understand the meaning of the words and are not acquiring new words as the babies in our study.

When looking at research around our chosen topic we found that studies either looked at language acquisition in infants, but with a range of different languages (Pelucchi, B., Hay, J. F., & Saffran, J. R. 2009), or they looked at adults, language acquisition with the same language but using different tones of voice (Nygaard & Queen 2008).

This combination of research is what inspired our project to look at language acquisition in infants, but also to incorporate how tone of voice affects that, so we are using different aspects of two different studies to shape our research project. We also feel that it is important because our results may be able the importance of tone of voice when teaching your child in order to maximize the chances of them learning words, and therefore enhancing their learning process from a very young age.

Background Literature

Language acquisition – Word segmentation
Word segmentation is the link between low-level properties of language, phoneme perception, and high-level properties, syntax, of infants. Spoken language has mostly no cues for the boundaries of words, and the segmentation of words is thereby perceptual. Infants can segment words without knowing any words in advance. Word recognition plays a large role in adults, but infants can segment word with this recognition. Babies in the age of 6 months have no segmentation ability. This ability is almost fully developed at the age of 10.5 month (Daland 2009). However, as mentioned studies show, babies can do word segmentation test without word recognition by using statistical information (Aslin, Saffran & Newport 1998).

Transitional probability study
This is where we take our starting point. The study of Saffron looks at 8-month-old babies’ ability to segment a continuous stream of syllables into new words. There are no acoustic cues to the beginning and end of the words. The study shows that the infants can use statistical information to solve this word segmentation task. The test words and part words are mentioned with a certain transitional probability and with same frequency. Based on the transitional probabilities the infants can recognise the syllables pairs that are mentioned after each other every time as a new word. Here the transitional probability of the words is the independent variable, and the dependent variable is the baby's recognition and segmentation of the word. We use this to investigate whether the tone of voice affect this ability to learn new words.

Method

Set up

For our experiment, we will be working with 8 month old babies of both genders. We have decided to use all babies from a westernized culture. We are doing this in order to minimize the differences in the home lives of the children in our experiment, as they will all be coming from the same cultural background. The sample size will consist of 120 babies, 40 in each group, meaning that no one baby is used in more than one condition. In order to make sure that the babies are all in a relatively similar state, we are saying that they will have to have had at least 9 hours sleep 2 hours prior to the experiment, and also will have to have eaten an hour before the experiment. This is done as an attempt to minimize the chances of the child crying during the experiment during hunger or tiredness. There will be three different experimental conditions, each of which will itself have a familiarization stage and then a separate assessment stage.

The three experimental conditions are:

Monotonous tone (control)
Happy/positive tone
Angry/negative tone

These three categories represent the three different tones of voice that will be pre-recorded and presented to the babies in the experiment.

For the recordings, we will be using a female speaker, the same person for all the conditions. Then for the actual experiment, each baby will be played the same recording of the speaker, relative to their respective experimental condition. The speaker will be speaking a list of syllables, using the transitional probability method, then the recordings will be played to the babies on a screen. The babies will be sitting on the lap of their parent (either mother or father) so that the baby is comfortable during the experiment.

The transitional probability method will go as follows:

Familiarization stage

The speaker will say a list of syllables, for example: la, ti, do fa, lie...etc. The list used for the recordings will be the same one for all the three conditions, but just in a different tone of voice. The speaker will pause for the same length of time between each syllable and maintain clarity throughout in order to help prevent our results from being unreliable. In each condition, during the familiarization stage, the baby will be played the recording on a screen on loop for three minutes. The order of the syllables will represent their transitional probability. Two of pairs of syllables in the string of syllables will have 100% transitional probability meaning that they will always be played one after the other. For instance, if the syllables “fo, la” had 100% then they would always appear in the recording as “fo, la”. Whereas all the other syllables used will only have a 30% transitional probability meaning that there is only a 30% chance of them reoccurring in the same order. Here we will be referring the the words with 100% transitional probability as “test-words” and the mixed-up syllables that are in random order as “part-words” The list of syllables that will be looped for three minutes will consist of 20 syllables, 2 pairs of which will have 100% transitional probability. This method will be used in the familiarization stage for all three conditions, each loop has the same woman saying the same syllables, except in a different tone of voice for each condition. Once each child has participated in the familiarization stage they will then also complete the assessment stage.

Assessment stage

During the assessment stage, in which would be similar to the familiarization stage, the child will again be listening to syllables whilst sitting on their parent’s lap. However, here we are are assuming that in the familiarization stage, the syllable pairs that had 100% transitional probability, will have been associated by the baby as a word, thus will have been learned. For example, if the syllables “ti, do” had 100% transitional probability then the baby would associate “tido” as being a word. Thus, in the assessment stage, we will be using the same speaker. The speaker will be speaking in the same tone of voice for all of the three conditions. She will be speaking in a monotone way, thus here the monotonous condition from the familiarization stage will be acting as a control. We are doing this so that we can see whether the babies learn a new word better when being spoken to in an angry voice or a happy voice, relative to how much attention they have paid. The part-words will be randomly ordered syllables, which will be the same as the ones that occurred in the familiarization stage of the experiment, just in a different order. The babies will be watching the screen in the exact same conditions as the first stage, except here, a blinking light will be placed above the screen to hold the baby’s attention. The baby’s gaze will be monitored using an eye-tracking device so that when their gaze moves from the light to focus on the screen, we can time how long the baby looks at the screen for, before returning to look at the light. Here we will be referring to this as the looking time. The stream of “test-words” and part-words here will be 40 syllables long and will only be played once. The assessment stage is thus a way for us as investigators to see how well the baby has learnt the word, or can recognise the word, and whether they recognise the words better after learning them in an angry, monotonous or happy tone of voice.

Discussion

Choice of method

Why did we use children at the age of eight months?
Language acquisition happens throughout our entire life. In periods of our life we are more sensitive to environmental inputs. These periods are called critical periods, and are in neurobiology described as windows of plasticity. Acquisition a native language doesn’t come all at once, but in a cascade of steps. Each with its own critical period. Language discrimination and native phonetic categories are some of the first steps that will later lead to syntactic and semantic understanding of sentences (Werker and Hensch 2015). The critical period of word segmentation occurs at around 8 months (Aslin, Saffran & Newport 1996).

Why did we choose to record the speaker?
With the speaker recorded all babies in each group are presented with the same speech and so here we have extremely high control and there will be no variation from test to test. This is the most important reason for this choice. This will only give the baby an auditive and visual input. There will be no interaction with an actual person, which has been proven to increase babies’ ability to learn languages.

Why did we choose to play the recorded speaker on a screen?
However, the presence of a stranger in the room reading the continuous stream of syllables could affect the babies in many different ways. The charisma of the speaker could cause different reactions in each individual infant and thereby affect their concentration to the word segmentation test. Of cause the babies could react differently to the screen with the speaker as well, but the effect would not be as profound, when there won’t be an actual person in the room.

How long are the words and why?
The words that we are using are not exactly words, it is a string/list of syllables. The words, however, that the baby’s will learn (the ones with 100% transitional probability) will be two syllables long and won’t be words that the baby could have potentially learnt before, meaning they will be nonsense words. For instance, we may combine the two syllables “fo” “la” to make the word “fola” that the child would learn in the familiarization stage. This makes the words easy, short and simple for the babies to learn. This simplifies our study and thus we can make it clear as to whether the babies have learnt the new word or not.

Why are the babies divided into three groups? Why doesn’t each baby just go through all three conditions? How can this affect our results?
The babies were divided into three separate groups and no baby does more than one condition. They are equally split between the three. We did this because as we are using such a young group of participants (8-month-old babies) it is likely that they will not have enough energy to complete all three conditions. This means that for example in the third time they did the experiment, they would be more tired and thus potentially pay less attention, and this may confound our results. Also, because the participants are so young, it is unlikely that their environment growing up will have been hugely different, so the differences in the personality will not be as much as compared to using older participants. Therefore, it will make less of an impact on our results by using different participants in each condition.

Why did we choose a large sample size?
Having a large sample size is good also because it makes our results more reliable, and the mean more accurate. The sample size is also large because we are taking into account all female and male 8-month-old children from a westernized background. This is a large target population so therefore we need a large sample size for our sample to be well representative of the population.

How are we measuring the result? How is the time of gaze related to word segmentation and recognition?
Looking time paradigm
To measure whether the infants have learned the test words and thereby how well they have completed the word segmentation test, we use the looking time paradigm, also called the habituation of looking time. This method builds on the assumption that the duration of gaze is representative as duration of attention. The study of Fantz’s observation (Fantz 1964) showed that infants would rather look at a novel than a stimulus that were familiar to them. When the infants in the assessment phase recognises a recently learned test word and the new part-words, their attention will go to the screen of the speaker. Their gaze would be longer at the new part-words, and the looking time will decrease when they recognise the newly learnt test words from the familiarization stage (Oakes 2010). By looking at the dishabituation we can extract the results, we want to analyze, so we can compare the three groups of different tone of voice conditions. Thus we are expecting that for our experiment in particular that when the child is in the familiarization stage, they will spend less time gazing at the screen (shorter looking time) than when they hear new syllable combinations that they do not recognise (part-words), as they were not in the familiarization stage of the experiment. We may also predict that there will be a difference in the looking times for the learnt words between the three tone of voice conditions, dependent on how well they learnt the words in the familiarization stage relative to the tone of voice they were presented with.

Evidence that infants learn language acquisition also comes from an experiment done by Saffran, Hay and Pelucchi who found through their experiment on 8-month-old infants track backward transitional probabilities. They found that “A paired t-test revealed a significant difference in average looking time for HTP-words (10.06 s) versus LTP-words (8.91 s): t(31)=3.05, p<.01”. This provides evidence of language acquisition through transitional probability, allowing us to make our predictions of the infants learning the words in the familiarization stage.

Why did we choose to measure our results in this way/use this method?
We decided to use the method of eye tracking to measure the baby’s gaze because it is an accurate way of seeing where the baby is looking. It is much more accurate than using an observational technique, as we can see exactly where the baby is looking. This means that it will be clear to us exactly when the word is catching the baby’s attention.

Challenges and solutions to these
Each infant has different preferences of stimuli (Oakes 2010). This could be the complexity of the stimuli, if it’s dynamic or static and what the baby is used to. If the baby has a female primary caregiver it will prefer a male face. This is why we used the same speaker both the familiarization and assessment phase. We use the same speaker to eliminate the influence of previous preferences and to avoid that the infant's gets familiarized with one person and see another person later.

The looking time reflects the competing preference for novelty and familiarity. If the infant is fully familiarized it will prefer the novel, part-words in the assessment phase. However, if the infant is not fully familiarized it might prefer the familiarized stimuli, the test words. This would make the results inaccurate. The way we choose to come around this challenge was to use a large sample group of infants. In this way, we allow there to be some evasiveness, but in the big picture, we can still see a tendency.

The infants should be exposed to a test where they had minimal prior preferences to be able to use the habitation looking time. In order to do this, we use an artificial language and not know English words. The whole point of the research is to see how the tone of voice affects the infant's ability to segment newly learned words. Infants usually don’t have a recognition of words, but by using the artificial language we make sure that they don’t.

Last but not least, there is an insecurity in how to determine when the duration of gaze is the right length to be considered as an indication of the infant recognising the word. Infants get influenced or distracted more easily than others. You could do a habitation test to see which infants fits the type of measurement method best. To make the experiment more simply, we chose use the blinking light to keep the infants focused and prevent them from being distracted.

What are the limits of our study?
Although we are focusing on the data collected through statistical information in language learning, as well as using as many elements that we can use as control variables, there still may be third variables. Third variables, such as accents and vernaculars that the child is exposed to during critical period, since the western culture is more multicultural and less homogenous compared to the East. This particular third variable would possibly affect where they would respond to the syllables presented in the experiment as instant recognition. Our attempt in eliminating this third variable is by using babies, who do not have profound variation in personalities as adults possess; however, it is still a possible factor.

Infants, although with less than a year of experience in the world, are shaped by their home environment, which makes each participant difficult to profile, as each individual is different and they do not have a method of clear communication yet. So far, we only control the cultural factor of having them come from westernized households, in hopes of ensuring that they are influenced by similar experiences, in which we are quite limited with what we can control in such a diverse culture.

Expected results

We are expecting to see a difference in the looking times across all three tone of voice conditions. Through research, we found that in one experiment where tone of voice was varied that the recognition was the best when the meaning of the word was related to the tone of voice used (Nygaard & Queen 2008). However, this experiment used words that are known and the participants were adults. So all we can predict is that there will be a difference, between the three conditions but we cannot predict in which conditions the recognition of the “test-words” will be better. Yet, we could predict that the looking time for the words that the infants recognise will be shorter if they recognise the word than if they don’t recognise the word. A study had found that infants pay more attention to words that they are unfamiliar with than words that they already recognise (Franz 1964). Thus, we could say that they will look at the screen for only a short time for recognised words because in their mind they are thinking “I already know this word”. Therefore, this means that we can expect that the control group should have the shortest looking time because not only will the infant recognise the word from learning it prior but also the tone of voice is the same in both stages here. This makes it even easier for the infant to recognise the word, therefore this category should predictably have the shortest looking time of the three categories. However, another study using a transitional probability method found that when using this method, the looking time for the participants for words that they recognised was in fact longer than the looking time for the words that they didn’t recognise (Saffran, Hay and Pelucchi 2009). This contradicts the results from the aforementioned study, therefore overall we have decided that we cannot predict which way the length looking time will fall. So, we are predicting that there will simply be a difference between the looking times for recognised words between the three tone of voice categories.

Multidisciplinary research

Our transitional probability study is considered to be multidisciplinary, since we are touching on both psychology and linguistics, in which is known as psycholinguistics. When language acquisition is discussed, there is always an interdisciplinary between psychology and linguistics, as we are focused on the ability to learn language. In psychology, language acquisition dives into the capacity of the perception and comprehension of a certain stimuli. In our case, the stimuli are the positive, neutral, and negative reinforcements coming from our tonal variables. From the linguistic point of view, we are focusing on the phonetic and phonology areas of psycholinguistics, as the study is concerned with how effective the brain, during critical period, can achieve to process speech sounds from a spectrum of positive to negative tone.

Conclusion

What did we learn from our research?

We have learnt many things both from proposing this study, researching studies to help support it and also hearing useful peer preview when doing practice proposals. Firstly, we have learnt that infants can use statistical information to segment words without actually knowing what the words mean or having the ability to recognise a word. This is extremely useful tool for children especially when learning languages and may also explain why children who learn multiple languages from a young age grow to be successfully fluent in these languages. Through lots of time researching, we didn’t manage to locate any research studies done on infants that takes into account the influence of tone of voice on language acquisition. Whilst this is a good thing as it gives a clear topic to focus on for our study, it also makes it hard for us to make informed predictions of what we expect our experiment to show, hence why we have proposed a non-directional hypothesis. It is also difficult to tell whether an infant has in fact learnt a new word or not. Even though the studies that we have found to support our study mostly all focus on the looking time of the participants as an accurate way to locate the attention of the child. It is hard for us directly correlate whether the looking time of the child is directly related to whether the child has learnt the word. We also had some difficulties when finding that studies that we similar actually showed very different results, meaning that it again was hard for us to make an informed prediction of what we expected our results to show. Also, using infants in experimental studies can prove difficult because it is hard to maintain concentration for the child and also hunger and tiredness are much higher influencing factors for the study than if we used older participants.

Our experience
There was quite a learning curve in our experience of creating this study. We began this study with only the in-class knowledge of language acquisition, from the psychology sector, as well as the introduction of phonemes from the linguistic lecture about ambiguity. From those starting points, we dug into our interest of how the difference between motherly and fatherly talk influenced a child’s language acquisition. Gradually, we narrowed down the study to realize the core of this cultural phenomenon of having a mother’s coo that contrasts with a father’s tone of talking to a small adult. By narrowing to the core of the question, we are able to come up with controls where we attempt to eliminate as many third variable influences as possible, by focussing on tones.

Although our study has not been executed, we learned a lot about how parental influence affects “baby babble” and statistical analyses on parts of speech. Researching about transitional probability was the turning point of creating this study, as it was a foreign method to our group initially. With the assistance from our psychology lecturer, Rebecca Reh, she introduced how statistical learning would benefit our study, as we have struggled with how we could eliminate as much exterior influences as possible. Transitional probability is a method that attempts to extract the structure from sensitive environments that would have disrupted the accuracy of the experimental results.

Plan for future research

In our study, we have focused on how the tone of voice affects the infant's ability to segment and thereby learn new words. Previous research has been done on, how the relationship between children with neurological insult and the caregiver affects their cognitive and language outcomes. The results were that a harmonious relationship between the caregiver and the child supported the child’s development (Leiser, Heffelfinger & Kaugars 2017). This study, however, was done on older children not on infants. Building on our study, if the results are as we predict, it would be interesting to combine the two areas and study how the tone of voice and the relationship to the caregiver supports the infant’s language acquisition. There could be made a comparison of the combination of tone of voice and relationship to caregiver to gain an idea of the optimal conditions for language development of the infant. This study will have a large amount of independent and dependent variables, and will require a large group of babies.

Annotated Bibliography

Aslin, Richard N.; Saffran, Jenny R.; Newport, Elissa L. “Computation of Conditional Probability Statistics by 8-Month-Old Infants” Psychological Science, vol. 9, no. 4, 1998, pp. 321-324 (Accessed 20-1-207 21.55)
http://www.jstor.org.ezproxy.library.ubc.ca/stable/pdf/40063345.pdf

This experiment examines whether infants at the age of 8 months can use statistical information to do well in their word segmentation test. We have used this study as a base for, how we should do our experiment. This study introduces the transitional probability method and the use of the blinking light to keep the infants' attention during the assessment stage. However, it mentions nothing about tone of voice or other non-linguistic properties of communication.

Daland, Robert “Word Segmentation, Word Recognition, and Word Learning: A Computational Model of First Language Acquisition” Northwestern university, June 2009
https://www.linguistics.northwestern.edu/documents/dissertations/linguistics-research-graduate-dissertations-dalanddissertation2009.pdf

This dissertation processes the subfield of language acquisition word segmentation. It explains the terminology behind and discusses different method used in language acquisition. This source doesn’t provide of with new information as the research journals, but it provide us with knowledge about what to be thoughtful of, when designing our experiment.

Leiser, Kara; Heffelfinger, Amy; Kaugars, Astrida. ” Associations among parent–child relationships and cognitive and language outcomes in a clinical sample of preschool children” The Clinical Neuropsychologist, vol. 31, no. 2, 2017, pp. 423-437
http://www-tandfonline-com.ezproxy.library.ubc.ca/doi/pdf/10.1080/13854046.2016.1268649?needAccess=true

This study examines how the relationship between the child and caregiver affects cognitive and language outcomes. Through repeating meetings the child where to solve different task with help from different caretakers. There conclusions were that harmonious child-parent interactions supports the development of the child’s cognitive and language development. The weakness of this study is that it focusses on preschool children with early neurological insults. However, this study is relevant because it examines the importance of the relationship between the caregiver and the child, which is the other independent variable of our study.

Nygaard, Lynne C.; Queen, Jennifer S. “Communicating emotion: Linking affective prosody and word meaning.” Journal of Experimental Psychology: Human Perception and Performance, vol. 34, no. 4, 2008, pp. 1017-1030
http://psycnet.apa.org/record/2008-09670-018

The listeners of this study were presented words in three different emotional tone of voices; happy, sad or neutral. The role of emotional tone of voice was investigated in relation to perception of a word. Their conclusions were that the emotional state of the speaker influences the understanding and recognition of a word. The weakness of this study is that it focusses on adult whereas we focus on infants acquiring a spoken language, and it does not take the relationship between the speaker and the listener into account. This study can contribute to our study by exploring how the effect of the emotional tone of speaker affects the listener.

Oakes, Lisa M. “Using Habituation of Looking Time to Assess Mental Processes in Infancy” University of California, Davis, J Cogn Dev., 2010
http://www-tandfonline-com.ezproxy.library.ubc.ca/doi/pdf/10.1080/15248371003699977?needAccess=true

The article gives an introduction to habituation of looking time, which is our method to measure whether the infants have learned the test words. It discusses the challenges of using this method, what you should be careful of and how you set the best practice. This doesn’t give any practical advice on how to do the experiments, but it discusses what to think of before doing the experiment such choosing elimination prior preference.

Pelucchi, B., Hay, J. F., & Saffran, J. R. (2009). Learning in reverse: 8-month-old infants track backward transitional probabilities. Cognition, 113(2), 244–247.
http://doi.org/10.1016/j.cognition.2009.07.011

This article offers strong evidence for the ability of 8 month old babies to complete language acquisition of an unfamiliar language using a head turn method. The results from this study allows us to assume that the babies in our study will successfully learn new words from the familiarisation stage of our study, and thus this can support our results for the looking time found in our experiment, meaning that we should be able to say that when they hold their gaze at the screen this is due to them recognising a word that they have learnt in the familiarisation stage of the experiment. Even though in this study, they use words from actual languages, the babies they used in this experiment were all monolingual. This means that even though we are creating our own words in our experiment, the babies are still just as unfamiliar with the new words in our experiment as they would be in Pelucchi, Hay and Saffran’s experiment.