Resonant frequencies and the vocal tract length

This post is about resonant frequencies of a tube, in the context of speech and the neutral vocal configuration. Two formulas are given: the first to calculate the resonant frequencies when the length is known, and second, to calculate the length when the frequency of a formant is known. Finally, there is a real-life example: a calculation of a speaker’s vocal tract length after measuring the formants in schwa.

The speech mechanism in vowels is described by a model that uses the physical properties of tubes. A tube is a simple apparatus that, if attached to a source of sound, can emit harmonic frequencies. When attached to a sound speaker at the end, the tube acts as a resonator that “has an infinite number of resonances, located at frequencies given by odd-quarter wavelength” (Kent and Read 14). The resonant frequencies of a tube closed at one end are calculated by using this formula (Johnson 96):

 Fn=\frac{(2n-1)c}{4L}

Where n is an integer, L is the length of the tube and c is the speed of sound (about 35,000 cm/sec).

This was very interesting to me, so I decided to experiment with the formula in R language. The purpose was to calculate average frequencies of a vocal tract in the neutral configuration (a position of vocal organs where a tube without obstacles is created from the larynx to the lips). So, the formula written above in R looks like this:

freq <- ((2*i-1)*35000)/(4*tract.len)

For a given speed of sound c=35000, the formant number i and the tract length, we can calculate estimated formant values. As an example, we can insert L = 17.5 cm in the formula, the average length of human tract16 from glottis to lips (15). In this case the first formant, or the first resonance frequency, occurs at 500 Hz, the second at 1500 Hz, the third at 2500 Hz, and so on. Here is the output form R code located here:

> Resonance(17.5)
Tract length is 17.5 cm.
formant 1: 500 Hz
formant 2: 1500 Hz
formant 3: 2500 Hz
formant 4: 3500 Hz
formant 5: 4500 Hz

Of course, we can reverse the calculation; by entering formant frequency and the order of the formant we can calculate an average length:

prep <- 35000*((formant/2)-0.25)
length <- (prep/freq)

This is the result of  Length function of the code:

> Length(1000, 1)
Estimated tract length is 8.75 cm, where formant number 1 has value of 1000 Hz.

This length corresponds to vocal tract lengths measured in infants.

spectrogram and waveform
A spectrogram and waveform near the end of a word "abjured". The three red lines show formants, while the vertical line shows the measurement point. Analysed in Praat.

To make the calculations even more interesting, we can measure the frequency of the first formant of speakers, and then “calculate” the length lengths of the vocal tracts. Here is an example: we recorded a speaker and examined the sound data. Since schwa sound is pronounces in (approximately) the neutral configuration, we measured the formants where this sound (IPA: ə) was articulated. In this case, that was near the end of the word  abjured /əbˈdʒʊəd/. The first three formant values in the sample female speaker were:

Time_s   F1_Hz   F2_Hz   F3_Hz
4.633178   549.304326   1750.098455   2915.885791

If we enter 549.3 Hz in the second formula, we get:

> Length(549.304326,1)
Estimated tract length is 15.92 cm, where formant number 1 has value of 549.3043 Hz.

This is, it seems, an acceptable value for this speaker.

The measurements and image was obtained by using Praat, free phonetic software. Calculation and the code example were written in R programming language.

The Euclidean Distance in Diphthongs – R Graph and Code

Representing and plotting a distance in F1/F2 graph, in terms of the Euclidean distance, is relatively easy in R. This post shows one of the ways of achieving that. First, we provide a sample data, which consists of F1 and F2 values for two diphthong targets.  Then, draw the diphthong positions with their starting and ending targets, and, finally, calculate the distance. This R code does most of the F1/F2 calculations and drawing.

Data sample (Formants in a Diphthong)

First, a data sample.

ESLStudents
    ascii   ipa     f1      f2
1  aw_l_1 ɑʊl_1 900.96 1600.10
2  aw_l_2 ɑʊl_2 373.61 1082.59 

RPSpeaker
    ascii   ipa     f1      f2
1  aw_l_1 ɑʊl_1 823.07 1542.39
2  aw_l_2 ɑʊl_2 411.39 1405.78

These are the values for the first two formants in /ɑʊ/, as measured in a group of 15 female ESL student and one RP speaker (also female). Number 1 in the notation marks the first vowel target, 2 the second (thus, aw_l_1 is /ɑ/ and aw_l_2 is /ʊ/), while the “l” marks a long diphthong. The two targets will be the starting and the ending of a line, and the line’s length is expressed by the Euclidean distance.

The Euclidean Distance

Euclidean distance is a metric distance from point A to point B in a Cartesian system, and it is derived from the Pythagorean Theorem. Thus, if a point p has the coordinates (p1, p2) and the point q = (q1, q2), the distance between them is calculated using this formula:

distance <- sqrt((x1-x2)^2+(y1-y2)^2)

Our Cartesian coordinate system is defined by F2 and F1 axes (where F1 is y-axis), and the metric distance refers to the distance from one diphthong target to another. The vowel targets, corresponding to A and B points are defined by the F1/F2 values in Hertz for a particular vowel. In our example above, A and B  are rows 1 and 2, while the values are F2 and F1 frequencies.

Plotting in R

The third step in the process is plotting, so we could see the graphical representation of the distance. We can do that by:

  1. Drawing the F1/F2 “coordinate system”.
  2. Drawing the vowels in A and B positions, and connecting them with a line.
  3. Drawing the arrows showing the direction of pronunciation and placing the IPA symbols.
An example looks like this:
Diphthongs drawn on F1/F2 plot
The English diphthongs as pronounced by the ESL students and a native RP speaker.
The diphthong  /ɑʊ/ is plotted in the lower right corner of the graph. Here are the Euclidean distances for that diphthong (in both variants):
          RPSpeaker  ESLStudents
aw_l ɑʊl  738.86     433.75
aw_s ɑʊs  816.08     471.60
The R code used to plot the graph can be found here.

IPA Symbols in R

This post is an example of how to place IPA (International Phonetic Alphabet) in R charts. I have achieved that by using the hexadecimal values of the corresponding Unicode symbols. There may be a more direct approach, but I am unaware of one.

A plot is created as usual, but the IPA labels are stored in a separate vector:

diph.names.ipa <- c('e\u026A', 'a\u026A', '\u0254\u026A')

The hex values of IPA symbols are available here.

A sample graph created with this R script looks like this:

A sample graph showing IPA symbols drawn by plot() comand.
A sample graph showing IPA symbols drawn by plot() command.

 

If you are working with R in ESS, there is a difference in IPA representation on Windows and Linux. In Windows the characters are shown in the hex notation, at least in my case. On Linux, on the other hand, the symbols are shown as IPA, so it is much easier to work:

Screenshot of IPA in ESS on Linux
IPA symbols within a data frame object in R (ESS/Linux)

The table above is sorted and ready to be inserted into a text editor. In case you are using Word or Writer, you can copy/paste the table with a quick workaround. You need to have installed Open Office (Libre Office). Open Calc application, select the first cell and paste the table from Emacs. In options that appear, select “Space” and “Merge delimiter” in “Separated by” and confirm. Next step is to copy the table from Calc and paste it where needed:

Vowel F1 F2 F3
ɑ 891.89 1656.59 2564.01
a 700.65 1389.3 2871.73
ɛ 585.82 1909 2713.09
e 532.55 2197.79 2714.36
ɔ 493.94 1270.26 2604.23
ʊ 383.08 1240.57 2610.09
ɪ 383.48 2308.99 2719.21
ə 480.32 1680.69 2652.19