The Euclidean Distance in Diphthongs – R Graph and Code

Representing and plotting a distance in F1/F2 graph, in terms of the Euclidean distance, is relatively easy in R. This post shows one of the ways of achieving that. First, we provide a sample data, which consists of F1 and F2 values for two diphthong targets.  Then, draw the diphthong positions with their starting and ending targets, and, finally, calculate the distance. This R code does most of the F1/F2 calculations and drawing.

Data sample (Formants in a Diphthong)

First, a data sample.

    ascii   ipa     f1      f2
1  aw_l_1 ɑʊl_1 900.96 1600.10
2  aw_l_2 ɑʊl_2 373.61 1082.59 

    ascii   ipa     f1      f2
1  aw_l_1 ɑʊl_1 823.07 1542.39
2  aw_l_2 ɑʊl_2 411.39 1405.78

These are the values for the first two formants in /ɑʊ/, as measured in a group of 15 female ESL student and one RP speaker (also female). Number 1 in the notation marks the first vowel target, 2 the second (thus, aw_l_1 is /ɑ/ and aw_l_2 is /ʊ/), while the “l” marks a long diphthong. The two targets will be the starting and the ending of a line, and the line’s length is expressed by the Euclidean distance.

The Euclidean Distance

Euclidean distance is a metric distance from point A to point B in a Cartesian system, and it is derived from the Pythagorean Theorem. Thus, if a point p has the coordinates (p1, p2) and the point q = (q1, q2), the distance between them is calculated using this formula:

distance <- sqrt((x1-x2)^2+(y1-y2)^2)

Our Cartesian coordinate system is defined by F2 and F1 axes (where F1 is y-axis), and the metric distance refers to the distance from one diphthong target to another. The vowel targets, corresponding to A and B points are defined by the F1/F2 values in Hertz for a particular vowel. In our example above, A and B  are rows 1 and 2, while the values are F2 and F1 frequencies.

Plotting in R

The third step in the process is plotting, so we could see the graphical representation of the distance. We can do that by:

  1. Drawing the F1/F2 “coordinate system”.
  2. Drawing the vowels in A and B positions, and connecting them with a line.
  3. Drawing the arrows showing the direction of pronunciation and placing the IPA symbols.
An example looks like this:
Diphthongs drawn on F1/F2 plot
The English diphthongs as pronounced by the ESL students and a native RP speaker.
The diphthong  /ɑʊ/ is plotted in the lower right corner of the graph. Here are the Euclidean distances for that diphthong (in both variants):
          RPSpeaker  ESLStudents
aw_l ɑʊl  738.86     433.75
aw_s ɑʊs  816.08     471.60
The R code used to plot the graph can be found here.

IPA Symbols in R

This post is an example of how to place IPA (International Phonetic Alphabet) in R charts. I have achieved that by using the hexadecimal values of the corresponding Unicode symbols. There may be a more direct approach, but I am unaware of one.

A plot is created as usual, but the IPA labels are stored in a separate vector:

diph.names.ipa <- c('e\u026A', 'a\u026A', '\u0254\u026A')

The hex values of IPA symbols are available here.

A sample graph created with this R script looks like this:

A sample graph showing IPA symbols drawn by plot() comand.
A sample graph showing IPA symbols drawn by plot() command.


If you are working with R in ESS, there is a difference in IPA representation on Windows and Linux. In Windows the characters are shown in the hex notation, at least in my case. On Linux, on the other hand, the symbols are shown as IPA, so it is much easier to work:

Screenshot of IPA in ESS on Linux
IPA symbols within a data frame object in R (ESS/Linux)

The table above is sorted and ready to be inserted into a text editor. In case you are using Word or Writer, you can copy/paste the table with a quick workaround. You need to have installed Open Office (Libre Office). Open Calc application, select the first cell and paste the table from Emacs. In options that appear, select “Space” and “Merge delimiter” in “Separated by” and confirm. Next step is to copy the table from Calc and paste it where needed:

Vowel F1 F2 F3
ɑ 891.89 1656.59 2564.01
a 700.65 1389.3 2871.73
ɛ 585.82 1909 2713.09
e 532.55 2197.79 2714.36
ɔ 493.94 1270.26 2604.23
ʊ 383.08 1240.57 2610.09
ɪ 383.48 2308.99 2719.21
ə 480.32 1680.69 2652.19


Checking Praat’s TextGrids in Python

A TextGrid file contains data about intervals, segments, times etc. of the corresponding signal file (audio in wav, mp3, aif…). Because grids are in plain-text  – they can be analysed / checked / extracted  automatically, or parsed.

In case you are a linguist/phonetician you might be using Praat, a small, but very powerful, programme for phonetic analysis. Chances are have a lot of speakers and recordings. You will probably segment signals in Praat, and save the segmentation in TextGrids.

Thanks to Margaret Mitchell and Steven Bird, who contributed the parser for Praat TextGrid to Natural Language Toolkit, automated analysis is now much easier.

TextGrid parser is a part of NLTK and it is located here.

I am grateful to the authors, because they saved me a lot of time during segmentation checks. All that was needed was a Python script that uses the above code to load TextGrid content, and then write a set of checks for each file/speaker.

Checking file  03-speaker-im.TextGrid
    Checking proper tier names...
    Checking if tiers contain 32 items...
    Checking if all tiers have valid text...
    Checking if the diphthongs have pairs...
    Checking if all words are present...
    Checking if the words and diphthongs match...
Mismatch: "ay_l" not allowed in "dice", at position 24.
It should say "ay_s".

Here, for example, my script warned me that I have a wrong label for a diphthong in the file number 3. To spot that “manually” it would require a lot of time and attention.

I hope this post might help other researchers, and here is the Python script I wrote for my phonetic research.