FONRYE English Dictionary: Phonetic and Syllable Search

It’s not always possible to find a good searchable phonetic dictionary. That is why I created a free and open source program that searches phonetically transcribed words and filters the results against some basic rules. It uses BEEP and Moby Hyphenator II sources.

Download: FONRYE 0.3.3 (2.2 MB)

In this post: why phonetic dictionary search, what is FONRYE, download and search, settings and results, credits.

Why I needed a searchable phonetic dictionary

For several month I have been working on my M.A. in experimental phonetics. One of the prerequisites is an acceptable corpus. My work is about the English diphthongs. However, diphthongs have to be pronounced after voiced plosives and before voiced/unvoiced plosives, and the words containing diphthongs should preferably be monosyllabic.

Making a corpus is not an easy task and it involves a painstaking search for suitable material. I had no searchable phonetic dictionary of any sort (a version of Macmillan Advanced Dictionary refused to work). It was a pure luck, then, to come across a paper where the bibliography listed one interesting source:  University of Cambridge public FTP server. That is where I found BEEP and MH2 and decided to compile my own searchable dictionary, hopefully usable for the making the corpus.

What FONRYE is, and what it is not

FONRYE (named after fonetski rječnik in Serbian) is a very simple program (or script, if you like) written in Python 2.6. It is a specific piece of software I created for personal use: to search for diphthongs in a phonetic context. It does not have any fancy search rules or regular expression syntax. The plan was to use regexp, but it was very slow to run – I guess it can be improved if needed. So, please bear in mind that it was not planned for releasing: the code may contain strange comments, bad spelling etc.

Its settings are contained in the script itself, in 4 lines of code, which will be explained later. Here’s an example:

before = ('m', 'n', 'r', 'l' ),
after = sounds['voiceless'] + sounds['voiced'],
diphthongs = sounds['diphthongs'],
syllable = 0

The user enters desired search conditions, executes the program, which then saves the results in a folder, accompanied by a short info.

FONRYE phonetic dictonary search
FONRYE phonetic dictonary search, code view

How to use FONRYE

  1. On Windows/Mac: Download Python, but a version lower than 3.0. The version 2.6 is preferred. On Linux: You already have Python installed, but make sure you have an “old” version as well (again, prior to the version 3).
  2. Download FONRYE files, and unpack them. Please make sure you do not delete ‘results’ folder or the program will not work. On Windows go to Start menu and find IDLE inside Python folder. On Linux: Use any plain text editor which supports code editing, such as gedit. Or, install Python IDLE from your OS repository. Edit file run.py, enter your settings and save the file. Finally, run the program (double click run.py or press F5 in IDLE).
  3. The program will start the search, and after it finishes the results will be in results/fonyre_results_n, where n is the search counter.

Settings and results format

In the step 3 above you opened run.py file. Here is how to enter the “settings”. First, locate these lines:

before = (),
after = (),
diphthongs = ()
syllable = 0

Do not modify anything except content inside the brackets and syllable number (that is, unless you are familiar with programming). By the way, syllable = 0 means words with 1 syllable, syllable = 1 with 2 syllables etc. Enter your phonemes in the brackets. For example, the settings:

before = ('b', 'd'),
after = ('p', 't'),
diphthongs = ('ay',)
syllable = 0

…will search for all words containing diphthong ay (IPA: aɪ) if the diphthong is between b/d and p/t. After the search is done, go to ‘results/fonrye_search’ folder and locate search_info.txt (here is a vowel search info, a sample) – that is info about your search, including unique mark (ID) placed in all result files to keep track of the searches/results. The folder ‘files’ is where your searches are placed. For the provided sample search the program produced the following file/results:

# uniqueid-dzzic
BIGHT		b ay t
BITE		b ay t
BLIGHT		b l ay t
BRIGHT		b r ay t
BRIGHTS		b r ay t s
BY-PASS		b ay p aa s
DIGHT		d ay t

You can find computer phonetic code in phoncode.txt in ‘data’ directory or in the file beep-1.0-edited. Please enter only this plain phonetic notation. IPA is not applicable.

Credits

I could create and use this little project, and place it on the Net, thanks to two people who provided the core of the project: a phonetic dictionary and a hyphenation dictionary. The phonetic dictionary was compiled by Toby Robinson from Cambridge University Engineering Department;  Moby Hyphenation dictionary was created by Grady Ward. Both the projects were placed into the public domain in 1996. See Bibliography page for FTP addresses.

My credits are for some fast-writing not-so-good-looking slow Python code, which you are free to improve and share.

Feedback is very welcome!

Formant synthesis application

Jonas Beskow at the Centre for Speech Technology KTH Stockholm wrote free Formant Synthesis Demo computer programme that runs on Windows and Linux (and on any other OS for which the application can be compiled from the open source code the author kindly uploaded).

The programme synthesises F1, F2, F3 and F4 formants from several sources (rectangle, triangle, sine, sampled and noise). It “demonstrates formant-based synthesis of vowels in real time, in the spirit of Gunnar Fant’s Orator Verbis Electris (OVE-1) synthesiser of 1953” (from the About window).

„Formants are defined by Fant  as ‘the spectral peaks of the sound spectrum |P(f)|’ of the voice. Formant is also used to mean an acoustic resonance,[2] and, in speech science and phonetics, a resonance of the human vocal tract. It is often measured as an amplitude peak in the frequency spectrum of the sound, using a spectrogram (in the figure) or a spectrum analyzer, though in vowels spoken with a high fundamental frequency, as in a female or child voice, the frequency of the resonance may lie between the widely-spread harmonics and hence no peak is visible. In acoustics, it refers to a peak in the sound envelope and/or to a resonance in sound sources, notably musical instruments, as well as that of sound chambers” — Wikipedia.

Formant Synthesis Demo
The window of the Formant Synthesis Demo

The download link is on the Formant Synthesis Demo site.

Updated Latin: Meanings and Derivation

This is the first major update after putting the site online, two months ago. PyLatinam is improved so it “speaks” English now. Also, there are some grammatical updates.

Grammar: Meanings and translation

If you have heard about William Whitaker’s free Latin dictionary called WORDS, you probably know that his work is freely available for other applications. Thanks to this, pyLatinam Automatic Grammar after derivation now shows meanings of the word. Here is an example of declined noun with highlighted meanings:

This was done by first parsing main WORDS file and making it available for pyLatinam code. As the project improves, new parts of speech will be added; next to be implemented are pronouns and adjectives.

Grammar: The code

PyLatinam Speaks English.There are some significant changes in the code, but little of that is visible on Interactive Pages. pyLatinam has a new class that makes work with dictionaries (internal ones) easier. As mentioned before, new code displays the meanings. In the terms of grammatical validity, the program will now recognize words like puer, pueri, m and gener, generi, m as those that keep sound e during declension (so-called fleeting e issues).

Visual stuff

Not so important for functionality of the program as much as it makes a visit to the site more pleasant: I have changed the default, almost generic design in, hopefully, something classical-looking.

FAQ Page

There is also a new page, FAQ, available in two languages:

English: pyLatinam Frequently Asked Questions
Serbian, Latin: pyLatinam česta pitanja (srpski, latinica)
Serbian, Cyrillic: pyLatinam честа питања (српски, ћирилица)

pyLatinam is online grammar software. It is free and open software, still in its early stages.

Your comments and suggestions are more than welcome!

See more articles about Automated Latin Grammar.