The complexity of spoken language is revealed when we segment it into small building blocks. Language has multiple levels: sentences are made up of words, words are made up of syllables and syllables are made up of small speech sound units called phonemes (oral vowels and consonants). 

These phonemes, specific to each language, are essential to distinguish words from each other. For example, the difference between the auditory words “lot” and “not” is based solely on the difference between the first phoneme of each word. The /l/ and the /n/ are phonemes of the French language. Phonemes are pronounced differently according to their position and their environment within a word, and to also vary as a function of regional accents. The different pronunciations that are allowed within a language, that is, that lead to the same percept, are called “phones”.

Our SyllabO+ project was launched several years ago to fill knowledge gaps about the use of words, syllables and phonemes in Quebec French, which differs from other varieties of French, such as French from France, from Senegal or Switzerland. Each variety of French is indeed unique in several aspects.

The SyllabO+ project started in 2013. Funded by several @SSHRCCANADA grants, the project, led by Pascale Tremblay, began by collecting speech recordings from 225 people aged from 18 to 97. These recordings were first transcribed into the International Phonetic Alphabet (IPA)—a language used by linguists, phoneticians, terminologists, and dictionary makers to represent the sounds of all languages easily and unambiguously—then the transcriptions were cut up into words, then into syllables and finally into phonemes! The corpus comprises 360,000 syllables! This represents an enormous amount of work that has been spread over several years!

SyllabO + is a free online tool that allows people to search in several databases: words, lemmas, syllables, and phones. The Words database contains ~16,000 different words, the syllables database contains 5614 different syllables, and the phones database contains 49 different phones!

The project is still in development. We are now working on morphemes, which are the units of meaning that make up words. This project is led by a team made up of Pascale Tremblay, Noémie Auclair-Ouellet, formerly a professor-researcher at McGill University, and Alexandra Lavoie, a laboratory assistant who segmented all the words in the corpus into morphemes. The paper is now written, and we are about to submit it. We are working to integrate it to our website. Stay tuned!

Over the years, many people have contributed to this gigantic project: Pascale Bédard, who made it her master’s subject and is still working with the team to create the new morpheme database, Johanna-Pascale Roy, professor at Université Laval, Patrick Drouin, professor at Université de Montréal and several research assistants: Anne-Marie Audet, Julie Rivard, Émilie Belley, Claudie Ouellet, Chloé Chagnon-Dumesnil, Catherine Savard, and Catherine Denis.

What is SyllabO+ used for? SyllabO + allows people to search for syllables and phonemes in order to obtain their usage statistics associated, including the frequency of occurrence of each syllable and each phoneme and several other statistics. It provides a unique map of Quebec’s oral language! This information is useful for the creation of language assessment materials and language exercises for research or clinical purpose, which must be controlled according to different parameters such as the frequency of use of words or syllables. In the lab, we use SyllabO to develop the stimuli for each of our studies. This information is also useful for teachers and learners of French to find out which words are used most often. Finally, for all language lovers who would like to know, for example, if the word “automobile” is more frequent than the word “car” or “vehicle”? The answer is on SyllabO +!

Read also:

Our article on SyllabO+

Pascale Bédard’s master’s thesis, available on CorpusUL, Université Laval’s institutional repository, aims to make your scientific production freely accessible in order to increase its visibility and promote the sharing of knowledge in a sustainable way:

Suggested readings: