We are proud to introduce SyllabO+ [SilabO+], the first corpus and database dedicated to spoken French in Québec! SyllabO+ contains the recordings of 225 adult native speakers of Quebec French ranging in age from 20 to 97 years old. The recordings were made in formal contexts (e.g. courses, radio interviews) and during informal conversations, in order to represent the different language registers, which have a huge impact on word use and therefore distributional statistics. All recordings were manually transcribed to the International Phonetic Alphabet (IPA), segmented into syllables and saved as annotated xml files. Next, we created two sub-lexical databases: one for the phones and one for the syllables of spoken Quebec French. We then developed a set of analytical tools to calculate absolute and normalized frequencies, transition probabilities, and mutual information of single syllables and phones, and syllable and phone collocations. The syllable database contains 364,302 syllables (including 5,613 unique syllables). The phone database contains 830,189 phones (including 49 unique phones). The database and tools are now integrated into an open-access web application. Moreover, a description of SyllabO+ is now published in Behavioral Research Methods.
This project was started in summer 2013 and was carried out by Pascale Tremblay and her team at the Speech and Hearing Neuroscience Laboratory at Université Laval. It forms the bulk of the master thesis of Pascale Bédard, M.Sc., who completed this work under the supervision of Pascale Tremblay, and in collaboration with Patrick Drouin, Ph.D., Université de Montréal, and Johanna-Pascale Roy, Ph.D., Université Laval. Several other students have been involved in the project, including Anne-Marie Audet, Julie Rivard, and Claudie Ouellet. More recently, the work of Chloé Chagnon-Dumesnil, Micaël Carriel and Catherine Denis has allowed us to increase the size of the corpus from 184 to 225 speakers.
The project was funded by the Social Sciences and Humanities Research Council of Canada (SSHRC) through an Insight Development Grant and a Connexion grant to P. Tremblay, and made possible through a Leaders Opportunity Fund (LOF) from the Canada Foundation for Innovation (CFI) also to P. Tremblay. We also thank our research centre (CERVO Brain Research Centre) where our laboratory is located, as well as Université Laval.
SyllabO+ was created to enable the scientific study of sub-lexical phenomena in oral language. No other database exists that focuses on sub-lexical phenomena in spoken Quebec French. And no other tool provides information about the use of syllables and phones as a function of age and sex of the speakers, as well as communication context. SyllabO+ is therefore a unique research tool that will serve to create controlled stimuli as well as describe the use of syllable use in Quebec French spoken language. It provides a complete list of all syllabic structures and phones in spoken French in Québec as well as the distribution information of each of these structures. It will therefore greatly enhance current knowledge of the sub-lexical phenomena in spoken language. The databases can also be used to provide a linguistic description of contemporary use of spoken French in Quebec, in terms of phonological and phonetic phenomena including phonotactic rules in spoken French language, phonetic inventory, etc., The main fields of application for SyllabO+ are cognitive neuroscience of language, psycholinguistics, neurolinguistics, phonetics, phonology, and the study of first and second language learning.
SyllabO+ is also a unique source of information to support the work of clinicians. We hope that it will be used to elaborate targeted and age-appropriate speech interventions to rehabilitate speech perception and production, including strategies based on phonological awareness. It provides information on the use of different syllabic structures as a function and age and sex and communication context (formal, informal). SyllabO+ will also be a great resource in the elaboration of evaluation materials, allowing clinicians and researchers to create material controlled in terms of structure and frequency of use.
SyllabO+ was developed in an open access perspective to allow the research community to use it, as well as knowledge users, and thus maximize the impact of publicly funded research.The database is now available here.
We are currently developing a third database, one with all the words in the oral language! We received new funding in December 2016 to complete the word database from the SSHRC (Connexion Program). We expect that the word database will be available in 2017! Visit our website regularly to be informed of this exciting development.
Visit this site regularly for information on this new and exciting development!