Teaching – Philipp Buech

Find here a list of courses I taught. Materials can be found in the highlighted sections. [materials will be uploaded soon]

2023

Phonetic Data Processing Workshop (Sorbonne Nouvelle)

This workshop dealt with the automatic processing of phonetic data (acoustics, EMA) and how to use tools for their automatic annotation. The workshop had two parts: The first part covered forced alignment using the Montreal Forced Aligner (sessions 1 & 2) and the second part (session 3) presented methods for the automatic detection of gestural landmarks in EMA data. Here is a GitHub repository that contains materials for this course, but you will find the materials below too. All data used in this course is from publicly available data sets, with the exemption of the 3rd session, where I used data from one of my experiments (consent of the speaker is available).

Session 1 – Forced alignment, Montreal Forced Aligner I (10/03/2023, MDR, Salle Claude Simon)

This session introduced the basic principles of forced alignment and demonstrated how to use the MFA. This session also included a small introduction to Python and the Anaconda data science platform. Since Windows users had a problem with installing packages via Anaconda, a solution was provided and can be found in the Readme of session 1 on the GitHub repository. All Files can be found in the following.

Theory: This presentation introduced to the basic principles of Forced Alignment. Download the PDF here.

After the theoretical introduction, several tutorials were added to explore the capabilities of the MFA on data from various languages. Use the links for getting the primary data.

Tutorial	Description	Downloads
Use Case 1a: French alignments	This tutorial demonstrates the basic functionality of the MFA using a French corpus as an example. The data is from the SIWIS French Speech Synthesis dataset.	[PDF] [annotations]
Use Case 1b: Modifying the dictionary	You may see that the „vanilla“ dictionary does not account for variability in pronunciation. This tutorial demonstrates how to use and modify the dictionary in order to account for certain variations.	[PDF]
Use Case 2: Aligning a French audiobook	This tutorial demonstrates how to align utterances from an audiobook. The data is from a freely available audiobook „Le dernier jour d’un condamné“ of Victor Hugo from LibriVox. The corresponding text can be found here. Note that the data in the download folder were already converted into wav files.	[PDF]
Use Case 3: English alignments	English data from the DARPA TIMIT dataset were aligned in this tutorial. .lab-files containing the corresponding transcriptions of the utterances are provided in the download folder. Two different acoustic models and dictionaries were used for the alignments.	[PDF]
Use Case 4: Aligning Swahili broadcast speech	This use case demonstrate how to align Swahili broadcast speech data. The full corpus is available here.	[PDF]
Use Case 5: Aligning Hausa speech data	The last tutorial shows the alignment of Hausa data from the Mozilla Common Voice initiative. The entire dataset can be found here. Two pretrained acoustic models are available: one is the hausa_cv model without accompanying dictionary, and one is the hausa_mfa model including a dictionary.	[PDF]

Session 2 – Forced alignment, Montreal Forced Aligner II (24/03/2022, ENS, Salle Paul Langevin)

This session showed how to validate a corpus, how to train acoustic and g2p models and how these models are utilized in the MFA. The session started with an introduction of these functions. Afterwards, several tutorials were added to apply . Find first a list of introductions of these functions:

Tutorial	Description	Downloads
Useful functions in PRAAT and Audacity	This tutorial introduces some options for pre-segmenting the utterances in a corpus using either PRAAT or Audacity, since the MFA works best one utterance per audio file. After rough segmentations are done, the scripts in the scripts folder extract all the utterances and create a corpus folder that is conform with the corpus structure needed for the MFA.	[PDF] [scripts]
Corpus validation	This tutorial shows how to use the validate function of the MFA in order to detect errors in a corpus and a dictionary.	[PDF]
G2P model: dictionary creation	Grapheme-to-phoneme (g2p) models are introduced in this tutorial and how theiy can be used to modify or create pronunciation dictionaries.	[PDF]
Training of acoustic models	This tutorial shows how to train a new acoustic model that is necessary for the forced alignment. A dataset on Basque is used for illustration.	[PDF]

Afterwards, three tutorials were added to apply the functions on real data.

Tutorial	Description	Downloads
UseCase 1: French audiobook alignment 2	This section show how to use Audacity/Praat for presegmentation and the subsequent alignment of the data.	[PDF]
UseCase 2: Training of a new acoustic model on Mandarin speech data	This tutorial guides through the training of a new acoustic model using a large dataset (~10 h) of Mandarin.	[PDF]
UseCase 3: Xhosa data	In this notebook we work on Xhosa data, a language that has no acoustic model, dictionary and g2p model in the MFA repository.	[PDF]

Session 3 – Automatic landmark detection (EMA data) (14/04/2022, ENS, Salle Paul Langevin)

This session introduced to gestural landmarks and approaches to their automatic annotation.

Tutorial	Description	Downloads
EMA cheat sheet	A basic introduction into gestural landmarks and their detection based on the trajectories velocity and tangential velocity.	[PDF]
ema2wav	A short introduction to the ema2wav converter	[PDF]
Automatic landmark detection	This tutorial shows how to use the Python-based landmark_detection.py script, that implements several methods for gestural landmark detection based on 20% peak velocity (velocity and tangential velocity)	[PDF] [script]

2022

EMA Data Analysis (Sorbonne Nouvelle) with Sejin Oh (CNRS)

2018

Speech Signal Databases and their Processing, PRAAT scripting course (=Module P4.3, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)

Language Documentation and Fieldwork, Tutorial (=Module P9a.2, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)

Phonetics and Phonology II, Transcription course (=Module K3.2/K3.3, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)

Speech Production , Tutorial (=Module P9b.2, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)

A few tasks I have designed for this tutorial (in german). Click the button to download the PDF.

Download here

2017

Phonetics and Phonology I, Tutorial (=Module K2.2, B.A. Empirical Linguistcs, Goethe University Frankfurt am Main)

Typological Analysis, Tutorial (=Module K10.2, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)

Phonetics and Phonology II, Transcription course (=Module K3.2/K3.3, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)

2016

Phonetics and Phonology I, Tutorial (=Module K2.2, B.A. Empirical Linguistics, Goethe University Frankfurt am Main)