关键词 > EL5102
EL5102 Phonetics and Phonology Assignment 2: Crime-Solving Mystery
发布时间:2025-09-29
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
EL5102 Phonetics and Phonology
Sem 1, AY2025-26
Assignment 2: Crime-Solving Mystery
DUE: Monday, 29 September, 3 PM (class time)
Introduction
Forensic phonetics is the application of phonetics and phonology to investigating crimes and other legal issues. While forensic phonetics is considered a subfield of forensic linguistics in some parts of the world, in certain regions where this method of investigating crime is popular (e.g., the UK), a terminological distinction is made between forensic phonetics, which focuses on speech data, and forensic linguistics, which examines patterns in text-based evidence.
The use of phonetic data to narrow down the regional or social characteristics of a speaker on a recording has played an important role in criminal cases over the years, and has involved some major scholars in linguistics. In the 1980s, the founder of modern sociolinguistics, William Labov, was a key witness in a bomb threat case; his testimony was able to convince the jury that a series of bomb threats phone calls were made by someone with a Boston accent, while the defendant had a New York accent.
Can we use the phonetics skills we have learned in this course so far to identify the origins of two suspects?
Learning Objectives
• Become more familiar with the use of ELAN and Praat to transcribe speech data.
• Gain experience using Praat scripts to analyze vowels.
• Gain experience using spectrograms to identify phonetic features of speech.
• Gain experience plotting vowels.
• Gain familiarity with identifying dialectal differences.
Scenario
The police have asked you to analyze one recording each from Suspect A and Suspect B.
Let’s see if we can figure out where they are from! (Note: Don’t worry, this is an imaginary crime investigation! These audio clips are just from two online content creators.)
Assignment Instructions
Phase 1: Transcription
1.a. ELAN transcription
• In the Assignment 2 folder you can find an orthographic transcription of both Suspect A and B’s audio data.
• For both Suspect A and B, use ELAN to create a transcription with one tier, IP, in which the orthographic transcription of the speaker is divided approximately into intonational phrases – you can just segment the annotations according to pauses in speech.
• Save each of the transcriptions as an .eaf file (the normal transcription file).
• Export each of the transcriptions to Praat .TextGrid format. You will use these in the next stage.
Notes and Tips for ELAN transcription
• You can simply copy and paste segments of the orthographic transcription, which is already given to you, into ELAN. There is no need to retype it.
• Don’t forget to save frequently so that you don’t lose your work!
• You can export your transcription using File > Export As > Praat TextGrid.
1.b. Praat transcription
• For Speakers A and B, using the imported ELAN files, create two additional interval tiers in each TextGrid: word and phone. You can do this by opening the TextGrid in Praat, going to view & edit, and using Tier > Add interval tier. Remember to save your updated TextGrid file!
• For both speakers, use Praat to segment two clear tokens of each of the following twelve vowels (indicated below using Wells lexical keywords – the corresponding IPA for these vowels in various English varieties is given on Slides 63 and 64 of the Week 2 lecture):
1. FLEECE (equivalent to SHEEP on the Week 2 Slide 63 summary of vowels)
2. KIT (equiv. to SHIP on the Week 2 Slide 63 summary)
3. DRESS (equiv. to SET on the Week 2 Slide 63 summary)
4. TRAP
5. GOOSE
6. FOOT
7. STRUT
8. THOUGHT
9. LOT
10. BATH (equiv. to CLASS on the Week 2 Slide 63 summary)
11. FACE (equiv. to MAY on the Week 2 Slide 64 summary)
12. GOAT (equiv. to MOW on the Week 2 Slide 64 summary)
• For each token, segment the vowel on the phone tier and label it with the Wells lexical keyword (not with IPA). Then segment the word that it occurs in on the word tier, and label it with that word. You do not need to segment and label any words or phones other than the words that correspond to the two tokens per vowel that you want to analyze.
Notes and Tips for Praat transcription
• By ‘clear token’, we mean one where the vowel is pronounced clearly, preferably in a stressed syllable of a content word (i.e., not a function word like ‘and’, ‘or’, ‘but’, etc.), and not next to glides or liquids that obscure where the vowel starts or stops.
• However, if you can’t find two tokens of each vowel like this, you can compromise and use a token with an unclear start or stop, or in an unstressed syllable.
• Don’t worry if you can’t find two tokens for each vowel; just do your best. If you can’t find two tokens for a particular vowel, you may note this in your writeup.
• Only segment the vowel you want to analyze on the phone tier; do not segment any of the other phones in the word.
• Remember to segment the entire word for the word tier, and just the vowel for the vowel tier. You can see an example of this in the Week 4 sample files (starr sample vowels), but note that this sample uses Arpabet notation instead of Wells keywords to indicate the vowels.
• Don’t forget to save frequently so that you don’t lose your work!
Phase 2: Plotting the vowels
2.a Collecting the vowel data
• Using the Praat script that we practiced using in Week 4, collect the formant
information from the midpoint of each vowel token for each speaker (the script should do this automatically). Use these results to create a spreadsheet that reports the results of the script – you can simply import the results.txt file into a spreadsheet program.
• Using the data from the script, calculate the average F1 and F2 value for each of the
12 vowels above, for each speaker. Create a new spreadsheet (or new tab on the same spreadsheet) that shows these average values.
Notes and Tips for collecting the vowel data
• This Praat script is prone to making certain types of errors: for high front vowels, it
may perceive a ‘ghost’ formant between F1 and F2 and mistake it for F2. If that is the case, then the value that it reports for ‘F3’ is actually the real F2 value. If you see an F2 value that looks suspiciously low, and an F3 value that is more similar to what the F2 value is for the average [i] vowel (check Slide 66 from Week 3 for the F2 value for female speakers), then the F3 value it recorded is most likely the actual F2 value.
• The value that the script uses by default as the max frequency of the range to look for five formants is 5500 Hz, for an adult female speaker – you can see this value in the window that appears when you run the script. This means that the script looks for F1 through F5 within the range of 0 Hz to 5500 Hz. If the script is making a number of mistakes like the one above, where it perceives F2 as too low, you can simply raise the max value to something like 6500 Hz, and it will be more likely to look for higher F2 values. If you are having the opposite problem, and it is giving you values that are too high, you can lower the max frequency to 5000 Hz.
2.b Create vowel charts
• Create two vowel charts that show the data for Suspects A and B (each speaker on a separate chart).
• One simple way to do this would be to use NORM:
https://lingtools.uoregon.edu/norm/norm1.php
o Use this template to upload your vowel data:
https://lingtools.uoregon.edu/norm/downloads/NORM_template.xls
o Keep the ‘result type’ at ‘Formant values, un-normalized’
• Another option is to plot the vowels in Excel. Here is a video explaining how to do this:https://www.youtube.com/watch?v=JOQb_jY7KDE
• As an example, here is a plot of my vowels using Arpabet notation made using Excel:
Notes and Tips for plotting the vowels
• Remember that we are looking for two separate plots for Suspect A and Suspect B.
• Remember that each plot should include one point per vowel, indicating the average of the F1 and F2 values for the two data points you collected for each vowel.
• If you use Norm to create your plots, the template includes columns to the right for optional ‘glide’ information in the case of diphthongs; you may leave those columns blank.
• Norm creates a plot that is in eps format, which is difficult to read; you may simply right-click on the image and save it as a png file.
Phase 3: Analysis
Please create a writeup of approximately 500 words that addresses the following topics (3.a to 3.c):
3.a Is Suspect A probably from Glasgow, Liverpool, London, or Yorkshire?
• Please make a claim regarding which of the above four regions Suspect A is most
likely to be from. To support your claim, you should discuss at least two key pieces of evidence. This evidence should refer to the features listed in Table 1 below:
Table 1. Selected key features distinguishing English varieties from Glasgow, Liverpool, London, and Yorkshire (see further information in the notes and tips section below).
|
Features |
Glasgow |
Liverpool |
London |
Yorkshire |
|
Monophthongal FACE and GOAT |
Yes |
No |
No |
Yes |
|
FOOT-STRUT merger |
No |
Yes |
No |
Yes |
|
Postvocalic rhoticity |
Yes |
No |
No |
No |
|
TRAP-BATH split |
No |
No |
Yes |
No |
|
HAPPY |
[e] |
[i] |
[i] |
[ɛ] |
• For each piece of evidence, please explain how this evidence helps to distinguish your chosen variety from at least one of the other listed varieties.
• Each piece of evidence should include screenshots of at least one spectrogram and a brief explanation of how the spectrogram supports your analysis.
• Your evidence can also refer to the F1 and F2 values given for American English from Hillenbrand et al. 1995 (Week 3, Slide 66).
• After presenting your evidence, briefly address the following questions: How
confident do you feel about your identification? Is there any other accent of the four choices that you feel could potentially also be correct?
3.b Is Suspect B probably from Kenya, India, Jamaica, or South Africa?
Please address the same topics as listed above in 3a. In the case of Suspect B, please refer to the following key features:
Table 2. Selected key features distinguishing English varieties from South Africa, Kenya, India, and Jamaica (see further information in the notes and tips section below).
|
Features |
South Africa |
Kenya |
India |
Jamaica |
|
SQUARE |
[e] |
[ea] |
[ɛ] |
[iɛ] |
|
NURSE |
[ɜ] |
[a] |
[ɜ] |
[a] |
|
Postvocalic rhoticity |
Variable |
No |
Variable |
Variable |
|
TRAP-BATH split |
Yes |
No |
Yes |
Yes |
|
TH/DH stopping |
No |
Yes |
Yes |
Yes |
3.c Reflection
Please conclude your writeup with a brief reflection that addresses the following:
• Which parts of this assignment did you each find the most challenging?
• What challenges did you encounter in this assignment, and how did you overcome them?
Notes and Tips for the analysis
• Regarding the features for Tables 1 and 2, you may find further information here:
o FOOT-STRUT merger: this refers to the situation of accents in northern
England in which words with the vowel in STRUT are pronounced with the same rounded vowel that is used in FOOT (e.g., [ʊ]).
https://en.wikipedia.org/wiki/Phonological_history_of_English_close_back_v owels#FOOT%E2%80%93STRUT_split
o TRAP-BATH split: this refers to the situation in dialects like RP in which
BATH-class words are pronounced as a low back vowel (e.g., [ɑ]) while
TRAP-class words are pronounced with a front vowel (e.g., [æ]). Accents such as American English merge both these vowel classes into a single vowel (e.g., [æ]).https://en.wikipedia.org/wiki/Trap%E2%80%93bath_split
o TH/DH stopping: this refers to pronouncing [θ] as [t] and [ð] as [d].
o Postvocalic rhoticity: this refers to the pronunciation of [r] following vowels
and before consonants or pauses. Dialects labelled as ‘no’ in the chart are
similar to RP and delete [r] in these contexts, while dialects labelled as ‘yes’ or ‘variable’will always/sometimes pronounce this [r].
Submission
This assignment is due on Week 7 at 3 PM (our class time).
Please submit the following online in Canvas > Assignments > Assignment 2
1. ELAN .eaf files for Suspects A and B (two files total)
2. Praat .TextGrid files for Suspects A and B (two files total)
3. Spreadsheet (.xlsx) files for the raw formant values for each vowel token for Suspects A and B, plus two additional spreadsheets showing the average F1 and F2 values for each vowel for each speaker (separate files or separate spreadsheet tabs in one file is okay)
4. Vowel plots for Suspects A and B (two plots total, can be part of the Excel files or separate image files)
5. Writeup for Phase 3 analysis for Suspects A and B (one file total, .docx or .pdf format is fine)
Assessment
Section Marks
ELAN files: 10
Praat files: 10
Spreadsheets: 10
Vowel plots: 20
Analysis Writeup: 50
Suspect A 20
Suspect B 20
Reflection 10
Total: 100
The above sections of your project will be marked with the following criteria in mind:
• Whether the transcripts completed according to the instructions
• The accuracy of the vowel segmentation
• The appropriateness of the vowel selection
• Whether the spreadsheets have been generated according to the instructions
• The accuracy of the vowel formants
• The clarity of the vowel plots
• The quality and clarity of the evidence provided for each identification
• The thoughtfulness of the reflections
