ELAN Exercise 2
Creating a single-language transcript with multiple speakers
Today’s files are found in the folder: eng-television.
Step one: Create a new ELAN file.
Follow the steps in ELAN Exercise 1 to create a new ELAN file with the audio file 710-eng-television.wav or the video file `710-asl-conversation’.
Step two: Define linguistic types.
Because we’re starting to work with multiple tiers, it’s a good time to start thinking about ELAN’s linguistic types. This transcript will have one tier for each of the people in the conversation, and they will all contain a textual representation of the actual words that are signed or spoken on the recording. Remember that text representations of the contents of the recordings are considered to be top-level parent tiers, So, we’ll need a linguistic type that is associated directly with the timeline of the audio stream. Any tiers we make using this type will be used for dividing the audio stream into time-based portions. ELAN has a default type like this already assigned when you create a new file. It’s called default-lt, but let’s give it a better name.
- Go to
Type\(>\)Change Linguistic TypeThis opens up theChange Typedialog box.
- Let’s give
default-lta better name now: something liketext(or anything else you can easily remember). When picking a name for your type, remember that you’re NOT giving a name to an actual tier here. The other items in the dialog box contain some information about the behavior of this linguistic type. You’ll see that the stereotype isNone. This is the correct stereotype for this linguistic type (we’ll talk about the other stereotypes later). The menu calledUse Controlled Vocabularycan be ignored for now, as can the text box calledISO Data Categoryand theReferences to Graphics Allowedcheck box. Notice that theTime-alignablecheck box is checked. This is correct for this type, since tiers made with this type are aligned directly to the audio timeline.
For example, it’s a bad idea to call this type Joe's utterances because you can use this type for any tier that is associated directly to the audio timeline–i.e., not just Joe’s utterances, but also Mary’s utterances, or even Mary’s individual words. Save those more specific names for your tiers.
Click
Changeto accept the name change to this tier. Notice that the name changes in the window at the top of the dialog box.This is the only type we need for our single-language transcript, so click
Closeto close the dialog box.
Step three: Define tiers.
Today we are going to make a transcript for the television.wav audio clip or the asl-conversation.mp4 video clip. If you like, you can follow along with the pre-typed transcript for the audio clip by opening eng-television.txt in a text editor (wherein the symbol @ stands for a pulse of laughter). You’ll see that we have four speakers in this transcript, so we need to create four tiers: one each for Alvin, Lea, Peter and Allison. Each of these will be a top-level parent tier, and directly associated to the audio time line. Again, ELAN gives you a default tier of the correct configuration (but only one), so we’ll give it a better name and then create more tiers just like the first one for each speaker.
- Go to
Tier\(>\)Change Tier Attributes
The first drop-down menu contains all the tiers you have set up for this file (it only contains one, called ‘default’). Below that is an input box with the current tier’s name. Let’s give it a better name. Since we need a tier for each speaker, let’s start by calling this tier ‘Alvin’, named after our first speaker.
The next input box is labeled
Participant. Here you can type the full name of the speaker if you wish (Alvin Smith, for example). In the box labeledAnnotatoryou can type your own name if you wish (or the name of whoever annotated this tier for you).The drop-down menu labeled
Parent Tieralready containsNone. This is correct–remember, this is going to be a parent tier, so it hasNonefor its parent. The menu labeledLinguistic Typealready has the correct type selected, thetexttype that we previously defined. TheDefault Languagemenu allows you to pick a language for this tier.Click
Changeto accept the renamed tier. Notice that the name has changed both in the upper portion of the dialog box and in the main ELAN window.Now we have to set up three more tiers just like this one for the other speakers in our audio file. Leave the dialog box open and click on the
Addtab (or, if you’ve closed the box, go toTier\(>\)Add New Tier).In the
Tier Namebox, type the name of the second speaker: Peter. Again you can type the speaker’s full name in theParticipantbox, and your own name in theAnnotatorbox. Leave theParent TierandLinguistic Typesettings as they are (Noneandtextrespectively). When you’ve defined everything correctly, clickAddto add this tier (if you make a mistake, click on theChangetab and fix it!).Now add tiers just like this for the other speaker(s). Click
Closeto close the dialog box.
Step four: add annotations.
Now you can begin to add annotations for each speaker. Just like in the previous exercise, use the media controllers to listen to selected portions, and then you are ready to transcribe, double-click on a blue-highlighted area to open up a text box.
Note that the multi-tier structure of ELAN allows you to overlap annotations in time whenever the recording contains overlapping speech. You may need to listen carefully for the overlaps!
As you work, try to keep the edges of the annotations as close as possible to the actual beginnings and ends of the utterances you are transcribing.
If you would like to cut-and-paste from the text version of the transcript, feel free. Keep in mind this is a good time-saving technique if you are working with files that have already been transcribed in a separate text editor, which you are now time-aligning in ELAN for the first time.