Emu Workshop

Hywel Stoakes
31/08/2017

What is Emu?

Image

  • A flightless bird native to the the Australian Continant (Sahul)
  • A way to manage speech files with annotations and efficiently use statistical analyses

More Specifically?

Emu is a speech database management system:

  • Developed by Raphael Winkelmann at IPS in Munich.
  • Developed out of The Emu Speech Database developed at Macqurie Uni by Steve Cassidy and Jonathan Harrington and others.

Why use Emu?

  • Fast and powerful way to process speech files particularly if they are part of large corpora.
  • Hierarchical analysis
  • Access to statistical tools in R

What we will learn in this workshop.

  • How to visualise speech files in Emu The Emu Webapp
  • Formats for your data
  • Best practise for naming files
  • How to setup a database in EmuR
  • some basic visualisation in R

Comparisons between Praat and Emu

  • Many of you will be familiar with Praat.
  • What does Emu do that Praat does not?
  • Platform Independent/Browser Based (works best with Chrome/Firefox)
  • Interfaces easily with R

Importing from Praat TextGrids

  • Navigate to: http://ips-lmu.github.io/EMU-webApp/
  • Drag and Drop a pair of files (wav and textgrid)
  • These can be up to an hour long depending on quality of audio
    • (there is a limit of 2Gb per tab in Chrome)

Emu Web App: annotation

  • See help in the Emu Web App
  • Full set of Keyboard shortcuts
  • Feature requests welcomed: Link

Emu Web App: options

  • Spectrogram
  • Download Textgrid
  • Download JSON

Emu Web App: limitations

  • No access to speech analysis tools
  • No direct access to statistical and visualisation tools within R

Advanced: Installing R, Rstudio

Advanced: Installing: EmuR

Once R and Rstudio are installed:

install.packages("emuR")

To load the library, use:

library("emuR")

Quickstart in R:

  • Create and load a demo emuDB
create_emuRdemoData()

followed by

ae = load_emuDB(file.path(tempdir(), "emuR_demoData", "ae_emuDB"))

Quickstart in R:

  • Query the emuDB for all /n/ segments of the “Phonetic” annotation level
    sl = query(ae, "Phonetic == n")

Quickstart in R:

  • Extract the according formant values -
td = get_trackdata(ae, sl, onTheFlyFunctionName = "forest")

Quickstart in R:

Serve emuDB to EMU-webApp for annotation / visual inspection -

serve(ae)

For a more explicit introduction see

vignette("emuR_intro")

Advanced: Database construction

TBA