class: inverse, center, middle # Introduction to Fieldwork: From elicitation to ELAN <br> ## Session 3: Analysing field data ## Naomi Peck ### Albert-Ludwigs-Universität Freiburg <br> 2022-02-12 (updated: 2022-02-12) <img src="freiburg-logo.png" height="125px"/> <!-- insert VJS logo too? figure this out --> --- class: middle, center # Welcome back! --- # Recap Yesterday, we discussed what fieldwork is, how to collect and manage data, as well as ways of eliciting data. Hopefully you manage to work with someone to create a recording you'll be using today! --- class: middle, center # How did you go with recording? ??? For those of you who weren't able to make a recording before today, supplementary recordings of me (!) are available on my website [here](https://naomipeck.com/project/fieldwork-workshop/). --- # Elicitation Tasks - translation tasks - Leipzig-Jakarta Word List - TMA Questionnaire - grammatical judgements - stimuli tasks - Jackal and the Crow - Pear Story - text elicitation - sociolinguistic interview - "can you tell me about...?" - keeping data security in mind while trying to get naturalistic data - working with a consultant for the first time --- class: middle, center, inverse # Analysing Recordings --- # Analysing Recordings Processing any kind of data takes time. <br><br> For language documentation purposes, especially, this takes quite a long time. A common rule of thumb is that one minute of data will take ten minutes to transcribe and translate (by a native speaker). More complicated data (e.g. multiple speakers, overlaps) means this could take even longer. This is what is termed the *transcription bottleneck*. -- ### It is highly unlikely that you will ever transcribe everything that you record. --- # Analysing Recordings When deciding to create any sort of transformative work of your data (even transcription and translation!), there are a number of analytic decisions you will have to make. -- 1. Units of Analysis -- 2. Methods of Description --- # Unit of Analysis When we work with any kind of audiovisual signal, we have to decide how we will divide it up for analysis. Unfortunately, this analytical decision is often still not well-discussed in the literature, especially in descriptive grammars. -- Traditionally, linguists segment audiovisual media into clauses or sentences. What could be a problem with this? -- Himmelmann (2006) recommends segmenting on the basis of intonation units. Intonation units are posited to be universal for spoken languages (Himmelmann et al. 2018) and therefore should be a robust unit for comparison across corpora. The boundaries of intonation units are best distinguished by pauses, a clear final rise/fall in pitch, and the lengthening of final syllables. Sometimes boundaries are a little bit more fuzzy! ??? Himmelmann, Nikolaus P. 2006. The challenges of segmenting spoken language. In Jost Gippert, Nikolaus P. Himmelmann & Ulrike Mosel (eds.), *Essentials of language documentation*, 253–274. Berlin: Mouton de Gruyter. Himmelmann, Nikolaus P., Meytal Sandler, Jan Strunk & Volker Unterladstetter (2018): On the universality of intonational phrases in spontaneous speech: a cross‐linguistic interrater study. *Phonology* 35. 207–245. --- # Unit of Analysis Depending on what you would like to research, any further relevant units of analysis will also differ. -- What kinds of analytic units would be relevant for these research domains? - morphology - phonetic variation - grammatical relations - gesture -- Units like the 'word' can also be quite thorny issues (Himmelmann 2006). Dixon and Aikhenvald (eds.) 2002 is a great introduction to this topic. ??? Dixon, Robert M.W. and Alexandra Y. Aikhenvald (eds.). 2002. *Word: A cross-linguistic typology*. Cambridge: Cambridge University Press. Himmelmann, Nikolaus P. 2006. The challenges of segmenting spoken language. In Jost Gippert, Nikolaus P. Himmelmann & Ulrike Mosel (eds.), *Essentials of language documentation*, 253–274. Berlin: Mouton de Gruyter. --- # Methods of Description Effectively, I mean here *how* you will conduct your analysis. You may normally think of this as what abbreviations you will use for glossing or what other kinds of shorthand you may use in your writing. -- However, this can be as broad as how you decide to transcribe your file. -- >#### "...the act of transcription [...] is often undertaken as a purely methodological activity, as if it were theory neutral. ... Decisions as seemingly straightforward as to how to lay out the text, to those more nuaced - like how much non-verbal information to include and how to encode minutiae such as pause length and utterance overlap - have far-reaching effects on the utility of a transcript and the directions in which the transcript may lead analysts. > > (Kendall 2008:337) ??? Kendall, Tyler. 2008. On the history and future of sociolinguistic data. *Language and Linguistics Compass* 2(2), 332-351. Nagy, Naomi & Devyani Sharma. 2013. Transcription. In Robert J. Podesva & D. Devyani Sharma (eds.), *Research Methods in Linguistics*, 235–256. Cambridge: Cambridge University Press. --- # Methods of Description Some questions you may ask yourself about transcription: - what should I transcribe? - who should be doing the transcription? - should the transcription be phonetic or phonemic? - is there a standard orthography for this language? is there more than one? - will other linguists understand the transcription? - will the language community understand my transcription? - where do I put spaces? do I need spaces? - what role does punctuation play in my transcription? - do I use a automatic speech recogniser? - do I transcribe false starts and non-linguistic sounds (e.g., coughs)? - how much detail should I go into? ??? Nagy, Naomi & Devyani Sharma. 2013. Transcription. In Robert J. Podesva & D. Devyani Sharma (eds.), *Research Methods in Linguistics*, 235–256. Cambridge: Cambridge University Press. Jung, Dagmar & Nikolaus P. Himmelmann. 2011. Retelling data: Working on transcription. In Geoffrey Haig, Nicole Nau, Stefan Schnell & Claudia Wegener (eds.), *Documenting endangered languages*, 201–220. Berlin: Mouton de Gruyter. --- # Methods of Description Similar questions apply to translation. - do I need to translate the transcription? - what language is appropriate for my research purposes? - what language is appropriate for my language community? - do I translate literally or freely? --- # Methods of Description If you are creating glosses, the most common rules to follow are the [Leipzig Glossing Rules](https://www.eva.mpg.de/lingua/resources/glossing-rules.php) (Comrie et al. 2015). This list of rules also features an appendix of commonly used abbreviations at the end. The most important things to consider are: 1. Word-by-word alignment and morpheme-by-morpheme correspondence 1. Capitals (small caps) for grammatical glossing (i.e., 1SG/<span style="font-variant:small-caps;">1sg</span> not 1sg) 1. Glosses with multiple parts to them should be aligned to one word/morpheme and separated by the appropriate punctuation ??? Comrie, Bernard, Martin Haspelmath, and Balthasar Bickel. 2015. *Leipzig glossing rules: Conventions for Interlinear Morpheme-by-Morpheme Glosses*. Leipzig: Max Planck Institute for Evolutionary Anthropology. --- # Methods of Description These punctuation rules include: - periods between grammatical categories (e.g., <span style="font-variant:small-caps;">-dat.pl</span>) - underscores between multi-word glosses (e.g., be_at) - colon between grammatical categories you do not want to gloss separately (e.g. eat:<span style="font-variant:small-caps;">3sg</span>) - backslashes instead of hyphens if a grammatical property is indicated by a morphophonological change (e.g., father\<span style="font-variant:small-caps;">pl</span>-<span style="font-variant:small-caps;">dat.pl</span>) - greater than sign indicating the direction of action if two arguments are expressed simultaneously (e.g., <span style="font-variant:small-caps;">2du>3sg</span>) ??? Comrie, Bernard, Martin Haspelmath, and Balthasar Bickel. 2015. *Leipzig glossing rules: Conventions for Interlinear Morpheme-by-Morpheme Glosses*. Leipzig: Max Planck Institute for Evolutionary Anthropology. --- class: middle # What kind of decisions will you have to make in transcribing your recording today? --- class: middle, center, inverse # Analysing the Data --- # Analysing the Data Unfortunately, I do not have the time today to go into this! Here, however, are some good resources for those who wish to go in the field: Ameka, Felix, Alan Dench, and Nicholas Evans (eds.). 2006. *Catching Language: The Standing Challenge of Grammar Writing*. Berlin/New York: Mouton de Gruyter. Jun, Sun-Ah (ed.) 2005 Prosodic Typology. The Phonology of Intonation and Phrasing.Oxford: Oxford University Press. Kroeger, Paul R. 2005 Analyzing Grammar. Cambridge and New York: Cambridge University Press. Payne, Thomas E. 1997. *Describing morphosyntax: a guide for field linguists*. Cambridge: Cambridge University Press. Tagliamonte, Sali A. 2006. Analysing Sociolinguistic Variation. Cambridge: Cambridge University. --- # Further Sources Chelliah, Shobhana L. and Willem J. De Reuse. 2011. *Handbook of Descriptive Linguistic Fieldwork*. Dordrecht: Springer. Gippert, Jost, Nikolaus Himmelmann & Ulrike Mosel (eds.). 2006. *Essentials of language documentation*. Berlin: Mouton de Gruyter. Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. *Linguistics* 36, 161–195. Himmelmann, Nikolaus P. 2018. Meeting the transcription challenge. *Language Documentation and Conservation* 15, 33–40. Kendall, Tyler. 2008. On the history and future of sociolinguistic data. *Language and Linguistics Compass* 2(2), 332-351. --- class: inverse, center, middle # Short Break
05
:
00