Linguistics graduate student, Kyle Vanderniet, will present a talk about his work at Emory University May 20th, at 3pm in Modern Languages, Room 201.
Abstract: Text corpora are the predominant tool used by linguistics to study language. However, because they are comprised chiefly of written sources, they present a problem when exploring natural, authentic language use. I will present my results showing how Reddit can enable linguists to build corpora and models that more accurately reflect spontaneous, spoken language. I’ll illustrate this by walking through the analysis of an idiom, ‘[verb]in [one’s] *ss off’, that I used as a medium of comparison. I will also present a brief overview of software tools in Perl, Bash Shell, Java, and Excel and how they aided me in this research. My results show that Reddit was dramatically more representative of how idioms are used in spoken language than traditional text corpora.