Skip to content

Big data and language

In this talk for the seminar series organised by Zaragoza Lingüística, Dr Valenzuela discusses the possibilities that big data-based approaches offer to language research. In summary, big data or “macro data” is the massive data that users generate in their interactions with the digital world and whose enormous volume and heterogeneous nature requires specialized treatment. This talk reviews the main characteristics of big data to then focus on the possible problems derived from the use of big data in linguistic analysis. The final section offers a review of specific studies that use this approach in multimodality research: the study of language that includes not only the verbal component but also multimodal aspects, such as gestures or intonation. The talk concludes with a review of the advantages and problems of using this type of data.