Chunking in verbal art and speech

How do we learn to organize a language in chunks and to use those chunks creatively? To address this question, FORMULEARN rethinks formulaicity and creativity through discriminative and integrative learning in a complex dynamic system, shifting the focus from the written to the spoken word.

Theories of chunking are based on abstract rules or on the storage of large numbers of exemplars, always with a view of linguistic knowledge as linear combinations of units, such as phonemes or morphemes. Recently, Baayen and collaborators have proposed that linguistic chunking is based on discriminative learning, which creates statistical expectations within the complex dynamic system of cues and outcomes underlying language, without discrete units being necessary.

The Parry-Lord theory of oral composition-in-performance argued that oral singers produce complex poems out of rehearsed improvisation through the mastery of a system of formulas, chunks that integrate phrasal, metrical, and semantic structures. Ubiquitous throughout the history of the species, oral traditional performance is the original form not only of poetry but also of public discourse in general. Its comparison with general speech and text can shed light on chunk formation, on how chunking is learned, and on how it relates to meaning and creativity.

FORMULEARN reconsiders formulaicity and creativity by contrasting these theories, seeking to design the first quantitative studies of formulaic creativity not based on morphosyntactic patterns, but on sequences of multimodal cues directly linked to semantic contrasts and cognitive states, using statistical and machine-learning techniques such as discriminative learning models and generalized additive models.


Pagán Cánovas, C. In preparation. Oral poetics and multimodal language models. In Rethinking Orality III: From Homer to Neuroscience. Berlin, Boston: De Gruyter.

Pagán Cánovas, C. 2020. Learning formulaic creativity: Chunking in verbal art and speech. In T. Hoffmann (ed.), Construction Grammar and creativity: Evolution, psychology, and cognitive science, special issue of Cognitive Semiotics 13(1).

Pagán Cánovas, C. 2019. Creatividad formular: aprendizaje y segmentación en la poesía oral y en el discurso hablado. In R. González Ruiz, I. Olza & Ó. Loureda Lamas (eds.), Homenaje a Manuel Casado Velarde: textos breves. Pamplona: Universidad de Navarra. 51-56.

Antović, M. & Pagán Cánovas, C. 2018. Not dictated by metrics: Function words in the speech introductions of South-Slavic oral epic. Language and Communication 58: 11-23.

Pagán Cánovas, C. & Antović, M. (eds.) 2016. Oral Poetics and Cognitive Science. Berlin: Mouton de Gruyter.

Pagán Cánovas, C. & Antović, M. 2016. Introduction: Oral poetics and cognitive science. In C. Pagán Cánovas and M. Antović (eds.) Oral Poetics and Cognitive Science. Berlin: Mouton de Gruyter. 1-11.

Pagán Cánovas, C. & Antović, M. 2016. Construction grammar and oral formulaic theory. In C. Pagán Cánovas and M. Antović (eds.) Oral Poetics and Cognitive Science. Berlin: Mouton de Gruyter. 79-98.

Pagán Cánovas, C. & Antović, M. 2016. Formulaic creativity: Oral poetics and cognitive grammarLanguage and Communication 47: 66-74.


(all by FORMULEARN’s PI, Cristóbal Pagán Cánovas)

Cognition et culture dans la créativité formulaire : la segmentation du discours à travers de l’art verbal et la parole. 10èmes Ateliers d’été du CerLiCO : Langage, cognition, …et cultures. Université Bayonne, 8-9 Oct 2021.

The FORMULEARN project: Formulaic creativity and the origins of language and music. Language, Music, and Meaning Colloquium, University of Cologne, 13 July 2021.

The cross-cultural patterns of poetic performance in oral traditions: multimodality, communication, and cognition. Pleary lecture, International Conference on Intercultural Multimodal Communication. Hunan Normal U. Changsha, 12-13 December 2020.

Formulaic creativity and cognitive oral poetics. Creativity and Construction Grammar Workshop. KU Eichstätt-Ingolstadt, 19-20 Mar 2019.

ORALMIND: Oral poetic traditions and the origins of human cognition, communication, and creativity. CCLS Lecture Series, Faculty of Arts, University of Cologne, 21 Jan 2019.

Formulaic creativity: Connecting the main tenets of cognitive linguistics and oral poetics. Formula: Units of Speech – ‘Words’ of Verbal Art. Helsinki, 17-19 May 2017.

Oral poetic discourse and the construction of culture: Cognitive, literary, and linguistic approaches. Panel organized at DiscourseNet 15. Belgrade, 19-21 Mar 2015. Two talks: “Oral poetic discourse: an introduction” and “Construction grammar and oral formulaic theory (with Mihailo Antović)”

Composition in performance: Construction grammar and oral formulaic poetry, with Mihailo Antović. 8th Conference on Construction Grammar. Osnabrück, 3-6 Sep 2014.

Oral Poetics and Cognitive Science. Organized with Mihailo Antović. Freiburg Institute for Advanced Studies, 24-26 Jan 2013. Two talks: “Oral Poetics and Cognitive Science: An Introduction” and “Formulaic Style and Construction Grammar: Speech Introductions”.

Towards a Cognitive Oral Poetics: Constructions, Frames and Mappings in the Study of Oral Epic Poetry. DGKL 5, German Cognitive Linguistics Ass. Freiburg, 10-12 Oct 2012.


FORMULEARN is currently supported by an Alexander von Humboldt Fellowship for Experienced Researchers (2019-2022), which allows Cristóbal Pagán Cánovas to spend time at the Department of Quantitative Linguistics of the Eberhard Karls Universität Tübingen, in Germany, where he is working with Professor Harald Baayen and his collaborators.

Previously, the project was generously supported in its initial phases by a Tandem Fellowship in Linguistics and Literary Studies from the Freiburg Institute for Advanced Studies (2012-2013), awarded to Cristóbal Pagán Cánovas and Mihailo Antović.