Automatizing the process of understanding the global narrative structure of long texts and stories is still a major challenge for state-of-the-art natural language understanding systems, particularly because annotated data is scarce and existing annotation workflows do not scale well to the annotation of complex narrative phenomena. In this work, we focus on the identification of narrative levels in texts corresponding to stories that can be embedded (stories within stories) or otherwise coordinated within narratives. Lacking sufficient pre-annotated training data, we explore a solution to deal with data scarcity that is common in machine learning: the automatic augmentation of an existing small data set of annotated samples with the help of data synthesis. We present a workflow for narrative level detection, that includes the operationalization of the task, a model and a data augmentation protocol for automatically generating narrative texts annotated with breaks between narrative levels. Our experiments suggest that narrative levels in long text constitute a challenging phenomenon for state-of-the-art NLP models, but generating training data synthetically does improve the prediction results considerably.
@inproceedings{ Reiter2022aa,
Title = {{Exploring Text Recombination for Automatic Narrative Level Detection}},
Author = { Nils Reiter and Judith Sieker and Svenja Guhr and Evelyn Gius and Sina Zarrieß },
Booktitle = {{Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)}},
Pages = { 3346-3353 },
Location = { Marseille, France },
Month = { June },
Year = { 2022 }
}
TY -
TI - Exploring Text Recombination for Automatic Narrative Level Detection
AU - Nils Reiter
AU - Judith Sieker
AU - Svenja Guhr
AU - Evelyn Gius
AU - Sina Zarrieß
PY - 2022
CY - Marseille, France
J2 - Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)
ER -