Forschung – Digitale Forensische Linguistik

Research Topics

Our group focuses on computational linguistic analyses of digital media. Particular research topics of interest are:

harmful language online, such as disinformation and hate speech
discourse and dialog structure
machine learning approaches to social media discourse
computational analysis of non-literal meaning
linguistic variability in digital media
emojis and other linguistic social media phenomena
computational approaches to QUDs

Research Output

Current Funded Projects

Semantics and pragmatics of emojis in digital communication 1/2

DFG Priority Programme "Visual Communication" ViCom; DFG, 2022-2028
PIs: Tatjana Scheffler, Patrick Grosz (Oslo)
project website

In this project, we investigate the explanatory power of iconic and symbolic approaches to emoji semantics via corpus analysis and a set of experiments that explores the continuum between pictorial and symbolic emojis. Starting from the hypothesis that face emojis are composed of (iconic or symbolic) minimal units, we study how existing and novel emojis are processed semantically. This empirical basis will allow us to differentiate between the proposed iconic and lexical approaches to emoji meaning and develop a hybrid semantics.

Virtual Common Ground: Language and Virtual Identity

SFB1567, subproject D03; DFG, 2022-2026
PI: Tatjana Scheffler
project website

The subproject investigates the question of which linguistic means are used to construct and maintain the Common Ground in virtual communication. Using corpus and computational linguistic as well as experimental methods, two primary phenomena are investigated: the indication of group membership through positive speech (the counterpoint to hate speech), and the construction of virtual identities via linguistic self-representation, as well as via the adaptation to interlocutors in virtual life worlds.

Disagreements and their limits in discourse structure annotation: From insights to machine learning

SFB1287 (U Potsdam), subproject C11; DFG, 2026-2029
PIs: Manfred Stede (U Potsdam), Tatjana Scheffler
project website

Project C11 investigates Human Label Variation (HLV) in pragmatic annotation, challenging the traditional “single ground truth” assumption, and examining its potential as a meaningful signal for discourse structure. The project will identify causes of both inter- and intra-annotator disagreement, to inform new HLV-enriched discourse models. The project will develop input representations and evaluation schemes for machine learning under HLV, which ultimately may benefit downstream tasks. By addressing the hidden limits of discourse annotation variability, C11 contributes to CRC Cluster C’s focus on Hidden Variability.

Metaphor and social positioning in religious online forums

SFB1475, subproject C04; DFG, 2022-2025
PIs: Tatjana Scheffler, Frederik Elwert (RUB CERES)
project website

„Metapher und soziale Positionierung in religiösen Online-Foren“. Das Projekt erforscht, inwiefern Metaphern sozialer Rollen in den Äußerungen von religiösen Laien in Online-Foren verwendet werden. Inhaltlich steht der Einsatz von Metaphern zur Abgrenzung verschiedener Gruppen im Fokus. Auf methodischer Seite erforscht das Projekt computerlinguistische Methoden, die die Diskursstruktur der Foren nutzen, um Metaphern in großen Datenmengen teilautomatisch finden und interpretieren zu können.

Other Projects

"DEDICAITE – DEtecting AI-generated TExts in a DIdactic Context" In Kollaboration mit dem St. Josef-Hospital und der Abteilung für Medizinische Informatik, Biometrie und Epidemiologie
PostDocLab "A Multi-dimensional Approach to Advancing Disinformation Research: Narratives, Effects and Counter Structures", College UA Ruhr

Recently Completed Projects

SFB1287 (U Potsdam), subproject T01 "Transforming text across media"; DFG, 2021-2025
noFake: Crowdworking, digital platforms, social media, Projektwebseite; BMBF, 2021-2024
SFB1287, subproject A03 "Discourse Strategies across Social Media: Variability in Individuals, Groups, and Channels"; DFG, 2017-2021
Lehrprojekt: [Digitale Analyse großer Textkorpora]; RUB "Forschendes Lernen", 2021-2022
Lehrprojekt: LIMELDAS, Empirisches Arbeiten mit linguistischen Daten; RUB "Forschendes Lernen", 2020-2021

Resources

TwiBloCoP – Multimedia corpus of parenting bloggers: tweets & blogs

Linguistics of Emojis

We maintain the public Zotero library on papers related to the linguistic analysis of emojis. You can view or use the library here.

If any references are missing, you can suggest additions to the library via this online form.

Face Emoji Dataset