Programm

Programm#

Das vorläufige Programm der internationalen Konferenz Historical Languages and AI enthält die Abstracts zu den meisten Beiträgen.

Die finale Version finden Sie ab Februar 2026 hier vor.

Donnerstag, 5.3.2026

Programmpunkt	Uhrzeit	Speaker
Ankunft & Kaffee	09:00 - 09:30
Eröffnung & Das Daidalos-Projekt ^hybrid	09:30 - 10:00
Panel 1: Datasets & Data Augmentation I^hybrid
ALMeD: Medieval French Gold-Standard	10:00 - 10:30	Ragini Menon & Sabine Tittel (Heidelberg)
Abstract Full Title: ALMeD: A domAin-specific goLd-standard for MeDieval French All authors: Ragini Menon Sabine Tittel Medieval French is one of the Romance vernacular languages with a significant history of transmission and an enormous influence on other vernacular languages of the time. The sources convey valuable knowledge about all aspects of culture. Studying this language increasingly relies on NLP approaches. There have been numerous efforts to parse medieval French; however, these efforts use corpora comprising legal texts and—primarily—literary texts as their foundation and training data. This means that, with the exception of juridical resources, scientific literature is overlooked. We present ALMeD, an annotated, semantically rich corpus (work-in-progress) of medical and surgical treatises produced in medieval French, i.e., Old and Middle French including Anglo-Norman. Our semantically disambiguated gold-standard goes beyond existing resources, focusing on medical terminology, and accounts for morphological development, evolving syntactical structures and lack of spelling normalisation in these languages.
Unified Old Church Slavonic Dataset	10:30 - 11:00	Anastasija Margolina (Belgrad)
Abstract Full Title: Down the Slavic Memory Lane: A Unified Old Church Slavonic Corpus for Core NLP Tasks All authors: Anastasija Margolina This work introduces a large, unified Old Church Slavonic (OCS) corpus to address the under-representation of OCS texts in a standardized digital format. The openly licensed corpus merges 45 heterogeneous sources into a single dataset containing 256,000 text segments, which amounts to 4.4 million tokens. All texts feature standardized Unicode encoding and retain their full diacritics. To demonstrate the corpus’s usability for core NLP tasks, a six-class genre classifier was developed by fine-tuning a RuBERT model, achieving a weighted F1 score of 0.8. The complete corpus, preprocessing scripts, and the fine-tuned model are made publicly available at HuggingFace OCS-Dataset.
Kaffeepause	11:00 - 11:30
Panel 2: Datasets & Data Augmentation II^hybrid
Data Augmentation Strategies for Premodern Greek LM	11:30 - 12:00	Jacob Murel et al. (Princeton)
Abstract Full Title: A Comparison of Data Augmentation Strategies for Premodern Greek Language Models All Authors: Jacob Murel Sarah Yuan Barbara Graziosi This paper compares standard and original data augmentation (DA) strategies for premodern Greek language models (LMs). We also propose an original DA method intended to reflect manuscript variance. We evaluate DA’s impact on LM performance for downstream philological tasks: gap-filling, error detection, and correction. Initial results show DA yields marginal gains, with our variance-based DA performing worse than noisier methods. We suggest this may stems from the degree of variation introduced, highlighting the need to balance grammatical accuracy with textual diversity in classical language LMs.
InviTE Corpus: Annotating Tudor Invectives	12:00 - 12:45	Sophie Spliethoff et al. (Bielefeld)
Abstract Full Title: The InviTE Corpus: Annotating Invectives in Tudor English Texts for Computational Modeling All authors: Sophie Spliethoff Sanne Hoeken Silke Schwandt Sina Zarrieß Özge Alaçam In this paper, we aim at the application of Natural Language Processing (NLP) techniques to historical research endeavors, particularly addressing the study of religious invectives in the context of the Protestant Reformation in Tudor England. We outline a workflow spanning from raw data, through pre-processing and data selection, to an iterative annotation process. As a result, we introduce the InviTE corpus – a corpus of almost 2000 Early Modern English (EModE) sentences, which are enriched with expert annotations regarding invective language throughout 16th-century England. Subsequently, we assess and compare the performance of fine-tuned BERT-based models and zero-shot prompted instruction-tuned large language models (LLMs), which highlights the superiority of models pre-trained on historical data and fine-tuned to invective detection.
Mittagessen	12:45 - 13:45
Pitch your Research Idea^hybrid	13:45 - 14:30
Mobility.AI. Detecting Mobility in Antique Sources		András Handl (Leuven)
Introducing the COALA Project		Barabara McGillivray (London)
Technical Standards for Latin and Ancient Greek		Konstantin Schulz (Berlin)
Science Speed Dating^{nur in Präsenz}	14:30 - 16:00
Kaffeepause	16:00 - 16:30
Panel 3: LLMs^hybrid
Evaluating Latin/Greek Preverbs with LLMs	16:30 - 17:15	Andrea Farina & Michele Ciletti (London)
Abstract Full Title: Probing Preverbs: Evaluating Large Language Models on Latin and Ancient Greek Preverbed Motion Verbs All authors: Andrea Farina Michele Ciletti Preverbs, i.e., prefixes that modify verbal bases, play a central role in the semantics of Latin and Ancient Greek. Their meanings range from fully compositional to highly lexicalized, making them an ideal test case for evaluating the semantic capacities of Large Language Models (LLMs). We investigate the ability of 13 LLMs to interpret preverb semantics across a dataset of 2,834 manually annotated preverbed motion verbs, under zero-, one-, two-, and five-shot prompting conditions. We find that LLMs perform moderately well overall, with GPT-5 achieving the highest F1 of 0.629. Performance improves with increased context examples and is generally higher for compositional, non-lexicalized preverbs, Ancient Greek, and earlier historical periods. Qualitative evaluation highlights systematic differences in consistency, sensitivity to lexicalization, and handling of polysemy, revealing that reasoning-enabled and larger models tend to generate more accurate and internally coherent interpretations.
LLM for Middle French Translation (online)	17:15 - 18:00	Raphael Rubino et al. (Genf)
Abstract Full Title: Prompting Large Language Model for 16th Century Middle French Text Normalization and Modernization All authors: Raphael Rubino Mathilde Fontanet Sandra Coram-Mekkey Christophe Chazalon Pierrette Bouillon This paper presents a study on 16th century Middle French text normalization and modernization through pretrained large language model (LLM) prompting. The modernization process is decomposed in several steps performed by domain experts, leading to four textual variants manually derived from the original text. We explore the use of several few-shot sampling techniques for LLM prompting and compare them to full model fine-tuning, showing a trade-off between computation time and performance on the downstream tasks, with similar hardware specifications. The evaluation, in terms of automatic metrics, shows that the normalization task leads to lower error rates compared to modernization. All sampling methods outperform random selection of examples while surface-based sampling methods outperform the embedding-based approaches tested in our study for n-shot selection.
Gemeinsames Abendessen	ab 18:00

Freitag, 6.3.2026

Programmpunkt	Uhrzeit	Speaker
Ankunft & Kaffee	08:30 - 09:00
Panel 4: Specific NLP Tasks^hybrid
Supervised Ancient Greek translation alignment	09:00 - 09:45	Alek Keersmaekers (Leuven)
Abstract Full Title: Lexically-oriented word alignment for Ancient Greek: a learning-to-rank approach All authors Alek Keersmaekers This paper introduces a new approach to translation alignment, called lexically-oriented word alignment, which aligns words based on their lexical content. This approach makes translation alignment more flexible for specific purposes, such as automatically creating annotated datasets and bilingual dictionaries. It develops a new method to perform lexically-oriented word alignment between Ancient Greek and English, based on learning-to-rank supervised machine learning and automatic phrase detection. This approach is able to outperform earlier, unsupervised techniques for Ancient Greek-English word alignment, in particular for content words.
Topic Modeling of Latin Sermons	09:45 - 10:30	Tam Johnson et al. (Stockholm)
Abstract Full Title: Topic Modeling of Latin Sermons All authors Tam Johnson Jacob Langeloh Beáta Megyesi When examining the works of influential thinkers and theologians from the late medieval period, scholarship has often shown a curious reluctance to engage with the sometimes substantial collections of sermons. Can the latest advances in Natural Language Processing (NLP), and particularly in topic modeling, help to access the contents of these vital sources, in order to chart them more efficiently and understand them better? This study investigates the viability of using an unsupervised clustering technique to discover the various themes underlying and connecting these historical documents. Using BERTopic as the analytical tool, we examine parameter and data preprocessing alternatives and assess the impact on semantic quality of the generated topics by using standard quantitative metrics paired with a qualitative evaluation based on expert human judgments. The best performing model achieved scores -0.031 for topic coherence and .840 for topic diversity, while the expert human evaluator correctly identified 55% of intruder words in a word intrusion task, suggesting that automated unsupervised methods such as BERTopic could be useful for historians.
Evaluating Sentence Embeddings for Intertextuality	10:30 - 11:00	Michael Wittweiler et al. (Zürich/Konstanz)
Abstract Full Title: Context Matters: Probing the Robustness of Sentence Embeddings for Intertextuality Detection in Latin Text All authors Michael Wittweiler Marie Revellio Julian Schelb This paper explores the use of sentence embeddings for detecting intertextual references and paraphrased citations in Latin texts. We fine-tune a sentence transformer (SPhilBERTa) on a curated dataset of 544 intertextual pairs from Jerome and Lactantius, paired with candidate sentences from Virgil and other classical authors. Our analysis includes controlled modifications of target sentences to test robustness against inflectional variation, paraphrasing, and contextual changes Results show that embeddings retrieve longer verbatim quotations with high accuracy and are tolerant of morphological variation and paraphrase. However, performance decreases for short allusions and when surrounding semantic contexts diverge strongly. We find that context plays a decisive role: condensing target sentences to citation material can improve retrieval, but additional contextual cues can also strengthen similarity. We conclude that combining embedding-based and n-gram methods may yield broader coverage of intertextuality in Latin, while philological interpretation remains essential.
Kaffeepause	11:00 - 11:30
Workshop 1^hybrid
Tailored LLM for Ancient Texts	11:30 - 13:00	Premshay Hermon et al. (Jerusalem)
Abstract Full Title: First Steps towards a Tailored LLM for Ancient Scientific Texts (and beyond) Organisers: Premshay Hermon Orly Lewis Gideon Manelis Gabriele Torcoletti Beginner Level: No technical skills or knowledge is required of participants. Requirements: Please bring your own laptop. The workshop serves as an initial collaborative step toward developing domain-specific small LLMs for classical scholarship. Large language models (LLMs) are advanced AI systems trained on extensive text corpora to predict and generate language by learning patterns of words, concepts, and relationships. They create internal representations that support flexible reasoning and text generation. Fine-tuning these models with carefully curated question-answer pairs from specific domains such as ancient texts allows them to specialize in recognizing historical languages, terminology, and scholarly inquiry styles. Small LLMs with fewer parameters can be effectively fine-tuned to perform specific tasks with less computational cost, making them accessible for humanistic research. This process improves their accuracy and relevance by focusing on domain-specific knowledge, producing reliable tools that assist scholars in analysing and interpreting classical literature. The workshop will focus on a key and early step in building a tailored LLM: creating a suitable dataset for training it. Such a dataset consists of a large number of human-formulated questions and answers related to given data (e.g. an edition, a commentary, or a cluster of editions), which are then used to train the model to perform the required tasks. For the sake of feasibility and efficiency, we will focus on sources related to ancient medical and philosophical texts in Greek and Latin. However, we perceive this workshop as a case study and model for broader LLMs for all Greco-Roman literature, or for other historical languages and scholarly fields relying on them. At the core of the workshop stand the questions and answers that participants will compos concerning the ancient texts. We will provide templates and guidelines for writing suitable questions and answers for training the LLM. Questions will pertain to content, style and terminology of ancient texts in Latin and ancient Greek. While the emphasis will be investigating these broad themes in ancient scientific and philosophical texts, the workshop will be open to those interested in ancient texts from any domain. Direct access to the ancient texts is not strictly necessary, but participants may want to consult some texts to fine-tune their questions and answers. The organisers will provide key ancient sources – Galen, Hippocrates, Aristotle, Cicero, Lucretius, Seneca – in a searchable format, and participants may bring their own files for other texts.
Mittagessen	13:00 - 14:00
Workshops 2 & 3 (Parallele Sessions)^{nur in Präsenz}
WS 2: Logion: Machine Learning for Classical Philology	14:00 - 15:30	Jacob Murel (Princeton)
Abstract Full Title: Logion: Machine Learning for Classical Philology Organisers: Jacob Murel Beginner Level: Suitable for participants with no technical background. Requirements: Please bring your own laptop for which you have permissions to download software, as well as a Greek or Latin text you are interested in editing. This workshop introduces classics scholars and philologists to Logion. Logion is a no-code application that allows researchers to leverage language models to provide concrete suggestions and inspiration for common philological tasks, namely gap-filling and error detection/correction, for Latin and Greek. The workshop begins with a 15 minute presentation about the Logion project, followed by a 10-15 minute demo of the Logion application. Participants will then be provided a walkthrough on downloading and installing the application and how to use its current features for gap-filling and error detection/correction and access software documentation. The rest of the time will be devoted for participants to explore the software and ask questions. Participants are encouraged to come with a Greek or Latin text on which they are working or interested in editing in order to explore how Logion may assist them in editing and compiling emendations. The final 20 minutes of the workshop will be devoted to discussion on in-development features and for participants to provide feedback on how the software may be improved. Outcomes for the workshop are three-fold: participants will have an 1) opportunity to test LMs for their own philological work in a user-friendly environment without prior knowledge of code, 2) provide feedback on how the application may be improved to better assist their research, 3) foster ongoing collaborations between the Logion project and international philologists compiling critical editions of premodern texts.
WS 3: Finetuning and Low-resource Languages	14:00 - 15:30	Thomas Renkert & Florian Nieser (Heidelberg)
Abstract Full Title: How to train your own LLM on historical and low-resource languages — a hands-on workshop Organisers: Thomas Renkert Florian Nieser Advanced Level Requirements: Base level knowledge of how LLMs are structured and what fine-tuning is Jupyter Notebooks, Python, and the Linux terminal Deeper programming skills are not necessary. Other requirements: Please bring your own laptop. In this workshop, participants will learn how to train large language models on historical and low-resource languages, based on our experiences within the ParzivAI project. To facilitate reproducible and shareable results, we will focus on the usage of open source AI models in this workshop. Participants will gain insights into the full pipeline from creating their own datasets, to fine tuning, evaluating and using their own models in user-friendly ways. We will give an outlook on how to use historical chatbots in academic teaching. The aim of the workshop will be 1) to give a look into how datasets are created, how data is harvested, formatted and structured as well as how different pipelines for dataset creation and testing need to be organized. 2) We want to give a look into training techniques for instruction tuning (non-reasoning and reasoning), and what to consider regarding low-resource languages. 3) Furthermore, we look into different ways of using the trained LLMs (RAG, streamlit frontend, Openwebui). 4) The whole workshop will be structured as an open format with hand-on-parts, discussions, where we will show live examples of simple finetuning-tasks and testing the trained LLM. The workshop will demonstrate methods and pipelines of facilitating datasets for low-resource languages especially regarding Middle High German as well as demonstrate training approaches and the process of choosing the correct LLM for the given task. At the end, the participants will be able to create their own finetuning pipelines and apply the concepts of the workshops to other historical languages and their own research projects.
Kaffeepause	15:30 - 16:00
Panel 5: Applications beyond Research^hybrid
Libraries and the Digital Humanities	16:00 - 16:45	Katharina Ost (Düsseldorf)
Abstract Full Title: Data Holdings of Academic Libraries as an Opportunity for the Digital Humanities? All authors: Katharina Ost The paper outlines the services offered by academic libraries in the digital humanities based on four roles: as data providers, networkers, trainers, and developers. A practical example illustrates the preparation of a digitized incunable for NLP-workflows. The case study shows that there is a considerable amount of curatorial and technical effort involved between making available page scans and achieving genuine DH reusability.
A Classical Language AI Query Assistant	16:45 - 17:30	Eleni Bozia et al. (Florida)
Abstract Full Title: Automating Classical Language Philology: The Classical Language AI Query Assistant All authors: Audrey Barber Thomas Cerniglia Eleni Bozia In an effort to make classical research more accessible, the Data-Driven Humanities Research Group has developed an AI assistant focused on computational philology. Historically, researchers have manually parsed Latin and Greek texts to translate and reference ancient sources. The Classical Language AI Query Assistant aims to streamline this process by utilizing an automated querying system. Using ChatGPT-4-based research, this model was developed to answer specific grammatical and syntactic questions. School and higher-education instructors can use this assistant to enhance traditional pedagogical approaches. Additionally, scholars of other disciplines who wish to learn ancient Greek and Latin to increase their research sources can have easier access to AI-assisted language tools Leveraging such modern computational methods opens up several new avenues for teaching and research in classical studies.
Abschlussdiskussion & Feedback	17:30 - 18:00

Hier können Sie das Programm mit allen Abstracts als PDF herunterladen.