
Project Homerides
Building AI models to decode Antiquity.
Scansion is the practice of assigning rhythm to verse based on syllable lengths. It is subject to dozens of specialised rules and edge cases, and thus is mainly carried out by human experts instead of computer programs. Our AI model, homerides-scansion3, is the most accurate hexameter scansion tool available today, beating out all other state-of-the-art automatic tools at 99.7% accuracy. By comparison, the most advanced foundational models right now (Gemini 2.5 Pro, GPT o4-mini-high) only score in the low 20-30% range.
About Our Work
Large language models often struggle with questions in classical literature, either due to limited available training data or because these tasks involve open research challenges.
Our research aims to develop models capable of converting English prose novels into Homeric verse, advancing scholarship on the Homeric Question, and contributing to the decipherment of Linear A. Along the way, we're also open-sourcing a suite of research tools designed to support the classical studies community.
From the Blog
Applying Transfer Learning CORAL Loss To Language Deciphering
Applying pre-trained models to decipher unknown ancient scripts.
August 15, 2024
Alvin Zhou
Soft Voting Ensembles For Automatic Scansion
Combining multiple AI models to improve Homeric meter analysis.
July 28, 2024
Sebastian DeLorenzo
Weighted Finite-State Transducers
Using WFSTs for efficient modeling of linguistic rules and patterns.
June 10, 2024
Sebastian DeLorenzo
Reproducing Kernel Hilbert Spaces
Leveraging RKHS for advanced pattern recognition in textual data.
May 05, 2024
Sebastian DeLorenzo