Homer profile rendered in ASCII art

Project Homerides

Building AI models to decode Antiquity.

Scansion is the practice of assigning rhythm to verse based on syllable lengths. It is subject to dozens of specialised rules and edge cases, and thus is mainly carried out by human experts instead of computer programs. Our AI model, homerides-scansion3, is the most accurate hexameter scansion tool available today, beating out all other state-of-the-art automatic tools at 99.7% accuracy. By comparison, the most advanced foundational models right now (Gemini 2.5 Pro, GPT o4-mini-high) only score in the low 20-30% range.

Try these examples:

About Our Work

Large language models often struggle with questions in classical literature, either due to limited available training data or because these tasks involve open research challenges.

Our research aims to develop models capable of converting English prose novels into Homeric verse, advancing scholarship on the Homeric Question, and contributing to the decipherment of Linear A. Along the way, we're also open-sourcing a suite of research tools designed to support the classical studies community.

From the Blog

Applying Transfer Learning CORAL Loss To Language Deciphering

Applying Transfer Learning CORAL Loss To Language Deciphering

Applying pre-trained models to decipher unknown ancient scripts.

August 15, 2024

Alvin Zhou

Soft Voting Ensembles For Automatic Scansion

Soft Voting Ensembles For Automatic Scansion

Combining multiple AI models to improve Homeric meter analysis.

July 28, 2024

Sebastian DeLorenzo

Weighted Finite-State Transducers

Weighted Finite-State Transducers

Using WFSTs for efficient modeling of linguistic rules and patterns.

June 10, 2024

Sebastian DeLorenzo

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces

Leveraging RKHS for advanced pattern recognition in textual data.

May 05, 2024

Sebastian DeLorenzo