Abstract
Benchmarking for literary analysis is complicated by a persistent mismatch between the fixed context windows of classification models and the emergent properties of literary forms. Here, I approach this challenge by reconsidering semantic parallelism in Chinese regulated verse (lüshi 律詩) as a problem of scale. I first employ a “teacher” model to label parallelism at the couplet (meso) level and then test which “student” model architecture—micro (character), meso (couplet), or macro (poem)—can most effectively recover this labeling rule. The experiment points to a Goldilocks hypothesis: performance is maximized when the classifier is structurally aligned with the scale at which the feature has been encoded. This finding yields further practical insights: (1) bottom-up aggregation of local predictions sacrifices raw performance but offers greater interpretability by exposing the specific decisions of a misaligned model; (2) top-down inference requires additional training computation to compensate for global noise and achieve performance comparable to aligned models; (3) if the goal is to better understand how artificial intelligence represents specific literary phenomena internally (“vector poetics”), aligned classifiers afford the most direct and promising access. By examining different forms of (mis)alignment between texts and models, the study invites discussion on whether meaningful benchmarking requires matching the computational “unit of analysis” with the humanistic “unit of inquiry.”
| Original language | English |
|---|---|
| Article number | 34 |
| Number of pages | 11 |
| Journal | Journal of Open Humanities Data |
| Volume | 12 |
| Early online date | 25 Feb 2026 |
| DOIs | |
| Publication status | Published - 25 Feb 2026 |
Bibliographical note
The author would like to thank the anonymous reviewers and the editors of this issue for their helpful comments and questions. Special thanks are also due to Xiaotong Xu 徐曉童 and Yu Feng 馮宇 for their assistance in data curation.Keywords
- vector poetics
- regulated verse
- benchmarking
- contextual embeddings
- Transformer attention
Fingerprint
Dive into the research topics of 'From Character to Poem: Nested Contexts and Scalar Limits of Parallelism Detection in Classical Chinese Poetry'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver