From Character to Poem: Nested Contexts and Scalar Limits of Parallelism Detection in Classical Chinese Poetry

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

Benchmarking for literary analysis is complicated by a persistent mismatch between the fixed context windows of classification models and the emergent properties of literary forms. Here, I approach this challenge by reconsidering semantic parallelism in Chinese regulated verse (lüshi 律詩) as a problem of scale. I first employ a “teacher” model to label parallelism at the couplet (meso) level and then test which “student” model architecture—micro (character), meso (couplet), or macro (poem)—can most effectively recover this labeling rule. The experiment points to a Goldilocks hypothesis: performance is maximized when the classifier is structurally aligned with the scale at which the feature has been encoded. This finding yields further practical insights: (1) bottom-up aggregation of local predictions sacrifices raw performance but offers greater interpretability by exposing the specific decisions of a misaligned model; (2) top-down inference requires additional training computation to compensate for global noise and achieve performance comparable to aligned models; (3) if the goal is to better understand how artificial intelligence represents specific literary phenomena internally (“vector poetics”), aligned classifiers afford the most direct and promising access. By examining different forms of (mis)alignment between texts and models, the study invites discussion on whether meaningful benchmarking requires matching the computational “unit of analysis” with the humanistic “unit of inquiry.”
Original languageEnglish
Article number34
Number of pages11
JournalJournal of Open Humanities Data
Volume12
Early online date25 Feb 2026
DOIs
Publication statusPublished - 25 Feb 2026

Bibliographical note

The author would like to thank the anonymous reviewers and the editors of this issue for their helpful comments and questions. Special thanks are also due to Xiaotong Xu 徐曉童 and Yu Feng 馮宇 for their assistance in data curation.

Keywords

  • vector poetics
  • regulated verse
  • benchmarking
  • contextual embeddings
  • Transformer attention

Fingerprint

Dive into the research topics of 'From Character to Poem: Nested Contexts and Scalar Limits of Parallelism Detection in Classical Chinese Poetry'. Together they form a unique fingerprint.

Cite this