The Stylometry of Maoism : Quantifying the Language of Mao Zedong

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review


Recent advances in computational stylometry have enabled scholars to detect authorial signals with a high degree of precision, but the focus on accuracy comes at the expense of explainability: powerful black-box models are often of little use to traditional humanistic disciplines. With this in mind, we have conducted stylometric experiments on Maospeak, a language style shaped by the writings and speeches of Mao Zedong. We measure per-token perplexity across different GPT models, compute Kullback–Leibler divergences between local and global vocabulary distributions, and train a TF-IDF classifier to examine how the modern Chinese language has been transformed to convey the tenets of Maoist doctrine. We offer a computational interpretation of ideology as reduction in perplexity and increase in systematicity of language use.
Original languageEnglish
Title of host publicationProceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages
EditorsMika Hämäläinen, Emily Öhman, Flammie PIRINEN, Khalid ALNAJJAR, So MIYAGAWA, Yuri BIZZONI, Niko PARTANEN, Jack RUETER
PublisherAssociation for Computational Linguistics (ACL)
Number of pages5
ISBN (Print)9798891760127
Publication statusPublished - Dec 2023

Bibliographical note

The author would like to thank the anonymous reviewers for their comments and suggestions. Special thanks are due to Aaron Gilkison and Heidi Huang for their feedback and support.


Dive into the research topics of 'The Stylometry of Maoism : Quantifying the Language of Mao Zedong'. Together they form a unique fingerprint.

Cite this