Perplexity Games: Maoism vs. Literature through the Lens of Cognitive Stylometry

daron · June 11, 2024, 5:22am

Speaker: Maciej Kurzynski @daron

Affiliation: The Advanced Institute for Global Chinese Studies, Lingnan University

Title: Perplexity Games: Maoism vs. Literature through the Lens of Cognitive Stylometry.

Abstract (long version below): This paper explores the impact of linguistic engineering on language predictability and cognitive processing within the framework of modern Chinese literature and culture. By employing computational stylometry, information theory, and large language models, the study quantifies the stylistic features of Maospeak such as perplexity, entropy, and TF-IDF, comparing its predictability and vocabulary with other literary styles (Mo Yan and Chang Eileen, among others). The findings suggest that Maospeak, characterized by its engineered simplicity and repetitiveness, limits the variability and creativity in language use by reinforcing phrases with limited scope of possible sequence continuations. The results of the stylometric experiments are then explained from the perspective of predictive processing and the computational theory of ideology, positing that ideologies function similarly to error-minimization (likelihood-maximization) text generation techniques.

Long abstract

The article delves into the computational analysis of “Maospeak,” a language style used during Mao Zedong’s era, to understand its impact on language predictability and cognitive processing. It leverages information theory and stylometric methods to compare the stylistic features of Maospeak against contemporary Chinese literature, the works of Eileen Chang, and the writings of Mo Yan, focusing on aspects such as perplexity, vocabulary, and entropy.

Key findings include that Maospeak exhibits lower perplexity across various Chinese GPT and BERT models as compared to other literary styles, indicating higher predictability and a narrower range of vocabulary. This is attributed to its use of repetitive, redundant phrases and a limited set of politically charged vocabulary, aligning closely with Maoist ideological goals. Through the lens of cognitive science, the paper then suggests that such engineered language styles not only reflect political ideologies but also influence cognitive processes by limiting the range of possible sequence continuations, thereby shaping the way individuals think and perceive the world. In particular, Kitto and Boschetti’s study on entropy minimization and Wheeler’s on error minimization further elucidate how ideologies, like Maospeak, function to constrain linguistic and cognitive variability, promoting a homogenized thought process that minimizes prediction errors in perception and cognition and promote social cohesion.

The research also explores the cognitive implications of language predictability, drawing upon the theory of predictive processing in cognitive science. It suggests that human cognition is inclined towards reducing sensory and cognitive discrepancies, and predictable language styles like Maospeak facilitate this process by simplifying cognitive load, making them effective tools for ideological reinforcement. However, this also leads to a form of cognitive closure, where the language’s predictability limits the individual’s exposure to diverse linguistic structures and ideas, potentially stifling creativity and critical thinking. This process bears similarity to “over-fitting” in machine learning, whereby the model loses the ability to generalize beyond the training data.

Further, the article examines the role of literature and the arts in counteracting the cognitive constraints imposed by politically engineered languages. It posits that exposure to a wide range of literary styles can increase cognitive flexibility, allowing individuals to navigate and understand more complex and unpredictable linguistic environments. This, in turn, could counterbalance the effects of ideological language, promoting a more open and critical engagement with language and ideas. If the Bayesian model is valid for both artificial and natural intelligence, then reading widely and increasing one’s exposure to various language data counters the influences of ideologies on our linguistically mediated perceptions of the world and increases the perplexity of our imaginations. While the benevolent influence of literature is by no means guaranteed, with various “authoritarian” or “foundational” fictions and romans à clef actually aiming to reduce entropy, on the whole literary texts have the perplexing quality of introducing opaque personalities which cannot be fully explained within the local cognitive frame of the reader and thus pose a challenge to predictive processing. Encounters with characters that resist straightforward interpretations might improve mind-reading skills, provoke reflective changes, and lead to greater misalignment between the global (ideological) and local (individual) cognitive frames.

In conclusion, the paper argues that the computational analysis of language styles like Maospeak offers valuable insights into the relationship between language, ideology, and cognition. By demonstrating how political language can influence predictability and cognitive processing, the study contributes to a broader understanding of the power of language in shaping political discourse and individual thought processes. The findings underscore the importance of linguistic diversity and exposure to a wide range of literary styles as means of fostering cognitive flexibility and resisting ideological conformity.