Text mining assisted reading comprehension and experience

:speech_balloon: Speaker: Akshay Mendhakar @Akshay

:classical_building: Affiliation: ELIT, University of Warsaw

Title: Text mining assisted reading comprehension and experience

Abstract (long version below): Digital reading tools can help reading, especially for those new to a language. Text mining is one such tool that has potential as a reading assistant. The adaptation of text mining in classical fiction has been implemented with no clear empirical studies on assisted reading behaviour. In this project, we report a series of eye-tracking experiments using text mining and their influence on reading behaviour. The assistive effect of text mining across the reading of excerpts from Pride & Prejudice (fictional) and Limitless (nonfictional) are discussed. The current findings comment on individual processing abilities and how the literary reading can be influenced by external assistance of text mining.


Akshay_ELIT_poster_2023.pdf (1.1 MB)


:newspaper: Long abstract

Reading is arguably a non-natural, culturally acquired behaviour (Land & Tatler, 2009) which requires mental effort to learn & master. It is not surprising that foreign language readers of a specific language require additional cognitive effort compared to experts in the language (McLaughlin et al., 1983; Block, 1986; Eskey, 1988). These differences can be noted by the measurement of comprehension (Fitzgerald, 1995), measurement of perception of reading (Carrell et al., 1988), and reading strategies employed (Erler & Finkbeiner, 2007). The cognitive effort changes with factors such as the purpose of reading (academic or recreational reading), text type (fiction or non-fiction), and the medium of reading (paper or digital reading).

The popularity of digital reading is constantly on the rise (Liu, 2012). Digital reading tools can reduce the cognitive load on foreign language readers. This can impact reading comprehension and overall reading experience, as shown by a few studies for nonfictional text (Chambers et al., 2011; Chen & Chen, 2014; Ben‐Yehudah & Eshet‐Alkalai, 2021). Multiple cross-sectional studies have also suggested that technology can help reading, especially for those new to a language (Edyburn, 2007; Biancarosa & Griffiths, 2012; Cheung & Slavin, 2012). One such technology that is influencing digital reading is Text mining. Text mining is the process of analyzing textual data to obtain valuable insights, trends and patterns (Marzouk & Enaba, 2019). Text mining has evolved a lot since its inception. It is no longer a simple method that focuses on measuring the frequency of occurrence of some textual elements (Berry & Kogan, 2010). Modern-day text mining applications can be seen in typical use cases such as text classification, clustering, regression and association (Tan, Steinbach, & Kumar, 2006). Initial reports have shown the potential of text mining as a reading assistant in reading non-literary text. Methods like keyword extraction, a summary of the text, vocabulary builder, word meaning assistant, concept maps etc., are used as reading assistants (Hyerle, 1996; Guastello, Beasley, & Sinatra, 2000; Marzano, Pickering, & Pollock, 2001; Reategui et al., 2012; Hofmann & Chisholm, 2016; Reategui et al., 2019; Barcellos et al., 2020; Reategui et al., 2022).

Even though text mining concepts are successfully implemented in distant reading concepts (Moretti, 2013; Jockers, 2013), the application of text mining in literature is still in its infancy. Text mining-powered digital reading tools are currently being implemented in both fictional and nonfictional texts by Amazon KDP, De Gruyter, Penguin Random House etc. Particular focus has been given to the development & implementation of these technologies into digital reading solutions by major tech companies. The adaptation of text mining in classical fictional text reading has been implemented with no clear empirical studies on assisted reading behaviour [E.g., Amazon Kindle version of classical by Austen (1998), Fitzgerald (2003) etc.] application of text mining methods as a reading tool has to be empirically tested before it can be applied to fictional text reading.

Additionally, it can be noted that most of the text mining methods are designed to be used with nonfictional text types. For text mining methods to be used universally on fiction and nonfictional texts, the Linguistic computational parameters obtained as a result of this method have to be compared to grade the validity of these methods regardless of the text types (Fiction or non-fiction). In this project, we highlight the computational linguistic feature analysis and the performance of text mining across literary and non-literary text materials. Reports of eye-tracking experiments using text mining and their influence on reading behaviour in L2 speakers of English have been reported. The results are discussed with respect to keyword highlighting and visualizations and their influence on reading comprehension and perception. The assistive effect of text mining across the reading of excerpts Pride & Prejudice (fictional) and Limitless (nonfictional) are discussed. The findings are discussed with respect to the fast processing immersive processing route and slow processing aesthetic route in the literary reading of the Neurocognitive Poetics Model of literary reading (Jacobs, 2011; 2015). The current findings comment on individual processing abilities and how different routes can be influenced by external stimulus/ assistance (Mak & Willems, 2019; Tilmatine, Hubers, & Hintz, 2021). By using text mining, the drawback of NCPM is commented on.