TY - JOUR
T1 - The Reasoning-Like Capabilities of Large Language Models across Different Languages: Insights from Representational Similarity Analysis
AU - STOLLE, Chris M.
AU - YU, Rongjun
AU - HUANG, Yi
PY - 2026/1/20
Y1 - 2026/1/20
N2 - Recent research shows that Large Language Models (LLMs) demonstrate human-comparable performance on various cognitive tasks, suggesting reasoning-like capabilities. However, the language dependency of these capabilities and the contribution of their neural network states remain underexplored. This study investigates how different prompts and languages influence the reasoning performance of LLMs compared to humans, while exploring the internal cognitive-like processes of LLMs through representational similarity analysis (RSA). Using scenario-based and mathematical Cognitive Reflection Test (CRT) questions across four languages, we evaluated the reasoning capabilities of LLM Qwen 2.5 (including Gemma 2.9 and Llama 3.1 replications). Results showed that language significantly impacts performance in scenario-based CRT that requires nuanced semantic processing. However, RSA of the inner state activations revealed that the LLM processed identical questions similarly across languages, suggesting that the model encodes semantics in a language-independent latent space. Additionally, the LLM’s performance improved when it verbalised its reasoning, and this verbalisation increased similarity in activations. Layer-wise analyses revealed a U-shaped similarity pattern across early to late layers in Qwen and Gemma but not Llama. Furthermore, scenario-based and equivalent mathematical CRT versions elicited similar activation patterns for the paired questions, even after controlling for input and output confounds, pointing to format-agnostic reasoning mechanisms. These results highlight that while LLMs exhibit language-invariant semantic representations and format-agnostic reasoning, their performance remains sensitive to linguistic nuances and self-generated verbalisations, offering insights into both the strengths and limitations of their cognitive-like processing.
AB - Recent research shows that Large Language Models (LLMs) demonstrate human-comparable performance on various cognitive tasks, suggesting reasoning-like capabilities. However, the language dependency of these capabilities and the contribution of their neural network states remain underexplored. This study investigates how different prompts and languages influence the reasoning performance of LLMs compared to humans, while exploring the internal cognitive-like processes of LLMs through representational similarity analysis (RSA). Using scenario-based and mathematical Cognitive Reflection Test (CRT) questions across four languages, we evaluated the reasoning capabilities of LLM Qwen 2.5 (including Gemma 2.9 and Llama 3.1 replications). Results showed that language significantly impacts performance in scenario-based CRT that requires nuanced semantic processing. However, RSA of the inner state activations revealed that the LLM processed identical questions similarly across languages, suggesting that the model encodes semantics in a language-independent latent space. Additionally, the LLM’s performance improved when it verbalised its reasoning, and this verbalisation increased similarity in activations. Layer-wise analyses revealed a U-shaped similarity pattern across early to late layers in Qwen and Gemma but not Llama. Furthermore, scenario-based and equivalent mathematical CRT versions elicited similar activation patterns for the paired questions, even after controlling for input and output confounds, pointing to format-agnostic reasoning mechanisms. These results highlight that while LLMs exhibit language-invariant semantic representations and format-agnostic reasoning, their performance remains sensitive to linguistic nuances and self-generated verbalisations, offering insights into both the strengths and limitations of their cognitive-like processing.
KW - Large Language Models
KW - Language effect
KW - Representational similarity analysis
KW - Neural network
KW - Reasoning capabilities
U2 - 10.1016/j.chbah.2026.100250
DO - 10.1016/j.chbah.2026.100250
M3 - Journal Article (refereed)
SN - 2949-8821
JO - Computers in Human Behavior: Artificial Humans
JF - Computers in Human Behavior: Artificial Humans
M1 - 100250
ER -