Researchers from the Mohamed bin Zayed University of Artificial Intelligence and the Technical University of Darmstadt have published a paper proposing that large language models can function as cultural archives, storing fragments of how different societies navigate everyday life. The study introduces a method for extracting cultural commonsense knowledge graphs from LLMs, capturing sequences of actions and expectations that reflect ordinary life in specific cultures.
The research tested this approach across five countries — China, Indonesia, Japan, England, and Egypt — generating knowledge graphs in both English and each country’s native language. The method moves beyond extracting isolated facts to map culturally grounded commonsense: the implicit rules about what happens before and after a given event, what emotions are appropriate in specific situations, and how social interactions are expected to proceed. The findings will be presented at EACL 2026 in Rabat, Morocco.
The hypothesis draws on cognitive science: humans use culturally grounded commonsense to infer past events, predict future outcomes, and interpret others’ behavior. If language models absorb this knowledge during training, they become — intentionally or not — repositories of cultural information that can be systematically extracted and studied.
The practical implications extend beyond academic interest. AI systems that operate across cultures — customer service bots, content moderation tools, translation services — regularly fail when they apply one culture’s assumptions to another’s context. A system trained primarily on English-language data may apply Western social norms to interactions in contexts where different rules apply. The cultural knowledge graph approach offers a way to identify and quantify these biases before they cause problems in deployment.
This work connects to a broader movement in AI research toward cultural awareness. MBZUAI developed Jais, a 13-billion-parameter Arabic language model trained on a 395-billion-token bilingual dataset. Related projects include Nanda for Hindi and Sherkala for Kazakh. A companion MBZUAI paper accepted at EACL 2026 introduces JEEM, a benchmark for evaluating how AI interprets images across Arabic dialects and cultures.
The underlying question — whether AI models are neutral tools or cultural artifacts that carry the biases and assumptions of their training data — has implications for every deployment of AI in multilingual, multicultural contexts. This research provides a concrete methodology for answering it.
