ChatGPT changes its personality and adopts cultural stereotypes in response to the language used in the chat
The study, carried out by researchers at the UOC, was based on a personality test used by psychologistsThe results raise questions about whether this artificial intelligence tool may reflect prejudices based on stereotypes

A study by researchers at the Universitat Oberta de Catalunya (UOC) has shown that ChatGPT displays different "personalities" depending on the language used, a common phenomenon in humans, referred to as Cultural Frame Switching (CFS). The research shows that the system's personality also changes when conversing with English speakers from different countries, adopting cultural stereotypes from each country, even though the language is the same. The paper, "Exploring the Impact of Language Switching on Personality Traits in LLMs", published as open access, was presented at the 31st conference organized by the Association for Computational Linguistics, an international scientific society of professionals working in the field of natural language processing.
"We wanted to know if we could evaluate the personality of artificial intelligence systems like ChatGPT, using traditional psychological assessment tools, and see if the personality of systems like GPT varied depending on the language of the questionnaires, replicating differences found in humans," explained Rubén Nieto, researcher at the eHealth-TransLab Research Group (eHealth Lab), which is affiliated to the UOC's research unit on Digital Health, Health and Well-being and professor at the UOC's Faculty of Psychology and Education Sciences.
Cultural stereotypes reproduced by AI
In the analysis, the researchers used the Eysenck Personality Questionnaire-Revised (EPQR-A), which is commonly used in psychology and measures four aspects: extraversion, neuroticism, psychoticism, and lie scale. ChatGPT (version GPT-4o) was asked to complete the questionnaire in six different languages (English, Hebrew, Brazilian Portuguese, Slovak, Spanish, and Turkish) and also to simulate responses as a native English speaker in five different countries (UK, USA, Canada, Australia and Ireland).
"Our preliminary results support the initial hypothesis: the responses elicited from GPT-4o to the personality tests varied significantly depending on the language used. We also found that these differences aren't due exclusively to the translation of the test items, but rather to implicit cultural factors associated with each language or country. When it simulates native English speakers from five different countries, the personalities displayed matched the national stereotypes of each one, revealing the strong influence exerted by cultural biases in the data used to train GPT-4o," noted Andreas Kaltenbrunner, leader of Artificial Intelligence and Data for Society (AID4So) group, which is affiliated to the UOC's research unit on Digital Transformation, AI an Technology, and Turin's ISI Foundation.
The four authors, Jacopo Amidei, Gregorio Ferreira, Andreas Kaltenbrunner, from the AID4So, and Rubén Nieto, from the eHealth Lab, are concerned because the results indicate that "GPT-4o resorts to cultural stereotypes when asked to simulate a person from a specific country, and these biases could be amplified in automatic translations or tasks involving multilingual text generation". They recommend a number of measures to prevent this, such as incorporating human checks into the translation process, using more than one translation system and comparing the results (the translation tool used in this study is Google Translate), and developing models with greater awareness of cultural and social context, not just language.
Antoni Oliver, an expert in machine translation and professor at the UOC's Faculty of Arts and Humanities, makes a distinction between neural machine translation (NMT) models, which are systems trained only to translate (automatic translation tools), and large language models (LLMs), which can perform other functions in addition to translating, such as ChatGPT and Microsoft's AI tool, Copilot. "There are hundreds of large language models, all with different degrees of multilingualism. The more languages a model has been trained in, the better its translation capabilities will be. It seems, however, that NMT models are more accurate, while LLMs, working in larger contexts, may reproduce more stereotypes."
Psychological tests useful for AI research
Another interesting finding of the study is that psychological tests designed to explore personality in humans also appear to be able to assess language models such as GPT. "Our results show that GPT is sociable, emotionally stable, and follows social norms," said Nieto.
Systems like GPT can also be used to create virtual population samples with significant potential for health research. "Our study shows that GPT-4o can generate coherent responses with acceptable reliability values on some scales, such as Extraversion and Neuroticism. However, on other scales (such as Psychoticism) it's less consistent. So, without further validation, while we can say that the tests give some useful indicators, they can't be treated as exact measures or as directly comparable with human results," said Amidei.
Starting point for future research
The UOC team is now working to extend the study to include languages and models other than GPT-4o, such as Claude, LLaMA, and DeepSeek, as well as other personality tests, in order to assess the consistency of these results. "We need to better understand how AI systems produce biases based on stereotypes, so we'll design studies to replicate our results using other questionnaires and improving the processes for defining virtual populations," Nieto explained.
The research described is part of the UOC’s research missions: Ethical and human-centred technology and Culture for a critical society, and contributes to Sustainable Development Goal 9: Industry, Innovation and Infrastructure.
Jacopo Amidei, Gregorio Ferreira, Rubén Nieto, and Andreas Kaltenbrunner. 2025. Exploring the Impact of Language Switching on Personality Traits in LLMs. Proceedings of the 31st International Conference on Computational Linguistics, 2370–2378.
Research at the UOC
Specializing in the digital realm, the UOC's research contributes to the construction of future society and the transformations required to tackle global challenges.
Over 500 researchers and more than 50 research groups make up five research units, each with a mission: Culture for a critical society, Lifelong education, Digital health and planetary well-being, Ethical and human-centred technology and Digital transition and sustainability.
The university's Hubbik platform fosters the development of UOC community knowledge transfer and entrepreneurship initiatives.
The goals of the United Nations 2030 Agenda for Sustainable Development and open knowledge are strategic pillars that underpin the UOC's teaching, research and knowledge transfer activities. For more information, visit research.uoc.edu.
Experts UOC
Press contact
-
Anna Sánchez-Juárez