Resume: People have difficulty distinguishing between human voices and AI-generated voices, correctly identifying them only about half the time. Despite this, brain scans revealed different neural responses to human and AI voices, with human voices triggering areas related to memory and empathy, while AI voices activated regions for error detection and attention regulation.
These findings highlight both the challenges and potential of advanced AI speech technology. Further research will investigate how personality traits influence the ability to distinguish the origin of the voice.
Key Facts:
- Identification struggle: Participants correctly identified human voices 56% of the time, AI voices 50.5%.
- Neural responses: Human voices activated memory and empathy areas; AI voices provided error detection and attention regulation.
- Perception bias: Neutral voices were often seen as AI, while happy voices were seen as human.
Source: VENZEN
Humans are not very good at distinguishing between human voices and voices generated by artificial intelligence (AI), but our brains do respond differently to human and AI voices, according to research presented today (Tuesday) at the Federation of European Neuroscience Societies (FENS) Forum2024.
The study was presented by PhD student Christine Skjegstad and conducted by Ms Skjegstad and Professor Sascha Frühholz, both from the Department of Psychology at the University of Oslo (UiO), Norway.
Ms Skjegstad said: “We already know that AI-generated voices have become so sophisticated that they are almost indistinguishable from real human voices. It is now possible to clone a person’s voice from just a few seconds of recording, and scammers have used this technology to impersonate a loved one in need and trick victims into transferring money.
“While machine learning experts have developed technological solutions to detect AI voices, much less is known about the human brain’s response to these voices.”
The study involved 43 people listening to human and AI-generated voices expressing five different emotions: neutral, angry, fear, happy and pleasure. They were asked to identify the voices as synthetic or natural while their brains were studied using functional magnetic resonance imaging (fMRI).
fMRI is used to detect changes in blood flow in the brain, indicating which parts of the brain are active. The participants were also asked to rate the characteristics of the voices they heard in terms of naturalness, reliability and authenticity.
Participants correctly identified human voices only 56% of the time and AI voices 50.5% of the time, meaning they were equally bad at identifying both types of voices.
People were more likely to correctly identify a ‘neutral’ AI voice as AI (75% compared to 23% who could correctly identify a neutral human voice as human), suggesting that people assume neutral voices are more AI-like.
Female AI-neutral voices were correctly identified more often than male AI-neutral voices. For happy human voices, the correct identification rate was 78%, compared to just 32% for happy AI voices, suggesting that people associate happiness as being more human.
Both AI and human neutral voices were perceived as the least natural, reliable and authentic, while human happy voices were perceived as the most natural, reliable and authentic.
However, when they looked at brain imaging, researchers found that human voices elicited stronger responses in areas of the brain related to memory (right hippocampus) and empathy (right inferior frontal gyrus).
AI voices elicited stronger responses in areas related to error detection (right anterior mid-cingulate cortex) and attention regulation (right dorsolateral prefrontal cortex).
Ms Skjegstad said: “My research indicates that we are not very accurate at identifying whether a voice is human or AI-generated. The participants also often mentioned how difficult it was for them to tell the difference between the voices. This suggests that current AI voice technology can mimic human voices to a point where it is difficult for humans to reliably tell them apart.
“The results also indicate a perception bias where neutral voices were more likely to be identified as AI-generated and happy voices were more likely to be identified as more human, regardless of whether they actually were. This was especially true for neutral female AI voices, which may be because we are familiar with female voice assistants like Siri and Alexa.
“Although we are not very good at distinguishing between human and AI voices, there does seem to be a difference in the brain’s response. AI voices can induce increased alertness, while human voices can evoke a sense of connection.”
The researchers now plan to investigate whether personality traits, for example extroversion or empathy, make people more or less sensitive to noticing the differences between human and AI voices.
Professor Richard Roche is Chairman of the Communications Committee of the FENS Forum and Deputy Head of the Department of Psychology at Maynooth University, Maynooth, County Kildare, Ireland, and was not involved in the research.
He said: “Examining the brain’s responses to AI voices is crucial as this technology continues to develop. This research will help us understand the potential cognitive and social implications of AI voting technology, which can inform policy and ethical guidelines.
“The risks of using this technology to defraud and fool people are clear. However, there are also potential benefits, such as providing voice replacements for people who have lost their natural voice. AI voices can also be used in therapy for certain mental health conditions.”
About this AI and neuroscience research news
Author: Kerry Nobel
Source: VENZEN
Contact: Kerry Noble – FENS
Image: The image is credited to Neuroscience News
Original research: The findings will be presented at FENS Forum 2024