0
0 Cart (empty)

Search in Blog

Brands

New products

Poly E70 VM

Videoconferencia - Videoconferencia -...

€179.77

All new products

From identity theft to scams: concerns about the use of AI tools for voice cloning

Published on 2024-05-25

Artificial Intelligence (AI) is proliferating in various fields, and one of them is voice cloning technology, which is increasingly equipped with tools for its use and, in turn, is causing a series of concerns among users and professionals about possible cases of misuse, such as the recent controversy with actress Scarlett Johansson, who is seeking answers about whether OpenAI used her voice without consent to create the voice of Sky in the 'ChatGPT' chatbot.

Voice cloning using AI tools involves creating synthetic copies of anyone's voice through the use of algorithms and machine learning. In some cases, these tools allow replicating a person's speech with audio samples of just a few seconds, achieving realistic results.

Thus, unlike synthetic voices generated through a computer, which are created with text-to-speech technologies, voice cloning uses a person's real voice and generates a realistic interpretation of the original from it.

In this regard, there are various AI-powered tools that facilitate voice cloning. An example of this technology is Microsoft's VALL-E, which, with just a few three-second audio recordings, can mimic the speaker's voice while preserving even the emotions and acoustic environment of the message.

The same is true for OpenAI's recently announced tool, Voice Engine, a new AI model capable of creating personalized and natural voices with a single 15-second audio sample.

These tools thus present multiple advantages and conveniences when using voice in different types of contexts, whether for creating personalized voice assistants, helping people with speech problems, developing video games, or even for work environments, with applications in marketing or content translation.

However, voice cloning technologies are also raising some concerns among users and voice professionals who, in light of the potential for AI misuse, foresee potential issues such as voice and, therefore, identity impersonation.

POSSIBLE USES OF VOICE WITHOUT CONSENT

These concerns are materialized in cases like that of actress Scarlett Johansson, who is currently seeking legal answers regarding the use of a voice very similar to hers by OpenAI in its 'ChatGPT' chatbot. Specifically, it is the voice known as Sky which, as a result of this situation, has been temporarily discontinued.

The company led by Sam Altman initially proposed to the actress to voice ChatGPT, but Johansson declined the offer for "personal reasons" and, instead, OpenAI opted to work with professional voice actors, talent agencies, casting directors, and industry consultants.

In fact, the company has stated that Sky's voice is not an imitation of the American actress but "belongs to a different professional actress, who uses her own natural voice" and who was even hired before the offer made to Johansson.

However, Johansson claims that OpenAI imitated her voice regardless of her rejection of the offer and is therefore seeking to clarify what happened through legal means. "When I heard the released demo, I was stunned, furious, and incredulous to see that Mr. Altman was using a voice that sounded so eerily similar to mine," she said in a statement.

This case reflects one of the possible consequences of using these cloning technologies, which give rise to confusing situations where it is difficult to assert and protect users' identity on the internet, in this case, through the use of voice.

VOICE PROFESSIONALS ON ALERT

The emergence of these AI tools capable of cloning voices also puts voice industry professionals on alert, as they are affected by the capabilities of this technology, which can sometimes end up replacing their work, for example, when making interpretations.

This concern has already been highlighted by groups like the Dubbing Actors and Voice Talents Union of Madrid, who have requested that all oral communication with an AI "be duly identified as such." In this way, the organization aims to ensure that no user can be misled into thinking they are listening to a human when they are actually listening to an AI.

Furthermore, the union warned about the consequences that this type of technology and its uncontrolled use can have on the professional sector. For this reason, last year they expressed the need for legislation that includes issues such as the requirement for developers of AI cloning tools to include "an equalization or sound effect" that makes the content identifiable just by listening to it.

As an intermediate point between the use of AI and the work of voice professionals, initiatives like that of the company Speechless have emerged. Last April, they launched a hybrid AI that allows video game developers to use their AI-powered voice tools but based on a real voice provided by a voice actor. Thus, the professional receives a commission each time their voice resources are used in a video game.

IMPERSONATION OF CELEBRITIES

Continuing with the misuse of voice cloning tools, there have been other occasions where this AI technology has been directly used to impersonate the identity of famous personalities, specifically to carry out malicious activities such as promoting hateful behaviors.

One of these cases occurred last year with the tech startup ElevenLab, which reported misuse of voice cloning technology through their technology, following the appearance of a series of audio clips apparently featuring celebrities like Joe Rogan, Ben Shapiro, and Emma Watson, with racist and violent content.

This was confirmed by an investigation conducted by Motherboard, which detailed that the audios were initially posted on the 4Chan platform. Following this, ElevenLab stated that they would implement a series of measures to curb this misuse, such as requiring a text sample to verify voice copyright.

SCAMS AND 'DEEPFAKES'

However, these voice impersonations are becoming more frequent, especially on social networks like Facebook and Instagram, which have become one of the preferred dissemination channels for the perpetrators of these scams, as millions of people use them daily and any malicious campaign can have a wide reach.

According to a survey conducted by the companies developing the software solutions Voicebot and Pindrop, this is something that concerns more than 57 percent of users, who say they feel uneasy about their exposure to this growing trend.

With all this, at a time when users continuously deal with 'deepfakes,' false information, and voice impersonations, certain characteristics should be considered when consuming content, such as the consistency of the voice, which in the case of cloned voices, can have unusual tones or present inconsistent patterns.

Similarly, in addition to evaluating sources, it is advisable to examine the context of the content and be wary of those posts that share incredible content, such as high monetary rewards.

COMMENTS

No customer comments for the moment.

Add a comment