Introduction

Emerging advancements in Artificial Intelligence (AI) technology have led to an explosion of legitimate and nefarious opportunities. The specific use of AI for voice cloning and/or deep fakes has existed for almost two decades and previously required significant technical knowledge and resource access to achieve. With recent advancements, legitimate AI tools can be utilized with little or no cost to clone a victim's voice. Cyber Threat Investigations and Expert Services (CTIX) analysts have gathered and included several references from federal agencies and news sources to provide additional background and information on the most recent significant events reported.

What is AI Voice Cloning

AI voice cloning uses machine learning and natural language processing (NLP) to replicate human speech. The technology can create a unique human voice after studying human speech model data sets. What once took massive amounts of data samples to achieve, now only requires as little as an aggregate of five minutes of a specific person's voice-recorded data to produce a replicated output with user-provided text.

One of the most popular resources available to the broadest audience is ElevenLabs, which offers AI Text to Speech (T2S or TTS) software. The web-based tool is available in various formats from free to paid membership levels unlocking increasingly valuable resources for the user. ElevenLabs' viral growth is an easily referenced example of a commercial off-the-shelf tool that is revolutionizing AI voice cloning also known as voice synthesis or voice replication. CNET highlighted the simple process of using voice cloning in one of their recent vlogs.

Benefits

AI voice cloning has many useful applications that support individual as well as corporate needs. People who have accessibility issues or have lost the ability to speak have leveraged AI voice cloning to support their everyday life. Various AI voice cloning services have even "given people their voice back." The entertainment industry has capitalized on AI voice cloning and synthetization technology by bringing back characters voiced by deceased actors, artistic dubbing in music, and budget-friendly voice extras for films. Various customer service-based companies have utilized AI voice synthesis for virtual chat/call assistants to minimize the need for large and costly call centers. Even authors are afforded cost-effective AI voice options for audiobook publishing.

Threats

Just like many technological advancements in history, individuals have developed nefarious adaptations. AI voice cloning advancements have been manipulated in similar ways. The most prevalent use of malicious AI voice cloning has been its use in spear-phishing attacks. Con artistsand cyber criminalshave utilized AI voice cloning technology to spoof a specific individual's voice and targeted unwitting family members or acquaintances with believable emergency scenarios such as DUIs, vehicle accidents, or faux kidnapping schemes. The AI voice cloning enhancement to spear-phishing attacks has revitalized the practice of familiar phone schemes such as insurance or banking imitation. AI voice cloning advancements have made it significantly more challenging for victims to determine fact vs. fiction due to the realistic likeness the voice has to a known individual or real person to the victim. Recorded Future provides a detailed breakdown in their threat advisory dated May 2023. Several security leaders have found evidence of threat actors providing Voice Cloning As A Service (VCaaS) to individuals or organizations willing to pay. Dark web monitoring has revealed some threat actors are selling ElevenLabs accounts to facilitate VCaaS offerings. Other threat actors are utilizing VCaaS in multi-layer cyber-attacks for hire.

Advanced cyber threat actors are leveraging multiple resources to develop targeted penetration campaigns that incorporate generative AI against high-value and high-payoff targets. Banking and healthcare industries have enhanced their security posture to defend against breaches utilizing AI voice cloning. With a predominant amount of sensitive information and valuable assets in healthcare and banking, many companies have taken preventative measures to minimize such attacks. But one Vice News journalist, Joesph Cox, was able to utilize free software to clone his voice and use it to gain access to a bank account in England. His findings highlight what little information, resources, and voice samples are required to successfully access funds or vital information.

Data privacy and personal data awareness online are at the heart of the vulnerabilities mentioned with regard to AI generative technology including voice cloning. Understanding that social media posts, especially videos, can be utilized as data points to inform a TTS platform or something similar is eye-opening for most people. Nation-State Threat Actors have increased disinformation campaigns using social media as an accelerant platform. United States Congress has begun to invest time and energy into revising privacy laws and regulations to better protect the Nation's citizens. AI generative technology frameworks are being explored by leaders to prepare for AI integration in the future. Governments across the globe are exploring best practices to address generative AI and exposing disinformation campaigns to ensure security and stability.

Summary

  • Utilization of AI voice cloning technology has been reduced to entry level users with little to no experience, providing them the capability of cloning a person's voice.
  • Leveraged by con artists through advanced cyber threats to Nation-State threat actors.
  • Threat actors can achieve obfuscated use of AI technologies by leveraging free trials and services offered by AI vendors.
  • One of the most popular resources available to the broadest audience is ElevenLabs, which offers AI text-to-speech software.
  • Threat Actors on the Dark Web are offering Voice Cloning as a Service (VCaaS) to "clients" willing to pay.
  • AI voice cloning has been used to defeat some Multifactor Authentication (MFA).
  • Currently, the most popular AI voice cloning threat is spear-phishing for one-time scams for monetary gain.
  • U.S. legislators and various leaders have called for closer attention to AI strategy, policy, and laws to protect citizens.

Mitigation

  • Adjust privacy settings on social media to friends and family only, leaving minimal exposure to public view online. Remove unknown or questionable individuals from friend listings.
  • If suspicious calls are received, hang up and attempt to contact the known individual back.
  • Set verbal codes with trusted individuals, family, and friends for emergency circumstances that will verify identity in those instances.
  • Reduce online presence by removing unused memberships, and invest time to utilize digital data removal resources.
  • Verify information read online and on social media from credible various sources to spot and minimize disinformation.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.