Proving human authenticity of recorded voices - School of Electrical, Computer and Energy Engineering

“Deepfakes” have become a large societal concern with the advent of video and audio content generated by artificial intelligence, or AI. A deepfake is a convincing imitation that blurs the lines between fantasy and reality. They can cause trouble in determining, for example, whether a politician actually made a troubling statement or if they were sabotaged by those seeking to interfere in an election.

“Until recently, the sound of a recorded voice was universally accepted as genuinely human,” says Visar Berisha, a professor of electrical engineering in the Ira A. Fulton Schools of Engineering at Arizona State University with a joint appointment in the university’s College of Health Solutions. “There was no reason to doubt its authenticity. With the advent of voice cloning technology, this trust is eroding and skepticism, rather than trust, will become the new norm.”

With the potential to ruin reputations and erode faith in institutions, the U.S. Federal Trade Commission, or FTC, held the FTC Voice Cloning Challenge to develop creative multidisciplinary methods to combat AI-generated deepfake audio for a share of $35,000 in prize money.

One of the contest’s winners is OriginStory, a project that uses a new kind of microphone, one that first verifies that a human speaker is producing recorded speech, then watermarks the speech as authentically human. The watermark can be shown to listeners, establishing a chain of trust from recording to retrieval.

OriginStory’s development is heavy on ASU involvement; the project was developed with ASU resources and patented through SkySong Innovations.

Berisha leads the development team, which includes fellow ASU faculty members Daniel Bliss, a Fulton Schools professor of electrical engineering in the School of Electrical, Computer and Energy Engineering, part of the Fulton Schools, and Julie Liss, ASU College of Health Solutions associate dean and professor of speech and hearing science.