Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio
Enlarge / An AI-generated image of a person’s silhouette. Ars Technica On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything—and do it in a … Read more