"Microsoft Unveils VASA-1, an AI That Can Animate Single Photo With Audio in Real-time, Paving Way For Lifelike Avatars"

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

pics and it didn't happen — YouTube videos of 6K celebrities helped train AI model to animate photos in real time. Enlarge / A sample image from Microsoft for "VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time." On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future, it could power virtual avatars that render locally and don't r...