Meta recently showcased two significant artificial intelligence (AI) research projects: Emu Edit and Emu Video.
Emu Edit is designed for image editing based on commands, while Emu Video transforms static images into moving ones. Both models rely on the Emu model, which Meta previously reported on.
Emu Video can generate moving images from text, a single image, or a combination of both. It has a simpler structure compared to previous models, such as Make-A-Video, which allows for faster processing. In tests conducted by human evaluators, Emu Video was chosen more frequently than any other model, with a high average of 96% accuracy. The only model that had a comparable level of accuracy (faithfulness) was Imagen-Video.
Emu Edit offers detailed image customization, allowing users to extract objects from backgrounds and tailor styles to their liking. The research team discovered that Emu Edit needed to be trained with fundamental computer vision techniques, such as object detection and segmentation. This ability enables precise control over character poses and movements.
At present, Meta does not provide external access to the Emu models.
TLDR: Meta showcased two AI research projects, Emu Edit and Emu Video. Emu Video can create moving images from text or images, while Emu Edit allows for detailed image customization. Emu Video outperformed other models in human evaluations, while Emu Edit requires additional training in computer vision techniques. Currently, Meta does not offer access to the Emu models to external users.