Text-to-video model

A text-to-video model is a machine learning model which takes as input a natural language description and produces a video matching that description.[1]

Video prediction on making objects realistic in a stable background is performed by using recurrent neural network for a sequence to sequence model with a connector convolutional neural network encoding and decoding each frame pixel by pixel,[2] creating video using deep learning.[3]

Methodology

Models

There are different models including open source models. CogVideo presented their code in GitHub.[4] Meta Platforms uses text-to-video with makeavideo.studio.[5][6][7]Google used Imagen Video for converting text-to-video.[8][9][10][11][12]

Antonia Antonova presented another model.[13]

References

  1. Artificial Intelligence Index Report 2023 (PDF) (Report). Stanford Institute for Human-Centered Artificial Intelligence. p. 98. Multiple high quality text-to-video models, AI systems that can generate video clips from prompted text, were released in 2022.
  2. "Leading India" (PDF).
  3. Narain, Rohit (2021-12-29). "Smart Video Generation from Text Using Deep Neural Networks". Retrieved 2022-10-12.
  4. CogVideo, THUDM, 2022-10-12, retrieved 2022-10-12
  5. Davies, Teli (2022-09-29). "Make-A-Video: Meta AI's New Model For Text-To-Video Generation". W&B. Retrieved 2022-10-12.
  6. Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.
  7. "Meta's Make-A-Video AI creates videos from text". www.fonearena.com. Retrieved 2022-10-12.
  8. "google: Google takes on Meta, introduces own video-generating AI - The Economic Times". m.economictimes.com. Retrieved 2022-10-12.
  9. Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.
  10. "Nuh-uh, Meta, we can do text-to-video AI, too, says Google". www.theregister.com. Retrieved 2022-10-12.
  11. "Papers with Code - See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction". paperswithcode.com. Retrieved 2022-10-12.
  12. "Papers with Code - Text-driven Video Prediction". paperswithcode.com. Retrieved 2022-10-12.
  13. "Text to Video Generation". Antonia Antonova. Retrieved 2022-10-12.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.