Generative artificial intelligence
Generative artificial intelligence or generative AI (also GenAI) is a type of artificial intelligence (AI) system capable of generating text, images, or other media in response to prompts.[1][2] Generative models learn the patterns and structure of the input data, and then generate new content that is similar to the training data but with some degree of novelty (rather than only classifying or predicting data).[3] Generative AI can be either unimodal or multimodal; unimodal systems take only one type of input (for example, text) whereas multimodal systems can take more than one type of input (for example, text and images).[4]
The most prominent frameworks for approaching generative AI include generative adversarial networks (GANs) and generative pre-trained transformers (GPTs).[5][6] GANs consist of two parts: a generator network that creates new data samples, and a discriminator network that evaluates whether the samples are real or fake. The two networks are trained together in a competitive process, with the generator network continually trying to produce better and more realistic samples, while the discriminator network tries to accurately identify the fake ones. GPTs are artificial neural networks that are based on the transformer architecture, pre-trained on large datasets of unlabeled text, and able to generate novel human-like text.[7][8] They use large language models to produce data based on the training data set that was used to create them.[9]
Generative AI has many potential applications, including in creative fields such as art, music, and writing, as well as in fields such as healthcare, finance, and gaming. However, there are also concerns about the potential misuse of generative AI, such as in creating fake news or deepfakes, which can be used to deceive or manipulate people.
Notable generative AI systems include ChatGPT (and its variant Bing Chat), a chatbot built by OpenAI using their GPT-3 and GPT-4 foundational large language models,[10] and Bard, a chatbot built by Google using their LaMDA foundation model.[11] Other generative AI models include artificial intelligence art systems such as Stable Diffusion, Midjourney, and DALL-E.[12]
Generative AI has potential applications across a wide range of industries, including software development, marketing, and fashion.[13][14] Investment in generative AI surged during the early 2020s, with large companies such as Microsoft, Google, and Baidu as well as numerous smaller firms developing generative AI models.[1][15][16]
Modalities

A generative AI system is constructed by applying unsupervised or self-supervised machine learning to a data set. The capabilities of a generative AI system depend on the modality or type of the data set used.
- Text: Generative AI systems trained on words or word tokens include GPT-3, LaMDA, LLaMA, BLOOM, GPT-4, and others (see List of large language models). They are capable of natural language processing, machine translation, and natural language generation and can be used as foundation models for other tasks.[17] Data sets include BookCorpus, Wikipedia, and others (see List of text corpora).
- Code: In addition to natural language text, large language models can be trained on programming language text, allowing them to generate source code for new computer programs.[18] Examples include OpenAI Codex.
- Images: Generative AI systems trained on sets of images with text captions include such as Imagen, DALL-E, Midjourney, Stable Diffusion and others (see Artificial intelligence art, Generative art, Synthetic media). They are commonly used for text-to-image generation and neural style transfer.[19] Datasets include LAION-5B and others (See Datasets in computer vision).
- Molecules: Generative AI systems can be trained on sequences of amino acids or molecular representations such as SMILES representing DNA or proteins. These systems, such as AlphaFold, are used for protein structure prediction and drug discovery.[20] Datasets include various biological datasets.
- Music: Generative AI systems such as MusicLM can be trained on the audio waveforms of recorded music along with text annotations, in order to generate new musical samples based on text descriptions such as "a calming violin melody backed by a distorted guitar riff".[21]
- Video: Generative AI trained on annotated video can generate temporally-coherent video clips. Examples include Gen1 by RunwayML[22] and Make-A-Video by Meta Platforms.[23]
- Multimodal: A generative AI system can be built from multiple generative models, or one model trained on multiple types of data. For example, one version of OpenAI's GPT-4 accepts both text and image inputs.[24]
See also
References
- Griffith, Erin; Metz, Cade (2023-01-27). "Anthropic Said to Be Closing In on $300 Million in New A.I. Funding". The New York Times. Retrieved 2023-03-14.
- Lanxon, Nate; Bass, Dina; Davalos, Jackie (March 10, 2023). "A Cheat Sheet to AI Buzzwords and Their Meanings". Bloomberg News. Retrieved March 14, 2023.
- Pasick, Adam (2023-03-27). "Artificial Intelligence Glossary: Neural Networks and Other Terms Explained". The New York Times. ISSN 0362-4331. Retrieved 2023-04-22.
- https://www.marktechpost.com/2023/03/21/a-history-of-generative-ai-from-gan-to-gpt-4/
- https://pub.towardsai.net/generative-ai-and-future-c3b1695876f2
- https://www.computer.org/csdl/magazine/co/2022/10/09903869/1H0G6xvtREk
- https://www.weforum.org/agenda/2023/01/davos23-generative-ai-a-game-changer-industries-and-society-code-developers/
- https://time.com/6271657/a-to-z-of-artificial-intelligence/
- Andrej Karpathy; Pieter Abbeel; Greg Brockman; Peter Chen; Vicki Cheung; Yan Duan; Ian Goodfellow; Durk Kingma; Jonathan Ho; Rein Houthooft; Tim Salimans; John Schulman; Ilya Sutskever; Wojciech Zaremba (2016-06-16). "Generative models". OpenAI.
- Metz, Cade (2023-03-14). "OpenAI Plans to Up the Ante in Tech's A.I. Race". The New York Times. ISSN 0362-4331. Retrieved 2023-03-31.
- Thoppilan, Romal; De Freitas, Daniel; Hall, Jamie; Shazeer, Noam; Kulshreshtha, Apoorv; Cheng, Heng-Tze; Jin, Alicia; Bos, Taylor; Baker, Leslie; Du, Yu; Li, YaGuang; Lee, Hongrae; Zheng, Huaixiu Steven; Ghafouri, Amin; Menegali, Marcelo; Huang, Yanping; Krikun, Maxim; Lepikhin, Dmitry; Qin, James; Chen, Dehao; Xu, Yuanzhong; Chen, Zhifeng; Roberts, Adam; Bosma, Maarten; Zhao, Vincent; Zhou, Yanqi; Chang, Chung-Ching; Krivokon, Igor; Rusch, Will; Pickett, Marc; Srinivasan, Pranesh; Man, Laichee; Meier-Hellstern, Kathleen; Ringel Morris, Meredith; Doshi, Tulsee; Delos Santos, Renelito; Duke, Toju; Soraker, Johnny; Zevenbergen, Ben; Prabhakaran, Vinodkumar; Diaz, Mark; Hutchinson, Ben; Olson, Kristen; Molina, Alejandra; Hoffman-John, Erin; Lee, Josh; Aroyo, Lora; Rajakumar, Ravi; Butryna, Alena; Lamm, Matthew; Kuzmina, Viktoriya; Fenton, Joe; Cohen; Aaron; Bernstein, Rachel; Kurzweil, Ray; Aguera-Arcas, Blaise; Cui, Claire; Croak, Marian; Chi, Ed; Le, Quoc (January 20, 2022). "LaMDA: Language Models for Dialog Applications". arXiv:2201.08239 [cs.CL].
- Roose, Kevin (2022-10-21). "A Coming-Out Party for Generative A.I., Silicon Valley's New Craze". The New York Times. Retrieved 2023-03-14.
- "Don't fear an AI-induced jobs apocalypse just yet". The Economist. 2023-03-06. Retrieved 2023-03-14.
- Harreis, H.; Koullias, T.; Roberts, Roger. "Generative AI: Unlocking the future of fashion".
- "The race of the AI labs heats up". The Economist. 2023-01-30. Retrieved 2023-03-14.
- Yang, June; Gokturk, Burak (2023-03-14). "Google Cloud brings generative AI to developers, businesses, and governments".
- Bommasani, R; Hudson, DA; Adeli, E; Altman, R; Arora, S; von Arx, S; Bernstein, MS; Bohg, J; Bosselut, A; Brunskill, E; Brynjolfsson, E (2021-08-16). "On the opportunities and risks of foundation models". arXiv:2108.07258 [cs.LG].
{{cite arxiv}}
: CS1 maint: date and year (link) - Chen, Ming; Tworek, Jakub; Jun, Hongyu; Yuan, Qinyuan; Pinto, Hanyu Philippe De Oliveira; Kaplan, Jerry; Edwards, Haley; Burda, Yannick; Joseph, Nicholas; Brockman, Greg; Ray, Alvin (2021-07-06). "Evaluating Large Language Models Trained on Code". arXiv:2107.03374 [cs.LG].
- Ramesh, Aditya; Pavlov, Mikhail; Goh, Gabriel; Gray, Scott; Voss, Chelsea; Radford, Alec; Chen, Mark; Sutskever, Ilya (2021). "Zero-shot text-to-image generation". International Conference on Machine Learning. PMLR. pp. 8821–8831.
- Heaven, Will Douglas (2023-02-15). "AI is dreaming up drugs that no one has ever seen. Now we've got to see if they work". MIT Technology Review. Massachusetts Institute of Technology. Retrieved 2023-03-15.
- Agostinelli, Andrea; Denk, Timo I.; Borsos, Zalán; Engel, Jesse; Verzetti, Mauro; Caillon, Antoine; Huang, Qingqing; Jansen, Aren; Roberts, Adam; Tagliasacchi, Marco; Sharifi, Matt; Zeghidour, Neil; Frank, Christian (26 January 2023). "MusicLM: Generating Music From Text". arXiv:2301.11325 [cs.SD].
- Metz, Cade (April 4, 2023). "Instant Videos Could Represent the Next Leap in A.I. Technology". The New York Times.
- Queenie Wong (Sep 29, 2022). "Facebook Parent Meta's AI Tool Can Create Artsy Videos From Text". cnet.com. Retrieved Apr 4, 2023.
- "Explainer: What is Generative AI, the technology behind OpenAI's ChatGPT?". Reuters. March 17, 2023. Retrieved March 17, 2023.