Stable Diffusion is a machine learning model developed by StabilityAI, in collaboration with EleutherAI and LAION. This powerful text-to-image model is capable of generating digital images from natural language descriptions, making it a revolutionary tool for digital art and visual storytelling. Unlike other models such as DALL-E, Stable Diffusion makes its source code available, making it more accessible to artists, designers, and developers alike.
Using a subset of the LAION-Aesthetics V2 dataset and the computing power of 256 Nvidia A100 GPUs, Stable Diffusion has been trained to produce images that capture the essence of natural language prompts. This model is not only capable of generating original digital art pieces but can also be used for other tasks, such as image-to-image translation guided by a text prompt.
Check out some examples of the images it has generated on our prompt gallery and image gallery.
Stable Diffusion is based on an image generation technique called latent diffusion models (LDMs). Unlike other popular image synthesis methods such as generative adversarial networks (GANs) and the auto-regressive technique used by DALL-E, LDMs generate images by iteratively "de-noising" data in a latent representation space, then decoding the representation into a full image. LDM was developed by the Machine Vision and Learning research group at the Ludwig Maximilian University of Munich and described in a paper presented at the recent IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). Earlier this year, InfoQ covered Google's Imagen model, another diffusion-based image generation AI.
The Stable Diffusion model can support several operations. Like DALL-E, it can be given a text description of a desired image and generate a high-quality that matches that description. It can also generate a realistic-looking image from a simple sketch plus a textual description of the desired image. Meta AI recently released a model called Make-A-Scene that has similar image-to-image capabilities.
Many users have publicly posted examples of generated images; Katherine Crowson, lead developer at Stability AI, has shared many images on Twitter. Some commenters are troubled by the impact that AI-based image synthesis will have on artists and the art world. The same week that Stable Diffusion was released, an AI-generated artwork won first prize in an art competition at the Colorado State Fair.