What is new in Stable Diffusion 2.0?

What is new in Stable Diffusion 2.0? It’s trained with a brand new text encoder OpenCLIP and Depth-to-Image diffusion model.
text-to-image
image generation
Published

November 24, 2022

While the world was already amazed by the performance of open source text to image model, Stable Diffusion 1.x. Stability AI has released a new version with a lot of improvements.

List of notable updates:

  • Trained using a new text encoder, OpenCLIP, developed by LAION with support from Stability AI.
  • The text-to-image models can now generate images with default resolutions of both 512x512 pixels and 768x768 pixels.
  • The models are trained on subset of LAION-5B dataset after filtering out adult content using NSFW filter.
  • Stable Diffusion 2.0 comes with an Upscaler Diffusion model that enhances the resolution of images by a factor of 4.
  • Depth-to-Image: It extends the version 1 image-to-image by also considering depth of an input image for generation of a new image.
  • It also brings a new text guided image inpainting diffusion model, finetuned on the new Stable Diffusion 2.0 base text-to-image.