Categories: Image

Stable Diffusion v2-1-unCLIP model released

Information taken from the GitHub page: https://github.com/Stability-AI/stablediffusion/blob/main/doc/UNCLIP.MD

HuggingFace checkpoints and diffusers integration: https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip

Public web-demo: https://clipdrop.co/stable-diffusion-reimagine


unCLIP is the approach behind OpenAI’s DALL·E 2, trained to invert CLIP image embeddings. We finetuned SD 2.1 to accept a CLIP ViT-L/14 image embedding in addition to the text encodings. This means that the model can be used to produce image variations, but can also be combined with a text-to-image embedding prior to yield a full text-to-image model at 768×768 resolution.

If you would like to try a demo of this model on the web, please visit https://clipdrop.co/stable-diffusion-reimagine

This model essentially uses an input image as the ‘prompt’ rather than require a text prompt. It does this by first converting the input image into a ‘CLIP embedding’, and then feeds this into a stable diffusion 2.1-768 model fine-tuned to produce an image from such CLIP embeddings, enabling a users to generate multiple variations of a single image this way. Note that this is distinct from how img2img does it (the structure of the original image is generally not kept).

Blog post: https://stability.ai/blog/stable-diffusion-reimagine

submitted by /u/hardmaru
[link] [comments]

AI Generated Robotic Content

Share
Published by
AI Generated Robotic Content
Tags: ai images

Recent Posts

10 Podcasts That Every Machine Learning Enthusiast Should Subscribe To

Podcasts are a fun and easy way to learn about machine learning.

6 hours ago

o1’s Thoughts on LNMs and LMMs

TL;DR We asked o1 to share its thoughts on our recent LNM/LMM post. https://www.artificial-intelligence.show/the-ai-podcast/o1s-thoughts-on-lnms-and-lmms What…

6 hours ago

Leading Federal IT Innovation

Palantir and Grafana Labs’ Strategic PartnershipIntroductionIn today’s rapidly evolving technological landscape, government agencies face the…

6 hours ago

How Amazon trains sequential ensemble models at scale with Amazon SageMaker Pipelines

Amazon SageMaker Pipelines includes features that allow you to streamline and automate machine learning (ML)…

6 hours ago

Orchestrating GPU-based distributed training workloads on AI Hypercomputer

When it comes to AI, large language models (LLMs) and machine learning (ML) are taking…

6 hours ago

Cohere’s smallest, fastest R-series model excels at RAG, reasoning in 23 languages

Cohere's Command R7B uses RAG, features a context length of 128K, supports 23 languages and…

7 hours ago