Stable Diffusion is changing everything

Open source is accelerating the progress with image generation AI and it is incredible. Back in August 2022, Stable Diffusion was released and made open source by stability.ai and runway. This gave developers the ability to generate images using AI with a latent text-to-image diffusion model. The latent diffusion model was trained using a on a subset of images in LAOIN-5b database and is relatively lightweight. Since this release of the original open source project, numerous forks of it have popped up and developers are extended further and faster than ever.

Many community repositories are adding user interfaces to make it easy to generate images using the Stable Diffusion AI allowing for non-technical people generate images using the AI. The content below outlines some examples and insights based on my findings around Stable Diffusion.

Invoke AI

Invoke AI is a fork of Stable Diffusion that provides end users a Stable Diffusion Toolkit that comes with some serious features. With this repository you’ll be able to use run a web server and enter a prompt that the AI will use to generate images.

Basic image generation

After going through all installation and setup steps, you can start up the web server with the command below:

python script/dream.py --web

Once the web server is running you ar able to use a web interface for generating images. For instance, searching for leaves changing colors falling off a tree in the fall season will leaves changing colors falling off a tree in the fall season

Image generation using a reference image

Along side generating images from nothing you can also use a reference image, give it a description and see how AI will generate a new image for you

Using this reference image

mountains

You are able to generate new images based on this reference image. Below is an example of the cli command that you can use instead of the web server

"photo realistic snowy mountains in the clouds" --init_img=./init-images/mountains.png --strength=0.85 --steps=200 -n4 --cfg_scale=15

With this prompt it has generated 4 new images

mountains with a beautiful sky mountains that looks more abstract mountains

Face Restoration with GFPGAN and Real-ESRGAN

A really exciting feature is that you can use Stable Diffusion AI to restore realistic details in images that have very low-quality. It provides the ability to do face restoration and up-scaling by utilizing GFPGAN and Real-ESRGAN.

face restoration image

Insights

The world of AI is rapidly progressing and there is a lot of interest around image AI generation thanks to the open sourcing of Stable Diffusion. It has provided developers with a means to execute AI Image generation without the need to have a deep understanding of AI engineering.

A tool for artists

With this in mind, I believe we will see a new area of creation through AI image and video generation like we've never seen. We've started to see this through out communities across the web which are generating "digital art" using image generation AI. This could lead to serious disruption in the art industry and could reshape the way we create art as a whole. Like with most change there will be those who disagree that an artificial intelligence cannot generate true art. However, I urge skeptics to consider how this technology can elevate your craft and be used as a tool to do things you would previously not thought possible.

For instance, Sir Peter Jackson, film director, screenwriter, and producer, used artificial intelligence and machine learning to restore century-old World War I footage in his critically acclaimed documentary They Shall Not Grow Old. The use of technology turned black and white video into full color, which made for a more immersive moving experience for the viewer. This give a viewers a glimpse into reality over a hundred years ago.

they shall not grow old source: https://www.imdb.com/title/tt7905466/

Concerns around Deepfakes

There is no doubt that artificial intelligence (AI) has made some incredible breakthroughs in recent years, particularly in the realm of image and video generation. However, with these advances comes a growing concern over the use of AI to create deep fake images and videos - which could be used to spread misinformation and cause major problems in the real world.

In order to ensure that these breakthroughs are used for good, it is crucial that the tech community takes great care in how we innovate. We must be mindful of the potential risks and make sure that we are always working towards creating a safer, more transparent world. With deep fakes, as with any other new technology, we have a responsibility to use our powers for good. Let's make sure we do just that.

Thanks for reading!

Jonathan