1649-An exterior eye-level view of a resident-architectureExterior_v40Exterior-1285606908.

// Let's Play
Stable Diffusion

What is Stable Diffusion ?

Stable Diffusion is a deep learning-based model developed by Stability AI that generates high-quality images from text descriptions. It's part of the broader category of "text-to-image" models, which use neural networks to interpret and visualize the content described in a text prompt. Unlike traditional methods, Stable Diffusion can create highly detailed and varied images, making it valuable for creative tasks like digital art, concept design, and visual storytelling. The model's ability to understand and render complex scenes makes it a powerful tool in AI-driven content creation.

See pixaroma YT instructions

FIND THE RIGHT MODEL

You can test your image by looking at different works online, Civitai is one of the popular sites to get models.

Note that Each model's setting will cater to different needs such as generating humanoids, robots, or a specific product design.

It is fair to test different models and see whether they can generate more or less of what you desire.

Model Experiment

1. Install Stable Swarm:

Download the Windows file and save it to the D drive or any secondary drive partition that does not contain the Windows operating system or any program data. Once saved, run the installation process and allow it to proceed. The Command Prompt (cmd) will execute additional updates automatically. After the updates are complete, the Command Prompt will close, and an installation window will appear. From there, you can select your desired options and proceed with the installation.

(Make sure you have a graphics card of RTX20 series and up)

2. Download any desired model online:

Search for any models online or go to Civitai, some models may need you to register first.

0929-An exterior eye-level view of a resident-architecturerealmix_v1repair-766356100.png

Setting test Core Parameter

Testing Core Parameters provides different outcomes based on the prompt you've entered. Below, you'll find key options to adjust:

Images: Specify the number of images to generate per batch (1-4).

Seed: Determines the type of variation for the generation. Typically, setting this to -1 allows for random generation.

Steps: Controls the quality of the generated images. A setting of 40 usually suffices.

CFG Scales: Adjusts the strength of the prompt input. Higher values produce more contrast, while lower values result in less contrast. Be cautious, as too high can lead to distorted or burnt images, and too low can cause nonsensical results. A value of 7 is a good starting point, with normal usage ranging from 5 to 9.

1728-An exterior eye-level view of a resident-architectureExterior_v40Exterior-1204637665.

Init image

1. Image to Image:

In the latest release of Beta Stable Swarm, the 'Image to Image' option has been updated and is now referred to as 'Init image.' When selecting an image to be developed, it's recommended to upload one that closely resembles a white model with minimal materials defined for optimal results.

2. Try a prompt:

"An exterior eye-level view of a residential villa by Lake Washington in the Pacific Northwest modern style with a landscape-integrated design. The villa features a linear form with a glass curtain wall and red ceder wood facade, bathed in bright sunlight. The scene is captured on a clear sunny day with the serene lake in the background, shot using a Canon EOS 5D Mark IV paired with a Canon EF 16-35mm f/2.8L II USM lens"

1806-An exterior eye-level view of a resident-architectureExterior_v40Exterior-1359705435.

ControlNet

ControlNet will influence the structure of your image:

this advanced extension enables more precise control over image generation. It enhances the model's capabilities by allowing users to provide additional inputs, such as images, sketches, or specific features, to steer the generation process in a more detailed and targeted manner. For this experiment, we will use MLSD and Canny

MSLD:

A technique that optimizes the depth information in images by minimizing the mean squared error between predicted depth maps and target depth maps.

MSLD is used to refine the depth aspect of an image during generation, ensuring that the depth and three-dimensionality are more accurately represented.

Canny:

When using Canny ControlNet, you can input an image, and the model uses the Canny edge detection algorithm to create an edge map of the image. This edge map is then used as a guide to control the generation process in Stable Diffusion, ensuring that the generated image closely follows the structure and edges of the original input image.

WORK