SDXL training

Overview

Stable Diffusion XL or SDXL is the 2nd gen image generation model that is tailored towards more photorealistic outputs with more detailed imagery and composition. SDXL can generate realistic faces, legible text within the images, and better image composition, all while using shorter and simpler prompts.

info

LoRA + Text-embedding is currently the only option for fine-tuning SDXL.

Input training images

Output images

Training tips

Default token for SDXL should be ohwx and will be set automatically if none is specified

Inference tips

Do not copy and paste prompts from SD15
Do not use textual-inversions such as easynegative or badhands from SD15
Consider activating face-swap and face-inpainting (which in turn requires super-resolution) - this is the biggest boost you can get to increase similarity to subject
Use clean small concise prompts - usually up to 15 words
Avoid long negatives - this will decrease similarity to subject.
Start with baseline SDXL 1.0 inference before going to other base models. Most custom SDXL models are biased and may reduce similarity. Models which we noticed that work okay are ZavyChromaXL and ClearChromaXL

All above tips will help increase similarity to the original subject.

Aspect ratios

The below aspect ratios are recommended for SDXL inference since these were also used for the training.

aspect: width, height
5: 704, 1408
52: 704, 1344
57: 768, 1344
6: 768, 1280
68: 832, 1216
72: 832, 1152
78: 896, 1152
82: 896, 1088
88: 960, 1088
94: 960, 1024
0: 1024, 1024
07: 1024, 960
13: 1088, 960
21: 1088, 896
29: 1152, 896
38: 1152, 832
46: 1216, 832
67: 1280, 768
75: 1344, 768
91: 1344, 704
0: 1408, 704
09: 1472, 704
4: 1536, 640
5: 1600, 640
89: 1664, 576
0: 1728, 576

API usage

See here for API usage

SDXL training

Overview​

Training tips​

Inference tips​

Aspect ratios​

API usage​

Overview

Training tips

Inference tips

Aspect ratios

API usage