Update 3: Enhanced Model with Captions I have introduced a new model trained with captions. The key modifications include appending “.txt” to the caption extension in the GUI and adjusting alpha to 64. The training time and results closely align with previous models. Prompt adherence is notably improved when training with captions, accelerating the attainment of likeness. Explore the Saori Hara – v1.0 model trained with captions under Stable Diffusion LoRA by Civitai. Your preference between versions is welcome.

Update 2: Exploring Caption Training Settings Efforts are ongoing to determine optimal settings for training with captions, but outcomes are not as promising as training without captions. New models without captions have been released, and a caption-trained version is in progress. When training without captions, limiting the dataset to 30 face and 20 high-quality body photos and approximately 1500 steps yields superior results, reducing overfitting concerns.

Update: Rectifying Guide Errors I acknowledge errors in the guide, particularly with the kohya_ss training, which failed to read captions due to an incorrect format. Correct this by changing the caption format to “.txt” in the GUI. Additionally, when naming the image folder (e.g., “15_j03d4”), ensure to append the class name at the end (e.g., “15_j03d4 man”). All models trained with these faults will still yield acceptable results, but moving forward, I will adopt a corrected workflow. My sincere apologies for any inconvenience caused.

Additional Notes:

  • Training a single LoRA on an RTX 3080 takes approximately one hour.
  • ADetailer is essential for correcting images where the face is not the focal point.
  • This guide represents my exploration following various sources, with room for improvement (e.g., addressing moles, facial marks on celebrities in the final LoRA).
  • Valuable feedback or suggestions for workflow enhancement are encouraged.

Workflow Steps:

1. Image Collection:

  • Gather 30-50 face images showcasing various angles, lighting, and hairstyles.
  • Obtain 10-30 body images with diverse angles, lighting, and clothing.
  • Crop face images to 512×512 and body images to 512×768.
  • Organize images into folders with a standardized naming convention.

2. Folder Naming:

  • Use a format like <Shortened Name> (e.g., Joe Danger – j03d4).
  • Create subfolders for face and body images, specifying repetition ranges.

3. Captioning and Tag Management:

  • Apply kohya_ss and WD14 captioning methods for face and body images.
  • Use BooruDatasetTagManager to remove unwanted tags.

4. Folder Structure:

  • Establish ‘model’ and ‘logs’ folders within the parent folder.

5. Directory Configuration:

  • Update directories in the provided JSON file with paths for images, model, and logs.
  • Adjust settings for specific GPUs.

6. Regularization Images:

  • Download regularization images from GitHub and place them in a designated folder.
  • Organize regularization images in line with the suggested folder structure.

7. Training Configuration:

  • Train for 3 epochs (adjustable).
  • Aim for 1500-2000 steps during training.

8. Final Directory Structure:

  • Ensure proper organization of folders, subfolders, and files.

Note: Install and configure necessary tools and libraries mentioned in the guide. Customize folder names and paths based on individual setups.