TY - GEN
T1 - ColorizeDiffusion
T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
AU - Yan, Dingkun
AU - Yuan, Liang
AU - Wu, Erwin
AU - Nishioka, Yuma
AU - Fujishiro, Issei
AU - Saito, Suguru
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Diffusion models have achieved great success in dual-conditioned image generation. However, they still face significant challenges in image-guided sketch colorization, where reference and sketch images usually exhibit different semantics and spatial structures. This mismatch, termed 'distribution shift' in this peper, results in various artifacts and degrades the colorization quality. To address this issue, we conducted thorough investigations into the image-prompted latent diffusion model and developed a two-stage training framework to mitigate the effects of distribution shift based on our analysis. Comprehensive quantitative comparisons, qualitative evaluations, and user studies were performed to demonstrate the superiority of our proposed methods. Additionally, ablation studies were conducted to assess the impact of the distribution shift and the selection of reference embeddings. Codes are made publicly available at https://github.com/tellurion-kanata/colorizeDiffusion.
AB - Diffusion models have achieved great success in dual-conditioned image generation. However, they still face significant challenges in image-guided sketch colorization, where reference and sketch images usually exhibit different semantics and spatial structures. This mismatch, termed 'distribution shift' in this peper, results in various artifacts and degrades the colorization quality. To address this issue, we conducted thorough investigations into the image-prompted latent diffusion model and developed a two-stage training framework to mitigate the effects of distribution shift based on our analysis. Comprehensive quantitative comparisons, qualitative evaluations, and user studies were performed to demonstrate the superiority of our proposed methods. Additionally, ablation studies were conducted to assess the impact of the distribution shift and the selection of reference embeddings. Codes are made publicly available at https://github.com/tellurion-kanata/colorizeDiffusion.
KW - image generation
KW - sketch colorization
UR - https://www.scopus.com/pages/publications/105003625586
UR - https://www.scopus.com/inward/citedby.url?scp=105003625586&partnerID=8YFLogxK
U2 - 10.1109/WACV61041.2025.00498
DO - 10.1109/WACV61041.2025.00498
M3 - Conference contribution
AN - SCOPUS:105003625586
T3 - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
SP - 5092
EP - 5102
BT - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 February 2025 through 4 March 2025
ER -