Paper ID: 2412.05325
The Role of Text-to-Image Models in Advanced Style Transfer Applications: A Case Study with DALL-E 3
Ebubechukwu Ike
While DALL-E 3 has gained popularity for its ability to generate creative and complex images from textual descriptions, its application in the domain of style transfer remains slightly underexplored. This project investigates the integration of DALL-E 3 with traditional neural style transfer techniques to assess the impact of generated style images on the quality of the final output. DALL-E 3 was employed to generate style images based on the descriptions provided and combine these with the Magenta Arbitrary Image Stylization model. This integration is evaluated through metrics such as the Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR), as well as processing time assessments. The findings reveal that DALL-E 3 significantly enhances the diversity and artistic quality of stylized images. Although this improvement comes with a slight increase in style transfer time, the data shows that this trade-off is worthwhile because the overall processing time with DALL-E 3 is about 2.5 seconds faster than traditional methods, making it both an efficient and visually superior option.
Submitted: Dec 4, 2024