Opt-In Art: Learning Art Styles Only from Few Examples

Hui Ren*, 1, 2,
1UIUC 2MIT 3Northeastern University

*Co-first authors

Creative AI Track NeurIPS 2025
Teaser Image
(a)

We introduce Blank Canvas Diffusion, a carefully curated text-to-image model trained only on photographs, serving as the pretraining foundation for our model.

(b)

We explore whether a model with no prior exposure to paintings can learn artistic styles using a LoRA Art Adapter trained on a small opt-in sample of an artist's work.

(c)

We find that a model trained without paintings can generalize an artistic style from only a few examples through this adapter approach.

We explore whether pre-training on datasets with paintings is necessary for a model to learn an artistic style with only a few examples. To investigate this, we train a text-to-image model exclusively on photographs, without access to any painting-related content.

We show that it is possible to adapt a model that is trained without paintings to an artistic style, given only few examples. User studies and automatic evaluations confirm that our model (post-adaptation) performs on par with state-of-the-art models trained on massive datasets that contain artistic content like paintings, drawings or illustrations.

Finally, using data attribution techniques, we analyze how both artistic and non-artistic datasets contribute to generating artistic-style images. Surprisingly, our findings suggest that high-quality artistic outputs can be achieved without prior exposure to artistic data, indicating that artistic style generation can occur in a controlled, opt-in manner using only a limited, carefully selected set of training examples.

Blank Canvas Diffusion

Distinguishing "art" from "not art" in natural images is challenging, as artistic expression often emerges in unexpected places, from sculptural designs to logos and branding on everyday objects. Our objective is to separate visual art from natural imagery, ensuring everyday scenes and objects are represented while minimizing intentional artistic elements. This figure illustrates our approach to defining the boundary between art and non-art images. We exclude graphic arts but retain other forms, such as architecture. The spectrum ranges from "definitely art" (e.g., photographs of tapestries, baroque architecture, or paintings) to "maybe art" (e.g., accidental art that isn't the main subject). It further includes "maybe not art" (e.g., artistic elements in daily objects like doors, signboards, or decorative cakes) and "definitely not art" (e.g., nature or landscapes).

We train the Blank Canvas Diffusion on a dataset containing minimal graphic content. Our Blank Canvas Diffusion model is built on a latent diffusion architecture, to prevent any art-related knowledge from leaking through the text embeddings, we instead use a language-only Text Encoder based on BERT.

Teaser Image

Method

To train an Art-Style Adapter, we collect a few examples of artworks in a specific style X0 ∈ A and caption the content of the artwork. This can be done automatically or manually. To connect the newly learned style information with specific tokens in the prompt, we append a text "in the style of V* art" to the content prompt, denoted as C*. To enable the model to learn this new artistic style, we fine-tune the U-Net module using LoRA. The generated image should match the style of a small exemplar dataset when prompted with a caption C*, which includes a style prefix V*. For example, if C* = "People walking along a riverside path with colorful trees in the style of V*", the image should reflect both the scene (content) and the specified artistic style. Content loss ensures that the visual elements of the prompt C = "People walking along a riverside path with colorful trees" are accurately depicted, while style loss maintains the distinct artistic qualities associated with V*.

Teaser Image

Demo

Evaluation

Our Blank Canvas Diffusion model shows limited style trans- fer with training-free methods, suggesting that traditional models may rely on inherent artistic biases. Unlike our model, traditional models have seen vast amounts of art, enabling them to internalize stylistic patterns for effective style transfer.

Teaser Image

(Left) Results of the Perceptual User Study; Blank Canvas Diffusion with Adapter method (green bar) is preferred over im- age editing baselines, on par with Adapter on the SD1.4 backbone and favored less with StyleAligned (SD1.4), however the margin of preference is narrow between the baselines. (Right) Quantita- tive evaluation of the baselines, Blank Canvas Diffusion with the Art Adapter acheives a good trade-off between the style and content.

Teaser Image

Qualitative Comparison - Image Stylization

Comparison of our method and other image stylization baselines for the artist Van Gogh. All captions contain a suffix “in the style of Vincent van Gogh / V* art””.

Teaser Image

Qualitative Results

Art generation results (of art generation and image stylization) and training images are shown in Figures 20–36. We demonstrate our model’s ability to replicate diverse artistic styles: Impressionism (Monet, van Gogh, Corot), Art Nouveau (Klimt), Fauvism (Derain), Abstract Expressionism (Matisse, Pollock, Richter), Abstract Art (Kandinsky), Cubism (Picasso, Gleizes), Pop Art (Lichtenstein, Warhol), Ukiyo-e (Hokusai), Expressionism (Escher), and Postmodern and Geometric Abstraction (Miró, Battiss). The captions and reference images are sampled from the LAION Pop dataset. The examples shown are generated after training on just 10-15 samples from the artist, representing the model's only exposure to the artistic style.



Data Attribution

We find that our Art Adapter can generalize from a small Art-Style training set and generate seemingly novel images that are coherent with the given artistic style. To better understand which training images contributed to the synthesized image, and to check whether the art filtering may have overlooked some art content that influenced the result, we applied an off-the-shelf data attribution technique. For each generated image, we retrieved the top five attributed images from both Blank Canvas Dataset and Art-Style examples. While we expect stylistic elements to dominate, real-world influences from the Blank Canvas Dataset play a significant role. In the style-inspired generation, distinctive artistic features capture the essence of the style, yet the attribution method uncovers real-world elements beneath, as though the style has been gently overlaid on the content.


BibTeX

@misc{ren2025optinartlearningart,
        title={Opt-In Art: Learning Art Styles Only from Few Examples}, 
        author={Hui Ren and Joanna Materzynska and Rohit Gandikota and David Bau and Antonio Torralba},
        year={2025},
        eprint={2412.00176},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2412.00176}, 
      }