Neural Style Transfer with pystiche

This example showcases how a basic Neural Style Transfer (NST), i.e. image optimization, could be performed with pystiche.


This is an example how to implement an NST and not a tutorial on how NST works. As such, it will not explain why a specific choice was made or how a component works. If you have never worked with NST before, we strongly suggest you to read the Gist first.


We start this example by importing everything we need and setting the device we will be working on.

23 import pystiche
24 from pystiche import demo, enc, loss, optim
25 from pystiche.image import show_image
26 from pystiche.misc import get_device, get_input_image
28 print(f"I'm working with pystiche=={pystiche.__version__}")
30 device = get_device()
31 print(f"I'm working with {device}")

Multi-layer Encoder

The content_loss and the style_loss operate on the encodings of an image rather than on the image itself. These encodings are generated by a pretrained encoder. Since we will be using encodings from multiple layers we load a multi-layer encoder. In this example we use the vgg19_multi_layer_encoder() that is based on the VGG19 architecture introduced by Simonyan and Zisserman [SZ2014] .

44 multi_layer_encoder = enc.vgg19_multi_layer_encoder()
45 print(multi_layer_encoder)

Perceptual Loss

The core components of every NST are the content_loss and the style_loss. Combined they make up the perceptual loss, i.e. the optimization criterion.

In this example we use the FeatureReconstructionLoss introduced by Mahendran and Vedaldi [MV2015] as content_loss. We first extract the content_encoder that generates encodings from the content_layer. Together with the content_weight we can construct the content_loss.

60 content_layer = "relu4_2"
61 content_encoder = multi_layer_encoder.extract_encoder(content_layer)
62 content_weight = 1e0
63 content_loss = loss.FeatureReconstructionLoss(
64     content_encoder, score_weight=content_weight
65 )
66 print(content_loss)

We use the GramLoss introduced by Gatys, Ecker, and Bethge [GEB2016] as style_loss. Unlike before, we use multiple style_layers. The individual losses can be conveniently bundled in a MultiLayerEncodingLoss.

75 style_layers = ("relu1_1", "relu2_1", "relu3_1", "relu4_1", "relu5_1")
76 style_weight = 1e3
79 def get_style_op(encoder, layer_weight):
80     return loss.GramLoss(encoder, score_weight=layer_weight)
83 style_loss = loss.MultiLayerEncodingLoss(
84     multi_layer_encoder, style_layers, get_style_op, score_weight=style_weight,
85 )
86 print(style_loss)

We combine the content_loss and style_loss into a joined PerceptualLoss, which will serve as optimization criterion.

93 perceptual_loss = loss.PerceptualLoss(content_loss, style_loss).to(device)
94 print(perceptual_loss)


We now load and show the images that will be used in the NST. The images will be resized to size=500 pixels.

104 images = demo.images()
106 size = 500


ì downloads all demo images upfront. If you only want to download the images for this example remove this line. They will be downloaded at runtime instead.


If you want to work with other images you can load them with read_image():

from pystiche.image import read_image

my_image = read_image("my_image.jpg", size=size, device=device)
130 content_image = images["bird1"].read(size=size, device=device)
131 show_image(content_image, title="Content image")
136 style_image = images["paint"].read(size=size, device=device)
137 show_image(style_image, title="Style image")

Neural Style Transfer

After loading the images they need to be set as targets for the optimization criterion.

147 perceptual_loss.set_content_image(content_image)
148 perceptual_loss.set_style_image(style_image)

As a last preliminary step we create the input image. We start from the content_image since this way the NST converges quickly.

155 starting_point = "content"
156 input_image = get_input_image(starting_point, content_image=content_image)
157 show_image(input_image, title="Input image")


If you want to start from a white noise image instead use starting_point = "random" instead:

starting_point = "random"
input_image = get_input_image(starting_point, content_image=content_image)

Finally we run the NST with the image_optimization() for num_steps=500 steps.

In every step the perceptual_loss is calculated nd propagated backward to the pixels of the input_image. If get_optimizer is not specified, as is the case here, the default_image_optimizer(), i.e. LBFGS is used.

181 output_image = optim.image_optimization(input_image, perceptual_loss, num_steps=500)

After the NST is complete we show the result.

187 show_image(output_image, title="Output image")


If you started with the basic NST example without pystiche this example hopefully convinced you that pystiche is a helpful tool. But this was just the beginning: to unleash its full potential head over to the more advanced examples.

Total running time of the script: ( 0 minutes 0.000 seconds)

Estimated memory usage: 0 MB

Gallery generated by Sphinx-Gallery