Perceptual loss

The identification of content and style are core elements of a Neural Style Transfer (NST). The agreement of the content and style of two images is measured with the content_loss and style_loss, respectively.

Operators

In pystiche these losses are implemented Operator s. Operator s are differentiated between two types: RegularizationOperator and ComparisonOperator. A RegularizationOperator works without any context while a ComparisonOperator compares two images. Furthermore, pystiche differentiates between two different domains an Operator can work on: PixelOperator and EncodingOperator . A PixelOperator operates directly on the input_image while an EncodingOperator encodes it first.

In total pystiche supports four archetypes:

Operator

Builtin examples

PixelRegularizationOperator

  • TotalVariationOperator [MV15]

EncodingRegularizationOperator

PixelComparisonOperator

EncodingComparisonOperator

  • FeatureReconstructionOperator [MV15]

  • GramOperator [GEB16]

  • MRFOperator [LW16]

Multi-layer encoder

One of the main improvements of NST compared to traditional approaches is that the agreement is not measured in the pixel or a handcrafted feature space, but rather in the learned feature space of a Convolutional Neural Network called encoder. Especially variants of the style_loss depend upon encodings, i. e. feature maps, from various layers of the encoder.

pystiche offers a MultiLayerEncoder that enables to extract all required encodings after a single forward pass. If the same operator should be applied to different layers of a MultiLayerEncoder, a MultiLayerEncodingOperator can be used.

Perceptual loss

The PerceptualLoss combines all Operator s in a single measure acting as joint optimization criterion. How the optimization is performed will be detailed in the next section.