Universal Style Transfer via Feature Transforms

Similar documents
Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Fun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning

Predicting Horse Racing Results with Machine Learning

Lecture 39: Training Neural Networks (Cont d)

Attacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China

Visualizing and Understanding Stochastic Depth Networks

Deformable Convolutional Networks

Predicting Human Behavior from Public Cameras with Convolutional Neural Networks

Vector Synoptic Maps at NSO. Luca Bertello

Convolutional Neural Networks

Visual Traffic Jam Analysis Based on Trajectory Data

Predicting NBA Shots

Deconstructing Data Science

A Study of Human Body Characteristics Effect on Micro-Doppler-Based Person Identification using Deep Learning

Improving Context Modeling for Video Object Detection and Tracking

CS 1675: Intro to Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 1, 2018

Deconstructing Data Science

Data modelling and interpretation

Introduction to Pattern Recognition

IMPROVED OIL SLICK IDENTIFICATION USING CMOD5 MODEL FOR WIND SPEED EVALUATION ON SAR IMAGES

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

HIGH RESOLUTION DEPTH IMAGE RECOVERY ALGORITHM USING GRAYSCALE IMAGE.

A New Benchmark for Vison-Based Cyclist Detection

Prediction Market and Parimutuel Mechanism

MWGen: A Mini World Generator

Biomechanics and Models of Locomotion

THe rip currents are very fast moving narrow channels,

Smart-Walk: An Intelligent Physiological Monitoring System for Smart Families

Anabela Brandão and Doug S. Butterworth

Introduction to Machine Learning NPFL 054

SWIFT. The Stratospheric Wind Interferometer for Transport Studies

Petacat: Applying ideas from Copycat to image understanding

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

Journal of Chemical and Pharmaceutical Research, 2014, 6(3): Research Article

Geophysical Model Functions for the Retrieval of Ocean Surface Winds

A Fault Diagnosis Monitoring System of Reciprocating Pump

Object Recognition. Selim Aksoy. Bilkent University

A Novel Approach to Predicting the Results of NBA Matches

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Visual Background Recommendation for Dance Performances Using Dancer-Shared Images

Human Pose Tracking III: Dynamics. David Fleet University of Toronto

Business and housing market cycles in the euro area: a multivariate unobserved component approach

Advanced PMA Capabilities for MCM

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images

Discussion: Illusions of Sparsity by Giorgio Primiceri

Introduction to Pattern Recognition

Neural Nets Using Backpropagation. Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill

THE WAVE CLIMATE IN THE BELGIAN COASTAL ZONE

Driver Behavior at Highway-Rail Grade Crossings With Passive Traffic Controls

Open Research Online The Open University s repository of research publications and other research outputs

Coupling distributed and symbolic execution for natural language queries. Lili Mou, Zhengdong Lu, Hang Li, Zhi Jin

Evaluating NBA Shooting Ability using Shot Location

Basketball field goal percentage prediction model research and application based on BP neural network

DNS Study on Three Vortex Identification Methods

67. Sectional normalization and recognization on the PV-Diagram of reciprocating compressor

POKEMON HACKS. Jodie Ashford Josh Baggott Chloe Barnes Jordan bird

The sycc Color Space

A Robust Speed-Invariant Gait Recognition System for Walker and Runner Identification

PEDESTRIAN behavior modeling and analysis is

Supplementary Material for Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval

Cricket umpire assistance and ball tracking system using a single smartphone camera

What will Happen Next? Forecasting Player Moves in Sports Videos

1. Introduction. Faustini Libent Ishabailu 1, Dr. Pei Zhijun 2, Abdalla Mohamed Hambal 3

A new 2D image compression technique for 3D surface reconstruction

Phrase-based Image Captioning

ARTIFICIAL NEURAL NETWORK BASED DESIGN FOR DUAL LATERAL WELL APPLICATIONS

A Trajectory-Based Analysis of Coordinated Team Activity in Basketball Game

B. AA228/CS238 Component

Interpretable Discovery in Large Image Data Sets

Bayesian Methods: Naïve Bayes

Investigation of Gait Representations in Lower Knee Gait Recognition

Compression of FPGA Bitstreams Using Improved RLE Algorithm

Polar Activities at DLR Maritime Security Lab Bremen in the Projects EISTAK and EMS

FFI RAPPORT BROADBAND INVERSION AND SOURCE LOCALIZATION OF VERTICAL ARRAY DATA FROM THE L-ANTENNA EXPERIMENT IN EIDEM Ellen Johanne

What we have learned over the years Richard Williams I Sales Manager

Effects of data structure on the estimation of covariance functions to describe genotype by environment interactions in a reaction norm model

arxiv: v2 [cs.cv] 1 Apr 2019

FAULT DIAGNOSIS IN DEAERATOR USING FUZZY LOGIC

WaMoS II Wave Monitoring System

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

Automated analysis of microscopic images of cellular tissues

1.1 The size of the search space Modeling the problem Change over time Constraints... 21

GOLOMB Compression Technique For FPGA Configuration

Substitution of Energy and Capital an Its Uncertainty for China

Principal component factor analysis-based NBA player comprehensive ability evaluation research

Inferring land use from mobile phone activity

Pedestrian Protection System for ADAS using ARM 9

Diver Training Options

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

Chapter 6. Analysis of the framework with FARS Dataset

Geometric moments for gait description

VOLLEYBALL Match players ranking. TAL Tala'ea El-Geish

The PyMca Application and Toolkit V.A. Solé - European Synchrotron Radiation Facility 2009 Python for Scientific Computing Conference

A N E X P L O R AT I O N W I T H N E W Y O R K C I T Y TA X I D ATA S E T

On the convergence of fitting algorithms in computer vision

Folding Reticulated Shell Structure Wind Pressure Coefficient Prediction Research based on RBF Neural Network

MEDICAL IMAGE WAVELET COMPRESSION- EFFECT OF DECOMPOSITION LEVEL

u = Open Access Reliability Analysis and Optimization of the Ship Ballast Water System Tang Ming 1, Zhu Fa-xin 2,* and Li Yu-le 2 " ) x # m," > 0

STUDY ON DIVISION OF POOR PROVINCES IN CHINA BASED ON THE METHOD OF QUANTITY-FIG. HU Fang-xiao 1, WANG Yu-bao 2

Transcription:

Universal Style Transfer via Feature Transforms Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang UC Merced, Adobe Research, NVIDIA Research Presented: Dong Wang (Refer to slides by Ibrahim Ahmed and Trevor Chan) August 31, 2018 ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 1 / 20 NVI

Problem Transfer arbitrary visual styles to content images Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 2 / 20 NVI

Related Work Existing works often trade off between generalization, quality, and efficiency. Not efficient during inference: Image Style Transfer using Convolutional Neural Network. (CVPR 2016) Style Specific Networks: Perceptual Losses for Real-Time Style Transfer and Super-Resolution. (ECCV 2016) Poor generalizing abilities in terms of visual quality: Arbitrary Style Transfer in Real-Time with Instance Normalization.(ICCV 2017) ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 3 / 20 NVI

Proposed Method Image Reconstruction + Feature Transforms Train autoencoder for image reconstruction, then fix it Whiten/ Coloring Transform on feature space ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 4 / 20 NVI

Image Reconstruction Encoder: Train VGG-19 on ImageNet Classification task Decoder: Trained to reconstruct the image More than one decoder trained for reconstruction 5 trained decoders Image source: Li et.al ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 5 / 20 NVI

Loss Function for Reconstruction Decoder Combination of pixel reconstruction loss and feature loss. L = I o I i 2 2 + λ Φ(I o ) Φ(I i ) 2 2 (1) I i, I o are the input image and reconstruction output. Φ( ) is the VGG encoder. λ is the weight to balance the two losses. Note: no style image is used in process of training. ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 6 / 20 NVI

Feature Transforms ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 7 / 20 NVI

Feature Transforms by Whitening/Coloring Content features are transformed at intermediate levels by statistics of the style features. In each layer, need content features to exhibit same characteristics of the style features of the same layer. WCT (Whitening/Coloring Transform) achieves this. ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 8 / 20 NVI

Whitening Transform Transform a random vector (a) to be uncorrelated and have unit variance decorrelate the components of original vector (b) scale the different components so they have unit variance (c) ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UC Merced, August 31, Adobe 2018Research, 9 / 20 NVI

Coloring Transform Coloring is the inverse of the whitening transform. Transform white noise into random vector with desired covariance matrix. ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 10 / 20 NVI

Apply WCT to Style Transfer Disassociate input image style and associate the input image with the style of the style image. From content image I c and style image I s, extract their vectorized feature maps f c and f s. WCT will directly transform the f c to match the covariance matrix of f s. ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 11 / 20 NVI

Whitening and Coloring Transform Whitening: Σ c = f c f T c = E c D c E T c ˆf c = E c Dc 1/2 Ec T f c T ˆf c ˆf c = I Coloring: Σ s = f s f T s = E s D s E T s ˆf cs = E s Ds 1/2 Es T ˆf c T ˆf cs ˆf cs = f s fs T ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 12 / 20 NVI

Whitened Image Feature Inverting whitened features. We invert the whitened VGG Relu 4 1 feature as an example. Left: original images, Right: inverted results (pixel intensities are rescaled for better visualization). The whitened features still maintain global content structures. ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 13 / 20 NVI

Multi-level Stylization ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 14 / 20 NVI

Multi-level Stylization Single-level stylization using different VGG features. (a)-(c) Intermediate results of our coarse-to-fine multi-level stylization framework. I1 is the final output of the multi-level pipeline. (d) Reversed fine-to-coarse multi-level pipeline. Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced,31, Adobe 2018 Research, 15 / 20 NVI

Experiment Results Compared with other models Other methods were inferior in terms of Handling arbitrary styles Efficiency Learning-free Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced,31, Adobe 2018 Research, 16 / 20 NVI

Experiment Results ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 17 / 20 NVI

Parameters Image size Style weight control between stylization and content preservation. ˆf cs = α ˆf cs + (1 α)f c ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 18 / 20 NVI

Spatial Control ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 19 / 20 NVI

Takeaways Work with arbitrary styles Do not have to train on style images Scale and weight of style transfer can be changed on the fly ijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Universal XinStyle Lu, Ming-Hsuan Transfer Yang (UCAugust Merced, 31, Adobe 2018 Research, 20 / 20 NVI