JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

Similar documents
This file is part of the following reference:

FACE DETECTION. Lab on Project Based Learning May Xin Huang, Mark Ison, Daniel Martínez Visual Perception

PREDICTING the outcomes of sporting events

FIRE PROTECTION. In fact, hydraulic modeling allows for infinite what if scenarios including:

Attacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China

Validation of 12.5 km Resolution Coastal Winds. Barry Vanhoff, COAS/OSU Funding by NASA/NOAA

Windcube FCR measurements

Bayesian Optimized Random Forest for Movement Classification with Smartphones

Sensing and Modeling of Terrain Features using Crawling Robots

Estimating the Probability of Winning an NFL Game Using Random Forests

CS 528 Mobile and Ubiquitous Computing Lecture 7a: Applications of Activity Recognition + Machine Learning for Ubiquitous Computing.

Calibration and Validation of the Simulation Model. Xin Zhang

A Novel Approach to Predicting the Results of NBA Matches

The Intrinsic Value of a Batted Ball Technical Details

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

INTRODUCTION TO PATTERN RECOGNITION

Introduction to Pattern Recognition

Navigate to the golf data folder and make it your working directory. Load the data by typing

Advanced PMA Capabilities for MCM

Environmental Science: An Indian Journal

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

DEVELOPMENT OF A SET OF TRIP GENERATION MODELS FOR TRAVEL DEMAND ESTIMATION IN THE COLOMBO METROPOLITAN REGION

Predicting NBA Shots

Evaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie

CALIBRATION OF THE PLATOON DISPERSION MODEL BY CONSIDERING THE IMPACT OF THE PERCENTAGE OF BUSES AT SIGNALIZED INTERSECTIONS

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning

Design of a Pedestrian Detection System Based on OpenCV. Ning Xu and Yong Ren*

A Bag-of-Gait Model for Gait Recognition

The risk assessment of ships manoeuvring on the waterways based on generalised simulation data

Evaluating and Classifying NBA Free Agents

DEPARTMENT OF THE NAVY DIVISION NEWPORT OFFICE OF COUNSEL PHONE: FAX: DSN:

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Inferring land use from mobile phone activity

Investigation of Gait Representations in Lower Knee Gait Recognition

A study of advection of short wind waves by long waves from surface slope images

Figure 2: Principle of GPVS and ILIDS.

Uninformed Search (Ch )

Fingerprint Recompression after Segmentation

Deconstructing Data Science

2 When Some or All Labels are Missing: The EM Algorithm

Title: Modeling Crossing Behavior of Drivers and Pedestrians at Uncontrolled Intersections and Mid-block Crossings

Smart-Walk: An Intelligent Physiological Monitoring System for Smart Families

Climate Change Scenarios for the Agricultural and Hydrological Impact Studies

Improving Accuracy of Frequency Estimation of Major Vapor Cloud Explosions for Evaluating Control Room Location through Quantitative Risk Assessment

COMPLETING THE RESULTS OF THE 2013 BOSTON MARATHON

Interim Operating Procedures for SonTek RiverSurveyor M9/S5

Singularity analysis: A poweful technique for scatterometer wind data processing

CS145: INTRODUCTION TO DATA MINING

Real-Time Electricity Pricing

THE CORRELATION BETWEEN WIND TURBINE TURBULENCE AND PITCH FAILURE

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Compaction, Permeability, and Fluid Flow in Brent-type Reservoirs Under Depletion and Pressure Blowdown

Estimating Paratransit Demand Forecasting Models Using ACS Disability and Income Data

Atmospheric Waves James Cayer, Wesley Rondinelli, Kayla Schuster. Abstract

Analysis of Car-Pedestrian Impact Scenarios for the Evaluation of a Pedestrian Sensor System Based on the Accident Data from Sweden

Introduction to Machine Learning NPFL 054

THe rip currents are very fast moving narrow channels,

An experimental validation of a robust controller on the VAIMOS autonomous sailboat. Fabrice LE BARS

E. Agu, M. Kasperski Ruhr-University Bochum Department of Civil and Environmental Engineering Sciences

Predictors for Winning in Men s Professional Tennis

QUESTION BANK PERIOD : JULY - NOV 2018 BATCH: UNIT I DIGITAL IMAGE FUNDAMENTALS

Chapter 4 Traffic Analysis

A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Rotation Centers of the Equine Digit and their Use in Quantifying Conformation

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

Climate Change Scenarios for the Agricultural and Hydrological Impact Studies

Cricket umpire assistance and ball tracking system using a single smartphone camera

hot hands in hockey are they real? does it matter? Namita Nandakumar Wharton School, University of

Legendre et al Appendices and Supplements, p. 1

Cycling Volume Estimation Methods for Safety Analysis

A SEMI-PRESSURE-DRIVEN APPROACH TO RELIABILITY ASSESSMENT OF WATER DISTRIBUTION NETWORKS

Trouble With The Curve: Improving MLB Pitch Classification

THE WAVE CLIMATE IN THE BELGIAN COASTAL ZONE

EE 364B: Wind Farm Layout Optimization via Sequential Convex Programming

Gait Recognition. Yu Liu and Abhishek Verma CONTENTS 16.1 DATASETS Datasets Conclusion 342 References 343

67. Sectional normalization and recognization on the PV-Diagram of reciprocating compressor

Lab 11: Introduction to Linear Regression

SUPPLEMENT MATERIALS

Robust Enhancement of Depth Images from Depth Sensors

An algorithm for Sea Surface Wind Speed from MWR

An Efficient Code Compression Technique using Application-Aware Bitmask and Dictionary Selection Methods

Modeling Pedestrian Volumes on College Campuses

HIGH RESOLUTION DEPTH IMAGE RECOVERY ALGORITHM USING GRAYSCALE IMAGE.

Novel empirical correlations for estimation of bubble point pressure, saturated viscosity and gas solubility of crude oils

Predicting Human Behavior from Public Cameras with Convolutional Neural Networks

Linear Compressor Suction Valve Optimization

Advanced Test Equipment Rentals ATEC (2832) OMS 600

AutoGait: A Mobile Platform that Accurately Estimates the Distance Walked

Object Recognition. Selim Aksoy. Bilkent University

Provably Secure Camouflaging Strategy for IC Protection

To Illuminate or Not to Illuminate: Roadway Lighting as It Affects Traffic Safety at Intersections

Early Skip Decision based on Merge Index of SKIP for HEVC Encoding

Introduction to Pattern Recognition

Currents measurements in the coast of Montevideo, Uruguay

Robocup 2010 Version 1 Storyboard Prototype 13 th October 2009 Bernhard Hengst

FAULT DIAGNOSIS IN DEAERATOR USING FUZZY LOGIC

Integration of time-lapse seismic and production data in a Gulf of Mexico gas field

4. Learning Pedestrian Models

Transcription:

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts Jan Kodovský, Jessica Fridrich May 16, 2012 / IH Conference 1 / 19

What is JPEG-compatibility steganalysis? Detects embedding changes in the spatial domain using the fact that the cover image was previously JPEG compressed All pixels in a cover image must be obtainable by decompressing the corresponding quantized DCT coefficients Relevancy Photographs are commonly stored in the JPEG format JPEG format may not allow sufficient capacity Vast majority of publicly available steganographic tools hide messages in raster formats Most steganographic algorithms do not take JPEG compatibility into account 2 / 19

Overview of the process Step 1 Step 2 Estimate JPEG compression parameters θ from image Y Y... (stego) image in raster format θ... JPEG quality factor (quantization table), DCT specifics, chrominance tables, color subsampling, etc. Detect embedding changes using the recompressed image Ŷ Ŷ = JPEG 1 θ (JPEG θ (Y))... recompressed image (predictor) JPEG θ... JPEG compression (many-to-one mapping) JPEG 1 θ... JPEG decompression 3 / 19

Prior art Fridrich, Goljan, Du (SPIE 2001) Mathematical guarantee of JPEG incompatibility Brute-force search, growing complexity for higher quality factors Böhme (IH 2007) Weighted Stego-image for decompressed images (WSJPG) Targeted to LSB replacement (LSBR) Uniform weights, predictor = recompressed image Ŷ ˆβ WS = 1 n n (y i ȳ i )(y i ŷ i ) i=1 y i... pixel value ȳ i... value of the pixel with flipped LSB ŷ i... cover pixel predictor (image recompression) 4 / 19

Prior art, cont d Luo, Wang, Huang (IEEE SPL 2011) The same recompression predictor as WSJPG Decision based on the number of mismatched pixels ˆβ LUO = 1 n {i y i ŷ i } y i... pixel value ŷ i... cover pixel predictor (image recompression) Ability to detect embedding operations other than LSBR Much less robust to inaccurate estimate of θ Does not distinguish between embedding changes and natural recompression artifacts 5 / 19

Image recompression BOSSbase image 7347.pgm, JPEG quality factor 80 Decompressed image Y Residual R = Y Ŷ 6 / 19

Image recompression BOSSbase image 7347.pgm, JPEG quality factor 80 Decompressed image Y ˆβ LUO = 1 n nnz(r) Residual R = Y Ŷ LSBR @ change rate β = 0.01 6 / 19

Introducing a simple feature vector Goal: distinguish between emb. changes and recomp. artifacts Recompression artifacts... patterns over 8 8 pixel blocks Embedding changes... individual changed pixels Simple pattern descriptor ρ (k)... number of mismatched pixels in the kth block h = (h m )... histogram of ρ (k) over the image h m = 64 {k ρ (k) = m}, m = 0,..., 64 n Relationship to the predictor of Luo et al. ˆβ LUO = 1 n {i y i ŷ i } = 1 64 m h m 64 m=0 7 / 19

Properties of the proposed features 7 10 3 Feature value 6 5 4 3 2 1 h 1 Quality factor 80 0 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 Change rate β 8 / 19

Properties of the proposed features 7 10 3 Feature value 6 5 4 3 2 1 h 1 h 5 Quality factor 80 0 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 Change rate β 8 / 19

Properties of the proposed features 7 10 3 Feature value 6 5 4 3 2 1 h 1 h 5 h 10 Quality factor 80 0 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 Change rate β 8 / 19

Properties of the proposed features 7 10 3 Feature value 6 5 4 3 2 1 h 1 Quality factor 80 h 5 h 10 h15 h 20 h25 h30 0 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 Change rate β Feature vector h covers all change rates β Next step: supply h to a machine-learning engine 8 / 19

Comparison to Luo et al. 0.5 ˆβLUO 0.4 0.3 0.2 0.1 Quality factor 80 0.02 0.01 0 0 100 200 300 400 500 Number of changes 0 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 Change rate β Large variance due to recompression artifacts Inaccurate for detecting small change rates (our focus) 9 / 19

Clairvoyant detector Distinguish between cover images and stego images of known β Tested over a range of different quality factors and change rates Our focus: very small change rates (even a single change) Binary classifier trained for every tested change rate β Ensemble classifier with FLD as a base learner Image database: BOSSbase Threshold set to minimize total average error under equal priors 1 P E = min P FA 2 (P FA + P MD ) 10 / 19

Clairvoyant detector, cont d A single changed pixel (β = 0.0000038) LSBR JPEG quality 70 80 85 90 92 94 96 98 WSJPG.387.425.470.490.491.497.498.498 Proposed 0 0.010.085.089.489.497.498 100 changed pixels (β = 0.00038) JPEG quality 70 80 85 90 92 94 96 98 WSJPG.157.147.166.219.197.329.378.391 Proposed 0 0 0 0 0.012.251.301 262 changed pixels (β = 0.001) JPEG quality 70 80 85 90 92 94 96 98 WSJPG.076.063.064.088.072.148.215.245 Proposed 0 0 0 0 0 0.049.174 Note: For such small values of β, LUO performed consistently worse than WSJPG 11 / 19

Detection of schemes other than LSBR LSB matching (±1 embedding) similar results as LSBR HUGO (adaptive algorithm) less detectable Number of changed pixels Change rate β (cpp) QF 1 10 25 100 0.001 0.005 0.01 80.0213.0017.0022.0018.0017.0007.0006 0 0 0 0 0 0 0 90.1235.0160.0065.0049.0035.0024.0024.0852.0046.0007 0 0 0 0 95.4953.4627.3974.2415.0859.0191.0076.4948.4472.3680.0977.0003 0 0 (Recompression artifacts correlate with texture/edges) 12 / 19

Detector for unknown β Dropping the assumption of known change rate β One-sided hypothesis testing problem: β = 0 vs β > 0 We construct a single classifier and set a threshold according to the predefined value of FA rate (based on covers only) How to train a classifier? Pevný (SPIE 2011) uniform mixture of change rates Our scenario: even a small number of changes reliably detected train on a fixed small β with a reasonable error rate Error too high difficult to find optimal decision boundary Error too low many decision boundaries equally good, but only some of them useful for smaller change rates 13 / 19

Detector for unknown β, cont d PD 1.0 0.8 0.6 0.4 0.2 0.0 QF 90 P FA = 0.030 P FA = 0.020 P FA = 0.010 P FA = 0.005 1 10 100 1.0 0.8 0.6 0.4 0.2 0.0 P FA = 0.20 P FA = 0.10 P FA = 0.05 P FA = 0.02 P FA = 0.01 QF 95 1 10 100 1000 Number of testing changes Number of testing changes P D... probability of detection QF 90: P FA = 1% detects everything with > 10 changes QF 95: P FA = 1% detects everything with > 300 changes (β 0.0011) 14 / 19

Quantitative detector Quantitative detector built using Support Vector Regression (SVR) Methodology described by Pevný et al. (IEEE TIFS, 2012) ν-svr with a Gaussian kernel (libsvm library) 3 parameters: kernel width, misclassification cost, bound on the number of support vectors 3D grid-search + cross-validation BOSSbase, training change rates chosen uniformly from [0, b] Relative measures of accuracy: B r (β) = 1 β (median( ˆβ) β) 100% M r (β) = 1 β median( ˆβ median( ˆβ) ) 100% (More informative for change rates of very different magnitudes) 15 / 19

Quantitative detector, cont d β b = 0.5 b = 0.05 b = 0.005 b = 0.0005 10/n 2.78 ± 4.84 50/n 9.04 ± 8.06 +0.64 ± 2.34 100/n 15.6 ± 28.5 3.36 ± 4.13 0.22 ± 2.00 0.001 5.326 ± 10.9 0.19 ± 1.75 3.83 ± 1.72 0.0035 0.47 ± 3.06 +0.11 ± 0.71 16.4 ± 1.37 0.01 16.3 ± 17.2 0.00 ± 1.06 0.90 ± 0.80 43.7 ± 1.07 0.035 3.74 ± 4.68 +0.05 ± 0.40 0.1 1.17 ± 1.74 21.1 ± 1.17 0.2 0.57 ± 0.94 0.3 0.26 ± 0.79 0.4 +0.02 ± 0.51 0.5 0.90 ± 1.52 Numbers... B r(β) ± M r(β). Crosses correspond to failures (either B r or M r is larger than 50%). JPEG quality fixed to 90. Trained on [0, b]. Very different magnitudes of testing change rates Training on [0, 0.5] only 4 training samples with < 100 changes Smaller b higher training-sample density more accurate on [0, b] 16 / 19

Quantitative detector, cont d Heuristic cascading algorithm to cover all change rates: 1. Set b = (b 1,..., b k ), b i > b i+1, b i [0, 0.5], initialize i = 1. 2. Compute ˆβ i by training on [0, b i]. If i = k, terminate and output ˆβ i. 3. If ˆβ i b i+1, increment i = i + 1, go to Step 2. 4. Output ˆβ i. b 1 = 0.5 b 2 = 0.05 b 3 = 0.005 b 4 = 0.0005 Cascade 10/n 2.78 ± 4.84 2.78 ± 4.84 50/n 9.04 ± 8.06 +0.64 ± 2.34 +0.65 ± 2.35 100/n 15.6 ± 28.5 3.36 ± 4.13 0.22 ± 2.00 0.10 ± 2.02 0.001 5.326 ± 10.9 0.19 ± 1.75 3.83 ± 1.72 0.19 ± 1.75 0.0035 0.47 ± 3.06 +0.11 ± 0.71 16.4 ± 1.37 +0.13 ± 0.71 0.01 16.3 ± 17.2 0.00 ± 1.06 0.90 ± 0.80 43.7 ± 1.07 0.00 ± 1.06 0.035 3.74 ± 4.68 +0.05 ± 0.40 +0.07 ± 0.40 0.1 1.17 ± 1.74 21.1 ± 1.17 1.27 ± 1.67 0.2 0.57 ± 0.94 0.57 ± 0.94 0.3 0.26 ± 0.79 0.24 ± 0.74 0.4 +0.02 ± 0.51 +0.04 ± 0.47 0.5 0.90 ± 1.52 0.96 ± 1.49 17 / 19

Cascade vs. LUO (LSBR) Br(β) Br(β) 5 0-5 -10-15 -30 Cascade LUO QF = 80 10 5 10 4 10 3 10 2.1.2.3.4.5 30 15 0 QF = 95 10 5 10 4 10 3 10 2.1.2.3.4.5 Change rate β 5 0-5 -10 QF = 90 10 5 10 4 10 3 10 2.1.2.3.4.5 40 20 0-20 QF = 100-40 10 5 10 4 10 3 10 2.1.2.3.4.5 Change rate β 18 / 19

Summary Accurate JPEG-compatibility steganalysis using block-histogram of the number of mismatched pixels after recompression Limitations of prior art WSJPG limited to LSB replacement LUO does not distinguish between emb. changes and recompression artifacts Three types of detectors constructed Clairvoyant known β, binary classification Unknown β one-sided hypothesis testing, CFAR detector Quantitative outputs estimate of β Accurate detection of fewer than 100 changes for QF up to 94 Proposed method requires images for classifier training Trade off between robustness w.r.t. the estimate of θ and the ability to detect embedding schemes other than LSBR 19 / 19