Image authentication using LBP-based perceptual image hashing

Feature extraction is a main step in all perceptual image hashing schemes in which robust features will lead to better results in perceptual robustness. Simplicity, discriminative power, computational efficiency and robustness to illumination changes are counted as distinguished properties of Local Binary Pattern features. In this paper, we investigate the use of local binary patterns for perceptual image hashing. In feature extraction, we propose to use both sign and magnitude information of local differences. So, the algorithm utilizes a combination of gradient-based and LBP-based descriptors for feature extraction. To provide security needs, two secret keys are incorporated in feature extraction and hash generation steps. Performance of the proposed hashing method is evaluated with an important application in perceptual image hashing scheme: image authentication. Experiments are conducted to show that the present method has acceptable robustness against perceptual content-preserving manipulations. Moreover, the proposed method has this capability to localize the tampering area, which is not possible in all hashing schemes.


Introduction
In recent years, we have witnessed the development of multimedia information in many aspects of our daily lives. Multimedia finds its application in various areas including, but not limited to, advertisements, art, entertainment, engineering, medicine, business, scientific research and spatial temporal applications. Many of the advantages of digital multimedia have led to the fast progress on media acquisition tools, powerful hardware, sophisticated editing software and network technologies that provide various media sharing and streaming services. However, digital multimedia data have suffered from illegal access and unauthorized distributions. Professional forgers with advanced technology can alter multimedia data without any trail on forged information. Therefore, it is necessary to create new tools and techniques to discover the authenticity and integrity of digital media [1]. In recent decade, several methods have been extensively studied for intellectual property protection of digital images and image forgeries detection including image watermarking-based schemes [2], digital image forensic-based schemes [3] and perceptual image hashing-based schemes [4,5]. For security/authentication of multimedia data, perceptual image hashing is introduced based on traditional cryptosystems. Traditional cryptographic hash functions have been used in applications involving data integrity issues and data retrieval [6]. However, it should be noted that the purpose of a cryptographic hash function and a perceptual image hash function are totally different. Hash functions in traditional cryptosystems are very secure, but they are very sensitive, i.e. changing even one bit of the input will change the output considerably. However, digital images should undergo content-preserving manipulations such as compression, enhancement, cropping, and scaling. An image hash function should tolerate such changes and produce hash values similar to the original image hash values with the same visual appearance [7]. On the other hand, the perceptual hashing system should be sensitive to content-changing distortions and reject malicious manipulations and attacks. Perceptual image hash functions generally consist of two steps, as illustrated in figure 1 [8]. The first step extracts a feature vector from the image which is depending on the image content or characteristics itself. In the second step, this feature vector is compressed and quantized into a binary or real number sequence to form the final hash value. Since an image hash also serves as a secure tag of the image, a secret key is incorporated into either feature extraction or hash generation or both to guarantee that the hash is hardly obtained by unauthorized adversaries without the secret key. This paper is organized as follows. Section 2 and 3 review the background literature of perceptual image hashing methods and center-symmetric local binary patterns, respectively. Section 4 presents the proposed new image hashing scheme. Experimental results are given in section 5. Finally, conclusion is drawn in section 6.

Related works
Feature extraction is the key step in the image hashing which differentiates perceptual hashing methods. The aim of feature extraction is to present a compact and robust representation of the image content. The extracted features are quantized into a binary or real numbers sequence to form the final hash value. By reviewing the literature, the existing hash generation algorithms can be roughly classified into three categories based on their feature extraction method: transform based schemes, matrix factorization based schemes, and local feature pattern based schemes.

Transform based schemes
Many approaches are utilized classic image transforms to extract image features. Various properties of the DCT can be utilized to create perceptual image hash functions. For example, low-frequency DCT coefficients of an image are mostly stable under image manipulations. Fridrich and Goljan [9] proposed a robust hashing algorithm based on the stability of low-frequency DCT coefficients. Another method using scale-and rotation-invariant property of Fourier-Mellin transform (FMT) was proposed in [10]. The authors presented a robust image hashing method using the magnitudes of the Fourier transform coefficients which were randomly weighted and summed. Their method is robust to various content-preserving manipulations such as geometric distortions, filtering operations, and etc. In a more recent development, several methods have been proposed by Radon transform. To provide robustness against geometric distortions in image hashing, the Radon transform was first used in [11]. An expanded method was proposed in [12]. The method uses Radon transform to divide an image into radial projections and build a RAdial Variance (RAV) vector of image pixels. Then, the first 40 DCT coefficients of the RAV vectors are converted into the robust image hash called RASH. The method is computationally simple and is resilient to scaling and rotation but its discriminative capability needs to be strengthened. Venkatesan et al. [13] proposed a perceptual image hashing technique based on quantized statistics of randomized rectangles in the discrete wavelet domain (DWT). In their method, averages or variances of the random blocks in wavelet image are computed and then quantized with randomized rounding to form a secure binary hash. This method is sensitive to contrast adjustment and gamma correction and robust against a limited range of geometric attacks.

Matrix factorization based schemes
In the second type of image hashing schemes, the advantage of matrix factorization or decomposition are used to extract the image features. In this category singular value decomposition (SVD) and non negative matrix factorization (NMF) are the most popular decomposition tools for image matrixes. Kozat et al. [14] proposed a hashing algorithm in two steps using SVD. The method is robust against some small variations in rotation and scaling. In another work in [15], Non-negative Matrix Factorization (NMF) is used as a dimensionality reduction technique to generate image hashing. Although, the method is resilient to a large class of perceptually insignificant attacks but it is vulnerable to brightness changes and large hashing methods using dimension reduction techniques, a new robust and secure image hash algorithm was also proposed based on Fast Johnson-Lindenstrauss transform (FJLT) [16].

Image
Image Hash Hash Generation Feature Extraction geometric transforms. In the category of image

Local feature pattern based schemes
Finally, the methods like [17], [18] and [19] can be included in the third group of image hashing algorithms called local feature patterns-based hashing schemes. In [18] and [19] SIFT keypoints are used as most significant and robust local features for image hashing. In this work we propose a novel and robust perceptual image hashing method based on Center-Symmetric Local Binary Pattern (CSLBP) features. CSLBP features are extracted from each non-overlapping block of the original gray-scale image. For each block, the final hash code is obtained by inner product of its CSLBP feature vector and a pseudorandom weight vector. To provide security needs, two secret keys are incorporated in feature extraction and hash generation steps. Many experiments are conducted and the results show that our algorithm can reach a good balance between robustness and discrimination and outperforms some well-known algorithms. High robustness against luminance changes, applicability in image tampering detection, acceptable hash length and running time can be also counted as the advantage of our proposed hashing method.

Center-symmetric local binary patterns (CSLBP)
Among the feature descriptors, Local Binary Patterns (LBP) is one of the most famous and powerful ones. The idea of LBP is originally proposed by Ojala et al. [20,21] for texture classification. Then, it has gained increasing attention in many image analyses applications due to its low computational complexity, invariance to monotonic gray-scale changes and texture description ability [20]. LBP has been utilized in various image analyses applications such as dynamic texture recognition [22], face recognition [23], detecting moving objects [24], image forgery [25], image region descriptors [26] and so on. In the original LBP, signed gray level differences of each pixel with its neighboring pixels are described as a binary form. However, the LBP operator produces rather long histograms and it is therefore difficult to use in the context of a region descriptor. Furthermore, the original LBP feature is not robust on flat images. To address the problems, center-symmetric local binary patterns, CSLBP, as a modified version of LBP was proposed in [26].
where, T is a small value used to threshold the gray-level difference to increase the robustness of the CSLBP feature on flat image regions. CSLBP captures better gradient information than the basic LBP, because instead of comparing the gray-level of each pixel with the center pixel, gray-level differences between center-symmetric pairs of opposite pixels in a neighborhood are compared.
Circularly symmetric neighborhoods and CSLBP feature for radius R and P=8 neighborhood pixels.

Proposed algorithm for perceptual image hashing
In this section, we propose image hashes generation based on block feature extraction using center-symmetric local binary patterns (CSLBP). As shown in figure 3, the proposed technique consists of three steps: pre-processing, feature extraction, and hash generation. To guarantee the security requirements, two secret keys, K1 and K2, are also incorporated in feature extraction and hash generation steps. The following section describes details of our hashing algorithms.

Pre-processing
In this work, we are dealing with gray-level images, so RGB images are first converted to grayscale images using standard color space conversion.
Since real images may have different size, to ensure that the final generated hash has a fixed length, all input images are first rescaled into a standard resolution of MN  by bi-linear interpolation. Then, the resized input image is divided into non-overlapping blocks of BB  pixels and feature extraction is applied to each image block. We need to keep a trade-off between the hash length, the discriminative capability and perceptual robustness in choosing the size of blocks. This is because a large block size means few features which will inevitably reduce discriminative capability. When the size of block is decreased, discrimination can be improved but perceptual robustness is easily affected by minor modification. In addition, a smaller block size will increase the hash size. In experiments, we find that 32 B  is an acceptable moderate size for 256 256  images. In this scheme, before feature extraction, we first filter each block with an edge-preserving adaptive low-pass filter. The adaptive filter is more selective than a comparable linear filter, because of preserving edges and other high-frequency parts of an image. For this purpose, we use a pixel-wise adaptive Wiener method based on statistics estimated from a local neighborhood of each pixel. Our experiments have shown that this filtering has considerably improvements on robustness of image hash. The consecutive operations in the pre-processing step are drawn in figure 4.

Feature extraction
In CSLBP-based image hashing, CSLBP features are extracted to represent the main content of the image compactly. Then, in hash generation step, the extracted features are converted into a real number sequence to form the final hash value. The details of these feature extraction method is presented as follows.  CSLBP operator discards the magnitude information of the difference between the centersymmetric pairs of pixels and uses only the sign information (see (1)). In other words, for a given MN  image, after identifying the CSLBP pattern of each pixel ( , ) ij, the normalized histogram of CSLBP codes is computed over the image and it is used as a feature vector, (4): where, B is the maximal CSLBP pattern value. In a gradient-based descriptor, image gradients are used to obtain magnitude and orientation for each image pixel. Since, definition of magnitude in CSLBP operator, , pq m , is similar to image gradient, it contains useful information for image description [27]. In our proposed CSLBP-based image hashing, the motivation behind proposed feature descriptor is an efficient combination of CSLBP-based descriptor and gradient-based descriptor using sign and magnitude information of local differences. We extract two features for each pixel of the image by using the CSLBP operator. The sign feature is the same as the original CSLBP code defined by (1) and the magnitude feature vector corresponds to gradient information which is achieved by the magnitude of local differences. The CSLBP operator has three parameters: radius R, number of neighboring pixels P, and threshold on the gray level difference T. In our experiments best results are achieved by R=1, P=8, and T=0.01. Using (2) and (3). Four histograms are built considering four components of magnitude vector. Each pixel of the image with a given CSLBP code is assigned to a bin in the histogram according to its magnitude. In other words, the magnitude feature, MV, is used as an adaptive weight in histogram calculation of CSLBP codes, (6).
Finally, the obtained histograms are joined together to create the feature vector of an image block, FV, which is used for hash generation, (7). Each FV has 64 elements and to enhance the security of the scheme, all the elements in each FV are randomly scrambled according to a secret key K1.

Hash generation
In the previous step, a feature vector was constructed for each non-overlapping block of input image. We generate pseudorandom weights

Experiments
In this section, we provide a comprehensive evaluation of the proposed algorithm in image authentication [19]. Furthermore, some experiments are also included to analyze the performances of the proposed hashing scheme with respect to forged region detection.

Evaluation of image authentication
Image authentication experiment is designed to measure the sensitivity of our method to distinguish malicious attacks from contentpreserving distortions. Figure 5  In our method, since hash values are approximately linearly changed, the correlation coefficient is used as a distance measure.
We considered the problem of image authentication as a hypothesis testing problem with two hypotheses: (H0: Image is authentic) and (H1: Image is not authentic). Each test image is classified into one of the hypothesis states. We use the receiver operating characteristics (ROC) curve to examine the discriminative capabilities of various hashing schemes in image authentication. True positive rate (TPR) and false positive rate (FPR) are two axes of ROC curve, which are defined by (9) and (10) Based on a given threshold (T), TPR gives us an estimate of the probability of true detection and FPR obtains the percentage of images that are falsely classified as original image. To implement authentication experiment, we construct three databases: an original images database, a similar images database and a forged images database. The original images database is constructed by a collection of 1000 color images from two databases: The Corel database [28] and Columbia photographic images and photorealistic computer graphics dataset [29]. All the color images are resized to 256  256 and converted as gray-scale images in the experiments. Similar images are obtained by applying several contentpreserving operations on each original image. The operations are listed in table 1. Our forged image database is constructed by splicing image forgery. Image splicing is a simple form of photomontage technology where is defined by simple combining image fragments from two or more different images without further post-processing. A total of 1000 tampered images are created with perform splicing forgery on each original image. In each forged image the pasted area is 10% of the host image. Figure 6 compares the ROC curves of the proposed method and those of [30] and two methods in [19]. Refer to (9) and (10), it is clear that TPR and FPR indicate robustness and discrimination, respectively. It can be observed from figure 6 (f) that the proposed method has shown stronger ability than the other four methods to distinguish content-preserving distortions from malicious attack. For example with the same probability of false detection in FPR = 0.2, CSLBP achieves higher probability of correct detection (TPR = 0.77) than other hashing

Effect of block size
To show effect of block size on hash performances of our methods, we change the block size and conduct the experiment under a combination of geometric and processing attacks, (in the last row of Table 1

Tampering localization
A forger can easily modify local regions of an image and alter its original content by available photo-editing software. In digital image forensics, many techniques have been proposed to address the problem of forged region detection [3,31]. Block-based matching is one of the main methods in forged region localization. In the method, first, the image is divided into overlapping or nonoverlapping blocks. Then, the perceptual hash function generates a hash value for each block of the original image. During forensic analysis, hashing codes are extracted from the corresponding blocks of the suspect image and a block-wise comparison can reveal potential tampered regions. For tampering localization, the choice of block size controls the trade-off between hash length and detection performance. Larger block size gives a smaller hash length but can introduce higher false detection than a smaller block size. An illustration of tampering localization functionality of the proposed method is provided by the following experiment. We used a database of 100 image pairs, which includes the original image and tampered copy. Tampered images are generated by splicing technique where some regions of the original image are replaced with foreign blocks. We assess the accuracy of tampering localization by ROC analysis. In the results, the receiver operating curve is a plot of the probability of true positive rate versus the false positive rate as the system threshold is varied. The two probabilities are defined as follows: Number of tampered blocks detected as tampered () Total number of tampered blocks TPR T 

Number of genuine blocks detected as tampered () Total number of genuine blocks FPR T 
where, T is the variant threshold parameter.
The result of ROC analysis is presented in figure  8, in which the true positive rates and false positive rates are averaged based on the results from 100 image pairs. The figure compares the sensitivity of forged region detection in our method and the method in [18]. Roy et al. uses quantized edge direction histogram features to localize tempered blocks. We can observe from figure 8 that the proposed method attain better accuracy than [18]. Because our method for hash generation utilize combine LBP-based descriptors and gradient-based descriptors. It is worth noting that FP [30] is not applicable for tampering detection. In this sense, the proposed hashing scheme is more generally applicable. An example of forged region detection, using CSLBP-based hashing method, is also illustrated in figure 9. From left to right, columns 1 to 3 are original images, tampered images and detection results, respectively.

Hash length and CPU running time
Clearly, hash length in the methods based on the block feature extraction is proportional to the number of image blocks. In the local feature point-based schemes, the number of keypoints determines the hash length. According to the parameters used in the experiment, the hash length of different methods is listed in table 2. The hash length of our method is 64 decimal digits. Table 2  An example of forged region detection, using CSLBP-based hashing method, is also illustrated in figure 9. From left to right, columns 1 to 3 are original images, tampered images and detection results, respectively.

Hash length and CPU running time
Clearly, hash length in the methods based on the block feature extraction is proportional to the number of image blocks. In the local feature point-based schemes, the number of keypoints determines the hash length. According to the parameters used in the experiment, the hash length of different methods is listed in table 2.
The hash length of our method is 64 decimal digits.

Conclusion
In this paper we proposed a new perceptual image hashing method based on local binary patterns (LBP). In this algorithm a simple and efficient version of LBP feature, called center-symmetric LBP (CSLBP), was used for feature extraction.
The original CSLBP descriptor uses only the sign information of local differences. In this paper, however, both sign and magnitude information are utilized for image hashing to make benefit of gradient-based and LBP-based descriptor simultaneously. To increase security of our hashing method, two secret keys were used in feature extraction and hash generation steps. To evaluate our proposed method, an experiment was conducted for image authentication. The results demonstrated that our proposed scheme could distinguish legal distortions from malicious manipulations. Finally, applicability in image tampering detection, acceptable hash length and running time can be also counted as the advantage of our proposed hashing method.