TMCnet News

Copy-Paste Forgery Image Blind Detection Algorithm Based on Histogram Invariant Moments [Sensors & Transducers (Canada)]
[April 22, 2014]

Copy-Paste Forgery Image Blind Detection Algorithm Based on Histogram Invariant Moments [Sensors & Transducers (Canada)]


(Sensors & Transducers (Canada) Via Acquire Media NewsEdge) Abstract: Considered that general detection algorithms against common copy-paste image forgery have poor robustness, this paper proposes a forgery image blind detection algorithm based on histogram invariant moments. The algorithm first carries out discrete wavelet transform on the image to be detected to extract low frequency part, then divides the low-frequency image into blocks. After that, histogram invariant moments characteristic vectors of these blocks are extracted and a characteristic matrix is constructed using these vectors. The characteristic vectors in the matrix are sorted by dictionary, and finally the sorted adjacent blocks. The block is judged by confidence distance to determine whether some of them are copy-paste image blocks. Experiments show that the proposed algorithm can more accurately locate the forged area of copy-paste images, and has better robustness on anti-noise, anti-compression, anti-rotation and anti-scaling. Meanwhile, the amount of computation is effectively reduced and detection efficiency is improved. Copyright © 2013 IFSA.



Keywords: Histogram invariant moments, Discrete wavelet, Image matching, Image forgery.

(ProQuest: ... denotes formulae omitted.) 1. Introduction Detection algorithm against digital image tampering can be divided into two categories: one of them is image authentication based on fragile watermark and digital signature. This category belongs to active algorithms. It needs to add a watermark or auxiliary information into the image in advance, but in reality not all of the images can involve this information in advance. The other category is blind detection technology, which belongs to passive authentication. It relies on characteristics of image itself for authentication, but this authentication method is more complex and challenging. However, it has broad application prospects.


There are many image forgery methods [1], where the copy-paste region tampering of same image is extremely common. That is, some part of an image is copied and then pasted on the other part of the image. This can obscure the human figure or object in the image. Detection algorithms against this image forgery belong to blind detection technology. Among them, exhaustive search method is the most commonly used method. Fridrich [2] first analyzed the exhaustive search algorithm, and then proposed a block matching detection method based on discrete cosine transform (DCT) to improve the efficiency of the exhaustive search algorithm. Popescu [3] proposed a similar approach, using principal component analysis (PCA) rather than discrete cosine transform to reduce the dimensions of the characteristic vectors and the dimensions of computation. Since then many scholars have made further researches based on the block matching algorithm. Wu Qiong [4] used discrete wavelet transform to extract the low frequency part of the original image, and then divided the low frequency part into blocks. Singular value decomposition was used to reduce dimensions, and finally the blocks were sorted by dictionary, the amount of computation was further reduced. Jing Li, et al. [5] proposed detection and locating algorithm based on phase to improve the efficiency of the block matching algorithm. In general, these algorithms are timeconsuming, need large amount of computation and have poor robustness.

In this paper, the author carries out research on taking histogram invariant moments as the image characteristic vectors and improving the anti-rotation, anti-noise, anti-compression and scaling performance of the algorithm. The proposed algorithm first carries out discrete wavelet transform on the forgery image to extract low frequency part, then extracts histogram invariant moments characteristic vectors of the low frequency part blocks. The characteristic vectors are sorted by dictionary, and finally the sorted adjacent blocks are judged by confidence distance to determine whether some of them are copy-paste image blocks.

2. Detection Algorithm Based on Wavelet Transform and Histogram Invariant Moments 2.1. Haar Wavelet Transform If an image is divided into blocks, for the (M-a+l) x (N-b+l) blocks that can be overlapped, the amount of histogram invariant moments computation is extremely large. For this end, the author first employs DWT (Discrete Wavelet Transform) as Fig. 1 shows to reduce the size of the image. After dividing the low frequency part into blocks, histogram invariant moments characteristic vectors are extracted from low frequency blocks. The number of blocks can be dramatically reduced in this way. Meanwhile, due to the low frequency area's ability of maintaining the profile and space characteristic of the image, the robustness of extracted image characteristics is enhanced.

2.2. Normalized Histogram Invariant Moments The histogram of an image can reflect the statistical properties of the image. It reflects the proportion of areas and number of pixels with different gray values in the entire image. The amount of information contained in an image can also obtained through histogram. Therefore, when the imaging conditions of one area changes, although the look and feel of the image has undergone great changes, the corresponding histogram will not change a lot. So we can utilize the information contained in histogram to define the characteristics of the image, and to find matching points between the images. In order to take advantage of the information of the histogram, article [6] and [7] have defined several image histogram invariant moments which have nothing to do with displacement and scale changes.

For a given digital image J(x, y), we assume that A represents the total number of pixels of the image, and A(t) represents the total number pixels of an image with the gray value t, thus the histogram can be defined as: ... (1) When the image fix,y) is converted to histogram P{t), the items which need to be processed change from two-dimensional to one-dimensional, the amount of data is largely reduced, and the computation time is subsequently reduced and the processing speed is improved.

For the one-dimensional function P(t), the k order moments are defined as ... (2) Central moment of k-order is defined as ... (3) where t=m\lmo.

The normalized k-order central moments are defined as ... (4) In the process of image matching, we expect that we can choose moments with such math characteristics as shift, rotation and scale invariance, i.e. invariant moments characteristics. If using central moments to represent image characteristics, it will only have shift invariance; if we normalize the central moments, it will have shift and scale invariance. Therefore, ordinary moments or central moments alone cannot achieve the desired goals. But if we perform some transform on them, the expected goals can be achieved theoretically. We present three invariant moments f, f?, fi, with invariant shift, rotation and scale characteristics based on normalized central moments: ...(5) ...(6) ...(7) In this paper, circular blocks are employed as feature extraction block template to remain the rotation invariance of histogram invariant characteristic. Histogram invariant eigenvalues of rectangular image blocks' effective area after being performed image rotation will have larger changes. Therefore, it is essentially infeasible to take rectangular image blocks as the image rotation invariant characteristic of image matching.

2.3. Sorting Normalized Histogram Invariant Moment Vectors by Dictionary Sort The image is divided into blocks. After that, histogram invariant moments vectors of these circular blocks are extracted and a characteristic matrix is constructed using these vectors. It will need very huge amount of computation if traditional methods are used to find the most relevant vector the current characteristic vector in the characteristic matrix.

The copy area and paste area of the image are the combination of many similar image blocks. These image blocks have consistent shift vectors. This fact can convert the locating problem of forged image area into finding image block pairs with similar shift vectors. We first carry out dictionary sort on the characteristic matrix, and then compare the two adjacent characteristic vectors to find similar image pairs.

2.4. Matching Detection of Similar Image Blocks In this paper, the Pearson correlation coefficient in statistics is used for similarity matching detection: The correlation coefficient of the two variables is defined as a quotient, dividing covariance of two variables by the product of standard deviations of them, which is ...(8) where x, y are two characteristic vectors (with same size) of one image block, cov(x,y) is the covariance of x, y, and ox , oy are standard deviations for x and y respectively.

The last formula (8) defines a population relevant coefficient, which is indicated by Greek alphabet p. If using the covariance and standard deviation of samples instead of population covariance and standard deviation, the calculated result will become sample correlation coefficient, which is generally denoted by r: ... (9) where L is the number of image blocks.

3. Specific Algorithm Flowchart and Steps Considered that general detection algorithms against common copy-paste image forgery have poor robustness and long computation time of block matching, this paper proposes a detection algorithm based on wavelet transform and normalized histogram invariant moments. Theoretically, the algorithm can not only detect shift transform but also can detect rotation and scaling transformation. The flowchart of the algorithm is shown in Fig. 2: The algorithm first carries out discrete wavelet transform on the image to extract low frequency part, then divides the low-frequency image into circular blocks. After that, histogram projection invariant moments characteristic vectors of these blocks are extracted. Finally, similarity of adjacent blocks is judged by confidence distance to determine whether some of them are copy-paste image blocks.

The specific algorithm is described as following: 1) Perform discrete wavelet decomposition on the Mx N image and extract low frequency part S, whose size is M/2xN/2; 2) Divide S into small overlapped circular blocks with a diameter of a, the size of the block is L=[M/2-a+l\ [N/2-a+l]. Calculate projection histogram invariant moment characteristic vectors of every circular block using (l)-(7).

3) Form a row vector matrix K using the characteristic vectors of each small block, where i is the row vector number. The total number of row vectors is L. The obtained matrix K has a dimension of Lx H, where H is the number of elements in each row vector.

4) Sort matrix K in a dictionary sequence. Denote the sorted matrix as Kl, which has the same dimension as K. Kh is the i01 row vector of Kl. Record the small block in K and Kli with (Xi, Yi) as its upper left comer coordinates.

5) Calculate the position difference between Kli and Klj and their relevant coefficient p(x,y) in the matrix K\. For each pairXl,, Klj, if p^l^Kl^T and \AXij\>a, \AYij\>b are satisfied, it indicates that this region of the image is tampered. Record the corresponding (X, Yi) and (Xj, Yj) and mark the area whose top left coordinate is (X, Yi), (Xj, Yj) with white color.

4. Experimental Results and Analysis All experiments are carried out on a computer with a dual core (TM) 2.4 GHz CPU, 2 GB RAM and Windows XP operating system. The image blocks are circles with a diameter of 16. Matlab 7 programming algorithm and related algorithms are adopted to carry out testing.

4.1. Robustness Testing of Anti-Rotation We first select a JPEG image show in Fig. 1(a) with a size of 512^512, and then modify it using Adobe Photoshop. Rotate the copied area "sunflower"clockwise by 0°, 90°, 180° and 270° respectively and paste them. The detection results are shown in the following Fig. 3. The results show that both un-rotated and rotated tampering can be detected correctly by the algorithm.

4.2. Robustness Testing of Anti-Scaling We also select the same JPEG image with a size of 512x512, and then modify it using Adobe Photoshop. Taking into account that the tampering will become apparent due to too much scaling, we will just scale down an image area by 1/5 and scale it up by 1/5 respectively and then paste them. The detection results are shown in the following Fig. 4. As can be seen from detection results, the proposed algorithm can correctly detect scaling tampering.

4.3. Robustness Testing of Anti-noise Addition and Lossy Compression In order to verify the robustness of anti-noise addition and lossy compression, we add different Gaussian noise p=0, o2=0.001, p=0, o2=0.005, p=0, a2=0.01 on copy-paste forgery image respectively and use different Gaussian low pass filter with 3x3 template, 5x5 template, 7x7 template, and different compression factors 50, 75, 85.

According to the experiment results, for noise addition show in Fig. 5, the detection effect declines slightly with the increase of noise factor, however, tampered area can still be detected correctly .The robustness against Gaussian low-pass filtering does not decrease as the template increases as show in Fig. 6. For addition of compression factor, the detection effect declines accordingly, but tampered area can still be detected correctly as show in Fig. 7. All this shows that detection algorithms based on histogram invariant moments based have better robustness.

4.4. Algorithm Comparison We have compared the proposed algorithm in this paper with those algorithms in [2, 3, 8], the results are shown in Table 1: The algorithm in paper [2] has poor robustness against area copy-paste tampering. The algorithm in paper[3] has better robustness because it improves the algorithm in paper [2]. The algorithm in paper [8] can detect rotation, but it cannot detect scaling. Compared with paper[2, 3, 8], the proposed algorithm based on histogram invariant moments not only has good robustness, but also can detect tampered area which has been rotated and scaled.

5. Conclusions Image region copy-paste forgery is a simple and common method of tampering. Detection algorithm must have certain robustness in anti-noise and anticompression, as well as anti-geometry change. For this type of tampering, this paper proposes a detection algorithm based on normalized histogram invariant moments. The algorithm first carries out discrete wavelet transform on the tampered image to extract low frequency part, then divides the lowfrequency image into circular blocks. After that, histogram invariant moments characteristic vectors of these blocks are extracted and a characteristic matrix is constructed using these vectors. The characteristic vectors in the matrix are sorted by dictionary to reduce matching space. And similarity of characteristic vectors of two adjacent blocks is compared to determine whether some of them are copy-paste image blocks using judging threshold. Simulation experiments show that the algorithm can not only accurately locate the forged regions of copypaste images, but also can effectively resist noise and lossy JPEG compression, as well as rotation and scaling attacks. Meanwhile, the amount of computation is effectively reduced and detection efficiency is improved.

References [1] . B. L. Shivakumar and S. Santhosh Baboo, Detecting copy-move forgery in digital images: a survey and analysis of current methods, Global Journal of Computer Science and Technology, Vol. 10, Issue 7, 2011, pp. 61-65.

[2] . F. D. Soukal, and J. Lukás, Detection of copy move forgery in digital images, in Proceedings of the Digital Forensic Research Workshop, August 2003.

[3] . A. C. Popescu and H. Farid, Exposing digital forgeries by detecting duplicated image regions, Technical Report, TR2004-515, Department of Computer Science, Dartmouth, 2004.

[4] . W. Qiong, L. Guohui, S. Shaojie, T. Dan, Detection of copy forgery regions in the image based on wavelet and singular value decomposition, Journal of Chinese Computer Systems, Vol. 29, No. 4, 2008, pp. 730-733.

[5] . J. Li, Z. Huijuan, Detection and location of image region duplication forgery based on phase correlation, Journal of Computer Applications, Vol. 33, Issue 3, 2013, pp. 660-662, 687.

[6] . Z. Wei, Image feature extraction based on unitary projection histogram moment invariant, Computer Engineering, Vol. 37, No. 1, January 2011, pp. 226-228.

[7] . W. Jue, S. Xiaowei, Image matching based on the invariant feature of circle projection and histogram, Automation Technology and Application, Vol. 26, Issue 8,2007, pp. 80-82.

[8] . Z. Junhong, Research on passive digital image forensics based on image content, South China University of Technology, October 2011, pp. 58-62.

1 Junliu ZHONG,2 Yanfen GAN 1 College of Information Engineering, Guangdong Mechanical & Electrical College, Guangzhou, China 2 Information Science and Technology Department Guangdong University of Foreign Studies South China Business College, Guangzhou, China 1 Tel:+8615013207300 1 E-mail:[email protected] Received: 16 September 2013 /Accepted: 22 November 2013 /Published: 30 December 2013 (c) 2013 International Frequency Sensor Association

[ Back To TMCnet.com's Homepage ]