Sunteți pe pagina 1din 57

DATA HIDING THROUGH IMAGE STEGANOGRAPHY

[B.Tech Project Report]


A Project By
Md Rameez Akhtar Ishita Chel Kushal Kannungo Soumyajit Das

Project Guide Prof. Anjan Payra

Dr Sudhir Chandra Sur Degree Engineering College [Computer Science and Engineering Dept]

West Bengal University of Technology


(2009-2013)

DATA HIDING THROUGH IMAGE STEGANOGRAPHY


Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Bachelor of Technology in Computer Science and Engineering by
Md Rameez Akhtar (09255001001) Ishita Chel(09255001013) Kushal Kannungo(09255001049) Soumyajit Das(09255001051)

Project Guide Prof. Anjan Payra

Dr Sudhir Chandra Sur Degree Engineering College Computer Science and Engineering Dept [Affiliated to WBUT]

West Bengal
2009-2013

[Affiliated to WBUT]

West Bengal
2009-2013

CERTIFICATE
This is to certify that this project report entitled DATA HIDING THROUGH IMAGE STEGANOGRAPHY by Md Rameez Akhtar, Ishita Chel , Kushal Kanungo and Soumyajit Das submitted in partial fulfilment of the requirements for the degree of Bachelor of Technology in Computer Science and Engineering of the West Bengal University Of Technology, West Bengal, during the academic year 2012-13, is a bonafide record of work carried out under our guidance and supervision.

The results embodied in this report have not been submitted to any other University or Institution for the award of any degree or diploma.

(Guide) Prof. Anjan Payra

(Head of Dept) Prof. Samir Ghosh

Acknowledgement

It is our privilege to express our sincerest regards to our project coordinator, Prof. Anjan Payra, for his valuable inputs, able guidance, encouragement, whole-hearted cooperation and constructive criticism throughout the duration of our project. We deeply express our sincere thanks to our Head of Department Prof. S Kundu for encouraging and allowing us to present the project on the topic Data Hiding Through Image Steganography at our department premises for the partial fulfillment of the requirements leading to the award of B-Tech degree. We take this opportunity to thank all our lecturers who have directly or indirectly helped our project. We pay our respects and love to our parents and all other family members and friends for their love and encouragement throughout our career. Last but not the least we express our thanks to our friends for their cooperation and support.

ABSTRACT

The Internet as a whole does not use secure links, thus information in transit may be vulnerable to interception as well. The important of reducing a chance of the information being detected during the transmission is being an issue now days. Some solution to be discussed is how to passing information in a manner that the very existence of the message is unknown in order to repel attention of the potential attacker. Besides hiding data for confidentiality, this approach of information hiding can be extended to copyright protection for digital media. In this research, we clarify what steganography is, the definition, the importance as well as the technique used in implementing steganography. We focus on the Least Significant Bit (LSB) technique in hiding messages in an image. The system enhanced the LSB technique by randomly dispersing the bits of the message in the image and thus making it harder for unauthorized people to extract the original message.

Keyword: Steganography, information hiding

Introduction
Steganography brings science to the art of hiding information. The purpose of steganography is to convey a message inside of a conduit of misinterpretation such that the existence of the message is both hidden and difficult to recover when discovered. Basically the information hiding process in a Steganoraphic system starts by identifying a cover mediums redundant bits. The embedding process creates a stego medium by replacing these redundant bits with data from the hidden message. The basic purpose to make communication unintelligible to those who do not possess the right keys. The first step is steganography is that to embed and hiding information is to pass both the secret message and the cover message in to the encoder, inside the encoder, one or several protocols will be implemented to embed the secret information into the cover message. A key is needed in the embedding process. By using the key we can reduce the chance of third party attackers getting hold of the stego object and decoding it to find out the secret information. In general the embedding process inserts a mark X, in an object Y, a key K, usually produced by a random number generator is used in the embedding process and the resulting marked object Y is generated by mapping X x Y x K Y Having passed through the encoder a stego object will be produced. A stego object is the original cover object with the secret information embedded inside. This object should look almost identical to the cover object as otherwise a third party attacker can see embedded information. Having produced the stego object, It will be sentoff via some communication channel. At the receiving end the stego object is fed into the system the public or private key that can decodethe original key that is used inside the encoding process is also needed to detect the secret information. One of the reasons that intruders can be successful is that most of the information they acquire from a system is in a form that they can read and

comprehend. Intruders may reveal the information to others, modify it to misrepresent an individual or organization, or use it to launch an attack. One solution to this problem is, through the use of steganography. Steganography is a technique of hiding information in digital media. In contrast to cryptography, it is not to keep others from knowing the hidden information but it is to keep others from thinking that the information even exists.

Background of the Problem


Steganography become more important as more people join the cyberspace revolution. Steganography is the art of concealing information in ways that prevent the detection of hidden messages. Steganography include an array of secret communication methods that hide the message from being seen or discovered. The goal of steganography is to avoid drawing suspicion to the existence of a hidden message. This approach of information hiding technique has recently become important in a number of application areas. Digital audio, video, and pictures are increasingly furnished with distinguishing but imperceptible marks, which may contain a hiddin copyright notice or serial number or even help to preventunauthorized copying directly. Military communications system make increasing use of traffic security technique which, rather than merely concealing the content of a message using encryption, seek to conceal its sender, its receiver or its very existence. Similar techniques are used in some mobile phone systems and schemes proposed for digital elections. Some of the techniques used in steganography are domain tools or simple system such as least significant bit (LSB) insertion and noise manipulation, and transform domain that involve manipulation algorithms and image transformation such as discrete cosine transformation and wavelet transformation. However there are technique that share the characteristic of both of the image and domain tools such as patchwork, pattern block encoding, spread spectrum methods and masking.

Objective
This project comprehends the following objectives: (i) (ii) To produce security tool based on steganographic techniques. To explore techniques of hiding data using steganography.

Results obtained hiding the message 111 in the pixel 10101000-10101000-10101000 with the LSB method

Motivation

The primary reason for selecting steganography among the list of possible project topics was due to the unfamiliarity of the word that twigged an interest in the subject. Another motivation for researching the topic was after reading an online article in the USA Today titled Terror groups hide behind Web encryption that claims terrorists and, in particular, Osama bin Laden and the al-Qaida network, may be using steganography to communicate with each other in planning terrorist attacks. It is thought that images with hidden messages are placed on bulletin boards or dead drops for other terrorists to pick up and retrieve hidden messages. Thus far, this supposition has yet to be proven.

The Scope Of Steganography


With the boost in computer power, the internetand with the development of digital signalprocessing (DSP), information theory and codingtheory, steganography has gone digital. In therealm of this digital world, steganography hascreated an atmosphere of corporate vigilance thathas spawned various interesting applications,thus its continuing evolution is guaranteed.Cyber-crime is believed to benefit from thisdigital revolution. Hence an immediate concernis to find out best possible attacks to carry outsteganalysis, and simultaneously, finding outtechniques to strengthen existing stegnographytechniques against popular attacks likesteganalysis. Cryptography Cryptography encodes information in such a way that nobody can read it, except the person who holds the key. More advanced crypto techniques ensure that the information being transmitted has not been modified in transit. There is some difference in cryptography and steganography, in cryptography the hidden message is always visible, because information is in plain text form but in steganography hidden message is invisible.

Steganography Versus Cryptography


The comparison and contrast between steganography and cryptography is illustrated from the following table . S.no. Context Steganography Cryptography

Host Files

Image, Audio, Text, etc. Image, Audio, Text, etc.

Mostly Text Files Mostly Text Files

Hidden Files

Result

Stego File

Cipher Text

Type of Attack

Steganalysis: Analysis of a file with a objective of finding whether it is stego file or not

Cryptanalysis

Table Comparison and contrast between steganography and cryptography.

Steganalysis
Steganalysis is a relatively new research discipline with few articles appearing before the late-1990s. Steganalysis is "the process of detecting steganography by looking at variances between bit patterns and unusually large file sizes" . It is the art of discovering and rendering useless covert messages. The goal of steganalysis is to identify suspected information streams, determine whether or not they have hidden messages encoded into them, and, if possible, recover the hidden information. The challenge of steganalysis is that: 1. The suspect information stream, such as a signal or a file, may or may not have hidden data encoded into them. 2. The hidden data, if any, may have been encrypted before being inserted into the signal or file. 3. Some of the suspect signal or file may have noise or irrelevant data encoded into them (which can make analysis very time consuming). 4. Unless it is possible to fully recover, decrypt and inspect the hidden data, often one has only a suspect information stream and cannot be sure that it is being used for transporting secret information.

Modern Terminology and Framework

Secret Message

Cover Message

Embedding Algorithm

Stego Message

Is Stego Message

no

Secret Key

yes Suppress Message

Secret Message

Message Retrieval Algorithm

Secret Key

Histogram

A histogram is one of the basic quality tools. It is used to graphically summarize and display the distribution and variation of a process data set. A

frequency distribution shows how often each different value in a set of data occurs. The main purpose of a histogram is to clarify the presentation of data. You can present the same information in a table; however, the graphic presentation format usually makes it easier to see relationships. It is a useful tool for breaking out process data into regions or bins for determining frequencies of certain events or categories of data. These charts can help show the most frequent. Typical applications of histograms in root cause analysis include: Presenting data to determine which causes dominate Understanding the distribution of occurrences of different problems, causes, consequences, etc. A histogram can typically help you answer the following questions: What is the most common system response? What distribution (center, variation and shape) does the data have? Does the data look symmetric or is it skewed to the left or right? A histogram is a specialized type of bar chart. Individual data points are grouped together in classes, so that you can get an idea of how frequently data in each class occur in the data set. High bars indicate more points in a class, and low bars indicate less points.

Original image

Fig. 1 grayscale image

Histogram of Fig. 1

Least Significant Bit Substitution


In LSB steganography, the least significant bits of the cover medias digital data are used to conceal the message. The simplest of the LSB steganography techniques is LSB replacement. LSB replacement steganography flips the last bit of each of the data values to reflect the message that needs to be hidden. Consider an 8-bit grayscale bitmap image where each pixel is stored as a byte representing a grayscale value. Suppose the first eight pixels of the original image have the following grayscale values: 11010010 01001010 10010111

10001100 00010101 01010111 00100110 01000011 To hide the letter C whose binary value is 10000011, we would replace the LSBs of these pixels to have the following new grayscale values: 11010011 01001010 10010110 10001100 00010100 01010110 00100111 01000011. Note that, on average, only half the LSBs need to change. The difference between the cover (i.e. original) image and the stego image will be hardly noticeable to the human eye. Figure (a), (b) that show a cover image and a stego image (with data is embedded); there is no visible difference between the two images.

Fig. a Cover image

Fig. b Stego image

LSB steganography, as described above, replaces the LSBs of data values to match bits of the message. It can equally alter the data value by a small amount, ensuring the a

legal range of data values is preserved. The difference being that the choice of whether to add or subtract one from the cover image pixel is random.This will have the same effect as LSB replacement in terms of not being able to perceive the existence of the hidden message. This steganographic technique is called LSB matching. Both LSB replacement and LSB matching leave the LSB unchanged if the message bit matches the LSB. When the message bit does not match the LSB, LSB replacement replaces the LSB with the message bit; LSB matching randomly increments or decrements the data value by one. LSB matching is also known as 1 embedding.

In the case of still grayscale images of type bitmap, every pixel is represented using 8 bits, with 11111111 (=255) representing white and 00000000 (=0) representing black. Thus, there are 256 different grayscale shades between black and white which are used in grayscale bitmap images. In LSB stegonography, the LSBs of the cover image is to be changed. As the message bit to be substituted in the LSB position of the cover image is either 0 or 1, one can state without any loss of generality that the LSB's of about 50 percent pixel changes. There are three possibilities: 1. Intensity value of any pixel remains unchanged. 2. Even value can change to next higherodd value. 3. Odd Value change to previous lower even value.

MATERIALS AND METHODS


First Component Alteration TechniqueFor Image Steganography In the technique, a new imagesteganography scheme based on firstcomponenet Alteration technique is introduced.In a computer, images are represented as arraysof values. These values represent the intensities of the three colors R (Red), G (Green) and B(Blue), where a value for each of three colorsdescribes a pixel. Each pixel is combination ofthree components(R,G and B). In this scheme, the bits of firstcomponent (blue component) of pixels of image have been replaced with data bits, which areapplied only when valid key is used. Bluechannel is selected because a research wasconducted by Hecht, which reveals that thevisual perception of intensely blue objects is lessdistinct that the perception of objects of red andgreen. For example, suppose one can hide a message in three pixels of an image (24-bit colors). Suppose the original 3 pixels are: (00100111 11101001 11001000) (001001111100100011101001) (11001000 00100111 11101001) A steganographic program could hide the letter"A" which has a position 65 into ASCIIcharacter set and have a binary representation"01000001", by altering the blue channel bits ofpixels. (01000001 11101001 11001000) (00100111 1100100011101000) (11001000 00100111 11101001)

A. Embedding phase
The embedding process is as follows. Inputs: Image file and the text file Output: Text embedded image Procedure:

Step 1: Extract all the pixels in the given imageand store it in the array called Pixelarray. Step 2: Extract all the characters in the given textfile and store it in the array called Characterarray. Step 3: Extract all the characters from the Stegokey and store it in the array called Key- array. Step 4: Choose first pixel and pick charactersfrom Key- array and place it in first componentof pixel. If there are more characters in Keyarray,then place rest in the firstcomponent of next pixels, otherwise follow Step(e). Step 5: Place some terminating symbol toindicate end of the key. 0 has been used as aterminating symbol in this algorithm. Step 6: Place characters of Character- Array in each first component (blue channel) of nextpixels by replacing it. Step 7: Repeat step 6 till all the characters hasbeen embedded.

Step 8: Again place some terminating symbol toindicate end of data. Step 9: Obtained image will hide all thecharacters that input.

B. Extraction phase
Inputs: Embedded image file Output: Secret text message

Procedure:
Step 1: Consider three arrays. Let they beCharacter-Array, Key-array and Pixel-array. Step 2: Extract all the pixels in the given imageand store it in the array called Pixelarray. Step 3: Now, start scanning pixels from firstpixel and extract key characters from first (blue)component of the pixels and place it in Keyarray.Follow Step3 up to terminating symbol,otherwise follow step 4. Step 4: If this extracted key matches with the keyentered by the receiver, then follow Step 5,otherwise terminate the program by displayingmessage Key is not matching. Step 5: If the key is valid, then again startscanning next pixels and extract secret messagecharacters from first (blue) component of nextpixels and place it in Character array. FollowStep 5 till up to terminating symbol, otherwise follow step 6. Step 6: Extract secret message from Character array. The primary motivation of the current work is to increase PSNR. For this purpose we employ the approach whichhide secret image in to cover image with the help of logic gates.

Algorithm:
Step1: Read the image to be embedded Step 2: Read the image inside which message isembed

Step 3: set numSignificantBits = n ; where n=1,28 Step 4: size1 = size(secret); and size2 =size(coverImage); Step 5: set the "numSignificantBits"n significantbits of each byte of cover image to zero by usingbit by AND operation on cover and size1 matrix Step 6: embedd the "numSignificantBits" mostsignificant bits of secret image to create the stegoimage by using stego= (cover zero+ secret)/28-n Step 7: recover the embedded image, by using bitby shift operation Step 8: Display Figure of cover image, Image tobe hidden, stego image and recover image Step 9: End Note :- as the value of n will be increase thequality of stego and recover image will bedegraded. The proposed method is applicable for both 24bit color and 8 bit gray image. So the conversion of 24 bit color image to 8 bit grayscale image isdone as follow:

Conversion Of Color Image Into Greyscale Image


Conversion of a color image tograyscale can be done using several approaches. Different weighting of the primary colorseffectively represent the effect of obtainingblack-and-white image with color images. Acommon strategy is to match the luminance ofthe grayscale image to the luminance of the color Image. The proposed method is baled both 24 bit colorand 8 bit gray image

To convert any color to a grayscalerepresentation of its luminance, first onemust obtain the values of its red, green, and blue(RGB) primaries in linear intensity encoding, bygamma expansion. Then, add together 30% ofthe red value, 59% of the green value, and 11%of the blue value(these weights depend on theexact choice of the RGB primaries, but aretypical). Regardless of the scale employed (0.0 to1.0, 0 to 255, 0% to 100%, etc.), the resultantnumber is the desired linear luminance value; ittypically needs to be gamma compressed to getback to a conventional grayscale representation. a conventional grayscale representation. To convert a gray intensity value toRGB, simply set all the three primary color components red, green and blue to the grayvalue, correcting to a different gamma ifnecessary. The method adopted in current workfor experimental evaluation is to obtain the RGBvalues of individual pixels and to take the average to be normalized to fit in the scale 0 to 255.

Pixel-value differencing
Pixel-value differencing (PVD) based steganography is one of popular approaches for secret data hiding in the spatial domain. However, based on extensive experiments, we find that some statistical artifacts will be inevitably introduced even with a low embedding capacity in most existing PVD-based algorithms. In this paper, we first analyze the common limitations of the original PVD and its modified versions, and then propose a more secure steganography based on a content adaptive scheme. In our method, a cover image is first partitioned into small squares. Each square is then rotated by a random degree of 0, 90, 180 or 270. The resulting image is then divided into non-overlapping embedding units with three consecutive pixels, and the middle one is used for data embedding. The number of embedded bits is dependent on the differences among the three pixels. To preserve the local statistical features, the sort order of the three pixel values will remain the same after data hiding. Furthermore, the new method can first use sharper edge regions for data hiding adaptively, while preserving other smoother regions by adjusting a parameter.

The experimental results evaluated on a large image database show that our method achieves much better security compared with the previous PVD-based methods. Based on the embedding domains, image steganographic algorithms can be classified into two types, that is those embedding in the spatial domain, such as LSB (Least Significant Bit) based approaches and PVD-based approaches, and those embedding in the transform domain, such as F5 and outguess.In this paper, we focus on an adaptive and secure steganography in the spatial domain. The LSB-based steganography is one of famous approaches in the spatial domain, in which the least significant bits of a cover image that along a pseudo-random route are changed according to the secret bit stream to be embedded. Those methods regard all pixels within an image can tolerate equal amounts of changes without causing visual artifacts to an observer. However, this is not true especially for the images with more smoother and/or regular regions.

Based on the fact that our human vision is sensitive to slight changes in the smooth regions, while can tolerate more severe changes in the edge regions, the PVDbased

methods have been proposed to enhance the embedding capacity without introducing obvious visual artifacts into stego images. In PVD-based schemes, the number of embedded bits is determined by the difference between the pixel and its neighbor. The larger the difference amount is, the more secret bits can be embedded. Usually, PVD-based approaches can achieve more imperceptible results compared with those typical LSB-based approaches with the same embedding capacity. However, based on extensive experiments and analysis, we find that most existing PVDbased algorithms perform bad to resist some statistical analysis even with a low embedding capacity, e.g. 10% bpp (bit per pixel).

Analysis on PVD-based steganography


Overview of PVD-based steganography In the original PVD scheme proposed by Wu and Tsai the procedure of data embedding is shown as follows. Step 1 A cover image is first rearranged as a row vector by running thought all rows in a raster scanning manner. The vector is then divided into nonoverlapping 1-by-2 embedding units. For each unit, say [gi, gi+1], a difference d is calculated by d = gi+1 gi, where gi, gi+1 [0, . . . , 255]. Step 2 And then the absolute difference |d|, where |d| [0, . . . , 255], is classified into a number of contiguous ranges denoted as Ri, where i = 1, 2, . . . n. The lower bound, upper bounds and the width of region Ri are denoted as li, ui and wi, respectively.Atypical setting of the regions is that [0 7], [8 15], [16 31], [32 63], [64 127] and [128 255]. Assuming |d| belongs to region Rk, here k [1, 2 . . . 6].

Step 3 Determine the number of embedded bits by n = log2(wk) then select the next sub-stream with n bits from the secret message, and transfer them into a decimal value b. Step 4 Calculate the new difference d_ by

If gi or gi+1 is out of the range [0,255], then the candidate embedding unit is marked as abandoned one. Note that such unused units are few in most natural images and are very easy to detect. It can be proven that the new absolute difference |d| = |gi+1 gi| in the stego image will fall into the same region Rk as the difference |d| = |gi+1 gi| in the cover image. So in data extraction, if |d| Rk, the embedded value can be extracted correctly by b = |d| lk. In , the authors shows that a fixed region Ri employed in the original PVD scheme (Step 2) will introduce some unusual steps in the histogram of pixel value differences (PVD histogram) for all the embedding units, which can be used to expose the presence of hidden message and further to estimate the size of hidden bits. To make the steganography immune to PVD histogram based analysis, the authors employ random regions instead of the fixed regions when data hiding. The experimental results in show that the new approach can effectively eliminate those undesired steps. To improve the embedding capacity, a new PVD-based steganography combined with LSB technique is proposed by Wu et al. In this method, Steps 1 and 2 are the same as it did in the original PVD scheme. After that, if Rk belongs to higher level, namely |d| > 15, then embeds secret bits using the original method. Otherwise if |d| belongs to lower level, namely |d| 15, embeds three bits into gi and gi+1 using 3-LSB replacement, respectively. Assume that the pixel values after data hiding are denoted as gi and gi+1, then calculate |d| = |gi+1 gi|, if the new difference |d| is not belongs to lower level, then readjust them by

In a latest work, it divides the pixel differences into three levels: lowerlevel, middle-level and high-level. Similar to Wus approach, k secret bits are embedded into gi and gi+1, respectively, where k = l,m, h, if |d| belongs to lowerlevel, middle-level and high-level, respectively. After that the method applies the modified LSB approach [1, 5] to the resulting pixels pair and gets the new pixel pair (gi, gi+1). If the new difference |d| = |gi+1 gi| belongs to different levels, readjust them into the following forms:

Finally selects the better choice of the new values gi, gi+1 which satisfies the conditions that the new difference |d| and the original one belong to the same level and the value of |gi gi| + |gi+1 gi+1| is the smallest among the candidates. Compared with, the new method can provide stego image with larger embedding capacity as well as higher objective quality.

Properties of PVD-based steganography


In previous section, the main idea of the existing PVD-based approaches is that they first divide a cover image into non-overlapping embedding units with two consecutive pixels in a raster scanning manner, and then deal with the embedding units separately in a pseudo-random order. For a given embedding unit, the difference is first calculated, and then the difference is classified into one of several regions (random or fixed). Usually, pixel pairs located at the edge regions (with larger difference) are embedded more secret bits than those located at smooth regions (with smaller differences). The embedding strategies may be different according to different steganographic approaches and/or the regions that the differences belong to. In order to guarantee the validity of data extraction, the difference in each embedding unit must belong to the same region after data hiding. Otherwise, those approaches need to readjust them into new ones or marked them as unused units. Based on the characteristics of HVS (human visual system), the original PVD approaches can embed more secret bits into an image with fewer visual artifacts. Up to now, several modified approaches i.e. have been proposed to enhance the embedding capacity and/or improve the objective quality of the stego images. However, just a few works address the security issues of the embedding schemes. In the following Subsection , we will first analyze the common limitations in the existing PVD-based approaches, and then propose a novel and secure scheme based on the modified PVD with more adaptability to the image contents in the next Section 3.

SOME POPULAR METHODS AND ALGORITHMS JPEG File Interchange Format


The file format defined by the Joint Photographic Experts Group (JPEG) stores image data in lossy compressed form as quantised frequency coefficients. Fig. 1 shows the compressing steps performed. First, the JPEG compressor cuts the uncompressed bitmap image into parts of 8 by 8 pixels. The discrete cosine transformation (DCT) transfers 8 8 brightness values into 8 8 frequency coefficients (real numbers). After DCT, the quantisation suitably rounds the frequency coefficients to integers in the range 2048 . . . 2047 (lossy step). The histogram in Fig. 2 shows the discrete distribution of the coefficients frequency of occurrence. If we look at the distribution in Fig. 2, we can recognise two characteristic properties: 1. The coefficients frequency of occurrence decreases with increasing absolute value.

2. The decrease of the coefficients frequency of occurrence decreases with increasing absolute value, i. e. the difference between two bars of the histogram in the middle is larger than on the margin.

After the lossy quantisation, the Huffman coding ensures the redundancyfree coding of the quantised coefficients. The following sections mainly refer to the distribution in Fig. 2. Statements of file sizes and steganographic capacities relate to the true colour image Expo shown in Fig. 3.

Jsteg
This algorithm made by Derek Upham serves as a starting point for the contemplation here, because it is resistant against the visual attacks presented in, and nevertheless offers an admirable capacity for steganographic messages (e. g., 12.8 % of the steganograms size). After quantisation, Jsteg replaces the least significant bits (LSB) of the frequency coefficients by the secret message.2 The embedding mechanism skips all coefficients with the values 0 or 1. Fig. 4 shows Derek Uphams embedding function of Jsteg in C source code. However, the statistical attack on Jsteg reliably discovers the existence of embedded messages, because Jsteg replaces bits and, thus, it introduces a dependency between the values frequency of occurrence, that only differ in this bit position (here: LSB). Jsteg influences pairs of the coefficients frequency of occurrence, as Fig. 5 shows. Let ci be the histogram of JPEG coefficients. The assumption for a modified image is that adjacent frequencies c2i and c2i+1 are similar. We take the arithmetic mean

to determine the expected distribution and compare against the observed distribution

The difference between the distributions ni and n i is given as

with k 1 degrees of freedom, which is the number of different categories in the histogram minus one.Fig. 6 shows the statistical attack on a Jsteg steganogram (with 50 % of the capacity used, i. e. 7680 bytes). The diagram presents the probability of embedding

as a function of an increasing sample: Initially, the sample comprises the first 1% of the JPEG coefficients, then the first 2%, 3%, . . . The probability is 1.00 up to 54% and 0.45 at 56%; A sample of 59% and more contains enough unchanged coefficients to let the p-value drop to 0.00.

F3 ALGORITHM
The algorithm F3 serves as a tutorial example. It differs in double respects from Jsteg:

1.

Instead of overwriting bits, it decrements the coefficients absolute values in case their LSB does not matchexcept coefficients with the value zero, where we can not decrement the absolute value. Hence, we do not use zero coefficients steganographically. The LSB of nonzero coefficients match the

secret message after embedding, but we did not overwrite bits, because the Chi-square test can easily detect such changes. So we can hope that no steps will occur in the distribution. In contrast to Jsteg, F3 uses coefficients with the value 1. The symmetry of 1 and 1 visible in Fig. 2 consequently remains.

2. Some embedded bits fall victim to shrinkage. Shrinkage accrues every time F3 decrements the absolute value of 1 and 1 producing a 0. The receiver cannot distinguish a zero coefficient, that is steganographically unused, from a 0 produced by shrinkage. It skips all zero coefficients. Therefore, the sender repeatedly embeds the affected bit since he notices when he produces a zero. In comparison to Fig. 2, the histogram shows a relative surplus of even coefficients. This phenomenon results from the repeated embedding after shrinkage. Shrinkage occurs only if we embed a zero bit. The repetition of these zero bits shifts the (originally equalised) ratio of steganographic values in favour of the

steganographic zeroes. Hence, the F3 embedding process produces more even coefficients than odd. The steganographic interpretation of coefficients with the values 1 or 1 is 1 (because their LSB is 1). For this reason the embedding function keeps them unchanged when it embeds a 1. Fig. 7 shows the flashy frequency of occurrence for even and odd coefficients, which we can detect by statistical means. If we simply ignore the shrinkage, the superior number of even coefficients disappears. Unfortunately the receiver gets only fragments of the message in this case. The application of an error-correcting code could possibly solve the problem. If we extract putative messages from unchanged carrier media with F3, these messages will have a distribution with more ones than zeroes. Therefore, if we embed more ones than zeroes (in a suitable ratio), the superior number in the histogram disappears as well. A more elegant solution of this problem (F4) makes use of the symmetry in Fig. 2.

F4 ALGORITHM

F3 has two weaknesses 1. Because of the exclusive shrinkage of steganographic zeroes, F3 effectively embeds more zeroes than ones, and producesas well as Jsteg, but in a different waystatistically detectable peculiarities in the histogram.

2. The histogram of JPEG files (Fig. 2) contains more odd than even coefficients (excluding 0). Therefore, unchanged carrier media contain (from Jstegs or F3s perspective) more steganographic ones than zeroes. The algorithm F4 eliminates these two weaknesses in one stroke by mapping negative coefficients to the inverted steganographic value: even negative coefficients represent a steganographic one, odd negative a zero; even positive represent a zero (as before with Jsteg and F3), and odd positive a one. In Fig. 8 each

two bars of the same height represent coefficients with inverse steganographic value (steganographic zeroes are black, steganographic ones white). Fig. 9 shows the embedding loop of F4 in Java source code. The array coeff[] holds all the JPEG coefficients of the carrier medium. Suppose we have two random variables X, Y for observed coefficients before and after F4 embeds a message. P(X = x) denotes the probability for JPEG producing a coefficient with a given value x, and P(Y = y) denotes the probability for F4 producing a coefficient with a given value y. We can write the two characteristic properties (cf. Sect. 2) for some coefficient values P(X = 1) > P(X = 2) > P(X = 3) > P(X = 4) P(X = 1) P(X = 2) > P(X = 2) P(X = 3) > P(X = 3) P(X = 4) (5) (6)

If the message bits are uniformly distributed, we deduce

If we add P(X = 2) P(X = 3) to (6), we find P(X = 1) P(X = 3) > P(X = 2) P(X = 4)

(13)

With (13) we see that the right hand side of (10) is greater than in (11). So the left hand sides give the second characteristic property for Y . P(Y = 1) P(Y = 2) > P(Y = 2) P(Y = 3)

(14)

Similarly we can show these characteristic properties for other values modified by F4, i. e. decreasing occurrence with increasing absolute value (cf. (12)), and decreasing decrease with increasing absolute value (cf. (14)).

RESULTS AND DISCUSSIONS

The method is applicable for both grayscale (8 bit) or color image(24 bit). We categorized images with respect to their JPEG quality factor, and observed the effect on the performance of the steganalyzers. But other than the JPEG quality factor, image properties such as image texture could be used to categorize the images. There are many approaches to quantify the texture of an image. A crude measure of image texture would be the mean variance of JPEG blocks. This measure is simple and can be efciently computed, even with our large data set.

HISTOGRAM CLASSIFICATION OF ORIGINAL TO STEGO IMAGES

Text Steganography

Cover image

Text File

Stego image

Future Scope

Why steganography? Who needs steganography? What are the uses for steganography? Where can one use steganography? According to Richard E. Smith (a data security expert), he doesnt see many practical uses for steganography because it only works as long as nobody expects you to use it. The author respectfully takes exception to this statement. Initially after reading this statement, the myth that Charles H. Duell, Commissioner of Patents in 1899 had declared that the Patent Office should be closed because everything that could possibly be invented had already been invented came to mind. Perhaps the computer security community should give up on endless patches, security applications, etc because they only work if nobody expects that they are in use. To quote Dale Carnegie, Most of the important things in the world have been accomplished by people who have kept on trying when there seemed to be no hope at all. There are ongoing studies to harden steganographic images from steganalysis. In his paper, Defending Against Statistical Steganalysis, Provos presents new methods which would allow one to select a file in which a message might be safely hidden and resistant to standard statistical analysis.

How we propose to extent our project


This project on steganography has the potential to be scaled up to higher standards. We have only brushed the surface of the research on data security through steganography. Its only limited to our imagination what we can do in future to modify and scale up this projects to new standards. Some of our ideas are discussed below. The first idea that we have is to develop a multi bit image steganography method that is much more efficient in hiding large volumes of data in a single image file. In this manner we will uncover a steganographic technique which will enable hassle free transmission of large volumes of secure data over the network from sender to a destined receiver. Furthermore another radical idea is that we can train this proposed software through the help of technologies like neural network to determine and distinguish a normal image with a stego image. In this manner this software will get trained to understand a stego image and therby can autonomously perform steganalytic operations without human intervention. There are many such ideas and we will always work on them to improve this work for a long time. There are multiple possibilities of improving the current techniques and careful work may reveal many of the possibilities in the field of steganography.

ACRONYMS
LSB- Least significant Bit. In computing, the least significant bit (lsb) is the bit position in a binary integer giving the units value, that is, determining whether the number is even or odd. The lsb is sometimes referred to as the right-most bit, due to the convention inpositional notation of writing less significant digits further to the right.

RGB- Red Green Blue Colour model. The RGB color model is an additive color model in which red, green, and blue light are added together in various ways to reproduce a broad array of colours. PSNR- Peak Signal-to-Noise Ratio. PSNR is one of metrics to determine the degradation in the embedded image with respect to the host image.. Values over 36 dB in PSNR are acceptable in terms of degradation, which means no significant degradation is observed by human eye. ASCII- The American Standard Code for Information Interchange is a characterencoding scheme originally based on the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text. DSP- Digital signal processing is the mathematical manipulation of an information signal to modify or improve it in some way. It is characterized by the representation of discrete time, discrete frequency, or other discrete domain signals by a sequence of numbers or symbols and the processing of these signals.

Appendix

Cover image - An image containing an embedded message. Cipher text Refers to encrypted data. Cryptography The art of protecting information by encrypting it into an unreadable format, called cipher text. A secret key is used to decrypt the message into plain text. Encryption The translation of data into a secret code. Least significant bit (LSB) - The bit contributing the least value in a string of bits. Lossless compression - For most types of data, lossless compression techniques can reduce the space needed by only about 50%. No data is lost in the process. For greater compression, one must use a lossy compression technique. Lossy compression - Lossy compression technologies attempt to eliminate redundant or unnecessary information. Some amount of data is lost in the process. Plain text Refers to any message that is not encrypted - also called clear text. Steganalysis The art of discovering and rendering useless covert messages. Steganography - A means of overlaying one set of information ("message") on another (a cover). Stego image - The result of combining the cover image and the embedded message. Stego text It is the result of applying some steganographic process to a plain text (not necessarily encrypted). TCP/IP - The Transmission Control Protocol / Internet Protocol is the standard protocol suite used on the Internet.

CONCLUSION

Steganography can be used for hidden communication. We have explored the limits of steganography theory and practice. We pointed out the enhancement of the image steganographic system using LSB approach to provide a means of secure communication. A stego-key has been applied to the system during embedment of the message into the cover-image. In our proposed approach, the message bits are embedded randomly into the cover-image pixels instead of sequentially. Finally, we have shown that steganography that uses a key has a better security than non-key steganography. This is so because without the knowledge of the valid key, it is difficult for a third party or malicious people to recover the embedded message. However there are still some issues need to be tackled to implement LSB on a digital image as a

cover-object using random pixels.

References
[1] Kurak, C. and McHugh, J.: A Cautionary Note on Image Downgrading. Proc. IEEE 8 Annual
th

Computer Security Applications Conference. San Antonio, USA, Nov./Dec. 1992, pp. 153-155.
[2] Moskowitz, I., Longdon G. and Chang, L.: A New Paradigm Hidden in Steganography. Proc. 2000

Workshop on new security paradigms, Ballycotton, Country Cork, Ireland, 2000. ACM Press, New York, pp. 41-50.
[3] Sharp, T.: An implementation of key-based digital signal steganography. Proc. 4 International
th

Workshop on Information Hiding, Pittsburgh, USA, April 25, 2001. Springer LNCS, vol. 2137.
[4] Kawaguchi, E. and Eason, R.: Principle and applications of BPCS-Steganography. Proc. Multimedia

Systems and Applications Conference, Boston, MA, USA, November 2, 1998. SPIE series, vol. 3528, pp. 464-473.

[5] Bender, W., Gruhl, D., Morimoto, N. and Lu, A.: Techniques for data hiding. IBM Systems Journal,

vol. 35, nos. 3&4, 1996.

[6] Moskowitz, I., Johnson, N. and Jacobs, M.: A detection study of an NRL steganographic method.

NRL Memorandum Report NRL/MR/5540{02-8635, Naval Research Laboratory, Code 5540, Washington, 2002.
[7] Noto, M.: MP3Stego: Hiding text in MP3 files. Sans Institute, 2003.

[8] Sharp, T.: An implementation of key-based digital signal steganography. Proc. 4 International

th

Workshop on Information Hiding. Springer LNCS, vol. 2137, pp.13-26, 2001.

[9] Katzenbeisser, S. and Petitcolas, F.: Information hiding techniques for steganography and digital

watermarking. Artech House Books, 1999.

[10] Hempstalk, K.: Hiding behind corners: using edges in images for better steganography.

Computing Womens Congress Conference, Hamilton, New Zealand, 2006.

[11] Fridrich, J., Goljan, M. and Du, R.: Reliable detection of LSB steganography in color and grayscale

images. Proc. ACM Workshop on Multimedia and Security, Ottawa, ON, Canada, Oct. 5, 2001, pp. 27-30.
[12] Dumitrescu, S., Wu, X. and Wang, Z.: Detection of LSB steganography via sample pairs analysis.

5 International Workshop on Information Hiding. Noordwijkerhout, Pays-Bas, 7/10/2002. Springer LNCS, vol. 2578, pp. 355-372, 2003.
th

th

[13] Ker, A.: Improved detection of LSB steganography in grayscale images. Proc. 6 Information

Hiding Workshop. Springer LNCS, vol. 3200, pp. 97-115, 2004.


nd

[14] Van Dijk, M. and Willems, F.: Embedding information in grayscale images. Proc. 22 Symposium

on Information and Communication Theory in the Benelux, pp. 147-154, Enschede, The Netherlands, May 15-16, 2001.
[15] Goljan, M. and Holotyak, T.: New blind steganalysis and its implications. Proc. SPIE Security,

Steganography, and Watermarking of Multimedia Contents VIII, vol. 6072, pp. 1-13, 2006.
[16] Thyssen, J. & Zimmerman, H., Contraband 9g, 1999-01-01 (consultation date 2009-07-07).

Available in http://www.jthz.com/puter/.
[17] Bernard Electronics, Data Privacy Tools 3.5, last update 2008-10-03 (consultation date 2009-07-

07). Available in http://www.xs4all.nl/~bernard/home_e.html


[18] Hempstalk, K., Digital Invisible Ink Toolkit 1.5, last update 2006-06-09 (consultation date 2009-

07-07). Available in http://diit.sourceforge.net 19 Alpha Tec Ltd, EikonaMark 4.7, last update 2005-01-01 (consultation date 2009-07-07). Available in http://www.alphatecltd.com/watermarking/eikonamark/eikonamark.

S-ar putea să vă placă și