Documente Academic
Documente Profesional
Documente Cultură
Charles Kurak Department of Computer Science The University of North Carolina Chapel Hill, NC 27599-3715 John McHugh Department of Computer Science The University of North Carolina Chapel Hill, NC 27599-3715
that are not immediately obvious when the information is printed or viewed on a display. In the presence of untrusted software that may contain Trojan horse code, care must be taken to ensure that the downgrading is done in such a way that the resulting product is free of contamination even though its sources may be compromised. The issues surrounding high assurance downgrading of text les are discussed in detail in 5]. To the authors' knowledge, no high assurance downgraders for text les exist. For the past several years, one of the authors (McHugh) has been part of a team developing a high assurance windowing system based on the MIT X Windows system. Targeted at B3 evaluation, this system 3] has attracted substantial interest in the security community. The developers have been told by a number of potential users that an image downgrading capability is a requirement for many MLS windowing systems. The purpose of this note is to relate the results of some simple experiments that the authors of this paper have performed at the University of North Carolina that cast doubt on the notion of trustworthy image downgrading. This paper continues with a brief description of the steps that are believed to be necessary for high assurance downgrading of text information in an environment in which contamination by Trojan horse programs is considered a credible threat. We then describe our experiments in which we contaminated images with other images. We note that our approach, though simplistic, is easily extended to more sophisticated forms that would be, we believe, extremely di cult to detect. We conclude with an outline of areas in which we think that research should be performed before operational downgrading of images is undertaken in an environment in which contamination by Trojan horse programs is a threat. Because the techniques and terminology used in digital image processing are unfamiliar to many in the computer security community, the appendix contains a brief introduction to the subject and de nes some of the terms used in the body of the paper. Readers unfamiliar with digital image processing may want to read the appendix rst. 1
Abstract
The results of an experiment that shows that it is very simple to contaminate digital images with information that can later be extracted are presented. This contamination cannot be detected when the image is displayed on a good quality graphics workstation. Based on these results, it is recommended that image downgrading based on visual display of the image to be downgraded not be performed if there is any threat of image contamination by Trojan horse programs. Potential Trojan horse programs may include untrusted image processing software.
1 Introduction
For as long as there has been classi ed information, there has been a need to de-classify or downgrade information that has been classi ed above its currently appropriate level. On some occasions, wholesale reclassi cation is permitted, as in the case of automatic downgrading of an entire document after some predetermined time period has elapsed. In other cases, piecemeal downgrading is done, using a combination of cutting and pasting from the source document combined with the obliteration or marking out of individual words or phrases. The preparation of an unclassied summary of a classi ed document is a good example of the latter process. In the pen and paper world, this process may be followed by the copying or transcription of the resulting document to ensure that the obliterated material is truly unreadable by the recipient of the downgraded document. In this world, it is reasonable to assume that the individual performing the downgrading is trustworthy and that no additional secrets have been hidden in the source document encoded by means of word order, spacing, etc. Under these assumptions, the resulting document can be assumed to be properly classi ed at a level lower than that of the source from which it was derived. For information stored in electronic form, the situation is not so simple. It is possible to encode substantial amounts of information in a text le in ways
To appear in the proceedingsof the 1992 Computer Security Applications Conference, San Antonio, TX in December 1992.
2 Downgrading Text
Like many actions involving classi ed information, downgrading rst and foremost involves accountability. Some human is required to accept responsibility for the downgrading process. In the event that classi ed information is compromised, this person will be held accountable. When software is used to aid an individual in the downgrading of, say, a text le, the individual needs to know that the software will not, on its own, pass through information that is not supposed to be downgraded and that it will operate in such a way that helps to prevent the user from making careless mistakes. The downgrader proposed in 5] does this in several ways. 1. The downgrader is trusted software, formally speci ed and veri ed to have exactly and only the functionality needed to do its job. A major portion of the e ort in building such a downgrader would be devoted to achieving assured functionality in such areas as information display and user interactions. 2. The downgrader converts its input into a canonical form that excludes invisible and unprintable characters as well as information that might be encoded via spacing or formatting. The user sees only this form and the downgrader output is provably derived from its input using a series of subtractive operations. 3. The downgrader interacts with its user in a way that forces the review of downgraded material as small segments (typically sentences) viewed in context, preventing the deliberate or accidental downgrading of large quantities of information without appropriate review. The output of such a downgrader resembles the results of the cut, paste, and obliterate operation described earlier. It is claimed that this style of downgrading o ers acceptable performance coupled with a high degree of assurance and immunityto Trojan horse attacks.
that we have chosen is extremely simple and would be easy to detect if it was suspected, we can postulate much more di cult to detect contamination mechanisms.
3 Downgrading Images
Visual information presents more di cult problems. In the computer, an image is an array of numbers that represent light intensities at various points in the image. In displayable form, the image typically has 8 to 24 bits per pixel. Display screens are typically 1024 768 pixels (Super VGA on a PC class platform) or 1280 1024 pixels (many workstations). An 8 bit per pixel image occupying a quarter of the screen might be 500 by 600 pixels and contain 300 kilobytes of data. Examining each byte of data in a small local context, as is done with textual data, does not seem to be fruitful. Displaying the entire image on the computer screen does not eliminate the possibility of contamination as will be seen. While the form of contamination 2
used. The extracted image of the airport is shown in Figure 10. Why is this approach so successful? Part of the reason is the fact that even good computer displays simply don't support the degree of grey scale resolution needed to distinguish 28 or 256 distinct shades. In reality, about 100 levels is all that we can distinguish under ideal circumstances. In a noisy picture, such as that of the airport, it is all but impossible to detect tampering with the low order 4 bits. Even in the case of a picture having large at areas (such as the text), tampering with the low order 3 bits is undetectable.
do
end do
contaminated image: shift left by 8 - n extracted image: set to the shifted value of the contaminated image
seals to the image data because it must be processed by complex and necessarily untrusted software in order to be made useful. These programs transform the data and are possible sources of contamination. Hiding images within images is an obvious form of contamination. It is clear that any form of information can be hidden. For example, if only one bit per pixel can be appropriated, a 200 by 200 image allows 40 kilobits or 5 kilobytes for information hiding. This is the equivalent of about a page and a half of text. Suitably encrypted, such information would appear as random noise on the low order bit of the image. We suspect that it would be possible to hide small numbers of bits of information in almost any image and to do it in ways that would be both di cult to detect and largely immune to attempts to remove them. One area that is of particular interest is the transmission of the contaminated image in hardcopy format. The question arises whether or not a contaminating image can be successfully extracted from a photo-quality paper image. In other words, can an image be hidden, printed on photo paper, re-digitized, and extracted? We hope to address these questions in the near future. For the present, we wish to issue a cautionary note. Downgrading is a risky process at best and not something to be undertaken lightly. Most computer displays simply do not display enough of the information contained in even low quality digital images to allow visual inspection of the displayed image as a basis for assuming that the image can be safely downgraded. We know of no work that has been done that would lead to the necessary assurance and recommend that images not be downgraded based on visual display if there is any possibility that the image has been contaminated by a Trojan horse program at any time in its history.
Scene
Image energy Scanner (sensor)
Process
Display
Visual image
Observer
Figure 1: Digital image processing The number of di erent grey-levels in the digital representation of the image is a function of the energy distribution of the original image, as well as the capabilities of the digitizing device. Currently, digitizing devices, sometimes referred to as scanners, are readily available for 4 or 8 bit digitizing. That is, with a 4bit scanner, 24 or 16 grey-levels are discernable. With an 8-bit scanner, 28 or 256 grey-levels are discernable. Color scanners capable of 24 bit resolution are also available. These use 8 bits for each of the red, blue, and green energies. Images utilizing 12 bits (212 or 4096 grey-levels) are commonly utilized by the medical industry for radiological images. The images used in this work utilize only 8 bits. After reading this paper, one can later extrapolate the consequences associated with higher order acquisition devices. The spatial resolution of scanners used for digitizing photographs is typically 200 to 900 pixels per inch in both the X and Y dimensions. Capacities up to 8.5 by 11 inches are common with larger sizes being available. Direct image sensing is usually done with some sort of digital camera. Charge coupled devices with spatial resolutions from about 100 by 100 pixels up to 4096 by 4096 pixels are available for use in these cameras. These are combined with analog to digital converters and appropriate scanning circuitry to provide a digital data stream representing the image. Devices to digitize directly from analog video signals are also available. Once a digital representation has been acquired, it can be processed or transformed in a variety of ways. It is possible to duplicate most of the photographer's darkroom tricks on the computer. The contrast, brightness, color balance, etc. of the image can 4
3 Color images are typically represented as three grey{levels, one each for the energies of the red, green, and blue portions of the spectrum. Energies ranging from audio frequencies through the X{ray portion of the spectrum can be sensed and digitized.
all be altered. The image can be magni ed4, reduced, cropped, etc. False colors can be used to emphasize energy distributions. In addition, the image can be processed to compensate for defects in the sensor, i.e. out of focus images can be sharpened, etc. Finally, the digital image can be interpreted, i.e. features such as edges and structures can be identi ed. The result of the digital image processing can be either another digital image, or it can be a data structure that contains information about the image. We are primarily interested in the former case. Image processing programs are often large and complex. Once an image has been processed, it is typically viewed, either directly on the screen of a workstation or similar display device or via some form of photographic reproduction. Many workstation display devices can display 8-bit images (256 grey-scales or 256 di erent colors at a time), although higher resolution (24-bit) display devices are readily available. On a color display, the 256 grey-levels are produced by setting the Red, Green, and Blue values to equal intensities, thereby mixing the three colors equally to achieve grey. The problem is that the human visual system (HVS) is not as good at distinguishing between levels as the display is at producing them. The appropriate metric for the HVS is the just noticeable di erence or JND. Under ideal circumstances the eye can only distinguish on the order of 100 JNDs 4, 6] in grey scale. It is thus apparent that the capabilities of currently available display devices are far greater than those of the HVS and the HVS is not capable of discerning all of the information presented to it. The inability of the HVS to discriminate precisely among small di erences in image intensity makes possible a number of strategies for reducing the space required to store digital images. When we consider that a typical weather satellite image is 1200 pixels by 600 pixels by 8 bits and requires nearly 3/4 of a megabyte of disk space, it is clear that compression is useful if not essential. There are two kinds of compression, lossless and lossy.
compression algorithms have a greater potential for reducing the size of the data than do lossless ones. They are typically based on a combination of the allowable tolerances for image reconstruction and on the statistical characteristics of the data to be compressed.
References
1] James M. Coggins. Designing C++ libraries. The C++ Journal, 1(1):25{32, June 1990. 2] Benjamin M. Dawson. Introduction to image processing algorithms. BYTE, pages 169{186, March 1987. 3] Jeremy Epstein, et al. A prototype B3 trusted X Window system. In 1991 Computer Security Applications Conference, December 1991. 4] James Foley, et al. Computer Graphics. Addison{ Wesley, 2nd edition, 1990. 5] John McHugh. An EMACS based downgrader for the SAT. In Marshall D. Abrams and Harold J. Podel, editors, Computer and Network Security, pages 228{237. IEEE Computer Society Press, 1986. 6] Stephen M. Pizer, et al. Evaluation of the number of discernable levels produced by a display. In Information Processing in Medical Imaging. INSERM, Paris, 1980.
storage space. It is always possible to reconstruct the image exactly with this method of compression. This method is preferred when there is a requirement that the original information not be modi ed as might be the case when subsequent processing is needed. Lossless compression algorithms are widely used to compress text and data les and are usually based on the statistical characteristics of the data to be compressed. Lossy compression algorithms also save space, and the reconstructed image is very close (or possibly identical) to the original, but exact reconstruction is not always possible. This may be adequate for many applications, especially when the user only wants to use or display the image in a fashion that does not require exact data representation. Lossy
4