Documente Academic
Documente Profesional
Documente Cultură
Emir Turajli
Faculty of Electrical Engineering,
University of Sarajevo
Sarajevo, Bosna and Hercegovina
emir.turajlic@etf.unsa.ba
Abstract Efficient compression of medical images is needed acceptable fidelity and compression ratios differ among various
to decrease the storage space and enable efficient image transfer types of medical images, e.g. CT, MRI etc.
over network for access of electronic patient records. Since the
medical images contain diagnostically relevant information, it is Digital image compression methods can be broadly
necessary for the process of image compression to preserve high classified in two groups, lossy and lossless methods [3].
levels of image fidelity, especially when the images are compressed Lossless technique achieve compression by exploiting
at low bit rates. This paper investigates the capacity of an artificial statistical and/or spatial redundancies in an image and as such
neural network framework for medical image compression. are able to recover original image perfectly. On the other hand,
Specifically, the performance of the proposed image compression lossy image compression methods can also reduce psychovisual
method is evaluated on a database of computed tomography image redundancies and can generally attain much higher
images of lungs, where PSNR and MSE are used as the principal compression ratios at the cost of irreversibly degrading image
image quality metrics. The compressed image data is derived from quality.
the hidden layer outputs, where the artificial neural networks are
trained to reconstruct the network input features. The results of Some examples of widely employed lossless image
image block segmentation are used as the network training compression methods include arithmetic encoding [4],
features. The paper proposes the use of Kohonen's self-organizing Huffman encoding [5] and run-length encoding [3]. Successful
maps for segmentation of feature space and the use of multiple approaches to lossy image compression include fractal coding
finely tuned multi-layer perceptrons to achieve an improved [6-8] vector quantization [9, 10], DCT transform coding [11-
compression performance. This paper presents a study on how the 13], wavelet transform [14-16] and neural network based
choice of block size, network architecture, and training method compression [17-20].
affect the compression performance. An attempt is made to
optimize the artificial neural network framework for the Artificial neural networks have been successfully applied to
compression of computed tomography lung images. a broad spectrum of applications, from speech processing [21],
medicine [22], power engineering [23], to finance [24]. This
KeywordsSignal processing, Image Processing, Artificial paper investigates the capacity of artificial neural networks for
Intelligence. compression of computed tomography images of lungs.
Artificial neural network based image compression commonly
I. INTRODUCTION relies on adopting a feedforward neural network structure that
is trained to reconstruct the input features at its output. Having
For a long time, image compression has had an important a smaller number of neurons in the hidden layer compared to
role in the development of a range of multimedia computer the input and output layers of the network, the outputs of the
services and telecommunication applications, including hidden layer in fact represent the compressed image data.
teleconferencing, digital broadcast codec and video technology, Commonly, network training relies on block segmentation of
etc. With the recent significant growth in e-health, images to generate the training data for the network.
telemedicine, teleconsultation and teleradiology, there is an
increasing research interest in the field of medical images This paper investigates how image segmentation block size,
compression [1]. Digital representation of medical images has network architecture, and training method influence image
an indispensable role in medical diagnostics, and thus, it is compression performance. In addition, the paper proposes the
necessary for the image compression to effectively preserve the use of Kohonens Self-Organizing Maps (SOM) for feature
resolution as well as the perceptual quality of medical images. space segmentation and the use of multiple finely tuned multi-
In fact, the principal aim of image compression is to impose the layer perceptrons to improve the compression performance for
least amount degradation to the diagnostically relevant computer tomography images of lungs. An attempt is made to
information, while enabling effective archiving and transferring optimize the artificial neural network based image compression
of medical images, with respect to the available communication method so as to attain the best CT image quality, as defined by
and storage channels [2]. Another important issue in medical various objective quality measures, at a given bit-rate.
image compression that needs to be considered is the fact that
The remainder of this paper is organized as follows. In Their effect on the quality of image reconstruction will be
section II, a review of a multilayer perceptron, along with the further studied in this paper. This paper will consider two
considered training methods is presented. A brief overview of neural network training methods, specifically backpropagation
self-organizing maps is also presented in this section. The (gradient descent) and Scaled Conjugate Gradient (SCG)
proposed system for compression of CT images of lungs is algorithm. Supervised network learning is an iterative
presented in section III. Section IV presents and discusses the procedure of a general form:
experimental results. Section V concludes the paper.
w m+1 = w m + w m = w m + um p m , m 0 (1)
II. ARTIFICIAL NEURAL NETWORKS
, where w denotes the weight at mth iteration, and u and p
A. A Multilayer Perceptron represent the learning rate and the direction of weight
Fig. 1 presents a schematic diagram of a three layer adaptation, respectively. While the weight update under
multilayer perceptron consisting of the input layer with L standard backpropagation is in the negative direction of the
inputs, a hidden layer with H neurons and the output layer with error function gradient, the conjugate gradient search uses the
K neurons. Each neuron computes the weighted sum of its second order approximation of the error function. Conjugate
inputs and subsequently, passes the output through an activation gradient methods are well suited to handle large-scale problems
function to obtain a neuron response. Here, the input and output in an effective way [25, 26]. Scalable conjugate gradient
layers are fully connected to the hidden layer. The outputs of adopts the Levenberg-Marquardt approach to scale the learning
the hidden layer are passed to the decoder which consists of the rate and is demonstrated to be significantly faster than the
neural network structure that links the outputs of the hidden standard backpropagation and other CGM methods [26].
layer to the outputs of the neural network. In the process of
image compressions, the number of inputs, L, is equated to the B. Kohonens Self-Organized Maps
number of outputs, K and the neural network is trained to
Kohonens Self Organizing Maps (SOM) correspond to a
reconstruct the input features at its output. Thus, image
feed-forward artificial neural with a single computational layer
compression is achieved by selecting a smaller number of
that adopts an unsupervised competitive form of learning to
neurons in the hidden layer H, relative to the input layer size.
produce a low-dimensional representation of the input space.
The network training data is obtained through the process SOMs are able to adaptively transform any incoming signal
of block image segmentation, where the entire image is divided pattern of arbitrary dimension into a low-dimensional map, and
into rectangular NxN blocks. Each block is vectorised to form in the process preserve the topological ordering. Although,
a feature vector that is used both as an input and as a target higher dimensional maps are also possible, Kohonens self-
vector during the training process. Thus, the image organizing maps are typically used to produce one or two
segmentation block size is directly related to the dimensionality dimensional discrete maps. Fig. 2. illustrates a self-organizing
of the input feature vectors. Generally, the most important map with a two-dimensional grid topology.
aspects in the artificial neural network design are related to the The process of self-organization involves four distinct
network architecture and the choice of training method. The elements: a) Initialization, where small random values are
choice of image segmentation block size and the choice for the assigned to connection weights; b) Competition, where a
number of neurons in the hidden layer of the network will have discriminant function is used to declare a single neuron as a
a direct effect on the attained compression ratios. competition winner; c) Cooperation, where a spatial location of
a topological neighborhood, as determined by the location of III. IMAGE COMPRESSION
the winning neuron, provides the basis for cooperation among Fig. 3 illustrates the proposed image compression method.
neighboring neurons; d) Adaptation, where connection weights In the first stage of the image compression process, a computed
are adjusted in order to decrease the corresponding discriminant tomography image of lungs is segmented into rectangular NxN
function values in relation to the input pattern. In this paper, blocks. The blocks are converted into a vector form to produce
Euclidian distance is adopted as a discriminant function and the the input and the target patterns for the supervised artificial
weight update rule is defined as: neural network learning. The training of the artificial neural
network, labeled ANN, involves the available dataset, in its
w = (x i wji) (2) entirety. In the next stage, the outputs of this artificial neural
network are passed to Kohonens self-organizing map to
partition the feature space and enable training of multiple
, where represents the ith element in the input pattern x. Here, artificial neural networks in the subsequent stage of the image
denotes the network weight connecting the ith input unit and compression process. Multiple artificial neural networks,
the jth neuron in the output layer, while represents the labeled ANN 2 to ANN M, are trained to process only the
size of topological neighborhood as a function of winning specific input subspaces as defined by the trained self-
neuron index I(x) and time (epoch). Finally, weight update uses organizing map. The design of each artificial network involves
an exponentially decreasing time-dependent learning rate, two stages. Firstly, all individual artificial neural networks, in
denoted as . In this paper, one-dimensional self-organizing this hierarchical layer of the encoder, are initialized as the
maps are used to partition the input space into a small number previously trained ANN, where the entire available dataset is
of subspaces and as such facilitate the use of multiple finely used to train the neural network. Subsequently, each neural
tune multilayer perceptrons to perform image compression. network is trained on a particular subset of available feature
vectors. This two stage process ensures that each of the
networks, ANN 1 to ANN M, represent a finely tuned version of
the ANN network that are specifically designed to process
particular input subspace.
After the training of unsupervised and supervised neural
networks is completed, the process of image encoding is rather
straightforward. Image segmentation and vectorization
transforms an image into a set of feature vectors. Each of these
vectors is allocated to one particular finely tuned neural network
according to the SOM outputs. A particular neural network
processes the feature vector, and its hidden layer outputs are
passed to the decoder. Since the number of hidden layers is
significantly less than the feature vector length, image
compression is achieved.
The process of image reconstruction is illustrated in Fig. 4.
In the decoder design, the principal challenge presents a fact that
the decoder inputs can come from M different artificial neural
Fig.3. Block diagram of the encoder
networks in the encoder and the decoder must ascertain from
which specific neural network it comes from before it can make
an accurate reconstruction of that specific image segment. Thus,
the decoder functionality requires two distinct stages, allocation
of encoder input to a specific neural network, and the
reconstruction of an image segment associated with that encoder
input.
In general, the reconstruction of image segments requires
only a part of the ANN structure that links the hidden layer
outputs to the ANN outputs. In order to make this distinction
clear, superscript H is used to denote that the reduced neural
network structure. Thus, ANNH represents the reduced ANN
network.
Initially, in the process of neural network identification, a
coarse reconstruction of the feature vectors is performed. At this
stage, when the encoder input is not associate with a particular
finely tuned network, ANN constitutes the best means for
feature vector reconstruction. This neural network can be
Fig.4. Image reconstruction thought of as the average of the set of finely tuned networks.
The ANNH outputs are passed to the SOM network, where the
reconstructed feature vector is associate with the particular
finally tuned neural network.
In the second stage, an improved reconstruction of the
feature vector can be obtained. If a particular encoder input is a
result of some feature vector being processed by the ith neural
network, ANN i at the encoder, by correctly identifying the ith
neural network at the decoder, ANNH i can be used to obtain a
more accurate reconstruction of that feature vector. Once the
feature vectors are reconstructed, the invers process of images
segmentation and vectorization can be imposed to reconstruct
the image.
Note that the performance of image reconstruction relies on
the assumption that neural network identification can be
performed with high levels of accuracy. The results of
experimental evaluation presented later in this paper will show
that this is indeed the case.