Sunteți pe pagina 1din 10

International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 240-249, Dec. 2011.

Manuscript
Received:
25, Sep., 2011
Revised:
21, Nov., 2011
Accepted:
15, Dec., 2011
Published:
15, Jan., 2012

Keywords
bacteriophage
lambda,
watermarking,
I nformation
hiding,
DNA
computing,
E.coli
bacteria,
Data deletion
Cloning,
Mutation

Abstract Genetically engineered
machines take advantage of the
computational power of DNA and seem to be
promising in the future of computers. In this
paper a solution has been proposed and
simulated for securing images in the context
of DNA computing. A total of three
genetically engineered machines have been
proposed for hiding, watermarking and
deletion of images using cloning techniques
in genetic engineering. Our proposed
methods have numerous advantages over the
previously proposed methods. In the
proposed watermarking scheme the size of
the watermark picture, in contrast to the
previously proposed schemes has no limit in
size and takes place naturally and the
proposed in-vivo data deletion procedure
utilizes a one-sided natural process and in
spite of the case for data deletion in silicon
computers it is guaranteed to be leakage
free.


1. Introduction
Biomolecules constitute fundamental basis for
architecture of life on the earth. Their great potential can be
easily versified by observing the complexity of different
organisms on our planet. Of these molecules, DNA
(deoxyribonucleic acid) is assumed as the biochemical basis
of heredity in all organisms and in spite of being
constructed from only four kinds of nucleotides, it is
conjectured as the reason for all the existing complexity and
diversity in living beings of nature.
The physical potential of DNA molecules for
computational purposes was first shown by Adleman [1]
during his historical experiment of solving the Hamiltonian
path problem which is an NP-complete problem using
synthetic DNA parts and appropriate encoding and
programming these molecules to find the answer of this
problem. He discovered some intrinsic properties associated
with DNA molecules such as the ability to hide information
or the massive parallelism inherent in these molecules.
Based on the work of Kari et al., the ability of DNA


This work was done in Iran University of Science and Technology and
was financially supported by the Iran Telecommunication Research Centre
(ITRC).
Arash Karimi and Hadi Shahriar Shahhoseini are with Iran University
of Science and Technology (Emails: ar-karimi@elec.iust.ac.ir,
h.shsh@iust.ac.ir ())
molecules to do computations dates back to millions of
years ago when a species of protozoa (ciliated protozoa)
solved a problem like finding a Hamiltonian path in a graph
[2], [3]. The outstanding work of Tom Head [4] in finding
the Turing powerful potential of operation of splicing the
DNA strands supports the ideas of Adleman from his
practical experiments that it may be possible to build a
universal Turing machine based on operations and materials
constructed merely from biological parts. Based on these
theoretical basis as well as practical and experimental
abilities of genetic engineering and biotechnological
discipline, many efforts have been accomplished to
construct finite automata and simple Turing machines from
biological units and based on genetic engineering principles.
Some DNA computing models were also proposed which
simulate Turing machines. Some of the most outstanding
works in this direction can be stated as follows. In [5] some
challenges in designing a molecular computer are discussed.
In [6] mutagenesis as the basis for designing and
implementation of molecular computers is considered. The
first leap towards implementation of in-vivo finite automata
was taken in 2004 by Shapiro et al. [7] which can
distinguish between strings having odd number versus even
number of input symbols. Since the initial papers and
personalities who initiated this interdisciplinary area were
cryptologists or computation theorists and because of the
similarity of the nature of genetic codes and cryptology,
providing security using biomolecules received much
attention from the first proposed papers in this area. For
instance, implementation of the only
information-theoretically secure cipher, the Vernam
One-time pad scheme, using DNA molecules was proposed
by Gehani et al. [8]. A molecular computer scheme to break
The Data Encryption Standard [9] based on in-vitro
synthetic DNA manipulation was proposed by Boneh et al.
and afterwards by Adleman in [10] and [11] respectively.
DNA chip-based implementation of a steganography
scheme was proposed in [8]. We previously introduced
in-vivo solutions for multiclient authentication and a
watermarking scheme in [12] and [13].
In this paper, we propose in-vivo security mechanisms
which can be most suitable for securing images. The first
proposed system is an in-vivo solution for hiding
information which is generally applicable for hiding images
as well as steganography. In the second scheme, a
watermarking scheme has been introduced in which we use
the infection procedure of the E. coli bacteria with phage
lambda as our model. The third proposed scheme also deals
with annihilation of the information in-vivo which has
Achieving Secrecy for Images Using in-vivo DNA
Cloning Techniques
Arash Karimi & Hadi Shahriar Shahhoseini

Arash Karimi et al.: Achieving Secrecy for Images Using in-vivo DNA Cloning Techniques
International Journal Publishers Group (IJPG)


241
motifs from the Lytic cycle of infection of E. coli.
The first proposed scheme gives an in-vivo solution for
hiding messages which can be thought of as an important
secrecy primitive in DNA-based computers.
We utilize the infection procedure of E. coli bacterium as a
motive to introduce two other security mechanisms in
DNA-based computers in which the state of reproduction of
viruses in E. coli defines the secrecy mechanism (Lysogeny
and Lytic cycle for watermarking and data deletion
primitives respectively).
A novel coding algorithm for encoding information into
cells of a living-being is also proposed which is based on
the multiform property of amino acids.
The rest of this paper is organized as follows. In Section 2,
the preliminary background is reviewed. Section 3 gives an
overview of the proposed schemes and the information
hiding scheme, the watermarking scheme and data
annihilation procedure are discussed respectively. Section 4
describes simulation results. Section 5 provides a security
analysis for security of the proposed schemes and in Section
6, conclusions are drawn.
2. Preliminary Background
In this Section for conceiving the relationship between
the proposed schemes and their motifs in biological
phenomena we discuss the power of genetic code table for
encoding the data as well as natural processes in which E.
coli bacteria is infected with phage lambda.
A. Genetic Code
In the following, we describe an interesting property of
the genetic code which can be considered as a fertile ground
to base the proposed scheme for encoding data in DNA
nucleotides. This interesting property which can be utilized
for encoding data in a DNA sequence can be seen in the
genetic code table of Fig. 1. Three consecutive nucleotides
in the DNA sequence of a gene form a codon that defines an
amino acid which is then translated to a functional protein.
All in all, there exist 20 amino acids in different life forms
on the earth but we can make 64 possible forms from
different combinations of four bases in a codon. As a
consequence, each codon can have more than three possible
forms which makes a redundancy. In other words, there are
at least three possible codon forms for each amino acid
which is then translated into a single protein. So the
phenotype does not change even if we change the sequence
common to each amino acid. What we can see in Fig. 1 is
that for all codons which produce one amino acid, the first
and the second nucleotides are the same but the third amino
acid changes. The codons highlighted in yellow in the figure
are those which produce one form of amino acids and
therefore are translated to a single functional protein. The
mutation in DNA molecule in which the phenotype does not
change is called silent mutation.
According to what stated above, we can describe our
proposed scheme for encoding data into DNA segments
using the multiform property of amino acids in the
algorithm shown below.
Algorithm I. Encoding data into DNA segments
Step 1. Prepare a gene or a sequence of genes in which
data is going to be encoded. Note that all codons involved in
these genes should be selected from the highlighted codons
for which there exist multiple forms.
Step 2. For encoding purpose, synthesize another gene
sequence which is a modified copy of the first gene
sequence using below considerations: if we want to encode
logical zero, do not change the sequence of codon and
just copy it down and in case we want to encode logical
one change the last nucleotide of the codon according to
the yellow-highlighted parts of the genetic table shown in
Fig. 1.
Step 3. For decoding purpose, compare two sequences. Do
it codon by codon. If any codon is unchanged, its
corresponding bit becomes zero but if the codon is
changed to produce an amino acid with the same phenotypic
properties according to the genetic table, its corresponding
bit becomes one.

Fig. 1 The genetic code table
B. The process of infection of E. coli
Bacteriophages or simply phages are those viruses which
infect cells of bacteria. A well studied test case for
observing the infection of a bacterium with phages is the
process of infection of E. coli with phage lambda.
Phages can be seen in two categories of mild and
destructive. In the destructive form when a phage infects a
bacterial cell, DNA of phage is reproduced in hundreds of
copies and those genes which encode new cover proteins
are expressed. This action takes place in a synchronized
manner so that no new phage is produced before destruction
of the host cell. By destruction of the host cells, new phages
are released and destruct other host cells.
Lambda phage is a virus with circular DNA which infects E.
coli cells. Its length is 50kbp and consists of 50 genes.
When the phage finds a host E. coli cell, it binds it from a
specific DNA structure on the cell of E. coli then DNA of
lambda is ejected from its head and is entered into the
International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 240-249, Dec. 2011.
International Journal Publishers Group (IJPG)


242
interior membrane of the bacterium and then, in order to not
being destructed by exonuclease enzymes it forms a circular
structure and then the DNA molecules are linked in specific
sites existing in two sides of the linear strands. These
specific sites are shown in Fig. 2.


Fig. 2 Specific site on the end of the linear strand
After this operation, the ligase enzyme encoded by host
cell, closes the cut sites in both sides and makes a closed
circular lambda molecule. Injection of DNA of phage in
cells of E. coli is demonstrated in Fig. 3.


Fig. 3 Injection of DNA of phage in the E. coli cell
The lambda phage is a mild phage for which there are
two phases of proliferation which are called Lytic and
Lysogeny. In the Lysogeny cycle the genome of phage,
instead of proliferation, is integrated into genome of
bacterium and the genes related to the cover proteins are not
expressed. This integrated and deactivated phage is called
prophage. These prophages are proliferated during cell
division procedure as a part of bacterial chromosome in the
inactive form. Therefore, each on two daughter cells is the
result of this Lysogeny cell division and this Lysogeny state
can be kept for a long duration but there is the possibility of
state change to the Lytic cycle. This state change from
Lysogeny to Lytic is called induction which is possible by
ejection of the prophage DNA from genome of bacteria,
proliferation and activation of required genes for generation
of cover and regulator proteins in the Lytic cycle.
Lysogeny cycle is quite stable in the normal conditions
but in case the cell is exposed to destructing conditions, the
inactive phage integrated into genome of bacterium (the
prophage) can effectively change its state into the Lytic
cycle. This kind of state change from Lysogeny to Lytic is
called Lysogeny induction. Selection of everyone of these
cycles depends on the state of acceptance of Lytic or
Lysogeny gene expression programs. The program
responsible for Lysogeny cycle can be kept in the cell for
many generations of proliferation but during induction
process, this cycle is changed to the Lytic cycle with a high
efficiency. The procedure of infection of E. coli cell and the
cycles of proliferation of bacteriophages are shown in Fig.
4.

Fig. 4 Different cycles of proliferation in infection of E. coli with phages
3. The Proposed Schemes
In this section we introduce our proposed schemes for
providing secrecy in images using genetically engineered
machines.
A. An in-vivo mechanism for hiding information
In this section we show our proposed scheme for an
in-vivo system using the procedures which occur naturally
as an integral part of gene expression in all living organisms.
We define our initial setup for the hiding system as shown
below:
We utilize the silent mutation property of amino acids as
demonstrated in section (2.A) for encoding our messages
into the blocks of DNA sequences in the synthetic plasmids
conveying the information. Furthermore, we define
transmitter and receiver side information of the system as
shown in Equ. (1)-(2). The transmitter side information is a
kind of information which is added to the message at the
transmitter side and the receiver side information is added
in the receiver side to unveil the hided information.

Receiver information(A biochemical indirect
activator)

(Equ. 1)
Arash Karimi et al.: Achieving Secrecy for Images Using in-vivo DNA Cloning Techniques
International Journal Publishers Group (IJPG)


243
)
, ) (
, (
sequence message the after padding for
s nucleotide of sequence known A
gene reporter a phenotype
certain a with gene known A
inhibitor l biochemica A n Informatio r Transmitte



(Equ. 2)
Equ. 1 and Equ. 2 show the receiver and the transmitter side
information of our proposed data hiding system,
respectively. The first element of both of which is a
biochemical substance which can be naturally found in the
bacteria we work with. As can be seen in Equ. 1, the
receiver information is a biochemical indirect activator
which indirectly activates expression of the genes which lie
downstream of the promoter of the synthetic gene sequence
which encodes the message.
Furthermore Equ. 2 shows that the first element of the
transmitter information is a biochemical inhibitor which
effectively blocks expression of the downstream gene(s) of
the promoter of the plasmid which encodes the message of
the proposed system, the second element of Equ. 2 is a
known gene with a specific phenotype and the third element
of it demonstrates a known sequence of nucleotides which
shows that the message data has ended. Any gene to be
expressed needs a promoter which is upstream of it and that
gene which comes after it as shown in Fig. 5.

Fig. 5 A plasmid containing its promoter and a gene
In order to provide an example to demonstrate our hiding
mechanics, we use Equ. (3)-(4) to express the
transmitter-receiver information pairs of the hiding scheme.

) ( Re IPTG n Informatio ceiver (Equ. 3)
)
, , (
data after padding for sequence
DNA A gene GFP LacI n Informatio r Transmitte

(Equ. 4)
In Equ. 3, IPTG or Isopropyl -D-1-thiogalactopyranoside
is a biochemical reagent which induces transcription of the
gene that encodes for beta-galactosidase, a hydrolase
enzyme which cooperates in catalyzing the hydrolysis of
-galactosides to monosaccharide.
Also, in Equ. 4, the transmitter information contains a
biochemical substance (LacI protein) which inhibits
transcription of the upstream gene(s) of the promoter which
belongs to the message-encoding plasmid.
IPTG molecule (with the following chemical formula
C
9
H
18
O
5
S), when connected to LacI, detaches it from the
promoter and unblocks expression of the gene(s)
downstream of the promoter this process is shown in Fig. 6.
With this explanation at hand, we are now ready to describe
the algorithm in which Alice encrypts a message and send it
to Bob.
Algorithm II. The proposed scenario for secure
communication of Alice and Bob
Step 1. Alice encodes her intended message in
accordance with the silent mutation property of the genetic
code in some gene(s) which have been cloned in the
message information-bearing plasmid.


Fig. 6 Detaching LacI from the plasmid by IPTG
Step 2. Alice inserts the third element of Equ. 4 which is a
known sequence of nucleotides to the message she wishes
to send to Bob.
Step 3. Alice, using the transmitter information defined in
Equ. 2, hides the padded message (chosen from the message
space and encoded in message-encoding plasmid
M
P
using the silent mutation property of the genetic code). The
transmitter information can be composed by concatenation
of LacI and the DNA sequence of a known gene (such as
green florescent protein gene) as shown in Fig. 7 which
serves as the terminator which reveals that there exists a
hided information after ending the sequence of this reporter
gene.
The hiding procedure can be accomplished easily by
binding the synthetic concatenation of Fig. 7 to the plasmid
which conveys the message information and therefore by
blocking expression of the gene(s) which lie downstream of
promoter of the message-encoding plasmid.


Fig. 7 The first and second elements of the transmitter side information
of the proposed data hiding scheme

In this way, transcription of these genes will be stopped and
therefore, Alice can hide the message.
Step 4. Bob receives the solution which contains the hided
message sent from Alice. Since using transmitter
information Alice has blocked expression of her intended
message, only Bob who possesses the receiver information
has the means to unveil the information which is hided in
the solution received by Alice and then he can extract the
message information sent by Alice by unblocking
expression of the downstream genes of the message

LacI GFP gene


Promoter
Gene
Plasmid
LacI
IPTG


Promoter
Gene
Plasmid

International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 240-249, Dec. 2011.
International Journal Publishers Group (IJPG)


244
promoter. In our example, Bob by adding IPTG can remove
LacI and by removing it, the GFP gene is expressed and the
solution which contains hided and coded information turns
to green.
Step 5. By analyzing the resultant plasmid, Bob can unveil
the message sent by Alice which lies between the GFP gene
and the known sequence of nucleotides which was
previously defined as a part of transmitter information.
Step 6. By decoding the sequence of nucleotides which
was derived by Bob in step 4, according to the genetic code
table shown in Fig. 1, he can find out the message Alice
sent to him.
B. A new in-vivo watermarking scheme
A remarkable difficulty that is common to all the
watermarking algorithms presented up to now is restriction
of size of the watermark picture. We have overcome this
restriction by using DNA molecules to code image
information into cells of two microbes. Our proposed
method has the ability to encode images with a very large
size, since the total length of E. coli and phage lambda can
be used to encode the host image and the watermark picture
respectively (approximately 4000 4000 pixels for the
host image and 127 127 pixels for the watermark image
if we use E. coli and lambda phage for encoding the host
image and the watermark image respectively.) Furthermore
we can use larger phages to encode larger watermark
images. It is noteworthy that for implementation of this
scheme in the laboratory we should control the infection
cycle of E. coli so that it maintains the lysogenic cycle and
as shown in [14] these conditions can be achieved with a
probability of at least 90%. The proposed scheme like all
the other watermark schemes [15]-[17] includes two steps
of embedding the watermark image and extracting the
watermark image:

1) Embedding watermark: In the proposed method
the host image is first converted into a string of sequential
bits and then it is mapped to DNA sequence of the
bacterium E. coli.
The mapping of image bits and codons is based on the
concept of a Silent mutation which is a kind of mutation
that does not alter the amino acid and so does not outbreak
in the phenotype as explained in section (2.A). In the sequel,
some Algorithms are brought to explain the detailed
procedure of the proposed watermarking method.

Algorithm III. Coding of information of the host image in
the genome of E. coli:
Step 1. Selection of the host image.
Step 2. Displaying the host image as a sequence of binary
bits using the halftone technology [17].
Step 3. Selection of a gene from the E. coli genome to
possess the size of at least three times as big as the size of
bits of the image.
Step 4. Coding of the sequence of Step 2 in the genome of
E. coli such that if its corresponding bit is zero, there will be
no change in the structure of the codon of the gene,
otherwise, the codon of the corresponding gene incurs a
silent mutation.
In the next step, we should select a specific site in the
genes of E. coli and phage lambda that are regarded as the
sticky ends of them. It is substantial to note that the sticky
ends are unique. Otherwise, by doing it in the laboratory,
the circular DNA of E. coli will be patchy and will not yield
an appropriate result.

Algorithm IV. Coding of information of the watermark
image in the genome of lambda phage:
Step 1. Selection of the watermark image.
Step 2. Displaying the watermark image as a sequence of
binary bits using the halftone technique.
Step 3. Selection of a gene from the lambda phage genome
to possess the size of at least three times as big as the size of
bits of the image.
Step 4. Coding of the sequence of Step 2 in the genome of
phage lambda such that if its corresponding bit is zero, there
will be no change in the structure of the codon of the gene,
otherwise, the codon of the corresponding gene incurs a
silent mutation.
Algorithms III and IV result in two test tubes that contain
the coded DNA strands of the host image and the watermark
image respectively. In the next algorithm, an appropriate
site must be selected for computer simulation of insertion of
the phage DNA into the DNA of E. coli.

Algorithm V. Selection of the sticky ends of E. coli and
phage lambda:
Step 1. Reception of the genes of E. coli and phage lambda
produced in Algorithms III and IV.
Step 2. Finding a cos site in the phage lambda and the
corresponding site in E. coli bacterium.
We have used the cos site achieved in [18] to be used in
Step 2 of Algorithm V as a secondary attachment site. In the
next algorithm, insertion of the phage lambda DNA into the
DNA of E. coli is computer simulated.

Algorithm VI. Insertion of the DNA of the phage lambda in
that of E. coli:
Step 1. Cutting the double-stranded DNA of E. coli and
lambda phage from their sticky ends.
Step 2. Insertion of a piece of the lambda phage into the
cutting edge of E. coli bacteria.

2) The watermark image extraction: In this stage, to
extract the watermark image by the owner he should evict
lambda phage from the solution containing the infected E.
coli and then he should extract the phage gene and decode it.
In Algorithm VII, stages of watermark extraction are
depicted.

Algorithm VII. Evicting phage lambda from the infected E.
coli
Step 1. The owner, using the knowledge of sticky ends of
the phage that he himself has inserted, finds two identical
sticky ends and cuts them stepwisely.
Step 2. Extraction of a shorter length sequence (phage
lambda).
Step 3. Adjoining free nucleotides of the remaining
sequence (E. coli).
Arash Karimi et al.: Achieving Secrecy for Images Using in-vivo DNA Cloning Techniques
International Journal Publishers Group (IJPG)


245
In laboratory Algorithm VII is implemented using
centrifuge of the resulted solution. In this regard, because
lambda phage sequence is shorter than that of E. coli, it
moves faster in the test tube and so it can be easily
extracted.

Algorithm VIII. Decoding
Step 1. Comparing sequences of the extracted phage from
step 3 of the algorithm VII with the original gene of the
phage to extraction of the watermark image.
Step 2. Comparing DNA sequence of the resulting E .coli
in step 4 of the algorithm VII with the original sequence
of E. coli to extract the host image.

C. An information annihilation scheme
The last scheme we introduce in this paper is a data
annihilation scheme which can be used to delete a message
in in-vivo computers. This scheme uses the concept of
infection of E. coli with bacteriophage lambda in the Lytic
cycle. The scenario in which our wetware data annihilation
scheme is useful can be stated as follows.
Assume that Alice has encoded her message into the
genome of E. coli. For a variety of reasons she may wish to
delete this message so that no one else can see or recover it.
The mechanics of this scheme can be explained in the
algorithm below.
Algorithm IX. A wetware data annihilation scheme
Step 1. Alice infects the E. coli bacterium which conveys
her encoded information (the data encoding scheme uses the
silent mutation property just as explained in section (2.A)).
Step 2. The infection procedure will be guided using a
bacteriophage (we assume it to be the bacteriophage
lambda). Alice controls this infection so that it goes to the
Lytic cycle.
Step 3. The E. coli bacterium containing the encoded
message is completely destroyed and therefore, her data is
completely deleted.
As can be seen in the above algorithm, Alice is able to
delete her message so that it cannot be recyclable anymore.
The proposed scheme is similar to the deletion of a file from
recycle bin of a usual silicon computer. The user intends to
delete any file he wishes so that it cannot be retrieved by
any means.
Our wetware scheme has advantages over the usual deletion
method in silicon based computers since it is natural and has
no loss, and also the procedure which utilizes this scheme is
one-sided and it cannot be reversed, therefore it guarantees
deletion of the message, but in silicon computers, the
deleted data can be retrieved from the memory and it goes
to a specific address in memory even if deleted and so there
are some bits in the memory from which this deleted
information can be leaked. One example for such
information leakage in usual silicon-based computers can be
seen in the case for cold boot attacks introduced in [19].
4. Simulation Results
To show the performance of our proposed scheme we
have run a simulation in which we want to embed the
picture containing arm of our university of size 40 40
pixels into a 200 200 pixels Lenna image (Fig. 8(a)).
And next, we will depict the results of our computer
simulation in two parts of embedding the watermark and
extraction of it.










(b) (a)

Fig. 8 (a) Host image (b) watermark image

1) Embedding the watermark: Our proposed method of
embedding watermark is divided into three parts:
1. Coding the host image in a specific gene of E. coli
The host image will be first considered in a halftone mode.
i.e. in the dark parts of the image the density of the black
pixels will be more and in bright parts of it the density of
the black pixels will be less.
(
(
(
(
(
(
(
(
(
(
(

0 0 0 0 0 0 1 0
0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 1
0 0 0 0 0 1 0 1
0 1 0 0 1 0 1 1
0 0 0 0 1 0 0 1
0 1 0 1 0 1 1 0
1 1 1 0 1 0 0 1



Fig. 9 Host image

International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 240-249, Dec. 2011.
International Journal Publishers Group (IJPG)


246
Based on the size of the host image, we have chosen gene
leuL of E. coli to encode the host image. The part of the
abovementioned original gene which corresponds to the
selected area demonstrated in Fig. 9 is presented in matrix
of Equ. 5.




(Equ. 5)

By coding the host image in leuL , the coded matrix of the
selected area of the image is shown in Equ. 6.




(Equ. 6)

As we can see, if each pixel of the image is one, its
corresponding codon will incur a silent mutation (its amino
acid does not change) and if the pixel is zero, its
corresponding codon will not change.
2. Coding the watermark image in genome of bacteriophage
lambda
We use lambda phage as the carrier of our watermark and
encode our watermark image in it. To do so, we should
mention that the specific part of the genome of lambda
phage should not contain a cos site because the image will
be caught from the sticky ends.
(
(
(
(
(
(
(
(
(
(
(

1 1 1 1 0 0 0 1
1 1 1 1 1 0 0 1
1 1 1 0 0 0 0 1
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1
1 0 1 0 0 0 0 1
1 1 1 1 0 0 0 1
1 1 1 1 0 0 0 1




Fig. 10 The watermark image
In the matrix of Equ. 7 part of phage lambda genome which
corresponds to the selected area of the watermark image is
shown.

(
(
(
(
(
(
(
(
(
(
(

=
AAA GGA AAC GGC TTT TCG CGA TCT
AAG GAT AAG CCA AGA GAT CCG CAG
CTA CAT GGT ATT GTT GTC GGT GTT
ATG GGA TAG TGT CAG GCT ATA CCG
ACG GGA GGG TGG CTT CGA AAG AAA
TTC CTG ATC TGG ATC AAG CTA GAT
GAC AGC AAA TGC CAA TTT CGC CCC
CGG ACT ATC TAT CTC CTT GAA TAG
M
Lambda SC Code _ _




(Equ. 7)

By coding the watermark image in lambda phage genome,
the coded matrix of the selected area of the image is
demonstrated in the matrix of Equ. 8.

(
(
(
(
(
(
(
(
(
(
(

=
AAG GGG AAT GGG TTT TCG CGA AGC
AAA GAC AAA CCG AGG GAT CCG CAA
CTG CAC GGG ATT GTT GTC GGT GTG
ATG GGA TAG TGT CAG GCT ATT CCT
ACG GGA GGG TGG CTT CGA AAA AAG
TTT CTG ATA TGG ATC AAG CTA GAC
GAT AGG AAG TGT CAA TTT CGC CCG
AGG ACG ATA TAC CTC CTT GAA TAA
M
Lambda SC Code _ _




(Equ. 8)

3. Infection of Escherichia coli with phage lambda
Bacteriophage lambda, in order to infect E. coli integrates
its genome into the TrpC gene of E. coli from its cos site
and the recombination DNA sequence of that is shown in
Fig. 11.

ATGCAAACCGTTTTAGCGAAAATCGTCGCAGACAAGGCGATTTGGGTAGAAGCCCGCAAA
CAGCAGCAACCGCTGGCCAGTTTTCAGAATGAGGTTCAGCCGAGCACGCGACATTTTTAT
GATGCGCTACAGGGTGCGCGCACGGCGTTTATTCTGGAGTGCAAGAAAGCGTCGCCGTCA
AAAGGCGTGATCCGTGATGATTTCGATCCAGCACGCATTGCCGCCATTTATAAACATTAC
GCTTCGGCAATTTCGGTGCTGACTGATGAGAAATATTTTCAGGGGAGCTTTAATTTCCTC
CCCATCGTCAGCCAAATCGCCCCGCAGCCGATTTTATGTAAAGACTTCATTATCGACCCT
TACCAGATCTATCTGGCGCGCTATTACCAGGCCGATGCCTGCTTATTAATGCTTTCAGTA
CTGGATGACGACCAATATCGCCAGCTTGCCGCCGTCGCTCACAGTCTGGAGATGGGGGTG
CTGACCGAAGTCAGTAATGAAGAGGAACAGGAGCGCGCCATTGCATTGGGAGCAAAGGTC
GTTGGCATCAACAACCGCGATCTGCGTGATTTGTCGATTGATCTCAACCGTACCCGCGAG
CTTGCGCCGAAACTGGGGCACAACGTGACGGTAATCAGCGAATCCGGCATCAATACTTAC
GCTCAGGTGCGCGAGTTAAGCCACTTCGCTAACGGTTTTCTGATTGGTTCGGCGTTGATG
GCCCATGACGATTTGCACGCCGCCGTGCGCCGGGTGTTGCTGGGTGAGAATAAAGTATGT
GGCCTGACGCGTGGGCAAGATGCTAAAGCAGCTTATGACGCGGGCGCGATTTACGGTGGG
TTGATTTTTGTTGCGACATCACCGCGTTGCGTCAACGTTGAACAGGCGCAGGAAGTGATG
GCTGCGGCACCGTTGCAGTATGTTGGCGTGTTCCGCAATCACGATATTGCCGATGTGGTG
GACAAAGCTAAGGTGTTATCGCTGGCGGCAGTGCAACTGCATGGTAATGAAGAACAGCTG
TATATCGATACGCTGCGTGAAGCTCTGCCAGCACATGTTGCCATCTGGAAAGCATTAAGC
GTCGGTGAAACCCTGCCCGCCCGCGAGTTTCAGCACGTTGATAAATATGTTTTAGACAAC
GGCCAGGGTGGAAGCGGGCAACGTTTTGACTGGTCACTATTAAATGGTCAATCGCTTGGC
AACGTTCTGCTGGCGGGGGGCTTAGGCGCAGATAACTGCGTGGAAGCGGCACAAACCGGC
TGCGCCGGACTTGATTTTAATTCTGCTGTAGAGTCGCAACCGGGCATCAAAGACGCACGT
CTTTTGGCCTCGGTTTTCCAGACGCTGCGCGCATATTAA

Fig. 11 The specific part of the infected E.coli genome

The specific part of the phage genome that carries the cos
site is depicted in Fig. 12.

TATTTAGCTTTCTGCTTCCTTTTGGATAACCCACTGTTATTCATGTTGCATGGTGCACTG
TTTATACCAACGATATAGTCTATTAATGCATATATAGTATCGCCGAACGATTAGCTCTTC
AGGCTTCTGAAGAAGCGTTTCAAGTACTAATAAGCCGATAGATAGCCACGGACTTCGTAG
CCATTTTTCATAAGTGTTAACTTCCGCTCCTCGCTCATAACAGACATTCACTACAGTTAT
GGCGGAAAGGTATGCATGCTGGGTGTGGGGAAGTCGTGAAAGAAAAGAAGTCAGCTGCGT
CGTTTGACATCACTGCTATCTTCTTACTGGTTATGCAGGTCGTAGTGGGTGGCACACAAA
GCTTTGCACTGGATTGCGAGGCTTTGTGCTTCTCTGGAGTGCGACAGGTTTGATGACAAA
AAATTAGCGCAAGAAGACAAAAATCACCTTGCGCTAATGCTCTGTTACAGGTCACTAATA
CCATCTAAGTAGTTGATTCATAGTGACTGCATATGTTGTGTTTTACAGTATTATGTAGTC
TGTTTTTTATGCAAAATCTAATTTAATATATTGATATTTATATCATTTTACGTTTCTCGT
TCAGCTTTTTTATACTAAGTTGGCATTATAAAAAAGCATTGCTTATCAATTTGTTGCAAC
GAACAGGTCACTATCAGTCAAAATAAAATCATTATTTGATTTCAATTTTGTCCCACTCCC
TGCCTCTGTCATCACGATACTGTGATGCCATGGTGTCCGACTTATGCCCGAGAAGATGTT
GAGCAAACTTATCGCTTATCTGCTTCTCATAGAGTCTTGCAGACAAACTGCGCAACTCGT
GAAAGGTAGGCGGATCCCCTTCGAAGGAAAGACCTGATGCTTTTCGTGCGCGCATAAAAT
ACCTTGATACTGTGCCGGATGAAAGCGGTTCGCGACGAGTAGATGCAATTATGGTTTCTC
CGCCAAGAATCTCTTTGCATTTATCAAGTGTTTCCTTCATTGATATTCCGAGAGCATCAA
TATGCAATGCTGTTGGGATGGCAATTTTTACGCCTGTTTTGCTTTGCTCGACATAAAGAT
ATCCATCTACGATATCAGACCACTTCATTTCGCATAAATCACCAACTCGTTGCCCGGTAA
CAACAGCCAGTTCCATTGCAAGTCTGAGCCAACATGGTGATGATTCTGCTGCTTGATAAA
TTTTCAGGTATTCGTCAGCCGTAAGTCTTGATCTCCTTACCTCTGATTTTGCTGCGCGAG
TGGCAGCGACATGGTTTGTTGTTATATGGCCTTCAGCTATTGCCTCTCGGAATGCATCGC

Fig. 12 The specific part of the lambda bacteriophage genome

The infection procedure of bacteria E. coli with phage
lambda and the way its genome enters into the circular
DNA of E. coli is demonstrated in Fig. 13.
Arash Karimi et al.: Achieving Secrecy for Images Using in-vivo DNA Cloning Techniques
International Journal Publishers Group (IJPG)


247
2) Watermark extraction: In an appropriate watermarking
scheme only the owner of the image can extract his
watermark and our proposed scheme can meet this need.
That is because only the person who embeds the watermark
has the knowledge of the genes that the watermark and the
host image are hiding into and also, he is the only one to
know the sticky ends for selection of the appropriate
enzymes to extract the phage. The extraction process of
infected E. coli is reverse of embedding the watermark
image into the host image process.


Fig. 13 The specific part of the infected E.coli genome

5. Security Analysis of the
Proposed Schemes
In this section we provide analysis on our proposed schemes
and prove the security of these schemes.
In the first scheme, a wetware hiding mechanism was
proposed which utilizes a pair of chemical substances in the
transmitter and receiver sides to hide a message, often a
picture, in the E. coli cells. In this scheme there exist some
security parameters which help in provision of secrecy in it.
the first thing to note is that the proposed pairs are chosen
according to the elegant property of binding a biochemical
substance to a plasmid and detaching another biochemical
from it. This pair plays an important role in security of this
scheme. The other factor which is equally important in
provision of security is the power of our encoding scheme
which is by itself a hiding mechanism. The known sequence
which is added to the end of the message is another security
parameter of our proposed scheme which defines the end of
the message. The importance and roles of these security
parameters can be understood better in security analysis of
the proposed watermark scheme which is defined in the
next paragraph.
In order to analyze security of the proposed wetware
watermark scheme, we should first notice that there are
many characteristics that are involved in security of the
proposed scheme which let the owner of watermark prove
his possession. Every one of these characteristics are called
security characteristic of the system. These security
characteristics can be defined as follows:
1. The bacteriophage within the watermark information is
encoded.
2. The specific gene which is utilized by the owner of
watermark for encoding the watermark picture.
3. The specific location within the mentioned gene in which
the watermark picture is encoded.
4. Length of the sticky end.
5. The DNA sequence of the sticky end which is in
possession of the owner just as a secret key.
6. The specific bacterium which is used for encoding
information of the host image.
7. The specific gene which is used for encoding the host
image.
8. The specific location in the gene in which the host image
is encoded.
Every one of the abovementioned characteristics play a role
in achieving secrecy in the proposed watermark scheme.
For a better understanding of these roles we analyze the
attack conditions to this system.
Assume that the attacker to our watermarking scheme
possesses a test tube containing the infected E. coli bacteria
which is infected by the lambda phage. He may wish to
extract the hided information in E. coli and lambda phage.
To do this, he should analyze the DNA sequence related to
the E. coli genome which contains genome of the lambda
phage. Since he does not know the location of the hided
information, he should analyze all 5000 Kb nucleotides of
genome of E. coli to find the hided information related to
the host image. While the owner of the watermark knows
the specific gene within the host image is hided as well as
location of the encoded information in that specific gene
and therefore, he can find it very easily, the attacker, for the
sake of finding the specific location in which the watermark
information is hided, should look for all 48502 bp
nucleotides of the lambda phage. Furthermore, an attacker
to the proposed scheme should also look for all bacteria that
are infected with different phages and then analyze their
genome which is a demanding task because of the variety of
bacteria and phages. The owner of watermark only knows
which phage has been used for hiding the watermark
information and achieving to this information is indeed
necessary and sophisticated as well. This problem gets more
complex when a bacterium is infected by a variety of
bacteriophages. Therefore, the attacker is faced with more
difficulty in finding the virus in which the watermark
information in hided.
The specific cross sequence which is a subsequence of the
15-nucleotides cos site, is also known only by the owner
and this information lets him know the exact location from
genome of E. coli in which the lambda phage is located.
Therefore, an adversary ought to look for a specific site in
the structure of E. coli genome which is complementary to a
specific sequence of genome of phage and then he should
make a piecewise cut to that specific point.
Since there is a number of possibilities for the cross
sequence with different lengths, each one of them can be
used as a cut point in the genome of E. coli. So, if the length
of the cross sequence is assumed to be l , since for each
one of these l points one can assume 4 different bases, an
adversary must analyze
l l 2
2 4 = possibilities in order to
find the cut point in genome of E. coli. Therefore, the
complexity of finding the cut point in E. coli genome
Lambda phage
GCTTTTTTATACTAA
CGAAAAAATATGATT
E.coli
GCTTTTTTATACTAA
CGAAAAAATATGATT
GCTTTTTTATACTAA
CGAAAAAATATGATT
GCTTTTTTATACTAA
CGAAAAAATATGATT
International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 240-249, Dec. 2011.
International Journal Publishers Group (IJPG)


248
becomes of order ) 2 (
2l
O which is exponential.
There can be used a variety of different secondary cos sites
as entrance point of the phage to the E. coli genome and we
used the sequence driven in [18] as this secondary cross
sequence.
As can be seen in the abovementioned security analysis,
because of the variety of existing security parameters, the
proposed watermark model has a high level of security
against those attacks in which the attacker possesses the test
tube containing the infected bacteria.
The security of the proposed data deletion procedure can
also be justified using the fact that the procedure of Lytic
reproduction and cell destruction takes place quite naturally
and does not need an extra control when the infection
procedure starts and therefore, it is so trustworthy and
guarantees deletion of the encoded message.
D. Considerations of implementing the proposed methods
in cloning laboratories
The proposed schemes for security assurance in images
can be easily implemented in genetic engineering
laboratories. In the first scheme, the data encoding in
plasmids can be easily implemented using UV to change the
nucleotides of a codon and the usual laboratory techniques
can be utilized in adding LacI as well as IPTG to the
synthesized plasmid. In the second and third proposed
schemes, the most important task is mixing two solution
containing bacterial species and bacteriophage species and
then to control this procedure so that the E. coli bacteria is
properly infected and this solution does not go to the Lytic
cycle in the watermark scheme or goes to the Lytic cycle in
the data annihilation scheme.
The laboratory establishment for the engineered infection
process is explained in [14]. The results of this laboratory
experiment demonstrate that considering the explained
conditions, we can make sure that 99% of bacteria are
infected and also 90% of bacterial population go with the
Lysogeny cycle.
In order to make sure that the process of infection goes
with the Lysogeny cycle, there must be a control over the
experiment but if there is no control on the infection
experiment, the procedure tends to the Lytic cycle and
therefore, in practice implementing the second proposed
scheme is a more demanding task.
E. The problems associated with the proposed methods
The proposed approach for providing security in images
has some constraints and drawbacks associated with it
which can be classified into two groups. The first one
contain some general drawbacks associated with all
computational systems based on DNA molecules and the
second belongs to the proposed scheme which utilizes
cloning techniques as a model for computing with DNA.
The drawbacks of the first type belong to lack of access
to the biotechnology facilities which fades away in the light
of growing technology of genetic engineering. The other
problem of the first type also belong with the high error
rates which is tagged to all operations with DNA sequences.
These errors necessitate the need for repeating the
biotechnological experiments for achieving trusty results.
The problems of the second kind associated with the
watermarking scheme relates to the control of the infection
procedure of the phage such that it is prevented from
entering the Lytic cycle. If so, the cell containing the
information will be lost. But as shown in the laboratory
considerations mentioned above, we are able to control this
cycle such that more that 90% of the infected cells enter the
Lysogeny cycle. The precision in experiments is the key for
achieving this goal.

6. Conclusion
In this paper three genetically engineered machines
were proposed to ensure secrecy in transmitting images in
DNA-based computers. Our proposed schemes are the first
security initiatives for images in the DNA computing
context. They provide some improvements over the existing
methods implementable in silicon computers. The usual
data hiding mechanisms in silicon computers have practical
drawbacks in that data can be leaked but the dense medium
of DNA molecule helps us use our proposed scheme along
with the novel data encoding and retrieval mechanisms. Our
in-vivo watermarking scheme also takes advantage of a
natural process which takes place during infection of E. coli
bacterium by bacteriophage lambda and solves the problem
of size limitation of the existing watermarking schemes.
Our analysis predicts that the proposed scheme can be
implemented in laboratory with 90% probability of success.
In the proposed watermarking scheme the infection of E.
coli in the Lysogeny cycle is considered and we used the
Lytic cycle of infection of E. coli for achievement of
another security assurance scheme which we called data
annihilation scheme. The process considered in this security
primitive is a naturally one-sided procedure in which
bacteriophages annihilate the cells of an E. coli bacterium.
The last secrecy primitive can be used similar to the
deletion of a file from a silicon-based computer but the
one-sided property of our proposed scheme guarantees
deletion of the data (such as an image file) so that it cannot
be retrieved and since it gets its motifs from a one-sided
phenomenon which occur in nature, it has no leakage in
contrast to the silicon-based computers as stated in [18] for
which, the deleted information is recoverable and leaks
information in different locations of memory.

Acknowledgement
The authors would like to thank R. Dastanian for her
useful comments and supports in computer simulations of
this paper and Iran Telecommunications Research Centre
(ITRC) for supporting this paper financially.
References
[1] L.M. Adleman, Molecular computation of solutions to
combinatorial problems, (1994) Science, vol. 266, pp.
1021-1024.
Arash Karimi et al.: Achieving Secrecy for Images Using in-vivo DNA Cloning Techniques
International Journal Publishers Group (IJPG)


249
[2] L. Landweber & L. Kari, The evolution of cellular
computing: natures solution to a computational problem,
(1998) LNCS, vol. 2950, pp. 207-216.
[3] L. Kari, DNA Computing: Arrival of Bilogical
Mathematics, (1997) The Mathematical Intellignecer, vol.
19, No.2, pp. 9-22.
[4] T. Head, Formal Language Theory and DNA: an analysis of
the generative capacity of specific recombinant behaviors,
(1987) Bull. Math. Biology, vol. 49, pp. 737-759.
[5] L.M. Adleman, On constructing a molecular computer,
(1996) In R.J. Lipton, E.M. Baum (Eds), DNA based
Computers I, Proceedings of a DIMACS Workshop,
American Mathematical Society, Providence, RI, USA,
pp.1-22.
[6] J. Khodor & D. Gifford, Design and implementation of
computational systems based on programmed mutagenesis,
(1998) In Preliminary Proceedings of 4
th
DIMACS Workshop
on DNA Based Computers, pp. 101-108.
[7] Y. Benenson, B. Gil, U. Ben-Dor, R. Adar & E. Shapiro, An
autonomous molecular computer for logical control of gene
expression, (2004) Nature 414, pp. 430-434.
[8] A. Gehani, T. LaBean & J. Reif, DNA-based Cryptography,
Aspects of Molecular Computing, (2004) Springer-Verlag
Lecture Notes In Computer Science, vol. 2950.
[9] National Bureau of Standards: Data Encryption Standard,
(1977) U.S. Department of Commerce, FIPS, pub.46.
[10] D. Boneh, C. Dunworth & R. Lipton, Breaking DEES Using
a Molecular Computer, (1995) Princeton CS Tech-Report
CS-TR, pp.489-495.
[11] L.M. Adleman, P.W. K. Rothemund, S.Roweis & E. Winfree,
On applying molecular computation to the Data Encryption
Standard, (1999) 2nd annual workshop on DNA Computing,
Princeton University, Eds. L. Landweber and E. Baum,
DIMACS: series in Discrete Mathematics and Theoretical
Computer Science, American Mathematical Society. pp.
31-44.
[12] R. Dastanian, A. Karimi & H.Sh. Shahhoseini, A Novel
Multi-Client Authentication Method Using Infection of
Bacteria, (2011) Proceedings of International Conference on
Communication and Electronics Information (ICCEI), China,
1, pp. 310-314.
[13] A. Karimi, R. Dastanian & H. Sh. Shahhoseini, A New
Watermarking Scheme for An in-vivo Computer Based on
Infection of E. coli, (2010) Proceedings of International
Conference on Computer and Electrical Engineering
(ICCEE), China, 8, pp. 484-488.
[14] B.A. Fry, Conditions for the Infection of Escherichia coli
with Lambda Phage and for the Establishment of Lysogeny,
(1959) Journal of gen. Microbiol, vol. 21, pp. 676-684.
[15] Ch.H. Huang, Sh.Ch. Chuang, Y.L. Huang & J.L. Wu,
Unseen Visible Watermarking: A Novel Methodology for
Auxiliary Information Delivery Via Visual Contents, (2009)
IEEE Trans. On Informtion Forensics and Security, vol. 4,
No. 2.
[16] Y.C. Hou, Visual cryptography for color images, (2003)
Elsevier Journal of Pattern Recognition, vol. 36, pp.
1619-1629.
[17] X. Wang, Z. Xu & P. Niu, A feature-based digital
watermarking scheme for halftone image, (2010) AEU -
International Journal of Electronics and Communications,
vol. 64, no. 10, pp. 924-933.
[18] G.E. Christie & T.Platt, A secondary attachment site for
bacteriophage lambda in trpC of E.coli, (1979) Cell, vol.16,
no. 2, pp. 407-413.
[19] A. Halderman, S. Schoen, N. Heninger, W. Clarkson, W.
Paul, J. Calandrino, A. Feldman, J. Appelbaum & E. Felten.
Lest we remember: Cold boot attacks on encryption keys
(2008) In Usenix Security Symposium.
Arash Karimi Received the B.S. and
M.S. degrees in the Dep. Of Electrical
Engineering from Amirkabir University
of Science and Technology
(Polytechnic of Tehran) and Iran
University of Science and Technology
(IUST), Tehran, Iran, in 2008 and 2011,
respectively. His research interests
include cryptography, unconventional
methods in computation with a focus
on cryptanalysis, Biochemical
computing, and formal languages and
automata.


Hadi Shahriar Shahhoseini received
B.S. degree in electrical engineering
from University of Tehran, in 1990,
M.S. degree in electrical engineering
from Azad University of Tehran in
1994, and Ph.D. degree in electrical
engineering from Iran University of
Science and Technology, in 1999. He
is an assistant professor of the
electrical engineering department in
Iran University of Science and
Technology. His areas of research include networking,
supercomputing and reconfigurable computing. More than 130
papers have been published from his research works in scientific
journals and conference proceedings. He is an executive
committee member of IEEE TCSC and serves IEEE TCSC as
regional coordinator in middle-East Countries.
.

S-ar putea să vă placă și