Sunteți pe pagina 1din 14

Abstract:

Now a day's all the information is available on internet but the problem is
that" information is not secure from threats like reuse ,plagiarism etc
".According to our research we have noted that a lot of work is done on
audio, video, image but less work have been done on text watermarking .
Due to the less work on text watermarking the data is not secure and every
other person is using the data for their purpose. or overcoming this problem
watermark have to be embedded on text. !n this paper, we propose a "lain
text watermarking algorithm using combined image as text watermark to
fully protect the text document. #he watermark is logically embedded in the
text and is extracted later to prove ownership. $xperimental results
demonstrate the e%ectiveness of proposed algorithm under locali&ed as well
as dispersed tampering attacks on the text.
Project Goal/Objectives
The main objective of this research are embedding a watermark into Text using a form of spatial
domain technique, which is least significant bit technique, and then performing doc compression
and decompression and followed by the removal of watermark from the watermarked Text.
The following are the subdivisions of the project:
Embedding the Watermark Text into the Host text:
Compression of the watermarked text:
Decompression of the watermarked Text as image:
Removing the watermark from the watermarked Text:
Introductions:
In todays digital world, text is the most widely used medium of communication in the Internet or
intranet. Internet and intranet is full of information. Information is based on text. !nd now a
days text is not secure. "or securing the text scheme which is uses known as watermarking.
#atermark is a process of preventing text from being copied and it also shows that text is
authentic. #atermark is embedded on data, so data is in different forms like text, audios, image,
and videos. In recent years, many work have been done for $ideo%s, images, audio%s but the main
point is that insufficient works is done on text.
&o the problem is known that text have not been secured properly. "or protecting text from
different threats we have to embed watermark in text. The processes which have been used for
text is same for image, video, audio and text.
'any techniques are used for securing the information. (owever in recent years various
multimedia and text contents are utili)ed as the user to pay the payment *+,, plain text is an
urgent choice of user for its practical reality in the digital area. -egally and illegally, there are
more than one data securing schemes *., are to be used for default data extracting or data
securing. 'any others are used for proper authentications */0, and now, for steganography */,.
Text depend upon data securing schemes in the digital area can be divided into algorithm%s
or &eagram and open codes *1,. &eagram embeds load, it means the change the formatting
and layout the data representation for common use 2e.g., one gape word 34/4 and more than /
space%s character%s 3 5046 */,.In other space schemes *7, is one of the newts time of
text &eagram4s. Therefore other &eagram approaches could be found */,.
8hange pictures with our new methods.
"ig 2a6
9n the other hand, *:, a row divide in two fields of text 2say ! and ;6 therefore total spaces
between words 2i.e., the total amount of pixels around two same text words6 */, in two
sets is same as it is . &paces between the words in each field thin or extended so that
confident situation%s are content to encoding the information */,.Therefore, <! <=<;< > ? is
forced to embedded 5/4, and inversely propositionally, and ; is the total number of word
spaces in set ; and ?= 0. In *@, more progress this work by using multistep accent method
to complete advanced transporter competence. These differences give increase to many
scientific challenges in plain Text watermarking as well as */,.
&o that text watermarking techniques should insert private and invisible watermarks in plain text
documents which remain complete after varied tamper attacks of placing, removal and changing
the formatting. Aata can be checked for text privacy and copyright protection.
There are three different types of techniques to use for the security purpose and provide solutions
for the threats that can be create by attackers .These techniques are watermarking, cryptography
and steganography. In which data may be in any form like audio, video, image and text.
Aigital #atermarking has developed into a very dynamic area of research and has addressed
challenging issues for digital content. #atermarking can be visible 2logos or signatures6 and
invisible 2encoding and decoding6.
Problem motivation and applications:
To analy)e the awareness of users about how important is security informationB a survey has
been conducted and results with a deep analysis are stated below.
! total of /@C various categories of users from different fields, gender, age group and countries
have participated in this survey. Dducationist 2from all majors6, &tudents 2from different majors6,
Eesearchers, users from IT industry, users from ;usiness industry from Fakistan, &audi !rabia,
G&!, 8anada, Dgypt, !ustralia, Hamen, !lgeria, Iew Jealand, Fhilippine, Falestine, India,
-ebanon and GK have provided feedbacks and awareness about Information &ecurity.
!nalysis of data shows significant results for the need of information security. Eesults shows that
/00L participants agree 2+CL strongly agreed, M1 agreed6 that an author should be aware of
online information threats. @ML strongly and :7L agreed where /0L are not sure to be aware of
threats. .ML 2CCL strongly agreed and MCL agreed6. /1L strongly agreed, /CL are not sure
where +0L disagreed 2M1 L disagreed and 1CL strongly disagreed6 that they do not care about
their information being copied or misused. 77L 2:0L strongly agreed and @7L agreed6 that
information security is important to them where /ML are not sure about it. +7L agreed 2:1L
strongly agreed and M1L agreed6 to add their signatures to the online data, /ML are not sure
where /0L disagreed 2+L disagreed and @L strongly disagreed6 to add any extra information to
the content. M0L agreed 27L strongly agreed and /ML agreed6 that adding signatures is an extra
effort to the author, MML are not sure where :7L disagreed 21/L disagreed and /+L strongly
disagreed6 that it is an effort and encouraged to sure content this way. @0L agreed 2MML strongly
agreed and 7L agreed6 that adding signatures will protect the information and will not bother an
author, 17L are not sure where MML disagreed 2/+L disagreed and :L strongly disagreed6 to it.
:1L participants agreed 21L strongly agreed and :0L agreed6 that a general author is not aware
of online security threats, /ML are not sure where @1L disagreed 2@0L disagreed and 1L
strongly disagreed6 to this. /00L participants agreed 2C7L strongly agreed and @ML agreed6 to
have an awareness campaign about information security and its threats. M+L agreed 2/7L
strongly agreed and .L agreed6 that it should be an individual4s responsibility to be aware, 1CL
are not sure that who is responsible and M+L disagreed 2/.L disagreed and 7L strongly
disagreed6 that it should be an individual4s responsibility to know the threats.
&econd half of the survey is about the awareness campaign that who should play a role in sharing
the threats to the inline content and what should be the responsible behavior towards copyrights
of an online author. 'any participants chose more than one given options and some shared
interesting ways of creating and enforcing awareness. +@L participants think that it should be the
responsibility of Internet community to educate people about the subject, 77L participants think
that the corresponding website, where an author is writing content should educate the people
about the information security and tell them the ways to protect online data, .@L participants
think that it should be a part of learning and teaching curriculum. It has been noticed in @rd
option that regardless of the field of participants, maximum people voted for learning and
teaching option. !nd /:L participants shared the other ways of awareness. The suggested ways
are to run workshops and seminars by educational institutions for students and public, seminars
by solution providing companies and different stakeholder, employers should run yearly
sessionsNmeetings for their employees, serious role of internet community, curriculum should not
only designed for specific majors but should be general for all majors, online protection
guidesNmanuals, dedicated web pages on a company4s or a university4s websites, learning videos,
Oovernments should regulate laws, International common laws for all, should be focused more
in research, self responsibility, sharing knowledge and using social network to educate people.
It has been analy)ed that people are serious about the online threats and it provides a large space
to research more about the information security. 'ore algorithmsNmethods are required to make
it a subject to interest, practice and implement. It has been also noticed that people are concerned
about its awareness and expect the different stakeholders to take a responsibility to educate more
and implement more.
Oraph / and Oraph M shows the information of survey results.

Literature Review:
Information &ecurity has drawn tremendous amount of attention of researchers and academia */,.
Information security is a way to protect information against its confidentiality, reliability and
availability. #ith the massive growth of Internet and its easy and low cost access to an authorB
Internet has attracted billions of writers. The growing factor of electronic publishing has
somehow an effect on the print media. ;ut in the meantime copyright protection of electronic
text is becoming more and more elusive. 9ther then preventing unauthori)ed access to copy the
content, a discouraging factor can also be focused and added to the web based data.
'any techniques of text watermarking have been proposed in the past .These techniques use as
Text watermarking image as a text, meaning change, assumption based, $erb and noun and
pronoun based, sentence and word short form , misprint error etc. Therefore, we can broadly
classify the previous work on digital text watermarking in the following categoriesB an image
based approach, a syntactic approach, and a semantic approach.
Bit Level Watermarking
According to these techniques some bit level or attributes have unique
values. These bit level and values are determined with a secure key known
only to the author of the information or data concentrate along its primary
key. These bit levels contain the watermark. The watermark can be detected
with the help of fnding the secure key. If some fnd the secure key of
watermark then attacker or fnder the secure key can be easily add or
remove data or informations.
In [! the solution with the help of user input a location to the te"t to be
copyright protection. A watermark can be apply as a copyright protection
with a secure key to protect the applying and a part of te"t contain to be
insert in the results. #atermark applying is making of two parts$ According to
frst step input data make into two parts than the second step encoding
method apply on each bit of watermark parts. The algorithms purposed here
prove to be %e"ible to signifcant types of attacks& including division step
selection& linearly data changes and add or remove random selection or
alterations. In performing the necessary step changing as rewarding the
given 'usage( points is one of the enduring challenges. )rom this process&
the algorithm applying the ancient watermarking step and then small points
of steps for data using with deference to the using points.
! "pread "pectrum
In which bits *a+ of watermark are combine with a ,oise of -.,
*pseudo .andom ,oise+. ,ow that generated signal and this noise
signal pass through from host signal *s+./o that the signal from the -.,
likes a secret key. The host amplitude is greater than from
watermarked signal amplitude about 01.)or the detection of -.,
signal by the math flter and correlation receiver.
B! Line "#i$t %oding
In this coding scheme shifted the even line up or down by the bit value
in payload [2!.There are two bit value for shifted the line& if bit is one
then line shift to up and 3 bit shift the line down. .emaining odd lines
use as a control lines and decoding.
%! Word "#i$ting
In these schemes each line of sentences is divided into words groups.
4very group has a enough number of words and characters. In this
process even group shifted to right or left as a bit value set in the pay
load and remaining odd groups used as measuring and comparing the
distances between the groups of word.
&! 'eature %oding
According to feature coding changing the te"t features in a precise way
to select the bit value in payload. After embedded the watermarked we
can achieved by comparing the original document with watermarked
document [5!.)eature coding applies on the te"t in real life. There are
three steps in *a+ original data. In *b+ selected the words for applying
the feature coding. In last step applying the feature coding on the
selected word [5!.
(! %#aracter spacing
According to this scheme other purposely& create a spaces between
words of te"t lines to watermark a te"t data. In this process of
encoding scheme a space between words in te"t document. These
spaces between in words of di6erent lines show the features of sine
wave and all the information7s stored in sine wave.
'! ")non)m
In this process of watermarking is change the word where the
information hidden in the te"t document. The advantage of this
method is securing the information7s in sense of retyping and copy
pate system. )rom this method fully changed the meaning of words.
*e+t as Image based pproac#:
According to this techniques of digital te"t watermarking& te"t
document as a image is used to applying the watermark [5!. -lain
te"t is very di8cult to watermark because of its ease& securing&
and low capacity for watermark applying. )irstly applying in plain
te"t watermarking used to operate te"t as a te"t image. #atermark
was applying in the layout and background of appearances of the te"t
images.
In [9:00! researcher proposed techniques to watermark plain te"t by
using image as te"t. In the history of approaches& ;rassil was the frst
method proposed line shift coding algorithm which one changes the
document image by every type of pattern like the upward or downward
and also alignment changing of the data depending on bit level.
The second one algorithm approach was inter:word shifting. In this
process of algorithm can be change the te"t hori<ontally to applying
the watermarked. This algorithm can be performs in both modes of
blind and non blind. The featuring of another third approaches is
coding algorithm [=!.In this algorithm modifes the features as pi"el of
words& changing in length of lines and applying the watermark bits
level in the plain te"t.
According to all three techniques block the attacks from the attackers
by applying the watermarked with securing key [=!.The most using the
algorithm like inter:word algorithm solution under the many attacks but
it also can be break it.

>a"emchuk& et al. [0=! [0?! [0@! analysed the show of the above
represents the methods. The relationship and other techniques is
centroid based [0! these are also optional which operate Te"t fle is
a digital signal of time and see in length of word shift and the
uses of di6erences between centroid and ne"t plain te"t fle
blocks for determine the as a watermark correspondingly. Aow& et al.
[0![02! more comparison the competence of the di6erent approaches.
In [02! proposed another algorithm spaces word distance in every
sentence of the paragraphs. The distances between the words are as
to the wave of sign wave in digital system of an e"plicit period and
occurrence. In quality and the pi"el points rank algorithms were
also residential which spot the plain te"t documents by changing
the blow quality such as girth [05!. Algorithm which defnes the
gray range image of te"t was also residential [09!. Bne more
approached which watermarked plain te"t document as an image using
border track histogram was also purposed [C3!.
In [C=! optional a novel idea depend upon an intellectual
programming scheme in the felds of te"t watermarking which has
no big e6ect on the changing of the synta" of the document and the
appearance. /o that the providing a format self:determining technique
in which information within the te"t is describe to hide certain
information.

In [CC! purposed a te"t watermarking scheme point out from an
obDect based situation. /olution of the watermarking also defne the
new concept of watermarking an obDect based on te"t document
where each and every te"t thread is introduced as a break up
obDect having its own set of properties.
"tatistical Watermarking
In [0=! author proposed algorithm in hides the watermark information
in the mostly ignored 7time7 feature of the database. The 7Time7
features e"it by defaulting& but in many applications they are not used.
)or the specifc& the 7Eate7 features in databases are made of two
felds$ 7Time7 and 7Eate7. .esearcher will work on the 7Time7 area which
is defned three vales$ seconds& hours and mints. /ecuring the
information hide the information of the watermark in the seconds
vales should have the low e6ect on the using in the database. These
applications have main advantage of using the time featuring is the
large bit value capacity arises for not show the watermark and so that
large watermarks can be easily hidden& if these situations required.
This process is reverse to the bit values depend techniques where
watermark every bits values have limited awaiting locations that can
be used to hide these information without being subDected to deletion
or devastation.
%ontent Watermarking
In [9! authors purposed& new techniques for the certain errors produce
for some digital or numeric values or features of databases and also
securing the te"t documents. If someone fned the tuple and same
situation other same tuple fnd in marked then this condition says the
same tuple. Then combined the total ratio of match tuple in detection
values and also te"t documents as a relational data bases again
selected can be used to confrm the watermark for the copyright
protections. #atermarking approaches is not associated with the tuples
and secure key. It has su8cient rightness& possibility and strength to
prevent to copyright protections.
In [03! researcher proposed& a novel watermarking algorithm for
scholar assets rights */A.+ protection for relational data and the te"t
documents. This approach of mechanism is get to choose a best
possible applying scheme for each documents fles according to its
data description based on also defne or general several fundamental
schemes main aim at making warp of the watermarked data to be
smallest. In sense of error correcting and the other many techniques
are applying to improve the watermarking of robustness.
%luster Watermarking
In [5! researcher proposed a cluster watermarking. In this process
clustering code used for the watermarking and for the detection the
watermark with the help of result of clustering. In this we can more
secure information of the watermark. In this paper a new methodology
introduce like odd and even. )rom this techniques modifying the
original data and more securing.
*e+t "teganograp#)
There are many approaches are defne for the te"t steganography. In
which popular approaches are classifes like generate the random
words& changing the look of original words *reverse words+& tags& and
modifying the data attribute.
Ot#er pproac#es
There are many other approaches also introduce many techniques for
the copyright protection including$
In [! author solve the main problem of ownership and unauthori<ed
data copy for the te"t documents and also relational data bases. There
are main three approaches are used to prevent& determine& and tracing
back data out from the documentations. here are many process like
business scenarios are used as the e"pressive use case& but the main
point is to secure the information with same techniques when these
information sharing with each other parties and other one.
In [0?! author purposed& an algorithm Doint ownership copyright
protections. According to this techniques divide the original data in two
parts and the securing with a key individually. There are two main
points that decide the number of parts wants to be recovering the
original data. In [0! a theoretical skeleton for relational database
based on covert contribution technology is proDected. The main covert
is out of order into several parts and is unseen on your own in a
relational database. The method using of AaFrange& interpolating&
polynomial techniques to recover the original data with securing key. In
[02! new techniques introduce& a fragile watermarking scheme to
confrm the veracity of a plain te"t documents and recuperate the right
database is proposed. There are all tuples in a plain te"t
documentations are frst divided into parts with using a covert key.
Then watermarking applying in division parts and then verifes the
watermark.
In [0! authorGs main focus to improve the locali<ation correctness of
fragile folder watermark. A novel fragile watermarking scheme is
planned for relational databases by summari<ing the order and parts
information between plain te"t documents. If the combining row
securing signature with quality signature in interfere locali<ation&
tampered part can be points at the segment level. In this scheme no
e"tra space required and also not required any space to the plain te"t
original data.
,et#odolog)/lgorit#m
Embedding the Watermark Text into the Host text:
Host Te"t& A& is selected
A watermark Te"t& ;& is selected
The least signifcant bits *A/;s+ of the host Te"t A will be replaced by the
most signifcant bits *>/;s+ of the watermark te"t ;
A watermarked te"t& I& is obtained which contains the te"t A with its A/;s
replaced by the >/;s of ;.
Compression of the watermarked text:
#atermarked te"t I is read
Eiscrete Iosine Transform is applied [C0! [CC!
;lock is compressed through quanti<ation or Hu6man coding.

Decompression of the watermarked Text as image:


The compressed image will now be decompressed
Removing the watermark from the watermarked Text:
The watermark from the watermarked te"t I is removed
It gives host te"t A and the watermark te"t ;
The fig show our algorithm model
The tools used in this project4s implementation are F(F and (T'-
Re$erences:
[0
!
KokSheik Wong, Kok Onn Chee Lip Yee Por, "UniSpaCh: A text-based data hiding method sing
Uni!ode spa!e !hara!ters," "he #orna$ o% S&stems and So%t'are, pp( )-*, +e!ember ,-))(
[C
!
.nera #a$i$, "Cop&right Prote!tion o% P$ain "ext," /AS" 0ationa$ Uni1ersit& o% Compter and
2merging S!ien!es, 3s$amabad, A thesis sbmitted in partia$ %$%i$$ment o% the re4irements %or the
degree o% +o!tor o% ,-)-(
[=! 5( .ho, W( .hao, .( Wang, and L( Pan, 6Se!rit& theor& and atta!k ana$&sis %or text 'atermarking7,
in Pro!eedings o% 3nternationa$ Con%eren!e on 2-8siness and 3n%ormation S&stem Se!rit&,9283SS
,--:;, <a& ,--:, pp( )-=(
[?! <( "opkara, >( ?i!!ardi, +( @akkani-"r, and <( #( Ata$$ah, 60atra$ $angage 'atermarking:
Cha$$enges in bi$ding a pra!ti!a$ s&stem7, Pro!eedings o% the SP32 3nternationa$ Con%eren!e on
Se!rit&, Steganograph&, and Watermarking o% <$timedia Contents, San #ose, CA, #an( ,--=(
[@! #( "( 8rassi$, S( Lo', 0( /( <axem!hk, and L( O( >orman, 2$e!troni! <arking and 3denti%i!ation
"e!hni4es to +is!orage +o!ment Cop&ing, 3222 #orna$ on Se$e!ted Areas in Commni!ations,
1o$( )A, no( *, pp( )B:C-)C-B, )::C(
[! #( "( 8rassi$, S( Lo', 0( /( <axem!hk, L( O( >orman, 6@iding in%ormation in do!ment images7,
in Pro!eedings o% the ,:th Anna$ Con%eren!e on 3n%ormation S!ien!es and S&stems, #ohns @opkins
Uni1ersit&, )::C, pp B*,-B*:(
[2! 0( /( <axem!hk and S( Lo', 6<arking text do!ments,7 in Pro!eedings o% the 3222 3nternationa$
Con%eren!e on 3mage Pro!essing, Washington, +C, O!t( )::D, pp( )A-)=(
[5
!
<(An'ar <irEa ,.nera #a$i$, "A ?e1ie' o% +igita$ Watermarking "e!hni4es %or "ext +o!ments," in
3nternationa$ Con%eren!e on 3n%ormation and <$timedia "e!hno$og&, 3s$amabad,,--:, pp( )-C(
[9! A( Khan and An'ar <( <irEa, >eneti! Per!epta$ Shaping: Uti$iEing Co1er 3mage and
Con!ei1ab$e Atta!k 3n%ormation Using >eneti! Programming, 3n%ormation /sion, 1o$( *, no( B, pp(
ACB-A=C, ,--D(
[03! A( Khan, 3nte$$igent Per!epta$ Shaping o% a digita$ Watermark, Ph+ "hesis, /a!$t& o%
Compter S!ien!e and 2ngineering, >3K 3nstitte, Pakistan, ,--=(
[00! #( "( 8rassi$, S( Lo', 0( /( <axem!hk, and L( O( >orman, 2$e!troni! <arking and 3denti%i!ation
"e!hni4es to +is!orage +o!ment Cop&ing, 3222 #orna$ on Se$e!ted Areas in Commni!ations,
1o$( )A, no( *, pp( )B:C-)C-B, )::C(
[0C! #( "( 8rassi$, S( Lo', and 0( /( <axem!hk, 6Cop&right prote!tion %or the e$e!troni! distribtion o%
text do!ments7, in Pro!eedings o% the 3222, 1o$( *D, no( D, ):::, pp())*)-)):=(
[0=! 0( /( <axem!hk and S( Lo', 6<arking text do!ments,7 in Pro!eedings o% the 3222
3nternationa$ Con%eren!e on 3mage Pro!essing, Washington, +C, O!t( )::D, pp( )A-)=(
[0?! 0( /( <axem!hk, S( @( Lo', Per%orman!e Comparison o% "'o "ext <arking <ethods, 3222
#orna$ o% Se$e!ted Areas in Commni!ations 9#SAC;, 1o$( )=, no( B, pp( C=)-CD,, )::*(
[0@! 0( /( <axem!hk, 2$e!troni! +o!ment +istribtion, A"F" "e!hni!a$ #orna$, 1o$( =, Sept(
)::B, pp( DA-*-(
[0! S( @( Lo', 0( /( <axem!hk, and A( <( Lapone, "+o!ment identi%i!ation %or !op&right
prote!tion sing !entroid dete!tion," 3222 "ransa!tions on Commni!ations, 1o$( B=, no(A, <ar!h
)::*, pp AD,-A*)(
[02! S( @( Lo' and 0( /( <axem!hk, 6Capa!it& o% text marking !hanne$,7 3222 Signa$ Pro!essing
Letters, 1o$( D, no( ),, pp( ABC -ABD, +e!( ,---(
W(B B"(& *(-*
W*(R,R.I/G
P Project Proposal 2 FhaseP/, 6
P Gnder the &upervision of,
P "ir i#ab .#an
/ubmitted ;y&
>ehboob ,a<im /heh<ad*3035@+
,aDam:Jl:/ahar*30C@=+
mehboobfra<5@Kgmail.com
IQE! Gniversity Islamabad 8ampusR &pring M0/M

S-ar putea să vă placă și