Sunteți pe pagina 1din 49

CHAPTER 1

INTRODUCTION
1
CHAPTER 1 : INTRODUCTION
1.1 Background
On the last decades, the amount of web-based information available has increased
dramatically. ow to !ather useful information from the web has become a
challen!in! issue for users. Current web information !atherin! systems attem"t to
satisfy user re#uirements by ca"turin! their information needs. $or this "ur"ose, user
"rofiles are created for user bac%!round %nowled!e descri"tion.
User "rofiles re"resent the conce"t models "ossessed by users when !atherin!
web information. & conce"t model is im"licitly "ossessed by users and is !enerated
from their bac%!round %nowled!e. 'hile this conce"t model cannot be "roven in
laboratories, many web ontolo!ists have observed it in user behaviour. 'hen users
read throu!h a document, they can easily determine whether or not it is of their
interest or relevance to them, a (ud!ment that arises from their im"licit conce"t
models. If a user)s conce"t model can be simulated, then a su"erior re"resentation of
user "rofiles can be built.
To simulate user conce"t models, ontolo!ies*a %nowled!e descri"tion and
formali+ation model*are utili+ed in "ersonali+ed web information !atherin!. ,uch
ontolo!ies are called ontolo!ical user "rofiles, or "ersonali+ed ontolo!ies. To
re"resent user "rofiles, many researchers have attem"ted to discover user bac%!round
%nowled!e throu!h !lobal or local analysis.
-lobal analysis uses e.istin! !lobal %nowled!e bases for user bac%!round
%nowled!e re"resentation. Commonly used %nowled!e bases include !eneric
ontolo!ies /e.!., 'ordNet0, thesauruses /e.!., di!ital libraries0, and online %nowled!e
2
bases /e.!., online cate!ori+ations and 'i%i"edia0. The !lobal analysis techni#ues
"roduce effective "erformance for user bac%!round %nowled!e e.traction. owever,
!lobal analysis is limited by the #uality of the used %nowled!e base. $or e.am"le,
'ordNet was re"orted as hel"ful in ca"turin! user interest in some areas but useless
for others.
1ocal analysis investi!ates user local information or observes user behaviour
in user "rofiles. $or e.am"le, 1i and 2hon! discovered ta.onomical "atterns from the
users) local te.t documents to learn ontolo!ies for user "rofiles. ,ome !rou"s learned
"ersonali+ed ontolo!ies ada"tively from user)s browsin! history. &lternatively, ,e%ine
and ,u+u%i analysed #uery lo!s to discover user bac%!round %nowled!e. In some
wor%s, users were "rovided with a set of documents and as%ed for relevance feedbac%.
User bac%!round %nowled!e was then discovered from this feedbac% for user "rofiles.
owever, because local analysis techni#ues rely on data minin! or classification
techni#ues for %nowled!e discovery, occasionally the discovered results contain noisy
and uncertain information. &s a result, local analysis suffers from ineffectiveness at
ca"turin! formal user %nowled!e.
$rom this, we can hy"othesi+e that user bac%!round %nowled!e can be better
discovered and re"resented if we can inte!rate !lobal and local analysis within a
hybrid model. The %nowled!e formali+ed in a !lobal %nowled!ebase will constrain
the bac%!round %nowled!e discovery from the user local information. ,uch a
"ersonali+ed ontolo!y model should "roduce a su"erior re"resentation of user "rofiles
for web information !atherin!.
In this "ro(ect, an ontolo!y model to evaluate this hy"othesis is "ro"osed. This
model simulates users) conce"t models by usin! "ersonali+ed ontolo!ies, and attem"ts
3
to im"rove web information !atherin! "erformance by usin! ontolo!ical user "rofiles.
The world %nowled!e and a user)s local instance re"ository /1IR0 are used in the
"ro"osed model. 'orld %nowled!e is common sense %nowled!e ac#uired by "eo"le
from e."erience and education3 an 1IR is a user)s "ersonal collection of information
items. $rom a world %nowled!e base, we construct "ersonali+ed ontolo!ies by
ado"tin! user feedbac% on interestin! %nowled!e. & multidimensional ontolo!y
minin! method, ,"ecificity and 4.haustivity, is also introduced in the "ro"osed model
for analy+in! conce"ts s"ecified in ontolo!ies. The users) 1IRs are then used to
discover bac%!round %nowled!e and to "o"ulate the "ersonali+ed ontolo!ies. The
"ro"osed ontolo!y model is evaluated by com"arison a!ainst some benchmar%
models throu!h e."eriments usin! a lar!e standard data set. The evaluation results
show that the "ro"osed ontolo!y model is successful.
1.2 Report Organiation
Cha"ter 5 discusses the "roblem definition and "rovides the bac%!round %nowled!e.
Cha"ter 6 contains 1iterature ,urvey.
Cha"ter 7 discusses the desi!n of the system.
Cha"ter 8 !ives the details of the technolo!ies used.
Cha"ter 9 "rovides the im"lementation details.
Cha"ter : contains e."erimental results.
Cha"ter ; !ives the conclusion on the wor% carried out and future sco"e.
4
CHAPTER 2
1IT4R&TUR4 ,UR<4=
5
CHAPTER 2: !ITERATURE "UR#E$
2.1 Onto%og& !earning
-lobal %nowled!e bases were used by many e.istin! models to learn ontolo!ies for
web information !atherin!. $or e.am"le, -auche et al. and ,ie!e et al. learned
"ersonali+ed ontolo!ies from the O"en Directory >ro(ect to s"ecify users) "references
and interests in web search. On the basis of the Dewey decimal classification, ?in! et
al. develo"ed IntelliOnto to im"rove "erformance in distributed web information
retrieval. 'i%i"edia was used by Downey et al. to hel" understand underlyin! user
interests in #ueries. These wor%s effectively discovered user bac%!round %nowled!e3
however, their "erformance was limited by the #uality of the !lobal %nowled!e bases.
&imin! at learnin! "ersonali+ed ontolo!ies, many wor%s mined user
bac%!round %nowled!e from user local information. 1i and 2hon! used "attern
reco!nition and association rule minin! techni#ues to discover %nowled!e from user
local documents for ontolo!y construction. Tran et al. translated %eyword #ueries to
Descri"tion 1o!ics) con(unctive #ueries and used ontolo!ies to re"resent user
bac%!round %nowled!e. 2hon! "ro"osed a domain ontolo!y learnin! a""roach that
em"loyed various data minin! and natural-lan!ua!e understandin! techni#ues.
Navi!li et al. develo"ed Onto1earn to discover semantic conce"ts and relations from
web documents. 'eb content minin! techni#ues were used by @ian! and Tan to
discover semantic %nowled!e from domain-s"ecific te.t documents for ontolo!y
learnin!. $inally, ,hehata et al. ca"tured user information needs at the sentence level
rather than the document level, and re"resented user "rofiles by the Conce"tual
Ontolo!ical -ra"h. The use of data minin! techni#ues in these models leads to more
6
user bac%!round %nowled!e bein! discovered. owever, the %nowled!e discovered in
these wor%s contained noise and uncertainties.
&dditionally, ontolo!ies were used in many wor%s to im"rove the "erformance
of %nowled!e discovery. Usin! a fu++y domain ontolo!y e.traction al!orithm, a
mechanism was develo"ed by 1au et al. in 6AAB to construct conce"t ma"s based on
the "osts on online discussion forums. Cuest and &li used ontolo!ies to hel" data
minin! in biolo!ical databases. @in et al. inte!rated data minin! and information
retrieval techni#ues to further enhance %nowled!e discovery. Doan et al. "ro"osed a
model called -1U4 and used machine learnin! techni#ues to find similar conce"ts in
different ontolo!ies. Dou et al. "ro"osed a framewor% for learnin! domain ontolo!ies
usin! "attern decom"osition, clusterin!Dclassification, and association rules minin!
techni#ues. These wor%s attem"ted to e."lore a route to model world %nowled!e
more efficiently.
2.2 U'er Pro(i%e'
User "rofiles were used in web information !atherin! to inter"ret the semantic
meanin!s of #ueries and ca"ture user information needs. User "rofiles were defined
by 1i and 2hon! as the interestin! to"ics of a user)s information need. They also
cate!ori+ed user "rofiles into two dia!ramsE the data dia!ram user "rofiles ac#uired
by analy+in! a database or a set of transactions3 the information dia!ram user "rofiles
ac#uired by usin! manual techni#ues, such as #uestionnaires and interviews or
automatic techni#ues, such as information retrieval and machine learnin!. <an der
,lui(s and uben "ro"osed a method called the -eneric User Fodel Com"onent to
im"rove the #uality and utili+ation of user modellin!. 'i%i"edia was also used to hel"
discover user interests. In order to ac#uire a user "rofile, Chirita et al. and Teevan et
7
al. used a collection of user des%to" te.t documents and emails, and cached web "a!es
to e."lore user interests. Fa%ris et al. ac#uired user "rofiles by a ran%ed local set of
cate!ories, and then utili+ed web "a!es to "ersonali+e search results for a user. These
wor%s attem"ted to ac#uire user "rofiles in order to discover user bac%!round
%nowled!e.
User "rofiles can be cate!ori+ed into three !rou"sE interviewin!, semi-
interviewin!, and non-interviewin!. Interviewin! user "rofiles can be deemed "erfect
user "rofiles. They are ac#uired by usin! manual techni#ues, such as #uestionnaires,
interviewin! users, and analy+in! user classified trainin! sets. One ty"ical e.am"le is
the TR4C $ilterin! Trac% trainin! sets, which were !enerated manually. The users
read each document and !ave a "ositive or ne!ative (ud!ment to the document a!ainst
a !iven to"ic. Gecause, only users "erfectly %now their interests and "references,
these trainin! documents accurately reflect user bac%!round %nowled!e. ,emi-
interviewin! user "ro-files are ac#uired by semi-automated techni#ues with limited
user involvement. These techni#ues usually "rovide users with a list of cate!ories and
as% users for interestin! ornon-interestin! cate!ories. One ty"ical e.am"le is the web
trainin! set ac#uisition model introduced by Tao et al., which e.tracts trainin! sets
from the web based on user fed bac% cate!ories. Non-interviewin! techni#ues do not
involve users at all, but ascertain user interests instead. They ac#uire user "rofiles by
observin! user activity and behaviour and discoverin! user bac%!round %nowled!e. &
ty"ical model is OGI'&N, "ro"osed by -auche et al., which ac#uires user "rofiles
based on users) online browsin! history. The interviewin!, semi-interviewin!, and
non-interviewin! user "rofiles can also be viewed as manual, semiautomatic, and
automatic "rofiles, res"ectively.
8
2.) *%o+a% ,no-%edge Ba'e'
'orld %nowled!e is im"ortant for information !atherin!. &ccordin! to the definition
"rovided by, world %nowled!e is common sense %nowled!e "ossessed by "eo"le and
ac#uired throu!h their e."erience and education. &lso, as "ointed out by Nierenber!
and Ras%in, Hworld %nowled!e is necessary for le.ical and referential disambi!uation,
includin! establishin! reference relations and resolvin! elli"sis as well as for
establishin! and maintainin! connectivity of the discourse and adherence of the te.t to
the te.t "roducer)s !oal and "lans.I In this "ro"osed model, user bac%!round
%nowled!e is e.tracted from a world %nowled!e base encoded from the 1ibrary of
Con!ress ,ub(ect eadin!s /1C,0.
'e first need to construct the world %nowled!e base. The world %nowled!e
base must cover an e.haustive ran!e of to"ics, since users may come from different
bac%!rounds. $or this reason, the 1C, system is an ideal world %nowled!e base.
The 1C, was develo"ed for or!ani+in! and retrievin! information from a lar!e
volume of library collections. $or over a hundred years, the %nowled!e contained in
the 1C, has under!one continuous revision and enrichment. The 1C, re"resents
the natural !rowth and distribution of human intellectual wor%, and covers
com"rehensive and e.haustive to"ics of world %nowled!e. In addition, the 1C, is
the most com"rehensive non-s"eciali+ed controlled vocabulary in 4n!lish. In many
res"ects, the system has become a de facto standard for sub(ect catalo!in! and
inde.in!, and is used as a means for enhancin! sub(ect access to %nowled!e
mana!ement systems.
9
Table 1
Comparison of Different World Taxonomies
!C"H !CC DDC RC
J of To"ics 7B8,A;A 8,658 5K,8:6 5AA,AAA
,tructure Directed
&cyclic -ra"h
Tree Tree Directed
&cyclic -ra"h
De"th 7; ; 67 5A
,emantic
Relations
Groader, Used-
for, Related-to
,u"er- and
,ub- class
,u"er- and
,ub- class
,u"er- and
,ub- class
The 1C, system is su"erior com"ared with other world %nowled!e ta.onomies used
in "revious wor%s. Table 5 "resents a com"arison of the 1C, with the 1ibrary of
Con!ress Classification /1CC0 used by $ran% and >aynter , the Dewey Decimal
Classification /DDC0 used by 'an! and 1ee and ?in! et al., and the reference
cate!ori+ation /RC0 develo"ed by -auch et al. usin! online cate!ori+ations. &s shown
in Table 5, the 1C, covers more to"ics, has a more s"ecific structure, and s"ecifies
more semantic relations. The 1C, descri"tors are classified by "rofessionals, and
the classification #uality is !uaranteed by well-defined and continuously refined
catalo!in! rules. These features ma%e the 1C, an ideal world %nowled!e base for
%nowled!e en!ineerin! and mana!ement.
10
CHAPTER )
D4,I-N
CHAPTER ): DE"I*N
11
).1.or%d ,no-%edge Repre'entation
The structure of the world %nowled!e base used in this "ro(ect is encoded from the
1C, references. The 1C, system contains three ty"es of referencesE Groader term
/GT0, Used-for /U$0, and Related term /RT0 . The GT references are for two sub(ects
describin! the same to"ic, but at different levels of abstraction /or s"ecificity0. In our
model, they are encoded as the is-a relations in the world %nowled!e base. The U$
references in the 1C, are used for many semantic situations, includin! broadenin!
the semantic e.tent of a sub(ect and describin! com"ound sub(ects and sub(ects
subdivided by other to"ics. The com"le. usa!e of U$ references ma%es them difficult
to encode. Durin! the investi!ation, we found that these references are often used to
describe an action or an ob(ect. 'hen ob(ect & is used for an action, & becomes a "art
of that action /e.!., Ha for% is used for dinin!I03 when & is used for another ob(ect, G,
& becomes a "art of G /e.!., Ha wheel is used for a carI0. These cases can be encoded
as the "art-of relations. Thus, we sim"lify the com"le. usa!e of U$ references in the
1C, and encode them only as the "art-of relations in the world %nowled!e base.
The RT references are for two sub(ects related in some manner other than by
hierarchy. They are encoded as the related-to relations in our world %nowled!e base.
The "rimitive %nowled!e unit in our world %nowled!e base is sub(ects. They
are encoded from the sub(ect headin!s in the 1C,. These sub(ects are formali+ed as
followsE
Definition 1. Let be a set of subjects, an element s is formalized as a 4-tuple
s := label,nei!"bor,ancestor,descendant#, $"ere
12
Label is t"e "eadin! of s in t"e LC%& t"esaurus'
(ei!"bour is a function returnin! t"e subjects t"at "a)e direct lin*s to s in t"e
$orld *no$led!e base'
+ncestor is a function returnin! t"e subjects t"at "a)e a "i!"er le)el of
abstraction t"an s and lin* to s directl, or indirectl, in t"e $orld *no$led!e
base'
Descendant is a function returnin! t"e subjects t"at are more specific t"an s
and lin* to s directl, or indirectl, in t"e $orld *no$led!e base'
The sub(ects in the world %nowled!e base are lin%ed to each other by the
semantic relations of is-a, "art-of, and related-to. The relations are formali+ed as
followsE
Definition 2. Let be a set of relations, an element r is a --tuple r :=
ed!e,t,pe#, $"ere
+n ed!e connects t$o subjects t"at "old a t,pe of relation'
+ t,pe of relations is an element of .is-a, part-of, related-to/0
'ith Definitions 5 and 6, the world %nowled!e base can then be formali+ed as
followsE
Definition 3. Let W12 be a $orld *no$led!e base, $"ic" is taxonam, constructed as
a directed ac,clic !rap"0 T"e W12 consists of a set of subjects lin*ed b, t"eir
semantic relations and can be formall, defined as a --tuple W12 := , # $"ere
is a set of subjects := .s
1
,s
-
,000,s
m
/'
is a set of semantic relations := .r
1
,r
-
,000,r
n
/ lin*in! t"e subjects in 0
13
$i! 6. & ,am"le >art of the world %nowled!e base
).2 Onto%og& Con'truction
14
Definition 4. T"e structure of an ontolo!, t"at describes and specifies topic is a
!rap" consistin! of a set of subject nodes0 T"e structure can be formalized as a 3-
tuple := %, tax
%
,rel#, $"ere
% is a set of subjects of t"ree subsets %
4
, %
-
and %

, $"ere %
4
is a set of positi)e
subjects re!ardin! , %
-
% is ne!ati)e, and %

% is neutral'
tax
%
is t"e taxonomic structure of , $"ic" is a nonc,clic and directed
!rap" 5%,60 7or eac" ed!e e t,pe5e6=is-a or part-of, iff s
1
s
-
# ,
taxs
1
s
-
#= True means s
1
is-a or is a part-of s
-
'
rel is a 2oolean function definin! t"e related-to relations"ip "eld b, t$o
subjects in %0
The sub(ects of user interest are e.tracted from the '?G via user interaction. & tool
called Ontolo!y 1earnin! 4nvironment /O140 is develo"ed to assist users with such
interaction with 'eb User Interface. Re!ardin! a to"ic, the interestin! sub(ects consist
of two setsE "ositive sub(ects are the conce"ts relevant to the information need, and
ne!ative sub(ects are the conce"ts resolvin! "arado.ical or ambi!uous inter"retation
of the information need. Thus, for a !iven to"ic, the O14 "rovides users with a set of
candidates to identify "ositive and ne!ative sub(ects. These candidate sub(ects are
e.tracted from the '?G.
$or each s ,, the s and its ancestors are retrieved if the label of s contains
any one of the #uery terms in the !iven to"ic /e.!., HeconomicI and Hes"iona!eI0.
$rom these candidates, the user selects "ositive sub(ects for the to"ic. The user-
selected "ositive sub(ects are "resented on the to"-ri!ht "anel in hierarchical form.
15
The candidate ne!ative sub(ects are the descendants of the user-selected
"ositive sub(ects. They are shown on the bottom-left "anel. $rom these ne!ative
candidates, the user selects the ne!ative sub(ects. These user-selected ne!ative
sub(ects are listed on the bottom-ri!ht "anel /e.!., H>olitical ethicsI and H,tudent
ethicsI0. Note that for the com"letion of the structure, some "ositive sub(ects /e.!.,
H4thics,I HCrime,I HCommercial crimes,I and HCom"etition UnfairI0 are also
included on the bottom-ri!ht "anel with the ne!ative sub(ects. These "ositive sub(ects
will not be included in the ne!ative set.
The remainin! candidates, which are not fed bac% as either "ositive or
ne!ative from the user, become the neutral sub(ects to the !iven to"ic.
&n ontolo!y is then constructed for the !iven to"ic usin! these user fed bac%
sub(ects. The structure of the ontolo!y is based on the semantic relations lin%in! these
sub(ects in the '?G. The ontolo!y contains three ty"es of %nowled!eE "ositive
sub(ects, ne!ative sub(ects, and neutral sub(ects. $i!. 7 illustrates the ontolo!y
/"artially0 constructed for the sam"le to"ic H4conomic es"iona!e,I where the white
nodes are "ositive, the dar% nodes are ne!ative, and the !rey.
16
$i!ure 7.&n ontolo!y constructed for s"ecific word economic es"iona!e
).) Onto%og& /ining (or "e0antic "peci(icit&
Ontolo!y minin! discovers interestin! and on-to"ic %nowled!e from the conce"ts,
semantic relations, and instances in ontolo!y. In this section, a 6D ontolo!y minin!
method is introducedE ,"ecificity and 4.haustivity. ,"ecificity /denoted s"e0
describes a sub(ect)s focus on a !iven to"ic. This method aims to investi!ate the
sub(ects and the stren!th of their associations in an ontolo!y.
The semantic s"ecificity is investi!ated based on the structure of O/T0
inherited from the world %nowled!e base. The stren!th of such a focus is influenced
by the sub(ect)s locality in the ta.onomic structure ta.
,
of O/T0. &s stated in
Definition 8, the ta.
,
of O/T0 is a !ra"h lin%ed by semantic relations. The sub(ects
17
located at u""er bound levels toward the root are more abstract than those at lower
bound levels toward the Hleaves.I The u""er bound level sub(ects have more
descendants, and thus refer to more conce"ts, com"ared with the lower bound level
sub(ects. Thus, in terms of a conce"t bein! referred to by an u""er bound and lower
bound sub(ects, the lower bound sub(ect has a stron!er focus because it has fewer
conce"ts in its s"ace. ence, the semantic s"ecificity of a lower bound sub(ect is
!reater than that of an u""er bound sub(ect.
The semantic s"ecificity is measured based on the hierarchical semantic
relations /is-a and "art-of0 held by a sub(ect and its nei!hbours in ta.
,
. Gecause
sub(ects have a fi.ed locality on the ta.
,
of O/T0, semantic s"ecificity is also called
absolute s"ecificity and denoted by s"e
a
/L0.
).1 Arc2itecture o( t2e "earc2 Engine 3Onto%og& /ode% Backend4
The "ro"osed ontolo!y model aims to discover user bac%!round %nowled!e and
learns "ersonali+ed ontolo!ies to re"resent user "rofiles. $i!. 8 illustrates the
architecture of the ontolo!y model. & "ersonali+ed ontolo!y is constructed, accordin!
to a !iven to"ic. Two %nowled!e resources, the !lobal world %nowled!e base and the
user)s local instance re"ository, are utili+ed by the model.
The world %nowled!ebase "rovides the ta.onomic structure for the
"ersonali+ed ontolo!y. The user bac%!round %nowled!e is discovered from the user
local instance re"ository. &!ainst the !iven to"ic, the s"ecificity of sub(ects are
investi!ated for user bac%!round %nowled!e discovery.
18
$i!ure 8. &rchitecture
).5 Preci'ion and Reca%%
The "erformance of the e."erimental models was measured by three methodsE the
"recision avera!es at 55 standard recall levels /55,>R0, the mean avera!e "recision
/F&>0,and the $5Feasure. These are modern methods based on "recision and recall,
the standard methods for information !atherin! evaluation. >recision is the ability of a
system to retrieve only relevant documents. Recall is the ability to retrieve all relevant
documents.
19
The $5Feasure is calculated by
8recision = 9 .rele)ant documents/ .retrie)ed documents/
9 . retrie)ed documents/ 9
:ecall = 9 .rele)ant documents/ .retrie)ed documents/
9 .total rele)ant documents/ 9
71= - ; 8recision ; :ecall
8recision 4 :ecall
$5 measure is the harmonic mean of the "recision and recall.
Gased on these, we can conclude that the Ontolo!y model is very close to the
TR4C model, and si!nificantly better than the baseline models. These evaluation
results are "romisin! and reliable.
20
C&>T4R 8
T4CNO1O-I4, U,4D
21
CHAPTER 1: TECHNO!O*IE" U"ED
This >ro(ect is im"lemented usin! the followin! technolo!ies on HFy4cli"se
4nter"rise 'or%bench 5A.AI with &"ache Tomcat : installedE
5. @ava 44 ;
6. &"ache ,truts
7. Do(o @ava,cri"t $ramewor%
8. Fy,C1
6a7a EE 8:
The al!orithms and the readin! from !lobal %nowled!e base which is in MF1 form
and the validation of lo!in form, re!istration form etc., are im"lemented usin! @ava
44. $orm validation is done usin! @ava Geans. The @ava ,erver >a!es of @ava 44 is
used for renderin! the customi+ed "a!es to the users. ,ervlets are used with ,truts on
server side.
Apac2e "trut':
&"ache ,truts is an o"en-source web a""lication framewor% for develo"in! @ava 44
web a""lications. It uses and e.tends the @ava ,ervlet &>I to encoura!e develo"ers to
ado"t a model-view-controller /F<C0 architecture. In a standard @ava 44 web
a""lication, the client will ty"ically call to the server via a web form. The information
is then either handed over to a @ava ,ervlet which interacts with a database and
"roduces an TF1-formatted res"onse, or it is !iven to a @ava ,erver >a!es /@,>0
document that inter-min!les TF1 and @ava code to achieve the same result. Goth
a""roaches are often considered inade#uate for lar!e "ro(ects because they mi.
a""lication lo!ic with "resentation and ma%e maintenance difficult.
22
The !oal of ,truts is to se"arate the model /a""lication lo!ic that interacts with
a database0 from the view /TF1 "a!es "resented to the client0 and the controller
/instance that "asses information between view and model0. ,truts "rovides the
controller /a servlet %nown as &ction ,ervlet0 and facilitates the writin! of tem"lates
for the view or "resentation layer /ty"ically in @,>, but MF1DM,1T and <elocity are
also su""orted0. The web a""lication "ro!rammer is res"onsible for writin! the model
code, and for creatin! a central confi!uration file struts-confi!..ml that binds to!ether
model, view and controller.
Re#uests from the client are sent to the controller in the form of N&ctionsN
defined in the confi!uration file3 if the controller receives such a re#uest it calls the
corres"ondin! &ction class that interacts with the a""lication-s"ecific model code.
The model code returns an N&ction$orwardN, a strin! tellin! the controller what
out"ut "a!e to send to the client. Information is "assed between model and view in the
form of s"ecial @avaGeans. & "owerful custom ta! library allows it to read and write
the content of these beans from the "resentation layer without the need for any
embedded @ava code.,truts is cate!ori+ed as a re#uest-based web a""lication
framewor%.
Do9o 6a7a"cript :ra0e-ork:
Do(o Tool%it is an o"en source modular @ava,cri"t library /or more s"ecifically
@ava,cri"t tool%it0 desi!ned to ease the ra"id develo"ment of cross-"latform,
@ava,cri"tD&(a.-based a""lications and web sites.
23
Do(o contains Di(it wid!et system which is very feature rich com"onent set
that can be readily used by our web a""lication.Do(o wid!ets are com"onents
com"risin! @ava,cri"t code, TF1 mar%-u", and C,, style declarations that "rovide
cross-browser, interactive features such asE
--Fenus, tabs, and toolti"s
--,ortable tables
--Dynamic charts
--6D vector drawin!s
--&nimated effects
--Tree wid!ets /used in our Ontolo!y 1earnin! 4nvironment0
--<arious forms and routines for validatin! form in"ut
--Calendar-based date selector, time selector, and cloc%
--Core wid!ets
--Fa"s
/&";!:
Fy,C1 is the worldOs most used relational database mana!ement system /RDGF,0
that runs as a server "rovidin! multi-user access to a number of databases.
In our "ro(ect we used Fy,C1 for database bac%end to store user credentials,
ontolo!y tables and sam"le lin%s database.
24
CHAPTER 5
I/P!E/ENTATION
25
CHAPTER 5: I/P!E/ENTATION
5.5 Introduction
The >ro(ect is develo"ed usin! Fy4cli"se 5A.A with &"ache Tomcat : installed. User
>rofiles are created when user lo!s onto the system. & new table is created for storin!
the credentials of user and his "ersonali+ed bac%!round %nowled!e. The re!istration
of the user the user and lo!in form are validated usin! @ava Geans.
&fter 1o!in user is !iven a menu with o"tions to select from. Data set content
and "recision-recall values calculation module are "resent in menus. &fter user selects
one o"tion dataset content is "resented to select a sub(ect. 'hen a sub(ect is selected
then our ontolo!y learnin! module will be "resented facilitatin! the user to select
"ositive and ne!ative sub(ects. 'hen search o"tion selected to search the sam"le
inde. of lin%s with descri"tions we calculate the semantic s"ecificity for that sub(ect
and the relevant results are dis"layed accordin! to the learnt bac%!round %nowled!e
from Ontolo!y 1earnin! 4nvironment.
The Ontolo!y 1earnin! 4nvironment is im"lemented usin! Do(o di(it wid!ets
and the results "a!e is im"lemented usin! @ava ,erver "a!es. Then the ontolo!y is
"ersonali+ed because it is ta%in! user)s "ersonal interests into account. It is semantic
web search because the search is not done usin! the !iven %eyword blindly.
26
5.2 P'eudo Code "nippet'
5.2.1. A%gorit20 1. Ana%&ing 'e0antic re%ation' (or 'peci(icit&
Input : a personalized ontolo!, := tax
%
, rel#' a coefficient bet$een 5<,160
Output : spe
a
(%6 applied to specificit,0
1 set *=1, !et t"e set of lea)es %
<
from tax
%
, for 5s
<
%
<
6 assi!n spe
a
5s
<
6=*'
2 !et = $"ic" is t"e set of lea)es in case $e remo)e t"e nodes %
<
and t"e related
ed!es from tax
%
'
3 if 5 ===> 6 then return' ??t"e terminal condition'
4 foreach = = do
5 if 5 is+5=6== >6 then spe
1
a
5=6=*'
6 else spe
1
a
5=6 = ; min. spe
a
(s69 s is+5=6 /'
7 if 5 part@f5=6 == >6 then spe
-
a
5=6=*'
8 else spe
-
a
5=6 = A
s part@f5=6
spe
a
5s6 '

9part@f5=69
spe
a
5=6 = min5spe
1
a
5=6, spe
-
a
5=66'
1! end
11 * = * ; , %
<
= %
<
B =, !o to step -0
27
The determination of a sub(ect)s spe
a
is described in &l!orithm 5. The
and are two functionsin the al!orithm satisfyin!
Theis returns a set of sub(ects that
satisfy H PTrueI and The returnsa set
of sub(ects that satisfy P True and
. &l!orithm 5 is efficient with the com"le.ity of only
. The al!orithm terminates eventually because is a
directed acyclic !ra"h, as defined in Definition 8.
5.2.2. Do9o "cript (or 'e%ection o( po'iti7e and negati7e 'u+9ect'
<SCRIPT type="text/javascript">
dojo.requre!"dojo.d#t#.Ite$%&eRe#dStore"'(
dojo.requre!"djt.Tree"'(
dojo.requre!"dojo.p#r)er"'(
var*ou+t = 1(
function)etPo)t,e!' -
vartree = djt..yId!"ptree"'(
var+ode = tree.#ttr!")e&e*tedIte$"'(
vard#t#/ = "d#t#" 0 *ou+t(
varpo)/ = "po)" 0 *ou+t(
*ou+t = !*ou+t 0 1'(
varro1 = do*u$e+t.2et3&e$e+t4yId!d#t#/'(
var5dde+ = do*u$e+t.2et3&e$e+t4yId!po)/'(
ro1.++er6T78 = +ode.+#$e(
5dde+.,#&ue = +ode.+#$e(
9
<:SCRIPT>
28
This scri"t is intended for selection of "ositive and ne!ative sub(ects from
do(o.tree data structure which is a user interface core com"onent of our ontolo!y
learnin! environment.
CHAPTER <
R4,U1T,
29
<.1 Output "creen'2ot'
30
31
$
i
!
u
r
e

9
.

1
o
!
i
n

"
a
!
e
32
$
i
!
u
r
e

:
.

R
e
!
i
s
t
r
a
t
i
o
n

>
a
!
e
33





















$
i
!
.
;
.

>
a
s
s
w
o
r
d

r
e
c
o
v
e
r
y
34
%

2
.
8
.

6
o
$
e

S
*
r
e
e
+
35
%

2
.
9
.

S
e
&
e
*
t

o
+

o
;

)
u
.
j
e
*
t
)
36
%

2
.
1
0
.

<
+
t
o
&
o
2
y

e
+
,

r
o
+
$
e
+
t
=
1
37
$
i
!
u
r
e

5
5
.
O
n
t
o
l
o
!
y

1
e
a
r
n
i
n
!

4
n
v
i
r
o
n
m
e
n
t
-
6
38
$
i
!
u
r
e

5
6
.

,
e
a
r
c
h

R
e
s
u
l
t
s

39
$
i
!
u
r
e

5
7
.

,
e
l
e
c
t
i
o
n

o
f

>
r
e
c
i
s
i
o
n

f
r
o
m

o
m
e

,
c
r
e
e
n

F
e
n
u
40
$
i
!
u
r
e

5
8
E

T
h
e

c
a
l
c
u
l
a
t
e
d

>
r
e
c
i
s
i
o
n

R
e
c
a
l
l

a
n
d

$
-
F
e
a
s
u
r
e

<
a
l
u
e
s
41
$
i
!
u
r
e

5
9
.

-
r
a
"
h

s
h
o
w
i
n
!

v
a
l
u
e
s

u
"
o
n

s
u
b
m
i
s
s
i
o
n
42
$
i
!
u
r
e

5
:
.

1
o
!
o
u
t

,
c
r
e
e
n
<.2 Te'ting
Test cases can be divided in to two ty"es. $irst one is >ositive test cases and second
one is ne!ative test cases. In "ositive test cases are conducted by the develo"er
intention is to !et the out"ut. In ne!ative test cases are conducted by the develo"er
intention is to don)t !et the out"ut.
PO"ITI#E TE"T CA"E"
" .No Te't ca'e De'cription Actua% 7a%ue E=pected
7a%ue
Re'u%t
1 Create the new user
re!istration "rocess
New user
created
successfully
To u"date the
database in
F,&C4,,
True
2 4nter the (ava "ro!rams
root directory for the
identification of fault
usin! Dynamic test
case !eneration
"rocess.
4.ecute the
total structure
of bu!)s
information.
Identify the
bu!s of
information.
True
) Usin! TF1 valuator
to s"ecify the related
bu!s finder !eneration
"rocess
Identify the
bu!s in "osition
wise.
Gu!s can be
identified as a
re"ort
!eneration
"rocess
True
1 Chec% the verification
"rocess of ta!s.
It can be
contains the
state mana!er
re"ort "rocess
It can be
wor%in! as a
constraint
solver "rocess
True
43
NE*ATI#E TE"T CA"E"
" .No Te't ca'e De'cription Actua% 7a%ue E=pected
7a%ue
Re'u%t
1 Create the new user
re!istration "rocess
New user can
not be created
successfully
There is no
u"datin!
"rocess of
database
$alse
2 4nter the (ava
"ro!rams root directory
for the identification of
fault usin! Dynamic
test case !eneration
"rocess.
That can be
limited number
of ta!s
re"resentation
"rocess
Contains only
less number of
ta!s
s"ecification
"rocess
$alse
) Usin! TF1 validate
to s"ecify the related
bu!s finder !eneration
"rocess
It can be
"rocess the
e.ecution
"rocess li%e
semi
"ermanently
It can be
re!istered as a
limited faults
s"ecification
"rocess
$alse
1 Chec% the verification
"rocess of ta!s.
Constraint
solver was not
e.ecuted
"erfectly
It should wor%
with less
constraints
!eneration
"rocess
$alse
44
CHAPTER 8
CONC1U,ION
45
8.1 .ork Carried Out
&n ontolo!y model is "ro"osed for re"resentin! user bac%!round %nowled!e for
"ersonali+ed web information !atherin!. The model constructs user "ersonali+ed
ontolo!ies by e.tractin! world %nowled!e from the 1C, system and discoverin!
user bac%!round %nowled!e from user local instance re"ositories. & two dimensional
ontolo!y minin! method s"ecificity, is also introduced for user bac%!round
%nowled!e discovery. In evaluation, the standard to"ics and a lar!e test bed were used
for e."eriments. The model was com"ared a!ainst benchmar% models by a""lyin! it
to a common system for information !atherin!. The e."eriment results demonstrate
that our "ro"osed model is "romisin!. & sensitivity analysis was also conducted for
the ontolo!y model. In this investi!ation, we found that the combination of !lobal and
local %nowled!e wor%s better than usin! any one of them. In addition, the ontolo!y
model usin! %nowled!e with both is-a and "art-of semantic relations wor%s better
than usin! only one of them. 'hen usin! only !lobal %nowled!e, these two %inds of
relations have the same contributions to the "erformance of the ontolo!y model.
'hile usin! both !lobal and local %nowled!e, the %nowled!e with "art-of relations is
more im"ortant than that with is-a.
8.2 "cope (or :urt2er De7e%op0ent
The "ro"osed ontolo!y model in this "ro(ect "rovides a solution to em"hasi+in!
!lobal and local %nowled!e in a sin!le com"utational model. The findin!s in this
"ro(ect can be a""lied to the desi!n of web information !atherin! systems. The model
also has e.tensive contributions to the fields of Information Retrieval, web
Intelli!ence, Recommendation ,ystems, and Information ,ystems. we will investi!ate
the methods that !enerate user local instance re"ositories to match the re"resentation
46
of a !lobal %nowled!e base. The "resent wor% assumes that all user local instance
re"ositories have content-based descri"tors referrin! to the sub(ects, however, a lar!e
volume of documents e.istin! on the web may not have such content-based
descri"tors. $or this "roblem, strate!ies li%e ontolo!y ma""in! and te.t
classificationDclusterin! were su!!ested. These strate!ies will be investi!ated in future
wor% to solve this "roblem. The investi!ation will e.tend the a""licability of the
ontolo!y model to the ma(ority of the e.istin! web documents and increase the
contribution and si!nificance of the "resent wor%.
47
Re(erence':
Q5R R. Gae+a-=ates and G. Ribeiro-Neto, Fodern Information Retrieval. &ddison
'esley, 5BBB.
Q6R -.4.>. Go., @.,. unter, and '.-. unter, ,tatistics $or 4."erimenters. @ohn
'iley S ,ons, 6AA9.
Q7R C. Guc%ley and 4.F. <oorhees, H4valuatin! 4valuation Feasure ,tability,I >roc.
&CF ,I-IR )AA, "". 77-8A, 6AAA.
Q8R 2. Cai, D.,. FcNamara, F. 1ouwerse, M. u, F. Rowe, and &.C. -raesser,
HN1,E & Non-1atent ,imilarity &l!orithm,I >roc. 6:
th
&nn. Feetin! of the Co!nitive
,cience ,oc. /Co!,ci )A80, "". 5KA-5K9, 6AA8.
Q9R 1.F. Chan, 1ibrary of Con!ress ,ub(ect eadin!sE >rinci"le and &""lication.
1ibraries Unlimited, 6AA9.
Q:R >.&. Chirita, C.,. $iran, and '. Ne(dl, H>ersonali+ed Cuery 4."ansion for the
'eb,I >roc. &CF ,I-IR /)A;0, "". ;-58, 6AA;.
Q;R R.F. Colomb, Information ,"acesE The &rchitecture of Cybers"ace. ,"rin!er,
6AA6.
QKR &. Doan, @. Fadhavan, >. Domin!os, and &. alevy, H1earnin! to
Fa" between Ontolo!ies on the ,emantic 'eb,I >roc. 55th Int)l Conf. 'orld 'ide
'eb /''' )A60, "". ::6-:;7, 6AA6.
QBR D. Dou, -. $rish%off, @. Ron!, R. $ran%, &. Falony, and D. Tuc%er, HDevelo"ment
of Neuroelectroma!netic Ontolo!ies/N4FO0E & $ramewor% for Finin! Grainwave
Ontolo!ies,I >roc. &CF ,I-?DD /)A;0, "". 6;A-6;B, 6AA;.
48
Q5AR D. Downey, ,. Dumais, D. 1ieblin!, and 4. orvit+, HUnderstandin! the
Relationshi" between ,earchers) Cueries and Information -oals,I >roc. 5;th &CF
Conf. Information and ?nowled!e Fana!ement /CI?F )AK0, "". 88B-89K, 6AAK.
Q55R ,. -auch, @. Chaffee, and &. >retschner, HOntolo!y-Gased >ersonali+ed ,earch
and Growsin!,I 'eb Intelli!ence and &!ent ,ystems, vol. 5, nos. 7D8, "". 65B-678,
6AA7.
49

S-ar putea să vă placă și