Documente Academic
Documente Profesional
Documente Cultură
hn" w.u
I A Web address
A Name the different pârts of this URL (Uniform Resource Locâtor),
l!ttp:1&rylhw.
; l<,:ts
7
When people ùse the term search engine in rclation to the Web, they axe usually referring to
the actual search forms that seaxch through databases of HîML docurneûts, initially gathqed
Crawler-based search ergines are those that use automated solïware agents {called crawlers)
that visit a Web site, read the information on the actual site, read the site's meta tags and also
follow the links that the site connects to performiug indexing on alt linked Web sites as well
The crawler relurns all that information back to a central depository, where the data is
indexed. The ciawler will periodically retudl to the sites to check for any infomalion that hâs
changed. The frequency with which this happens is determined by the administâtors of the
search engine.
In both cases, when you query a search engine to locate information, you're actually searching
though the index that the search engine has created arc not actually searching the
-you
Web. These indices are giant databases of infornaton that is collected and stored and
sùbsequently searched. 'fhis explains why sometimes a search on a commercial search engine,
such as Yahoo! or Google, will retum results that are, in fact, dead links. Since the search
results are based on the index, if the index hasn't been updated since a Web page became
iwalid the seârch ergine treats the page as still an active link even though it no longer is. It
will remain that way until the iDdex is updated.
So why will the samc search on different search engines produce difere[t results? Part ofthe
answer to that question is because not all indices are going to be exactly the same. It depends
on what the spiders find or what the humans submitted. But more important, not every search
engine uses the same algo thm to search tbrough the indices. The algorithm is what the
search engines use to determine the relevance ofthe information in the index to what the user
is searching for.
Oûe ofthe elements that a search englne algorithm scans for is the ûequency and location of
kclavords on a Web page. Those rvith higher frequonoy are typically considered more
t
., !,'..\.;" --,t ,.
\rr* \"c;,,"
\.t-,,\
relevant. But search engine technology is becoming sophistcated in its attempt to discourage
whal is knowD as keyword stumng. or spamdexing. -- t, - 'Y)
",:_. .:
Another common elemeDt that algorithrns analyze is the way that pages link to other pagqs iû
the Web. By analyzing how pages link to each other, an engine car bolh determine what a
page is about (if the keywords of the linked pages are similar to the ke)ryr'ords on the original
page) and whether that page is considered "important" and deserving of a boost in ranking.
Just as the techrology is becoming increasingly sophisticaled to igroie keyword stuffrng, it is
also becoming morc sal'vy to Web masters who build artificial links into their sites in order to
1, A program tiat searches docriments for specified kenvords and rchrms a list of thc
documents where tbe keywords were found, .....!lf L:d..,i.n1..r.*..
A special HTML tag that provides informatioû âbout a Web page: ..,nlela,..{rtrtr.-,.
3. The process by which search engines select pieces ofrelevânt code fiom the web page
and catalog them: ...........+.4:-u.
"i,,....,....
4. It refers to the practice of lodding a Web page with infomration without substanlial
added value: .. -........!1.:,r*x.q.'].. -
5. A link on the Web that points to a Web page or server that is permanently unavailable:
...............J*o.J....*;.1.+..,...
6. The position of something on a scale showing its importance in relalior to other
similar things: .,.... :te.',. !...',c.............,.
ll
\.it-
3 Web Server Error Messages
Ërors on the Intemel, and those cnnoying enor mcssages. occur quite frequently and
-
can be quite frustrating especially if you do not know the differerce between a 404 erroi
-
and a 502 error. Many times they have more to do with the Web servers yourre aying to
access nther than something being wrong with your computer.
Provide the IITTP status codes (âlso called error messages) relâting to the
following receptive meânings:
1- Usually means the syntax used in the URL is inconect (e.g., uppercase letter should be
I
Iow€rcase letter; wrongpùnctuationmarks):..,11,.....:....i.-,..,,.1.r-.*..J|ç.c?L,,:-.rf,.
2- Server is looking for some enôryption key from the client and is not getting it. Also,
wong password may have been entered. Try it again, paying close attention to case
-t
sensiti\,11y:............+.1...*.,....!.:.'.1:.r:.......1.,.',.\(.1..,......,.....-....,...............
3- Similar to 401; special permission needed to access the site - a password and/or
usemame if it is a regisûation issue. Other tirnes you may not have the proper
permissions set up on the seNer o! the site's administrator just docsn't want you to be
abletoaccessthesite'....i.1...1,.....f.,:.1.",,,...1,.,..........r,-.............jr:,r,"rr-J.......
4- Server canûot find the file you requested. File has either been moved or deleted, or
you entercd the wrong URL or document name, Look at the URL. If a word looks
misspelled, then correct it ard tly it again. If thât doesn't work backtrack by deleting
infomation between each backslash, until you come to a page on that sitc that isn't a
*./ i'
404. From lherc you may be able to nnd lhe page you're looking for: ..|'l5..:1.....1.r-!. n
f :-'"! /
Client stopped the rcquçst befole the server finished retrieving it. A user will either hit
the stop button, close the browscr, or click on a link beforc the page loads. Usually
occurs\r'llenserversarcsloworfilesizesarclarge,.4J.....i....i.1....,.,....::...;.....
.-t
6- Webserverdoesn'tsupportarequestedlèature:.......-,.....:i.......,.,,;....,,))..:....-.r...1 . )'"-',1
7- Server coûgestion; too many oonnections; high traTfic. Keep trying until the page
loads:,...517-L........i:.rl:.:i.tr:;....1.L,-,,.., -.;..:,1 ...':.::.'........"..".r..,'
L
Help box
A collocation is a pair or group of words that . Verb+ particle: hack into a computer
are olten used together. Leâming coliocations log onto a bank account
@' Comptete the gaps to mako collocâtions, ând then say what type they are '/-\- ô\ r
t\.. t'
..*+.]'.
- ,rr" \\cb. ........r.... chal room. or1.:... ., \
b
... . .. . :.. .mails ,
6tr_-r '.
Z, Inrtont r*o-. {.i. b" u g."ut\vuy to communicate with {ii"oa" 1: li ' 1"'
"* ,.'
3. ThissofiuaremaynotU"....\ir.S.6..."ompaliblc!^itholdcr,,peratingsv"re6'. lJ- I'di
Make collocations relat€d to rveb browsers with words from the box,
ând then compl€te the following passage
the home page the cuneil,t page bgûoii lavourites image links
The toolbar of a web pagc shows all the navigation icons. which lct you 1
Tab buttons iet you view different sites at the same time, ard the built-in 5
page won't load. you can 7 ........ d.,..........,... . meaning the page reloads (downioads
agah). If you \aant to mark a website address so that you can easily revisit the page at a
later time, you can add it to your favbu tes (favorites in American English), or bookmark
it, \\ hen 1ou wanr ro visit h again you simpll click S .....Q.............
u
On the web pagç itself, nost sites feature 9 I andf 0 ........i...
Together, these are known as hyperlinls and take you to other web pages when clicked.
\i\.1
1r - \h.1
c
', çôl c t , .,'J r,.j. ',..,'
Simple Mail Transfer protocol
SMTP is used to transfer message between one mail serv€r aûd another. It's also used
by email program oû PCs to send mail to the server. SMTP is very sqeight&Iw4ld
providiûg only facilities to deliver only messages to one or more recipients,in batch mocle.
Once a message has been delivcred, it cân"J.be rccalled or cancelled, It's also deleted t''rom
the sending server once it's beeû clelivered.'Sa11p uses .prrsh'. operatioj. meaûing tht;
lhat
cgll:9tion is initiated by the sending sen/er Érher thar the receiver. Ihis makes it
unsuitable for dcliveting messages to desktop pcs, which aren't guaranteed to be switçhed
or} at all times
In !o1t-bq1ed mail sysrems. such Unix and Web mail, SMTP is rhe only protocol the
as
server uses. lteceived messages arc stored locally and retricjed fionl the local file system
by the mail program. In the casc of Web mail, the message is then translatcd into HTML
and tÉnsmitted to your bronser. SMTp is the only protocol for tmnsferring mcssages
between servers. How they are then storcd varies Aom systeût to systein.
POP is a message relfieval protocol used by many pC mail clielrls to gel messagçs
from a seryer, typically your ISP's mail server, It only allows you to dormload ali
ûessages in youl mailbox at once. It rvorks in ,,pull" mocle, the receiving pC initiating the
connection. Pc-based POI,3 mail cLients can do this automaiicall at a preset inteNal.
When yoLr use your Web mail accoLrnr to access a popl mailbo*rlhc lnail server opens a
conn-ection to the POP3 server just as a PC-hased application $ould. The messages arc
then copied into your Web maiibox and read via a browser.
Since pop3 downloads all the mcssages in your mailbox, there,s an option to leave
messrges on the server, so that they can be picked up Êom different machines without
losing anffil1s;loqs meait rhat you'll g€t every message clownloaded every time you
con_nect to rhe senerï:Tf )qq dorit clean your mailbox regular.ly thilcould meagieng
@11:eOt When ùsing a weSmail aécouni io reirieve out POP3 rnail. be careful abour
leavrng messages on the server--ifroo many build up, cach download will take g lotgjllIllE
.; 1q$ up your inbox. Many Wcb maii systems won't recognize messages you,ve alieacly
dounlô-aded.:o you will ger duplicares oloncs 1ou haren r deleied.
IMAP is similar in opemtion to pop, but allows you more choice over what messages
you download. Tnitially. only message heâders are retrieved, giving information about the
sellder and subjeca. You can thçn downkradjust those messages you waût to read. you can
also delete individual messages liom the server. and some [MAp4 serveru let you organizq
your mail into folders. This makes dotvnload times shorter and therc,s no danger oflosing
messages.
lA
\,/
/ Find answers to these qu€stions in the text:
!Q Which emall protocol is uscd to transfer mcssages betw€en selver computers? S MJ?
a2,, Wh1 is SMTP unsuir.able lor delivering mcssages lu dcsklop PCsi
3) Name rwo ho.L-bascd mail slstems mcnlioned in lhe lext. ' ,* '.. I
3. An email transfçr process in which the receiving computer initiates the cotnection:
'....... .t1rr.\ )i:..\r 2n '\
4. A simpLe mail lransfcr protocol that is used 1o send messages bettve€n seryeN:
3 P,-1'P
5, A message-retrieval protocol that downloads all email mcssages ai the same time:
.........i.o.!.... ..
Li.ten to this recording which explâins how a browser linds the rveb page you waDt.
Note down the different stâges in the appropriate order.
8 Writing
Write your own description of how â bro[scr finds the wet, p:rgc you wânt, using
appropriate time cleuses,
b-XQ;"Q.' F",*1
':r,,,rc-
lêur*.-4336
@ ?.J,,,.;q .ç^
.. - r'r-r-J-i[ l{* $l
"h"t-- f^Aô
t- $els"L.t P* 0i^%, H"J- B* th .*""""t l"
c - tf* çn,--4",. l.L-.,.,, ,^$srr-J;* t-.N. lo o c*J^"9 J?dB
€^J.rjo- +o t. i,.".1.-c".1
")
i- t,,r\,
i*5,,, f,o
ô-l'"' d
1_l:l
I eco,.-L
l' e Gi J', ,* rrli,
t .\\
lnsl ; ^M u* re csçt'>
Fo""R Sc..-..
w. w w
^d
"*-t'- ^C.u*t;.
N {^^" . t row,... Ar"J^ +r". weL _
o v qqo' û"'-L - t,-C' D
''
- \'rc-rq
^o ê'h r
9"f-"\?" URL-:
"J4/\ra4 *. e,*
r) e,-,"1 &,,Rr_,"
iln,l{},f.t-'' -*"_ î*,**r
*r/,* {.. 6.1 *;*Éf 'a
Wct Àefe.^ .
4r4^..
_ ç",q-1" 1" a-unL }o nv,io , I
*";t*:*,î,;
r.."!^
ïS*r**, .r
-^1d
pace'-'*
n,{cl,!.
wxa .!fuÎe odn", \(-1"à *^;-Q"t( nxi
U .t+p.a ,",,1R b*un-
_=--
Îp
-: ne1r,,t
o aoLM. ts '*-d t#., ^ +o tc., w.b .r*,,v"r.
'rff*O
to
t^t,\
",* tl*.\5er{.
^
.- ul-L wc,t" )r.vq .',tJ'fâ. fp
{\{- -"-J
.ù6 e"a.^te -'-**,a _ ; b.T"*"F"J rn
r).-ext
Data Security
1 Read the text "THE ANATOMy OF A VlRUS,,
and then find answers to the questions that fo ow:
A biological virus js a very small, simole
organism that infects living cells, known
itselfto them and using them to reproduce as the host, by attaching
itself. This often causes harm to the host cells.
simjlarly, a computer virus is a very small progr",
resources to reproduce itself. lt often
,outin,, tf,.iiit*, ."rr,**
,vstem and uses its
does this by patching thu opuÀilne " ,vat",
program ftles, such âs COM or EXE to enable it to detect
f es. tt then copies itsetf-into ù*"ïË!..
if,,, ,ornerimes causes harm
to the host computer system.
When the user runs an infected progrânr,
it ;s loade/J into memory (:rrrying the virus. The virus uses a
:ry'pryj:IT'|8lethniqueto5taylè5ident.nmemory.i.-l,.r'".i,'".'ryl
l-lTqll! rhis process continues r,,ntitthe
,ne vtrus may also contain a pavload tllat r'lmains
computel i,l*ii"i."6ltr.
-,.'. , .kev- :: - dormant urrtil a tri€ger event activâtes it, such as the
The pavlc,ad ;";;""; ; ;i'.'r.ir",-liii .'rn, a. somethins rerativerv
ïfnular a m' ss'rs
::::l:-"-':ilt"" t*kolaving
i:ll:t::,1:l
sucn as deleting files on le har4r /.i,.k-
e on the monltor s1rç,gn 6r it mitht do ;mething more destructive
q
,2.
How are co mputerviruses tike biological viruses?
Whèt is th,.t effect of a virus pâtching the operatin8 system? *."'t
ii;i.'. C.- *g
rq'tt J."\ ;"'ît) l!:
)À:'r ",.+-')..\"",
. some viruses desisned o"'::o.o ;: "XÀ j
(| 4. ï:f il, :".""t ;Ùl,t'.:
:: writer provide? r".-g,;l
what xamples ot pôyload doesthe
e
',.!lto?
\
(.CtqC.\ e KE: i,.'I
(
5 Whar.1;66 o1Ororrams do vlruses often attach IJ {-_
Ji Pro,r'ide the function of each virus routine:
misdirection
reproduction
tri88er
oad
1^
l,_, ,,,..\,.1,, \. (1-'vri\ I
-,.f- . 1,J.V,tlO
@ You are golng to hear Jon, a bank security offlcer, answer some questions about hisjob.
Before you listen try to complete the sentences about bank securitY.
a. A l$jl-ùk.:.J4i-:. hacker is a hàckerwho helps oryanizations protect themselvês against cdminal
hackers.
b. A ..0!lac\."-1r*f. ls a process to check to see who is connected to â network.
c. ....'.*rqf .1"ç....-, tingerprintinF gives information about what operating system people are using.
d. 128 bit sst ..........Çi.:È;t:{:.,r*).. encrypt data.
e. Anti-virus software can protect againstviruses and ......'.1:tL9-1.!L":. ,.,............
f. ......}l),'.â..,2-..,...,,,.,...... phishing is a more targ€ted form of phishing.
Now listen to Jon and check your answers.
3- lnternet Crimes
A.,6 People shouldn't buy cracked software or download music illegâlly from the lnternet'
;;--(a
1L,-G'
Be suspicious of wonderful offers. Don't buy if you âren't sure.
It's dangerous to give personal informatlon to people you contact in chat rooms'
i.:€
l.*Q
Don't opên attachments from people you don't know even ifthe subject looks attractive-
Scanyouremail and becareful aboutwhichwebsitesyou visit.
'r-r--(!) check with your bank before sending information.
4- Read the following text, and then deal with the related exercises:
2
does data encryption provide?
Glwhat
b. integrity c" authentication
æ)
@e message encrypted with the recipient's public key cân only be decrypted with
a- the senders private key b-the sende/s public key @the recipienfs priyate key
Owhat svstem is commonly used forencryption? pr-v,Ùc K.l ."*$.:r'S'1"\
@what is the opposite of tncrypt'? d."gtt
@A message-digest functio n is used toi
ln English words particularly adjectives and nouns are combined inlo eompound struc.tures in a I
vaiely of ways. There are three forms of compound words:
l- ïhe closed form, in which the words âre melied together, such as: keyboard.
2- The hyphenâtêd folm, such as: a hands-free device.
3- ThE open form, such es: a conlrol panel.
A- Compounds consisl of a hsadword and one or morê modmers which refer to difrerent things:
2- Use orfunciiont e.g. search-enginê= a program used to find information on the web.
1- When the secônd noun belongs to or is parl oflhe first: e.g. college librâry
' but words denoting quaniity (eg.r piecej can not be used iqthis way:^ e.g. A piece of text (not
text piece). \( l.S* K- \S
' \.:"rï'6n '\"J\>
2- The firsl noun can indicate the place of the second: s.g. online shop.
3. The first noun can indicâte the time of the second: e.g. Pre-service training.
4- The lirst can Btale the material of which the secend is made of: e.g. carbon fiber.
5- The lirst noun can also statê the power u6ed lo operale the second: e.g. laser disk = a siorage
disk with stored data reâdâble by laser beam.
6- The first cân indicate the purpose (aim) of the second: e.g.lrâffic signal= a signalthât
controls lhe movement of traffic,
7- Work areâs, such âs faclory, can be preceded by the name of the article produced:
e.g. a machine Êhop= a workshop where different meialg are cut, shaped and worked.
8-Some combinations are often used of occupalions, sports, hobbies, and the people
who p€c'tige lhem: e.g. computer conaultant, car rally.
9- The lirsl noun can show what ihe second is about or concemed with:
e.g. income lax, car insurance.
B- Read the following senten€es, and then form compounds that refer to them:
1. Whet does data enc44rtion provide?
2, A message encrypted with the recipien?s public key.ân only be decryptêd with
a-the senders private key b-the sende/s public keY c-the recipient's private key,
6- Listening: Listên to this recording on'Basic Cyber Safeq/, and then note down
some of the given pieces of advice.
7- Lantuag€ Work
A- Explâin the following compounds, extrâct some others from the text and then,
study the related notes below: "a message-integrity scheme", "data encryption
methods".
ô A website which is designed in a goodway. -. '.11.i-lq,l'-,r.i
2- A display which is mounted on the user's head. ..,
.. r.. J-rr, ,rrr,, i,, .J
!3' An operation which doesn't require hands. - ' -, .l -i,. ..
!O A computerwhich runs on batteries. \.Jù,.,.-.q'.tr'.1 '-:,,*i:,!': '-
5- A hard drive which integrates two different teàÂÀologies. f
'i.
i,\'"'J
6- A special file which redireds to another file or program. S4,.û.f\'r
7- A peripherâl device which reads and writes flash memory.
I:.,.:'r i, ',...'
8- A file which can be retrieved and displayed, but not changed or deteted.
9- A device which is worked by the user's voice.
,.!G A source of powerwhich mayTail. ,, .. lî"i. r
l1-An unauthorized access ofa website. ,,.- -.\. ,..' ",.,
12- Strategies against malware. .l
The average computer user has between.5 ând 15 username/password combinations to Iog in to
email accounts, social networking sites, discussion boardt news and entertainment sitet online
stores, online banking accounts, or other websites. For people who use email or other internet
applications at work, the number of required username/password combinations may surpass 30.
Some of these accounts demand that you use a specific number of symbols and digits, while others
,
require you to chang€ your password every 50 days. Whel1 you add to this list the codes needed to/
access things like ATMs (devices to perform financial transactions), home alarm systems, padlocksi
or voicemail, the number of passwords becomesqqgger'ÂglThe teeting of frustration that results
from mainteining a memorized list of login credentiaÈ f,aigro\^,n ro pr"ualent thât it actuâlly-hâsra --- ,*
nameipasswordratieue. ,
i "::.î*.iff
-
Having to remember so many different passwords i1g$![g but it can also be dangerous. Because
it is virtually impossible to remember â unique password for each of these accounts, many people
leave hândwritten listr of usernames and passwords on or next to their computers. Others solve this
problem by using the same password for every account or usinS extremely simple passwords, While
these practices make it easier to remember login information, they also make it exponentially eâsier
for thieves to hack into accounts, Single sjgn-on (SSO) authentication and password management
software can help.. mitigate- this problem, but there are \!|,tvbr{t
Ig.?gj!,.epploaches. SSO
authentication can be used for{elated, but ind€pendent softwarè system;. With SSO, users log in
once to access a variety of different ap-plications. Users only need to remember one password to log
in to the main system; the SSO software then automatically logs the user in to other accounts within
the system. SSO software is typlcally used by large companies, schools, or libraries. password
management software, such as KeePass and Password Safe, is most often used on personal
computers, These software programs-which have been built into many major web browsers-store
passwords in a remote database and automatically "remember" usery passwords for a variety of
srtes. \ \r4Àtr;.. J
'Ihe problem with both SSO authentication and password
management software is that the feature
thât makes them useful is also what makes them vulnerable. If a user loses or forgets the passworci
required to log in to SSO software, the user will then lose access to all of the applicetions linked to
the SSO account, Furthermore, if a hacker can crack the SSO password, he or she will then have
6
accessto all of the linked accounts. Users who rely on password management software, are
susceptible to the same problems, but they also incur the added threat of passwords being
compromised because of computer theft. s.U... I V\St
Although most websites or network systems allow users to recover or change lost passwords by .,o"
providing email addresses or answering a prompt, this proc€ss can waste time and cause fgrllgr'
frustration. What is more, recovering a forgotten password is only a temporary solution; it does not
address the larger problem of password fatigue,
Some computer scientists have suggested that instead of passwords, computers rely on biometrics.
This is a method of recognizing human users based on uniqu€ traits, such as fingerprints, voice, or
DNA. Biometric identification ls currently used by some government agencies and private
companiet including the Department of Defense and Disney World while biometrics would
certainly ellminate the need for people to remember passwords, the use of biometrics raises ethical
questions concerning privacy and can also be expensive to implement,
The problems associated with sso, password management software, and biometrics continue to
stimulate software engineers and computer security experts to search for the cure to password
fatigue. Until they find the perfect solution, however, everyone will simply have to rely on the
solutions as
inadequate.
d, The author explains â problem and then to agree with his or her solution to the
egsq?-qgtreaders
problem.
e. The author explains a problem, contextualizes the problem, and ultimately dismisses it as an
unnecessary concern,
2) The passage discusses all of the following solutions to password fatigue except
b. voice-recognition software
c, KeePass
ôntelligent encryption
3) As used in paragraph9, which is the best synonVm forg$g{$
7
a" predid
b, postpone
c. investigate
@lesren
e, complicate
4) According to the passage, SSO authentication software may be safer than password management
software because
ll. if a user of password management software forgets his or her login credentials, the user can no
longer access any ofthe appiications protected by the password
lll.
hackers who access password mânagement software can gain access to all of the applications
protected by that password
a, I only
b. llonly
c. land llonly
5) Which ofthe following statements from the passage represents an opinion, as opposed to â fâct?
a. "For people who use email or other internet applications at work, the number of required
u-îername/password combinations may surpass 30."
b, the feeling of frustration that results from mâintaining a memorized list of login credentials has
Srown so prevalentthat it actually has a name: password fatigue."
Q'Having to remember so many different passwords is irritating but it can also be dangerous."
d. "Additionally, recovering a forgotten password is only a temporary solufion; it does not address
the lar8er problem of passwo.d fatigue,"
e. "The problems associated with SOS, password management software, and biometrics continue to
stimulate software engineers and computer security experts to search for the cure to password
fatigue."
6) ln paragraph t the âuthor notes that "the use of biometics raises ;ihical questions concerning
privacy," Which ofthe following situations could be used as an example to illustrate this polnt?
a. A thief steals a personal computer with password management software and gains access tc
private email accounts, credit card numbert and bank statements.
b. An employee et a company uses a voice recognition system to log in to his computer, only to be
called away by his boss. While he is away from the computer but still logged in, another employee
sngops on his computer and reads personal email correspondence,
8
c. A computer hacker gains access to a system that uses sso software by cracking the password, [îè
gaining private access to all linked accounts
.-"d.
A company that employs fingerprint identificatlon securlty softwâre turns over its database of
fingerprints to the local police department when a violent crime occurs on its grounds.
e. Even when a person is on password-protected websitet an internet browser tracks the person's
internet use and colleds information in order to tailor advertisements to his or her interests.
a. angry
d. hopeful
e, depressed
9
FtÈÈ' Scc-icq,
N-\,. i".*;\1;,
S'itrc lt' !"'una.. '
À *
4_
<'i*
- 1o *""*ô1+ <\ ,yh<.^Â,.ôa
o*r-,,ô , S u,r" \t^. k"t {e
h ç'.te...
vt"c-Ê".ù
, \ "^c^X1t Fe{
h"c.aa^6. i.l, ^ tJ"t..Àq
L - +" d..tû o. eie-çq,.v.eJ
*-ff j *.* *uu,n' k1 (p_vJ.
\
q - x"c.y;f !ç^.rJ" k2
to q"^cat$ l1- triAc
l. Ft* x"J-, { - -*^ff , T.r" *". \9.- y*"'*h kn
{
I - +o e*c*ôt1 çÈ- F.*c r-6 " "U'3"1*! 3..?fJ""-
\{... i+r^.- *"ô
çrr^-uù
i', 1,r,f,^3
'
% X<,J !e-c-!"u et^,;*;-, v\'e $,,.Vc- t- $!9s'W fu :l:;
^ *''*À&J t\ & i'*ç*J ! p-Lu.cKX
:irL- , îN- m.,,r-&
^^ J"n çs xé-Â''4 '14* n'c'p^;-tw&c
,aJy\ r .lfr x-.J UX K' x'
,t***""# *v.t
l ^Jt'!J!*"'-
*s
p**. X1 rr""î5 ".
.'Â1-v
- i1q ,Dtr( !r /z---->
\Ncb \c
-t
\NeÆ:L '*" jl*u^i 5ltum
\"1"t- )se I - +rrnâL
-\-
\<JL'
n
\r...,n.
tL. r..^,-t*.
)* 'Q crà,t,,h
o\{*-5$-$s*Tr",,
*'. '.,,^m v-r'**
t' ^)n
â m.rct \*t^"*"16
" *,*i*u
l) i^ \*-qs.,*J
".ti."k S J ffet'a
r,À meP.". f,
t*ÀQc
'\J.!l / iQ,t -t t f^')- ".- /(J
3) t
.w.N+.Q-t. +lP
o.JiJ<^Lo \__z----\-.t-
JÀe6g't'"^^
^âv"^t^
teba.
ù"f N, \^J"^
""n. tfo
-,û[
= siJ,;
\z-\z'------'l
{+* ..Jv v.l- .t",{t P*t ".05$"- ..{
' - il.-*!""j\
. & '|ït q.0J.J +o sù*'-["'\J .- Àc<"_u .*n."J
ri-
{.n g{-{,.1
A+hèt"
' lla4à
*u
+3, N.J,.. (3".. 1n.^v"_00,^.,
L4crvef, ïç_ tht
gt^*^"Q Àrncl-x/ . 1H.ô
,Jà,- ,
qec^-".- ^,t
At û""I ,Às y.j. J)^,.v. ts
*R" j),."{À*.,
*. "t
*S^-C* t,n $ *> No hfu
-trh- A"^,.1n-'5 o" ,n";n' '6t* âa/" 4v un'
fo 'À"'-.,'
T&- ^F-""A; a5 *1eLil s^ xi-r?' tt*
\1.yt,4 .,
-V,.,ta 9..v",*. k.t'\
*"Y,S- '? 3,ô*.t *a*-*
rg
.
- ^cPJ'd
.f*t*'t
ut*^-l 1q.o N;*u-"\,>
w\'-..-\'
J-o.."Lt r,.,
.leriù'.
J u-.lic.
t^."-'$-$*
$'uu s
\€'eR..F
U.S.T,H.B / C.E,I.I, UNIT 3
Neual networks look a! the rules ofusing data, which are based on the connections found or
on â€-ampl9 set ofdata. As a result, the softûare continually anaiyses value and compares it io
the other factors, and it compares thesg factols repeatsdly until it finds patterns ernerging. /
These patt€rns are knou'n as rules. The software then looks for other patterns based on these
rules or sends ou1 an alarm when the trigger value is hit.
Clustering divides data into groups based on similar features or limited data ranges. Clusters
are userl when data is not labelled in a way that is favomble to mining. For instance, an
insurance company that wanls 10 find instances of h.aud rvouldn'r have its record labelled as
fraudulent or not fraudulent. But after analyzing pattems within clusters, the mining software
can stalt to figure out the rul€s that point 10 which claims are iikely to be false.
Decision trees, like clusiers, separaie the data inio subsets and then analysg the subsets to
divide them into lulther subsets, and so on (for a few more levels). The final subsets arc then
snlali enough that the mining process can Iind iniercsting pattcms :urd relationships within the
dâtâ.
Onçe the data to be minçd is identificd. it should be cleansed. Cleansing data lices ir fiom
duplicate iûformation and eûonçou!-data. Next, the data shouid be stored in a unifoln format
within relevant catçgories or fields. N4iùing toois can work with all types ofdata storagc, lrom
large data \a,archousc to smaller desktop databases to flat files. Data warchouses and data
marts are storage methods that involve archiving large amounts ofdata in a way that makes it
casy to acoess when necessary.
When the proccss is compiete. the midng softwarc generates â reporL. anal"vst goes over
v| rl
the report to see if fui{rer work needs ro bc done, such as rctining ^n
{arq4qefgl} usins othcr
4gla_glùCis_-199!s !g e!4q!ne -t! e data. o:f_gygo,sçIappinelie-drla_i! il!_qus.eblc, lf no
further work is required, the repofi procceds to the decision makers lor appropriate action.
the power of data mining is being uscd for many purposcs, such as analyzitg Supremc Coùt
decisions, disçovering pattems io heallllCafS, pulling stories about competirors from
ncwswires, rg!q!]ll!g qo!t1_e..nqckr! in producrion processes. and analysing scquerces in the
@ Separate data into subsets and then analyse the bsets to divide them into futher
strbsels for a numbe! oflevels. Lerv:".n \,"€o-'.
@ Storage method ofæchiving large amounts ofdata to it easy to access ùIo n,or,\,o**
($ Data ftee frorn duplicate and erroneous infomation
O A process offiltering thorough large amounts ofraw for useful inlormalion \)dld ! q
^Iv^
@ A computing tool that tries to operate in a way similar the human hrain t\r
2- Mark the followitrg as truc or fâlsei
A1] anâlyst goes over thç rçpofi to see if further work needs to be done.
What does the undcrlined word reprcsem?
te"r.J... f:-'* (\*)
Ol - It's
EL-utt".
than mine.
I ...,!À<-Q
a\
( r["ù *{*,} ?. )
lmore small- I .FëeL
fhither could be used here.
{ \N{&1,. s"\vrc-r u
'i
Q2 - Yours is than mine. cÊd'æ1
flbin".
$i*r* çt.r.rrrn,o (* JÉ\ J.rÀ!.,,
Jmore big
was
Qlo - It than I thought it would be.
Equicker -
-Jmore-quick
(. .
JEither could be Lrsed here.
E.g. The more I study, lhe ntore I know. (AslBecause I y more, my knowledge grows
more).
Ë.g. The smaller the problem, thc Iess (hallenging it is for the programmer'
(As,4lecause the problem is simple, it does not present a great challcnge 10 the progmrnmer)
E.g. The dlffererces in resolution are noticeable The more dots per inch, the clearer the
image. (As/Because there are a lot ofdots per inch, the image becomes very clear)'
Sometimes the subject and the verb (to be) are omitted:
E.g. The sooner, the better.
E.g. The less said about it, the better.
Now that data mining has been introduced. it is interestinç to deal with some other related
aspects. , .-\ _e"r.+^L,Q,
-
-- |ÊJ i -, i, -7 'r'
È
, L-4t f-
Benelits of data mining -',^..,r.À\Q
.!, Y ",' L)l
\-D l\{ê}
'\tv r, 0 \
L*..r.! '*l rJÀ( f .^'^\c,.t t' d<
à\
r/ Àe^ Fo{
\L)
q) \\" uo,l'n I Jer ' vie 1'.i,1 1.7"v:,ts1' ;.'!^
/rl
5) -- Dc.'*
p, ç,1,*, H. ,:nJ eYft "'""
Direct marketiûg: the ability to prcdict who is most likely to be interesied and what produilts
can save companies immense amounls and marketing expendilures.
Fraud detection: data mining techniques cau help discover whiqh insurauce clains cellulâr
phone calls or credit çard pr,uchases arc likely to be i'raudulent.
h llnanoial matkcls: llata rnining techniques are cxtensively used to help model
/- _,-!g!giÊfing
financial markets,
arl.1,,"
Mining odiner web sites today find themselves compeling fol customer loyalty.
-
Comparirg data rnining lvith some other t€chniques
Query tools versus data mining tools: with a query tool a user cal1 ask a questioD. Howevq, a
dara mining process racLles a broadcr unclerlying goal ofa u.er'.
:J.-.i . \,
Data mining tasks: The most commoû types of data mining tasks classified' based on the
kind ofknowledge they are looking 1br are lisled as fbllowsr
Sequence detection: by observing pattems ifl the data, sequcnces are detemined
Deviation analysis: for example Johl went to the bank on Satûday bùl he did not go to the
grocery store after that, instead he went to a football ganle. With this task anomalous
Techniques for data miningr Data mining is an jniegration of multiple technologies These
include:
Data mining processl The process overview: in geneml whçn people ralk about data mining
th€y focus primarily on the aclual mining and discovery aspeçts. The idea sounds intuitive and
altractive. However, mining data is only one step in the overall process. The latter js indeed a
multistep. iteralive process.
o Decision trees: decision tecs or a serirs of thc or thcn rules as a colrlmoDly uscd
machine leaming algorithm are poqerlul and popular tools fbr classification and
p(cdiction
. Neural networks (NN-): Neural networks constilute another popular data mining
technique. It is a systcm of soffware programs and data structures that approximates
the operation ofthe brain
Mark lhe following ls truc or fal\€:
\e;
3!
1- Targeting potcntial clieDts and produçts is onç ofthe lasks ofdala miDing. '.;'',."-
D) Data mlÛing Jbrecast conccrning trends in thç marketplacc is not essential, but is
useful in reducinc cosrs. \ " (
( r-)
Conceming insurancc claims and unlike cellular phonc calls, credit oard purchascs are
fraudulent. T i.' .,
3 Report writi[g