Sunteți pe pagina 1din 52

1.Explain Teradata Architecture Major Components of Teradata Architecture NODE: A node is made up of various hardware and software components.

Components that make up a node are 1. Parsing Engine (PE) 2. BY E! ". Access #odu$e Processor (A#P) %. &isks Parsin En ine !he Parsing Engine (PE) is a component that interprets '() re*uests+ receives input records+ and passes data. !o do that it sends the messages through the BY E! to the A#Ps. !"NET !he BY E! is the message passing $a,er. -t determines which A#P(s)(Access #odu$e Processor) shou$d receive a message. Access Module Processor #AMP$ !he A#P is a virtua$ processor designed for and dedicated to managing a portion of the entire data.ase. -t performs a$$ data.ase management functions such as sorting+ aggregating+ and formatting data. !he A#P receives data from the PE+ formats rows+ and distri.utes them to the disk storage units it contro$s. !he A#P a$so retrieves the rows re*uested ., the Parsing Engine. Dis%s &isks are disk drives associated with an A#P that store the data rows. /n current s,stems+ the, are imp$emented using a dis% arra&

A$$ app$ications run under 0 -1+ 2indows ! or 2indows 2333 and a$$ !eradata software runs under P&E. A$$ share the resources of CP0 and memor, on the node. A#Ps and PEs are 'irtual processors running under contro$ of the P&E.!heir num.ers are software configura.$e. -n addition to user app$ications+ gatewa, software and channe$ driver support ma, a$so .e running. !he !eradata 4&B#' has a 5shared6nothing5 architecture+ which means that the vprocs (which are the PEs and A#Ps) do not share common components. 7or e8amp$e+ each A#P manages its own dedicated memor, space (taken from the memor, poo$) and the data on its own vdisk 66 these are not shared with other A#Ps. Each A#P uses s,stem resources independent$, of the other A#Ps so the, can a$$ work in para$$e$ for high s,stem performance overa$$. (&mmetric Multi)Processor #(MP$:A sing$e node is a ',mmetric #u$ti6Processor ('#P) Massi'el& Parallel Processin #MPP$:2hen mu$tip$e '#P nodes are connected to form a $arger configuration+we refer to this as a #assive$, Para$$e$ Processing (#PP) s,stem.

M *.+unctionalit& of each that include in Teradata architecture9


Parsin En ine: A Parsing Engine (PE) is a virtua$ processor(vproc). -t is made up of the fo$$owing software components9 1. 'ession Contro$+ 2. Parser+ "./ptimi:er+ %. &ispatcher.

(ession Control !he ma;or functions performed ., 'ession Contro$ are $ogon and $ogoff. )ogon takes a te8tua$ re*uest for session authori:ation+ verifies it+ and returns a ,es or no answer. )ogoff terminates an, ongoing activit, and de$etes the session<s conte8t. Parser !he Parser interprets '() statements+ checks them for proper '() s,nta8 and eva$uates them semantica$$,. !he PE a$so consu$ts the &ata &ictionar, to ensure that a$$ o.;ects and co$umns e8ist and that the user has authorit, to access these o.;ects. Optimi,er !he /ptimi:er is responsi.$e for deve$oping the $east e8pensive p$an to return the re*uested response set. Processing a$ternatives are eva$uated and the fastest a$ternative is chosen. !his a$ternative is converted to e8ecuta.$e steps+ to .e performed ., the A#Ps+ which are then passed to the dispatcher. Dispatcher !he &ispatcher contro$s the se*uence in which the steps are e8ecuted and passes the steps on to the BY E!. -t is composed of e8ecution contro$ and response6contro$ tasks. E8ecution contro$ receives the step definitions from the Parser and transmits them to the appropriate A#P(s) for processing+ receives status reports from the A#Ps as the, process the steps+ and passes the resu$ts on to response contro$ once the A#Ps have comp$eted processing. 4esponse contro$ returns the resu$ts to the user. !he &ispatcher sees that a$$ A#Ps have finished a step .efore the ne8t step is dispatched. &epending on the nature of the '() re*uest+ a step wi$$ .e sent to one A#P+ or .roadcast to a$$ A#Ps.

!he !"NET hand$es the interna$ communication of the !eradata 4&B#'. A$$ communication .etween PEs and A#Ps is done via the BY E!. 2hen the PE dispatches the steps for the A#Ps to perform+ the, are dispatched onto the BY E!. !he messages are routed to the appropriate A#P(s) where resu$ts sets and status information are generated. !his response information is a$so routed .ack to the re*uesting PE via the BY E!. &epending on the nature of the dispatch re*uest+ the communication ma, .e a9 = !roadcast>message is routed to a$$ nodes in the s,stem. = Point)to)point>message is routed to one specific node in the s,stem. /nce the message is on a participating node+ P&E hand$es the mu$ticast(carries the message to ;ust the A#Ps that shou$d get it). 'o+ whi$e a teradata s,stem does do mu$ticast messaging+ the BY E! hardware a$one cannot do it 6 the BY E! can on$, do point6to6point and .roadcast .etween nodes. 7EA!04E' /7 BY E!9 !he BY E! has severa$ uni*ue features9 +ault tolerant9 each network has mu$tip$e connection paths. -f the BY E! detects an unusa.$e path in either network+ it wi$$ automatica$$, reconfigure that network so a$$ messages avoid the unusa.$e path. Additiona$$,+ in the rare case that BY E! 3 cannot .e reconfigured+ hardware on BY E! 3 is disa.$ed and messages are re6routed to BY E! 1 (or e*ua$$, distri.uted if there are more than two BY E!s present)+ and vice versa. -oad .alanced: traffic is automatica$$, and d,namica$$, distri.uted .etween .oth BY E!s. (cala.le9 as ,ou add nodes to the s,stem+ overa$$ network .andwidth sca$es $inear$, 6 meaning an increase in s,stem si:e without $oss of performance. /i h Performance9 an #PP s,stem t,pica$$, has two or more BY E! networks. Because a$$ networks are active+ the s,stem .enefits from the fu$$ aggregate .andwidth of a$$ networks. 'ince the num.er of networks can .e sca$ed+ performance can a$so .e sca$ed to meet the needs of demanding app$ications. !he techno$og, of the BY E! is what makes the !eradata para$$e$ism possi.$e. The Access Module Processor #AMP$

!he Access Module Processor #AMP$ is the virtua$ processor. An A#P wi$$ contro$ some portion of each ta.$e on the s,stem. A#Ps do the ph,sica$ work associated with generating an answer set inc$uding sorting+ aggregating+ formatting and converting. An A#P can contro$ up to ?% ph,sica$ disks. !he A#Ps perform a$$ data.ase management functions in the s,stem.An A#P responds to Parser@/ptimi:er steps transmitted across the BY E! ., se$ecting data from or storing data to its disks. 7or some re*uests+ the A#Ps ma, redistri.ute a cop, of the data to other A#Ps. !he Data.ase Mana er su.s,stem resides on each A#P. !he &ata.ase #anager9 = 4eceives the steps from the &ispatcher and processes the steps. -t has the a.i$it, to9 A -oc% data.ases and ta.$es. A Create+ modif&+ or delete definitions of ta.$es. A 0nsert1 delete1 or modif& ro2s within the ta.$es. A 3etrie'e information from definitions and ta.$es. = Co$$ects accountin statistics+ recording accesses ., session so users can .e .i$$ed appropriate$,. = 4eturns responses to the &ispatcher. !he &ata.ase #anager provides a .ridge .etween that $ogica$ organi:ation and the ph,sica$ organi:ation of the data on disks. !he &ata.ase #anager performs a space6management function that contro$s the use and a$$ocation of space.

A dis% arra& is a configuration of disk drives that uti$i:es specia$i:ed contro$$ers to manage and distri.ute data and parit, across the disks whi$e providing fast access and data integrit,. Each AMP 'proc must have access to an arra, contro$$er that in turn accesses the ph,sica$ disks. A#P vprocs are associated with one or more ranks of data. !he tota$ disk space associated with an A#P vproc is ca$$ed a 'dis%. A vdisk ma, have up to three ranks. !eradata supports severa$ protection schemes9 = 4A-& )eve$ B>&ata and parit, protection striped across mu$tip$e disks. = 4A-& )eve$ 1>Each disk has a ph,sica$ mirror rep$icating the data. = 4A-& )eve$ '>&ata and parit, protection simi$ar to 4A-& B .ut used for E#C diBsk arra,s. !he disk arra, contro$$ers are referred to as dual acti'e arra& controllers+ which means that .oth contro$$ers are active$, used in addition to serving as .ackup for each other. 4./o2 is Teradata parallel5 !eradata is Para$$e$ for the fo$$owing reasons9 Each PE can support up to 123 user sessions in para$$e$. Each session ma, hand$e mu$tip$e re*uests concurrent$,. 2hi$e on$, one re*uest at a time ma, .e active on .eha$f of a session+ the session itse$f can manage the activities of 1? re*uests and their associated answer sets. !he #P) is imp$emented different$, for different p$atforms+ this means that it wi$$ a$wa,s .e we$$ within the needed .andwidth for each particu$ar p$atform<s ma8imum throughput. Each A#P can perform up to C3 tasks in para$$e$. !his means that A#Ps are not dedicated at an, moment in time to the servicing of on$, one re*uest+ .ut rather are mu$ti6threading mu$tip$e re*uests concurrent$,. Because A#Ps are designed to operate on on$, one portion of the data.ase+ the, must operate in para$$e$ to accomp$ish their intended resu$ts. -n addition to this+ the optimi:er ma, direct the A#Ps to perform certain steps in para$$e$ if there are no contingencies .etween the steps. !his means that an A#P might .e concurrent$, performing more than one step on .eha$f of the same re*uest.

(uer, Para$$e$ism9 Breaking the re*uest into sma$$er components+ a$$ components .eing worked on at the same time+ with one sing$e answer de$ivered. Para$$e$ e8ecution can incorporate a$$ or part of the operations within a *uer,+ and can significant$, reduce the response time of an '() statement+ particu$ar$, if the *uer, reads and ana$,:es a $arge amount of data. (uer, para$$e$ism is ena.$ed in !eradata ., hash6partitioning the data across a$$ the DP4/Cs defined in the s,stem. A DP4/C provides a$$ the data.ase services on its a$$ocation of data.$ocks.A$$ re$ationa$ operations such as ta.$e scans+ inde8 scans+ pro;ections+ se$ections+ ;oins+ aggregations+ and sorts e8ecute in para$$e$ across a$$ the DP4/Cs simu$taneous$, and unconditiona$$,. Each

operation is performed on a DP4/C<s data independent$, of the data associated with the other DP4/Cs. 6.Explain mechanism in data distri.ution and data retrie'al Data Distri.ution: !eradata uses hash partitioning and distri.ution to random$, and even$, distri.ute data across a$$ A#Ps. !he rows of every ta.$e are distri.uted among a$$ A#Ps 6 and idea$$, wi$$ .e even$, distri.uted among a$$ A#Ps. !he rows of a$$ ta.$es are distri.uted across the A#Ps according to their Primar& 0ndex va$ue. !he Primar, -nde8 va$ue goes into the hashing a$gorithm and the output is a "26.it 3o2 /ash. !he high order 1? .its are referred to as the E.ucket num.erF and are used to identif, a hash map entr,. !he Ehash .ucketF is a$so referred to as then &'2 G &estination 'e$ection 2ord. !his entr,+ in turn+ is used to identif, the A#P that wi$$ .e targeted. !he remaining 1? .its are not used to $ocate the A#P. Each hash map is simp$, an arra, that associates &'2 va$ues (or .ucket num.ers) with specific A#Ps

. !o $ocate a row+ the A#P fi$e s,stem searches through a memor,6resident structure ca$$ed the #aster -nde8. An entr, in the #aster -nde8 wi$$ indicate that if a row with this !a.$e -& and row hash e8ists+ then it must .e on a specific disk c,$inder. !he fi$e s,stem wi$$ then search through the designated C,$inder -nde8. !here it wi$$ find an entr, that indicates that if a row with this !a.$e -& and row hash e8ists+ it must .e in one specific data .$ock on that c,$inder. !he fi$e s,stem then searches the data .$ock unti$ it $ocates the row(s) or returns a o 4ows 7ound condition code.

Data retri'al: 4etrieving data from the !eradata 4&B#' simp$, reverses the storage mode$ process. A re*uest made for data is passed on to a Parsing Engine(PE). !he PE optimi:es the re*uest for efficient processing and creates tasks for the A#Ps to perform+ which resu$ts in the re*uest .eing satisfied. !asks are then dispatched to the A#Ps via the BY E!. /ften+ a$$ A#Ps must participate in creating the answer set+ such as returning a$$ rows of a ta.$e to a c$ient app$ication. /ther times+ on$, one or a few A#Ps need participate. !he PE wi$$ ensure that on$, the A#Ps that need to wi$$ .e assigned tasks. /nce the A#Ps have .een given their assignments+ the, retrieve the desired rows from their respective disks. !he A#Ps wi$$ sort+ aggregate+or format if needed. !he rows are then returned to the re*uesting PE viathe BY E!. !he PE takes the returned answer set and returns it to the re*uesting c$ient app$ication.

7. 0f P0 is not defined on a Teradata ta.le1 2hat 2ill happen5 !eradata ta.$es must have a primar, inde8. -f none is specified whi$e creating the ta.$e+ teradata supp$ies an automatica$$, created one. 8.2hat are the t&pes of indexes in Teradata5 0ni*ue Primar, -nde8 (0P-) 0ni*ue 'econdar, -nde8 (0'-) on60ni*ue Primar, -nde8 ( 0P-) on60ni*ue 'econdar, -nde8 ( 0P-) Hoin -nde8

9.2hat is secondar& index5 :hats are its uses5 A secondar& index is an a$ternate path to the data. 'econdar, inde8es are used to improve performance ., a$$owing the user to avoid scanning the entire ta.$e during a *uer,. A secondar, inde8 is $ike a primar, inde8 in that it a$$ows the user to $ocate rows. 0n$ike a primar, inde8+ it has no inf$uence on the wa, rows are distri.uted among A#Ps. 'econdar, -nde8es are optiona$ and can .e created and dropped d,namica$$,. 'econdar, -nde8es re*uire separate su.ta.$es which re*uire e8tra -@/ to maintain the inde8es. Comparing to primar, inde8es+ 'econdar, inde8es a$$ow access to information in a ta.$e ., a$ternate+ $ess fre*uent$, used paths. !eradata automatica$$, creates a 'econdar, -nde8 'u.ta.$e. !he su.ta.$e wi$$ contain9

'econdar, -nde8 Da$ue 'econdar, -nde8 4ow -& Primar, -nde8 4ow -&

2hen a user writes an '() *uer, that has a '- in the 2IE4E c$ause+ the Parsing Engine wi$$ hash the 'econdar, -nde8 Da$ue. !he output is the 4ow Iash of the '-. !he PE creates a re*uest containing the 4ow Iash and gives the re*uest to the #essage Passing )a,er (which inc$udes the BY E! software and network). !he #essage Passing )a,er uses a portion of the 4ow Iash to point to a .ucket in the Iash #ap. !hat .ucket contains an A#P num.er to which the PEJs re*uest wi$$ .e sent. !he A#P gets the re*uest and

accesses the 'econdar, -nde8 'u.ta.$e pertaining to the re*uested '- information. !he A#P wi$$ check to see if the 4ow Iash e8ists in the su.ta.$e and dou.$e check the su.ta.$e row with the actua$ secondar, inde8 va$ue. !hen+ the A#P wi$$ create a re*uest containing the Primar, -nde8 4ow -& and send it .ack to the #essage Passing )a,er. !his re*uest is directed to the A#P with the .ase ta.$e row+ and the A#P easi$, retrieves the data row. 'econdar, inde8es can .e usefu$ for 9

'atisf,ing comp$e8 condition Processing aggregates va$ue comparision #atching character com.ination Hoining ta.$es

;.2h& is secondar& index needed5 'econdar, inde8es are used to improve performance ., a$$owing the user to avoid scanning the entire ta.$e during a *uer, 'econdar, inde8es are fre*uent$, used in the where c$ause. !he .ase ta.$e data arenJt redistri.uted when secondar, inde8es are defined. 'econdar, inde8es can .e usefu$ for 9

'atisf,ing comp$e8 condition Processing aggregates va$ue comparison #atching character com.ination Hoining ta.$es

<.:hat are the different t&pes of loc%s in Teradata5 )ocking prevents mu$tip$e users who are tr,ing to change the same data at the same time from vio$ating the dataJs integrit,. )ocks are automatica$$, ac*uired during the processing of a re*uest and re$eased at the termination of the re*uest. There are four t&pes of loc%s: Exclusi'e -oc%9E8c$usive $ocks are on$, app$ied to data.ases or ta.$es+ never to rows. !he, are the most restrictive t,pe of $ockK a$$ other users are $ocked out. E8c$usive $ocks are used rare$,+ most often when structura$ changes are .eing made to the data.ase. 3ead -oc%94ead $ocks are used to ensure consistenc, during read operations. 'evera$ users ma, ho$d concurrent read $ocks on the same data+ during which no modification of the data is permitted. :rite -oc%92rite $ocks ena.$e users to modif, data whi$e $ocking out a$$ other users e8cept readers not concerned a.out data consistenc, (Access $ock readers). 0nti$ a 2rite $ock is re$eased+ no new read or write $ocks are a$$owed. Access -oc%:Access $ocks can .e specified ., users who are not concerned a.out data consistenc,. !he use of an access $ock a$$ows for reading data whi$e modifications are in process. Access $ocks are designed for decision support on $arge ta.$es that are updated on$, ., sma$$ sing$e row changes. Access $ocks are sometimes ca$$ed Esta$e readF $ocks+ i.e. ,ou ma, get Lsta$e data< that hasn<t .een updated. -oc%s ma& .e applied at three le'els: &ata.ase G app$ies to a$$ ta.$es@views in the data.ase !a.$e@Diew G app$ies to a$$ rows in the ta.$e@views 4ow Iash G app$ies to a$$ rows with same row hash -oc% t&pes are automaticall& applied .ased on the (=- command: 'E)EC! G app$ies a 4ead $ock 0P&A!E G app$ies a 2rite $ock C4EA!E !AB)E G app$ies an E8c$usive $ock 1>.:hen is ACCE(( loc% used5 Access $ocks are used for the *uick access to ta.$es in mu$ti6user environment even if other re*uest are updating the data. !he, a$so have minima$ effect on $ocking out others 6 when ,ou use an access $ock+ virtua$$, a$$ re*uests are compati.$e with ,ours.

11./o2 to set default data.ase5 (ettin the default data.ase: !he user name ,ou $ogon with is ,our temporar, data.ase. 7or e8amp$e +if ,ou $ogon as .$ogon a.cK password98,: then a.c is norma$$, defau$t data.ase (ueries ,ou make that do not specif, data.ase name wi$$ .e made against ,our defau$t data.ase. Chan in the default data.ase: !he DATA!A(E command is used to change the defau$t data.ase 7or e8amp$e9 &A!ABA'E .ir$aK set ,our defau$t data.ase to .ir$a and the su.se*uent *ueries are made against .ir$a data.ase. 1*.:hat is a cluster5 A c$uster is a group of A#Ps that act as a sing$e fa$$.ack unit. C$ustering has no effect on primar, row distri.ution of the ta.$e+ .ut the fa$$.ack row cop, wi$$ a$wa,s go to another A#P in the same c$uster. 'hou$d an A#P fai$+ the primar, and fa$$.ack row copies stored on that A#P cannot .e accessed. Iowever+ their a$ternate copies are avai$a.$e through the other A#Ps in the same c$uster. !he $oss of an A#P in one c$uster has no effect upon other c$usters. -t is possi.$e to $ose one A#P in each c$uster and sti$$ have fu$$ access to a$$ fa$$.ack6protected ta.$e data. -f there are two A#P fai$ures in the same c$uster+ the entire !eradata s,stem ha$ts.2hi$e an A#P is down+ the remaining A#Ps in the c$uster must do their own work p$us the work of the down A#P. !he e8amp$e shows an C6A#P s,stem set up in two c$usters of %6A#Ps each.

14.:hat are the connections in'ol'ed in Channel attached s&stem5 -n channe$6attached s,stems+ there are three ma;or software components+ which p$a, important ro$es in getting the re*uests to and from the !eradata 4&B#'.

!he client application is either written ., a programmer or is one of !eradata<s provided uti$it, programs. #an, c$ient app$ications are written as Efront endsF for '() su.mission+.ut the, a$so are written for fi$e maintenance and report generation. An, c$ient6supported $anguage ma, .e used provided it can interface to the Ca$$ )eve$ -nterface (C)-). !he Call -e'el 0nterface #C-0$ is the $owest $eve$ interface to the !eradata 4&B#'. -t consists of s,stem ca$$s which create sessions+ a$$ocate re*uest and response .uffers+ create and de6.$ock Eparce$sF of information+ and fetch response information to the re*uesting c$ient. !he Teradata Director Pro ram #TDP$ is a !eradata6supp$ied program that must run on an, c$ient s,stem that wi$$ .e channe$6attached to the !eradata 4&B#'. !he !&P manages the session traffic .etween the Ca$$6)eve$ -nterface and the 4&B#'. -ts functions inc$ude session initiation and termination+ $ogging+ verification+ recover,+ and restart+ as we$$ as ph,sica$ input to and output from the PEs+ (inc$uding session .a$ancing) and the maintenance of *ueues. !he !&P ma, a$so hand$e s,stem securit,.

!he /ost Channel Adapter is a mainframe hardware component that a$$ows the mainframe to connect to an E'C/ or Bus@!ag channe$. !he P!(A (PC- Bus E'C/ Adapter) is a PC- adapter card that a$$ows a 2or$d#ark server to connect to an E'C/ channe$. !he P!CA (PC- Bus Channe$ Adapter) is a PC- adapter card that a$$ows a 2or$d#ark server to connect to a Bus@!ag channe$. 16.:hat are the connections in'ol'ed in Net2or% attached s&stem5 -n network6attached s,stems+ there are four ma;or software components that p$a, important ro$es in getting the re*uests to and from the !eradata 4&B#'.

!he Call -e'el 0nterface #C-0$ is a $i.rar, of routines that resides on the c$ient side. C$ient

app$ication programs use these routines to perform operations such as $ogging on and off+ su.mitting '() *ueries and receiving responses which contain the answer set. !hese routines are MCN the same in a network6attached environment as the, are in a channe$ attached. !he Teradata OD!C? #Open Data.ase Connecti'it&$ driver uses an open standards.ased /&BC interface to provide c$ient app$ications access to !eradata across )A 6.ased environments. C4 has /&BC drivers for .oth 0 -1 and 2indows6.ased app$ications. !he Micro Teradata Director Pro ram #MTDP$ is a !eradata6supp$ied program that must .e $inked to an, app$ication that wi$$ .e network6attached to the !eradata 4&B#'. !he #!&P performs man, of the functions of the channe$ .ased !&P inc$uding session management. !he #!&P does not contro$ session .a$ancing across PEs. Connect and Assign 'ervers that run on the !eradata s,stem hand$e this activit,. !he Micro Operatin (&stem 0nterface #MO(0$ is a $i.rar, of routines providing operating s,stem independence for c$ients accessing the 4&B#'. B, using #/'-+ we on$, need one version of the #!&P to run on a$$ network6attached p$atforms.

17./o2 do &ou replace a null 'alue 2ith a default 'alue 2hile loadin 5 0sing C/A)E'CE function ',nta89 C/A)E'CE( C/)+ J&E7A0)!J) 18.:hat is COMP3E((5 Compress9 B, defau$t compresses the nu$$ va$ues. -n order to compress an, va$ues e8p$icit$, we need to give the characters or va$ues in order to compress those va$ues. 19./o2 man& 'alues can 2e compress in Teradata5 An, co$umn can .e compressed e8cept the inde8ed co$umn and non vo$ati$e. 1;.Difference .et2een 'olatile and lo.al 'olatile ta.le5 O$o.a$ !emporar, ta.$es (O!!) 6 1. 2hen the, are created+ its definition goes into &ata &ictionar,. 2. 2hen materia$i:ed data goes in temp space. ". thats wh,+ data is active up to the session ends+ and definition wi$$ remain there upto its not dropped using &rop ta.$e statement. -f dropped from some other session then its shou$d .e &rop ta.$e a$$K %. ,ou can co$$ect stats on O!!. Do$ati$e !emporar, ta.$es (D!!) 6 1. !a.$e &efinition is stored in ',stem cache 2. &ata is stored in spoo$ space. ". thats wh,+ data and ta.$e definition .oth are active on$, upto session ends. %. o co$$ect stats for D!!. 1<.Difference .et2een P@ and P05 Primar& @e&: A re$ationa$ concept used to determine re$ationships among entities and to define referentia$ constraints.

ot re*uired+ un$ess referentia$ integrit, checks are to .e performed. &efine ., C4EA!E !AB)E statement. 0ni*ue. -dentifies a row uni*ue$,. Da$ue can not .e changed. Can not .e nu$$. ot re$ated to access path.

Primar& 0ndex: 0sed to store rows on disk. &efined ., C4EA!E !AB)E '!A!E#E ! . 0ni*ue or on uni*ue. -t is used to distri.ute rows. Da$ues can .e changed. Can .e nu$$. 4e$ated to access path. *>.:hat is multiple statement processin 5 #u$tip$e statement processing increases the performance when $oading into $arge ta.$es. A$$ statements are sent to parser simu$taneous$,. A$$ statements are e8ecuted para$$e$. *1.:hat is TDP0D5 !&P-& is the -P address of the teradata server machine. **.:hat is tenacit&5 'pecifies the no. of hours that teradata 7)/A& continuous tr,ing to $ogon when the ma8imum no of $oad ;o.s is a$read, running on teradata data.ase. *4.:hat is (leep5 'pecifies the no. of minutes that teradata 7)/A& pauses .efore retr,ing on $ogon operation. *6.:hat is data.ase s%e2in 5 'kew factor occurs when the primar, inde8 co$umn se$ected is not a good candidate. #ean+ -f for a ta.$e when the P- se$ected having high$, non uni*ue va$ues then 'PE2 factor wi$$ .e getting ., defau$t it wi$$ .e :ero+ if skew factor se$ected is greater than 2B then it is not a good sign. *7.:hat is soft 3eferential 0nte rit& and !atch 3eferential 0nte rit&5 (oft 3eferential 0nte rit&: -t provides a mechanism to a$$ow user6specified 4eferentia$ -ntegrit, (4-) constraints that are not enforced ., the data.ase. Ena.$es optimi:ation techni*ues such as Hoin E$imination. !atch 3eferential 0nte rit&: !ests an entire insert+ de$ete+ or update .atch operation for referentia$ integrit,. -f insertion+ de$etion+ or update of an, row in the .atch vio$ates referentia$ integrit,+ then parsing engine software ro$$s .ack the entire .atch and returns an a.ort message. *8.Difference !et2een M-OAD A +-OAD M-OAD:

-t does the $oading in the B phases Phase19-t wi$$ get the import fi$e and checks the script Phase29-t reads the record from the .ase ta.$e and store in the work ta.$e Phase"9-n this App$ication phase it $ocks the ta.$e header Phase%9-n the &#) operation wi$$ done in the ta.$es Phase B9 -n this ta.$e $ocks wi$$ .e re$eased and work ta.$es wi$$ .e dropped. #u$ti$oad a$$ows nonuniBue secondar, inde8es 6 automatica$$, re.ui$ds them after $oading. #u$ti$oad can $oad at ma8 B t.$s at a time and can a$so update and de$ete the data

+ast-oad: 7ast$oad performs the $oading of the data in 2phase and it no need a work ta.$e for $oading the data so it is faster as we$$ as it fo$$ows the .e$ow steps to $oad the data in the ta.$e Phase16-t moves a$$ the records to a$$ the A#P first without an, hashing Phase26After giving end $oading command+Amp wi$$ hashes the record and send it to the appropriate A#P' . 7ast$oad is used to $oad empt, ta.$es and is ver, fast+ can $oad one ta.$e at a time. *9. Ad'anta es of PP0 PP-96Partitioned Primar, -nde8. 2hen a -nde8 is given on a partitioned ta.$e on the partitioned co$umn that is the co$umn on which the partitioned has done the same co$umn has .een given as a primar, inde8 then+ -f there are more partitions+ then it wi$$ .e faster to scan the ta.$e+ that too with the Pva$ue itse$f. *;. Disad'ata es of PP0 -f there are no partition dec$ared for the row to .e inserted in a particu$ar partition then it is waste to dec$are the primar, inde8 itse$f. -t is .etter to use the secondar, inde8 for partition for .etter performance. *<.Teradata joins5 Hoin Processing A ;oin is the com.ination of two or more ta.$es in the same 74/# of a sing$e 'E)EC! statement. 2hen writing a ;oin+ the ke, is to $ocate a co$umn in .oth ta.$es that is from a common domain. )ike the corre$ated su.*uer,+ ;oins are norma$$, .ased on an e*ua$ comparison .etween the ;oin co$umns. !he fo$$owing is the origina$ ;oin s,nta8 for a two6ta.$e ;oin9 'E)EC! QRta.$e6nameS.TRco$umn6nameS Q+Rta.$e6nameS.Rco$umn6nameS T 74/# Cta.le)name1D E A( Calias)name1D F 1Cta.le)name*D E A( Calias)name*D F E :/E3E ECta.le)name1D.FCcolumn)nameDG ECta.le)name*D.FCcolumn)nameD F

Ho0N %e&2ord is used in an (=- statement to Buer& data from t2o or more ta.les1 .ased on a relationship .et2een certain columns in these ta.les Common Hoin T&pes in Teradata 1.(elf Hoin *.0nner Hoin 4.Outer Hoin !he three formats of an /0!E4 H/are9

)eftUta.$e -E+T OITE3 HO0N 4ightUta.$e 6$eft ta.$e is outer ta.$e )eftUta.$e 30J/T OITE3 HO0N 4ightUta.$e 6right ta.$e is outer ta.$e -eftKta.le +I-- OITE3 HO0N 3i htKta.le ).oth are outer ta.les

(elf Hoin A (elf Hoin is simpl& a join that uses the same ta.le more than once in a sin le join operation. The first reBuirement for this t&pe of join is that the ta.le must contain t2o different columns of the same

domain. This ma& in'ol'e de)normali,ed ta.les. +or instance1 if the Emplo&ee ta.le contained a column for the mana erLs emplo&ee num.er and the mana er is an emplo&ee1 these t2o columns ha'e the same domain. !& joinin on these t2o columns in the Emplo&ee ta.le1 the mana ers can .e joined to the emplo&ees. Example: 'E)EC! #gr.)astUname (!it$e J#anager ameJ+ format J1(13) ) +&epartmentUname (!it$e J7or &epartment J) 74/# Emplo&eeKta.le A( Emp 0NNE3 HO0N Emplo&eeKta.le A( M r / Emp.#anagerUEmpU-& V #gr.Emp$o,eeU um.er / E4 H/&epartmentUta.$e A' &ept Emp.&epartmentUnum.er V &ept.&epartmentUnum.er

O3DE3 !" * M

0NNE3 HO0N: E4 H/E4 H/ke,word return rows when there is at $east one match in .oth ta.$es ',nta89

'E)EC! co$umnUname(s) 74/# ta.$eUname1 - E4 H/- ta.$eUname2 / ta.$eUname1.co$umnUnameVta.$eUname2.co$umnUname -E+T OITE3 HO0N !he )E7! /0!E4 H/ke,word returns a$$ rows from the $eft ta.$e (ta.$eUname1)+ even if there are

no matches in the right ta.$e(ta.$eUname2). -E+T OITE3 HO0N (&ntax: 'E)EC! co$umnUname(s) 74/# ta.$eUname1 )E7! /0!E4 H/- ta.$eUname2 / ta.$eUname1.co$umnUnameVta.$eUname2.co$umnUname 30J/T OITE3 HO0N: !he 4-OI! /0!E4 H/- ke,word 4eturn a$$ rows from the right ta.$e (ta.$eUname2)+ even if there are no matches in the $eft ta.$e (ta.$eUname1). 30J/T OITE3 HO0N (&ntax: 'E)EC! co$umnUname(s) 74/# ta.$eUname1 4-OI! /0!E4 H/- ta.$eUname2 / ta.$eUname1.co$umnUnameVta.$eUname2.co$umnUname +I-- OITE3 HO0N: !he 70)) /0!E4 H/ke,word return rows when there is a match in one of the ta.$es.

+I-- OITE3 HO0N (&ntax: 'E)EC! co$umnUname(s) 74/# ta.$eUname1 70)) /0!E4 H/- ta.$eUname2 / ta.$eUname1.co$umnUnameVta.$eUname2.co$umnUname A 70)) /0!E4 H/- uses .oth of the ta.$es as outer ta.$es. !he e8ceptions are returned from .oth ta.$es and the missing co$umn va$ues from either ta.$e are e8tended with 0)).

Product Hoin -t is ver, important to use an e*ua$ condition in the 2IE4E c$ause. /therwise ,ou get a product ;oin. !his means that one row of a ta.$e is ;oined to mu$tip$e rows of another ta.$e. A mathematic product means that mu$tip$ication is used. 4>. Difference .et2een Primar& index and secondar& index5 1. primar, inde8 cannot create after ta.$e creation+ whereas secondar, inde8 can .e created d,namica$$,. 2. primar, inde8 is 1 A#P operation+ secondar, inde8 is 2 A#P operation and non uni*ue secondar, inde8 is A)) A#P operation. 41. 2hat are Hournals5 Hourna$ing is a data protection mechanism in teradata Hourna$s are generated to maintain pre6 images and post images of a &#) transaction starting@ending at@from a checkpoint. 2hen a &#) transaction fai$s+the ta.$e is restored .ack to the $ast avai$a.$e checkpoint using the ;ourna$ -mages. !here are two t,pes of Hourna$s (1) permanent (2) !ransient ;ourna$. !he purpose of the permanent ;ourna$ is to provide se$ective or fu$$ data.ase recover, to a specified point in time. -t permits recover, from une8pected hardware or software disasters. !he permanent ;ourna$ a$so reduces the need for fu$$ ta.$e .ackups that can .e cost$, in .oth time and resources. 1. Permanent journals are e8p$icit$, created during data.ase and@or ta.$e creation time. !his ;ourna$ing can .e imp$emented depending upon the need and avai$a.$e disk space. PH processing is a user se$ecta.$e option on a data.ase which a$$ows the user to se$ect e8tra ;ourna$ing for changes made to a ta.$e. !here are more options and the data can .e ro$$ed forward or .ackward (depending if ,ou se$ected the correct options) at points of the customers choosing. !he, are permanent .ecause the changes are kept unti$ the customer de$etes them or un$oads them to a .ackup tape. !he, are usua$$, kept in con;unction with .ackups of the data.ase and a$$ow partia$ ro$$.ack or ro$$ forward for some corrupted data or operationa$ error $ike someone de$eted a months worth of data .ecause the, messed up the where c$ause *.Transient Hournal !he transient ;ourna$ permits the successfu$ ro$$.ack of a fai$ed transaction (!1 ). !ransactions are not committed to the data.ase unti$ the A#Ps have received an End !ransaction re*uest+ either imp$icit$, or e8p$icit$,. !here is a$wa,s the possi.i$it, that the transaction ma, fai$. -f so+ the participating ta.$e(s) must .e restored to their pre6transaction state. !he transient ;ourna$ maintains a cop, of .efore images of a$$ rows affected ., the transaction. -n the event of transaction fai$ure+ the .efore images are reapp$ied to the affected ta.$es+ then are de$eted from the ;ourna$+ and a ro$$.ack operation is comp$eted. -n the event of transaction success+ the .efore images for the transaction are discarded from the ;ourna$ at the point of transaction commit. !ransient Hourna$ activities are automatic and transparent to the user

4*.Teradata fast export script5 .-OJTA!-E 3estart-o 1KfxpM .3IN .!EJ0N .-A"OIT .+0E-D .+0E-D .0MPO3T .ENPO3T (E-ECT +0-E lo on M ENPO3T (E((0ON( 6 M 3ecordK-a&out M inKCit& inKOip 1 C/A3#*>$ M P C/A3#7$M

0N+0-E cit&K,ipKinfile -A"OIT 3ecordK-a&out M

OIT+0-E custKacctKoutfile* M A.AccountKNum.er 1 C.-astKName 1 C.+irstKName 1 A.!alanceKCurrent +3OM Accounts A 0NNE3 HO0N AccountsKCustomer AC 0NNE3 HO0N Customer C ON C.CustomerKNum.er G AC.CustomerKNum.er ON A.AccountKNum.er G AC.AccountKNum.er :/E3E A.Cit& G :inKCit& AND A.OipKCode G :inKOip O3DE3 !" 1 M .END ENPO3T M .-OJO++ M 44.Teradata statistics. 'tatistics co$$ection is essentia$ for the optima$ performance of the !eradata *uer, optimi:er. !he *uer, optimi:er re$ies on statistics to he$p it determine the .est wa, to access data. 'tatistics a$so he$p the optimi:er ascertain how man, rows e8ist in ta.$es .eing *ueried and predict how man, rows wi$$ *ua$if, for given conditions. )ack of statistics+ or out6dated statistics+ might resu$t in the optimi:er choosing a $ess6than6 optima$ method for accessing data ta.$es. Points: 1: Once a collect stats is done on the ta.le#on index or column$ 2here is this information stored so that the optimi,er can refer this5 Ans9 Co$$ected statistics are stored in &BC.!D7ie$ds or &BC.-nde8es. Iowever+ ,ou cannot *uer, these two ta.$es. *: /o2 often collect stats has to .e made for a ta.le that is freBuentl& updated5 Answer9 You need to refresh stats when B to 13N of ta.$eJs rows have changed. Co$$ect stats cou$d .e prett, resource consuming for $arge ta.$es. 'o it is a$wa,s advisa.$e to schedu$e the ;o. at off peak period and norma$$, after appro8imate$, 13N of data changes. 4: Once a collect stats has .een done on the ta.le ho2 can i .e sure that the optimi,er is considerin this .efore execution 5 i.eM until the next collect stats has .een done 2ill the optimi,er refer this5 Ans9 Yes+ optimi:er wi$$ use stats data for *uer, e8ecution p$an if avai$a.$e. !hatJs wh, sta$e stats is dangerous as that ma, mis$ead the optimi:er. :hat is a /OT AMP 2hen the work$oad is not distri.uted across a$$ the A#Ps+ on$, a few A#Ps end up over.urdened with the

work. !his is a hot A#P condition. !his t,pica$$, occurs when the vo$ume of data ,ou are dea$ing with is high and (a). You are tr,ing to retrieve the data in a !E4A&A!A ta.$e which is not we$$ distri.uted across the A#Ps on the s,stem (.ad Primar, -nde8) /4 (.). 2hen ,ou are tr,ing to ;oin on co$umn with high$, non uni*ue va$ues /4 (c). 2hen ,ou app$, the &-'!- C! operator on a co$umn with high$, non uni*ue va$ues 6: /o2 can i %no2 the ta.les for 2hich the collect stats has .een done5 Ans9 You run Ie$p 'tats command on that ta.$e. e.g IE)P '!A!--'!-C' !AB)EU A#E K this wi$$ give ,ou &ate and time when stats were $ast co$$ected. You wi$$ a$so see stats for the co$umns ( for which stats were defined) for the ta.$e. You can use !eradata #anager too. 7: To 2hat extent 2ill there .e performance issues 2hen a collect stats is not done5Can a performance issue .e related onl& due to collect stats5 Pro.a.l& a /OT AMP could .e the reason for lac% of spool space 2hich is leadin to performance de radation QQQ As9 1stpart9 !eradata uses a cost .ased optimi:er and cost estimates are done .ased on statistics. 'o if ,ou dont have statistics co$$ected then optimi:er wi$$ use a &,namic A#P 'amp$ing method to get the stats. -f ,our ta.$e is .ig and data was uneven$, distri.uted then d,namic samp$ing ma, not get right information and ,our performance wi$$ suffer. 2nd Part9 o+ performance cou$d .e re$ated to .ad se$ection of inde8es ( most important$, P-) and the access path of a particu$ar *uer,. ?9 A$so $et me know what can $ead to $ack of spoo$ space apart from I/! A#P WWW Ans9 /ne reason comes to m, mind+ a product ;oin on two .ig data sets ma, $ead to the $ack of spoo$ space. 46. :here 2ill &ou define error ta.les in the script5

-n 7)/A& X #)/A& we define in BEO- )/A&- O statement. 47. 0 ha'e to load data dail&. :hich load utilit& 2ill .e ood5 !P0#P. 48 :hat are different (PACE( a'aila.le in Teradata5 Perm 'pace !emp 'pace spoo$ space Perm (pace 9A$$ data.ases have a defined upper $imit of permanent space. Permanent space is used for storing the data rows of ta.$es. Perm space is not pre6a$$ocated. -t represents a ma8imum $imit. (pool (pace : A$$ data.ases a$so have an upper $imit of spoo$ space. -f there is no $imit defined for a particu$ar data.ase or user+ $imits are inherited from parents. !heoretica$$,+ a user cou$d use a$$ una$$ocated space in the s,stem for their *uer,. 'poo$ space is temporar, space used to ho$d intermediate *uer, resu$ts or formatted answer sets to *ueries. /nce the *uer, is comp$ete+ the spoo$ space is re$eased. E8amp$e9 You have a data.ase with tota$ disk space of 133OB. You have 13OB of user data and an additiona$ 13OB of overhead. 2hat is the ma8imum amount of spoo$ space avai$a.$e for *ueriesY Answer9 C3OB. A$$ of the remaining space in the s,stem is avai$a.$e for spoo$ Temp (pace : !he third t,pe of space is temporar, space. !emp space is used for O$o.a$ and Do$ati$e temporar, ta.$es+ and these resu$ts remain avai$a.$e to the user unti$ the session is terminated. !a.$es created in temp space wi$$ survive a restart. 49.different options that 2e can specif& in C3EATE ta.le statement5 !here are two different ta.$e t,pe phi$osophies so there are two different t,pe ta.$es. !he, are 'E! and #0)!-'E!. -t has .een said+ 5A man with one watch knows the time+ .ut a man with two watches is

never sure5. 2hen !eradata was origina$$, designed it did not a$$ow dup$icate rows in a ta.$e. -f an, row in the same ta.$e had the same va$ues in ever, co$umn !eradata wou$d throw one of the rows out. !he, .e$ieved a second row was a mistake. 2h, wou$d someone need two watches and wh, wou$d someone need two rows e8act$, the sameY !his is 'E! theor, and a 'E! ta.$e kicks out dup$icate rows. !he A '- standard .e$ieved in a different phi$osoph,. -f two rows are entered into a ta.$e that are e8act dup$icates then this is accepta.$e. -f a person wants to wear two watches then the, pro.a.$, have a good reason. !his is a #0)!-'E! ta.$e and dup$icate rows are a$$owed. -f ,ou do not specif, 'E! or #0)!-'E!+ one is used as a defau$t. Iere is the issue9 the defau$t in Teradata mode is (ET and the defau$t in AN(0 mode is MI-T0(ET. !herefore+ to e$iminate confusion it is important to e8p$icit$, define which one is desired. /therwise+ ,ou must know in which mode the C4EA!E !AB)E wi$$ e8ecute in so that the correct t,pe is used for each ta.$e. !he imp$ication of using a 'E! or #0)!-'E! ta.$e is discussed further. (ET and MI-T0(ET Ta.les A 'E! ta.$e does not a$$ow dup$icate rows so !eradata checks to ensure that no two rows in a ta.$e are e8act$, the same. !his can .e a .urden. /ne wa, around the dup$icate row check is to have a co$umn in the ta.$e defined as 0 -(0E. !his cou$d .e a 0ni*ue Primar, -nde8 (0P-)+ 0ni*ue 'econdar, -nde8 (0'-) or even a co$umn with a 0 -(0E or P4-#A4Y PEY constraint. 'ince a$$ must .e uni*ue+ a dup$icate row ma, never e8ist. !herefore+ the check on either the inde8 or constraint e$iminates the need for the row to .e e8amined for uni*ueness. As a resu$t+ inserting new rows can .e much faster ., e$iminating the dup$icate row check. Iowever+ if the ta.$e is defined with a 0P- and the ta.$e uses 'E! as the ta.$e t,pe+ now a dup$icate row check must .e performed. 'ince 'E! ta.$es do not a$$ow dup$icate rows a check must .e performed ever, time a 0P- &0P (dup$icate of an e8isting row 0P- va$ue) va$ue is inserted or updated in the ta.$e. &o not .e foo$edW A dup$icate row check can .e a ver, e8pensive operation in terms of processing time. !his is .ecause ever, new row inserted must .e checked to see if it is a dup$icate of an, e8isting row with the same 0P- 4ow Iash va$ue. !he num.er of checks increases e8ponentia$$, as each new row is added to the ta.$e. 2hat is the so$utionY !here are two9 either make the ta.$e a #0)!-'E! ta.$e (on$, if ,ou want dup$icate rows to .e possi.$e) or define at $east one co$umn or composite co$umns as 0 -(0E. -f neither is an option then the 'E! ta.$e with no uni*ue co$umns wi$$ work+ .ut inserts and updates wi$$ take more time .ecause of the mandator, dup$icate row check. Be$ow is an e8amp$e of creating a 'E! ta.$e9 C4EA!E 'E! !a.$e !omC.emp$o,ee ( emp +dept +$name +fname +sa$ar, - !EOE4 - !EOE4 CIA4(23) DA4CIA4(23) &EC-#A)(13+2)

+hireUdate &A!E ) IN0=IE P4-#A4Y - &E1(emp)K otice the 0 -(0E P4-#A4Y - &E1 on the co$umn emp. Because this is a 'E! ta.$e it is much more efficient to have at $east one uni*ue ke, so the dup$icate row check is e$iminated. !he fo$$owing is an e8amp$e of creating the same ta.$e as .efore+ .ut this time as a #0)!-'E! ta.$e9 C4EA!E MI-T0(ET !AB)E emp$o,ee ( emp +dept +$name +fname +sa$ar, - !EOE4 - !EOE4 CIA4(23) DA4CIA4(23) &EC-#A)(13+2)

+hireUdate &A!E )

P30MA3" - &E1(emp)K otice a$so that the P0 is no2 a NIP0 .ecause it does not use the word 0 -(0E. !his is importantW As mentioned previous$,+ if the 0P- is re*uested+ no dup$icate rows can .e inserted. !herefore+ it acts more $ike a 'E! ta.$e. !his #0)!-'E! e8amp$e a$$ows dup$icate rows. -nserts wi$$ take $onger .ecause of the mandator, dup$icate row check. 4;. :hat is macro5 Ad'ata es of it. Macros:A macro is a predefined+ stored set of one or more '() commands and report6formatting commands. #acros are used to simp$if, the e8ecution of fre*uent$, used '() commands. #acros do not re*uire permanent space. 4<.:hat are the functions of AMPs in Teradata5 Each A#P is designed to ho$d a portion of the rows of each ta.$e. An A#P is responsi.$e for the storage+ maintenance and retrieva$ of the data under its contro$. !eradata uses hash partitioning to random$, and even$, distri.ute data across a$$ A#Ps for .a$anced performance points9 6>. /o2 Does Teradata (tore 3o2s5 = !eradata uses hash partitioning and distri.ution to random$, and even$, distri.ute data across a$$ A#Ps. = !he rows of ever, ta.$e are distri.uted among a$$ A#Ps 6 and idea$$, wi$$ .e even$, distri.uted among a$$ A#Ps. = Each A#P is responsi.$e for a su.set of the rows of each ta.$e. = Even$, distri.uted ta.$es resu$t in even$, distri.uted work$oads. +all.ac% A Do2n Amp reco'er& journalQQQ /i1 :hen a +all.ac% protected AMP oes do2n durin a 2rite operation1 the update ta%es place in the +all.ac% AMP in the same cluster to later update in the ori inal AMP 2hen it reco'ers. :hen an AMP oes do2n the updates are also recorded in the Do2n AMP 3eco'er& journal to later update 2hen AMP reco'ers. M& dou.t is 2hen an AMP oes do2n are the updates made in .oth +all.ac% AMP A Do2n AMP reco'er& journal5 !ecause if "es1 it loo%s li%e a redundant reco'er& measure or 0s it li%e Do2n AMP 3eco'er& journal is used for onl& Non +all.ac% protected AMPs or for +all.ac% protected AMPs 2hen .oth the AMPs in the cluster are do2n. 3e ards1 Annal T

Hi Annal, According to my knowledge 1.Down amp recovery journal will start when AMP goes down to restore the data for the down amp .fall !ack is like it has redundant data,if one amp goes down in the cluster also it wont affect your "ueries.the "uery will use data from fall !ack rows.the down amp wont !e updated use the data from fall !ack. #or your dou!t,$hen amp is down you ran the update,so fall !ack rows will !e updated.%till amp is in down condition and if you run the "uery,the "uery will use the updated ones and run.whenever down amp active it will use downamp recovery journal and data will !e updated. Hope this helps. &egards, %yam Prasad '

61. :hich one 2ill ta%e care 2hen an AMP oes do2n5 &own amp recover, ;ourna$ wi$$ start when A#P goes down to restore the data for the down amp 2.fa$$ .ack is $ike it has redundant data+if one amp goes down in the c$uster a$so it wont affect ,our *ueries.the *uer, wi$$ use data from fa$$ .ack rows.the down amp wont .e updated use the data from fa$$ .ack. 7or ,our dou.t+2hen amp is down ,ou ran the update+so fa$$ .ack rows wi$$ .e updated.'ti$$ amp is in down condition and if ,ou run the *uer,+the *uer, wi$$ use the updated ones and run.whenever down amp active it wi$$ use downamp recover, ;ourna$ and data wi$$ .e updated. 6*.:hich one 2ill ta%e care 2hen a NODE oes do2n5 -n the event of node fai$ure+ a$$ virtua$ processors can migrate to another avai$a.$e node in the c$i*ue. A$$ nodes in the c$i*ue must have access to the same disk arra,s 64.:hat is the use of ENP-0N plan5 !he E1P)A- faci$it, a$$ows ,ou to preview how !eradata wi$$ e8ecute a re*uested *uer,. -t returns a summar, of the steps the !eradata 4&B#' wou$d perform to e8ecute the re*uest. E1P)A- a$so disc$oses the strateg, and access method to .e used+ how man, rows wi$$ .e invo$ved+ and its cost in minutes and seconds. 0se E1P)A- to eva$uate a *uer, performance and to deve$op an a$ternative processing strateg, that ma, .e more efficient. E1P)A- works on an, '() re*uest. !he re*uest is fu$$, parsed and optimi:ed+ .ut not run. !he comp$ete p$an is returned to the user in reada.$e Eng$ish statements. E1P)A- provides information a.out $ocking+ sorting+ row se$ection criteria+ ;oin strateg, and conditions+ access method+ and para$$e$ step processing. E1P)A- is usefu$ for performance tuning+ de.ugging+ pre6va$idation of re*uests+ and for technica$ training. 66.Ise of COA-E(CE function5 !he newer A '- standard C/A)E'CE can a$so convert a 0)) to a :ero. Iowever+ it can convert a 0)) va$ue to an, data va$ue as we$$. !he C/A)E'CE searches a va$ue $ist+ ranging from one to man, va$ues+ and returns the first on6 0)) va$ue it finds. At the same time+ it returns a 0)) if a$$ va$ues in the $ist are 0)). !o use the C/A)E'CE+ the '() must pass the name of a co$umn to the function. !he data in the co$umn is then compared for a 0)). A$though one co$umn name is a$$ that is re*uired+ norma$$, more than one co$umn is norma$$, passed to it. Additiona$$,+ a $itera$ va$ue+ which is never 0))+ can .e returned to provide a defau$t va$ue if a$$ of the previous co$umn va$ues are 0)). !he s,nta8 for the C/A)E'CE fo$$ows9 'E)EC! C/A)E'CE (Rco$umn6$istS Q+R$itera$S T )

+RAggregateS( C/A)E'CE(Rco$umn6$istSQ+R$itera$ST ) ) 74/# Rta.$e6nameS O4/0P BY 1 K -n the a.ove s,nta8 the Rco$umn6$istS is a $ist of co$umns. -t is written as a series of co$umn names separated ., commas. 'E)EC! C/A)E'CE( 0))+3) A' Co$1 +C/A)E'CE( 0))+ 0))+ 0))) A' Co$2 +C/A)E'CE(") A' Co$" +C/A)E'CE(JAJ+") A' Co$% K 67.Diff .et2een role 1 pri'ile e and profile5 A ro$e can .e assisgned a co$$ection of access rights in the same wa, a user can.

You then grant the ro$e to a set of users+ rather than grant each user the same rights. !his cuts down on maintenance+ adds standardisation (hence reducing erroneous access to sensitive data) and reduces the si:e of the d.c.a$$rights ta.$e+ which is ver, important in reducing &BC .$ocking in a $arge environment. Profi$es assign different characteristics on a 0ser+ such as spoo$ space+ permspace and account strings. Again this he$ps with standardisation. ote that spoo$ assigned to a profi$e wi$$ overru$e spoo$ assigned on a create user statement. Check the on $ine manua$s for the fu$$ $ists of properties &ata Contro$ )anguage is used to restrict or permit a userJs access. -t can se$ective$, $imit a userJs a.i$it, to retrieve+ add+ or modif, data. -t is used to grant and revoke access privi$eges on ta.$es and views. 68.Diff .et2een data.ase and user5 Both ma, own o.;ects such as ta.$es+ views+ macros+ procedures+ and functions. Both users and data.ases ma, ho$d privi$eges. Iowever+ on$, users ma, $og on+ esta.$ish a session with the !eradata &ata.ase+ and su.mit re*uests. A user performs actions where as a data.ase is passive. 0sers have passwords and startup stringsK data.ases do not. 0sers can $og on to the !eradata &ata.ase+ esta.$ish sessions+ and su.mit '() statementsK data.ases cannot. Creator privi$eges are associated on$, with a user .ecause on$, a user can $og on and su.mit a C4EA!E statement. -mp$icit privi$eges are associated with either a data.ase or a user .ecause each can ho$d an o.;ect and an o.;ect is owned ., the named space in which it resides 69./o2 man& mload scripts are reBuired for the .elo2 scenario 7irst - want to $oad data from source to vo$ati$e ta.$e. After that - want to $oad data from vo$ati$e ta.$e to Permanent ta.$e. 6;.:hat are the t&pes of CA(E statements a'aila.le in Teradata5 !he CA'E function provides an additiona$ $eve$ of data testing after a row is accepted ., the 2IE4E c$ause. !he additiona$ test a$$ows for mu$tip$e comparisons on mu$tip$e co$umns with mu$tip$e outcomes. -t a$so incorporates $ogic to hand$e a situation in which none of the va$ues compares e*ua$. 2hen using CA'E+ each row retrieved is eva$uated once ., ever, CA'E function. !herefore+ if two CA'E operations are in the same '() statement+ each row has a co$umn checked twice+ or two different va$ues each checked one time. !he .asic s,nta8 of the CA'E fo$$ows9 CA'E Rco$umn6nameS 2IE 2IE 2IE E & !,pes9 1.7$e8i.$e Comparisons within CA'E 2hen it is necessar, to compare more than ;ust e*ua$ conditions within the CA'E+ the format is modified s$ight$, to hand$e the comparison. #an, peop$e prefer to use the fo$$owing format .ecause it is more f$e8i.$e and can compare ine*ua$ities as we$$ as e*ua$ities. !his is a more f$e8i.$e form of the CA'E s,nta8 and a$$ows for ine*ua$it, tests9 CA'E 2IE Rcondition6test1S !IE Rtrue6resu$t1S Rva$ue1S !IE Rva$ue2S !IE Rva$ue S !IE Rtrue6resu$t1S Rtrue6resu$t2S Rtrue6resu$t S

Q E)'E Rfa$se6resu$tS T

2IE 2IE E &

Rcondition6test2S !IE Rcondition6test S !IE

Rtrue6resu$t2S Rtrue6resu$t S

Q E)'E Rfa$se6resu$tS T !he a.ove s,nta8 shows that mu$tip$e tests can .e made within each CA'E. !he va$ue stored in the co$umn continues to .e tested unti$ it finds a true condition. At that point+ it does the !IE portion and e8its the CA'E $ogic ., going direct$, to the E &. 2.Comparison /perators within CA'E -n this section+ we wi$$ investigate adding more power to the CA'E statement. -n the a.ove e8amp$es+ a $itera$ va$ue was returned. -n most cases+ it is necessar, to return data. !he returned va$ue can come from a co$umn name ;ust $ike an, se$ected co$umn or a mathematica$ operation. Additiona$$,+ the a.ove e8amp$es used a $itera$ LV< as the comparison operator. !he CA'E comparisons a$so a$$ow the use of - + BE!2EE + 0))-7 and C/A)E'CE. -n rea$it,+ the BE!2EE is a compound comparison. -t checks for va$ues that are greater than or e*ua$ to the first num.er and $ess than or e*ua$ to the second num.er. !he ne8t e8amp$e uses .oth formats of the CA'E in a sing$e 'E)EC! with each one producing a co$umn disp$a,. -t a$so uses A' to esta.$ish an a$ias after the E &9 'E)EC! CA(E :/EN JradeKpt 0( NI-- T/EN LJrade Point In%no2nL :/EN JradeKpt 0N #11*14$ T/EN L0nte er JPAL :/EN JradeKpt !ET:EEN 1 AND * T/EN L-o2 Decimal 'alueL :/EN JradeKpt C 4.<< T/EN L/i h Decimal 'alueL E-(E L6.> JPAL END A' OradeUPointUAverage +CA(E ClassKcode :/EN L+3L T/EN L+reshmanL :/EN L(OL T/EN L(ophomoreL :/EN LH3L T/EN LHuniorL :/EN L(3L T/EN L(eniorL E-(E LIn%no2n ClassL END A' C$assU&escription 74/# 'tudentUta.$e /4&E4 BY C$assUcode K

4.CA(E for /ori,ontal 3eportin Another interesting usage for the CA'E is to perform hori:onta$ reporting. orma$$,+ '() does vertica$ reporting. !his means that ever, row returned is shown on the ne8t output $ine of the report as a separate $ine. Iori:onta$ reporting shows the output of a$$ information re*uested on one $ine as co$umns instead of vertica$$, as rows. Previous$,+ we discussed aggregation. -t e$iminates detai$ data and outputs on$, one $ine or one $ine per uni*ue va$ue in the non6aggregate co$umn(s) when uti$i:ing the O4/0P BY. !hat is how vertica$ reporting works+ one output $ine .e$ow the previous. Iori:onta$ reporting shows the ne8t va$ue on the same $ine as the ne8t co$umn+ instead of the ne8t $ine. 0sing the ne8t 'E)EC! statement+ we achieve the same information in a hori:onta$ reporting format ., making each va$ue a co$umn9 'E)EC! ADO(CA(E ClassKcode

:/EN L+3L T/EN JradeKpt E-(E NI-- END) (format JZ.ZZJ) A' 7reshmanUOPA +ADO(CA(E ClassKcode :/EN L(OL T/EN JradeKpt E-(E NI-- END) (format JZ.ZZJ) A' 'ophomoreUOPA +ADO(CA(E ClassKcode :/EN LH3L T/EN JradeKpt E-(E NI-- END) (format JZ.ZZJ) A' HuniorUOPA +ADO(CA(E ClassKcode :/EN L(3L T/EN JradeKpt E-(E NI-- END) (format JZ.ZZJ) A' 'eniorUOPA 74/# 'tudentU!a.$e 2IE4E C$assUcode -' /! 0)) K

6.Nested CA(E Expressions After .ecoming comforta.$e with the previous e8amp$es of the CA'E+ it ma, .ecome apparent that a sing$e check on a co$umn is not sufficient for more comp$icated re*uests. 2hen that is the situation+ one CA'E can .e im.edded within another. !his is ca$$ed nested CA'E statements. !he CA'E ma, .e nested to check data in a second co$umn in a second CA'E .efore determining what va$ue to return. -t is common to have more than one CA'E in a sing$e '() statement. Iowever+ it is powerfu$ enough to have a CA'E statement within a CA'E statement. E8amp$e9 'E)EC! )astUname +CA(E C$assUcode 2IE !IE JH4J JHunior J [[(CA(E :/EN JradeKpt C * T/EN L+ailin L :/EN JradeKpt C 4.7 T/EN LPassin L E-(E LExceedin L END) E)'E J'enior J [[(CA(E :/EN JradeKpt C * T/EN L+ailin L :/EN JradeKpt C 4.7 T/EN LPassin L E-(E LExceedin L END) END A' CurrentU'tatus 74/# 'tudentU!a.$e 2IE4E C$assUcode (JH4J+J'4J) /4&E4 BY c$assUcode+ $astUnameK

6<.ho2 2ill &ou A-TE3 a ta.le in Teradata5 1.A)!E4 !AB)E(A&&)9 ',nta8 A)!E4 !AB)E emp$o,ee A&& 'treet DA4CIA4("3) +A&& Cit, DA4CIA4(23)K A)!E4 !AB)E(drop)9 ',nta8 A)!E4 !AB)E emp$o,ee &4/P Phone

+&4/P PrefK A)!E4 !AB)E(4ename)9 ',nta8 A)!E4 !AB)E emp$o,ee 4E A#E 'treet !/ 'treetAddrK 7>.mention the order of (=- execution 'E)EC!62IE4E6O4/0P BY6IAD- O6 /4&E4 BY c$ass 71 :hat is the (=- to find the .ase AMP1 no. of records stored for a particular ta.le5 7* :hen a P0 is not mentioned on a ta.le1 ho2 2ill Teradata consider the P0 for that ta.le5 -f ,ou donJt specif, a P- at ta.$e create time then !eradata must chose one. 7or instance+ if the &&) is ported from another data.ase that uses a Primar, Pe, instead of a Primar, -nde8+ the C4EA!E !AB)E contains a P4-#A4Y PEY (PP) constraint. !eradata is smart enough to know that Primar, Pe,s must .e uni*ue and cannot .e nu$$. 'o+ the first $eve$ of defau$t is to use the P30MA3" @E" column#s$ as a IP0. -f the &&) defines no P4-#A4Y PEY+ !eradata $ooks for a co$umn defined as 0 -(0E. As a second $eve$ defau$t+ !eradata uses the first column defined 2ith a IN0=IE constraint as a IP0. -f none of the a.ove attri.utes are found+ !eradata uses the first column defined in the ta.$e as a / 60 -(0E P4-#A4Y - &E1 ( 0P-). 74 :hat is co'ered Buer& in Teradata5 -f a 'E)EC! *uer, covers a$$ the co$umns that are defined in the H/*ueries are ca$$ed as C/DE4E& *uer,. #u$ti6Co$umn 0'- Co$umns used as a Covered (uer, 76. :hat is NI(0 .it mappin 5 77. :hat are data demo raphics5 &ata demographics give us the information re$ated to fre*uent$, updated co$umns. data demographics are 9 ma8imum rows per va$ue t,pica$ rows per va$ue distinct va$ues 78. Diff .et2een lo ical and ph&sical data modelin 5 -o ical Rersus Ph&sical Data.ase Modelin After a$$ .usiness re*uirements have .een gathered for a proposed data.ase+ the, must .e mode$ed. #ode$s are created to visua$$, represent the proposed data.ase so that .usiness re*uirements can easi$, .e associated with data.ase o.;ects to ensure that a$$ re*uirements have .een comp$ete$, and accurate$, gathered. &ifferent t,pes of diagrams are t,pica$$, produced to i$$ustrate the .usiness processes+ ru$es+ entities+ and organi:ationa$ units that have .een identified. !hese diagrams often inc$ude entit, re$ationship diagrams+ process f$ow diagrams+ and server mode$ diagrams. An entit, re$ationship diagram (E4&) represents the entities+ or groups of information+ and their re$ationships maintained for a .usiness. Process f$ow diagrams represent .usiness processes and the f$ow of data .etween different processes and entities that have .een defined. 'erver mode$ diagrams represent a detai$ed picture of the data.ase as .eing transformed from the .usiness mode$ into a re$ationa$ data.ase with ta.$es+ co$umns+ and constraints. Basica$$,+ data mode$ing serves as a $ink .etween .usiness needs and s,stem re*uirements. !wo t,pes of data mode$ing are as fo$$ows9

- &E1 as ;oin co$umns+ such t,pe of

)ogica$ mode$ing Ph,sica$ mode$ing

-f ,ou are going to .e working with data.ases+ then it is important to understand the difference .etween $ogica$ and ph,sica$ mode$ing+ and how the, re$ate to one another. )ogica$ and ph,sica$ mode$ing are

descri.ed in more detai$ in the fo$$owing su.sections. -o ical Modelin )ogica$ mode$ing dea$s with gathering .usiness re*uirements and converting those re*uirements into a mode$. !he $ogica$ mode$ revo$ves around the needs of the .usiness+ not the data.ase+ a$though the needs of the .usiness are used to esta.$ish the needs of the data.ase. )ogica$ mode$ing invo$ves gathering information a.out .usiness processes+ .usiness entities (categories of data)+ and organi:ationa$ units. After this information is gathered+ diagrams and reports are produced inc$uding entit, re$ationship diagrams+ .usiness process diagrams+ and eventua$$, process f$ow diagrams. !he diagrams produced shou$d show the processes and data that e8ists+ as we$$ as the re$ationships .etween .usiness processes and data. )ogica$ mode$ing shou$d accurate$, render a visua$ representation of the activities and data re$evant to a particu$ar .usiness. !he diagrams and documentation generated during $ogica$ mode$ing is used to determine whether the re*uirements of the .usiness have .een comp$ete$, gathered. #anagement+ deve$opers+ and end users a$ike review these diagrams and documentation to determine if more work is re*uired .efore ph,sica$ mode$ing commences. !,pica$ de$ivera.$es of $ogica$ mode$ing inc$ude

Entit, re$ationship diagrams An Entit, 4e$ationship &iagram is a$so referred to as an ana$,sis E4&. !he point of the initia$ E4& is to provide the deve$opment team with a picture of the different categories of data for the .usiness+ as we$$ as how these categories of data are re$ated to one another. Business process diagrams !he process mode$ i$$ustrates a$$ the parent and chi$d processes that are performed ., individua$s within a compan,. !he process mode$ gives the deve$opment team an idea of how data moves within the organi:ation. Because process mode$s i$$ustrate the activities of individua$s in the compan,+ the process mode$ can .e used to determine how a data.ase app$ication interface is design. 0ser feed.ack documentation

Ph&sical Modelin Ph,sica$ mode$ing invo$ves the actua$ design of a data.ase according to the re*uirements that were esta.$ished during $ogica$ mode$ing. )ogica$ mode$ing main$, invo$ves gathering the re*uirements of the .usiness+ with the $atter part of $ogica$ mode$ing directed toward the goa$s and re*uirements of the data.ase. Ph,sica$ mode$ing dea$s with the conversion of the $ogica$+ or .usiness mode$+ into a re$ationa$ data.ase mode$. 2hen ph,sica$ mode$ing occurs+ o.;ects are .eing defined at the schema $eve$. A schema is a group of re$ated o.;ects in a data.ase. A data.ase design effort is norma$$, associated with one schema. &uring ph,sica$ mode$ing+ o.;ects such as ta.$es and co$umns are created .ased on entities and attri.utes that were defined during $ogica$ mode$ing. Constraints are a$so defined+ inc$uding primar, ke,s+ foreign ke,s+ other uni*ue ke,s+ and check constraints. Diews can .e created from data.ase ta.$es to summari:e data or to simp$, provide the user with another perspective of certain data. /ther o.;ects such as inde8es and snapshots can a$so .e defined during ph,sica$ mode$ing. Ph,sica$ mode$ing is when a$$ the pieces come together to comp$ete the process of defining a data.ase for a .usiness. Ph,sica$ mode$ing is data.ase software specific+ meaning that the o.;ects defined during ph,sica$ mode$ing can var, depending on the re$ationa$ data.ase software .eing used. 7or e8amp$e+ most re$ationa$ data.ase s,stems have variations with the wa, data t,pes are represented and the wa, data is stored+ a$though .asic data t,pes are conceptua$$, the same among different imp$ementations. Additiona$$,+ some data.ase s,stems have o.;ects that are not avai$a.$e in other data.ase s,stems. 79. 2hat is deri'ed Ta.le5 Deri'ed ta.les are a$wa,s $oca$ to a sing$e '() re*uest. !he, are .ui$t d,namica$$, using an additiona$ 'E)EC! within the *uer,. !he rows of the derived ta.$e are stored in spoo$ and discarded as soon as the *uer, finishes. !he && has no know$edge of derived ta.$es. !herefore+ no e8tra privi$eges are necessar,. -ts space comes from the users spoo$ space. 7o$$owing is a simp$e e8amp$e using a derived ta.$e named &! with a co$umn a$ias ca$$ed avgsa$ and its

data va$ue is o.tained using the ADO aggregation9 'E)EC! \ 74/# #(E-ECT ARJ#salar&$ +3OM Emplo&eeKta.le$ DT#a' sal$ K 7;.2hat is the use of :0T/ C/EC@ OPT0ON in Teradata5 -n !eradata+ the additiona$ ke, phase9 2-!I CIECP /P!-/ + indicates that the 2IE4E c$ause conditions shou$d .e app$ied during the e8ecution of an 0P&A!E or &E)E!E against the view. !his is not a concern if views are not used for maintenance activit, due to restricted privi$eges. 7<.2hat is soft referential inte rit& and .atch referential inte rit&5 'oft 4- is ;ust an indication that there is a PP67P re$ation .etween the co$umns and is not imp$emented at !& side. But having it wou$d he$p in cases $ike Hoin processing etc. Batch9 6 !ests an entire insert+ de$ete+ or update .atch operation for referentia$ integrit,. 6 -f insertion+ de$etion+ or update of an, row in the .atch vio$ates referentia$ integrit,+ then parsing engine software ro$$s .ack the entire .atch and returns an a.ort message. )ets sa, that - had a ta.$e ca$$ed 1 with some num.er of rows and - wanted to insert these rows into ta.$e Y (insert into 1 se$ect \ from ,). Iowever+ some of the rows vio$ated an 4- constraint that ta.$e Y had. 7rom reading the manua$s+ it seemed to me that if using standard 4-+ a$$ of the va$id rows wou$d .e inserted .ut the inva$id ones wou$d not. But with .atch 4- (which is 5a$$ or nothing5) - wou$d e8pect nothing to get inserted since it wou$d check for pro.$em rows up front and return an error right awa,. -f in fact there is no difference e8cept in how !eradata processes things interna$$, (i.e. where it checks for inva$id rows) then wh, wou$d ,ou want to use one over the otherY 2ou$dnJt ,ou a$wa,s want to use .atch since it does the checking up front and saves processing timeY

Points: $ets suppose that we have " dimensions and 1 facts ta.$e ($ike in the e8amp$e a.ove). $ets suppose that ;oin inde8 (or a;i) is .ased on " dims and facts (a$$ ta.$es inner ;oined). 1. with or without referentia$ integrit,9 if ,ou su.mit *uer, which ;oins dim1+ dim2+ dim" and facts inde8 can .e used 2. with referentia$ integrit,9 if ,ou su.mit *uer, which ;oins dim1 and facts inde8 can .e used .ecause optimi:er knows that facts rows reference rows from other dims (so he knows that inner ;oin wi$$ not throw awa, those records) ". without referentia$ integrit, if ,ou su.mit *uer, which ;oins dim1 and facts inde8 cannot .e used .ecause optimi:er does not know if rows from facts reference rows from other dims and optimi:er does not know if it is one6to6man, or man,6 to6one or an,thing e$se.

(Hard( referential integrity is the (normal( referential integrity that enforces any &) constraints and ensures that any data loaded into the ta!les meets the &) rules. *ou should keep in mind that neither Multiload or #astload allow the target ta!le to have foreign key references. +pump does allow this. (%oft( referential integrity is a feature that is more a!out accessing the data than a!out loading it. %oft referential integrity does not enforce any &) constraints. However, when you specify soft &), you are telling the optimi,er that the foreign key references do e-ist. +herefore, it is your jo! to make sure that is true.

Soft Referential Integrity (Soft RI) is a mechanism by which you can tell the optimizer that even though no formal RI constraints have been placed on the table(s), the data in the tables conform to the requirements of RI enforced tables.

his means that the user has insured the following!


he "# of the parent table has unique, not null values. he $# of the child table contains only values which are contained in the "# column of the parent table.

Soft RI

%oes not create or maintain reference inde&es %oes not validate referencing constraints

'y allowing the optimizer to assume that RI constraints are implicitly in force, (even though no formal RI is assigned to the table), you enable the optimizer to eliminate (oin steps in queries such as the one seen previously. Implementing Soft RI Soft RI is implemented using slightly different syntax than standard RI. The REFERENCES clause for the column definition will add the key words 'WITH NO CHECK OPTION'. )&amples *reate the employee table with a soft RI reference to the department table. CREATE TABLE employee ( employee_number INTEGER NOT NULL, mana er_employee_number INTEGER, !epar"men"_number INTEGER , #ob_$o!e INTEGER, la%"_name CHAR(&'( NOT NULL, )*r%"_name +ARCHAR(,'( NOT NULL, -*re_!a"e .ATE NOT NULL, b*r"-!a"e .ATE NOT NULL, %alary_amoun" .ECI/AL(0',&( NOT NULL , 1OREIGN KE2 ( !epar"men"_number ( RE1ERENCE3 WITH NO CHECK OPTION !epar"men"( !epar"men"_number(( UNI4UE PRI/AR2 IN.E5 (employee_number(6 he parent table must be created with a unique, not null referenced column. )ither of the e&amples below may be used. CREATE TABLE !epar"men" ( !epar"men"_number INTEGER NOT NULL CON3TRAINT pr*mary_0 PRI/AR2 KE2 ,!epar"men"_name CHAR(,'( UPPERCA3E NOT NULL UNI4UE ,bu! e"_amoun" .ECI/AL(0',&( ,mana er_employee_number INTEGER(6 CREATE TABLE !epar"men" ( !epar"men"_number INTEGER NOT NULL ,!epar"men"_name CHAR(,'( UPPERCA3E NOT NULL UNI4UE ,bu! e"_amoun" .ECI/AL(0',&( ,mana er_employee_number INTEGER( UNI4UE PRI/AR2 IN.E5 (!epar"men"_number(6 )&ecuting the same query as before, notice the (oin elimination step ta+es place (ust as it did when standard RI was enforced. $ind all employees in valid departments. E5PLAIN 3ELECT employee_number , !epar"men"_number 1RO/ employee e, !epar"men" ! WHERE e7!epar"men"_number 8 !7!epar"men"_number

OR.ER B2 &,06 ,n E5PLAIN of this query produces the following partial result! 3) We do an all-AMPs RETRIEVE step from SQL !e "# $a# of an all-ro$s s%an $&t' a %ond&t&on of ()N*T (SQL !e!department+n,m"er IS N-LL)))&nto Spool . (/ro,p+amps)0 $'&%' &s ",&lt lo%all# on t'e AMPs! T'en $e do a S*RT to order Spool . "# t'e sort 1e# &n spool f&eld.! T'e s&2e of Spool . &s est&mated $&t' no %onf&den%e to "e 34 ro$s! T'e est&mated t&me for t'&s step &s ! 5 se%onds! ,gain, the department table does not need to participate in the (oin for the same reason as seen in the previous e&ample. Soft RI Caution! -ote that the responsibility for this query to produce accurate results lies with the user. If the table data violates the rules of RI, then the (oin elimination step can have consequences for the accuracy of the results. It is assumed that the validation of the data for referential integrity ta+es place e&ternal to eradata, or is enforced on eradata through other application methods. C/EC@(IM:

+he pro!lem in the diskdrive and disk array...can corrupt the data.... these type of corrupted data cant !e found easily..!ut "ueries against these corrupted data will get u wrong answers..we can find the corruption !y means of scandisk and checkta!le.....+hese errors will reduce the availa!ility of the D$H.......+his 'inda .rrors is called D)sk )/o .rrors )norder to avoid this in +D we have the D)sk )/o )ntegrity 0heck.... 0heck%um is used to check the Disk )/1 )ntegrity 0heck !y means of checksum for ta!le level......this is a kinda protection techni"ue !y which we can select the various levels of corruption checking .......... +hese checks are done !y some integrity methods..... +his feature detects and logs the disk i/o errors +D give predefined data integrity levels check..... default,low,end,medium,high....etc... this checksum can !e ena!led.....using create ta!le for ta!le level.. DD2. for system level use D3%control utilty to set the parameter )f u wanna more hands on then u ve to use the scandisk and checkt!l utility.... u ve to run the checkt!l utility in level 4 so that it will diagnos the entire rows,!yte !y !yte...
8>.2hat is identit& column5 - !eradata D24B.1 with one+ co$umn (- !EOE4 data t,pe) that is defined as an -dentit, co$umn. IereJs the &&)9 C4EA!E 'E! !AB)E testUta.$e + / 7A))BACP + / BE7/4E H/04 A)+ / A7!E4 H/04 A)+ CIECP'0# V &E7A0)! ( P4-#U4EO-/ U-& - !EOE4 OE E4A!E& A)2AY' A' -&E !-!Y ('!A4! 2-!I 1 - C4E#E ! BY 1 #- DA)0E 621%]%C"?%] #A1DA)0E 21%]%C"?%] / CYC)E)+

P4-#U4EO-/ UC& CIA4(?) CIA4AC!E4 'E! )A!P4-#A4Y - &E1 ( P4-#U4EO-/ U-& )K

/! CA'E'PEC-7-C /! 0)))

+eradata has a concept of identity columns on their ta!les !eginning around 5 &6.-. +hese columns differ from 1racle7s se"uence concept in that the num!er assigned is not guaranteed to !e se"uential. +he identity column in +eradata is simply used to guaranteed row8uni"ueness. .-ample9
CREATE M-LTISET TA6LE M#Ta"le ( ColA INTE7ER 7ENERATE8 69 8EFA-LT AS I8ENTIT9 (START WIT: . INCREMENT 69 3 ) Col6 VARC:AR(3 ) N*T N-LL ) -NIQ-E PRIMAR9 IN8E; p&d< (ColA)=

:ranted, 0olA may not !e the !est primary inde- for data access or joins with other ta!les in the data model. )t just shows that you could use it as the P) on the ta!le.

81./o2 to implement IP(E3T lo ic in Teradata usin (=-5 2e have #E4OE6- !/ option avai$a.$e in !eradata data which works as an 0P'E4! $ogic in teradata.

E8amp$e9 #E4OE into deptUta.$e1 as !aregt 0'- O ('E)EC! deptUno+ deptUname+ .udget 74/# deptUta.$e where deptUno V 23) 'ource / 2IE (!arget.deptUno V 23)

#A!CIE& then 0P&A!E set deptUname V LBeing 4enamed<

2IE

/! #A!CIE& then - 'E4! (deptUno+ deptUname+ .udget) DA)0E' (source.deptUno+ source.deptUname+ source..udget)K

0P'E4!

update +1 from + ! set last;dt < !.last;dt where +1.msisdn < !.msisdn else insert into tp;tmp.sa;telenor;mshare;!ackup values = + .msisdn, + .oper;cd, + .outg, + .incom, + .fst;dt, + .last;dt, + .second;last;dt, + .third;last;dt>

8*.2hat is (AMP-E0D in Teradata5 'ince 'A#P)E-& is a co$umn+ it can .e used as the sort ke,. .ultiple sample sets may be generated in a single query if desired. o identify the specific set, a tag called the SA !"#I$ is made available for association with each set. he S,."/)I% may be selected, used for ordering, or used as a column in a new table. 0et three samples from the department table, one with 123 of the rows, another with 123 and a third with 243. 3ELECT !epar"men"_number ,%ample*! 1RO/ !epar"men" 3A/PLE 7&9, 7&9, 79' OR.ER B2 %ample*!6 Result department+n,m"er ----------------3 . > 3 > 3 3 . . ? . 3 3 > . 5 SampleId ----------. . 3 3 3 3 3 3 3

84.:hat are diff other options a'aila.le 2ith (AMP-E function in Teradata5 'A#P)E function is used to retrive the random of data from ta.$e e8amp$e 1 'e$ect \ from emp samp$e 13 e8amp$e 2 se$ect \ from ta. samp$e when prodUcode V JA'J then 13 when prodUcode V JC#J then 13 when prodUcode V J&(J then 13 end

%ample #unction Hi, ) have an order ta!le which has order details alongwith Product 0ode as (A%( , (3?( ,(0M(,(D@(,(.&(,(#A( ) was to select a random of 1B records for each of the product codes (A%( , (0M( and (D@( 0an i use a (sample( teradata feature to acheive the a!ove results . )f yes how can that !e done in a single "uery, such that i get 4B records 1B each for the a!ove 4 product codes. )s there a !etter way to get the a!ove results +hanks, %am dnoeth C6 posts Doined 11/BE BF Dan BBC Hi %am,

select G from ta! sample when prod;code < 7A%7 then 1B when prod;code < 70M7 then 1B when prod;code < 7D@7 then 1B end Dieter RA%$& 'unction
he RAN.O/ function may be used to generate a random number between a specified range. RAN.O/ (/ower limit, 5pper limit) returns a random number between the lower and upper limits inclusive. 'oth limits must be specified, otherwise a random number between 4 and appro&imately 6 billion is generated. *onsider the department table, which consists of nine rows. SELECT department_number FROM department; !epar"men"_number ----------------? . 3 . 3 . 5 . > 3 > 3 3 3 > . )&ample ,ssign a random number between 7 and 8 to each department. SELECT department_number, RANDOM(1,9) FROM department; !epar"men"_number ----------------? . 3 . 3 . 5 . > 3 > 3 3 3 > . Ran!om(0,:( ----------3 5 3 @ 3 3 . ? .

%ote( it is possible for random numbers to repeat. he RAN.O/ function is activated for each row processed, thus duplicate random values are possible.

86.2hat are the considerations to choose a Primar& 0ndex5 !he Primar, -nde8 determines which A#P stores an individua$ row of a ta.$e. !he P- data is converted into the 4ow Iash using a mathematica$ hashing formu$a. !he resu$t is used as an offset into the Iash #ap to determine the A#P num.er. 'ince the P- va$ue determines how the data rows are distri.uted among the A#Ps+ re*uesting a row using the P- va$ue is a$wa,s the most efficient retrieva$ mechanism for !eradata. P/- !'9 .-t determines how data wi$$ .e distri.uted and is a$so the most efficient access path. 87. /o2 man& max roles can .e assi ned to a user.5 88.consider Mload or Tpump accordin to 'olume of the data.1diffrent situations 2here Tpump and Mload should .e used 5 -n genera$+ the more ,ou tend to accumu$ate ,our updates into $arge .atches .efore app$,ing them to ,our ta.$es+ the more $ike$, it is that ,ouJ$$ want to use #$oad. #$oad is more efficient at app$,ing a $arge num.er of updates. Iowever+ #$oad has certain $imitations $ike it canJt update uni*ue secondar, inde8es or ;oin inde8es+ it canJt fire triggers+ and ,ou canJt use it on a ta.$e with referentia$ integrit, defined. A$so+ #$oad wi$$ $ock the entire ta.$e with a write $ock when itJs in the APP)Y phase (when itJs app$,ing the updates). !pump+ on the other hand+ is .est used if ,ou are app$,ing updates throughout the da, in sma$$ .atches (or using a *ueue). !pump is not as fast+ especia$$, as the update vo$umes grow. -tJs advantages are that it doesnJt $ock the entire ta.$e for write+ .ut on$, $ocks the specific row6hash va$ues that are .eing updated+ and it on$, $ocks them for the duration of the update. A$so+ since there is no specia$ code inside the &B#' for !pump+ it supports a$$ &B#' features (updates uni*ue secondar, inde8es+ ;oin inde8es+ fires triggers+ etc.). -f ,ou are app$,ing updates on a week$, or dai$, .asis+ - wou$d tend to use #u$ti$oad. As ,ou start to app$, updates more fre*uent$, throughout the da,+ ,ou ma, start to find that !pump is the .etter option. 89.:h& 4rd N+ in Teradata -DM 5
because the Teradata model is in third normal form, you only have to enter data once. That significantly reduces data redundancy and means you dont have to reorganize the entire model every time you want to ask a new business question or add a new data source.

8;./o2 man& su.ject areas in +(-DM. 13 1.Part, 2.asset ".product %.agreement B.event ?.)ocation ].Campaign C.channe$ M.7inancia$ #anagement 13.-nterna$ /rgani:ation. 8<.explain a.out M-oad and (0 5 #)/A& wi$$ not work with uni*ue secondar, inde8es. 9>.PP0 and IP0 in ta.le creation statement. 0P-9 C4EA!E 'E! !a.$e !omC.emp$o,ee ( emp +dept +$name +fname +sa$ar, - !EOE4 - !EOE4 CIA4(23) DA4CIA4(23) &EC-#A)(13+2)

+hireUdate &A!E ) IN0=IE P4-#A4Y - &E1(emp)K

0&.A+. ta!le test = column1 %MA22)A+, column DA+. #1&MA+ 7****MMDD7, loaddate DA+. #1&MA+ 7yyyy8mm8dd7> ?A)@?. P&)MA&* )AD.H =column >

PA&+)+)1A 3* &AA:.;A=column 3.+$..A DA+. 7 BBB8B18B17 AAD DA+. 7 1BB8B18B17 .A0H )A+.&5A2 717>I
91.2hat is 'alue ordered NI(0 2hen we define a va$ue ordered 0'- on a co$umn the rows in the secondar, su.ta.$e get sorted .ased on the secondar, inde8 va$ue. !he co$umns shou$d .e of integer or date t,pe. !his is used for range *ueries and to avoid fu$$ ta.$e scans on $arge ta.$es. 9*.difference .et2een oracle and Teradata. Both the data.ase has there advantages X disadvantages. !here are a $ot of factors to .e taken into consideration .efore deciding which data.ase is .etter. -f ,ou are ta$king a.out /)!P s,stems then /rac$e is far .etter than !eradata. /rac$e is more f$e8i.$e in terms of programming $ike u can write Packages+procedures+functions . !eradata is usefu$ if ,ou want to generate reports on a ver, huge data.ase. But the recent versions of /rac$e $ike 13g is *uite good X contains a $ot of features to support &ata2areIouse !eradata is a #PP ',stem which rea$$, can process the comp$e8 *ueries ver, fast$,.. Another advantage is the uniform distri.ution of data through the 0ni*ue primar, inde8es with out an, overhead. 4ecent$, we had an eva$uation with e8perts from .oth /rac$e and !eradata for /)AP s,stem+ and the, were rea$$, impressed with the performance of !eradata over /rac$e. /rac$e support #PP in form of grid computing. uniform distri.ution of data .ased on primar, ke, wi$$ not .e much usefu$ when accessing huge amount of data a fu$$ scan is re*uired. so far we found teradata a$most e*ua$ in performance with orac$e 13g. Based on .ench mark and after consu$ting from different peop$e we find fo$$owing pro.$ems in !eradata. its too e8pensive. ,ou need $ong pockets to work with teradata. it has on$, one t,pe of inde8 whi$e orac$e has man, t,pes of inde8es especia$$, there .itmap inde8. teradata does not have materia$i:e view. orac$e has materia$i:e view which decrease the -/ .and width and makes s,stem more sca$a.$e. /rac$e has ver, wide variet, of ana$,tic functions for '*$. " t,pes of partitioning and in orac$e 11g there are some new addition in partitioning the a.i$it, to use c$usters without having to statica$$, partition data 7urther..... these are the remarks i found on some of orac$e discussion forms the $argest data.ases in the wor$d run on /rac$e http9@@.i:.,ahoo.com@prnews@3"111%@sff32MU1.htm$ the, count a) a$$ disk on the computer+ not ;ust data.ase disk .) the sum of a$$ data.ases a customer is using 66 not individua$ data.ases But sti$$ we saw that .est data.ase is the one which ,ou have technica$ resource to work and especia$$, tune. 94.2hat are the D!=- ta.les. &ata.ase (uer, )og ta.$es are the ta.$es present in &BC data.ase which store the histor, of a$$ the operations performed on the ta.$es present in the data.ases. !he histor, cou$d get ver, $arge so these ta.$es shou$d .e purged when the data is no $onger needed. 96.diff %ind of users in Teradata 97.explain a.out 3aid 1 and 3aid7 4aid Protection !here are man, forms of disk arra, protection in !eradata. 4A-& 1 and 4A-& B are common$, used and wi$$ .e discussed here. !he disk arra, contro$$ers manage .oth. 4A-& 1 is a disk6mirroring techni*ue. Each ph,sica$ disk is mirrored e$sewhere in the arra,. !his re*uires the arra, contro$$ers to write a$$ data to two separate $ocations+ which means data can .e read from two $ocations as we$$. -n the event of a disk fai$ure+ the mirror disk .ecomes the primar, disk to the arra, contro$$er and performance is unchanged. 4A-& 1 ma, .e configured as 4A-& 1 ^ 3 that uses mirrored striping.

4A-& B is a parit,6checking techni*ue. 7or ever, three .$ocks of data (spread over three disks)+ there is a fourth .$ock on a fourth disk that contains parit, information. !his a$$ows an, one of the four .$ocks to .e reconstructed ., using the information on the other three. -f two of the disks fai$+ the rank .ecomes unavai$a.$e. !he arra, contro$$er does the reca$cu$ation of the information for the missing .$ock. 4eca$cu$ation wi$$ have some impact on performance+ .ut at a much $ower cost in terms of disk space. 98.2hat is the diff .S2 sample and top. !he (amplin function ('A#P)E) permits a 'E)EC! to random$, return rows from a !eradata data.ase ta.$e. -t a$$ows the re*uest to specif, either an a.so$ute num.er of rows or a percentage of rows to return. Additiona$$,+ it provides an a.i$it, to return rows from mu$tip$e samp$es. 'E)EC! \ 74/# studentUcourseUta.$e (AMP-E 7M

!/P C$ause !he !/P c$ause is used to specif, the num.er of records to return. !he !/P c$ause can .e ver, usefu$ on $arge ta.$es with thousands of records. 4eturning a $arge num.er of records can impact on perfor mance. Note: Not all data.ase s&stems support the TOP clause.

Example: 1.'E)EC! !/P B3 PE4CE ! \ 74/# E#P 2. 'E)EC! !/P 2 \ 74/# E#P

+here is an +op function in 5 &6, !ut if you want to try out in 5 &J you need to go !y analytical function. %elect G #rom vinod;1 @ualify &ow;num!er=> 15.&=1rder !y empno> K< J
99./o2 to impro'e performance of the Buer& 9;.Explain Primar& 0ndex and ho2 do 2e select that !he Primar, -nde8 determines which A#P stores an individua$ row of a ta.$e. !he P- data is converted into the 4ow Iash using a mathematica$ hashing formu$a. !he resu$t is used as an offset into the Iash #ap to determine the A#P num.er. 'ince the P- va$ue determines how the data rows are distri.uted among the A#Ps+ re*uesting a row using the P- va$ue is a$wa,s the most efficient retrieva$ mechanism for !eradata. P/- !'9 .-t determines how data wi$$ .e distri.uted and is a$so the most efficient access path. 9<.:hat is difference .et2een 3ole1 Pri'ile e and profile A ro$e can .e assigned a co$$ection of access rights in the same wa, a user can. You then grant the ro$e to a set of users+ rather than grant each user the same rights. !his cuts down on maintenance+ adds standardi:ation (hence reducing erroneous access to sensitive data) and reduces the si:e of the d.c.a$$rights ta.$e+ which is ver, important in reducing &BC .$ocking in a $arge environment. Profi$es assign different characteristics on a 0ser+ such as spoo$ space+ perm space and account strings. Again this he$ps with standardi:ation. ote that spoo$ assigned to a profi$e wi$$ overru$e spoo$ assigned on a create user statement. Check the on $ine manua$s for the fu$$ $ists of properties &ata Contro$ )anguage is used to restrict or permit a userJs access. -t can se$ective$, $imit a userJs a.i$it, to retrieve+ add+ or modif, data. -t is used to grant and revoke access privi$eges on ta.$es and views.

;>.:hat are different spaces in Teradata and difference 5 Perm 'pace !emp 'pace spoo$ space Perm (pace 9A$$ data.ases have a defined upper $imit of permanent space. Permanent space is used for storing the data rows of ta.$es. Perm space is not pre6a$$ocated. -t represents a ma8imum $imit. (pool (pace : A$$ data.ases a$so have an upper $imit of spoo$ space. -f there is no $imit defined for a particu$ar data.ase or user+ $imits are inherited from parents. !heoretica$$,+ a user cou$d use a$$ una$$ocated space in the s,stem for their *uer,. 'poo$ space is temporar, space used to ho$d intermediate *uer, resu$ts or formatted answer sets to *ueries. /nce the *uer, is comp$ete+ the spoo$ space is re$eased. E8amp$e9 You have a data.ase with tota$ disk space of 133OB. You have 13OB of user data and an additiona$ 13OB of overhead. 2hat is the ma8imum amount of spoo$ space avai$a.$e for *ueriesY Answer9 C3OB. A$$ of the remaining space in the s,stem is avai$a.$e for spoo$ Temp (pace : !he third t,pe of space is temporar, space. !emp space is used for O$o.a$ and Do$ati$e temporar, ta.$es+ and these resu$ts remain avai$a.$e to the user unti$ the session is terminated. !a.$es created in temp space wi$$ survive a restart. ;1.0f &our (%e2 factor is oin up. :hat are remedies 5 'kew factor occurs when the primar, inde8 co$umn se$ected is not a good candidate. #ean+ -f for a ta.$e when the P- se$ected having high$, non uni*ue va$ues then 'PE2 factor wi$$ .e getting ., defau$t it wi$$ .e :ero+ if skew factor se$ected is greater than 2B then it is not a good sign. ;*.:hen1 /o2 and 2h& 2e use (econdar& 0ndexes.5 A secondar& index is an a$ternate path to the data. 'econdar, inde8es are used to improve performance ., a$$owing the user to avoid scanning the entire ta.$e during a *uer,. A secondar, inde8 is $ike a primar, inde8 in that it a$$ows the user to $ocate rows. 0n$ike a primar, inde8+ it has no inf$uence on the wa, rows are distri.uted among A#Ps. 'econdar, -nde8es are optiona$ and can .e created and dropped d,namica$$,. 'econdar, -nde8es re*uire separate su.ta.$es which re*uire e8tra -@/ to maintain the inde8es. ;4.:hat is difference .et2een Primar& @e& and Primar& 0ndex

;6.:hat is difference .et2een data.ase and user in Teradata. 2hat are the thin s &ou can do or can not do in .oth. Both ma, own o.;ects such as ta.$es+ views+ macros+ procedures+ and functions. Both users and data.ases ma, ho$d privi$eges. Iowever+ on$, users ma, $og on+ esta.$ish a session with the !eradata &ata.ase+ and su.mit re*uests. A user performs actions where as a data.ase is passive. 0sers have passwords and startup stringsK data.ases do not. 0sers can $og on to the !eradata &ata.ase+ esta.$ish sessions+ and su.mit '() statementsK data.ases cannot. Creator privi$eges are associated on$, with a user .ecause on$, a user can $og on and su.mit a C4EA!E statement. -mp$icit privi$eges are associated with either a data.ase or a user .ecause each can ho$d an o.;ect and an o.;ect is owned ., the named space in which it resides ;7.:hat is Chec%point 5

;8.:hen do &ou use !TE=. :hat other soft2ares ha'e &ou used or can 2e use rather than !TE=. 2hen the *uer, is performing operations on $esser amount of data in a ta.$e then we go for B!E(. An, kind of '() operations $ike 'E)EC!+ 0P&A!E+ - 'E4! and de$ete. Can .e used for import+ e8port and reporting purposes. #acros and 'tored procs can a$so .e run using B!E(. !he other uti$ities which we can use instead of B!E( for $oading purposes are 7A'!)/A& and #)/A&. And e8porting is 7A'!E1P/4!. But these are used whi$e accessing $arge amount of data.

;9./o2 man& t&pe of files ha'e &ou loaded and their differences. #+ixed and Raria.le$ 5 ;;./o2 do &ou execute &our jo.s in Teradata En'ironment. -n a channe$ environment -.e mainframes+ the $oad uti$ities can .e e8ecute through a HC). -n a network -.e from a command prompt the $oad scripts can .e run through the fo$$owing command. Ruti$it, nameS RscriptnameS ;<.:hat 2as the en'ironment of &our latest project #Num.er of Amps1 Nodes1 Teradata (er'er Num.er etc$ um.er of Amps production and integration G 2% deve$opment G 12 um.er of nodes 6 production and integration G % deve$opment G 2 <>.:hat is the process to restart the multiload if it fails -f #$oad fai$ed in the Ac*uisition phase ;ust rerun the ;o.. -f #$oad fai$ed in App$ication Phase9 a) !r, to drop error ta.$es+ work ta.$es+ $og ta.$es+ re$ease #$oad if re*uired n su.mit the ;o. from .Begin -mport onwards. .) if ur ta.$e is fa$$.ack protected u need to make sure un fa$$.ack and use 4E)EA'E #)/A& - APP)Y s*$. !hen resu.mit the ;o.. 1. 2. ". 4e$ase the #$oad on the target ta.$e. &rop a$$ error ta.$es and work ta.$es 4esu.mit the #$oad script.

<1./o2 does indexin impro'e Buer& performance.5 -nde8ing is a wa, to ph,sica$$, reorgani:e the records to ena.$e some fre*uent$, used *ueries to run faster. !he inde8 can .e used as a pointer to the $arge ta.$e. -t he$ps to $ocate the re*uired row *uick$, and then return to .ack to the user. or !he fre*uent$, used *ueries need not hit a $arge ta.$e for data. the, can get what the, want from the inde8 itse$f. 6 cover *ueries. -nde8 comes with the overhead of maintenance. !eradata maintains its inde8 ., itse$f. Each time an insert@update@de$ete is done on the ta.$e the inde8es wi$$ a$so need to .e updated and maintained. -nde8es cannot .e accessed direct$, ., users. /n$, the optimi:er has access to the inde8. <*.:hat is difference .et2een Multiload1 +ast-oad and TPIMP

<4.2hat are the different functions &ou do in !TE= #Errorcode1 Error-e'el1 etc$ 5 Error )eve$ 9Assigns severit, to errors ,ou can assign an error $eve$ (severit,) for each error code returned. ,ou can make decisions can .e .ased on error $eve$. <6.2hat is difference .et2een OE3O0+NI-- and NI--0+OE3O 5 !he ZE4/-7 0)) function9 wi$$ pass :ero when data coming as nu$$ !he 0))-7ZE4/ function9 wi$$ pass nu$$ when data coming as :ero. <7.:hat is 3an eKN 4angeU is defined on a partition primar, inde8 to specif, the range of va$ues of a co$umn that shou$d .e assigned to a partition. !he num.er of partitions V the num.er of ranges specified ^ no case ^ unknown no case G if the va$ue does not .e$ong to an, range unknown6 for the va$ues $ike nu$$s+ spaces etc <8.Explain PP05 PP- 96 Partitioned Primar, -nde8es are Created so as to divide the ta.$e onto partitions .ased on 4ange or Da$ues as 4e*uired .the data is first Iashed into Amps + then 'tored in amps .ased on the Partitions WWW which when 4etrived for a sing$e partition @ mu$tip$e Partitions + wi$$ .e a a$$ amps 'can+ .ut not a 7u$$ !a.$e 'can WWWW . this is effective for )arger !a.$es partitioned on the &ate 'pecia$$, WWW there is no e8tra /verhead on the ',stem (no 'p$ !a.$es Created ect ) <9.:hat is Castin in Teradata 5 -t wi$$ convert the data t,pe !he casting is simi$ar to &&)9

CA'!(J32@3"@233M63192B911J A' !-#E'!A#P 7/4#A! J##@&&@YYYY6II9#-9''J) <;.:hat is difference .et2een IN0ON and M0NI(5 Both are set operators on two ta.$es genera$$,. 0 -/ rows. gives a$$ rows from .oth ta.$es e$iminating dup$icate

#- 0' gives records from first ta.$e e8c$uding common records from .oth ta.$es. -ts ;ust $ike E1CEP! in !eradata

Operator 0 -/ 0 -/ A)) - !E4'EC! #- 0'

3eturns A$$ rows se$ected ., either *uer,. A$$ rows se$ected ., either *uer,+ inc$uding a$$ dup$icates. A$$ distinct rows se$ected ., .oth *ueries. A$$ distinct rows se$ected ., the first *uer, .ut not the second.

IN0ON Example !he fo$$owing statement com.ines the resu$ts with the 0 -/ operator+ which e$iminates dup$icate se$ected rows. !his statement shows that ,ou must match datat,pe (using the !/U&A!E and !/U 0#BE4 functions) when co$umns do not e8ist in one or the other ta.$e9 'E)EC! part+ partnum+ toUdate(nu$$) dateUin 74/# ordersU$ist1 0 -/ 'E)EC! part+ toUnum.er(nu$$)+ dateUin 74/# ordersU$ist2K PA4! PA4! 0# &A!EU6666666666 6666666 66666666 'PA4PP)0O ""2"1?B 'PA4PP)0O 13@2%@MC 70E) P0#P ""2"1?2 70E) P0#P 12@2%@MM !A-)P-PE 1""2MMM !A-)P-PE 31@31@31 C4A P'IA7! M"M%MM1 C4A P'IA7! 3M@12@32 'E)EC! part 74/# ordersU$ist1 0 -/ 'E)EC! part 74/# ordersU$ist2K PA4! 6666666666 'PA4PP)0O 70E) P0#P !A-)P-PE C4A P'IA7! M0NI( Example !he fo$$owing statement com.ines resu$ts with the #- 0' operator+ which returns on$, rows returned ., the first *uer, .ut not ., the second9 'E)EC! part 74/# ordersU$ist1 #- 0'

'E)EC! part 74/# ordersU$ist2K PA4! 6666666666 'PA4PP)0O 70E) P0#P

MM. 2hat is e8p$ain in teradataY


Ans : E8p$ain is a fn using which ,ou can find the e8ecution procedure of an, *uer, in s*$ assistant. !o use this fn t,pe E8p$ain .efore an, *uer, and run it or press 7? after writing a *uer,. -t a$so gives the estimated time+ ;oin confidence and memor, needed to e8ecute that *uer,. -tJs advisa.$e to use e8p$ain .efore e8ecuting an, comp$e8 *uer,. 1>>.2hat 2ill &ou do if &ou et lo2)confidence in explain plan. -n E1P)A- p$an when we get $ow confidence on a co$umn+ we define C/))EC! '!A!-'!-C' for that particu$ar co$umn. !hen onwards PE prepares p$an with Iigh confidence. 1>1.2hat 2ill &ou do if &ou et hi h)confidence in explain plan. !hen we wi$$ run the *uer, without hesitation. 1>*.0 ha'e one sBl Buer&1 2hen 0 ran explain plan its sho2in Product join. :hat are the factors &ou 2ill loo% in to the Buer& to ma%e mer e join 5 Product ;oins are the on$, ;oin t,pe that can ;oin two ta.$es without a .ind term. !he on$, wa, to avoid a product ;oin to make a merge ;oin is to supp$, a connecting term .etween the ta.$es where the operator of the term is V. (!hese terms are ca$$ed B- & !E4#'.) -.e we can add another ;oin condition $ike 1 V 1 11>./o2 does indexin impro'e Buer& performance5 -nde8ing is a wa, to ph,sica$$, reorgani:e the records to ena.$e some fre*uent$, used *ueries to run faster. !he inde8 can .e used as a pointer to the $arge ta.$e. -t he$ps to $ocate the re*uired row *uick$, and then return to .ack to the user. or !he fre*uent$, used *ueries need not hit a $arge ta.$e for data. the, can get what the, want from the inde8 itse$f. 6 cover *ueries. -nde8 comes with the overhead of maintenance. !eradata maintains its inde8 ., itse$f. Each time an insert@update@de$ete is done on the ta.$e the inde8es wi$$ a$so need to .e updated and maintained. -nde8es cannot .e accessed direct$, ., users. /n$, the optimi:er has access to the inde8. 111.Can 2e do collect stats on a ta.le 2hen the ta.le is .ein updated5 no 11*.:hat is Hoin 0ndex in TD and /o2 it 2or%s5 A ' 9 H/- - &E19 66666666666 Hoin -nde8 is nothing .ut pre6;oining 2 or more ta.$es or views which are common$, ;oined in order to reduce the ;oining overhead. 'o teradata uses the ;oin inde8 instead of reso$ving the ;oins in the participating .ase ta.$es. !he, increase the efficienc, and performance of ;oin *ueries. !he, can have different primar, inde8es than the .ase ta.$es and a$so are automatica$$, updated as and when the .ase rows are updated. the, can have repeating va$ues. There are 4 t&pes of join indexes: 1)'ing$e ta.$e ;oin inde8 6 here the rows are distri.uted .ased on the foreign ke, hash va$ue of the .ase ta.$e. 2) #u$ti ta.$e ;oin inde8 6 ;oining two ta.$es.

") Aggregate ;oin inde8 6 performing the aggregates .ut on$, sum and count. 114.0 ha'e t2o ta.les and one of the ta.le index is defined as IP0 or I(0. The second ta.le is ha'in an& of the indexes li%e IP01NIP01I(0 O3 NI(0. 0n this scenario 2hat t&pe of join strate & optimi,er 2ill use 5 #erge Hoin 'trateg, 116.0 ha'e t2o ta.les. Most of the time 0 am joinin on the same columns. :hich t&pe of join index 2ill impro'e the performance in this scenario 5 #u$ti ta.$e ;oin inde8 117.:hen 2ill &ou create PP0 and 2hen 2ill &ou create secondar& indexes5 Partitioned Primar, -nde8es are Created so as to divide the ta.$e onto partitions .ased on 4ange or Da$ues as 4e*uired. !his is effective for )arger !a.$es partitioned on the &ate and integer co$umns. !here is no e8tra /verhead on the ',stem (no 'p$ !a.$es Created ect ) 'econdar, -nde8es are created on the ta.$e for an a$ternate wa, to access data. !his is the second fastest method to retrieve data from a ta.$e ne8t to the primar, inde8. 'u. ta.$es are created. PP- and secondar, inde8es do not perform fu$$ ta.$e scans .ut the, access on$, a defined st of data in the A#PJs. 118.2hat is an optimi:ation and performance tunin and ho2 does it reall& 2or% in practical projects. can i et an& example to .etter understand. 119.Explain a.out (%e2 +actor5 'kew factor occurs when the primar, inde8 co$umn se$ected is not a good candidate. #ean+ -f for a ta.$e when the P- se$ected having high$, non uni*ue va$ues then 'PE2 factor wi$$ .e getting ., defau$t it wi$$ .e :ero+ if skew factor se$ected is greater than 2B then it is not a good sign. 11;.:hen &ou chose primar& index and 2hen 2ill &ou choose secondar& index5 Primar, inde8 wi$$ .e chosen at the time of ta.$e creation. !his wi$$ he$p us in data distri.ution+ data retrieva$ and ;oin operations. 'econdar, inde8es can .e created and dropped at an, time. !he, are used as an a$ternate path to access data other than the primar, inde8. 11<.:hen 2ill o for Hoin index 5 2hen we have two ta.$es which are ;oined .ased on the same ;oin condition ver, fre*uent$, then we go for Hoin -nde8 1*>.:hen 2ill &ou o for hash index5 a.A hash inde8 organi:es the search ke,s with their associated pointers into a hash fi$e structure. ..2e app$, a hash function on a search ke, to identif, a .ucket+ and store the ke, and its associated pointers in the .ucket (or in overf$ow .uckets). c.'trict$, speaking+ hash indices are on$, secondar, inde8 structures+ since if a fi$e itse$f is organi:ed using hashing+ there is no need for a separate hash inde8 structure on it. scenario .ased Buestions 1*1. 0n case of replacement loadin 2hich utilit& &ou prefer5 Mload or +load5 7$oad. 1**.0 ha'e a scenario 2here 0 update one column in a ta.le usin flat file as source. At the same time1 the same column is ettin updated .ecause of another flat file. :hich utilit& 2ill .e more applica.le in this case5 !pump is .etter as it $ocks at row $eve$

The ta.le ot loaded 2ith 2ron data usin +astload and it failed. The error messa e sho2n 2as: T3D!M( error *87*: Operation not allo2ed: Kd.K.Kta.leK is .ein -oaded.U /o2 to realese loc% on this ta.le5 2hen the data got $oaded comp$ete$, and sti$$ its $ocked+ su.mit another fast$oad script with BEO- )/A&- O A & E & )/A&- O atetments a$one. 0 need to create a delimited file usin fastexport. As fast export do not support delimited format1 so 0 ha'e 2ritten the follo2in select to et the delimited output: select trim#col1$ VV LVL VV trim#col*$ VV LVL VV trim#col4$ VV LVL VV ........... ............................... trim#col7>$ from ta.le .ut the a.o'e script prefix each line 2ith * jun% characters. /o2 to et the data 2ithout the jun% characters. 2hen the fastload chec% point 'alue is CG 8> and D 8>1 ho2 is that oin to matter5 2hen the checkpoint interva$ is RV ?3+ that indicates the minutes (time) interva$. -f the va$ue is more than ?3+ it wi$$ .e considered as the no. of records .ut not the time. 1*4. 0 am loadin a delimited flat file 2ith a time format as the follo2in : //:MM PMSAM Examples 2ould .e : <:67 AM 1>:*7 PM And there is no ,ero if the hours is a sin le inte er 'alue. 0s there an& 2a& that 0 2ould et the mload acBuisition phase count in the mload script5 M-OAD support en'ironment pro'ides different 'aria.les #total ins1 upd1 del etc.$ at the application phase1 .ut not at the acBuisition phase. 0s there an& 2a& other than scan the lo file5 !here are various commands avai$a.$e for the same. 'Y'AP)YC ! 'Y' /AP)YC ! 'Y'4C&C ! 'Y'4HC!C !

1*6. 0 ha'e this reBuirement 2hen error ta.le ets enerated durin the M-OAD1 0 2ant to send an email. /o2 can 0 achie'e this5W After #$oad use a B!E( to *uer, for the error ta.$e if present *uit on some va$ue sa, JMMJ and use ,our /' to mai$ when the return code is MM. - am using the fo$$owing s,nta8 to $ogon to !eradata &emo thru B!E(@B!E(2in9 .$ogon demotdat@d.c+d.cK and having the fo$$owing error9 \\\ Error9 -nva$id $ogonW \\\ !ota$ e$apsed time was 1 second. !eradata B!E( 3C.32.33.33 for 2- "2. Enter ,our $ogon or B!E( command9 !he hosts fi$e shows the fo$$owing9 12].3.3.1 $oca$host &emo!&A! &emo!&A!cop1

Iovewer when - use .$ogon demotdat@d.c without specif,ing its password+ it prompts for a password... when - t,pe in its password+ - am a.$e to $ogon. 1*7.:hat is the reason5 2hen we use B!E( in interactive mode+ we cant direct$, gine the -d and pwd. 2e have to first give and the $ogon id and then press enter. /n$, after that we have to enter the password. 1*8.Can 2e ma%e a M-OAD script fail 2hen the error ta.les are created 5 Current$, the m$oad scripts e8its with a return code V 3 which means $oading is successfu$ even though it is not.-t has created some error ta.$es which indicate some data has .een re;ected.... !here are various commands to do this operation. .$ogoff X'Y'0DC ! ^ X'Y'4HC!C ! ^ X'Y'E!C ! ^ X'Y'4CK

T3OI!-E (/OOT0NJ

7) open batch session got failed because of the following error.


9RI )R:7:;:7< 9R :=118 %atabase errors occurred! $n-ame! )&ecute >> ?-*R@?A%'* eradata@? eradata %atabase@ %uplicate unique prime +ey error in *$%91:%)B:*- /.*$%9:)* /:*5RR)- :', *C. $n-ame! )&ecute >> ?%ata%irect@?A%'* lib@ $unction sequence error

Solution ! 9hen ever you want to open a fresh batch id, first of all you should close the e&isting batch id and open a fresh batch id. 1) source is $lat file and I am staging the this flat file in teradata. I found that the initial zeroDs are truncating in teradata. 9hat could be the reason. Solution ! he reason is that in teradata you are defined the column datatype as Integer. hatDs why initial values are truncating. So, change the target table data type to B,R*C,R. B,R*C,R datatype it wonDt trucate the initial zeroDs. E) *anFt determine current batch I% for %ata Source 6G Solution ! $or any fresh stage load you should open a batch id for the current data source id. 6) 5nique "rimary +ey violation *$%9:)* /:*5RR)- :', *C table. Solution ! In *$%9:)* /:*5RR)- :', *C table unique primary +ey defined on )* /:%, ,:SR*):I%, )* /:%, ,:SR*):I-S :I% columns. ,t any point of time you shold have only one record for )* /:%, ,:SR*):I%, )* /:%, ,:SR*):I-S :I% columns. 2) canDt insert a -5// value in a -A -5// column. Solution ! $irst find all the -A -5// columns in a target table and cross verify with the corresponding source columns and identify for which source column

you are getting -5// value and ta+e necessary action. H) source is $lat file and I am staging the this flat file in teradata. I found that the initial zeroDs are truncating in teradata. 9hat could be the reason. Solution ! he reason is that in teradata you are defined the column datatype as Integer. hatDs why initial values are truncating. So, change the target table data type to B,R*C,R. B,R*C,R datatype it wonDt trucate the initial zeroDs. G) I am passing one record to target loo+ up but the loo+ up is not returning the matching record.I +now that the record is present in loo up. 9hat action you will ta+e I Solution ! use / RI.,R RI. in loo+ up sql override.this will remove the unwanted blan+ spaces. hen loo+ up will find the matching record in loo+ up. =) I am getting duplicate records for natural +ey ()* /:%, ,:SR*):#)J) what will you do to eliminate duplicate records natural +ey. Solution! we will concatenate 1 ,E or more source columns and chec+ for duplicate records. If you are not getting duplicates after concatenating then use those columns to populate )* /:%, ,:SR*):#)J column in target. 8) ,ccti:id is a -ot null column in ,0R)).)- table. Jou are getting a -5// value from *$%9:,0R)).)- :KR)$ loo+ up I what will you do to eliminate -5// records. Solution ! ,fter stage load, I will populate *$%9:,0R)).)- :KR)$ table (this table basically contain surrogate +eys). Ance you populate KR)$ table then you wonDt get any -5// records ,ccti:id column. 74) 5nique primary +ey violation on *$%9:)* /:', *C:CIS table. Solution ! In *$%9:)* /:', *C:CIS table 5nique primary inde& defined on ectl:btch:id column. So, there should be only one uniue record for a ectl:btch:id column. 77) when will you use )* /:"0.:I% column in target loo+ up sql overirde I Solution ! when you are populating a single target table (,0R)).)- table) from multiple mappings in the same informatica folder then we will use )* /:"0.:I% in taget loo+ up sql override. his will eliminate unnecessary updating records. 71) you are defined the primary +eys as per the ) / spec but you are getting the duplicate records. Cow will you handle. Solution ! ,part from the primary +ey columns in the spec,$irst I will add any

other column (other primary +ey columns in spec) as the primary +ey and I will chec+ for the duplicate records. If I didnDt get any duplicates, I will as+ modeller to add this column as the primary +ey. 7E) In teradata the error is mentioned as! Lno more room in databaseM Solution! I spo+e with %', to add the space for that database. 76) hough the column is available in target table, when I am trying to load using .load, it shows that tahe column is not available in the table. 9hyI Solution! ,s the loading process was happening through a view and the view was not refreshed to add the new column, it was the error message. So, refresh the view definition to add the new column. 72) when deleting the target table, though I wante to delete some data from the target table, by mista+e all the data got deleted from %evelopment table. Solution! ,dd )* /:%, ,:SR*):I% and "0.:I% in the where clause of the query. 7H) 9hile updatating the target table, it shows an error message saying multiple rows are trying to update a single row. Solution! here are duplicates available in the table matching the 9here condition of the update qurey. hese duplicate records need to be eliminated. 7G) I have a file with header, data records and trailer. %ata record is delimited with comma and header and trailer are fi&ed width. he header and trailer starts with (C%R, R,). I need to avoid the header and trailer while loading the file with .ultiload. "lease help me in this case. Solution! *ode .load utility to consider only the data records e&cluding the header and trailer records.
A!!") la*el +,#R# R#C-T$-I% %&T I%.',$R'/'TRA'0

PPPPMO3E ON HO0N( A 0NDENE(PPPP

G+eradata makes itself the decision to use the inde- or not 8 if you are not careful you spend time in ta!le updates to keep up an inde- which is no used at all =one cannot give the "uery optimi,er hints to use some inde- 8 though collecting of statistics may affect the optimi,er strategy G)n the MP8&A% environment, look at the script (/etc/gsc/!in/perflook.sh(. +his will provide a system8wide snapshot in a series of files. +he :%0 uses this data for incident analysis. G $hen using an inde- one must keep sure that the inde- condition is met in the su! "ueries (using )A, nested "ueries, or derived ta!les( G )ndication of the proper inde- use is found !y e-plain log entry (a &1$ HA%H MA+0H %0AA across A228AMP%(

G )f the inde- is not used the result of the analysis is the 7#?22 +A32. %0AA7 where the performance time grows when the si,e of the history ta!le grows G 'eeping up an inde- information is a time/space consuming issue. %ometimes +eradata is much !etter when you (manually( imitatate the inde- just !uilding it from scratch. G keeping up join inde- might help, !ut you cannot multiload to a ta!le which is a part of the join inde- 8 loading with 7tpump7 or pure 7%@27 is 1' !ut does not perform as well. Dropping and re8 creating a join inde- with a !ig ta!le takes time and space. G when your +eradata (e-plain( gives 7 J7 steps from your "uery =even without the update of the results> and the actual "uery is a join of si- or more ta!les Case e.g. $e had already given up updating the secondary inde-es 8 !ecause we have not had much use for them. After some trials and errors we ended up to the strategy, where the actual (purchase fre"uency analysis( is never made (directly( against the history ta!le. )nstead9 1> +here is a (one8shot( run to !uild the initial (customer7s previous purchase( from the (purchase history( 8 it takes time, !ut that time is saved later > +he purchase fre"uency is calculated !y joining the (latest purchase( with the (customer7s previous purchase(. 4> $hen the (latest purchase( rows are inserted to the (purchase history( the (customer7s previous purchase( ta!le is dropped and recreated !y merging the (customer7s previous purchase( with the (latest purchase( E> 3y following these steps the performance is not too fast yet =a!out J minutes in our two node system> for a !unch of almost 1.BBB.BBB latest receipts 8 !ut it is tolera!le now. =$e also tested !y adding !oth the previous and latest purchase to the same ta!le, !ut !ecause its si,e was in average case much !igger than the pure (latest purchase(, the self8join was slower in that case>
;;;;;;;;;

MANAGING CONCURRENT WORKLOADS


0nte rated e)commerce efforts present man& 2arehouse challen es. /ereLs ho2 Teradata can help. +he word e8commerce means many things to many people. Although for some it connotes only the $e!, the real value of e8commerce can only !e reali,ed when all channels of a !usiness are integrated and have full access to all customer information and transactions. )n fact, to me, e8 commerce means using the rich technology availa!le today to !ring added value to the customer and additional value to the !usiness through all customer interaction channels. ?nder this definition of e8commerce, an active warehouse is at the epicenter, providing the storage and access for decision making in the e8commerce world. As more and more companies adopt active warehousing for this purpose, data warehouse workloads are e-panding and changing. )f your warehouse relies on a +eradata D3M%, you7ll find that handling the challenge of high8 volume, widely varying, disparate service8level workloads is one of its core competencies. 1ne of the !iggest concerns ) hear from customers is how to deal with the "uickly rising num!er of concurrent "ueries and concurrent users that can result from active warehousing and e8commerce initiatives. .-pected service levels vary widely among different groups of users, as do "uery types.

And, of course, the entire workload must scale upward linearly as the demand increases, ideally with a minimum of effort re"uired from users and systems staff. Here7s a look at some of the most fre"uent "uestions ) receive on the su!ject of mi-ed workloads and concurrency re"uirements. /o2 do 0 .alance the 2or% comin in across all nodes of m& Teradata confi uration5 *ou don7t. +eradata automatically !alances sessions across all nodes to evenly distri!ute work across the entire parallel configuration. ?sers connect to the system as a whole rather than a specific node, and the system uses a !alancing algorithm to assign their sessions to a node. 3alancing re"uires no effort from users or system administrators. Does Teradata .alance the 2or% Bueries cause5 +he even distri!ution of data is the key to parallelism and scala!ility in +eradata. .ach "uery re"uest is sent to all units of parallelism, each of which has an even portion of the data to process, resulting in even work distri!ution across the entire system. #or short "ueries and update flow typical of $e! interactions, the optimi,er recogni,es that only a single unit of parallelism is needed. A "uery coordinator routes the work to the unit of parallelism needed to process the re"uest. +he hashing algorithm does not cluster related data, !ut spreads it out across the entire system. #or e-ample, this month7s data and even today7s data is evenly distri!uted across all units of parallelism, which means the work to update or look at that data is evenly distri!uted. :ill man& concurrent reBuests cause .ottlenec%s in Buer& coordination5 @uery coordination is carried out !y a fully parallel parsing engine =P.> component. ?sually, one or more P.s are present on each node. .ach P. handles the re"uests for a set of sessions, and sessions are spread evenly across all configured P.s. .ach P. is multithreaded, so it can handle many re"uests concurrently. And each P. is independent of the others with no re"uired cross8 coordination. +he num!er of users logged on and re"uests in flight are limited only !y the num!er of P.s in the configuration.

/o2 do &ou a'oid .ottlenec%s 2hen the Buer& coordinator must retrie'e information from the data dictionar&5 )n +eradata, the D3M% itself manages the data dictionary. .ach dictionary ta!le is simply a relational ta!le, paralleli,ed across all nodes. +he same "uery engine that manages user workloads also manages the dictionary access, using all nodes for processing dictionary information to spread the load and avoid !ottlenecks. +he P. even caches recently used dictionary information in memory. 3ecause each P. has its own cache, there is no coordination overhead. +he cache for each P. learns the dictionary information most likely to !e needed !y the sessions assigned to it. :ith a lar e 'olume of 2or%1 ho2 can all reBuests execute at once5 As in any computer system, the total num!er of items that can e-ecute at the same time is always limited to the num!er of 0P?s availa!le. +eradata uses the scheduling services ?ni- and A+ provide to handle all the threads of e-ecution running concurrently. %ome re"uests might also e-ist on other "ueues inside the system, waiting for )/1 from the disk or a message from the 3*A.+, for e-ample. .ach work item runs in a threadI each thread gets a turn at the 0P? until it needs to wait for some e-ternal event or until it completes the current work. +eradata configures several units of parallelism in each %MP node. .ach unit of parallelism contains many threads of e-ecution that aren7t restricted to a particular 0P?I therefore, every thread gets to compete e"ually for the 0P?s in

the %MP node. +here is a limit, of course, to the num!er of pieces of work that can actually have a thread allocated in a unit of parallelism. 1nce that limit is reached, +eradata "ueues work for the threads. .ach thread is conte-t free, which means that it is not assigned to any session, transaction, or re"uest. +herefore, each thread is free to work on whatever is ne-t on the "ueue. +he unit of work on the "ueue is a processing step for a re"uest. 0om!ining the "ueuing of steps with conte-t8free threads allows +eradata to share the processing service e"ually across all the concurrent re"uests in the system. #rom the users7 point of view, all the re"uests in the system are running, receiving service, and sharing system resources. /o2 does Teradata a'oid resource contention and the resultin performance and mana ement pro.lems5 +eradata algorithms are very resource efficient. 1ther D3M%s optimi,e for single8"uery performance !y giving all resources to the single "uery. 3ut +eradata optimi,es for throughput of many concurrent "ueries !y allocating resources sparingly and using them efficiently. +his kind of optimi,ation helps avoid wide performance variations that can occur depending on the num!er of concurrent "ueries. $hen faced with a workload that re"uires more system resources than are availa!le, +eradata tunes itself to that workload. +hrashing, a common performance failure mode in computer systems, occurs when the system has fewer resources than the current workload re"uires and !egins using more processing time to manage resources than to do the work. $ith most data!ases, a D3A would tune the system to avoid thrashing. However, +eradata adjusts automatically to workload changes !y adjusting the amount of running work and internally pushing !ack incoming work. .ach unit of parallelism manages this flow control mechanism independently. 0f all concurrent 2or% shares resources e'enl&1 ho2 are different ser'ice le'els pro'ided to different users5 +he Priority %cheduler #acility =P%#> in +eradata manages service levels among different parts of the workload. P%# allows granular control of system resources. +he system administrator can define up to five resource partitionsI each partition contains four availa!le priorities. +ogether, they provide B allocation groups =A:s> to which portions of the workload are assigned !y an attri!ute of the logon )D for the user or application. +he administrator assigns each A: a portion of the total system resources and a scheduling policy. #or e-ample, the administrator can assign short "ueries from the $e! site a guaranteed B percent of system resources and a high priority. )n contrast, the administrator might assign medium priority and 1B percent of system resources to more comple- "ueries with lower response8time re"uirements. %imilarly, the administrator might assign data mining "ueries a low priority and five percent of the total resources, effectively running them in the !ackground. *ou can define policies so that the resources adjust to the work in the system. #or e-ample, you could allow data mining "ueries to take up all the resources in the system if nothing else is running. ?nlike other scheduling utilities, P%# is fully integrated into the D3M%, not managed at the task or thread level, which makes it easier to use for parallel data!ase workloads. 3ecause P%# is an attri!ute of the session, it follows the work wherever it goes in the system. $hether that piece of work is e-ecuted !y a single thread in a single unit of parallelism or in ,BBB threads in JBB units of parallelism, P%# manages it without system administrator involvement. 0P? scheduling is a primary component of P%#, using all the normal techni"ues =such as "uantum si,e, 0P? "ueues !y priority, and so on>. However, P%# is endemic throughout the +eradata D3M%. +here are many "ueues inside a D3M% handling a large volume mi-ed workload. All of those "ueues are prioriti,ed !ased on the priority of the work. +hus, a high priority "uery entered after several lower priority re"uests that are awaiting their turn to run will go to the head of the "ueue

and will !e e-ecuted first. )/1 is managed !y priority. Data warehouse workloads are heavy )/1 users, so a large "uery performing a lot of )/1 could hold up a short, high8priority re"uest. P%# puts the high8priority re"uest )/1s to the head of the "ueue, helping to deliver response time goals. Data 2arehouse data.ases often set the s&stem en'ironment to allo2 for fast scans. Does Teradata performance suffer 2hen the short 2or% is mixed in5 3ecause +eradata was designed to handle a high volume of concurrent "ueries, it doesn7t count on se"uential scans to produce high performance for "ueries. Although other D3M% products see a large fall in re"uest performance when they go from a single large "uery to multiple "ueries or when a mi-ed workload is applied, +eradata sees no such performance change. +eradata never plans on se"uential access in the first place. )n fact, +eradata doesn7t even store the data for se"uential accesses. +herefore, random accesses from many concurrent re"uests are just !usiness as usual. %ync scan algorithms provide additional optimi,ation. $hen multiple concurrent re"uests are scanning or joining the same ta!le, their )/1 is piggy!acked so that only a single )/1 is performed to the disk. Multiple concurrent "ueries can run without increasing the physical )/1 load, leaving the )/1 !andwidth availa!le for other parts of the workload. :hat if 2or% demand exceeds TeradataLs capa.ilities5 +here are limits to how much work the engine can handle. A successful data warehouse will almost certainly create a demand for service that is greater than the total processing power availa!le on the system. +eradata always puts into e-ecution any work presented to the D3M%. )f the total demand is greater than the total resources, then controls must !e in place !efore the work enters the D3M%. $hen your warehouse reaches this stage, you can use Data!ase @uery Manager =D3@M> to manage the flow of user re"uests into the warehouse. D3@M, inserted !etween the users7 1D30 applications and the D3M%, evaluates each re"uest and then applies a set of rules created !y the system administrator. )f the re"uest violates any of the rules, D3@M notifies the user that the re"uest is denied or deferred to a later time for e-ecution. &ules can include, for e-ample, system use levels, "uery cost parameters, time of day, o!jects accessed, and authori,ed users. *ou can read more a!out D3@M in a recent +eradata &eview article =(#ield &eport9 D3@M,( %ummer 1LLL, availa!le online at www.teradatareview.com/summerLL/truet.html>. /o2 do administrators and D!As sta& on top of complex mixed 2or%loads5 +he +eradata Manager utility provides a single operational system view for administrators and D3As. +he tool provides real8time performance, logged past performance, users and "ueries currently e-ecuting, management of the schema, and more. (TA"0NJ ACT0RE +he active warehouse is a !usy place. )t must handle all decision making for the organi,ation, including strategic, long8range data mining "ueries, tactical decisions for daily operations, and event8!ased decisions necessary for effective $e! sites. Aevertheless, managing this diversity of work does not re"uire a staff of hundreds running a comple- architecture with multiple data marts, operational data stores, and a multitude of feeds. )t simply re"uires a data!ase management system that can manage multiple workloads at varying service levels, scale with the !usiness, and provide E4C availa!ility year round with a minimum of operational staff. 2. 0se C/#P4E'' in whichever attri.ute possi.$e. !his he$ps in reducing -/ and hence -mproves performance. Especia$$, for attri.ute having $ots of 0)) va$ues@0ni*ue known va$ues.

". C/))EC! '!A!-'!-C' on dai$, .asis (after ever, $oad) in order to improve performance. %. &rop and recreate secondar, indices .efore and after ever, $oad. !his he$ps in improving $oad performance (if critica$) B. 4egu$ar$, Check for EDE thru *uer,man data distri.ution across a$$ A#Ps using !eradata #anager or

?. Check for the com.ination on CP0+ A#P<s+ PE+ nodes for performance optimi:ation. Each A#P can hand$e C3 tasks and each PE can hand$e 123 sessions. #)/A& G Customi:e the num.er sessions for each #)/A& ;o.s depending on the um.er of concurrent #)/A& ;o.s X um.er of PE<s in the s,stem e.g 'CE A4-/ 1 A of AMPS B . A of MA< load Co"s 'andled "# TeradataB? (Parameter $'&%' %an "e set Dal,es-? to .?) A of Sess&ons per load Eo"B . (parameter t'at %an "e set at 7lo"al or at ea%' ML*A8 s%r&pt leDel) A of PEFsB. So . G?G.B ? H . (3 per Eo" oDer'ead) B 5 &s t'e Ma< sess&ons on Teradata "o< T'&s &s LESS t'en .3 0 $'&%' &s ma< A of sess&ons a PE %an 'andle 'CE A4-/ 2 AAMPS B .5 AMa< load Co"s 'andles "# TeradataB.? ASess&ons per load Eo"B . Aof PEFsB. So .5G.?G.B 3> H 3 (3 per Eo" oDe'ead) B 3@ (Ma< sess&ons on Teradata "o<)! T'&s &s M*RE t'en .3 0 $'&%' &s t'e ma< sess&ons a PE %an 'andle! :en%e ML*A8 fa&l0 &nsp&te of t'e ,sa/e of t'e SLEEP I TENACIT9 feat,res! 0se the ')EEP and !E AC-!Y features of #)/A& for schedu$ing #)/A& ;o.s. Check the !AB)E2A-! parameter. -f omitted can cause immediate $oad ;o. fai$ure if ,ou su.mit two #)/A&' $oads that are tr,ing to update the same ta.$e. H/- - &E1 6 Check the $imit on num.er of fie$ds for a ;oin -nde8 (ma8 1? fie$ds). -t ma, var, ., version Hoin -nde8 is $ike .ui$ding the ta.$e ph,sica$$,. Ience it has the advantage $ike BE!!E4 Performance since data is ph,sica$$, stored and not ca$cu$ated / !IE 7)Y etc. Cons are

of )/A&- O time(#)/A& needs Hoin -ndices to .e dropped .efore $oading) and additiona$ space since it is a ph,sica$ ta.$e.

S-ar putea să vă placă și