Documente Academic
Documente Profesional
Documente Cultură
Data Partition
Data Partition
Data Partition
Data Partition
Teradata MPP
B Y N E T
Data Partition
Data Partition
Data Partition
Data Partition
Data Partition
Data Partition
Data Partition
Data Partition
Oracle SMP
3
Tablespaces are individual units of storage Tablespaces have associated data files
Data files can be added to, extended or removed from a tablespace dynamically
Segments consist of extents Object space is allocated one extent at a time. May cause fragmentation.
4
SMP Node
PE V-Proc PE V-Proc
BYNET Connect
AM P M-T Loc k Log BPo ol I/O AM P M-T Loc k Log BPo ol I/O AM P M-T Loc k Log BPo ol I/O AM P M-T Loc k Log BPo ol I/O
B Y N E T
BYNET Connect
AM P M-T Loc k Log BPo ol I/O AM P M-T Loc k Log BPo ol I/O AM P M-T Loc k Log BPo ol I/O AM P M-T Loc k Log BPo ol I/O
DBC
Stores
TABLE INDEX
Products
INDEX
TABLE
. . .
Teradata
Execute Once
CREATE USER tpcd3000g AS PERM= 5400E9, PASSWORD= tpcd3000g;
6
Transient Journal Permanent Journal Spool Space Default Database Set at Table Level for Each Teradata Memory Management for Different Blocksizes is Automatic XCTL, Vconfig.out
7
Teradata
CREATE TABLE Table1 ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL (Col1 INTEGER, Col2 INTEGER, Col3 INTEGER) UNIQUE PRIMARY INDEX ( Col1 );
Col2 NUMBER,
Col3 NUMBER ) TABLESPACE Tablespace1 STORAGE (INITIAL 6144 NEXT 6144
MINEXTENTS 1 MAXEXTENTS 5 );
or... CREATE TABLE <expl|impl column specification> AS SELECT <any query>;
8
DBC
Product TABLE s
Fallback Permanent journaling Table block size Primary index Built into the Table MULTISET
10
Partitioning
Partitioning tables and indexes allow Oracle and Teradata to
store lots of data With Oracle, the process of choosing partitioning methods and partitioning keys is the balancing of query access path, performance, and data load requirements You specifically manage the partitioning constraints and their relationship to disk storage With Teradata, the hash partitioning algorithm is very good at evenly distributing (loading) data partitions and is the basis for high performance data access and ease of user access Provide reasonable partitioning columns when defining the table and Teradata does the rest
Partitioning Columns are Chosen for Even Data Distribution in Both Teradata and Oracle
11
99Q3
Florida
99Q2
New York
99Q2
Hash2
99Q1
Texas
99Q1
Hash1
Range Partitioning
List Partitioning
Hash Partitioning
12
Partitioning Comparisons-Teradata
Teradata Partitioning is a Fact of the System with
Hash Data Distribution based on Primary Index (Partitioning) Columns and system managed disk
AMP1
AMP2
AMP3
AMP4
13
Teradata
CREATE MULTISET TABLE ORDERTBL, DATABLOCKSIZE= 29.5 KILOBYTES ( O_ ORDERKEY DECIMAL (15,0) not null ,O_ CUSTKEY INTEGER not null ,O_ ORDERSTATUS CHAR( 1) CASESPECIFIC not null ,O_ TOTALPRICE DECIMAL( 15,2) not null ,O_ ORDERDATE DATE FORMAT yyyy- mm- dd not null ,O_ ORDERPRIORITY CHAR( 15) CASESPECIFIC not null ,O_ CLERK CHAR( 15) CASESPECIFIC not null ,O_ SHIPPRIORITY INTEGER not null ,O_ COMMENT VARCHAR( 79) CASESPECIFIC not null ) UNIQUE PRIMARY INDEX( O_ ORDERKEY );
Oracle Parallelism
Oracle parallelism is not directly related to table partitioning because
of its shared disk architecture Dynamically splits data over parallel processing units for Selects/Inserts means 1 or more parallel processing units per partition Considers partitioning when distributing data to parallel processing units One parallel processing unit applied to each partition for Updates/Deletes Each user/query can get varying amounts of parallelism or run serially depending upon the resources available at query run time May need to manually control parallelism to improve system throughput and to ensure fair distribution of parallel resources Parallel processes may funnel down to serial processing for final 15 sort/merge, aggregate activity
Teradata Parallelism
Teradata parallelism is directly related to its shared nothing
architecture Automatically applied by the database Architecture ensures that each major unit of parallelism (the VAMP) has similar amounts of data and memory Pipelining and query step parallelism is performed within the VAMP Utility Parallelism and Query/Data Manipulation Language parallelism (Select, Insert, Update, Delete) are all the same All system parallelism is available to ALL operations Teradata parallelism is automatic, pervasive, and database managed All users/queries take advantage of all the system parallel resources You do not manage and control parallelism
16
Data Types
Oracle
Teradata
CHAR VARCHAR2 NCHAR NCHAR2 NUMBER LONG LONGRAW RAW DATE BLOB CLOB NCLOB BFILE ROWID UROWID
CHAR VARCHAR CHAR VARYING LONG VARCHAR NUMERIC DECIMAL DOUBLE PRECISION FLOAT INTEGER SMALLINT BYTEINT BYTE VARBYTE GRAPHIC VARGRAPHIC LONG VARGRAPHIC DATE REAL
17
Datatypes
In Oracle the maximum precision "m" for number is 38. In Teradata it is 19 NUMBER (without precision) has no direct counterpart in Teradata. Determine migration by contents ROWID has no counterpart in Teradata
19
Constraints
20
Creating Indexes
Similarities between Teradata and Oracle: Indexes take up space on disk Indexes can be unique and non-unique Indexes and secondary indexes provide alternate ways to access data
Differences: Teradata indexes are not in B-tree structure Hash Subtables Teradata automatically partitions indexes across the AMPs Teradata uses a Primary Index for each table
22
indexes OLTP workloads required fast access paths to few rows Decision support solutions continue Oracles use of indexes where tactical queries with OLTP-like response time requirements are given more emphasis than throughput performance Teradata solutions have traditionally not used lots of indexes Teradatas efficient parallel architecture emphasizes throughput performance requirements - a result of its DSS background
23
system Saves on disk storage Reduces table maintenance windows where affected by existing indexes Fewer database objects to manage and monitor Most of the indexes found in Oracle may not be used on Teradata Indexes can provide clues to ad hoc query support requirements No Bit-Map indexes for Teradata Add indexes to Teradata only as workloads (or anticipated workloads) require them Monitor their usage or lack of usage Statistics are key - you WANT to collect statistics in Teradata!
24
Drive towards using single or few AMP operations for queries Minimizing the number of AMPs in an operation, all other AMPs are freed to perform other tasks. Scalability is increased because the freed AMPs can execute more single or few AMP operations creating greater throughput by increasing the number of tactical queries executed in parallel Create Efficient All AMP Operations: Reduce the resource consumption on each AMP for all AMP operations. Since all AMP operations are virtually impossible to remove from a data warehousing environment, reducing the impact on each AMP is important Gain scalability for two reasons: Individual queries execute faster freeing the AMPs to execute other queries 25 Decrease in resource consumption allows more queries to use the shared resources, such as spool space, or CPU
Miscellaneous DDL
Triggers Triggers function just the same (pre- and post-, insert / update / delete) as in Teradata Oracle Options not in Teradata Database Links Reference (use) objects in another instance Synonyms Named references to objects Sequences Number generators E Often used to generate surrogate keys Hierarchies K L
26
ANALYZE/Collect Statistics
ANALYZE/DBMS_STATS package are intended for collecting database object statistics for Oracles Cost Based Optimizer (CBO) Goal is to collect statistics give queries good access paths Once Plans are good and stable, stop analyzing tables to preserve plans Teradatas Collect Statistics command collects database object statistics Optimizer reacts to changing demographics (growing tables, changing column value cardinality, etc.) Keeping statistics up to date ensures good plans Dont freeze statistics - old statistics encourage old access plans that may not be effective as the database changes
27
both Oracle and Teradata to assign user access Oracle users may be granted:
Object Privileges
System Privileges
28
rights:
Tables
Views
Macros
29
Object Privileges
Granting Oracle Privileges Grant Alter Grant Delete Grant Index Grant Select Grant Insert Grant Update Grant Execute Granting Teradata Privileges Grant Drop Grant Delete Grant Index or Grant Drop Table Grant Select Grant Insert Grant Update Grant Execute Procedure/ Grant Execute
30
System Privileges
Granting Oracle Privileges
Grant Create Table Grant Alter Any Table Grant Delete On Any Table Grant Drop Any Table Grant Insert Any Table Grant Update Any Table
Grant Select Any Table Grant Create View Grant Create Any View Grant Drop Any View
31
Questions ?
32