Sunteți pe pagina 1din 117

Implementing Packages and Control Flow in SQL Server Integration Services 2008

Course 10057

Table of Contents
Creating Integration Services Packages ...............................................................................................1 Introduction .............................................................................................................................................. 1 Lesson Introduction .............................................................................................................................. 1 Lesson Objectives.................................................................................................................................. 1 Introduction to Packages .......................................................................................................................... 2 Package Properties in SSIS .................................................................................................................... 2 Creating Data Sources and Data Source Views ......................................................................................... 7 Creating Connection Managers in a Package............................................................................................ 8 Importing Packages ................................................................................................................................... 9 Implementing Control Flow Tasks: Part 1 .......................................................................................... 10 Introduction ............................................................................................................................................ 10 Lesson Introduction ............................................................................................................................ 10 Lesson Objectives................................................................................................................................ 10 Data Tasks ............................................................................................................................................... 11 Data Flow task ..................................................................................................................................... 11 Bulk Insert task .................................................................................................................................... 11 Execute SQL task ................................................................................................................................. 12 Using the Data Profiling Task .................................................................................................................. 14 Data Profiling Task Properties ............................................................................................................. 14 File and Network Tasks ........................................................................................................................... 15 File System task ................................................................................................................................... 15 FTP task ............................................................................................................................................... 15 XML task .............................................................................................................................................. 16 Send Mail task ..................................................................................................................................... 25 Web Service task ................................................................................................................................. 26 Message Queue task ........................................................................................................................... 27 Scripting Tasks......................................................................................................................................... 29 Script task ............................................................................................................................................ 29 ActiveX Script task ............................................................................................................................... 29 i

Execute Process task ........................................................................................................................... 30 Database Object Transfer Tasks.............................................................................................................. 32 Transfer Database task ....................................................................................................................... 32 Transfer SQL Server Objects task ........................................................................................................ 32 Transfer Error Messages task.............................................................................................................. 36 Transfer Jobs task................................................................................................................................ 36 Transfer Logins task ............................................................................................................................ 37 Transfer Master Stored Procedure task .............................................................................................. 37 Adding Tasks to the Control Flow ........................................................................................................... 39 Implementing Control Flow Tasks: Part 2 .......................................................................................... 40 Introduction ............................................................................................................................................ 40 Lesson Introduction ............................................................................................................................ 40 Lesson Objectives................................................................................................................................ 40 Package Execution Tasks ......................................................................................................................... 41 Execute Package task .......................................................................................................................... 41 Execute DTS 2000 Package task .......................................................................................................... 42 Analysis Services Tasks............................................................................................................................ 43 Analysis Services Processing task ........................................................................................................ 43 Data Mining Query task ...................................................................................................................... 43 Analysis Services Execute DDL task ..................................................................................................... 44 Maintenance Tasks ................................................................................................................................. 45 Back Up Database task ........................................................................................................................ 45 Check Database Integrity task............................................................................................................. 45 Execute SQL Server Agent Job task ..................................................................................................... 45 Notify Operator task ........................................................................................................................... 45 Execute T-SQL Statement task ............................................................................................................ 46 Rebuild Index task and Reorganize Index task .................................................................................... 46 Update Statistics task .......................................................................................................................... 46 Shrink Database task ........................................................................................................................... 46 History Cleanup task ........................................................................................................................... 47 Maintenance Cleanup task ................................................................................................................. 47 Windows Management Instrumentation Tasks...................................................................................... 48 ii

WMI Event Watcher task .................................................................................................................... 48 WMI Data Reader task ........................................................................................................................ 49 Working with Precedence Constraints and Containers ...................................................................... 50 Introduction ............................................................................................................................................ 50 Lesson Introduction ............................................................................................................................ 50 Lesson Objectives................................................................................................................................ 50 Introduction to Precedence Constraints ................................................................................................. 51 Outcome-based Precedence Constraints ............................................................................................... 52 Success ................................................................................................................................................ 52 Failure ................................................................................................................................................. 52 Completion.......................................................................................................................................... 52 Expressions within Precedence Constraints ........................................................................................... 53 Multiple Constraints ............................................................................................................................... 54 Introduction to Containers ..................................................................................................................... 55 Working with For Loop Containers ......................................................................................................... 56 Working with Foreach Loop Containers ................................................................................................. 57 Foreach Loop Enumarator Types ........................................................................................................ 57 Working with Variables .................................................................................................................... 59 Introduction ............................................................................................................................................ 59 Lesson Introduction ............................................................................................................................ 59 Lesson Objectives................................................................................................................................ 59 Introduction to Variables in SSIS ............................................................................................................. 60 System Variables in a Package ................................................................................................................ 61 Package-based variables ..................................................................................................................... 61 Task-based variables ........................................................................................................................... 61 Container-based variables .................................................................................................................. 61 Creating User-Defined Variables............................................................................................................. 62 Business Practices ............................................................................................................................ 63 Lab: Implementing Packages and Control Flow in Microsoft SQL Server 2008..................................... 64 Lab Overview .......................................................................................................................................... 64 Lab Introduction.................................................................................................................................. 64 Lab Objectives ..................................................................................................................................... 64 iii

Scenario................................................................................................................................................... 65 Exercise Information ............................................................................................................................... 66 Exercise 1: Creating SSIS Packages ...................................................................................................... 66 Exercise 2: Working with Control Flow Tasks ..................................................................................... 66 Exercise 3: Managing Control Flow Tasks ........................................................................................... 66 Exercise 4: Working with Variables ..................................................................................................... 67 Lab Instructions: Implementing Packages and Control Flow in Microsoft SQL Server 2008 .................. 68 Exercise 1: Creating SSIS Packages ...................................................................................................... 68 Exercise 2: Working with Control Flow Tasks ..................................................................................... 69 Exercise 3: Managing Control Flow Tasks ........................................................................................... 71 Exercise 4: Working with Variables ..................................................................................................... 74 Lab Review .............................................................................................................................................. 76 What is the difference between data sources and connection managers? ....................................... 76 At what location should you save a package to create a package template? .................................... 76 What is the difference between a Data Flow task and a Bulk Insert task? ........................................ 76 You want to control the execution of multiple SSIS packages. What would be the best task to use? ............................................................................................................................................................ 76 What type of precedence constraints can be used with the control flow component of SSIS? ........ 76 What is the purpose of containers? .................................................................................................... 77 What is the purpose of variables? ...................................................................................................... 77 Module Summary ............................................................................................................................ 78 Creating Integration Services Packages .................................................................................................. 78 Implementing Control Flow Tasks: Part 1 ............................................................................................... 78 Implementing Control Flow Tasks: Part 2 ............................................................................................... 79 Working with Precedence Constraints and Containers .......................................................................... 79 Working with Variables ........................................................................................................................... 79 Lab: Implementing Packages and Control Flow in Microsoft SQL Server 2008 ...................................... 80 Glossary........................................................................................................................................... 81

iv

Creating Integration Services Packages


Introduction
Lesson Introduction Packages represent the starting point for creating an SSIS solution. In addition to defining new packages, you can also add packages from other locations. After creating the package, the first step is to define data sources to connect to a wide range of data stores. You can also create data source view to specify the tables and views that will be used within the Integration Services package. If new packages are created on a regular basis, package templates can be defined to include common items, such as data sources and tasks, to save time when a new package is created. Lesson Objectives After completing this lesson, you will be able to:

Explain what packages are. Create data sources and data source views. Use connection managers in a package. Describe how to create a package template. Import packages.

Introduction to Packages
Packages are the basic unit for holding the logic of extract, transform and load (ETL) operations within SSIS. When an SSIS project is created in Business Intelligence Development Studio, a default SSIS package is created named Package.dtsx. This can be renamed. Within the package, the core elements of the package logic can be configured including:

Control Flow elements Data Flow elements Event handler elements SSIS package variables SSIS package configurations

There are additional elements that can also be configured. SSIS provides a Package Explorer window that helps you to conveniently view all of the package components in a single window. This is useful in situations where you inherit packages and require a single place to view all of the SSIS package components. At the package level, there are several properties that can be configured. Similar to many of the properties that exist within Business Intelligence projects, these properties are categorized for organization purposes. The categories of properties that are available at the package level include:

Checkpoints Execution Forced Execution Value Identification Misc Security Transactions Versions

Package Properties in SSIS Checkpoints


Checkpoints can be configured within a package to define restart points if a package should fail during its execution. This can be useful if a package can take time to execute. Without checkpoints, a failed package would have to be restarted from the beginning. With checkpoints, a failed package can be restarted from a checkpoint within the package, reducing the time it would take to rerun a package.

Package Property CheckpointFileName

Description The name of the file that captures the checkpoint information that enables a package to restart. Specifies when a package can be restarted. The values are Never, IfExists and Always. The default value of this property is Never. Specifies whether the checkpoints are written to the checkpoint file when the package runs. The default value of this property is False.

CheckpointUsage

SaveCheckpoints

Execution
The properties that are found within this category determine the behaviour of the package execution at run time. It can determine if the package is disabled by using the disable property, or whether Event Handlers can be used within the package by using the DisableEventHandlers property. You can also control how the package handles package error. Package Property DelayValidation Description Indicates whether package validation is delayed until the package runs. The default value for this property is False. Indicates whether the package is disabled. The default value of this property is False. Specifies whether the package event handlers run. The default value of this property is False. Specifies whether the package fails if an error occurs in a package component. At package level, the only valid value of this property is False. Specifies whether the parent container fails if an error occurs in a child container. The default value is of this property is False. The number of executable files that the package can run concurrently. The default value of this property is -1, which indicates that there is no limit. The maximum number of errors that can occur before a package stops running. The default value of this property is 1. The Win32 thread priority class of the package thread. The values are Default, AboveNormal, Normal, BelowNormal and Idle. The default value of this property is Default. Specifies whether the parent container fails if an error occurs in a child container. The default value is of this property is False.

Disable

DisableEventHandlers

FailPackageOnFailure FailParentOnError MaxConcurrentExecutables MaximumErrorCount PackagePriorityClass

FailParentOnError

Forced Execution Value


This category of properties helps you to force the execution value that is set on the package. Package Property ForcedExecutionValue ForcedExecutionValueType ForceExecutionValue Description If ForceExecutionValue is set to True, a value that specifies the execution value that the package returns. The default value of this property is 0. The data type of ForcedExecutionValue. Specifies whether the execution value of the package is forced. The default value of this property is False.

Identification
You can define supplementary information for the package including the package creators name, creation data and package name. This area also holds the packages globally unique identifier (GUID). Package Property ForcedExecutionValue ForcedExecutionValueType ForceExecutionValue CreationDate CreatorComputerName CreatorName Description Description If ForceExecutionValue is set to True, a value that specifies the execution value that the package returns. The default value of this property is 0. The data type of ForcedExecutionValue. Specifies whether the execution value of the package is forced. The default value of this property is False. The date that the package was created. The name of the computer on which the package was created. The name of the person who created the package. A description of package functionality.

Misc
Miscellaneous settings pertaining to the package can be set in this category of package properties. Package Property Configurations Expressions ForceExecutionResult LocaleId LoggingMode Description Click the browse button () to view and configure Package Configurations. Click the browse button () to create expressions for package properties. The execution result of the package. The values are None, Success, Failure and Completion. The default value of this property is None. A Microsoft Win32 locale. The default value of this property is the locale of the operating system on the local computer. A value that specifies the logging behaviour of the package. The values are Disabled, Enabled and UseParentSetting. The default value of this property is UseParentSetting. Indicates whether the package is in offline mode. This property is read-only. The property is set at the project level. Normally, the SSIS Designer tries to connect to each data source used by your package to validate the metadata associated with sources and destinations. You can enable Work Offline from the SSIS menu, even before you open a package, to prevent these connection attempts and the resulting validation errors when the data sources are not available. You can also enable Work Offline to speed up operations in the designer, and disable it only when you want your package to be validated. Indicates whether the warnings generated by configurations are suppressed. The default value of this property is False.

OfflineMode

SuppressConfigurationWarnings

Security
Security provides you with the ability to set package-level security, including a package password for the encryption of the package through various settings. Package Property PackagePassword ProtectionLevel Description The password for package protection levels (EncryptSensitiveWithPassword and EncryptAllWithPassword) that require passwords. The protection level of the package. The values are DontSaveSensitive, EncryptSensitiveWithUserKey, EncryptSensitiveWithPassword, EncryptAllWithPassword and ServerStorage. The default value of this property is EncryptSensitiveWithUserKey.

Transactions
Packages can have transactions settings to maintain the integrity of the ETL operation that is provided within the package. In this manner, if a package fails, then the data being loaded can be rolled back maintaining data integrity. The transaction isolation level is also defined in this category to determine the amount of concurrent access to the data. Package Property IsolationLevel Description The isolation level of the package transaction. The values are Unspecified, Chaos, ReadUncommitted, ReadCommitted, RepeatableRead, Serializable and Snapshot. The default value of this property is Serializable. The transactional participation of the package. The values are NotSupported, Supported and Required. The default value of this property is Supported.

TransactionOption

Versions
This category within the package property helps you to provide user-defined information about the VersionBuild of the package. You can also specify comments about the build and the MajorVersion and MinorVersion information of the package. The VersionGUID is a read-only property, providing a GUID version of the package.

Package Property VersionBuild VersionComments VersionGUID VersionMajor VersionMinor

Description The version number of the build of the package. Comments about the version of the package. The GUID of the version of the package. This property is read-only. The latest major version of the package. The latest minor version of the package.

Creating Data Sources and Data Source Views


Creating data sources and data source views are typically the first steps that are performed after a package has been created. Data sources provide the connection information that can provide access to source data. Data source views are objects based on a data source that provides an abstraction of a subset of tables, columns and relationships from the data source. SSIS provides you with the ability to define data sources and data source views within an SSIS project. This ensures that data sources and data source views only need to be defined once and can be used multiple times by different packages in the same project.

Creating Connection Managers in a Package


Connection managers define connection information for different elements of a package. These can include tasks within the Control Flow or Event Handlers SSIS component, such as a Simple Mail Transfer Protocol (SMTP) connection manager for the Send Mail Task. The Data Flow task can also make use of connection managers for defining data sources and data destination. Connection managers are different from data sources and data source views as they are embedded within the package itself. This means that when a package is deployed, the connection managers are deployed with the package, unlike data sources and data source views that are part of the SSIS project files. However, connection managers can be created based on the data sources that are defined within the SSIS project file.

Importing Packages
You can import existing packages into SSIS project files. Packages can be imported from the file system or from SQL Server and be integrated within Business Intelligence Development Studio for additional editing and redeployment. A package stored in the file system means that the package is stored within a logical drive on the server. A package stored in SQL Server means that the package is saved within SQL Server and can take advantage of the added security provide by SQL Server. The third option is the SSIS package store, which is a reference to the default package storage location on the file system. This, by default, is C:\Program Files\Microsoft SQL Server\100\DTS\Packages.

Implementing Control Flow Tasks: Part 1


Introduction
Lesson Introduction There are many tasks available in SSIS. These tasks provide great flexibility when defining the structure of the control flow. The tasks can be categorized into groups of tasks based on the functionality that they provide. The tasks can include working with data or working with the file system and network locations. The control flow is also extensible and can be facilitated by script and program execution tasks allowing the ability to interact with .NET languages, ActiveX and command prompt tasks. You can also use Transfer Database Objects tasks that can move database objects from one instance of SQL Server to another. Lesson Objectives After completing this lesson, you will be able to:

Describe the data tasks. Work with the Data Profiling Task. Work with file and network tasks. Work with scripting tasks. Work with Transfer Database Objects tasks. Add tasks to the control flow.

10

Data Tasks
Data tasks are Control Flow tasks that help you to work with data from various sources. The table describes the data tasks in SSIS. Data Flow task The Data Flow task is a core task used within SSIS to extract, transform and load data from a wide variety of data sources and destinations. You can define one or more Data Flow tasks within an SSIS package. The Data Flow task is a unique task within the control flow. This task has its own separate designer that you can access by double-clicking the Data Flow task. The Data Flow Designer is a rich development environment, in which you can configure data sources and destination. In addition, you can configure the wide variety of transformations to change the data that is moved as part of an ETL operation. The Data Flow task is commonly used in SSIS packages that load data warehouses. Separate Data Flow tasks may be created to populate staging tables. For example, a Data Flow task to populate dimension tables with another Data Flow task, and a Data Flow task to populate the fact tables with data. Bulk Insert task The Bulk Insert task can only transfer data from a text file into a SQL Server table. Note that the Bulk Insert task supports only Object Linking and Embedding, Database (OLE DB) connections for the destination database. While the Data Flow task can provide this capability, the Bulk Insert task is the most efficient method for this type of transfer, because it cannot perform transformations or error logging on the data while it is moving from the source file to the table. If the destination table or view already contains data, the new data is appended to the existing data when the Bulk Insert task runs. If you want to replace the data, run an Execute SQL task that runs a DELETE or TRUNCATE statement before you run the Bulk Insert task. The Bulk Insert Task Editor contains four property categories:

General. This category includes a Name and Description property. Connection. This category includes a File property to define the source text file connection. The connection category also contains a Connection and Destination Table property to define the destination SQL Server database and table. The text file RowDelimiter and ColumnDelimiter property can be used to define the text file format or if the Format property is set to Use File, a FormatFile can be used to define custom mappings between the text file and SQL Server table. If you have a format file that was created by the bcp utility, you can specify its path in the Bulk Insert task. The Bulk Insert task supports both XML and non-XML format files. Options. On the Options page, the Codepage and DataFileType property is used to define the type of text file. The BatchSize determine how many rows are inserted into the table as a batch. A setting of zero means all rows in the text file are inserted as one batch. If a batch size is set, each batch represents a transaction that is committed when the batch finishes running. The 11

FirstRow and LastRow properties determine the starting and ending row of the data in the text file. You can also use the MaxErrors property to determine the number of errors that are allowed before the Bulk Insert task fails. The Bulk Insert task is particularly useful in situations where data needs to be retrieved from a system that cannot be connected to SSIS. Text files can act as a great intermediary data source and the Bulk Insert task will minimise the time required to import the data into a SQL Server table. Execute SQL task The Execute SQL task helps you to run a SQL statement or stored procedure from within a package. The Execute SQL Task Editor contains many properties broken down into pages in the Execute SQL Task Editor. General. On this page, you can define a name and description for the task. A TimeOut property can be used to limit the time SQL statements have to execute. The CodePage can determine the locale that is used with variables generated by the Execute SQL task. The ResultSet property allows you to set the type of results returned. This can include Single Row, Full Result Set, XML or None. The SQL Statement category of properties includes properties that provide connection information to the source database and the type of statement the Execute SQL task will run. These properties include ConnectionType, Connection, SQLSourceType and SQLStatement. This will hold the SQL Statement itself. Within this same area, the ByPassPrepare property when set to False, can improve the performance of the Execute SQL task if it is run regularly. Parameter Mappings. This page can be used to map parameters within the SQL statement or stored procedures to variable values that exist within the package. With regards to OLE DB, Open Database Connectivity (ODBC) and Microsoft Office Excel connection managers, the parameters within the statements are represented as question marks (?) known as parameter markers. Therefore, if a SQL statement is issued as follows.
SELECT FirstName, LastName, Title FROM Person.Contact WHERE ContactID = ?

Using OLE DB and Excel connections, the parameter marker would be named 0 (zero). If a second parameter marker is added to the above query, it would be named 1. ODBC connections start parameter markers at number 1, so in the above query, using ODBC connection, this parameter marker would be named 1. This information is used to populate the Parameter Name property in the parameter mappings page. You can use the Data Type property to define the data type of the parameter and the Parameter Size property to define the size of variable length data types. You can use the Add and Remove buttons to manage parameter mappings. The Variable Name property is used to add a new or existing variable to which the parameter maps and the Direction property determines the type of parameter which could be an input parameter, output parameter or return code. Result Set. This page can be used to map the results of a SQL statement or stored procedures to a variable that exist within the package. You can use the Add button to add a result set or the Remove button to remove an existing result set. The Result Name property allows you to define a name for the result set, however, this name is dictated by the option chosen in the ResultSet property on the General page. If SingleRow is selected, the Result Name property can use either the name of a column returned by the query or the number that represents the ordinal position of a column in the column list of query. 12

If the ResultSet property is set to Full Result Set or XML, the Result Name property must be named 0 (zero). The Variable Name is used to map the result set to a new or existing variable. Expression. This page helps you to use the property expressions to dynamically update the properties of the Execute SQL task at run time by using variables. You can use the Execute SQL task for the following purposes:

Truncating a table or view in preparation for inserting data Creating, altering and dropping database objects such as tables and views Recreating fact and dimension tables before loading data into them Running stored procedures Saving the rowset returned from a query into a variable

13

Using the Data Profiling Task


The Data Profiling task is a new feature to SQL Server 2008 that helps you to analyze and profile the data within tables that is stored in SQL Server 2000 or later versions. This task does not work with third-party or file-based data sources. To run a package that contains the Data Profiling task, you must use an account that has read/write permissions, including CREATE TABLE permissions, on the tempdb database. Data Profiling Task Properties General. Allows you to specify the DestinationType for the data profiler results, which can be an XML File connection or a variable. You then use the Destination property to specify the XML file name or variable name. The OverWriteDestination allows you to specify if data that exists within the destination can be overwritten. You also have a TimeOut property to limit the time it takes for the information to be loaded into the destination. Profile Requests Page. Allows you to determine the data that you want to profile and the type of profiling you want to perform. As a result, this page is broken into two parts. The top part of the Profile Request lists the profiling you wish to perform. On each request, the bottom part of the page will display the properties for the request. Expression. Enables the use of property expressions to dynamically update the properties of the Data Profile Task at run time by using variables.

14

File and Network Tasks


There are a number of tasks within SSIS that allow you to interact with files of specific types and folders, which can be defined on local or remote locations. File System task The File System task helps you to interact with the file system so that you can add, remove or move a single folder or file. To work with multiple folders and files, you can use the File System task within a Foreach Loop Container. The File System task contains predefined operations that can be selected within the task including:

Copy directory Copy file Create directory Delete directory Delete directory content Delete file Move directory Move file Rename file Set attributes

These options are used to populate the Operation property within the File System task. Additional properties include the Name and Description property. The OverwriteDestination property can be set to True or False to determine if files and folders in the destination can be overwritten. There is a SourceConnection and DestinationConnection property that helps you to point to a source and destination connection manager. However, if the IsSourcePathVariable and IsDestinationPathVariable is set to True, this changes the SourceConnection and DestinationConnection property to SourceVariable and DestinationVariable respectively. This helps you to retrieve the source and destination file path from a variable that may have been set in an earlier task within the package. The Expressions page in the File System Task Editor helps you to use the property expressions to dynamically update the properties of the File System task at run time by using variables. A File System task may be used to create a folder on the file system so that it is ready to receive files that are retrieved from elsewhere. For example, an FTP task may use the folder created by the File System task as a destination to move files. FTP task The FTP task can be used to download and upload files to a remote server. This task can also manage folders on the server. To connect to the FTP site, an FTP connection manager needs to be defined and is used to populate FtpConnection property in the FTP Task Editor. 15

You can also provide a Name and Description. In addition, you can set the StopOnFailure property to True or False to determine if the FTP task stops, should it encounter a failure. These properties are defined on the General page of the FTP Task Editor. On the File Transfer page, there is a LocalPath property that allows you to set a file connection manager for the location of local files. However, if the IsLocalPathVariable property is set to True, this changes the LocalPath property to LocalVariable, allowing you to define the path from a variable that may have been set earlier in the package. This is the same with the RemotePath property that allows you to set a file connection manager for the location of remote files. However, if the IsRemotePathVariable property is set to True, this changes the RemotePath property to RemoteVariable, allowing you to define the path from a variable that may have been set earlier in the package. You also have the OverwriteFileAtDest property to determine if destination files can be overwritten. The Operation property is used to define the action that takes place within the FTP task including:

Send files. Sends a file from the local computer to the FTP server. Receive files. Saves a file from the FTP server to the local computer. Create local directory. Creates a folder on the local computer. Create remote directory. Creates a folder on the FTP server. Remove local directory. Deletes a folder on the local computer. Remove remote directory. Deletes a folder on the FTP server. Delete local files. Deletes a file on the local computer. Delete remote files. Deletes a file on the FTP server.

The Expression page helps you to use the property expressions to dynamically update the properties of the FTP task at run time by using variables. An FTP task is a useful medium to retrieve files from a third-party source on the Internet. Using the FTP task, you can connect to an FTP site, retrieve files and load the files into a local folder that may be used as data within the SSIS package. XML task You can work with XML documents by using the XML Control Flow task. With this task, you can modify an existing XML document or portions of an XML document with a number of prebuilt operations for working with XML data. The prebuilt functions include the following:

Merge. This function provides the ability to merge a number of XML documents into one. You can do this by defining a base XML document and merging the contents of a second document into the base document. Merge Function Operations in the XML Control Flow Task

16

The following are the options that are available for the Merge function when Merge is specified as the OperationType on the General page of the XML task. The key to using these options is that settings need to be specified for the base document and the second document on which the Merge operation is based. XPathStringSourceType Select the source type of the XML document. This property has the following options: Value Direct input File connection Variable Description Set the source to an XML document. Select a file that contains the XML document. Set the source to a variable that contains the XML document.

XPathStringSource If XPathStringSourceType is set to Direct input, provide the XML code or click the ellipsis button (), and then provide the XML by using the Document Source Editor dialog box. If XPathStringSourceType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable. If the XPath statement returns multiple nodes, only the first node is used. The contents of the second document are merged under the first node that the XPath query returns.

SaveOperationResult Specify whether the XML task saves the output of the Merge operation. OverwriteDestination Specify whether to overwrite the destination file or variable. DestinationType Select the destination type of the XML document. This property has the following options: Value File connection Variable Destination If DestinationType is set to File connection, select a File Connection Manager, or click <New connection...> to create a new connection manager. If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable. Description Select a file that contains the XML document. Set the source to a variable that contains the XML document.

SecondOperandType Select the destination type of the second XML document. This property has the following options:

17

Value Direct input File connection Variable SecondOperand

Description Set the source to an XML document. Select a file that contains the XML document. Set the source to a variable that contains the XML document.

If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (), and then provide the XML by using the Document Source Editor dialog box. If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Diff. This function compares two XML documents and detects the differences between the documents. The result of the differences is generated in a XML Diffgram Document. When the Diff function is selected, a number of options are available that allow you to define the precision of the comparison being made between the two XML documents. Diff Function Operations in the XML Control Flow Task
Specify the options that are available for the Diff operation when Diff is selected as the OperationType on the General page of the XML Task. DiffAlgorithm. Select the Diff algorithm to use when comparing documents. This property has the following options.

Value Auto Fast Precise

Description Let the XML task determine whether to use the fast or precise algorithm. Use a fast, but less precise Diff algorithm. Use a precise Diff algorithm.

Diff Options. Set the Diff options to apply to the Diff operation. The options are as follows.

Option IgnoreComments IgnoreNamespaces

Description A value that specifies whether comment nodes are compared. A value that specifies whether the namespace uniform resource identifier (URI) of an element and its attribute names are compared. If this

18

option is set to true, two elements that have the same local name but a different namespace are considered to be identical. A value that specifies whether prefixes of element and attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace URI and prefix are considered identical. A value that specifies whether the XML declarations are compared. A value that specifies whether the order of child elements is compared. If this option is set to true, child elements that differ only in their position in a list of siblings are considered to be identical. A value that specifies whether white spaces are compared. A value that specifies whether processing instructions are compared. A value that specifies whether the DTD is ignored.

IgnorePrefixes

IgnoreXMLDeclaration

IgnoreOrderOfChildElements

IgnoreWhiteSpaces IgnoreProcessingInstructions IgnoreDTD

FailOnDifference. Specify whether the task fails if the Diff operation fails. SaveDiffGram. Specify whether to save the comparison result, a DiffGram document. SaveOperationResult. Specify whether the XML task saves the output of the Diff operation. OverwriteDestination. Specify whether to overwrite the destination file or variable. DestinationType. Select the destination type of the XML document. This property has the options listed in the following table. Value File connection Variable Description Select a file that contains the XML document. Set the source to a variable that contains the XML document.

Destination. If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

19

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable. SecondOperandType. Select the destination type of the XML document. This property has the following options. Value Direct input File connection Variable Description Set the source to an XML document. Select a file that contains the XML document. Set the source to a variable that contains the XML document.

SecondOperand. If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button () and then provide the XML by using the Document Source Editor dialog box. If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Patch. Using the XML Diffgram Document from the Diff function, a new XML document is created from output of the Diff function. Patch Function Operations in the XML Control Flow Task
Specify the options available for the Patch operation when Patch is specified as the OperationType on the General page of the XML task. SaveOperationResult. Specify whether the XML task saves the output of the Patch operation. OverwriteDestination. Specify whether to overwrite the destination file or variable. DestinationType. Select the destination type of the XML document. This property has the following options: Value File connection Variable Description Select a file that contains the XML document. Set the source to a variable that contains the XML document.

Destination. If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

20

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable. SecondOperandType. Select the destination type of the XML document. This property has the following options:

Value Direct input File connection Variable

Description Set the source to an XML document. Select a file that contains the XML document. Set the source to a variable that contains the XML document.

SecondOperand. If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button () and then provide the XML by using the Document Source Editor dialog box. If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Validate. The Validate function helps you to compare the contents of the XML document to a XML Schema Definition (XSD) or Document Type Definition (DTD) Schema. Validate Function Operations in the XML Control Flow Task
The following are the options that are available for the Validate function when Validate is specified as the OperationType on the General page of the XML task. SaveOperationResult Specify whether the XML task saves the output of the Validate operation. OverwriteDestination Specify whether to overwrite the destination file or variable. DestinationType Select the destination type of the XML document. This property has the following options: Value File connection Variable Description Select a file that contains the XML document. Set the source to a variable that contains the XML document.

Destination

21

Select an existing File connection manager, or click <New connection...> to create a new connection manager. ValidationType Select the validation type. This property has the following options:

Value DTD XSD

Description Use a Document Type Definition (DTD). Use an XML Schema definition (XSD) schema. Selecting this option displays the dynamic options in section, ValidationType.

FailOnValidationFail Specify whether the operation fails if the document fails to validate.

ValidationType Dynamic Options


ValidationType = XSD SecondOperandType Select the source type of the second XML document. This property has the following options: Value Direct input File connection Variable Description Set the source to an XML document. Select a file that contains the XML document. Set the source to a variable that contains the XML document.

SecondOperand If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (), and then provide the XML by using the Source Editor dialog box. If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

XPath. You can perform and use XPath statements against the XML document. Xpath Function Operations in the XML Control Flow Task
The following are the options that are available for the XPath function when XPath is specified as the OperationType on the General page of the XML task.

22

SaveOperationResult Specify whether the XML task saves the output of the XPath operation. OverwriteDestination Specify whether to overwrite the destination file or variable. DestinationType Select the destination type of the XML document. This property has the following options: Value File connection Variable Description Select a file that contains the XML document. Set the source to a variable that contains the XML document.

Destination If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

SecondOperandType Select the source type of the second XML document. This property has the following options: Value Direct input File connection Variable Description Set the source to an XML document. Select a file that contains the XML document. Set the source to a variable that contains the XML document.

SecondOperand If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (), and then provide the XML by using the Source Editor dialog box. If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

PutResultInOneNode Specify whether the result is written to a single node. XPathOperation

23

Select the XPath result type. This property has the options listed in the following table. Value Evaluation Node list Values Description Returns the results of an XPath function. Returns the selected nodes as an XML fragment. Returns the inner text value of all selected nodes, concatenated into a string.

XSLT. You can customise the output of the XML document by applying Extensible Stylesheet Language Transformations (XSLT) to the XML document. XSLT Function Operations in the XML Control Flow Task
The following are the options that are available for the XSLT function when XSLT is specified as the OperationType on the General page of the XML task. SaveOperationResult Specify whether the XML task saves the output of the XSLT operation. OverwriteDestination Specify whether to overwrite the destination file or variable. DestinationType Select the destination type of the XML document. This property has the following options: Value File connection Variable Description Select a file that contains the XML document. Set the source to a variable that contains the XML document.

Destination If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

SecondOperandType Select the source type of the second XML document. This property has the following options. Value Direct input Description Set the source to an XML document.

24

File connection Variable

Select a file that contains the XML document. Set the source to a variable that contains the XML document.

SecondOperand If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (), and then provide the XML by using the Source Editor dialog box. If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager. If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

On the General page of the XML Task Editor, the OperationType option will allow you to select a specific function that you wish to use within the XML task. This will adjust the XML Task Editor with the options available for each function. Use the SourceType options to define the source of the XML document. These options can include Direct Input, File Connection and Variable. The Direct Input allows you to set the source connection as XML directly. The File Connection requires that a file connection manager pointing to the XML document is created separately and then used within the XML task. You can also point the source to a variable that holds the XML document. The Expression page helps you to use the property expressions to dynamically update the properties of the XML task at run time by using variables. Note that the XML task is not used to connect to an XML document that will be used as source data to be loaded into a separate destination. The XML task allows you to manipulate the XML data itself. An example could include using the Merge function to merge the contents of many XML documents into a single XML file for consolidation purposes. Send Mail task The Send Mail task helps you to connect to an SMTP server and interact with e-mail services within an SSIS package. Within the Send Mail Task Editor, the General page helps you to define a Name and Description for the Send Mail Task. The Mail page helps you to populate the SmtpConnection property with an SMTP connection manager that points to the e-mail server. The remaining properties relates to fields that are typically defined within an e-mail message. This includes the following properties:

From To Cc BCc Subject Priority 25

Attachments

The MessageSourceType property helps you to determine the source for the message within the e-mail message. A setting of DirectInput allows you to type in a message within the MessageSource property directly. However, the MessageSourceType can be set to FileConnection that allows you to point to a file connection manager that retrieves the message from a file. It can also be set to Variable that can retrieve the message contents from a variable that exists within the package. The Expression page helps you to use the property expressions to dynamically update the properties of the Send Mail task at run time by using variables. The Send Mail task is commonly used within SSIS packages to send an e-mail message to SSIS administrators about failures that occur within the package itself. This provides a useful notification mechanism if a package fails and you need to be informed about the failure. Web Service task The Web Service task can use a Hypertext Transfer Protocol (HTTP) Connection Manager to connect to a Web service and execute the methods that are available within a specific Web service. The HTTP connection manager provides the connection information but can only support anonymous and basic authentication. Note that Windows authentication is not available within the HTTP connection manager. The HTTP connection manager can be used to point to a Web site or a Web Service Description Language (WSDL) file. A WSDL file is an XML document that defines the network endpoints within the document so that it can be reused for other HTTP connection managers. Note that the WSDL file must be available locally on the server on which the SSIS package runs. Web services allow SSIS to communicate with software components using open protocols such as HTTP, SOAP and XML without the reliance on the platform on which the software component is running on. When connected to the Web service through the HTTP connection manager, you can expose the particular Web service method or property to add custom functionality to the SSIS package. There are many Web services available and they provide methods that perform an action from within a Web service, and properties, which enable the Web service to read data. In the Web Service task, methods are used that enable the Web Service task to perform an action such as modifying data.

General Page. On the General page, you can define the connection by using the HTTPConnection property. The WSDLFile property can be used to provide the local path to the WSDL file. The Name and Description property is used to define the name and description of the Web Service task. The OverwriteWSDLFile determines whether the WSDL file for the Web Service task can be overwritten. Input Page. The input page helps you to control how the Web Service task interacts with a method within a Web service. The Service property helps you to select the Web Service to use and the Method property helps you to select a specific method to use from within the Web service to perform an action. You can type a description for the Web method by using the WebMethodDocumentation property. If the method requires inputs to run, you can specify the name and data type of the input by using the Name and Type property respectively. The Variables property allows you to specify the use of package variables to provide the inputs to the Web Service task and the Value is used to specify the value that is provided by the variable. 26

Output page. If results are to be returned by the Web service method used, the output page is configured to determine how to store the results. The OutputType determines if the data is stored in a File or a Variable. If the option of File is selected, you must select a file connection manager to point to the file in which the results are stored or create a new File Connection Manager. If Variable is selected, you must select a variable or create a new Variable to store the results. Expressions page. The Expression page helps you to use the property expressions to dynamically update the properties of the Web Service task at run time by using variables.

Suppose you have an SSIS package that is used to populate a data warehouse and process an Analysis Services cube. A Web Service task could be used at the end of a package to call the ReportServices2005 Web service. You can use the SetExecutionOptions method within the class to create a snapshot of the report that uses the data, which is being populated. This way, the report snapshots are created after the data warehouse is loaded and the cube processed, rather than relying on the snapshot being created on a schedule. Message Queue task The Message Queue task allows SSIS packages to integrate with the Microsoft Message Queue (MSMQ) component of Microsoft Windows. MSMQ helps you to send and receive messages between different systems that are not continuously connected. Therefore, messages can be queued and delivered later if the destination is unavailable. This is useful in situations where an SSIS package is working with asynchronous data loads. When defining a queue to hold messages in the Microsoft Message Queue, you define whether the queue is configured to send or receive messages. You also define the type of message that can sent and received. If the message meets the defined message type, the Message Queue task will proceed to process the message. An example could include a string message type. Here, you can configure a queue to only proceed if a specific string message is received by the queue. The message types that are available include:

String message. When receiving messages, you can configure the task to compare the received string with a user-defined string and take action depending on the comparison. The comparison can be exact, case-sensitive or case-insensitive or use a substring. Variable message. You can configure the task to specify the names of the variables included in the message. When receiving messages, you can configure the task to specify both the package from which it can receive messages and the variable that is the destination of the message. Data file message. When receiving messages, you can configure the task to save the file, overwrite an existing file, and specify the package from which the task can receive messages. String message to variable. This option specifies the source message as a string that is sent to a destination variable. The String message to variable option is only available if the Message property in the Message queue task is set to Receive.

27

Within the Message Queue Task Editor, the General page helps you to define a name and description for the task. The MSMQConnection property allows you to define an MSMQ connection manager and the Message property defines if the task is responsible for sending or receiving messages in the queue. If the version of the MSMQ component is running on Microsoft Windows 2000, set the Use2000Format property to True, to allow the task to format the messages in a Windows 2000 format. The Message property will change the Message Queue Task Editor pages based on the setting defined. If the Message property is set to Send message, a Send page appears and consists of the following properties:

UseEncryption. This property determines if the messages sent are encrypted. EncryptionAlgorithm. The EncryptionAlgorithm is the algorithm used to protect the message, which could be RC2 or RC4 if UseEncryption is set to True. MessageType. This property includes the options of String message, which creates a StringMessage property to type in a string value, Variable message, which creates a VariableMessage property to map to a package variable value or data file message, which creates a DataFileMessage property to point to a specific file.

If the Message property is set to Receive message, a Receive page appears and consists of the following properties:

RemoveFromMessageQueue. This property indicates whether to remove the message from the queue after it is received. By default, this value is set to False. ErrorIfMessageTimeOut. This property indicates whether the task fails when the message times out, displaying an error message. The default is False. TimeoutAfter. If you choose to set ErrorIfMessageTimeOut to True, specify the number of seconds to wait before displaying the time-out message. MessageType. This can either be a value of String message, Variable message, Data File message or String message to variable. These options will dynamically change the remaining properties on the page to determine how the queue manages a particular message. For example, with the String message value selected, a Compare property and CompareString property is added to the editor to configure the criteria for the string comparison to determine if the receiving queue should proceed to process the message.

The Message Queue task is useful in scenarios where your package is dependent on receiving data from other sources. For example, you may have a central data warehouse that relies on updates to the fact table from branch offices. There may be a number of Message Queue tasks that wait to receive the data from each branch office. On arrival, the Message Queue task will pass the data into your SSIS package to add the data to the fact table. When all of the branch office data has arrived it will then allow the SSIS package to continue and complete the load of the data warehouse.

28

Scripting Tasks
You can add custom functionality in SSIS packages. A number of tasks are included to meet this requirement. Some of the tasks are as follows: Script task The Script task provides the ability to create custom tasks and transformations that are not provided by the built-in tasks in SSIS by using Microsoft Visual Basic (VB) or Microsoft Visual C# 2008. To use the Script task, the local machine on which the package runs must have Microsoft Visual Studio Tools for Applications installed. This provides a rich environment for building the custom scripts including IntelliSense and its own Object Explorer. You can access Microsoft Visual Studio Tools for Applications from within the Script Task. Microsoft Visual Studio Tools for Applications can be accessed by clicking the Edit Script button in the Script Task Editor. When the Edit Script is selected, Microsoft Visual Studio Tools for Applications opens an empty new project or reopens the existing project. The creation of this project does not affect the deployment of the package, because the project is saved inside the package file; the Script task does not create additional files. There are also additional properties that require configuration when using the Script task.

General page. The General page is used to provide a name and description for the Script task. Script page. The ScriptLanguage property allows you to define the language that is used to create the Script within the Script task; this can either be set to Microsoft Visual Basic 2008 or Microsoft Visual C# 2008. The EntryPoint provides the starting point within the Script task that the SSIS run-time engine will call first when running the Script task within the package. The name defined in this property must match the name of the method within the Script in Microsoft Visual Studio Tools for Applications Designer. You can define which variables can be used by the script by using the ReadOnlyVariables and ReadWriteVariables properties and select the variables to use. Multiple variables must be separated by a comma delimited list. Expressions page. The Expressions page in the Script Task Editor helps you to use the property expressions to dynamically update the properties of the Script task at run time by using variables.

The Script task is a powerful feature that helps you to overcome a situation where a built-in task or transformation cannot provide the required functionality. Examples can include connecting to custom data sources. ActiveX Script task The ActiveX Script task helps you to add custom logic to SSIS packages by using VBScript, Jscript or any scripting language that is installed on the local machine on which the package resides. In Data Transformation Services (DTS), ActiveX scripts provided the only means to add custom functionality to a DTS package. The ActiveX Script task, therefore, is provided for backward compatibility and is a deprecated feature of SSIS 2008. You should consider migrating the logic of scripts from within an ActiveX Script task to a Script task. Within the ActiveX Script task, you can configure the following options:

29

General page. The General page is used to provide a name and description for the ActiveX Script task. Script page. The Language property helps you to specify the scripting language used to add the custom functionality and is dependent on the languages installed on the local computer. The Script property helps you to type in the script for the custom logic; however, you can click the Browse button in the ActiveX Script Editor to browse for an existing script file that would overwrite any script typed within the Script property. The Save button can save the script typed in the Script property to a file and the Parse button can be used to verify the syntax of the script defined within the Script property. The EntryMethod property defines the entry point to the script when the SSIS package calls the ActiveX Script task. The name defined here must match the name defined within the script. Expressions page. The Expressions page in the ActiveX Script Task Editor helps you to use the property expressions to dynamically update the properties of the ActiveX Script task at run time by using variables.

Prior to SSIS, the ActiveX Script task was used in Data Transformation Services (DTS); the predecessor to SSIS, to perform loops within DTS packages. Within SSIS, the For Loop Container and Foreach Loop Container can now provide the same functionality. Consider replacing ActiveX Script tasks that perform this logic with the For Loop Containers and Foreach Loop Containers. Execute Process task The Execute Process task is typically used to call batch files that can perform command prompt tasks such as creating text files of data or calling applications. The following properties can be configured within the Execute Process task:

General page. Use the General page to define a name and description for the Execute Process task. Process page. The process page is used to determine the behaviour of the process that is to be executed by the task. The RequireFullFileName property can be set to True or False and will fail the task if the full file name for the process is not found. The Executable property defines the name for the executable and the Arguments property can be used to pass in arguments as the executable is running. The path for the executable can be specified by using the WorkingDirectory property. The Executable can also work with variables. The StandardInputVariable provides the variable inputs to a process and the StandardOutputVariable can capture output from a process. In both cases, you must specify a variable to which these properties map to should they be required. You can use the SuccessValue property to define which integer value indicates a successful execution of a process, the default is 0 (zero). The FailTaskIfReturnCodeIsNotSuccessValue relates to this to fail the task if the success value is not return to the task. You can also specify a TimeOut value for the number of seconds a process can run, and in conjunction with the TerminateProcessAfterTimeOut property, indicate if the process is forced to end if the time out value is reach. You can also use the WindowStyle property to determine if a window is shown while the process is running.

30

Expressions page. The Expressions page in the Execute Process Task Editor helps you to use the property expressions to dynamically update the properties of the Execute Process task at run time by using variables.

Consider using the Execute Process task if your SSIS package can only interact with a third-party application through a batch file. Create the batch file to interact with the third-party application first, and then use the Execute Process task to call the batch file when needed within the package.

31

Database Object Transfer Tasks


SSIS provides a number of tasks that help you to access, copy, insert, delete and modify SQL Server objects and data. This can be useful for one-off tasks that may be needed to be performed by a database administrator.
Transfer Database task The Transfer Database Wizard helps you to transfer a database from one instance of SQL Server 2000 or above and from one instance of SQL Server to another. This process can occur while the database is online or offline. However, the transfer is quicker if you perform it while the database is offline. You can also move the database to separate instances that exist within the same physical server. You can configure the following properties to the completed Transfer Database task.

General page. The General page is used to provide a name and description for the Transfer Database task. Databases page. You can define the source and destination connection information by using the SourceConnection and DestinationConnection properties. The DestinationDatabaseName and DestinationDatabaseFiles properties can be used to determine the database name and filenames and locations of the database files of the destination database. You can also use the DestinationOverwrite property to determine whether the database on the destination server can be overwritten. The Action property determines whether the database is copied or moved and the Method property determines if the database will be moved in online or offline mode. You use the SourceDatabaseName and SourceDatabaseFiles properties to specify the database that is transferred and the ReattachSourceDatabase determines if the task will try to reattach the source database should the task fail. Expressions page. The Expressions page in the Transfer Database Task Editor helps you to use the property expressions to dynamically update the properties of the Transfer Database task at run time by using variables.

The Transfer Database task is a useful task if you want to move a database from one instance of SQL Server that is on old hardware to a new instance of SQL Server that is held on a server with new hardware. Organizations may also use this task to create a copy of the database on a development SQL Server. Transfer SQL Server Objects task You can transfer specific SQL Server Objects from a database on an instance of SQL Server 2000 or above by using the Transfer SQL Server Objects task. However, if you transfer objects from SQL Server 2000, objects such as schemas, partition functions and partition schemes will not be supported. You use the Transfer SQL Server Objects task to transfer all objects, all objects of a type or only specified objects of a type. You can configure the Transfer SQL Server Objects task to include schema names, data, extended properties of transferred objects and dependent objects in the transfer. When copying data, you can specify whether to replace or append the existing data. In the Transfer SQL Server Objects task, you can configure the following properties:

32

General page. The General page is used to provide a name and description for the Transfer SQL Server Objects task. Databases page. You can define the source and destination connection information by using the SourceConnection and DestinationConnection properties. The SourceDatabase and the DestinationDatabase properties can be used to determine the database name that will have objects moved to and from. There is a wide range of additional properties that can be configured and can be dependent on how various properties are configured. Additional Properties in the Transfer SQL Server Objects Task Static Options
DropObjectsFirst. Select whether the selected objects will be dropped first on the destination server before copying. IncludeExtendedProperties. Select whether extended properties will be included when objects are copied from the source to the destination server. CopyData. Select whether data will be included when objects are copied from the source to the destination server. ExistingData. Specify how data will be copied to the destination server. This property has the options listed in the following table: Value Replace Description Data on the destination server will be overwritten.

Append

Data copied from the source server will be appended to existing data on the destination server.

Note: The ExistingData option is only available when CopyData is set to True. CopySchema. Select whether the schema is copied during the Transfer SQL Server Objects task. UseCollation. Select whether the transfer of objects should include the collation specified on the source server. IncludeDependentObjects. Select whether copying the selected objects will cascade to include other objects that depend on the objects selected for copying. CopyAllObjects. Select whether the task will copy all objects in the specified source database or only selected objects. Setting this option to False gives you the option to select the objects to transfer, and displays the dynamic options in the section, CopyAllObjects. ObjectsToCopy. Expand ObjectsToCopy to specify which objects should be copied from the source database to the destination database.

Note: ObjectsToCopy is only available when CopyAllObjects is set to False.

33

The options to copy the following types of objects are supported only on SQL Server. They include the following: Assemblies Partition functions Partition schemes Schemas User-defined aggregates User-defined types XML schema collections

CopyDatabaseUsers. Specify whether database users should be included in the transfer. CopyDatabaseRoles. Specify whether database roles should be included in the transfer. CopySqlServerLogins. Specify whether SQL Server logins should be included in the transfer. CopyObjectLevelPermissions. Specify whether object-level permissions should be included in the transfer. CopyIndexes. Specify whether indexes should be included in the transfer. CopyTriggers. Specify whether triggers should be included in the transfer. CopyFullTextIndexes. Specify whether full-text indexes should be included in the transfer. CopyPrimaryKeys. Specify whether primary keys should be included in the transfer. CopyForeignKeys. Specify whether foreign keys should be included in the transfer. GenerateScriptsInUnicode. Specify whether the generated transfer scripts are in Unicode format.

Dynamic Options
CopyAllObjects = False CopyAllTables. Select whether the task will copy all tables in the specified source database or only the selected tables. TablesList. Click to open the Select Tables dialog box. CopyAllViews. Select whether the task will copy all views in the specified source database or only the selected views. ViewsList. Click to open the Select Views dialog box. CopyAllStoredProcedures. Select whether the task will copy all user-defined stored procedures in the specified source database or only the selected procedures. StoredProceduresList. Click to open the Select Stored Procedures dialog box.

34

CopyAllUserDefinedFunctions. Select whether the task will copy all user-defined functions in the specified source database or only the selected UDFs. UserDefinedFunctionsList. Click to open the Select User Defined Functions dialog box. CopyAllDefaults. Select whether the task will copy all defaults in the specified source database or only the selected defaults. DefaultsList. Click to open the Select Defaults dialog box. CopyAllUserDefinedDataTypes. Select whether the task will copy all user-defined data types in the specified source database or only the selected user-defined data types. UserDefinedDataTypesList. Click to open the Select User-Defined Data Types dialog box. CopyAllPartitionFunctions. Select whether the task will copy all user-defined partition functions in the specified source database or only the selected partition functions. Supported only on SQL Server. PartitionFunctionsList. Click to open the Select Partition Functions dialog box. CopyAllPartitionSchemes. Select whether the task will copy all partition schemes in the specified source database or only the selected partition schemes. Supported only on SQL Server. PartitionSchemesList. Click to open the Select Partition Schemes dialog box. CopyAllSchemas. Select whether the task will copy all schemas in the specified source database or only the selected schemas. Supported only on SQL Server. SchemasList. Click to open the Select Schemas dialog box. CopyAllSqlAssemblies. Select whether the task will copy all SQL assemblies in the specified source database or only the selected SQL assemblies. Supported only on SQL Server. SqlAssembliesList. Click to open the Select SQL Assemblies dialog box. CopyAllUserDefinedAggregates. Select whether the task will copy all user-defined aggregates in the specified source database or only the selected user-defined aggregates. Supported only on SQL Server. UserDefinedAggregatesList. Click to open the Select User-Defined Aggregates dialog box. CopyAllUserDefinedTypes. Select whether the task will copy all user-defined types in the specified source database or only the selected UDTs. Supported only on SQL Server. UserDefinedTypes. Click to open the Select User-Defined Types dialog box. CopyAllXmlSchemaCollections. Select whether the task will copy all XML Schema collections in the specified source database or only the selected XML schema collections. Supported only on SQL Server. XmlSchemaCollectionsList. Click to open the Select XML Schema Collections dialog box.

35

Expressions page. The Expressions page in the Transfer SQL Server Objects Task Editor helps you to use the property expressions to dynamically update the properties of the Transfer SQL Server Objects task at run time by using variables.

The Transfer SQL Server Objects task allows you to be more selective about the specific objects to move between database in SQL Server, unlike the Transfer Database task. Use this task when you want to incorporate objects from a SQL Server database into your own database. Transfer Error Messages task The Transfer Error Messages task helps you to transfer one or more user-defined error messages from one instance of SQL Server to another. User-defined error messages are indicated to SQL Server as any message that has a message identity (ID) of 50,000 and above. You can also configure the Transfer Error Messages task to move only message that are defined within a specific language as well. In the Transfer Error Messages task, you can configure the following properties:

General page. The General page is used to provide a name and description for the Transfer Error Messages task. Messages page. You can define the source and destination connection information using the SourceConnection and DestinationConnection properties. You can also determine what should happen with existing error messages that exist within the destination instance by using the IfObjectExist property to overwrite existing user-defined error messages, skip existing messages or fail if error messages already exist. The TransferAllErrorMessages property allows you to set the value to True or False to determine if all or specific error messages are transferred. If set to False, use the ErrorMessageList property to select the specific messages to move. You can use the ErrorMessageLanguageList property to select the language that the error message should be in to move across. Expressions page. The Expressions page in the Transfer Error Messages Task Editor enables the use of property expressions to dynamically update the properties of the Transfer Error Messages task at run time by using variables.

Suppose you create user-defined error messages for use within a SQL Server instance. SSIS provides a task exclusively for the purpose of transferring these user-defined messages from one SQL Server instance to another, negating the need to manually recreating the messages. Transfer Jobs task The Transfer Jobs task helps you to transfer one or more SQL Server Agent jobs from one instance of SQL Server to another. You can define all the jobs or specific jobs to move and determine if the job should be enabled at the destination server.

General page. The General page is used to provide a name and description for the Transfer Jobs task. Jobs page. You can define the source and destination connection information by using the SourceConnection and DestinationConnection properties. The TransferAllJobs property allows you to set the value to True or False to determine if all or specific jobs are transferred. If set to 36

False, you use the JobList property to select the specific jobs to move. You can also determine what should happen with jobs with the same name that exist within the destination instance by using the IfObjectExist property to overwrite existing jobs, skip existing jobs or fail if the job already exist. You can use the EnableJobsAtDestination set to True or False to determine if the job is enabled at the destination. Expressions page. The Expressions page in the Transfer Jobs Task Editor helps you to use the property expressions to dynamically update the properties of the Transfer Jobs task at run time by using variables.

Similar to the Transfer Error Message task, you can use the Transfer Job task to move all or specific jobs from one instance of the SQL Server Agent to another without the need to manually recreate the job. Transfer Logins task Using Transfer Logins task, you can transfer one or more SQL Server logins from one instance of SQL Server to another. This excludes the sa account even if it has been renamed. You can define all of the logins or specific logins to move and determine if the Security Identifier (SID) of the login should be copied to the destination server.

General page. The General page is used to provide a name and description for the Transfer Logins task. Logins page. You can define the source and destination connection information by using the SourceConnection and DestinationConnection properties. The LoginsToTransfer property allows you to set the value of AllLogins, SelectedLogins or AllLoginsFromSelectedDatabases. If set to SelectedLogins, you use the LoginsList property to select the specific logins to transfer. You can also determine what should happen with Logins with the same name that exist within the destination instance by using the IfObjectExist property to overwrite existing logins, skip existing logins or FailTask if the login already exist. You can use the CopySids set to True or False to determine if the logins SID is copied to the destination. Expressions page. The Expressions page in the Transfer Logins Task Editor helps you to use the property expressions to dynamically update the properties of the Transfer Logins task at run time by using variables.

You can use the Transfer Logins task to move Logins from one instance of SQL Server to another without requiring to manually recreate them. Transfer Master Stored Procedure task If you create user-defined stored procedures and store them within the Master database, you can use the Transfer Master Stored Procedure task to transfer one or more stored procedures from one instance of SQL Server to another. In the Transfer Master Stored Procedure task, you can configure the following properties:

General page. The General page is used to provide a name and description for the Transfer Master Stored Procedure task. Jobs page. You can define the source and destination connection information by using the SourceConnection and DestinationConnection properties. The TransferAllStoredProcedures 37

property allows you to set the value of True or False. If set to False, you use the StoredProcedureList property to select the specific stored procedures to transfer. You can also determine what should happen with stored procedures with the same name that exist within the destination instance by using the IfObjectExist property to overwrite existing stored procedures, skip existing stored procedures or FailTask if the stored procedures already exist. Expressions page. The Expressions page in the Transfer Logins Task Editor helps you to use the property expressions to dynamically update the properties of the Transfer Logins task at run time by using variables.

The Transfer Master Stored Procedure task can only be used if you save your own user-defined stored procedures within the Master database.

38

Adding Tasks to the Control Flow


Adding tasks to the Control Flow Designer is a straightforward task. You can add tasks by clicking and dragging a task from the Toolbox in Business Intelligence Development Studio to the Control Flow Designer. You can also connect Control Flow tasks together through precedence constraints.

39

Implementing Control Flow Tasks: Part 2


Introduction
Lesson Introduction Additional categories of Control Flow tasks are available in SSIS that provide additional functionality, which can be used to interact with components outside of SSIS. Lesson Objectives After completing this lesson, you will be able to:

Work with Package Execution tasks. Work with Analysis Services Tasks. Work with Maintenance tasks. Work with Windows Management Instrumentation tasks. Configure Control Flow tasks.

40

Package Execution Tasks


You can manage multiple SSIS and DTS packages within a single SSIS package by using one of the Package Execution tasks. This can prove very useful when there are complex data loads that are required to be broken into modular packages for management purposes. In addition, a package can be reused in different SSIS packages. Execute Package task The Execute Package task enables an SSIS package to run other SSIS packages. This helps you to break down complex data loads into separate packages and then bring them together under a single package. You can then add constraints to control the workflow of all the packages within a single SSIS package. The package that is used to run the SSIS packages within it is referred to as the parent package. The packages running within it are referred to as child packages. By default, child packages use the same processes as the parent package. However, if the ExecuteOutOfProcesspackage property is set to True, the child package will run on its own process. This independence means that if a child package fails, then the parent package can continue to run. However, more memory would be required to run the child packages process. You can also pass variable values form a parent package to a child package so that you can customize the setting within a child package based on variable values that are defined at a parent level. For example, a Parent Package System Variable named System::StartTime may be passed onto the child package, as a basis for determining the date that it should use as the starting point for retrieving data. You must use a Package Configuration to pass a Parent Package Variable to a child package. In the Execute Package task, you can configure the following properties:

General page. The General page is used to provide a name and description for the Execute Package task. Package page. You use the Location property to determine if the package is retrieved from SQL Server or the File System. If SQL Server is selected, you must configure the Connection and PackageName property. If File System is selected, you then configure a Connections property. This will populate the PackageNameReadOnly property with the name of the File System package. If the child package is password protected, you must specify the password in the Password property. You can also determine if the child package should execute in its own process by using the ExecuteOutOfProcess property set to True. The default is False. Expressions page. The Expressions page in the Execute Package Task Editor helps you to use the property expressions to dynamically update the properties of the Execute Package task at run time by using variables.

The Execute Package task is commonly used to bring together multiple SSIS packages and add constraints to control the order in which the packages should run. This can be useful when loading a data warehouse as well. You can encapsulate separate packages to load the different parts of the data warehouse and then integrate those separate packages within a single package to control the workflow by using the Execute Package task.

41

Execute DTS 2000 Package task If you run SQL Server DTS packages, SSIS contains an Execute DTS 2000 Package task to run the DTS package from within an SSIS package. DTS makes use of Outer and Inner variables. Outer Variables are the equivalent of parent package variables in SSIS and Inner variables are the equivalent of child package variables. These can be used to pass variable values from a parent package to a DTS package. To run this successfully, you must first install the Runtime Support for DTS packages. This provides a DTS shell that enables you to manage DTS packages on SQL Server 2008 and run the Execute DTS 2000 package task. In the Execute DTS 2000 Package task, you can configure the following properties:

General page. The General page is used to provide a name and description for the Execute Package task. The StorageLocation allows you to specify where the DTS package is located. This can have a value of SQL Server, Structured Storage file or Embedded in Task. After this is set, you use the PackageName property to select the DTS package to execute. Any password associated with the DTS package must be defined in the Password property. You can update the PackageID property and the EditPackage property will allow you to edit the DTS package if DTS design-time support has been configured. Inner Variables page. You use the Name and Type property to define the name and data type of the Inner variable. The Value property is used to set the value of the inner variable. Within this page, a New button is available to create a New Inner Variable and a Delete button is available to remove inner variables. Outer Variables page. You use the Name property to define the name of the outer variable that is provided by a parent package. On this page, a New button is available to create a New Outer Variable and a Delete button is available to remove outer variables. Expressions page. The Expressions page in the Execute DTS 2000 Package Task Editor helps you to use the property expressions to dynamically update the properties of the Execute Package task at run time by using variables.

Although DTS support is enabled within SSIS, consider upgrading DTS packages to SSIS packages, because this is a deprecated feature available for backward compatibility only.

42

Analysis Services Tasks


Interaction with SQL Server Analysis Services (SSAS) is possible by utilizing the Analysis Services tasks within your SSIS packages. Analysis Services Processing task The Analysis Services Processing task helps you to process objects within Analysis Services including Cubes, Partitions, Dimensions and data mining models. You can process everything or be selective with the processing that occurs. You can also process different objects at the same time within a batch by defining that the objects are processed in parallel. This can speed up the processing of Analysis Services objects. However, you can process the objects in sequence as well. In the Analysis Services Processing task, properties that can be configured with the Analysis Services Processing task:

General page. The General page is used to provide a name and description for the Analysis Services Processing task. Analysis Services page. The Analysis Services connection manager allows you to define an existing connection manager to Analysis Services. Alternatively, you can use the New button to create a new connection manager. The ObjectList allows you to specify the ObjectName, Type, ProcessOptions and Settings for the Analysis Services object to be processed. You can use the Add or Remove button to add or remove an object to and from the object list. The Batch SettingsSummary helps you to define the Processingorder and Transactionmode for the object. You can also define the behaviour of the task when there are Dimensionerrors, and specify a path for the log file by using the DimensionKeyerrorlogpath. You can also control whether dependant objects are processed by using the Processedaffectedobjects property. These settings can be changed by using the ChangeSettings button. The ImpactAnalysis button will open another dialog box that shows the impact of processing an object on dependant objects. Expressions page. The Expressions page in the Analysis Services Processing Task Editor helps you to use the property expressions to dynamically update the properties of the Analysis Services Processing task at run time by using variables.

The Analysis Services Processing task is commonly used in SSIS packages that are used to load data warehouses. Typically, this is the last task that is added to SSIS and is used to process objects within Analysis Services after the data has been loaded into the data warehouse. Data Mining Query task The Data Mining Query task helps you to run Data Mining Expression (DMX) statements, which use prediction statements against a mining model. Prediction queries help you to use data mining to make predictions about sales or inventory figures as an example. You can then output the results to a new table. Unlike other SSIS tasks, the data mining query presents the properties in separate tabs with the following properties. Note that the Name and Description property is available on all tabs.

Mining Model tab. The Mining Model tab is used to provide an existing connection to the Analysis Services database. You can specify a new connection by clicking on the New button. The

43

Mining Structure allows you to specify the Data Mining Structure that is used as a basis for analysis. Query tab. The Query tab consists of three additional tabs within it. The BuildQuery tab allows you to write the DMX prediction query. If parameters are defined within the prediction query, you can use the ParametersMappings tab to map a DMX parameter to a SSIS variable. The Result Set tab allows you to map the results returned from a prediction query to an SSIS variable, which could be defined as a single row result or a Full Result Set. Output tab. The Output tab can be used to output the results to a dataset by using ADO.NET or a table by using OLE DB connection managers. Within this tab, you can define an existing connection or create a new connection by using the New button. Use the OutputTable to define the table in which the prediction query results are stored and choose whether to Drop and recreate the output table.

Note that this is one of the few tasks where property expressions cannot be used. The Data Mining Query task helps you to issue DMX prediction queries against a pre-existing data mining structure. If your data load requirements involve automated prediction on existing data, you can use the Data Mining Query task to fulfill this requirement. Analysis Services Execute DDL task The Analysis Services Execute DDL task runs data definition language (DDL) statements that can create, drop or alter multidimensional objects such as cubes and dimensions and mining models. You can configure the following properties in the Analysis Services Execute DDL task:

General page. The General page is used to provide a name and description for the Analysis Services Processing task. DDL page. You can use the Connection property to connect to an instance of Analysis Services. The DDL statement can come from one of three sources. By using the Source Type property, the DDL statement can be a Direct Input whereby it is manually entered into the task editor in the Source Direct property. The Source Type property can be set to File. This will alter the page, and add the Source property that helps you to define a file connection to the file that holds the DDL statement. Alternatively, the Source Type can be set to Variable. This retains the Source property, but in this case a variable is defined or you can create a new variable. Expressions page. The Expressions page in the Analysis Services Execute DDL Task Editor helps you to use the property expressions to dynamically update the properties of the Analysis Services Execute DDL task at run time by using variables.

You can use the Analysis Services Execute DDL task if you want to automate the creation of Analysis Services objects such as cubes or dimensions.

44

Maintenance Tasks
Many database maintenance tasks can be performed by using any of the Maintenance tasks. These tasks are also used as maintenance plans in SQL Server Management Studio. They are integrated within SSIS to provide maintenance capabilities within SSIS. Back Up Database task You can perform database backups within your SSIS packages for a filegroup, database file or log file. You can perform the following types of backups:

Full backup Differential backup Filegroup backup File backup Log backup

You can also use many of the options that are available when performing a backup in SQL Server Management Studio, including appending backups to a backup file and compressing backups. Check Database Integrity task The Check Database Integrity task helps you to check the integrity of all the objects within one or more databases and check how the database is allocated to the file system. Within this task, a connection needs to be specified to the instance of SQL server on which the databases reside, and then you can select the databases to perform the integrity check. Execute SQL Server Agent Job task You can interact with SQL Server Agent Job with the Execute SQL Server Agent Job task. By providing connection information to an instance of the SQL Server Agent, you are presented with a list of jobs that are present on the server. You can then select the job that you want the task to execute. Notify Operator task Your SSIS packages can interact with operators that have been defined within an instance of a SQL Server Agent. By providing connection information to the instance of the SQL Server Agent, a list of operators that are present within the instance appears. You can also add a specific subject and message within the task to indicate that the operator has been notified from an SSIS package.

45

Execute T-SQL Statement task This task helps you to execute a Transact-SQL statement. By providing the connection information, the statement will run against the instance that is defined within the connection property. You can also define a connection timeout in the task. The Execute T-SQL Statement task is not as flexible as the Execute SQL task. This task does not allow parameter mapping to variables that exist within the package. You also cannot map a result set of a Transact-SQL statement to a variable. Rebuild Index task and Reorganize Index task You can use the Rebuild Index task and the Reorganize Index task to manage the indexes on one or more databases. If you select a database, you can specifically choose the tables views within the database that the task will affect. If multiple databases are selected you do not get this choice. Provide the connection information to the instance of SQL server that holds the databases. Then, select the databases that the task will apply to. If a single database is selected, then select the specific tables or views that the task will affect. Update Statistics task You can update the statistical information of one or more databases by using the Update Statistics task. If one database is selected within the task, you can choose which tables or views the Update Statistics task will affect. If multiple databases are selected, then the task will update all the statistics within each database. Shrink Database task The Shrink database task executes the DBCC SHRINKDATABASE statement against one or more databases to reduce the size of the database as much as possible. After providing connection information, you can select the databases that the task will affect. You determine the amount of unused space to remain in the database after the database is shrunk (the larger the percentage, the less the database can shrink). The value is based on the percentage of the actual data in the database. For example, a 100 MB database containing 60 MB of data and 40 MB of free space, with a free space percentage of 50 percent, would result in 60 MB of data and 30 MB of free space (because 50 percent of 60 MB is 30 MB). Only excess space in the database is eliminated.

46

History Cleanup task The History Cleanup task deletes entries in the following history tables in the SQL Server msdb database:

backupfile backupfilegroup backupmediafamily backupmediaset backupset restorefile restorefilegroup restorehistory

By using the History Cleanup task, a package can delete historical data related to backup and restore activities, SQL Server Agent jobs and database maintenance plans. The task includes a property for specifying the oldest date of data retained in the history tables. You can indicate the date by number of days, weeks, months or years from the current day, and the task automatically translates the interval to a date. Maintenance Cleanup task The Maintenance Cleanup task removes files related to maintenance plans, including database backup files and reports created by maintenance plans. By using the Maintenance Cleanup task, a package can remove the backup files or maintenance plan reports on the specified server that is defined by the connection information. The Maintenance Cleanup task includes an option to remove a specific file or remove a group of files in a folder. Optionally, you can specify the extension of the files to delete.

47

Windows Management Instrumentation Tasks


SSIS can use Windows Management Instrumentation (WMI) to return information about the state of the computer hardware and software. This can be useful to trigger SSIS packages based on condition of the computer by using the following tasks. WMI Event Watcher task Leveraging WMI, a component of the Windows subsystem, you can use Management Instrumentation Query Language (WQL) to set up events that can be used as a notification to a SSIS package about the events that occurs on the Windows system. This can include monitoring the free space on a disk or memory. The following properties can be configured for the WMI Event Watcher task:

General page. The General page is used to provide a name and description for the WMI Event Watcher task. WMI Options page. You define a connection by using the WMIConnectionName property. However, the WQL query can be defined from one of the three sources by using the WMIQuerySourceType property including Direct Input, File or Variable. This will adjust the WMIQuerySource property allowing you to type in the WQL query directly, point to a file by using a file connection manager or by using an existing or new variable respectively. Additional properties include the ActionAtEvent that determines if the event is logged or a package action is initiated. You can also use the AfterEvent property to determine if the task succeeds or fails after receiving the WMI event. The Timeout property specifies a time out value for the event. You can use the ActionAtTimeout property to determine if the time out is logged or a package action is initiated. In addition, the AfterTimeout property determines if the task succeeds or fails based on the timeout occurring. You can also specify the number of events to watch out for using the NumberOfEvents property. Expressions page. The Expressions page in the WMI Event Watcher Task Editor helps you to use the property expressions to dynamically update the properties of the WMI Event Watcher task at run time by using variables.

A WMI Event Watcher task can be used to monitor event in the windows system that triggers a package to execute. For example, a WMI Event Watcher task could be used to monitor files being added to a folder. On the event occurring, the WMI Event Watcher receives a notification that can then allow the package to run using the files that have been loaded. The WQL to monitor files added to a folder is:
SELECT * FROM __InstanceCreationEvent WITHIN 10 WHERE TargetInstance ISA "CIM_DirectoryContainsFile" and TargetInstance.GroupComponent= "Win32_Directory.Name=\"c:\\\\WMIFileWatcher\""

48

WMI Data Reader task You can use the Management Instrumentation Query Language (WQL) along with the WMI Data Reader task to return information about the server on which the SSIS packages run. Information that is provided by the WMI Data Reader task may then be used in variables that can be used to populate values in other properties of the SSIS package. In the WMI Data Reader task, you can configure the following properties:

General page. The General page is used to provide a name and description for the WMI Data Reader task. WMI Options page. You define a connection by using the WMIConnectionName property. However, the WQL query can be defined from one of three sources by using the WMIQuerySourceType property including Direct Input, File or Variable. This will adjust the WMIQuerySource property allowing you to type in the WQL query directly, point to a file by using a file connection manager or by using an existing or new variable respectively. Additional properties include the OutputType property that determines whether the results returned by the WQL query are held in a data table and property value or property name and value. The DestinationType can be used to save the result of the WQL query to a File Connection or a Variable. You can also use the OverwriteDestination property to determine whether to Keep, Overwrite or append the results to the destination file or variable. Expressions page. The Expressions page in the WMI Data Reader Task Editor helps you to use the property expressions to dynamically update the properties of the WMI Data Reader task at run time by using variables.

You can use the results that are returned by the WMI Data Reader task to store the results in a variable that can be passed onto other tasks within the SSIS package.

49

Working with Precedence Constraints and Containers


Introduction
Lesson Introduction Precedence constraints bring the control flow alive by defining a workflow that occurs between different tasks. This helps you to control what should happen should a task succeed, fail or complete. Expressions can also be used to further refine the behavior of the workflow logic. Every task that is added to the control flow, by default, is part of its own task host container that extends variables and event handlers to the task through the task host container. You can add tasks to additional containers that are available in the Toolbox in Business Intelligence Development Studio to provide greater flexibility. By using sequence containers, you can also group multiple tasks into a single container so that the tasks can succeed or fail as a unit. This will ensure the integrity of the work that is performed by the SSIS package. The capabilities of containers can be extended by using For Loop and Foreach Loop containers that will create repeating workflows based on defined expressions. Lesson Objectives After completing this lesson, you will be able to:

Describe what precedence constraints are. Describe outcome-based precedence constraints. Use expressions within precedence constraints. Explain how multiple constraints work. Explain what containers are. Work with Sequence containers. Work with For Loop containers. Work with Foreach Loop containers.

50

Introduction to Precedence Constraints


Precedence constraints play an important role in controlling the order that tasks are run within the control flow. You can use outcome-based precedence constraints that dictate the flow of the control flow based on success, failure or completion of a task. You can also use expression-based precedence constraints that control the flow of the Control Flow task based on a predefined expression being met. You can create precedence constraints based on a combination of outcome and expression-based precedence constraints. When a Control Flow task is added to the Control Flow Designer, a green arrow appears underneath the Control Flow task. You can click and drag the arrow to connect the precedence executable to another constrained executable. This will indicate that on successful execution of the precedence executable, the control flow can move on to the constrained executable. If you click on the task again, a green arrow will appear under the original Control Flow task. You can click and drag this to another constrained executable. You can also change the control flow so that the arrow indicates on completion. This means that it will move onto the next Control Flow task regardless of the Control Flow task succeeding or failing and will appear in blue. You can also change the control flow so that the arrow indicates on failure. This will indicate that on failed execution of the precedence executable, the control flow can move onto the constrained executable that is connected to it and will appear in red. If expressions are used to control the tasks in the control flow, a blue line will appear with an fx icon appearing next to the control flow.

51

Outcome-based Precedence Constraints


Outcome-based precedence constraints will control the workflow of the control based on the success, failure or completion of the task from which the precedence constraint is based upon. Success Success requires that the precedence executable must complete successfully for the constrained executable to run. Failure Failure requires that the precedence executable fail for the constrained executable to run. Completion Completion requires only that the precedence executable has completed, without regard to outcome, in order for the constrained executable to run.

52

Expressions within Precedence Constraints


Expressions can be added to precedence constraints to determine the behaviour of the constraint. If the expression defined within the precedence constraint evaluates to true, the expression-based constraint proceeds to the next constrained executable. If the expression evaluates to false, the constrained task does not execute. The expression that is defined within the precedence constraint properties can make use of literals, operators, functions and variables within the expression. Variables in particular offer a powerful feature when used in expressions. SSIS provides an Expression builder that helps you to build expressions and check that the syntax is correct. For example, you have an SSIS package that will load a text file with product information from a table in the AdventureWorks database. However, this event cannot occur unless there are more than 500 rows of product data. Firstly, in the SSIS package, a Package variable named @Rows is created. Then, the first task added is an Execute SQL task, which runs the following statement.
SELECT COUNT(*) FROM Production.Product

On the Parameter Mappings page, an output parameter is created that maps to the precreated userdefined variable called @Rows. This variable will hold the results of the above query. A second task is added that is a Data Flow task that transfers a copy of the Production.Product table in the AdventureWorks database to a text file named products.txt. Then, a precedence constraint is connected between the Execute SQL task and the Data Flow task. In the Properties of the precedence constraint, the evaluation operation is set to expression. In the Expression text box, the following statement is added.
@Rows>500

When the package is executed, the Execute SQL task counts the number of rows in the Production.Product table. If the result is greater than 500, the Data Flow task executes as defined by the precedence constraint. If not, the Data Flow task does not run. Expressions can also be used in combination with outcome-based precedence constraints by using the following options in the Evaluation Operation list: Expression and Constraint. This means both the expression and the outcome constraint must be satisfied. Expression or Constraint. This means that either the expression or the outcome constraint must be satisfied.

53

Multiple Constraints
Any Control Flow task can have multiple precedence constraints connecting to it. When this situation occurs, you need to configure when the constrained executable should execute. If the constrained executable can only execute if all precedence constraints are completed configure the precedence constraint properties so that the Multiple Constraints property is set to a Logical AND. If only one of the precedence constraints must complete before the constrained task will execute, configure the precedence constraint properties so that the Multiple Constraints property is set to a Logical OR. These settings allow you to refine the control for complex SQL Server Integration Services packages.

54

Introduction to Containers
Control flow containers enable you to group a set of Control Flow tasks or containers within a single container so that they can be organised and managed as a single entity. Containers can also have precedence constraints connect to it and connect from it. SQL Server Integration Services provides three types of containers:

Sequence containers For Loop containers Foreach Loop containers

A Sequence container is typically used to contain a number of tasks within it, so that the container can act as one logical unit within the Sequence container. If the container has a precedence constraint, all the tasks within the container must complete first before the control flow path moves onto the next Control Flow task. Furthermore, properties that are defined at the container level are inherited by the tasks within it, such as the DisableEventHandlers property. Sequence containers can also act as a scope for variables. For Loop containers allow you to define a repeating workflow that uses an expression and loops until the expression evaluates to false. Within the For Loop container, you must define an initialization value as the basis to start the loop. To end the loop, an EvalExpression property specifies the value for exiting a loop with an AssignExpression property used to increment the loop. The For Loop container is useful when you know how many times the container should run through a loop. When the number of loops required is not known, a Foreach Loop container can loop through a collection of objects until the object no longer exists. Objects can include files, ADO recordsets or XML nodes. A task can be performed on the objects as they are enumerated by the Foreach Loop container.

55

Working with For Loop Containers


If you require a repeating workload with a fixed number of loops, you can use the For Loop container control flow executable. The For Loop container makes use of a number of properties to control the frequency of the loop. The For Loop container usually refers to a variable as a basis for holding the value of the loop. For example, a variable called @Counter. Within the For Loop container, the InitExpression property is used to set the initial value of the counter.
@Counter = 0

The AssignExpression property is used to add an incremental value when one cycle of the loop has been completed with an expression such as
@Counter = @Counter + 1

The EvalExpression property assigns a condition that determines when the loop is exited.
@Counter < 5

The For Loop container will then execute the task that is held within it on a repeating basis until the EvalExpression value is met. Within the For Loop Editor, the Name property helps you to define a unique name for the container.

56

Working with Foreach Loop Containers


Foreach Loop containers are useful control flow executables that allows you to loop through a collection of objects such as files in a folders. The useful aspect about the Foreach Loop container is that you are not constrained by the amount of loops that will be performed by the Foreach Loop container as it will loop through a collection of objects that exist until it has looped through every object. The Foreach Loop container can enumerate seven built in objects. Foreach Loop Enumarator Types The Foreach Loop container defines a repeating control flow in a package. The loop implementation is similar to Foreach looping structure in programming languages. In a package, looping is enabled by using a Foreach enumerator. The Foreach Loop container repeats the control flow for each member of a specified enumerator. SSIS provides the following enumerator types:

Foreach ADO enumerator to enumerate rows in tables. For example, you can get the rows in an ADO recordset. Foreach ADO.NET Schema Rowset enumerator to enumerate the schema information about a data source. For example, you can enumerate and get a list of the tables in the AdventureWorks SQL Server database. Foreach File enumerator to enumerate files in a folder. The enumerator can traverse subfolders. For example, you can read all the files that have the *.log file name extension in the Windows folder and its subfolders. Foreach From Variable enumerator to enumerate the enumerable object that a specified variable contains. The enumerable object can be an array, an ADO.NET DataTable, an Integration Services enumerator and so on. For example, you can enumerate the values of an array that contains the name of servers. Foreach Item enumerator to enumerate items that are collections. For example, you can enumerate the names of executables and working directories that an Execute Process task uses. Foreach Nodelist enumerator to enumerate the result set of an XML Path Language (XPath) expression. For example, this expression enumerates and gets a list of all the authors in the classical period: /authors/author[@period='classical']. Foreach SMO enumerator to enumerate SQL Server Management Objects (SMO) objects. For example, you can enumerate and get a list of the views in a SQL Server database.

Enumerators are configurable, and you must provide different information, depending on the enumerator.

57

The following table summarizes the information each enumerator type requires. Enumerator Foreach ADO Foreach ADO.NET Schema Rowset Foreach File Foreach From Variable Foreach Item Foreach Nodelist Foreach SMO Configuration requirements Specify the ADO object source variable and the enumerator mode. Specify the connection to a database and the schema to enumerate. Specify a folder and the files to enumerate, the format of the file name of the retrieved files, and whether to traverse subfolders. Specify the variable that contains the objects to enumerate. Define the items in the Foreach Item collection, including columns and column data types. Specify the source of the XML document and configure the XPath operation. Specify the connection to a database and the SMO objects to enumerate.

58

Working with Variables


Introduction
Lesson Introduction To add dynamic capability to packages, variables can be used to pass information between different components in a package or between packages. There are many system-defined variables that can be used to populate the setting of components within an SSIS package including containers, data flow and event handlers. You can also create user-defined variables to accommodate scenarios for dynamic packages that cannot be met by the system-supplied variables. Understanding how to make use of variables will provide a powerful tool for building intelligent packages. Lesson Objectives After completing this lesson, you will be able to:

Describe variables in SSIS. Describe system variables in a package. Create a user-defined variable. Use variables.

59

Introduction to Variables in SSIS


Variables can be created and used within SQL Server Integration packages to add flexibility to the package logic. Variables can be used to store values that can be used by numerous components of the SQL Server Integration Services packages. These components can include control flow executables, data flow components, precedence constraint expressions and For Loop expressions as some examples. Static values can be assigned to a variable within a variable definition itself, by setting a default value. You can also dynamically assign values to a variable from control flow tasks. For example, an Execute SQL task can be used to perform a count of the rows within a result set. You can then use the result mapping page of the Execute SQL task to pass the result to a variable. The variable can then use this as an input to other control flow tasks within the package. Many control flow executables contain an Expression page within the properties of an executable. These allow you to set the properties of an executable dynamically. Variables are commonly used within the property expression to set these values at run time. For example, you can use the properties expression page in the FTP task to dynamically set the LocalPath property within the expressions page. Variables can also be used in data flow components. The RowCount transformation must use a variable to store the results of the row count and will not work otherwise. You can also use variable to populate options within package configurations. When a SQL Server Integration Services package is created, a set of system variables is automatically created that can be utilized within the package. There is no set list of system variables, and the number of system variables increases as you add components to the SQL server Integration Services package. However, these additional system variable will be specific to the component added to the package. You can also create user-defined variables. In creating the user defined variable, you must define a scope. The scope determines which part of the package the variable can work. For example, if you have a package that consists of a single Data Flow Task, if you click the data flow task and create a user defined variable with the data flow task highlighted. The scope of the variable is limited to the Data Flow task only. If you click on the package and create a variable, the scope of the variable is limited to the package and all components defined within the package can make use of the variable, including the Data Flow task. You must also define a data type as you create the variable to determine the acceptable values that exist within the variable. To view the system variable within a package and to create user define variables, within Business Intelligence Development Studio, click on SSIS on the menu bar and then click Variables.

60

System Variables in a Package


There are many system variables that are created when you create a new SSIS package, add a task to the SSIS package or make use of containers. The table describes common system variables that can be used. Package-based variables Package-based variables are system variables that hold information about the package. Some of the common variables include the following:

System::PackageName. The name of the package. System::PackageID. The unique ID of the package. System::MachineName. The machine on which the package was executed. System::UserName. The name of the user that executed the package. System::StartTime. The time at which the package started to execute.

The information provided by package system variables can be useful in custom logging solutions that help you to log the values of these variables to a specific table for later review. Task-based variables Task-based variables can be specific to the task that you add to the SSIS package including the following:

System::TaskName. The name of the task. System::TaskID. The ID for the task within the package. System:TaskTransactionOption. This variable holds information about the transaction setting that the task is using.

Task-based variables can be useful when you want to use log information about a specific task or use the variables within other SSIS components. For example, you can refer to a task by using the TaskName variable within a Script task to perform additional operations against the task using a script. Container-based variables The container-based system variables can be used with Sequence, For Loop and Foreach Loop containers. Not all of the system variables are applicable to all container types. Some of the common variables include the following:

System::ParentContainerGUID. Holds the globally unique ID of the parent container that holds the container. System::ContainerStartTime. Holds the time when the container started to execute. System::LocaleID. Holds the value of the locale that the container uses.

Container-based variables can be used to pass values into tasks that exist with the container. For example, you can use the ContainerStartTime variable value to populate the subject line of a Send Mail task that exists within the container to send an e-mail message containing the time that the container started to execute. 61

Creating User-Defined Variables


Creating user-defined variables is a straightforward process. Using the variable pane in Business Intelligence Development Studio you can define a variable name, scope, data type and default value. Passing variable values in and out of package components can add flexibility to your package.

62

Business Practices

Create an SSIS package that performs a single operation to simplify troubleshooting of packages. Create package templates that contain common SSIS package components. Define data sources within Solution Explorer that need to be available for multiple packages. Create connection managers in an SSIS package so that the connection information is embedded within a package. Identify the Control Flow tasks that are required to complete operations of the package. Identify the Control Flow tasks that will provide error control within a package. Use precedence constraints to control the flow of task execution within a package. Use the Data Profiling task to gain familiarity with new data or identify quality issues with existing data. Use the Bulk Insert task to move data from a text file to a SQL Server table when no error checking or transformations are required. Use the Data Flow task for data loads that requires error checking and transformations. Use the Script task to create custom Control Flow tasks that cannot be met by existing tasks. Use the Execute Package task to control the flow of multiple packages using precedence constraints. Use Sequence containers to group related tasks within a single container that can have package properties set from one object. Use For Loop containers when you know the amount of times the iteration of a loop needs to occur. Use the Foreach Loop when you are required to iterate through objects but do not know how many times the loop should iterate. Use variables as the platform for passing the values between tasks when you are required to pass values from one Control Flow task to another. Ensure that variables are set at the correct scope to operate correctly.

63

Lab: Implementing Packages and Control Flow in Microsoft SQL Server 2008
Lab Overview
Lab Introduction The purpose of this lab is to create packages in SSIS that will contain the Control Flow tasks required to load data into the AdventureWorksDWDev database. You will work with the control flow components exploring the capabilities of common tasks that would be used within an organization to create SSIS packages. You will also explore the use of precedence constraints to control the workflow of the tasks that are defined within the package. Finally, you will add containers and variables to manage the flow and to group tasks together as a logical unit. Lab Objectives After completing this lab, you will be able to:

Create SSIS packages. Work with control flow. Use containers and precedence constraints to manage the control flow. Work with variables.

64

Scenario
You are a database developer for Adventure Works, a manufacturing company that sells bicycle and bicycle components through the Internet and a reseller distribution network. You have just installed the business intelligence components of SQL Server 2008. There is also a data warehouse on a development database named AdventureWorksDWDev. You want to explore common control flow components that might be used to populate data within a data warehouse. You have been tasked to create a simple data warehouse that will enable you to determine the process for populating a data warehouse. The data warehouse consists of three dimension tables named ProductDim, EmployeeDim and TimeDim that connect to the FactResellerSales table in a star schema. The first step is to utilise the staging tables in the AdventureWorksDWDev that will be used as a holding area for data that is transferred from numerous data sources. The ProductStage table will retrieve information from the Production.Product table in the Adventureworks database. The EmployeeStage table will retrieve its information from a supplied text file named Employee.txt located in the D:\Labfiles\Starter folder. You will create two simple packages that will load the data into the staging tables. You will also make use of precedence constraints to control the workflow of the SSIS package and inform a user named student with the e-mail address of student@adventure-works.com if a component of the control flow has failed. You will also explore the use of containers and variables within packages. Before undertaking this, you will copy all the tables from the AdventureWorks database to a database named AdventureWorks-Copy by using the appropriate SSIS task.

65

Exercise Information
Exercise 1: Creating SSIS Packages In this exercise, in the AW_SSIS project, in the AW_BI solution, you will rename the default package named Package.dtsx to AWSSIStemplate.dtsx. In this package, you will create connection managers for the Adventureworks2008 and the AdventWorksDWDev database. You will then create a package template from this package. You will create three packages by using the AWSSISTemplate named AWStaging, AWDataWarehouse and LoadAWDWDev to start the process of populating a data warehouse. You will then import the ResellerText package into the AW_SSIS project. Exercise 2: Working with Control Flow Tasks In this exercise, you will edit the ResellerText package by changing some properties to add meaning to the package. You will also add a File System task that will ensure that the Resellers.txt file is deleted in the D:\Labfiles\Starter folder. You will verify that these tasks function correctly. You will edit the AWStaging package that will use three Control Flow tasks. These tasks will help load the staging tables within the AdventureWorksDWDev database. The first task will use an Execute SQL task to truncate the staging tables. You will then add a second Control Flow task known as a Bulk Insert task that will bulk load data from the Resellers.txt file into the StageResellers table in the AdventureWorksDWDev database. You will verify that these tasks function correctly. You will then add a Data Flow task to the AWStaging package but not complete its configuration. Finally, you will edit the LoadAWDWDev package by adding three Execute Package tasks. You will name and provide a description for each of the tasks and configure them to execute the packages currently stored within the AW_SSIS project. Exercise 3: Managing Control Flow Tasks In this exercise, you will edit the ResellerText package by adding a Success precedence constraint between two tasks that exist within the package. You will take this further in the AWStaging package by adding both Success precedence constraints and Failure precedence constraints that will send an e-mail message should any task within the package fail. You will then explore the use of Completion precedence constraints within the LoadAWDWDev package. You will then develop the AWDataWarehouse by adding two Sequence containers. One Sequence container will hold the Control Flow tasks that will load the dimension tables. Another Sequence container will follow on that will hold the loading of the fact table data. Both tasks will send an e-mail message should the contents of a Sequence container fail. Using the AWDataWarehouse package, you will use Sequence containers to group Control Flow tasks that perform the same type of operation. One Sequence container will contain the tasks that populate

66

the dimension tables in the AdventureWorksDWDev data warehouse. The other task will contain the tasks that will load the fact table in the data warehouse. Exercise 4: Working with Variables In this exercise, to help you manage the changing data that can exist within the product dimension table; you will add a record to the ExtractLog table in the AdventureWorksDWDev database. You can use this database to keep track of when extraction of product data have occurred. You will edit the AWStaging package and create a user-defined variable named ProductLastExtract. Then, you will use an Execute SQL task to set the variable to the value defined within the ExtractDate column in the ExtractLog table. You will create another Execute SQL task to update the ExtractDate column in the ExtractLog table with todays date by using the GetDate function after the products have been loaded into the StageProduct table.

67

Lab Instructions: Implementing Packages and Control Flow in Microsoft SQL Server 2008
Exercise 1: Creating SSIS Packages
Exercise Overview In this exercise, within the AW_SSIS project within the AW_BI solution, you will rename the default package named Package.dtsx to AWSSIStemplate.dtsx. Within this package, you will create connection managers for the Adventureworks2008 and the AdventWorksDWDev database. You will then create a package template from this package. You will create three packages by using the AWSSISTemplate named AWStaging, AWDataWarehouse and LoadAWDWDev to start the process of populating a data warehouse. You will then import the ResellerText package into the AW_SSIS project. Task 1: You are logged on to the MIAMI server with the user name Student and password Pa$$w0rd. Proceed to the next task. Log on a. b. c. to the MIAMI server. To display the logon screen, press CTRL+ALT+DELETE. On the Logon screen, click the Student icon. In the Password box, type Pa$$w0rd and then click the Forward button.

Task 2: Open SQL Server Business Intelligence Development Studio and open the solution file AW_BI solution located in D:\Labfiles\Starter\AW_BI folder 1. 2. Open SQL Server Business Intelligence Development Studio. Open the AW_BI solution file in D:\Labfiles\Starter\AW_BI folder.

Task 3: Rename Package.dtsx to AWSSISTemplate Navigate to Package.dtsx in the AW_SSIS project.

Task 4: Add the AdventureWorks2008 and AdventureWorksDWDev connection managers by using OLE DB connections in the AWSSISTemplate package 1. 2. 3. Add the AdventureWorks2008 Connection Manager to the AWSSISTemplate package. Add the AdventureWorksDWDev connection Manager to the AWSSISTemplate package. Save the AW_BI solution and close down Business Intelligence Development Studio.

Task 5: Create a package template by using the AWSSISTemplate. dtsx file and then remove the AWSSISTemplate.dtsx file from the AW_SSIS project 1. 2. 3. Navigate to the AWSSISTemplate.dtsx file. Create the AWSSISTemplate package template. Remove the AWSSISTemplate package in the AW_BI Business Intelligence Development solution.

Task 6: Use the package template AWSSISTemplate to create three new SSIS packages named AWStaging, AWDataWarehouse, and LoadAWDWDev 1. 2. Create the AWStaging package by using the AWSSISTemplate package template. Create the AWDataWarehouse package by using the AWSSISTemplate package template.

68

3.

Create the LoadAWDWDev package by using the AWSSISTemplate package template.

Task 7: Import an SSIS package into an existing SSIS project and then confirm that the AWStaging, AWDataWarehouse, LoadAWDWDev, and Resellertext packages are created in the AW_SSIS project 1. 2. 3. Import the ResellerText.dtsx package located in D:\Labfiles\Starter folder into the AW_SSIS Integration Services project within the AW_BI Business Intelligence solution. Save the AW_BI solution. Confirm the AWStaging, AWDataWarehouse, LoadAWDWDev, and Resellertext packages exist within the AW_SSIS project in the AW_BI solution.

Task 8: You have completed all tasks in this exercise A successful completion of this exercise results in the following outcomes: You have created a package template named AWSSISTemplate. You have created the AWStaging, AWDataWarehouse, and LoadAWDWDev SSIS packages by using the AWSSISTemplate package template. You have imported the ResellerText package into the AW_SSIS project in the AW_BI Business Intelligence solution. You have verified the creation of the AWStaging, AWDataWarehouse, LoadAWDWDev, and ResellerText packages in the AW_SSIS project.

Exercise 2: Working with Control Flow Tasks


Exercise Overview In this exercise, you will edit the ResellerText package by changing some properties to add meaning to the package. You will also add a file system task that will ensure that the Resellers.txt file is deleted in the D:\Labfiles\Starter folder. You will verify that these tasks function correctly. You will edit the AWStaging package that will use three control flow tasks. These tasks will help load the staging tables within the AdventureWorksDWDev database. The first task will use an Execute SQL task to truncate the staging tables. You will then add a second control flow task known as a Bulk Insert Task that will bulk load data from the Resellers.txt file into the StageResellers table in the AdventureWorksDWDev database. You will verify that these tasks function correctly. You will then add a Data Flow Task to the AWStaging package but not complete its configuration. Finally, you will edit the LoadAWDWDev package by adding three Execute Package Tasks. You will name and provide a description for each of the tasks and configure them to execute the packages currently stored within the AW_SSIS project.

Task 1: Open the ResellerText package in the AW_SSIS project in the AW_BI solution Open the ResellerText package in Business Intelligence Development Studio.

Task 2: Edit the package ResellerText renaming the data flow task and adding a file system task to manage the creation of the Reseller.txt file located in D:\Labfiles\Starter. Save and close the ResellerText package. 1. In the ResellerText package, rename the data flow task from Data Flow Task 1 to Create Reseller Text File.

69

2. 3. 4. 5. 6.

Add a File System Task named Delete Reseller Text File to the Control Flow Designer. Configure the Delete Reseller Text File File System Task to delete the ResellerText.txt file located in D:\Labfiles\Starter folder. Navigate to the Resellers.txt file that exists in D:\Labfiles\Starter folder. Verify that the Delete Reseller Text File task and the Create Reseller Text File task works as expected. Save and close the ResellerText package.

Task 3: Edit the package AWStaging that will contain Control flow Tasks that will truncate the StageReseller and StageProduct table in the AdventureWorksDWDev database 1. 2. 3. Open the AWStaging package in Business Intelligence Development Studio. Add an Execute SQL Task named Truncate Staging Tables to the Control Flow designer. Configure the Truncate Staging Tables Execute SQL Task to truncate the StageReseller and StageProduct table in the AdventureWorksDWDev database.

Task 4: Edit the package AWStaging containing Control Flow Task that will bulk insert data from the Resellers.txt file located in D:\Labfiles\Starter folder into the dbo.StageReseller table in the AdventureworksDWDev database 1. 2. Add a Bulk Insert Task named Load Resellers to the Control Flow designer. Configure the Load Resellers Bulk Insert Task to insert data from the Resellers.txt file located in the D:\Labfiles\Starter folder into the StageReseller table in the AdventureWorksDWDev database.

Task 5: Verify that the Load Resellers Bulk Insert task and the Truncate Staging Tables Execute SQL task works as expected Verify that the Load Resellers Bulk Insert task and the Truncate Staging Tables Execute SQL task works as expected.

Task 6: Edit the package AWStaging that will contain Data flow Task that named Load Products. You will then save the AWStaging package and close it. 1. 2. Add a Data Flow Task named Load Products to the Control Flow designer. Save and then close the AWStaging Package.

Task 7: Open the LoadAWDWDev package in the AW_SSIS project in the AW_BI solution Open the LoadAWDWDev package in Business Intelligence Development Studio.

Task 8: Edit the package LoadAWDWDev using Execute Package tasks that will be used to control the execution of the packages that exist within the AW_SSIS project 1. 2. 3. 4. 5. 6. 7. Add an Execute Package Task named ExecResellerText to the Control Flow designer. Configure the Execute Package Task named ExecResellerText to execute the ResellerText Package located in D:\Labfiles\Starter\AW_BI\AW_SSIS folder. Add an Execute Package Task named ExecAWStaging to the Control Flow Designer. Configure the Execute Package Task named ExecAWStaging to execute the AWStaging Package located in D:\Labfiles\Starter\AW_BI\AW_SSIS folder. Add an Execute Package Task named ExecAWDataWarehouse to the Control Flow Designer. Configure the Execute Package Task named ExecAWDataWarehouse to execute the AWDataWarehouse Package located in D:\Labfiles\Starter\AW_BI\AW_SSIS folder. Save and then close the LoadAWDWDev Package.

70

Task 9: You have completed all tasks in this exercise A successful completion of this exercise results in the following outcomes: You have created and configured the File System Task in the ResellerText Package. You have created and configured an Execute SQL Task in the AWStaging package. You have created and configured a Bulk Insert Task in the AWStaging package. You have verified that the above tasks have worked. You have added a Data Flow Task in the AWStaging package. You have created and configured three Execute Package Tasks in the LoadAWDWDev package.

Exercise 3: Managing Control Flow Tasks


Exercise Overview In this exercise, you will edit the ResellerText package by adding a Success Precedence Constraint between two tasks that exist within the package. You will take this further in the AWStaging package by adding both Success Precedence Constraints and Failure Precedence Constraints that will send an email message should any task within the package fail. You will then explore the use of Completion Precedence Constraints within the LoadAWDWDev package. You will then develop the AWDataWarehouse by adding two sequence containers. One sequence container will hold the Control Flow Tasks that will load the dimension tables. Another Sequence Container will follow on that will hold the loading of the fact table data. Both tasks will send an email message should the contents of a sequence container fail. Using the AWDataWarehouse package, you will use Sequence Containers to group Control Flow Tasks that perform the same type of operation. One Sequence Container will contain the tasks that populate the dimension tables in the AdventureWorksDWDev data warehouse. The other task will contain the tasks that will load the fact table in the data warehouse. Task 1: Edit the package ResellerText adding a Success Precedence Constraint from the Delete Reseller Text File File System Task to the Create Reseller Text File Data Flow Task. Verify that the tasks work, then save and close the ResellerText package. 1. 2. 3. 4. 5. Open the ResellerText package in Business Intelligence Development Studio. In the ResellerText package, create a Success Precedence Constraint from the Delete Reseller Text File File System Task to the Create Reseller Text File Data Flow Task. Verify that the Delete Reseller Text File task and the Create Reseller Text File task works as expected. Navigate to the Reseller.txt file that exists in D:\Labfiles\Starter folder. Stop the execution of the ResellerText package. Save and then close the ResellerText Package.

Task 2: Edit the package AWStaging that will contain Success Precedence Constraints between the Truncate Staging Tables Task and the Load Resellers Task and a Success Constraint that connects to the Load Products Task 1. 2. 3. Open the AWStaging package in Business Intelligence Development Studio. In the AWStaging package, create a Success Precedence Constraint from the Truncate Staging Tables Execute SQL Task to the Load Resellers Bulk Insert Task. In the AWStaging package, create a Success Precedence Constraint from the Load Resellers Bulk Insert Task to the Load Products Data Flow Task.

71

Task 3: Edit the package AWStaging containing Send Mail Task that will send an email message to Student@Adventure-works.com if any of the existing AWStaging Control Flow Tasks fail 1. 2. 3. 4. 5. 6. Add a Send Mail Task named Email Student to the Control Flow Designer. Configure the Send Mail Task to email Student@Adventure-works.com with the Subject of AWStaging SSIS failure notification from SSIS@Adventure-works.com. In the AWStaging package, create a Failure Precedence Constraint from the Truncate Staging Tables Execute SQL Task to the Email Student Send Mail Task. In the AWStaging package, create a Failure Precedence Constraint from the Load Resellers Bulk Insert Task to the Email Student Send Mail Task. In the AWStaging package, create a Failure Precedence Constraint from the Load Products Data Flow Task to the Email Student Send Mail Task. Save and close the AWStaging package.

Task 4: Edit the LoadAWDWDev package containing On Completion Precedence Constraints between the ExecResellerText Task and the ExecAWStaging Task and an On Completion Constraint that connects to the ExecAWStaging task to the ExecAWDataWarehouse task 1. 2. 3. Open the LoadAWDWDev package in Business Intelligence Development Studio. In the LoadAWDWDev package, create a Completion Precedence Constraint from the ExecResellerText Execute Package Task to the ExecAWStaging Execute Package Task. In the LoadAWDWDev package, create a Completion Precedence Constraint from the ExecAWStaging Execute Package Task to the ExecAWDataWarehouse Execute Package Task.

Task 5: Edit the package LoadAWDWDev that will contain Send Mail Task that will send an email to Student@Adventure-works.com if any of the existing AWStaging Control Flow tasks fail. 1. 2. 3. 4. 5. 6. Add a Send Mail Task named Email Student to the Control Flow Designer. Configure the Send Mail Task to Student@Adventure-works.com with the Subject of LoadAWDWDev SSIS failure notification from SSIS@Adventure-works.com. In the LoadAWDWDev package, create a Failure Precedence Constraint from the ExecResellerText Execute Package Task to the Email Student Send Mail Task. In the LoadAWDWDev package, create a Failure Precedence Constraint from the ExecAWStaging Execute Package Task to the Email Student Send Mail Task. In the LoadAWDWDev package, create a Failure Precedence Constraint from the ExecAWDataWarehouse Execute Package Task to the Email Student Send Mail Task. Save and then close the LoadAWDWDev package.

Task 6: Edit the package AWDataWarehouse by adding two sequence containers. The first sequence container will be named LoadDimensions and the second sequence container will be called LoadFact. You will then create a Create A Success Precedence Constraint that connects the LoadDimensions sequence container to the LoadFact sequence container. 1. 2. 3. Open the AWDataWarehouse package in Business Intelligence Development Studio. In the AWDataWarehouse package, drag a Sequence Container from the Toolbox to the Control Flow Designer. In the AWDataWarehouse package, drag a Sequence Container from the Toolbox to the Control Flow Designer.

72

4. 5.

In the AWDataWarehouse package, create a Success Precedence Constraint from the LoadDimensions sequence container to the LoadFact sequence container. Save the AWDataWarehouse Package.

Task 7: Edit the package AWDataWarehouse adding an Execute SQL Task into the LoadDimensions sequence container will be called Generate Time Data. You will then add a Transact SQL script file located in D:\Labfiles\Starter named PopulateTimeDim.sql within the Execute SQL Task to add time data into the DimTime dimension table. 1. 2. 3. 4. In the AWDataWarehouse package, drag an Execute SQL Task from the Toolbox within the LoadDimensions sequence container and name it Generate Time Data. In the Generate Time Data Execute SQL Task, create a file connection named Time Data that point to the PopulateTimeDim.sql in the D:\Labfiles\Starter folder. In the Generate Time Data Execute SQL Task, create an OLE DB connection to the AdventureWorksDWDev database on the MIAMI server. Then parse the Transact SQL statement. Save the AWDataWarehouse Package.

Task 8: Edit the package AWDataWarehouse adding two data flow tasks below the Generate Time Data within the LoadDimensions sequence container. The first will be named Generate Reseller Data and the second named Generate Product Data. 1. 2. In the AWDataWarehouse package, drag LoadDimensions sequence container and In the AWDataWarehouse package, drag LoadDimensions sequence container and a Data Flow Task from the Toolbox within the name it Generate Reseller Data. a Data Flow Task from the Toolbox within the name it Generate Product Data.

Task 9: Edit the package AWDataWarehouse adding a Data Flow Task below within the LoadFact sequence container. It will be named Generate FactSales Data. In the AWDataWarehouse package, drag a Data Flow Task from the Toolbox within the LoadFact sequence container and name it Generate FactSales Data.

Task 10: Edit the package AWDataWarehouse that will contain Send Mail Task that will send an email to Student@Adventure-works.com if any of the existing AWDataWarehouse sequence containers tasks fail. 1. 2. 3. 4. 5. Add a Send Mail Task named Email Student to the Control Flow Designer. Configure the Send Mail Task to Student@Adventure-works.com with the Subject of LoadAWDWDev SSIS failure notification from SSIS@Adventure-works.com. In the AWDataWarehouse package, create a Failure Precedence Constraint from the LoadDimensions sequence container to the Email Student Send Mail Task. In the AWDataWarehouse package, create a Failure Precedence Constraint from the LoadFact sequence container to the Email Student Send Mail Task. Save and close the AWDataWarehouse Package.

Task 11: You have completed all tasks in this exercise A successful completion of this exercise results in the following outcomes: You You You You have have have have created created created created and and and and configured configured configured configured Success Precedence Constraint. a Send Mail Task. Failure Precedence Constraint. completed precedence constraint.

73

You have created and configured sequence containers.

Exercise 4: Working with Variables


Exercise Overview In this exercise, to help you manage the changing data that can exist within the product dimension table; you will add a record to the ExtractLog table in the AdventureWorksDWDev database. You can use this database to keep track of when extraction of product data have occurred. You will edit the AWStaging package and create a user-defined variable named ProductLastExtract. Then, you will use an Execute SQL Task to set the variable to the value defined within the ExtractDate column in the ExtractLog table. You will create another Execute SQL Task to update the ExtractDate column in the ExtractLog table with todays date by using the GetDate function after the products have been loaded into the StageProduct table. Task 1: Add data to the ExtractLog table in the AdventureWorksDWDev database with the DataSource column containing the value of StageProduct and the ExtractDate containing the value 2000-12-31 1. 2. 3. Open SQL Server Management Studio. Connect to the MIAMI SQL Server instance. In the AdventureWorksDWDev database, open the ExtractLog table. Set the DataSource column to contain the value of StageProduct and the ExtractDate column to contain the value of 2000-12-31. Save the table and close SQL Server Management Studio. Task 2: Open the AW_BI solution in Business Intelligence Development Studio, edit the AWStaging package and add a user defined package variable named ProductLastExtract 1. 2. 3. 4. 5. Open Microsoft Business Intelligence Development Studio. Open the AW_BI solution file in D:\Labfiles\Starter\AW_BI folder. Open the AWStaging package in Business Intelligence Development Studio. Create a user-defined variable at the AWStaging package level named ProductLastExtract, with a scope set to AWStaging, the data type set to DateTime and the default value to 12/31/1950. Save the AWDataWarehouse Package.

Task 3: Add an Execute SQL Task named Set Product Variable that will set the variable ProductLastExtract variable to the value that currently resides within the ExtractDate column in the ExtractLog table. Create a Success Precedence Constraint from the Set Product Variable Execute SQL Task to the Truncate Staging Tables Execute SQL Task. 1. 2. 3. 4. 5. In the AWStaging package, drag an Execute SQL Task from the Toolbox above the Truncate Staging Tables Execute SQL Task. In the Set Product Variable Execute SQL Task, create an OLE DB connection that points to the AdventureWorksDWDev database and write a query that selects the ExtractDate column value where the DataSource column equals StageProduct. In the Set Product Variable Execute SQL Task, set the result set property to Single Row and then map the result name ExtractDate to the variable ProductLastExtract. In the AWStaging package, create a Success Precedence Constraint from the Set Product Variable Execute SQL Task to the Truncate Staging Tables Execute SQL Task. Save the AWDataWarehouse package.

74

Task 4: Add an Execute SQL Task named Reset Product ExtractDate that will update the ExtractDate column in the ExtractLog table with todays date. Create a Success Precedence Constraint from the Load Products Execute SQL Task to the Reset Product ExtractDate Execute SQL Task. 1. 2. In the AWStaging package, drag an Execute SQL Task from the Toolbox above the Load Products Execute SQL Task. In the Reset Product ExtractDate Execute SQL Task, create an OLE DB connection that points to the AdventureWorksDWDev database and write a query that updates the ExtractDate column value where the DataSource column equals StageProduct with todays date. In the AWStaging package, create a Success Precedence Constraint from the Load Products Execute SQL Task to the Reset Product ExtractDate Execute SQL Task. Save the AWDataWarehouse Package and close Business Intelligence Development Studio.

3. 4.

Task 5: You have completed all tasks in this exercise A successful completion of this exercise results in the following outcomes: You You You You have have have have entered data within a table. created a user defined variable. set the value of a user-defined variable to be equal to a value in a table. reset a value within a table.

75

Lab Review
In this lab, you created packages in SSIS that loaded the staging tables in the AdventureWorksDWDev database. You worked with the control flow components, exploring the capabilities of common tasks that would be used in an organization to create SSIS packages. You then used precedence constraints to control the workflow of the tasks that were defined within the package. Finally, you added containers and variables to manage the flow and to group tasks together as a logical unit. What is the difference between data sources and connection managers? Data sources are defined within Solution Explorer in Business Intelligence Development Studio. Data sources that are defined can be used by any package that is created within the project. Connection managers are connection information objects that are embedded and deployed with a package. A connection manager can be created by using a data source defined in Solution Explorer, or can be created without the need of the data source. At what location should you save a package to create a package template? C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationIte ms. What is the difference between a Data Flow task and a Bulk Insert task? A Data Flow task allows you to extract data from any data source and load it into any destination. You can also transform data by using a Data Flow task. A Bulk Insert task allows the transfer of data from a text file into a SQL Server table. No transformations can be performed by using this task. However, it is the most efficient method of transferring data from a text file to SQL Server tables. You want to control the execution of multiple SSIS packages. What would be the best task to use? The Execute Package task would be the best task to use to coordinate the execution of multiple SQL SSIS packages. What type of precedence constraints can be used with the control flow component of SSIS? Outcome-based precedence constraints determine the flow of the Control Flow tasks based on a successful, failed or completed completion of a task. Expression-based precedence constraints determine the flow of the control flow based on a userdefined expression.

76

However, you can also use a combination of outcome-based and expression-based precedence constraint to dictate the flow of the Control Flow tasks. What is the purpose of containers? Every task that is added to the control flow by default, is part of its own task host container, that extends variables and event handlers to the task. Containers help you to group a set of Control Flow tasks or containers within a single container so that they can be organized and managed as a single entity. SSIS provide three additional type of containers as follows:

Sequence containers For Loop containers Foreach Loop containers

What is the purpose of variables? Variables are objects that can be used to store values, which can be used by numerous components of the SSIS packages.

77

Module Summary
Creating Integration Services Packages
In this lesson, you have learned the following key points:

The components that make up an SSIS include: o Control flow elements to define the overall logic of the SSIS package. o Data flow elements that help you to focus on the extraction, transformation and loading of data between data sources and destinations. o Event handler elements that help you to implement robust package logic when events occur in an SSIS package. o SSIS package variable that allows you to pass variable values between tasks in an SSIS package. o SSIS package configurations that help you to set properties of an SSIS package at run time. Data sources are specific to SSIS project files and can be shared by multiple packages created within a single SSIS project. Connection managers must be created in an SSIS package so that the connection information can be deployed with the package and that connection managers can be based on existing data sources. Data source views can be used to target specific tables or views from a data source and can be customized without affecting the underlying data source for customization. You can create multiple packages within a single SQL Server Integration Services project. You can import existing package into an SSIS package so that it can be edited.

Implementing Control Flow Tasks: Part 1


In this lesson, you have learned the following key points:

You can use the Data Flow task to perform data transfers that require transformations or error logging. The Bulk Insert task is the most efficient method to transfer data from a text file into a SQL Server table. However, transformation and error logging cannot be performed by this task. You can use the Execute SQL task to create a Transact-SQL statement to use a query to define a data source as the source data of a data transfer. When working with new database systems, using the Data Profiler task can help you gain familiarity with the database. There is a wide range of file and network tasks that help your SSIS packages interact with external services such as e-mail, the file system and FTP sites. To create custom components for SSIS packages, a number of scripting tasks are available to create the custom functionality including the Script task and ActiveX Script task. SSIS packages include a number of maintenance tasks that help you to automate and control the flow of database administrative tasks. You can easily add tasks to the Control Flow Designer. Different tasks have different properties for them to be configured correctly. 78

Implementing Control Flow Tasks: Part 2


In this lesson, you have learned the following key points:

You can use the Execute Package task to coordinate the execution of multiple packages for complex data loads. To use the Execute DTS 2000 Package task, the DTS runtime must be installed on the machine on which the package runs. You can use the Analysis Services Processing task at the end of an SSIS package workflow that is used to load a data warehouse, which is used as the source data for OLAP cubes. You can use the range of tasks available in the Toolbox under the maintenance section to automate the administration of SQL Server. WMI can be used to respond to events that WMI is set to watch so that event driven package logic can be added.

Working with Precedence Constraints and Containers


In this lesson, you have learned the following key points:

You can use the outcome-based precedence constraints to control the workflow of an SSIS package based on a success, failure or completion of a task. You can use expression-based precedence constraints to further refine the workflow logic of an SSIS package. Multiple precedence constraints must be managed to determine whether a tasks executes based on all constraints completing or only one constraint completing. Sequence containers are useful containers to group together related tasks that require the same properties applying or need to be transactional consistent as a unit. For Loop containers are useful containers when you can predict the number of loops required. Foreach Loop containers are useful containers when you cannot predict the number of loops required.

Working with Variables


In this lesson, you have learned the following key points:

System-defined variables are automatically available in an SSIS package. You can create user-defined variables to store data from Execute SQL tasks and other tasks. System variables and user-defined variable values can be passed between tasks in the same package and different packages to provide flexibility within a package. You can use a user-defined variable to affect SSIS package logic.

79

Lab: Implementing Packages and Control Flow in Microsoft SQL Server 2008
In this lab, you created packages in SSIS that loaded the staging tables in the AdventureWorksDWDev database. You worked with the control flow components, exploring the capabilities of common tasks that would be used in an organization to create SSIS packages. You then used precedence constraints to control the workflow of the tasks that were defined within the package. Finally, you added containers and variables to manage the flow and to group tasks together as a logical unit.

80

Glossary
.NET Framework An integral Windows component that supports building, deploying and running the next generation of applications and Web services. It provides a highly productive, standards-based, multilanguage environment for integrating existing investments with next generation applications and services, as well as the agility to solve the challenges of deployment and operation of Internet-scale applications. The .NET Framework consists of three main parts: the common language runtime, a hierarchical set of unified class libraries and a componentized version of ASP called ASP.NET. ad hoc report An .rdl report created with report builder that accesses report models. aggregation A table or structure that contains precalculated data for a cube. aggregation design In Analysis Services, the process of defining how an aggregation is created. aggregation prefix A string that is combined with a system-defined ID to create a unique name for a partition's aggregation table. ancestor A member in a superior level in a dimension hierarchy that is related through lineage to the current member within the dimension hierarchy. attribute The building block of dimensions and their hierarchies that corresponds to a single column in a dimension table. attribute relationship The hierarchy associated with an attribute containing a single level based on the corresponding column in a dimension table.

81

axis A set of tuples. Each tuple is a vector of members. A set of axes defines the coordinates of a multidimensional data set. ActiveX Data Objects Component Object Model objects that provide access to data sources. This API provides a layer between OLE DB and programming languages such as Visual Basic, Visual Basic for Applications, Active Server Pages and Microsoft Internet Explorer Visual Basic Scripting. ActiveX Data Objects (Multidimensional) A high-level, language-independent set of object-based data access interfaces optimized for multidimensional data applications. ActiveX Data Objects MultiDimensional.NET A managed data provider used to communicate with multidimensional data sources. ADO MD See Other Term: ActiveX Data Objects (Multidimensional) ADOMD.NET See Other Term: ActiveX Data Objects MultiDimensional.NET AMO See Other Term: Analysis Management Objects Analysis Management Objects The complete library of programmatically accessed objects that let and application manage a running instance of Analysis Services. balanced hierarchy A dimension hierarchy in which all leaf nodes are the same distance from the root node. calculated column A column in a table that displays the result of an expression instead of stored data. calculated field A field, defined in a query, that displays the result of an expression instead of stored data.

82

calculated member A member of a dimension whose value is calculated at run time by using an expression. calculation condition A MDX logical expression that is used to determine whether a calculation formula will be applied against a cell in a calculation subcube. calculation formula A MDX expression used to supply a value for cells in a calculation subcube, subject to the application of a calculation condition. calculation pass A stage of calculation in a multidimensional cube in which applicable calculations are evaluated. calculation subcube The set of multidimensional cube cells that is used to create a calculated cells definition. The set of cells is defined by a combination of MDX set expressions. case In data mining, a case is an abstract view of data characterized by attributes and relations to other cases. case key In data mining, the element of a case by which the case is referenced within a case set. case set In data mining, a set of cases. cell In a cube, the set of properties, including a value, specified by the intersection when one member is selected from each dimension. cellset In ADO MD, an object that contains a collection of cells selected from cubes or other cellsets by a multidimensional query.

83

changing dimension A dimension that has a flexible member structure, and is designed to support frequent changes to structure and data. chart data region A report item on a report layout that displays data in a graphical format. child A member in the next lower level in a hierarchy that is directly related to the current member. clickthrough report A report that displays related report model data when you click data within a rendered report builder report. clustering A data mining technique that analyzes data to group records together according to their location within the multidimensional attribute space. collation A set of rules that determines how data is compared, ordered and presented. column-level collation Supporting multiple collations in a single instance. composite key A key composed of two or more columns. concatenation The combining of two or more character strings or expressions into a single character string or expression, or to combine two or more binary strings or expressions into a single binary string or expression. concurrency A process that allows multiple users to access and change shared data at the same time. SQL Server uses locking to allow multiple users to access and change shared data at the same time without conflicting with each other.

84

conditional split A restore of a full database backup, the most recent differential database backup (if any), and the log backups (if any) taken since the full database backup. config file See Other Term: configuration file configuration In reference to a single microcomputer, the sum of a system's internal and external components, including memory, disk drives, keyboard, video and generally less critical add-on hardware, such as a mouse, modem or printer. configuration file A file that contains machine-readable operating specifications for a piece of hardware or software, or that contains information about another file or about a specific user. configurations In Integration Services, a name or value pair that updates the value of package objects when the package is loaded. connection An interprocess communication (IPC) linkage established between a SQL Server application and an instance of SQL Server. connection manager In Integration Services, a logical representation of a run-time connection to a data source. constant A group of symbols that represent a specific data value. container A control flow element that provides package structure. control flow The ordered workflow in an Integration Services package that performs tasks.

85

control-break report A report that summarizes data in user-defined groups or breaks. A new group is triggered when different data is encountered. cube A set of data that is organized and summarized into a multidimensional structure defined by a set of dimensions and measures. cube role A collection of users and groups with the same access to a cube. custom rollup An aggregation calculation that is customized for a dimension level or member, and that overrides the aggregate functions of a cube's measures. custom rule In a role, a specification that limits the dimension members or cube cells that users in the role are permitted to access. custom variable An aggregation calculation that is customized for a dimension level or member and overrides the aggregate functions of a cube's measures. data dictionary A set of system tables, stored in a catalog, that includes definitions of database structures and related information, such as permissions. data explosion The exponential growth in size of a multidimensional structure, such as a cube, due to the storage of aggregated data. data flow The ordered workflow in an Integration Services package that extracts, transforms and loads data. data flow engine An engine that executes the data flow in a package.

86

data flow task Encapsulates the data flow engine that moves data between sources and destinations, providing the facility to transform, clean and modify data as it is moved. data integrity A state in which all the data values stored in the database are correct. data manipulation language The subset of SQL statements that is used to retrieve and manipulate data. data mart A subset of the contents of a data warehouse. data member A child member associated with a parent member in a parent-child hierarchy. data mining The process of analyzing data to identify patterns or relationships. data processing extension A component in Reporting Services that is used to retrieve report data from an external data source. data region A report item that displays repeated rows of data from an underlying dataset in a table, matrix, list or chart. data scrubbing Part of the process of building a data warehouse out of data coming from multiple (OLTP) systems. data source In ADO and OLE DB, the location of a source of data exposed by an OLE DB provider. The source of data for an object such as a cube or dimension. It is also the specification of the information necessary to access source data. It sometimes refers to object of ClassType clsDataSource. In Reporting Services, a specified data source type, connection string and credentials, which can be saved separately to a report server and shared among report projects or embedded in a .rdl file.

87

data source name The name assigned to an ODBC data source. data source view A named selection of database objects that defines the schema referenced by OLAP and data mining objects in an Analysis Services databases. data warehouse A database specifically structured for query and analysis. database role A collection of users and groups with the same access to an Analysis Services database. data-driven subscription A subscription in Reporting Services that uses a query to retrieve subscription data from an external data source at run time. datareader A stream of data that is returned by an ADO.NET query. dataset In OLE DB for OLAP, the set of multidimensional data that is the result of running a MDX SELECT statement. In Reporting Services, a named specification that includes a data source definition, a query definition and options. decision support Systems designed to support the complex analytic analysis required to discover business trends. decision tree A treelike model of data produced by certain data mining methods. default member The dimension member used in a query when no member is specified for the dimension.

88

delimited identifier An object in a database that requires the use of special characters (delimiters) because the object name does not comply with the formatting rules of regular identifiers. delivery channel type The protocol for a delivery channel, such as Simple Mail Transfer Protocol (SMTP) or File. delivery extension A component in Reporting Services that is used to distribute a report to specific devices or target locations. density In an index, the frequency of duplicate values. In a data file, a percentage that indicates how full a data page is. In Analysis Services, the percentage of cells that contain data in a multidimensional structure. dependencies Objects that depend on other objects in the same database. derived column A transformation that creates new column values by applying expressions to transformation input columns. descendant A member in a dimension hierarchy that is related to a member of a higher level within the same dimension. destination An Integration Services data flow component that writes the data from the data flow into a data source or creates an in-memory dataset. destination adapter A data flow component that loads data into a data store. dimension A structural attribute of a cube, which is an organized hierarchy of categories (levels) that describe data in the fact table. 89

dimension granularity The lowest level available to a particular dimension in relation to a particular measure group. dimension table A table in a data warehouse whose entries describe data in a fact table. Dimension tables contain the data from which dimensions are created. discretized column A column that represents finite, counted data. document map A navigation pane in a report arranged in a hierarchy of links to report sections and groups. drill down/drill up To navigate through levels of data ranging from the most summarized (up) to the most detailed (down). drill through In Analysis Services, to retrieve the detailed data from which the data in a cube cell was summarized. In Reporting Services, to open related reports by clicking hyperlinks in the main drillthrough report. drilldown/drillup A technique for navigating through levels of data ranging from the most summarized (up) to the most detailed (down). drillthrough In Analysis Services, a technique to retrieve the detailed data from which the data in a cube cell was summarized. In Reporting Services, a way to open related reports by clicking hyperlinks in the main drillthrough report. drillthrough report A report with the 'enable drilldown' option selected. Drillthrough reports contain hyperlinks to related reports.

90

dynamic connection string In Reporting Services, an expression that you build into the report, allowing the user to select which data source to use at run time. You must build the expression and data source selection list into the report when you create it. Data Mining Model Training The process a data mining model uses to estimate model parameters by evaluating a set of known and predictable data. entity In Reporting Services, an entity is a logical collection of model items, including source fields, roles, folders and expressions, presented in familiar business terms. executable In Integration Services, a package, Foreach Loop, For Loop, Sequence or task. execution tree The path of data in the data flow of a SQL Server 2008 Integration Services package from sources through transformations to destinations. expression In SQL, a combination of symbols and operators that evaluate to a single data value. In Integration Services, a combination of literals, constants, functions and operators that evaluate to a single data value. ETL Extraction, transformation and loading. The complex process of copying and cleaning data from heterogeneous sources. fact A row in a fact table in a data warehouse. A fact contains values that define a data event such as a sales transaction. fact dimension A relationship between a dimension and a measure group in which the dimension main table is the same as the measure group table.

91

fact table A central table in a data warehouse schema that contains numerical measures and keys relating facts to dimension tables. field length In bulk copy, the maximum number of characters needed to represent a data item in a bulk copy character format data file. field terminator In bulk copy, one or more characters marking the end of a field or row, separating one field or row in the data file from the next. filter expression An expression used for filtering data in the Filter operator. flat file A file consisting of records of a single record type, in which there is no embedded structure information governing relationships between records. flattened rowset A multidimensional data set presented as a two-dimensional rowset in which unique combinations of elements of multiple dimensions are combined on an axis. folder hierarchy A bounded namespace that uniquely identifies all reports, folders, shared data source items and resources that are stored in and managed by a report server. format file A file containing meta information (such as data type and column size) that is used to interpret data when being read from or written to a data file. File connection manager In Integration Services, a logical representation of a connection that enables a package to reference an existing file or folder or to create a file or folder at run time. For Loop container In Integration Services, a container that runs a control flow repeatedly by testing a condition.

92

Foreach Loop container In Integration Services, a container that runs a control flow repeatedly by using an enumerator. Fuzzy Grouping In Integration Services, a data cleaning methodology that examines values in a dataset and identifies groups of related data rows and the one data row that is the canonical representation of the group. global assembly cache A machine-wide code cache that stores assemblies specifically installed to be shared by many applications on the computer. grant To apply permissions to a user account, which allows the account to perform an activity or work with data. granularity The degree of specificity of information that is contained in a data element. granularity attribute The single attribute is used to specify the level of granularity for a given dimension in relation to a given measure group. grid A view type that displays data in a table. grouping A set of data that is grouped together in a report. hierarchy A logical tree structure that organizes the members of a dimension such that each member has one parent member and zero or more child members. hybrid OLAP A storage mode that uses a combination of multidimensional data structures and relational database tables to store multidimensional data.

93

HTML Viewer A UI component consisting of a report toolbar and other navigation elements used to work with a report. input member A member whose value is loaded directly from the data source instead of being calculated from other data. input set The set of data provided to a MDX value expression upon which the expression operates. isolation level The property of a transaction that controls the degree to which data is isolated for use by one process, and is guarded against interference from other processes. Setting the isolation level defines the default locking behavior for all SELECT statements in your SQL Server session. item-level role assignment A security policy that applies to an item in the report server folder namespace. item-level role definition A security template that defines a role used to control access to or interaction with an item in the report server folder namespace. key A column or group of columns that uniquely identifies a row (primary key), defines the relationship between two tables (foreign key) or is used to build an index. key attribute The attribute of a dimension that links the non-key attributes in the dimension to related measures. key column In an Analysis Services dimension, an attribute property that uniquely identifies the attribute members. In an Analysis Services mining model, a data mining column that uniquely identifies each case in a case table. key performance indicator A quantifiable, standardized metric that reflects a critical business variable (for instance, market share), measured over time. 94

KPI See Other Term: key performance indicator latency The amount of time that elapses when a data change is completed at one server and when that change appears at another server. leaf In a tree structure, an element that has no subordinate elements. leaf level The bottom level of a clustered or nonclustered index. leaf member A dimension member without descendants. level The name of a set of members in a dimension hierarchy such that all members of the set are at the same distance from the root of the hierarchy. lift chart In Analysis Services, a chart that compares the accuracy of the predictions of each data mining model in the comparison set. linked dimension In Analysis Services, a reference in a cube to a dimension in a different cube. linked measure group In Analysis Services, a reference in a cube to a measure group in a different cube. linked report A report that references an existing report definition by using a different set of parameter values or properties. list data region A report item on a report layout that displays data in a list format.

95

local cube A cube created and stored with the extension .cub on a local computer using PivotTable Service. lookup table In Integration Services, a reference table for comparing, matching or extracting data. many-to-many dimension A relationship between a dimension and a measure group in which a single fact may be associated with many dimension members and a single dimension member may be associated with a many facts. matrix data region A report item on a report layout that displays data in a variable columnar format. measure In a cube, a set of values that are usually numeric and are based on a column in the fact table of the cube. Measures are the central values that are aggregated and analyzed. measure group All the measures in a cube that derive from a single fact table in a data source view. member An item in a dimension representing one or more occurrences of data. member property Information about an attribute member, for example, the gender of a customer member or the color of a product member. mining structure A data mining object that defines the data domain from which the mining models are built. multidimensional OLAP A storage mode that uses a proprietary multidimensional structure to store a partition's facts and aggregations or a dimension. multidimensional structure A database paradigm that treats data as cubes that contain dimensions and measures in cells.

96

MDX A syntax used for defining multidimensional objects and querying and manipulating multidimensional data. Mining Model An object that contains the definition of a data mining process and the results of the training activity. Multidimensional Expression A syntax used for defining multidimensional objects and querying and manipulating multidimensional data. named set A set of dimension members or a set expression that is created for reuse, for example, in MDX queries. natural hierarchy A hierarchy in which at every level there is a one-to-many relationship between members in that level and members in the next lower level. nested table A data mining model configuration in which a column of a table contains a table. nonleaf In a tree structure, an element that has one or more subordinate elements. In Analysis Services, a dimension member that has one or more descendants. In SQL Server indexes, an intermediate index node that points to other intermediate nodes or leaf nodes. nonleaf member A member with one or more descendants. normalization rules A set of database design rules that minimizes data redundancy and results in a database in which the Database Engine and application software can easily enforce integrity. Non-scalable EM A Microsoft Clustering algorithm method that uses a probabilistic method to determine the probability that a data point exists in a cluster.

97

Non-scalable K-means A Microsoft Clustering algorithm method that uses a distance measure to assign a data point to its closest cluster. object identifier A unique name given to an object. In Metadata Services, a unique identifier constructed from a globally unique identifier (GUID) and an internal identifier. online analytical processing A technology that uses multidimensional structures to provide rapid access to data for analysis. online transaction processing A data processing system designed to record all of the business transactions of an organization as they occur. An OLTP system is characterized by many concurrent users actively adding and modifying data. overfitting The characteristic of some data mining algorithms that assigns importance to random variations in data by viewing them as important patterns. ODBC data source The location of a set of data that can be accessed using an ODBC driver. A stored definition that contains all of the connection information an ODBC application requires to connect to the data source. ODBC driver A dynamic-link library (DLL) that an ODBC-enabled application, such as Excel, can use to access an ODBC data source. OLAP See Other Term: online analytical processing OLE DB A COM-based API for accessing data. OLE DB supports accessing data stored in any format for which an OLE DB provider is available.

98

OLE DB for OLAP Formerly, the separate specification that addressed OLAP extensions to OLE DB. Beginning with OLE DB 2.0, OLAP extensions are incorporated into the OLE DB specification. package A collection of control flow and data flow elements that runs as a unit. padding A string, typically added when the last plaintext block is short. The space allotted in a cell to create or maintain a specific size. parameterized report A published report that accepts input values through parameters. parent A member in the next higher level in a hierarchy that is directly related to the current member. partition In replication, a subset of rows from a published table, created with a static row filter or a parameterized row filter. In Analysis Services, one of the storage containers for data and aggregations of a cube. Every cube contains one or more partitions. For a cube with multiple partitions, each partition can be stored separately in a different physical location. Each partition can be based on a different data source. Partitions are not visible to users; the cube appears to be a single object. In the Database Engine, a unit of a partitioned table or index. partition function A function that defines how the rows of a partitioned table or index are spread across a set of partitions based on the values of certain columns, called partitioning columns. partition scheme A database object that maps the partitions of a partition function to a set of filegroups. partitioned index An index built on a partition scheme, and whose data is horizontally divided into units which may be spread across more than one filegroup in a database.

99

partitioned snapshot In merge replication, a snapshot that includes only the data from a single partition. partitioned table A table built on a partition scheme, and whose data is horizontally divided into units which may be spread across more than one filegroup in a database. partitioning The process of replacing a table with multiple smaller tables. partitioning column The column of a table or index that a partition function uses to partition a table or index. perspective A user-defined subset of a cube. pivot To rotate rows to columns, and columns to rows, in a crosstabular data browser. To choose dimensions from the set of available dimensions in a multidimensional data structure for display in the rows and columns of a crosstabular structure. polling query A polling query is typically a singleton query that returns a value Analysis Services can use to determine if changes have been made to a table or other relational object. precedence constraint A control flow element that connects tasks and containers into a sequenced workflow. predictable column A data mining column that the algorithm will build a model around based on values of the input columns. prediction A data mining technique that analyzes existing data and uses the results to predict values of attributes for new records or missing attributes in existing records.

100

proactive caching A system that manages data obsolescence in a cube by which objects in MOLAP storage are automatically updated and processed in cache while queries are redirected to ROLAP storage. process In a cube, to populate a cube with data and aggregations. In a data mining model, to populate a data mining model with data mining content. profit chart In Analysis Services, a chart that displays the theoretical increase in profit that is associated with using each model. properties page A dialog box that displays information about an object in the interface. property A named attribute of a control, field or database object that you set to define one of the object's characteristics, such as size, color or screen location; or an aspect of its behavior, such as whether it is hidden. property mapping A mapping between a variable and a property of a package element. property page A tabbed dialog box where you can identify the characteristics of tables, relationships, indexes, constraints and keys. protection In Integration Services, determines the protection method, the password or user key and the scope of package protection. ragged hierarchy See Other Term: unbalanced hierarchy raw file In Integration Services, a native format for fast reading and writing of data to files.

101

recursive hierarchy A hierarchy of data in which all parent-child relationships are represented in the data. reference dimension A relationship between a dimension and a measure group in which the dimension is coupled to the measure group through another dimension. This behaves like a snowflake dimension, except that attributes are not shared between the two dimensions. reference table The source table to use in fuzzy lookups. refresh data The series of operations that clears data from a cube, loads the cube with new data from the data warehouse and calculates aggregations. relational database A database or database management system that stores information in tables as rows and columns of data, and conducts searches by using the data in specified columns of one table to find additional data in another table. relational database management system A system that organizes data into related rows and columns. relational OLAP A storage mode that uses tables in a relational database to store multidimensional structures. rendered report A fully processed report that contains both data and layout information, in a format suitable for viewing. rendering A component in Reporting Services that is used to process the output format of a report. rendering extension(s) A plug-in that renders reports to a specific format. rendering object model Report object model used by rendering extensions.

102

replay In SQL Server Profiler, the ability to open a saved trace and play it again. report definition The blueprint for a report before the report is processed or rendered. A report definition contains information about the query and layout for the report. report execution snapshot A report snapshot that is cached. report history A collection of report snapshots that are created and saved over time. report history snapshot A report snapshot that appears in report history. report intermediate format A static report history that contains data captured at a specific point in time. report item Any object, such as a text box, graphical element or data region, that exists on a report layout. report layout In report designer, the placement of fields, text and graphics within a report. In report builder, the placement of fields and entities within a report, plus applied formatting styles. report layout template A predesigned table, matrix or chart report template in report builder. report link A URL to a hyperlinked report. report model A metadata description of business data used for creating ad hoc reports in report builder. report processing extension A component in Reporting Services that is used to extend the report processing logic. 103

report rendering The action of combining the report layout with the data from the data source for the purpose of viewing the report. report server database A database that provides internal storage for a report server. report server execution account The account under which the Report Server Web service and Report Server Windows service run. report server folder namespace A hierarchy that contains predefined and user-defined folders. The namespace uniquely identifies reports and other items that are stored in a report server. It provides an addressing scheme for specifying reports in a URL. report snapshot A static report that contains data captured at a specific point in time. report-specific schedule Schedule defined inline with a report. resource Any item in a report server database that is not a report, folder or shared data source item. role A SQL Server security account that is a collection of other security accounts that can be treated as a single unit when managing permissions. A role can contain SQL Server logins, other roles, and Windows logins or groups. In Analysis Services, a role uses Windows security accounts to limit scope of access and permissions when users access databases, cubes, dimensions and data mining models. In a database mirroring session, the principal server and mirror server perform complementary principal and mirror roles. Optionally, the role of witness is performed by a third server instance. role assignment Definition of user access rights to an item. In Reporting Services, a security policy that determines whether a user or group can access a specific item and perform an operation. 104

role definition A collection of tasks performed by a user (i.e. browser, administrator). In Reporting Services, a named collection of tasks that defines the operations a user can perform on a report server. role-playing dimension A single database dimension joined to the fact table on a different foreign keys to produce multiple cube dimensions. RDBMS See Other Term: relational database management system RDL See Other Term: Report Definition Language Report Definition Language A set of instructions that describe layout and query information for a report. Report Server service A Windows service that contains all the processing and management capabilities of a report server. Report Server Web service A Web service that hosts, processes and delivers reports. ReportViewer controls A Web server control and Windows Form control that provides embedded report processing in ASP.NET and Windows Forms applications. scalar A single-value field, as opposed to an aggregate. scalar aggregate An aggregate function, such as MIN(), MAX() or AVG(), that is specified in a SELECT statement column list that contains only aggregate functions. scale bar The line on a linear gauge on which tick marks are drawn analogous to an axis on a chart. 105

scope An extent to which a variable can be referenced in a DTS package. script A collection of Transact-SQL statements used to perform an operation. security extension A component in Reporting Services that authenticates a user or group to a report server. semiadditive A measure that can be summed along one or more, but not all, dimensions in a cube. serializable The highest transaction isolation level. Serializable transactions lock all rows they read or modify to ensure the transaction is completely isolated from other tasks. server A location on the network where report builder is launched from and a report is saved, managed and published. server admin A user with elevated privileges who can access all settings and content of a report server. server aggregate An aggregate value that is calculated on the data source server and included in a result set by the data provider. shared data source item Data source connection information that is encapsulated in an item. shared dimension A dimension created within a database that can be used by any cube in the database. shared schedule Schedule information that can be referenced by multiple items. sibling A member in a dimension hierarchy that is a child of the same parent as a specified member. 106

slice A subset of the data in a cube, specified by limiting one or more dimensions by members of the dimension. smart tag A smart tag exposes key configurations directly on the design surface to enhance overall design-time productivity in Visual Studio 2005. snowflake schema An extension of a star schema such that one or more dimensions are defined by multiple tables. source An Integration Services data flow component that extracts data from a data store, such as files and databases. source control A way of storing and managing different versions of source code files and other files used in software development projects. Also known as configuration management and revision control. source cube The cube on which a linked cube is based. source database In data warehousing, the database from which data is extracted for use in the data warehouse. A database on the Publisher from which data and database objects are marked for replication as part of a publication that is propagated to Subscribers. source object The single object to which all objects in a particular collection are connected by way of relationships that are all of the same relationship type. source partition An Analysis Services partition that is merged into another and is deleted automatically at the end of the merger process. sparsity The relative percentage of a multidimensional structure's cells that do not contain data.

107

star join A join between a fact table (typically a large fact table) and at least two dimension tables. star query A star query joins a fact table and a number of dimension tables. star schema A relational database structure in which data is maintained in a single fact table at the center of the schema with additional dimension data stored in dimension tables. subreport A report contained within another report. subscribing server A server running an instance of Analysis Services that stores a linked cube. subscription A request for a copy of a publication to be delivered to a Subscriber. subscription database A database at the Subscriber that receives data and database objects published by a Publisher. subscription event rule A rule that processes information for event-driven subscriptions. subscription scheduled rule One or more Transact-SQL statements that process information for scheduled subscriptions. Secure Sockets Layer (SSL) A proposed open standard for establishing a secure communications channel to prevent the interception of critical information, such as credit card numbers. Primarily, it enables secure electronic financial transactions on the World Wide Web, although it is designed to work on other Internet services as well. Semantic Model Definition Language A set of instructions that describe layout and query information for reports created in report builder.

108

Sequence container Defines a control flow that is a subset of the package control flow. table data region A report item on a report layout that displays data in a columnar format. tablix A Reporting Services RDL data region that contains rows and columns resembling a table or matrix, possibly sharing characteristics of both. target partition An Analysis Services partition into which another is merged, and which contains the data of both partitions after the merger. temporary stored procedure A procedure placed in the temporary database, tempdb and erased at the end of the session. time dimension A dimension that breaks time down into levels such as Year, Quarter, Month and Day. In Analysis Services, a special type of dimension created from a date/time column. transformation In data warehousing, the process of changing data extracted from source data systems into arrangements and formats consistent with the schema of the data warehouse. In Integration Services, a data flow component that aggregates, merges, distributes and modifies column data and rowsets. transformation error output Information about a transformation error. transformation input Data that is contained in a column, which is used during a join or lookup process, to modify or aggregate data in the table to which it is joined. transformation output Data that is returned as a result of a transformation procedure.

109

tuple Uniquely identifies a cell, based on a combination of attribute members from every attribute hierarchy in the cube. two A process that ensures transactions that apply to more than one server are completed on all servers or on none. unbalanced hierarchy A hierarchy in which one or more levels do not contain members in one or more branches of the hierarchy. unknown member A member of a dimension for which no key is found during processing of a cube that contains the dimension. unpivot In Integration Services, the process of creating a more normalized dataset by expanding data columns in a single record into multiple records. value An expression in MDX that returns a value. Value expressions can operate on sets, tuples, members, levels, numbers or strings. variable interval An option on a Reporting Services chart that can be specified to automatically calculate the optimal number of labels that can be placed on an axis, based on the chart width or height. vertical partitioning To segment a single table into multiple tables based on selected columns. very large database A database that has become large enough to be a management challenge, requiring extra attention to people, processes and processes. visual A displayed, aggregated cell value for a dimension member that is consistent with the displayed cell values for its displayed children. 110

VLDB very large database. write back To update a cube cell value, member or member property value. write enable To change a cube or dimension so that users in cube roles with read/write access to the cube or dimension can change its data. writeback In SQL Server, the update of a cube cell value, member or member property value. Web service In Reporting Services, a service that uses Simple Object Access Protocol (SOAP) over HTTP and acts as a communications interface between client programs and the report server. XML for Analysis A specification that describes an open standard that supports data access to data sources that reside on the World Wide Web. XMLA See Other Term: XML for Analysis

111

S-ar putea să vă placă și