Sunteți pe pagina 1din 8

INTRODUCTION

Most modern database management systems are based on the SQL language standard. Youll find, however, that most are based somewhat loosely on that standard, choosing instead to employ their own variations and additions to the SQL language. There is also some confusion as to what is considered part of the SQL language and what has been added as part of manufacturers DBMS products. In this chapter, youll get a formal introduction to the SQL language standard and to the types of variations that have been added by different DBMSs. You will learn about basic language components through simple command examples. The chapter introduces representative DDL and DML command statements through the standard command syntax and gives some insight into how some DBMS providers have modified these statements to meet their particular needs and design visions.

6.1 Introducing the SQL Language


There are two aspects of data management: data definition and data manipulation. Data definition requires a data definition language (DDL) that instructs the DBMS software as to what tables will be in the database, what attributes will be in the tables, which attributes will be indexed, and so forth. Data manipulation refers to the four basic operations performed on data stored in any DBMS: data retrieval, data update, insertion of new records, and deletion of existing records. Data manipulation requires a special language with which users can communicate data manipulation commands to the DBMS. As a class, these commands are known as data manipulation language (DML). Data retrieval statements are sometimes placed in a different category, separate from DML, as data query language (DQL) statements. For the purposes of this chapter, retrieval statements are treated as part of the DML set. The current relational database management system language, Structured Query Language (SQL), was developed in the early 1980s and incorporates both DDL and DML features. SQL has been declared a standard by the American National Standards Institute (ANSI) and by the International Organization for Standardization (ISO), with several standard versions issued since its initial development. Many manufacturers have produced versions of SQL, most quite similar, including mainstream DBMSs such as DB2, Oracle, MySQL, Informix, and Microsoft SQL Server. 6.1.1 Understanding SQL Features The SQL standard defines the features supported by standard SQL implementations. Most vendor implementations, however, go beyond the SQL standard. They offer the standard features and support most, if not all, of the standard command set, but provide their own extension to the language, giving them added functionality. These extensions usually take the form of additional commands or additional command options. There are also differences in how features that are not fully defined in the SQL standard are implemented. For example, the SQL standard defines basic database objects such as tables, views, and indexes. It does not define how these features are physically implemented. Microsoft SQL Server stores database objects in one or more database files, with each file containing one or more database objects. MySQL, on the other hand, creates separate files for database objects. Each table in MySQL can be effectively thought of stored as its own operating system file. Basic features of SQL can be described as falling within the following categories:

Data definition language (DDL): statements used to create and maintain database objects. Data manipulation language (DML): statements used to retrieve and manipulate data. Command operators: operator symbols and keywords used to run arithmetic, comparison, and logical operations. Functions: special executables that return values. Transaction control: include statements used to initiate and complete or abort transaction processing. What this means is that some things that you might have considered part of SQL because they are common to relational database systems are not. Instead they are features and command extensions implemented first by one DBMS provider and sometimes copied by others. For example, the SQL language specifies nothing about scheduled command execution, but this is a feature supported by Microsoft SQL Server and many others. The SQL language defines procedures, which are compiled sets of executable statements, in the context of the statements used to create and modify them. It does not include any specific management procedures to be included with a DBMS, but most do support a wide array of predefined management procedures that install with the database server and are treated as part of the vendors language set. 6.1.2 Using SQL There are two basic options for executing SQL commands: interactive SQL and embedded SQL. Interactive SQL, also known as dynamic SQL, refers to statements that you run directly, interacting with the database server. Embedded SQL refers to functionality that is embedded in a procedure or part of an application written in a different programming language. Query mode execution, which is used to refer to interactive execution, and embedded SQL, are both supported by all current DBMS products. Where DBMSs differ is in how they support them and what tools they provide. There is also, many times, a difference in what is called the language set included with the DBMS. For example, when you run statements in SQL Server, youre running Transact-SQL (or TSQL) statements. Many of these support command options not supported by standard SQL. Also, a small number of commands defined in standard SQL are not included as part of the Transact-SQL language.
Interactive SQL

All DBMSs provide some sort of command interface or command prompt for running SQL commands interactively. Microsoft SQL Server, for example, supports two basic options. It provides a character-based command prompt where statements can be typed in one at a time. Statements can be executed individually or as a set of executable statements known as a batch. You can also load and run scripts, groups of SQL Server commands stored as a file. Microsoft has released various versions of its command interface over the years, retaining older versions to provide backward compatibility as new ones are released. The preferred command interface for SQL Server 2005, sqlcmd, is shown in Figure 6-1. One of the biggest problems with a character-based interface is that it can be confusing and difficult to use. Sqlcmd, for example, supports a large number of startup options to let you configure the command environment. The options let you log in to a specific server, choose a working database, specify a script file to load and run automatically, and set options such as communication time-out values and network packet sizes for network communication. You can set default options for how data returned by queries is displayed. Set wrong, they can impair your performance and even keep you from doing what you need to do.

Using a character-based interface also requires either a thorough understanding of SQL language commands or a readily available reference. The interface is unforgiving and the error messages returned by most DBMSs are usually brief and not always helpful. Editing command strings in the interface can be difficult and even small typographic errors could lead to major problems. The biggest advantage of using a character-based interface is that it provides a way of executing SQL language commands through operating system batch files. This can be significant with a DBMS that does not internally supported scheduled command executions. In that case, you can still run periodic activities, such as data backups, through command scheduling supported by the operating system. SQL Server, as well as many other popular DBMS options, also provides a windowed command environment using a graphic user interface (GUI). In the case of SQL Server, it is included as part of SQL Server Management Studio, shown in Figure 6-2. The interface has menu-driven option support, making it somewhat easy to manage the command environment. You also have windowed help available with access to database and database object structures, and easy access to documentation. You can check commands for syntax errors before running them and easily edit your commands using a tool called Query Analyzer. The biggest problem with using Query Analyzer is that it is resource intensive and is not always available. If you want to run it from a client computer, which is what you would normally want to do for security reasons, then you must install the SQL Server client tools on that computer. The client tools include SQL Server Management Studio.
Embedded SQL

Embedded SQL is best suited to activities that must be performed periodically and that you want performed in the same way each time. Embedded SQL is also a critical part of any database application. Embedded SQL uses the same SQL language commands as when you are running interactive SQL statements. The difference is that they are included as part of an executable program. SQL Server, for example, has two basic programmable objects that provide embedded SQL support. These are stored procedures and user-defined functions. Both are similar in that they are sets of executable statements, they can accept parameters to help control execution, and they can perform actions and return results. The primary difference is how they are used. Stored procedures are most often used to automate periodic procedures or to hide the details of those procedures from the users. User-defined functions are used when you want to return either a scalar value to a user or a result formatted as a table. For example, you might create a stored procedure to process customer payments. The user would need to provide the stored procedure name, the customer ID, and the payment amount. Statements inside the stored procedure would handle the details of exactly what changes need to be made to the table or tables involved and could detect and respond to errors that might occur during the process. Not only do you ensure that the payments are posted properly, this also helps to ensure data security. Users do not need to know how the tables involved in the process are structured. Because they dont know how the data is structured, its harder for someone to retrieve information from the database without proper authorization. Embedded SQL is also used in application programs. The programming environment provides the connectivity tools to communicate with the database server and execute SQL commands. This can be done by passing literal command strings to the database server for execution or through application programming interfaces (APIs) that provide the necessary functionality. For example, SQL Server 2005 provides a set of.NET Framework management objects that make it easy to build management applications. Also, the data objects provided with ADO.NET support

direct manipulation of database tables and other database objects. In fact, ADO.NET is able to mirror the database and table structure, down to column names and table constraints, in memory. 6.1.3 Understanding Command Basics Whether running DDL or DML commands (as well as DQL commands, if you want to consider them separately), whether using interactive or embedded SQL, there are a few basic things that you need to understand. The first thing is that each SQL command has a specific command syntax that describes how you run the command. The command syntax includes the command name and any command keywords and parameters. Command keywords, also known as command clauses, are used to describe specific actions that a command will take. Command parameters are values that you supply so the command can run. Take a look at the following example, a simplified version of the SELECT statement: SELECT select_list FROM table [WHERE qualifying_conditions ] Lets break this down. SELECT is the command name. The SELECT command has various uses, but is most commonly used in queries to retrieve values from tables and views. The select_list, when SELECT is used in this manner, is the list of columns that you want the command to return. The select_list is one of the SELECT command parameters. Command syntax descriptions typically use italicized type, as in this example, to identify replaceable parameters. FROM is a command keyword defining the FROM clause. When using SELECT to retrieve data, FROM is a required keyword. It identifies the object or objects (typically tables and views) from which you are retrieving data. WHERE is an optional keyword and is used to filter the retrieval result. When retrieving data, if you dont include a WHERE clause, all rows are returned. The WHERE clause lets you set qualifying conditions, also called the search argument, that limit the rows returned. For example, you might want to filter a numeric column by the value it contains, or a date/time column by a range of dates. Square brackets are typically used in command syntax descriptions when describing command syntax to identify optional clauses. In actual command strings, square brackets can be used to enclose object names. When you run any command, SQL Server parses the command before it does anything else. In this context, parsing is the process of reviewing the command string to verify that youve used the correct command syntax. For example, attempting to run the following would result in a syntax error: SELECT name, street, city FORM employees This would return an error that FORM is not recognized. The exact wording of the error depends on your DBMS. In most cases, you will receive an error number and a textual description of the error. With some DBMSs, you can use its help system to look up the error number and get more information. However, some DBMSs just repeat the textual error statement that youve already seen. You need to understand that most commands, with the exception of some DDL commands, run in the context of a database. When you connect to a SQL Server database server, for example, through a command-line or graphical interface, your connection is associated with a default database, also known as the working database. When you execute a command that depends on a database, the default database is assumed. For example, when you specify a table by name or by schema and name only, SQL Server assumes the table is in the default database. Some commands, mostly maintenance commands, can run against the default database only. Other database products have similar defaults for connections, but the terminology used can vary between different DBMS products.

Each connection will also have an associated userthe user specified when you connected to the database server. The actions you can take depends on the permissions assigned that user. In some cases, you can temporarily override this by specifying to run a command using the security context of a different user, but this feature is not supported by all DBMSs. The result returned by the command after successful completion is somewhat command specific. When you run a SELECT command to retrieve data, it returns the requested columns and qualifying rows, known as the result set. You might also see this referred to as a relational result. It will also return a message stating

6.2 Understanding SELECT Fundamentals


One of the most commonly used SQL commands is the SELECT command. As you have already learned, you use SELECT to retrieve data. You can also use SELECT to perform calculations and to retrieve the results from functions, which are special commands designed to operate on data and return a result. 6.2.1 Working with SELECT You have already been introduced to the SELECT statement and its basic syntax. Were going to talk a bit more about SELECT because you will likely use it more than any other command. The complete SELECT command syntax is beyond the scope of this chapter, but well take a closer look at a few additional syntax options. SELECT statement command examples used in this chapter are all based on Microsoft SQL Servers Transact-SQL variation of the SQL standard language. Command syntax for the SELECT command, as well as for other commands discussed later in the chapter, varies for other DBMS products. SELECT commands are considered declarative statements rather than being procedural in nature. This means that you specify what data you are looking for rather than provide a logical sequence of steps that guide the system in how to find the data. The relational DBMS analyzes the declarative SQL SELECT statement and creates an access patha plan for what steps to take to respond to the query. The SELECT command also allows the user, in some circumstances, to exert a certain amount of logical control over the data retrieval process. Another point is that SQL SELECT commands can be run in either an interactive query or an embedded mode. In the query mode, the user types the command at a workstation and presses the Enter key. In the embedded mode, the SELECT command is embedded within the lines of a higher-level language program and functions as an input or read statement for the program. When the program is run and the program logic reaches the SELECT command, the program executes the SELECT. The SELECT command is sent to the DBMS where, as in the query mode case, the DBMS processes it against the database and returns the results, this time to the program that issued it. The program can then use and further process the returned data. The only tricky part to this is that some programming environments are designed to retrieve one record at a time. In the embedded mode, the program that issued the SQL SELECT command and receives the result set must process the rows returned one at a time. However, many newer languages and programming environments are designed to recognize and process result sets. Microsoft.NET Framework, for example, includes data objects that let you access rows one at a time, in the order received, or store the entire result set in memory and randomly access its contents. 6.2.2 Using Simple Data Retrieval Lets look at a couple of examples of simple retrieval statements. Were retrieving data from the SALESPERSON table shown in Figure 6-3. Lets start with a simple request: Find the commission percentage and year of hire of salesperson number 186. The SQL statement to accomplish this would be: SELECT COMMPERCT, YEARHIRE FROM SALESPERSON WHERE SPNUM=186;

The desired attributes are listed in the SELECT clause, the required table is listed in the FROM clause, and the restriction or predicate indicating which row(s) is involved is shown in the WHERE clause in the form of an equation. Notice that the SELECT ends with a single semicolon (;) at the very end of the entire statement. This is used to identify the end of the statement. The use of a semicolon is specified in the SQL standard with command statements as the end-of-statement delimiter, but its use is optional with Microsoft SQL Server. Different vendors vary in how they use end-of-statement delimiters. The result of this statement is: COMMPERCT YEARHIRE 15 2001 As is evident from this query, a qualifying condition (or search argument) like SPNUM, being used to filter the result set by searching for the required rows, does not have to appear in the query result. Its inclusion in the select list is optional as long as its absence does not make the result ambiguous, confusing, or meaningless. To retrieve the entire record for salesperson 186 the statement would change to: SELECT * FROM SALESPERSON WHERE SPNUM=186 This gives the result: SPNUM SPNAME COMMPERCT YEARHIRE OFFNUM 186 Adams 15 2001 1253 The * in the SELECT clause indicates that all columns (fields) are to be retrieved. Now, lets look at an example where you want a specific set of columns, but dont limit the rows, such as: List the salesperson number and salesperson name of all of the salespersons. Here, you would run: SELECT SPNUM, SPNAME FROM SALESPERSON This gives the result: SPNUM SPNAME 137 Baker 186 Adams 204 Dickens 361 Carlyle Finally, if you want to return all rows and all columns, all of the data in the SALESPERSON table, you would run: SELECT * FROM SALESPERSON The result would be identical to the table shown in Figure 63. 6.2.3 Retrieving Other Values When using Microsoft SQL Server, you can use SELECT to evaluate expressions. These can be, for example, mathematical expressions or expressions

6.3 Understanding Operators and Functions


A key part of writing and understanding expressions is understanding operators. SQL operators include both arithmetic and logical operators. Logical operators are used in the WHERE clause to build search conditions that depend on more than one qualifying condition. You can also categorize operators as the following: Unary Operator: an operator applied to only one operand at the time; typically in the format operator operand. For example, NOT A. Binary Operator: an operator applied to two operands at the time, typically in the format operand operator operand. For example, A OR B. Because of variations between different DBMS products, specific examples in this chapter are based primarily on Microsoft SQL Server. 6.3.1 Arithmetic Operators Arithmetic operators are used for arithmetic computations. The use of the arithmetic operators is very intuitive, and they can be used in virtually every clause of a SQL statement. Because of its effect when used with numeric data, weve also included a concatenation operator in our discussion. The full list of arithmetic operators is given in Table 6-1. While doing arithmetic in SQL is relatively easy, you must pay attention to the data type used in the operations. For numeric values, this includes the precision and scale of the result; for datetime, the range of the resulting values; and so on. Most DBMSs can convert similar data types automatically for evaluation through a process known as implicit conversion, but dissimilar types require explicit conversion (manual conversion) to a compatible type.

6.3.3 Standard SQL Functions SQL functions exist to make your life easier when you need to manipulate data. Think of the SQL functions as tools designed to accomplish a single well-defined taskfor example, calculating square root or converting lowercase letters into uppercase. You invoke a function within SQL query by name (usually a single keyword). Some functions accept arguments, some do not, but they always return a value. All functions could be divided into two broad categories: deterministic functions and nondeterministic functions. Deterministic functions always return the same result if you pass in the same arguments; nondeterministic functions might return different results, even if they are called with exactly the same arguments. For example, ABS, which returns the absolute value of a number passed to it as an argument, is a deterministic function. No matter how many times you call it with, say argument 5, it will always return 5 as a result. The Microsoft SQL Server function GETDATE() accepts no arguments and returns only the current date and time, and so is an example of a nondeterministic function. Each time you call it a new date and time is returned, even if the difference is less than one second. One reason this is important is that some DBMSs restrict use of the nondeterministic functions in database objects such as indexes or views. For example, SQL Server disallows use of such functions for indexed computed columns and indexed views. The list of functions changes slightly with each new release of the ANSI SQL standards. However, the functions supported by the various DBMS providers vary widely from the standard function list and from each other. As an example, the functions specified in the SQL-99 standard (as well as earlier standard versions) are described in Table 6-5. The SQL-99 standard refers to a version of the SQL standard released in 1999. New SQL standard versions are released every few years. Rather than expecting a standard implementation of any of these functions, you should refer to the documentation specific to your DBMS for available functions. Limit use of functions to statements, procedures, and other executables to be used with a specific SQL implementation.

6.4 Understanding DML Commands


Many DML commands vary in how they are implemented by various DBMSs. Rather than trying to introduce you to all of the statements you might use or focusing on the statement syntax for a particular DBMS, this sections goal is to give you a general introduction to the statements. The section includes a few ANSI standard command syntax examples as representative examples. Statement examples are based on Microsoft SQL Server. Popular DBMS products have full product documentation readily available. For example, you can install complete product documentation, including command syntax and command use examples, when you install SQL Server. Because of the variations in how statements are implemented, you should refer to your product-specific documentation for information about statement syntax and specific functionality. The ANSI SQL standard defines three basic DML statements: INSERT, UPDATE, and DELETE. You use the INSERT statement to add rows to a table. UPDATE lets you modify values in specified columns and rows (existing records). Run DELETE to remove rows from tables. Keep in mind with any of these statements that you might be prevented from taking actions due to constraints or other limits placed on a table. For example, you might not be allowed to delete a row that is referenced by another row in a different table through a foreign key constraint. Our discussion is limited primarily to the ANSI SQL standard syntax of each of these DML statements and their basic use. We use the statement syntax as defined in the ANSI SQL-99 standard, the standard on which most DBMSs implementations of the statements are based. Details of how to use advanced functionality, such as using queries to pass retrieved values for use in DML statements, is beyond the scope of this chapter.

6.5 Understanding DDL Commands


As with DML commands, DDL commands vary widely in how they are implemented by various DBMSs. Once again, this sections goal is to give you a general introduction to the statements. The section again includes a few ANSI standard command statements as representative examples. DDL commands are used to create and manage server and database objects. Server objects, as the name implies, are objects implemented at the server level. In Microsoft SQL Server, for example, logins (used for server access) are server level objects. Database objects are those objects created as database-specific, such as tables. There are three basic commands that relate to object management. They are as follows: CREATE: used to create server and database objects, such as CREATE TABLE and CREATE INDEX. ALTER: used to modify server and database objects, such as ALTER TABLE and ALTER INDEX. DROP: used to delete server and database objects, such DROP TABLE and DROP INDEX.

S-ar putea să vă placă și