Documente Academic
Documente Profesional
Documente Cultură
For developers
The Basis by Hugo Bernachea http://sqldata.blogspot.com australiano@gmail.com
PrePre-requisites
1.
Knowledge of Transact-SQL. Knowledge of SQL Server Administration (basic). Knowledge of data integrity concepts. Relational database design skills. Programming skills
2.
3.
4.
5.
Avoid Select * (issue with views even non schema bound ones - sp_refreshview ) y Indicating all the required fields results in reduced disk I/O and better performance.
y
SET NOCOUNT ON
Use SET NOCOUNT ON at the beginning of your SQL batches, stored procedures and triggers in production environments, as this suppresses messages like (1 row(s) affected) after executing INSERT, UPDATE, DELETE and SELECT statements. This improves the performance of stored procedures by reducing network traffic.
Hugo Bernachea - australiano@gmail.com - http://sqldata.blogspot.com
Top vs RowCount
y
Use top instead of Set RowCount , because Rowcount will be eliminated in next sql server version.
If we are not managing schemas and in order to avoid any ambiguity the stored procedures must belong to the user dbo
Create Procedure dbo.SPR_CasListEntity_XML Instead of : Create Procedure SPR_CasListEntity_XML
Hugo Bernachea - australiano@gmail.com - http://sqldata.blogspot.com
y y y
Use schema name with object name: The object name is qualified if used with schema name. Schema name should be used with the stored procedure name and with all objects referenced inside the stored procedure. This help in directly finding the complied plan instead of searching the objects in other possible schema before finally deciding to use a cached plan, if available. This process of searching and deciding a schema for an object leads to COMPILE lock on stored procedure and decreases the stored procedures performance. Therefore, always refer the objects with qualified name in the stored procedure like
Defensive programming
Handling special characters in searching y Suppose you want to search for something containing special characters like: y Select * from table1 where name like pro_ or name like [ab]
y
Create PROCEDURE dbo.SelectMessagesBySubjectBeginning @SubjectBeginningVARCHAR(50) AS SET NOCOUNT ON ; DECLARE @ModifiedSubjectBeginningVARCHAR(150) ; SET @ModifiedSubjectBeginning = REPLACE(REPLACE(@SubjectBeginning, '[', '[[]'), '%', '[%]') ; SELECT @SubjectBeginning AS [@SubjectBeginning] , @ModifiedSubjectBeginning AS [@ModifiedSubjectBeginning] ; SELECT Subject , Body FROM dbo.Messages WHERE Subject LIKE @ModifiedSubjectBeginning + '%' ; GO
With Recompile
y
If we queries with different execution plans in stored procedures (e.g. using IFs or CASE). Sql Server caches only one execution plan per sproc. However, if you have to do this, consider using WITH RECOMPILE option to get optimum results. Often it is better to recompile each time than use poor plans.
In many cases, we do not need sophisticated error handling. Quite frequently, all we need to do in case of an error, is roll back all the changes and throw an exception, so that the client knows that there is a problem and will handle it. In such situations, a perfectly reasonable approach is to make use of the XACT_ABORT setting. By default, in SQL Server this setting is OFF, which means that in some circumstances SQL Server can continue processing when a T-SQL statement causes a run-time error. In other words, for less severe errors, it may be possible to roll back only the statement that caused the error, and to continue processing other statements in the transaction. If XACT_ABORT is turned on, SQL Server stops processing as soon as a TSQL run-time error occurs, and the entire transaction is rolled back. When handling unexpected, unanticipated errors, there is often little choice but to cease execution and roll back to a point where there system is in a "known state." Otherwise, you risk seeing partially completed transactions persisted to your database, and so compromising data integrity. In dealing with such cases, it makes sense to have XACT_ABORT turned ON. Data modifications via OLE DB Note that, in some cases, XACT_ABORT is already set to ON by default. For example, OLE DB will do that for you. However, it is usually preferable to explicitly set it, because we do not know in which context our code will be used later.
Hugo Bernachea - australiano@gmail.com - http://sqldata.blogspot.com
IF Exists (select 1)
y
Use IF EXISTS (SELECT 1 ) instead of (SELECT * ) To check the existence of a record in another table, we uses the IF EXISTS clause. The IF EXISTS clause returns True if any value is returned from an internal statement, either a single value 1 or all columns of a record or complete recordset. The output of the internal statement is not used. Hence, to minimize the data for processing and network transferring, we should use 1 in the SELECT clause of an internal statement, as shown below: IF EXISTS (SELECT 1 FROM sysobjects WHERE name = 'MyTable' AND type = 'U')
Hugo Bernachea - australiano@gmail.com - http://sqldata.blogspot.com
The aggregation function COUNT cannot be used for presence checking. The requests using the following pattern in a condition block (IF,WHILE or WHERE) are prohibited (SELECT COUNT(*) FROM [ ]) > 0 (SELECT COUNT(*) FROM [ ]) = 0 EXISTS must be used to do this check
using COUNT, the SQL engine iterates on all the valid lines using EXISTS, the SQL engine ends with the first line it finds
y y y
must be replaced by : IF NOT EXISTS(SELECT 1 FROM PRM_BILL_DETAIL WHERE PRM_BILL_ID = @nPrmBillId) DELETE FROM PRM_BILL WHERE PRM_BILL_ID = @nPrmBillId
CREATE TABLE EMP_MASTER ( EMP_NBR NUMBER(10) NOT NULL PRIMARY KEY, EMP_NAME VARCHAR2(20 CHAR), MGR_NBR NUMBER(10) NULL ) go INSERT INTO EMP_MASTER VALUES (1, DON, 5); INSERT INTO EMP_MASTER VALUES (2, HARI, 5); INSERT INTO EMP_MASTER VALUES (3, RAMESH, 5); INSERT INTO EMP_MASTER VALUES (4, JOE, 5); INSERT INTO EMP_MASTER VALUES (5, DENNIS, NULL); INSERT INTO EMP_MASTER VALUES (6, NIMISH, 5); INSERT INTO EMP_MASTER VALUES (7, JESSIE, 5); INSERT INTO EMP_MASTER VALUES (8, KEN, 5); INSERT INTO EMP_MASTER VALUES (9, AMBER, 5); INSERT INTO EMP_MASTER VALUES (10, JIM, 5); COMMIT
NOT EXISTS select count(*) from emp_master T1 where not exists ( select 1 from emp_master T2 where t2.mgr_nbr = t1.emp_nbr ); COUNT(*) 9 Performance implications: When using NOT IN, the query performs nested full table scans, whereas for NOT EXISTS, query can use an index within the sub-query.
Optional Method
y
Another way of doing this is to use an outer join and check for NULL values in the other table: SELECT COUNT(*) FROM EMP_MASTER T1 LEFT OUTER JOIN EMP_MASTER T2 ON T1.EMP_NBR = T2.MGR_NBR WHERE T2.MGR_NBR IS NULL
The UNION operator offers 2 ways to combine the result of 2 requests in a single procedure
UNION : the doublets (i.e. the records present in both of the results) are removed from the result. When using the UNION statement, keep in mind that, by default, it performs the equivalent of a SELECT DISTINCT on the final result set. In other words, UNION takes the results of two like recordsets, combines them, and then performs a SELECT DISTINCT in order to eliminate any duplicate rows. This process occurs even if there are no duplicate records in the final recordset. If you know that there are duplicate records, and this presents a problem for your application, then by all means use the UNION statement to eliminate the duplicate rows. UNION ALL : the Transact-SQL engine gets all the lines, even the doublets.
For performances issues, if you are sure that there are no doublets (which is generally the case), the keyword ALL must be used. The result sets that are combined by using UNION must all have the same structure.They must have the same number of columns, and the corresponding result set columns must be strictly identical (same data type, same length, same precision, same collation)
Datetime
y y
In Transact-SQL, the datetime constants are displayed as strings The ISO format must be used :
yyyymmdd 'yyyymmdd hh:mm:ss 'yyyy-mm-ddThh:mm:ss.mmm (ISO8601)
International data
y
SQL Server offers 3 types of data managing Unicode: NCHAR, NVARCHAR and NTEXT. For the development of an international web application, it is essential that these data types should be used instead of CHAR, VARCHAR and TEXT.
Varchar or NVarchar
Varchar (8 bits) vs NVarchar(16bits)y page compression y Internationalization
y
It is important to note that it is preferable to use the N prefix in front of each column name when calling or inserting data in SQL tables. Thus, the following scripts should be used as examples: y INSERT INTO APP_USERS VALUES (NChetan, NChetan) y SELECT * FROM APP_USERS WHERE USERNAME=N?
y
Avoid Blob
Use: y Varchar(max), Nvarchar(max) and Varbinary(max)
y
Instead of y Text, ntext and image y Text, Ntext and image would be deprecated in future versions of SQL Server
y
y
http://www.sqlmag.com/article/tsql3/varbinary-max-tames-the-blob
Avoid prefix your stored procedure names with sp_. The prefix sp_ is reserved for system stored procedure that ship with SQL Server. Whenever SQL Server encounters a procedure name starting with sp_, it first tries to locate the procedure in the master database, then it looks for any qualifiers (database, owner) provided, then it tries dbo as the owner. So you can really save time in locating the stored procedure by avoiding the sp_ prefix. http://sqlserverpedia.com/blog/sql-server-bloggers/storedprocedure-performance-using-%E2%80%9Csp_%E2%80%9D-prefix%E2%80%93-myth-or-fact/ (15% facter any prefix different than sp_)
Finding dependencies
y
http://blog.sqlauthority.com/2010/02/04/sql-server-get-the-list-of-object-dependencies-sp_dependsand-information_schema-routines-and-sys-dm_sql_referencing_entities/ SELECT referencing_schema_name, referencing_entity_name, referencing_id, referencing_class_desc, is_caller_dependent FROM sys.dm_sql_referencing_entities ('YourObject', 'OBJECT'); GO
y
Custom Query
Select routine_name, routine_definition from information_schema.routines where routine_definition like %fiel01% Another method Select routine_name, routine_definition from information_schema.routines where charindex(field01, routine_definition) > 0
In Dynamic code use the sp_executesql stored procedure instead of the EXECUTE statement. The sp_executesql stored procedure supports parameters. So, using the sp_executesql stored procedure instead of the EXECUTE statement improve the re-usability of your code. The execution plan of a dynamic statement can be reused only if each and every character, including case, space, comments and parameter, is same for two statements.
If we again execute the above batch using different @Age value, then the execution plan for SELECT statement created for @Age =25 would not be reused.
DECLARE @Query NVARCHAR(100) SET @Query = N'SELECT * FROM dbo.tblPerson WHERE Age = @Age' EXECUTE sp_executesql @Query, N'@Age int', @Age = 25
In this case the compiled plan of this SELECT statement will be reused for different value of @Age parameter. The reuse of the existing complied plan will result in improved performance.
SQL Injection
y
Avoid writing code that is susceptible to SQL injection. Concatenated SQL statements transfer the SQL injection risk from the inline code directly to the stored procedure. This commonly occurs in reports or in search queries, where the assumption is that a SQL statement in a Stored Procedure needs to be generated dynamically. Frequently a WHERE clause may get generated on the client web application, and then passed to the stored procedure. Here is an example: CREATE PROCEDURE pMyQuery ( @Where varchar(8000) ) AS DECLARE @sql varchar(8000) SET @sql = SELECT * FROM MyTable + @Where + AND MyField = 4 ORDER BY This, That EXEC(@sql) Pass this to the WHERE clause and a table could get dropped. 1 = 1; DROP TABLE MyTable -- Note that the comment dashes at the end of the line will prevent any subsequent code in the concatenated SQL statement from being executed. In the above case the additional filter and ORDER BY clause would not be executed. http://palpapers.plynt.com/issues/2006Jun/injection-stored-procedures/ http://blogs.msdn.com/b/raulga/archive/2007/01/04/dynamic-sql-sql-injection.aspx Use Sp_executeSql instead of EXECUTE in dynamics queries
y y y
Identity
y
Use Scope_Identity instead of @@identity @@identity could fail if a trigger insert data in another table and this way returning identity from the table inserted by the trigger instead of the identity from the table in the procedure.
Creates a GUID that is greater than any GUID previously generated by this function on a specified computer since Windows was started. After restarting Windows, the GUID can start again from a lower range, but is still globally unique. Performance issues with clustered primary keys (but primary keys could be non clustered)
http://msdn.microsoft.com/en-us/library/ms189786.aspx
y
y
Remember that non-cluster indexes use and storage the key of the clustered index for identifying a row. (for this reason deleting and recreating cluster index recreate all the non cluster index of the same table).
For Low amount of data = table variables -->mainly memory resident (although high data volume could be put in tempdb)
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2008/03/30/sql-server-table-variable-vs-localtemporary-table.aspx
For large amount of data = temp tables (even with index) --> tempdb
http://decipherinfosys.wordpress.com/2007/05/04/temporary-tables-ms-sql-server/
Output Clause
USE AdventureWorks; GO CREATE TABLE TestTable (ID INT, TEXTValVARCHAR(100)) DECLARE @TmpTable TABLE (ID INT, TEXTValVARCHAR(100)) INSERT TestTable (ID, TEXTVal) OUTPUT Inserted.ID, Inserted.TEXTVal INTO @TmpTable VALUES (1,'FirstVal') INSERT TestTable (ID, TEXTVal) OUTPUT Inserted.ID, Inserted.TEXTVal INTO @TmpTable VALUES (2,'SecondVal')
Imagine you want to find all the null values in a column in a database table (SQL Server).
x 1 2 NULL 4 5 y Here is the SQL that performs the task as required: SELECT x, CASE x WHEN NULL THEN yes ELSE no END AS result FROM someTable
x
1 2 NULL 4
5
result
no no
yes
no No
But that isnt what he got. His result was like this:
x
1 2 NULL 4 5
y
result
no no
no
no no
SELECT x, CASE WHEN x IS NULL THEN yes ELSE no END AS result FROM someTable
y y
A primary key serve several purposes. It acts to enforce entity integrity of the table. Next, a primary key can be used along with a foreign key to ensure that referential integrity is maintained between tables. By itself, a primary key does not have a direct affect on performance. But indirectly, it does. This is because when you add a primary key to a table, SQL Server creates a unique index (clustered by default) that is used to enforce entity integrity. But as you have already discovered, you can create your own unique indexes on a table. So, strictly speaking, a primary index does not affect performance, but the index used by the primary key does. You dont have to have primary keys on your tables if you dont care about the benefits that arise from using them. If you like, you can keep the current indexes you have, and assuming they are good choices for the types of queries you will be running against the table, then performance will be enhanced by having them. Replacing your current indexes, and replacing them with primary keys will not help performance. Assign primary keys to every table because this is a best database design practice. Evaluate whether or not the primary keys index should be clustered or non-clustered, and then choose accordingly. Since you can only have one clustered index, it should be chosen well. It is not a good idea, from a performance perspective, to accept the default of a clustered index on a primary key, as it may not be the best choice for the use of a clustered index. In addition, it is not a good idea to double up on indexes. In other words, dont put a Primary key non-clustered index on a column, and a clustered index on the same column (this is possible, although never a good idea). It is needed for replication !
Complex Logic could be created using a more flexible language like C# and be delivered via CLR Funtions, fields and procedures. T-SQL will always perform better for standard CRUD (Create, Read, Update, Delete) operations, whereas CLR code will perform better for complex math, string manipulation and other tasks that go beyond data access.
CLR complement Transact-SQL code, not to replace it. Standard data access, such as SELECT, INSERTs, UPDATEs, and DELETEs are best done via Transact-SQL code, not the CLR. Computationally or procedurally intensive business logic can often be encapsulated as functions running in the CLR. Use the CLR for error handling, as it is more robust than what Transact-SQL offers. Use the CLR for string manipulation, as it is generally faster than using Transact-SQL. Use the CLR when you need to take advantage of the large base class library. Use the CLR when you want to access external resources, such as the file system, Event Log, a web service, or the registry. Set CLR code access security to the most restrictive permissions as possible.
Parameter Sniffing
y
Sometimes first execution of an SP sniffed the parameters and then built an execution plan that was optimized for that set of parameters.This can lead to poor execution plans for any call that the SP uses that have a different set of parameters that might lead to a different execution plan. Suppose looking for perez surname. SQL Server can do a scan table because Perez is a very common surname in Argentina and has a big statistical density in the statistics. Then, using an index and then going to data for every row to recover all the data requested could be a big effort and a table scan could be better. But when looking for Schwarzenegger surname, SQL Server engine could decide to use index because of the low statistical density of this surname. It could be an easy thing to look for this surname in the index and then go to data to recover the other data. The problem is when parameter sniffing happens and SQL Server uses the same execution plan for both searchs.
Parameter Sniffing
y
CREATE PROC GetCustOrders ( @FirstCust int, @LastCust int ) AS DECLARE @FC int DECLARE @LC int SET @FC = @FirstCust SET @LC = @LastCust SELECT * FROM Sales.SalesOrderHeader WHERE CustomerID BETWEEN @FC AND @LC Here, can see I have declared two local variables (@FC and @LC) and then populated them with the values of the parameters passed to my SP. By doing this, the actual values of the parameters are no longer contained in the BETWEEN clause in the SELECT statement, instead only those local variables are present. Because of this small change, the query optimizer looks at all the statistics related to objects in my query and determines on average what might be the best execution plan to use based on the statistics.
SET vs SELECT
We cannot assume that SET and SELECT always change the values of variables. y If we rely on that incorrect assumption, our code may not work as expected, so we need to eliminate it. y We demonstrates a case where SELECT leaves the value of a variable unchanged, if the result set is empty.
y
Hugo Bernachea - australiano@gmail.com - http://sqldata.blogspot.com
SET vs SELECT
y
We demonstrates a case where SELECT leaves the value of a variable unchanged, if the result set is empty. SET NOCOUNT ON ; DECLARE @i INT ; SELECT @i = -1 ; SELECT @i AS [@i before the assignment] ; SELECT @i = 1 WHERE 1 = 2 ; SELECT @i AS [@i after the assignment] ; @i before the assignment ------------------------1 @i after the assignment -----------------------1
SET vs SELECT
y
But shows that that, in the same case, SET changes the values of variables even if the result set is empty. SET NOCOUNT ON ; DECLARE @i INT ; SELECT @i = -1 ; SELECT @i AS [@i before the assignment] ; SET @i = ( SELECT 1 WHERE 1 = 2 ); SELECT @i AS [@i after the assignment] ; @i before the assignment ------------------------1 @i after the assignment ----------------------NULL
XML is best used for modeling data with a flexible or rich structure. Dont use XML for conventional, structured data. Store object properties as XML data in XML data types. When possible, use typed XML data types over untyped XML data to boost performance. Add primary and secondary indexes to XML columns to speed retrieval.
Upcoming
Next Document: y SQL Server Best Practices y Advanced Issues
y