performance david peter hansen microsoft certified master david@davidpeterhansen.com | @dphansen 4 parallelise with a queue scenario lots and lots of child packages no dependencies some take long, some short create a queue add path to all packages to the queue pop from queue pop item (child package path) from queue execute child package keep doing this until queue is empty parallelise run in parallel e.g. as many as number of logical cores measure balanced run create a queue pop from queue continue until done parallelise some take long, some short more rows per buffer buffer group of data allocated memory for rows and columns dynamically sized not a waterfall not from transformation to transformation group of transformations pass over the buffers in-place changes to data less is more less columns = more rows per buffer narrow data types = more rows per buffer unused columns remove them dont do SELECT * RunInOptimizedMode engine will not allocate for unused columns more rows remove unused columns narrow your data types RunInOptimizedMode buffers, buffers, size your buffers size of buffers calculated by data flow engine size of buffers sizerow = calculating the estimated size of a single row of data
engine decreases engine increases engine sizes the buffer as close
number of rows number of rows as possibleto sizebuffer change the values DefaultBufferSize DefaultBufferMaxRows goal (if you have enough memory) small number of large buffers fit as many rows into a buffer watch out for paging to disk (use perfmon) DefaultBufferSize default is 10 MB max is 100 MB DefaultMaxBufferRows databytes = 45,552 * 1,024 = 46,645,248 sizerow = 46,645,248 / 776,286 = 60 bytes DefaultMaxBufferRows = DefaultBufferSize / sizerow = 10 MB * 1,024 * 1,024 / 60 = 174,762 buffer size begin with default values make sure you have enough memory turn on logging: BufferSizeTuning (how many rows in each buffer) change DefaultBufferMaxRows/DefaultBufferSize do not block the road blocking nature non-blocking transformations semi-blocking transformation blocking transformations non-blocking streaming row-based semi-blocking hold up rows for a period of time semi-blocking what if one input is slower than the other? exceed the buffer memory while waiting (potentially) throttle the source (new feature in 2012) blocking all rows into memory try to avoid these real world ~30 sources / millions of rows low-memory condition notification to SSIS spill to disk this was in production! push down try to push down to the source GROUP BY ORDER BY dont block avoid row based non-blocking transformations be careful with semi-blocking transformation avoid blocking transformations bulk insert your data data destinations SQL Server Destination shared memory OLE DB Destination tcp/ip named pipes use fast load batch size Maximum Insert Commit Size (MICS) > buffer size one commit for every buffer =0 entire batch is committed in one big batch < buffer size commit after MICS & commit after every buffer smaller commit size inserting into a table with indexes sort must happen for every index smaller commit size makes sort fit in memory beware of fragmentation indexes disable/drop indexes before bulk load rebuild/create indexes after bulk load clustered index source data is ordered by cluster key specify ORDER hint in FastLoadOptions minimal logged if empty or trace flag 610 bulk insert sql server destination vs. ole db destination maximum insert commit size tables with indexes insert into clustered index dont spool your blob data blob data xml varchar(max) / nvarchar(max) varbinary(max) DT_TEXT / DT_NTEXT DT_IMAGE blob in pipeline half of a buffer for in-row data half of a buffer for blob data blob spooled dont fit in memory spooled to disk spool to disk if you really really have to use SSDs BufferTempStoragePath BLOBTempStoragePath default is TMP/TEMP (C:\...) minimize spool size DefaultBufferSize & DefaultBufferMaxRows 1) find max blob buffer DefaultBufferSize = e.g. 100MB MaxBufferSizeblob = 100 MB / 2 = 50 MB 2) estimate size of blob data 3) set Default BufferMaxRows < MaxBufferSize / estimated size of blob data dont spool blob xml, varchar(max), varbinary(max), DT_TEXT half of a buffer for blob data BufferTempStoragePath / BLOBTempStoragePath size DefaultBufferSize / DefaultBufferMaxRows divide and conquer real world very complicated business logic almost 100 transformations in one data flow in production! very hard to debug or tune divide split up your package extract/stage dimensions facts very large data flows? stage your data SQL Server or RAW conquer master package parallelise if possible optimize your query t-sql queries source components lookup transformations query plan tune your query use sql sentry plan explorer (free) indexing cardinality estimation problems with (nolock) use with care dirty reads / inconsistent data make sure nobody is writing to the table optimize tune your query with (nolock) use with care get the data, already blocking iterators sometimes you dont want to wait for the query to return the rows running time without hint: CPU time = 498 ms, elapsed time = 2043 ms.
with fast hint:
CPU time = 1186 ms, elapsed time = 2231 ms. option (fast n) optimize plan for the first n rows can help remove blocking iterators data into SSIS faster at a cost of overall query performance n = DefaultMaxBufferRows dont guess, measure it approach measure performance come up with a hypothesis tune your package measure performance again 10 tips and tricks 1) parallelise with a queue 2) more rows per buffer 3) buffers, buffers, size your buffers 4) do not block the road 5) bulk insert your data 6) dont spool your blob data 7) divide and conquer 8) optimize your query 9) get the data, already 10) dont guess, measure it references troubleshooting ssis package performance issues blogs.msdn.com/b/mattm/archive/2011/08/07/troubleshooting-ssis-package-performance-issues.aspx
top 10 sql server integration services best practices