If you’re a DBA or a developer chances are you’ve looked at a query plan or two. While looking at your plan you may have noticed that each operator has a cost. Did you know that the cost is measurable?
When you’re looking at your plan where do you start? Do you look at the highest cost operator? Do you look for key or RID lookups? Do you look for scans? There’s a lot of places to start and each of these are valid. I generally look for Spools, Filters, Probe Residuals, Warnings, and more. Look for posts on each of these in the future. You can also read my other posts on Query tuning here: SQL Server Central and SQL Tech Blog.com.
When looking at an operator such as an index seek or index scan you’ll notice that there are a few metrics for cost. One of those is the cost for IO. What would you say if I told you I could tell you what the cost of a scan was before you ever ran the query? This is another little gem that I can credit to my friend Joe Sack.
Let’s take a look at the following AdventureWorks2012 query:
SELECT ModifiedDate, rowguid
When we analyze the Clustered Index Scan we find that the Estimated I/O Cost is 0.257199.
How much does an I/O cost?
I/O cost can be measured with two different measurements (Random and Sequential). A single random I/O costs 0.003125, while a single sequential I/O costs 0.000740741.
Keep in mind the plan does not know if the page(s) are in cache or somewhere out on storage. These numbers are what the plan thinks it will need. Also note that the I/O costs do not take your storage hardware into account. It assumes one for all so SSD and Flash get the same score as spinning disks.
Let’s put this to a test with the following query:
WHERE AddressID = 207
Below you’ll see that the cost is 0.003125 for reading the single data page.
Now let’s examine the math for a scan. You should have a single random I/O at 0.003125 followed by sequential I/O’s at 0.000740741 for the rest of the index. Since we know the cost of each I/O type, we can determine the cost of the scan ahead of time using the following query:
.003125 + (.000740741 * (in_row_data_page_count-1)) as [Cost for Scan]
WHERE object_id = OBJECT_ID(‘Person.Address’)
We can see that the cost of the Clustered Index Scan we did earlier would be 0.257199.
That’s great but now what?
Don’t get me wrong, this is not the first place I’d start with tuning; however, this knowledge can be helpful for a whole host of things. Knowing this, I’d ask “What if the cost of the scan was 5.0 or greater? What affect may that have on the optimization process and parallelism?”
I’d like to hear more from you. Please feel free to comment and let me know your thoughts.