The Power of Indexes
As I mentioned in a previous tip, Giving Your Database Updates Enough Time, I had a client contact me with a problem: The updates for an unusually large batch of data in their online application was taking so long that the updates were timing out. As a short-term fix, we increased the update time to just over two minutes but we all recognized the right, long-term solution was to reduce the time the updates were taking.
I am a developer so we discussed some code-based solution but, before I touched the keyboard, I looked at the database. I wanted to see if I could apply some indexes to speed up processing. I was somewhat surprised to discover that none of the tables had any indexes or primary keys on them (though the tables did ... usually ... have columns that would uniquely identify each row in a table). This wasn't a huge system but some tables had as many as half-a-million rows in them.
Without indexes, every Join and every Where clause had to scan the whole table to find the rows it needed. It's a testament to SQL Server that the application ran as fast as it did (and it was certainly "fast enough" -- except for this problem update).
Rather than do any testing or analysis, I just went through the stored procedures involved in the update and added a primary key or an index to each table that was involved in a Join or a Where clause. If the Join or Where clause used two columns from the same table, I created a primary key or index that included both columns.
The results, as is usual with indexes, were miraculous. Well, it certainly looked like a miracle to my client: With no code changes, that update that was taking over two minutes now took less than fifteen seconds -- almost an order of magnitude speed improvement (900 percent, to be exact). In addition, every other transaction that used those tables now executed faster.
Because the overall load on the database dropped, even transactions that didn't use those indexes were completing marginally faster.
Of course, adding indexes isn't free: Each index is, effectively, a table that needs to be updated. The index must always be updated when a row is inserted or deleted from the parent table, but updates only affect the index if one of the indexed columns is changed.
Even if you're updating an indexed column, though, the net result for updates is usually positive because the update statement probably includes a Where clause that will run faster when there's an index in place (and, in my experience, the indexed columns are often the columns that are least likely to be updated). I find that I can add up to a half dozen indexes to a table before update performance starts to degrade. Certainly, my client didn't see any impact.
If you're not paying attention to the indexes on your tables, you're missing an opportunity.
Posted by Peter Vogel on 01/08/2015