Friday, March 30, 2012

Poor Performing Query

Here is the situation: I have a process that auto-generates reports and e-mails them to end users (no, we are not allowed to use SQL Mail due to security issues and the fact that we are a Lotus shop). The process runs the query, saves the data to a text file and then mails the file as an attachment.

If the particular report to be run has multiple recipients, we make multiple passes (ie, we run the same query multiple times and send each recipient a "personalized" version of the report).

The issue is that one of my auto reports fails for just the first recipient. Subsequent recipients all receive the report normally, but the first recipient consistently fails to receive the data.

When I run the T-SQL multiple times in Query analyzer, I get the following results:

Pass 1: 87 seconds
Pass 2: 2 seconds
Pass 3: 3 seconds

I know that the process that we wrote (which is wrapped in a DLL) uses an ADO connection/command timeout setting of 60 seconds (both properties are set to the same value).

The SQL is not a stored proc. When I run the Index tuning wizard, no additional indexes are recommended.

I'm struggling to determine the next step.

1. I am loathe to increase the command timeout setting, since I have already done that once (from 30 seconds to one minute). I'm pretty sure this will only defer final resolution.

2. I suppose I could create a stored proc (so that it doesn't have to recompile the execution plan each time), but that would make this one report different from the hundreds of other reports that my super users have created; it would also mean that my super users would not be able to control the layout of the report without coming to me.

3. I have a feeling I should be using query hints or some such, but I am completely unfamiliar with these optimization tools.

Is there another approach that I am overlooking?

Regards,

Hugh ScottIf these are similar then I would guess that for the first call it has to read the data from disk. For subsequent calls the data is in memory so is much quicker.

Maybe put in a dummy call first?|||Yes, the calls are identical. I agree with you that on subsequent passes, the data is being read from memory. What has me puzzled is what I can do about it. I have considered putting in a dummy call first (as you suggested), but I wanted to try something a little more orthodox first.

Do you think that adding memory to the server would help? The server currently has 2.6 GB of memory. The database itself is a little over 30 GB now. I have run performance monitor on the server. While pages/sec does spike during this operation, it usually averages around 4.5. It does not seem sufficiently spiky to warrant $4,000 for additional memory.

Some more information:

There are three tables involved:

table a left join view b left join table c

Table a: 5.4 million rows (~ 4 GB)
View b: 1.3 million rows (~ 2.3 GB)
View c: 15 rows

View b uses the UNION operator and thus cannot be indexed.
Table a is indexed six ways from Sunday.

Thanks again,

Hugh Scott

Originally posted by nigelrivett
If these are similar then I would guess that for the first call it has to read the data from disk. For subsequent calls the data is in memory so is much quicker.

Maybe put in a dummy call first?|||What's your UNION view code? UNION queries are very handy, and I've used them many times myself, but just as often I've seen them used unnecessarily. Perhaps there is a more efficient method of combining your view code directly in your procedure code.

blindman|||UNION views can be indexed. Also, have you tried to see what you get on IO stats ON when doing a SELECT from just that view using WHERE that would match your JOIN?|||Views can be indexed, but unless you use a clustered index then the index must be recreated each time the view is called, and you lose the benefit of indexes on the underlying tables.

Clustered indexes on views can greatly boost performance, but they cannot be used on UNION queries. From books-online:

A view must meet these requirements before you can create a clustered index on it:
.
.
.
The SELECT statement in the view cannot contain these Transact-SQL syntax elements:
.
.
.UNION operator.

blindman|||blindman: thanks for the reminder. UNION is not allowed in indexed views. But you're also referring to non-clustered indexes. I haven't found any info that would suggest that a non-clustered index on a view needs to be rebuilt each time the view is called. Can you enlighten me here?

hmscott: adding memory will not resolve 87-second processing time on the first call. Adding indexes to base tables may.|||Normally the results of a view are not stored in the database, but if a clustered index is created on the view then the results are stored just like a table, and the values in the view are updated whenever the values in the underlying table are updated. (This of course adds overhead to processing changes on the underlying tables.) The clustered index is necessary in order to update the values on the view.

Long story short, if you don't use a clustered index then the results are not stored and thus any indexing must be recreated each time.

Honestly, I don't know whether non-clustered indexes are maintained on views that also have clustered indexes. I think I also read somewhere that you can't create an index on a view unless it also has a clustered index, so maybe UNION queries can't have indexes at all.

It's also possible that the optimizer might be smart enough to apply filters from the procedure to the underlying tables prior to creating the UNION view when it is called, but this probably depends on a lot of factors.

I've just seen too many instances when a UNION view was used instead of a more appropriate WHERE clause criteria.

blindmansql

No comments:

Post a Comment