Wednesday, March 28, 2012

Poor performance processing data

I have a problem with performance during cube processing (using ProcessData).

When I start processing a number of partitions (in parallel), the network utilisation is initially 100%. Over the next 30 minutes the network utilisation drops to 10 % on the AS box. I have checked the database server and it is a similar story there. The processing eventually completes after a number of hours. Way over what our existing AS 2000 setup would take.

In order to checkout my setup I wrote a .Net 1.1 utility which runs the partition queries in parallel and writes the data to file. It is able to max out the Network utilisation for the duration of the queries on both the AS and SQL boxes.(5 in parallel, totalling 250 million rows came accross in 30 mins). This would seem to prove that the network and SAN are not the issue.

Does anyone any ideas why AS is slowing down as the partition processing progresses.

My setup is as follows:

The cube is on a 4 way 64-bit AS2005 Enterprise SP1 (v 9.0.2047.0) server reading data from a database on 4 way 32-bit SQL 2000 (v8.00.818) SP3 Enterprise

The database contains 27 tables containing fact data. Approx 50 million rows in each. Total 1.25 billion rows. 600GB)

Both machines are SAN attached and the SAN has been check by the supplier and exceeds the service level.

Any help would be appreciated.

Are you using incremental update or full processing of the cube? Incremental update can result in higher utliztion of disks.

Regards

/Thomas

|||

Hi Rob,

You might get some mileage out of the following paper on performance, particularly with regard to Max threads (esp. processing). Also you might want to review your Attribute Relationship and swap to Rigid if you can from the default Flexible. I gained considerable processing perfromance using the Max Threads and Rigid Attribute Relationships.

Paper is at : http://download.microsoft.com/download/4/f/8/4f8f2dc9-a9a7-4b68-98cb-163482c95e0b/SSASProperties.doc#_Toc136412586

Good Luck, Dave.

|||

Thanks for you reply Thomas. No it was full processing using ProcessData.

Besides which, I have now resolved this problem. It was caused by some named calculations that were defined in one of the dimension queries in the DSV. I had defined a few simple concatenations of Product ID and Product Description to use as the member name at each level of the hierarchy. Although I wasn't able to check the SQL I suspect it was doing some kind of Cartesian operation between the view that the dimension was based on and the same view that the named calc's were based on. When I moved the named calculations back to the DB view the problem was completely resolved. Processing now takes 7.5 hours as opposed to 40. An acquaintance has written about a similar issue in his blog: http://blogs.conchango.com/christianwade/archive/2006/07/12/4219.aspx

My advice is be careful with named Calculations in the DSV.

|||

Thanks Dave, I found that paper the other day. I posted the solution just as you reply came up, but I will check out the attribute relationships as you say. Also, I'm not convinced that I understand the cardinality option properly.

With regard to the Max Threads what have you set yours to and what kit have you got?

|||Retracting my comments earlier about the problem in the DSV. I had confused the issue by unbinding a dimension from the cube and the complexity had reduced and therefore speed increased. I now believe my problem to be related to write speed on SAN and not to do with the DSV.

No comments:

Post a Comment