Anyone who has dealt with development in SharePoint has likely had to deal with SPList, SPListItem and SPListItemCollection. Recently I have been working on some projects that were experiencing some performance issues. Looking at the code, I started to realize that while I had heard a lot of things from Microsoft (specifically the SharePoint product team) about how much the SharePoint list performance had been increased, I had not actually seen a lot of publications around where, how and some good ways to take advantage of them. Certainly there are articles about Indexed columns, and many about views only up to 1000-2000 items. I decided to dig a little deeper, and I think you might be as surprised as I was about what I found.
Test Data / New Item Performance
For part of the test, I decided I should combine the testing with creating the test data. One of the performance items I wanted to test was how SharePoint handled new items. Specifically, I wanted to know how quickly new items could be added, and I wanted a list in the end with 100,000 list items.
What I found was that SharePoint could really tear in when it came to creating a new item. Now I had a very simple list, it consisted of:
- SingleText - A single text column that I used to put GUID's in to
- Number - A number column I used to put a counter in to
- Choice - A choice column I used to put a text choice in to
- Bool - A boolean column I put either true or false in too (what choice did I have right?)
On my test hardware I could push about 50 items per second. I was actually pretty happy with that performance. It means I could populate 10,000 items in around 3.5 minutes. So how did I come by these numbers? I setup a list as above; I just used the Custom List template, and I used the following code:
DateTime start1 = DateTime.Now;
for (int i = 0; i < numItems; i++)
{
cn = rand.Next(0,3);
SPListItem item = list.Items.Add();
item["SingleText"] = Guid.NewGuid().ToString();
item["Number"] = itemCount + i;
item["Choice"] = choice[cn];
item["Bool"] = (cn <= 1);
item.Update();
}
DateTime end1 = DateTime.Now;
TimeSpan timer1 = new TimeSpan(end1.Ticks - start1.Ticks);
As you can see this code sets a time before and after the for loop which we can use to figure out how many seconds total were spent in the loop. I tested this by adding 1, 10, 100, 1,000, 10,000 and finally around 85,000 items to the list, and in general the timing of around 50 list items per second seemed to hold true for my hardware.
Updating Items Performance
There are really two ways to update items in a SharePoint list. Either you use the SPListItem.Update() method or you use the Lists web service UpdateListItems() web method. I decided to try them both, and what I found really surprised me. As I looked in to it more, I found some things I really hadn't expected.
SPListItem.Update()
The first thing I stumbled on to was SPListItemCollection's Count property. Apparently, using it forces some enumeration of the SPListItemCollection. In my initial code to update items, I was using the Count property in a for loop. The performance of that trial was just awful. I looked in to it more and I found that by moving the SPListItemCollection Count property access outside of the for loop in to an integer variable, I was able to increase performance dramatically. When I was testing this I had around 5,000 items in my list, and every call to the Count property was costing me around 2.5 seconds. So an
important note to anyone that uses the Count property of a SPListItemCollection - Make sure you only call it once and then put it in to a variable you control so you can refer to that instead of calling it again.
After I tracked down the Count snafu I was well on to my way to getting numbers for the Update performance. What I found is that I was only able to update 1 list item every 2.8 seconds. Surprised? I was. Now as I put that together with what I found with the Count property, I had to wonder if some list item collection isn't being enumerated. It seems quicker to delete an item and create a new copy. Here is the code I used to test this update:
DateTime start2 = DateTime.Now;
for (int i = 0; i < numItems; i++)
{
cn = rand.Next(0,3);
list.Items[i]["SingleText"] = Guid.NewGuid().ToString();
list.Items[i]["Number"] = itemCount+ i;
list.Items[i]["Choice"] = choice[cn];
list.Items[i]["Bool"] = (cn <= 1);
list.Items[i].Update();
}
DateTime end2 = DateTime.Now;
TimeSpan timer2 = new TimeSpan(end2.Ticks - start2.Ticks);
I am doing the same thing here with the time being set before and after the for loop. I am afraid I only had the patience to test this with 1, 10, 100 and 1,000 items. It was just taking too long.
Web Service UpdateListItems() Performance
I came to this part of the testing a little biased. I had just witnessed the poor performance of the SPListItem.Update() and, as I stated earlier, I fully expected the web service to perform even worse. But I was pretty much as wrong as a person can be. UpdateListItems() allows you to submit items in batch using a XML item update definition. This allows you to submit multiple items at once if you have them all to submit (I did find that 400 or so is the max you want to submit at once). I hit my test button fully expecting that I would have to go make a sandwich while I waited, but as I was starting to stand up it came back as completed. I blinked a little, thinking it was broken, but it had updated the 100 items I told it to. As surprising as it sounds, the web service was pushing updates at the comparitively blazing speed of 50 items per second. It was matching the creation speed of new items. If you are as puzzled as I am about why well I am thinking of starting a support group. As a performance tip in your code:
Do - use the web service to update list items if you need to update them yourself.
Don't - use SPListItem.Update() in your code if you have to update more than one item at a time.
Here is the code I used for testing:
DateTime start1 = DateTime.Now;
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
System.Xml.XmlElement batchElement = doc.CreateElement("Batch");
batchElement.SetAttribute("OnError", "Continue");
batchElement.SetAttribute("ListVersion", "1");
batchElement.SetAttribute("ViewName", "{F15B3020-EF3F-4458-B8A2-FD7F10725F82}");
for (int i = 1; i < numItems + 1; i++)
{
cn = rand.Next(0, 3);
xml.AppendFormat("<Method ID='{0}' Cmd='Update'><Field Name='ID'>{0}</Field><Field Name='SingleText'>{1}</Field><Field Name='Number'>{2}</Field><Field Name='Choice'>{3}</Field><Field Name='Bool'>{4}</Field></Method>",i, Guid.NewGuid(), itemCount+i, choice[cn], (cn <= 1));
}
batchElement.InnerXml = xml.ToString();
listService.UpdateListItems("TestDB", batchElement);
DateTime end1 = DateTime.Now;
TimeSpan timer1 = new TimeSpan(end1.Ticks - start1.Ticks);
I have so much more code quoted here because I was timing the additional cost of having to setup the web service connection, and it still was over 100 times faster than using SPListItem.Update().
Reading / Querying Item Performance
Once I had my 100,000 list items, I was not expecting spectacular performance on my list queries. I was pleasantly surprised by what I found. While the queries could take some time, there were certainly ways to get around those issues. I was able to query the list with various different CAML queries with an average speed of around 2,000 items per second. This speed held true when querying through the Lists web service GetItems() method as well. This meant when I queried my 100,000 item list and I returned 89,000 items in the result set, it took around 44 seconds to completed the query. I found that presorting the items in the query using <OrderBy> didn't affect the performance noticably at all. I did these two queries:
<Where><Contains><FieldRef='SingleText'/><Value Type='Text'>8</Value></Contains></Where>
<OrderBy><FieldRef Name='Number'Ascending='FALSE'/></OrderBy><Where><Contains><FieldRef Name='SingleText'/><Value Type='Text'>8</Value></Contains></Where>
There was less than a tenth of a second difference in the average time to process the query. So a performance hint:
Do - use the RowLimit property in the SPQuery class if you don't need to return all the results, and make the limit as small as you can while using <OrderBy> to make sure you get the most relevant results.
Queries I tested:
<Where><Contains><FieldRef='SingleText'/><Value Type='Text'>8</Value></Contains></Where> (search a GUID for a single character of text)
<OrderBy><FieldRef Name='Number'Ascending='FALSE'/></OrderBy><Where><Contains><FieldRef Name='SingleText'/><Value Type='Text'>8</Value></Contains></Where> (search a GUID for a single character of text and return a sorted list)
<Where><And><Contains><FieldRef Name='SingleText'/><Value Type='Text'>8</Value></Contains><Contains><FieldRef Name='SingleText'/><Value Type='Text'>a</Value></Contains></And></Where> (search a GUID for a the character '8' and the character 'a')
<Where><And><Lt><FieldRef Name='Number'/><Value Type='Text'>95000</Value></Lt><Gt><FieldRef Name='Number'/><Value Type='Text'>10000</Gt></Contains></And></Where> (search for all values in the Number field between 10,000 and 95,000)
Manipulating Data
As part of this performance data, I decided it was probably worth the time to check how well different methods of using the data were performing. One of the things I see a lot is the results of a query being loaded in to a DataSet or DataTable to be combined with content from a backend system or SQL table. I had found a new method available in the SPListItemCollection that I wanted to test. GetDataTable() is the method and it does exactly what I thought it did. It returns a data table populated with all the data in the SPListItemCollection. Once you have a DataTable there are several time saving tricks to make SharePoint lists more attractive as a data source. I found that, in general, it took more time to use the GetDataTable() method than it took to do a foreach loop and copy the data in to a new DataTable. This seemed really odd to me, but I thought perhaps I was running in to the same performance issue I was seeing in the SPListItemCollection Count property. It was able to process 1,400 items per second. So my results that were around 85,000 items took about one minute to generate with GetDataTable(), and about 48 seconds with the foreach loop (putting the performance there at about 1800 items per second).
Wrap Up
I will summarize what I have found.
Action
|
Performance
|
Add New Items
|
50 items/sec
|
SPListItem.Update()
|
2.8 secs/item
|
Web Service UpdateListItems()
|
50 items/sec
|
SPQuery
|
2,000 items/sec
|
SPListItemCollection.GetDataTable()
|
1,400 items/sec
|
Manually Create DataTable | 1,800 items/sec |
Keep in mind that this is of course on my hardware, which is a 64-bit dual core machine with 4GB RAM. Your mileage may vary.
Another important discovery worthy of summarizing is the realization that you should limit calls to the SPListItemCollection.Count property.