C# string 性能优化

Strings in C# are highly optimized but also potentially very wasteful. They give programmers a safe, fast way to handle character data. However, there are a few tricks you need to know about strings and memory if you want to write efficient code. Without this information, you could easily write code that squanders both memory, and computer clock cycles.

Sharing Memory

To understand C# strings, you need to understand the answer to one fairly simple question. Suppose you have two string variables calledMyString1 and MyString2. How can you get them both to point at the same place in memory? The goal here is not just to have two strings that contain the same value, but to have two string variables that reference a single block of memory that contains a string.

It turns out that the answer to this question is very simple and intuitive. The reasons behind the answer, however, are less obvious. Understand those reasons will give you the power to write code that is fast and efficient.

This post emerged from a thread on the C# forum. As often happens, I learned something in the course of the discussion. I've attempted to repackage that information and present it here in this post. The post begins with a look atStrings and StringBuilders, but the focus quickly switches to an exploration of how theString class handles memory.

Strings vs. StringBuilders

C# Strings are immutable. This means you can't modify an existing string. If you try to change it with the concatenation operator, or with theReplace, Insert, PadLeft, PadRight, or SubString methods then you end up with an entirely new string. You can't ever change an existing string. The operations you perform on aString frequently cause a new allocation of memory.

Allocations of memory are costly in terms of both memory and performance. As a result, there are times when you don't want to use theString class.

Developers who want to work with a single string and take it through an arbitrary number of changes in a loop can use theStringBuilder class. The StringBuilder class has many of the same methods as the String class. You can, however, change the contents of aStringBuilder class without having to allocate new memory. This means that in certain situations theStringBuilder class will be much faster than the String class. In other situations, however, the opposite will be true.

What's a developer to do? The String class is highly optimized and very efficient in most cases. However, if you need to modify a string then theString class tends to be a bit wasteful of resources. How concerned should developers be about this problem? How often should they abandon theString class and use StringBuilder? The answer, as it turns out, is "not very often."

You should only use the StringBuilder class if you need to modify a single string many times in a loop, or in a relatively small section of code. To fully understand why this is the case, you need to understand just how smart theString class can be when it comes to handling memory in typical programming scenarios.

What Makes a C# String Sharp?

The big win for Strings is the tricks they perform to limit unnecessary memory allocations. Look at this code:

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Text;
   4:   
   5:  namespace CSharpConsoleApplication3
   6:  {
   7:      class Program
   8:      {
   9:          static void Main(string[] args)
  10:          {
  11:              String foo = "foo data";
  12:              String bar = foo;
  13:              Console.WriteLine(ReferenceEquals(foo, bar));
  14:              Console.WriteLine(foo.Equals(bar));
  15:              foo = "a";
  16:              Console.WriteLine(foo.Equals(bar));
  17:              Console.WriteLine(ReferenceEquals(foo, bar));
  18:              String goober1 = "foo";
  19:              String goober2 = "foo";
  20:              Console.WriteLine(ReferenceEquals(goober1, goober2));
  21:          }
  22:      }
  23:  }

The goal of getting two string variables to reference the same memory is achieved in lines 11 - 12. In this case, bothfoo and bar point at the same place in memory. To check, call theReferenceEquals method (or the == operator). In this code, the call toReference Equals returns True in line 13. We can also call theEquals method (line 14) of the String class to see that the two strings are equal in that they both have the same value. That is, they both point at the eight letters that spell "foo data".

Now change the value of foo, as we do in line 15. A C/C++ programmer might then expect that bothfoo and bar would still reference the same memory, and hence both have the value "a". This is not the case. Lines 16 and 17 both returnFalse. The assignment of "a" to foo broke the connection between the two variables. Intuitively, this is what we would expect. It's only our "deeper understanding" of computer languages that make us see this as odd.

The final twist in this saga is that line 20 also returns True. Here we have assigned two different strings to two different variables. Our expectation is that these two variables should not point at the same place in memory. But line 20 shows that they do reference the same block of memory.

C# maintains something called an "intern table." This is a list of strings that are currently referenced. If a new string is created with code like that shown in lines 18 and 19, then the intern table is checked. If your string is already in there, then both variables will point at the same block of memory maintained by the intern table. The string is not duplicated. Again, this is intuitively what we want, but our understanding of computers makes us think that this is not what will happen. C# tries to conform to what we would intuitively expect to happen, not to what we think a computer is likely to do.

Some of the details of the intern table are discussed in this reference to theString Intern method.

Summary

This post explains a little bit about how C# handles memory allocations for theString class. Knowing this information is helpful if you want to write optimized code. It is also interesting information that intrigues us in part because it explains one small corner of the great wonder that is the C# language.

How important is it that one understands this information? That depends. For some people, it will be information they use every day. For others, it is just background noise. Writing safe, error free code is my most important task. Once that is accomplished, then I like to find time to work on optimization issues like those outlined here.

 

原文地址:http://blogs.msdn.com/b/charlie/archive/2006/10/11/optimizing-c_2300_-string-performance.aspx

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值