Everyone's Favorite Linear, Direct Access, Homogeneous Data Structure: The Array(英翻中)

 

Arrays are one of the simplest and most widely used data structures in computer programs. Arrays in any programming language all share a few common properties:

  • The contents of an array are stored in contiguous memory.
  • All of the elements of an array must be of the same type or of a derived type; hence arrays are referred to as homogeneous data structures.
  • Array elements can be directly accessed. With arrays if you know you want to access the ith element, you can simply use one line of code: arrayName[i].

The common operations performed on arrays are:

  • Allocation
  • Accessing

In C#, when an array (or any reference type variable) is initially declared, it has a null value. That is, the following line of code simply creates a variable named booleanArray that equals null:

bool [] booleanArray;

Before we can begin to work with the array, we must create an array instance that can store a specific number of elements. This is accomplished using the following syntax:

booleanArray = new bool[10];

Or more generically:

arrayName = new arrayType[allocationSize];

This allocates a contiguous block of memory in the CLR-managed heap large enough to hold the allocationSize number of arrayTypes. If arrayType is a value type, then allocationSize number of unboxed arrayType values are created. If arrayType is a reference type, then allocationSize number of arrayType references are created. (If you are unfamiliar with the difference between reference and value types and the managed heap versus the stack, check out Understanding .NET's Common Type System.)

To help hammer home how the .NET Framework stores the internals of an array, consider the following example:

bool [] booleanArray;

FileInfo [] files;

 

booleanArray = new bool[10];

files = new FileInfo[10];

Here, the booleanArray is an array of the value type System.Boolean, while the files array is an array of a reference type, System.IO.FileInfo. Figure 1 shows a depiction of the CLR-managed heap after these four lines of code have executed.

uploading.4e448015.gif转存失败重新上传取消

Figure 1. The contents of an array are laid out contiguously in the managed heap.

The thing to keep in mind is that the ten elements in the files array are references to FileInfo instances. Figure 2 hammers home this point, showing the memory layout if we assign some of the values in the files array to FileInfo instances.

uploading.4e448015.gif转存失败重新上传取消

Figure 2. The contents of an array are laid out contiguously in the managed heap.

All arrays in .NET allow their elements to both be read and written to. The syntax for accessing an array element is:

// Read an array element

bool b = booleanArray[7];

 

// Write to an array element

booleanArray[0] = false;

The running time of an array access is denoted O(1) because it is constant. That is, regardless of how many elements are stored in the array, it takes the same amount of time to lookup an element. This constant running time is possible solely because an array's elements are stored contiguously, hence a lookup only requires knowledge of the array's starting location in memory, the size of each array element, and the element to be indexed.

Realize that in managed code, array lookups are a slight bit more involved than this because with each array access the CLR checks to ensure that the index being requested is within the array's bounds. If the array index specified is out of bounds, an IndexOutOfRangeException is thrown. This check help ensures that when stepping through an array we do not accidentally step past the last array index and into some other memory. This check, though, does not affect the asymptotic running time of an array access because the time to perform such checks does not increase as the size of the array increases.

**Note   **This index-bounds check comes at a slight cost of performance for applications that make a large number of array accesses. With a bit of unmanaged code, though, this index out of bounds check can be bypassed. For more information, refer to Chapter 14 of Applied Microsoft .NET Framework Programming by Jeffrey Richter.

When working with an array, you might need to change the number of elements it holds. To do so, you'll need to create a new array instance of the specified size and copy the contents of the old array into the new, resized array. This process can be accomplished with the following code:

// Create an integer array with three elements

int [] fib = new int[3];

fib[0] = 1;

fib[1] = 1;

fib[2] = 2;

      

// Redimension message to a 10 element array

int [] temp = new int[10];

 

// Copy the fib array to temp

fib.CopyTo(temp, 0);

      

// Assign temp to fib

fib = temp;   

After the last line of code, fib references a ten-element Int32 array. The elements 3 through 9 in the fib array will have the default Int32 value—0.

Arrays are excellent data structures to use when storing a collection of homogeneous types that you only need to access directly. Searching an unsorted array has linear running time. While this is acceptable when working with small arrays, or when performing very few searches, if your application is storing large arrays that are searched frequently, there are a number of other data structures better suited for the job. We'll look at some such data structures in upcoming pieces of this article series. Realize that if you are searching an array on some property and the array is sorted by that property, you can use an algorithm called binary search to search the array in O(log n) running time, which is on par with the search times for binary search trees. In fact, the Array class contains a static, BinarySearch() method. For more information on this method, check out an earlier article on mine, Efficiently Searching a Sorted Array.

**Note   **The .NET Framework allows for multi-dimensional arrays as well. Multi-dimensional arrays, like single-dimensional arrays, offer a constant running time for accessing elements. Recall that the running time to search through a n-element single dimensional array was denoted O(n). For an nxn two-dimensional array, the running time is denoted O(n2) because the search must check n2 elements. More generally, a k-dimensional array has a search running time of O(nk). Keep in mind here than n is the number of elements in each dimension, not the total number of elements in the multi-dimensional array.

每个人最喜欢的线性、直接访问的同种数据结构的数组
在计算机程序中数组是方便和最被广泛使用的数据结构之一。在任何的计算机编程语言中数组都有一些共同的特性:
*数组内容储存在联系的内存中
*所用的数组元素必须是相同类型或派生类型;因此数组被称为同构数据类型
*数组元素是可以直接访问的。如果你想要的访问数组中的第i个元素,你可以简便的用一行代码:arrayName[i]。
对数组的常见操作是:
*分配
*访问
在C#中,当一个数组(或者任意引用类型变量)首次声明时,它是空值。换言之,下面这行代码简单的创建了名为booleanArray且为空的变量:
booleanArray = new bool[10];
或更常见:
arrayName = new arrayType[allocationSize];
这将在公共运行时管理中分配一块足以分配给arrayTypes的连续内存。如果数组类型是值类型,创建未装箱的数组类型值的分配型号。如果数组类型是引用类型,则创建数组类型引用的是allocationSize。(如果你不熟悉值类型和引用类型的区别和托管堆和堆栈之间的区别,可查阅Understanding .NET's Common Type System)
帮助你深入理解.NET框架如何存储数组的内部信息。思考以下例子:
bool []booleanArray;
FileInfo []files;

booleanArray = new bool[10];
files = new FileInfo[10];
这里,booleanArray是值类型的数组系统。而文件数组的引用类型为System.IO.FileInfo。图一展示了执行完四行代码后对公共运行时的堆的描述。
需要注意的是,数组文件中的十个元素是对FileInfo实例的引用。图二强调了这一点,如果我们将数组文件中的一些值赋给FileInfo实例就会显示内存布局。图二数组中的内容在托管堆中连续布局。.NET中的所有数组都允许读写它们的元素。用于访问数组元素的语法是:
//读一个数组元素
bool b = booleanArray[7];

//写一个数组元素
booleanArray[0] = false;
数组访问的运行时间记为O(1),因为它是常量。换言之,无论数组中存储了多少元素,使用相同数量的时间查找一个元素。这种常量运行时间之所以可能,仅仅是因为数组的元素是连续存储的,因此查找只需要知道数组在内存中的起始位置。
要注意在托管代码中,数组查找要比这稍微复杂一些,因为每个数组访问都会进行公共运行时检查,以确保所请求的索引在数组的范围内。如果指定的数组索引超出界限,就会抛出indexOutOfRangeException。这个检查帮助确保在遍历一个数组时,不会意外地跳过最后一个数组索引并进入其他内存。但是,这种检查不会影响数组访问的渐进运行时间,因为执行这种检查的时间不会随着数组大小的增加而增加。
注意:对于需要进行大量数组访问的应用程序,这种索引界限检查会带来轻微的性能损失。但是,如果有一些非托管代码,就可以绕过这个超出范围的索引检查。有关更多信息,请参阅Jeffrey Richter编写的《应用微软.NET框架编程》第14章。
当处理数组时,你可能需要改变它包含的元素的数量。为此,您需要创建一个指定大小的新数组实例,并将旧数组的内容复制到新的调整大小的数组里。这个过程可通过以下代码来完成:
//创建一个三个元素的整数数组
int []fib = new int[3];
fib[0] = 1;
fib[1] = 2;
fib[2] = 3;

//将消息重定向为一个10个元素的数组
int []temp = new int[10];

//将fib数组复制到temp中
fib.CopyTo(temp,0);

//把temp分配到fib中
fib = temp;
在最后一行代码之后,fib将引用一个包含10个元素的int32数组。fib数组中的元素3到9将具有默认的int32值——0。
当存储只需要直接访问的同类类型集合时,数组是非常好的数据结构。搜索一个未排序的数组有线性的运行时间。虽然在处理小数组或执行很少的搜索时这是可以接受的,但是如果您的应用程序存储的是经常被搜索的大数组,那么在本系列的后续文章中还有许多其他数据结构。请注意,如果您正在搜索某个属性上的数组,并且该数组是根据该属性排序的,那么您可以使用一种名为二分查找的算法来搜索二叉搜索树的搜索时间。实际上,数组类包含一个静态的binarySearch()方法。有关此方法的更多信息,可参阅前面的本人文章:高效搜索已排序的数组。
注意:.net框架也允许多维数组。多维数组,如一维数组,为访问元素提供了恒定的运行时间。回想一下,搜索n个元素一维数组的运行时间表示为O(n)。对于nxn二维数组,运行时间表示为O(n2),因为搜索必须检查n2的运行时间O(nk)。注意n是多维数组中元素的数量。

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值