从C# 3.0开始提供了Distinct方法,这对于集合的使用有了更为丰富的方法,经过在网上搜索相应的资源,发现有关这方面的写的好的文章还是不少的。而且为了扩展Linq的使用不方便的地方,有一些办法非常有效。由于本人工作中的需要,有一些功能暂时没有用到那么深入,现在只把最简单的一些功能分享出来,整理出来。
-
简单一维集合的使用:
List<int> ages = new List<int> { 21, 46, 46, 55, 17, 21, 55, 55 }; List<string> names = new List<string> { "wang", "li", "zhang", "li", "wang", "chen", "he", "wang" }; IEnumerable<int> distinctAges = ages.Distinct(); Console.WriteLine("Distinct ages:"); foreach (int age in distinctAges) { Console.WriteLine(age); } var distinctNames = names.Distinct(); Console.WriteLine("\nDistinct names:"); foreach (string name in distinctNames) { Console.WriteLine(name); }
- 在这段代码中,是最简单的Distinct()方法的使用。使用了集合接口IEnumerable,以及隐式类型var,至于这两种用法有什么区别,没有研究出来。
- 但是如果象下面这样的代码,是错误的!
List<int> disAge = ages.Distinct();
- 正确的方法应该是:
List<int> ages = new List<int> { 21, 46, 46, 55, 17, 21, 55, 55 }; List<int> disAge = ages.Distinct().ToList(); foreach (int a in disAge) Console.WriteLine(a);
- 也就是说Distinct()方法的返回集合类型是一个接口,不是具体的集合,所以需要用一个ToList()。
-
自定义类的使用:
- 首先我们看MSDN上给出的例子,先定义一个产品类:
public class Product : IEquatable<Product> { public string Name { get; set; } public int Code { get; set; } public bool Equals(Product other) { //Check whether the compared object is null. if (Object.ReferenceEquals(other, null)) return false; //Check whether the compared object references the same data. if (Object.ReferenceEquals(this, other)) return true; //Check whether the products' properties are equal. return Code.Equals(other.Code) && Name.Equals(other.Name); } // If Equals() returns true for a pair of objects // then GetHashCode() must return the same value for these objects. public override int GetHashCode() { //Get hash code for the Name field if it is not null. int hashProductName = Name == null ? 0 : Name.GetHashCode(); //Get hash code for the Code field. int hashProductCode = Code.GetHashCode(); //Calculate the hash code for the product. return hashProductName ^ hashProductCode; } }
- 在主函数里,是这样用的:
static void Main(string[] args) { Product[] products = { new Product { Name = "apple", Code = 9 }, new Product { Name = "orange", Code = 4 }, new Product { Name = "apple", Code = 9 }, new Product { Name = "lemon", Code = 12 } }; //Exclude duplicates. IEnumerable<Product> noduplicates = products.Distinct(); foreach (var product in noduplicates) Console.WriteLine(product.Name + " " + product.Code); }
- 这样的输出是:
/* This code produces the following output: apple 9 orange 4 lemon 12 */
- 但是现在的问题是,如果我们把主函数里改成这样:
static void Main(string[] args) { Product[] products = { new Product { Name = "Smallapple", Code = 9 }, new Product { Name = "orange", Code = 4 }, new Product { Name = "Bigapple", Code = 9 }, new Product { Name = "lemon", Code = 12 } }; //Exclude duplicates. IEnumerable<Product> noduplicates = products.Distinct(); foreach (var product in noduplicates) Console.WriteLine(product.Name + " " + product.Code); }
- 这样的输出是:
/* This code produces the following output: Smallapple 9 orange 4 Bigapple 9 lemon 12 */
- 我们的问题是,如果想按Code来索引,想找出Code唯一的这些成员,那么这里就需要重新定义一个对Code比较的类,或者再扩展成泛型类,但是这样非常繁琐。
-
博客鹤冲天的改进办法(以下均转自这个博客)
- 首先,创建一个通用比较的类,实现IEqualityComparer<T>接口:
public class CommonEqualityComparer<T, V> : IEqualityComparer<T> { private Func<T, V> keySelector; public CommonEqualityComparer(Func<T, V> keySelector) { this.keySelector = keySelector; } public bool Equals(T x, T y) { return EqualityComparer<V>.Default.Equals(keySelector(x), keySelector(y)); } public int GetHashCode(T obj) { return EqualityComparer<V>.Default.GetHashCode(keySelector(obj)); } }
- 借助上面这个类,Distinct扩展方法就可以这样写:
public static class DistinctExtensions { public static IEnumerable<T> Distinct<T, V>(this IEnumerable<T> source, Func<T, V> keySelector) { return source.Distinct(new CommonEqualityComparer<T, V>(keySelector)); } }
- 下面的使用就很简单了:
Product[] products = { new Product { Name = "Smallapple", Code = 9 }, new Product { Name = "orange", Code = 4 }, new Product { Name = "Bigapple", Code = 9 }, new Product { Name = "lemon", Code = 12 } }; var p1 = products.Distinct(p => p.Code); foreach (Product pro in p1) Console.WriteLine(pro.Name + "," + pro.Code); var p2 = products.Distinct(p => p.Name); foreach (Product pro in p2) Console.WriteLine(pro.Name + "," + pro.Code);
- 可以看到,加上Linq表达式,可以方便的对自定义类的任意字段进行Distinct的处理。