LINQ延迟查询
LINQ定义了一系列的标准查询操作符,我们可以通过这些操作符使用查询语法或者方法语法对数据源进行查询,LINQ在定义查询语句后并不会立即查询数据源,而是通过foreach对返回结果进行遍历的时候才会查询数据源,这种技术即LINQ延迟查询,举例如下:
//延迟查询 int[] numbers = new int[] { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 }; int i = 0; var q = numbers.Where(x => { i++; return x > 2; }); foreach (var v in q) { Console.WriteLine("v = {0}, i = {1}", v, i); }
此段代码输出如下
由此可见,程序在执行foreach循环的时候才进行查询,此时对程序稍作修改,将查询后的结果转换为List:
//延迟查询 int[] numbers = new int[] { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 }; int i = 0; var q = numbers.Where(x => { i++; return x > 2; }).ToList(); foreach (var v in q) { Console.WriteLine("v = {0}, i = {1}", v, i); }
执行结果
程序在执行ToList的时候已经进行了查询,所以返回i值全部为10。LINQ的延迟查询为何如此,让我们查看一下扩展方法Where的源码,该方法在System.Linq命名空间下的Enumerable静态类中(http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,577032c8811e20d3):
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) { if (source == null) throw Error.ArgumentNull("source"); if (predicate == null) throw Error.ArgumentNull("predicate"); if (source is Iterator<TSource>) return ((Iterator<TSource>)source).Where(predicate); if (source is TSource[]) return new WhereArrayIterator<TSource>((TSource[])source, predicate); if (source is List<TSource>) return new WhereListIterator<TSource>((List<TSource>)source, predicate); return new WhereEnumerableIterator<TSource>(source, predicate); }
也就是说定义var q = numbers.Where(x => { i++; return x > 2; });程序只是返回了一个WhereArrayIterator<TSource>对象,该对象有一个指向源数据和lambda表达式的引用,该类源码如下:
class WhereArrayIterator<TSource> : Iterator<TSource> { TSource[] source; Func<TSource, bool> predicate; int index; public WhereArrayIterator(TSource[] source, Func<TSource, bool> predicate) { this.source = source; this.predicate = predicate; } public override Iterator<TSource> Clone() { return new WhereArrayIterator<TSource>(source, predicate); } public override bool MoveNext() { if (state == 1) { while (index < source.Length) { TSource item = source[index]; index++; if (predicate(item)) { current = item; return true; } } Dispose(); } return false; } public override IEnumerable<TResult> Select<TResult>(Func<TSource, TResult> selector) { return new WhereSelectArrayIterator<TSource, TResult>(source, predicate, selector); } public override IEnumerable<TSource> Where(Func<TSource, bool> predicate) { return new WhereArrayIterator<TSource>(source, CombinePredicates(this.predicate, predicate)); } }
我们看到MoveNext()方法下的if (predicate(item))语句,程序在此处对源数据的每一个值进行判定,所以当对返回的WhereArrayIterator<TSource>对象进行foreach循环时,编译器会将foreach语句解析为下面形式的代码段,代码执行MoveNext时会同时进行查询。
IEnumerator<TSource> enumerator = q.GetEnumerator(); while(enumerator.MoveNext()) { TSource p = enumerator.Current; Console.WriteLine("v = {0}, i = {1}", p, i); }
使用ToList方法为何会直接查询到所有结果,我们查看Enumerable类下的ToList扩展方法:
public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source) { if (source == null) throw Error.ArgumentNull("source"); return new List<TSource>(source); }
ToList方法返回一个新的List<TSource>对象,我们再看List<TSource>的构造函数
public List(IEnumerable<T> collection) { if (collection==null) ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection); Contract.EndContractBlock(); ICollection<T> c = collection as ICollection<T>; if( c != null) { int count = c.Count; if (count == 0) { _items = _emptyArray; } else { _items = new T[count]; c.CopyTo(_items, 0); _size = count; } } else { _size = 0; _items = _emptyArray; // This enumerable could be empty. Let Add allocate a new array, if needed. // Note it will also go to _defaultCapacity first, not 1, then 2, etc. using(IEnumerator<T> en = collection.GetEnumerator()) { while(en.MoveNext()) { Add(en.Current); } } } }
构造函数将源数据转换为ICollection,因为我们的源数据类型无法转换为ICollection,则返回null,之后执行else段的代码,在此处执行en.MoveNext()进行查询数据源。