jQuery代码分析之二Selector Engine

最新推荐文章于 2023-02-13 16:26:37 发布

tomorrow_be_good

最新推荐文章于 2023-02-13 16:26:37 发布

阅读量1.1k

点赞数

分类专栏： web jQuery

本文链接：https://blog.csdn.net/sandstom_1985/article/details/8034398

版权

web 同时被 2 个专栏收录

14 篇文章 0 订阅

订阅专栏

jQuery

5 篇文章 0 订阅

订阅专栏

jQuery代码分析之二 Selector Engine Sizzle

本文的代码分析基于jQuery 1.7

Sizzle(selector,context,results,seed)

返回值：与selector匹配的元素集

selector：CSS selector

context：应用CSS selector的HTML元素上下文，默认值为document。比如我们有如下的HTML结构

<html>
<body>
<div id="parent">
<div>
<span>hello world</span>
</div>
</div>
<body>
</html>

当context没有设定值时，我们就会去匹配body元素的子孙元素；我们也可以把context设定为#parent，那这个时候我们就去匹配#parent的子孙元素。

results：可选项，我们可以把匹配值插入的一个集合，例如，在jQuery的类库实现中，会传入jQuery对象

Sizzle实现机制

对于CSS选择子，各大主流浏览器都开始在新版本中提供内置支持，如下是W3C文档中定义的实现接口：

	partical interface Document {
		Element? querySelector (DOMString selectors);
		NodeList querySElectorAll (DOMString selectors);
	}
	partial interface DocumentFragment {
		Element? querySelector (DOMString selectors);
		NodeList querySElectorAll (DOMString selectors);
	}
	partial interface Element {
		Element? querySelector (DOMString selectors);
		NodeList querySElectorAll (DOMString selectors);
	}

The querySelector() methods on the Document, DocumentFragment, and Element interfaces must return the first matching Element node within the subtrees of the context node. If there is no matching Element, the method must return null.

The querySelectorAll() methods on the Document, DocumentFragment, and Element interfaces must return a NodeList containing all of the matching Element nodes within the subtrees of the context node, in document order. If there are no matching nodes, the method must return an empty NodeList.

The NodeList object returned by the querySelectorAll() method must be static, not live ([DOM-LEVEL-3-CORE], section 1.1.1). Subsequent changes to the structure of the underlying document must not be reflected in the NodeList object. This means that the object will instead contain a list of matching Element nodes that were in the document at the time the list was created.

各种主流Desktop浏览器对这两个方法的支持情况

Chrome	Firefox	Internet Explorer	Opera	Safari
1	3.5	8	10	3.2

我们可以看出各主流浏览器对这这方法的支持并不统一，Firefox和Chrome的版本更新速度都很快，但是当前IE6,7在国内仍然占有一定的市场份额。那Sizzle所提供的CSS选择子解析功甚至能够在很低版本的浏览器上运行，我想这就是Sizzle的优势所在吧。

IE6+
FF3.0+
Chrome 5+
Safari 3+
Opera 9+

当用户不断升级浏览器版本时，Sizzle的性能也会随之提高，因为它会使用越来越多的浏览器内置功能。这个也就是我们经常谈到的"优雅退化(elegant degrade)"。那Sizzle是如何做到这一点的呢？

我们还是拿querySelectorAll作为例子。

if (document.querySelectorAll ) {
			(function() {
				var oldSizzle = Sizzle;
				// 先测试querySelectorAll是我们想要的功能
				div = document.createElement("div"),
				id = "__sizzle__";

				div.innerHTML = "";

				// Safari can't handle uppercase or unicode characters when
				// in quirks mode.
				if ( div.querySelectorAll && div.querySelectorAll(".TEST").length === 0 ) {
					return;
				}
				Sizzle = function( query, context, extra, seed ) {
					// 利用一些方式加快查询速度
					// 否则，仍然调用老的Sizzle函数
					return oldSizzle(query, context, extra, seed);
			})();
			//拷贝原来Sizzle对象上的属性及其值到新对象上
			for ( var prop in oldSizzle ) {
				Sizzle[ prop ] = oldSizzle[ prop ];
			}
		}

另外我们还可以举compareDocumentPosition的使用例子。

首先我们来了解一下这个函数来自W3C的定义

compareDocumentPosition introduced in DOM Level 3 Compares the reference node, i.e. the node on which this method is being called, with a node, i.e. the one passed as a parameter, with regard to their position in the document and according to the document order. Parameters other of type Node The node to compare against the reference node. Return Value unsigned short Returns how the node is positioned relatively to the reference node. Exceptions DOMException NOT_SUPPORTED_ERR: when the compared nodes are from different DOM implementations that do not coordinate to return consistent implementation-specific results.

它的返回值的详细值如下

DOCUMENT_POSITION_DISCONNECTED = 1;

DOCUMENT_POSITION_PRECEDING = 2;

DOCUMENT_POSITION_FOLLOWING = 4;

DOCUMENT_POSITION_CONTAINS = 8;

DOCUMENT_POSITION_CONTAINED_BY = 16;

看如下的代码分析

if ( document.documentElement.compareDocumentPosition ) {
	sortOrder = function( a, b ) {
		if ( a === b ) {
			hasDuplicate = true;
			return 0;
		}

		if ( !a.compareDocumentPosition || !b.compareDocumentPosition ) {
			return a.compareDocumentPosition ? -1 : 1;
		}

		return a.compareDocumentPosition(b)&4 ? -1 : 1;
	};

} else {
	sortOrder = function( a, b ) {
		// The nodes are identical, we can exit early
		if ( a === b ) {
			hasDuplicate = true;
			return 0;

		// Fallback to using sourceIndex (in IE) if it's available on both nodes
		} else if ( a.sourceIndex && b.sourceIndex ) {
			return a.sourceIndex - b.sourceIndex;
		}

		var al, bl,
			ap = [],
			bp = [],
			aup = a.parentNode,
			bup = b.parentNode,
			cur = aup;

		// If the nodes are siblings (or identical) we can do a quick check
		if ( aup === bup ) {
			return siblingCheck( a, b );

		// If no parents were found then the nodes are disconnected
		} else if ( !aup ) {
			return -1;

		} else if ( !bup ) {
			return 1;
		}

		// Otherwise they're somewhere else in the tree so we need
		// to build up a full list of the parentNodes for comparison
		while ( cur ) {
			ap.unshift( cur );
			cur = cur.parentNode;
		}

		cur = bup;

		while ( cur ) {
			bp.unshift( cur );
			cur = cur.parentNode;
		}

		al = ap.length;
		bl = bp.length;

		// Start walking down the tree looking for a discrepancy
		for ( var i = 0; i < al && i < bl; i++ ) {
			if ( ap[i] !== bp[i] ) {
				return siblingCheck( ap[i], bp[i] );
			}
		}

		// We ended someplace up the tree so do a sibling check
		return i === al ?
			siblingCheck( a, bp[i], -1 ) :
			siblingCheck( ap[i], b, 1 );
	};
}

另外，W3C文档中关于selector API的部分还在继续更新，将来浏览器会提供越来越多的Selector功能。例如

partial interface Element { 
		boolean matches(DOMString selectors, optional (Element or sequence<Node>)? ref Nodes); 
	}

The matches() method on the Element interface must return true if the context object is a matching Element node. Otherwise, the method must return false.

Sizzle是怎样提高常用选择子的解析性能的？

在讲这个主题之前，我们有必要先讲一下Sizzle是如何处理复杂选择子的，见代码

var Sizzle = function( selector, context, results, seed ) {
	results = results || [];
	context = context || document;

	var origContext = context;

	if ( context.nodeType !== 1 && context.nodeType !== 9 ) {
		return [];
	}
	
	if ( !selector || typeof selector !== "string" ) {
		return results;
	}

	var m, set, checkSet, extra, ret, cur, pop, i,
		prune = true,
		contextXML = Sizzle.isXML( context ),
		parts = [],
		soFar = selector;
	
	// Reset the position of the chunker regexp (start from head)
	do {
		chunker.exec( "" );
		m = chunker.exec( soFar );

		if ( m ) {
			soFar = m[3];
		
			parts.push( m[1] );
		
			if ( m[2] ) {
				extra = m[3];
				break;
			}
		}
	} while ( m );

	if ( parts.length > 1 && origPOS.exec( selector ) ) {

		if ( parts.length === 2 && Expr.relative[ parts[0] ] ) {
			set = posProcess( parts[0] + parts[1], context, seed );

		} else {
			set = Expr.relative[ parts[0] ] ?
				[ context ] :
				Sizzle( parts.shift(), context );

			while ( parts.length ) {
				selector = parts.shift();

				if ( Expr.relative[ selector ] ) {
					selector += parts.shift();
				}
				
				set = posProcess( selector, set, seed );
			}
		}

	} else {
		// Take a shortcut and set the context if the root selector is an ID
		// (but not if it'll be faster if the inner selector is an ID)
		if ( !seed && parts.length > 1 && context.nodeType === 9 && !contextXML &&
				Expr.match.ID.test(parts[0]) && !Expr.match.ID.test(parts[parts.length - 1]) ) {

			ret = Sizzle.find( parts.shift(), context, contextXML );
			context = ret.expr ?
				Sizzle.filter( ret.expr, ret.set )[0] :
				ret.set[0];
		}

		if ( context ) {
			ret = seed ?
				{ expr: parts.pop(), set: makeArray(seed) } :
				Sizzle.find( parts.pop(), parts.length === 1 && (parts[0] === "~" || parts[0] === "+") 
				&& context.parentNode ? context.parentNode : context, contextXML );

			set = ret.expr ?
				Sizzle.filter( ret.expr, ret.set ) :
				ret.set;

			if ( parts.length > 0 ) {
				checkSet = makeArray( set );

			} else {
				prune = false;
			}

			while ( parts.length ) {
				cur = parts.pop();
				pop = cur;

				if ( !Expr.relative[ cur ] ) {
					cur = "";
				} else {
					pop = parts.pop();
				}

				if ( pop == null ) {
					pop = context;
				}

				Expr.relative[ cur ]( checkSet, pop, contextXML );
			}

		} else {
			checkSet = parts = [];
		}
	}

	if ( !checkSet ) {
		checkSet = set;
	}

	if ( !checkSet ) {
		Sizzle.error( cur || selector );
	}

	if ( toString.call(checkSet) === "[object Array]" ) {
		if ( !prune ) {
			results.push.apply( results, checkSet );

		} else if ( context && context.nodeType === 1 ) {
			for ( i = 0; checkSet[i] != null; i++ ) {
				if ( checkSet[i] && (checkSet[i] === true || checkSet[i].nodeType === 1 && 
				Sizzle.contains(context, checkSet[i])) ) {
					results.push( set[i] );
				}
			}

		} else {
			for ( i = 0; checkSet[i] != null; i++ ) {
				if ( checkSet[i] && checkSet[i].nodeType === 1 ) {
					results.push( set[i] );
				}
			}
		}

	} else {
		makeArray( checkSet, results );
	}

	if ( extra ) {
		Sizzle( extra, origContext, results, seed );
		Sizzle.uniqueSort( results );
	}

	return results;
};

看完这段代码，我们知道它的执行效率取决于函数Sizzle.find和Sizzle.filter，看它们的实现代码

Sizzle.find = function( expr, context, isXML ) {
	var set, i, len, match, type, left;

	if ( !expr ) {
		return [];
	}

	for ( i = 0, len = Expr.order.length; i < len; i++ ) {
		type = Expr.order[i];
		
		if ( (match = Expr.leftMatch[ type ].exec( expr )) ) {
			left = match[1];
			match.splice( 1, 1 );

			if ( left.substr( left.length - 1 ) !== "\\" ) {
				match[1] = (match[1] || "").replace( rBackslash, "" );
				set = Expr.find[ type ]( match, context, isXML );

				if ( set != null ) {
					expr = expr.replace( Expr.match[ type ], "" );
					break;
				}
			}
		}
	}

	if ( !set ) {
		set = typeof context.getElementsByTagName !== "undefined" ?
			context.getElementsByTagName( "*" ) :
			[];
	}

	return { set: set, expr: expr };
};

Sizzle.filter = function( expr, set, inplace, not ) {
	var match, anyFound,
		type, found, item, filter, left,
		i, pass,
		old = expr,
		result = [],
		curLoop = set,
		isXMLFilter = set && set[0] && Sizzle.isXML( set[0] );

	while ( expr && set.length ) {
		for ( type in Expr.filter ) {
			if ( (match = Expr.leftMatch[ type ].exec( expr )) != null && match[2] ) {
				filter = Expr.filter[ type ];
				left = match[1];

				anyFound = false;

				match.splice(1,1);

				if ( left.substr( left.length - 1 ) === "\\" ) {
					continue;
				}

				if ( curLoop === result ) {
					result = [];
				}

				if ( Expr.preFilter[ type ] ) {
					match = Expr.preFilter[ type ]( match, curLoop, inplace, result, not, isXMLFilter );

					if ( !match ) {
						anyFound = found = true;

					} else if ( match === true ) {
						continue;
					}
				}

				if ( match ) {
					for ( i = 0; (item = curLoop[i]) != null; i++ ) {
						if ( item ) {
							found = filter( item, match, i, curLoop );
							pass = not ^ found;

							if ( inplace && found != null ) {
								if ( pass ) {
									anyFound = true;

								} else {
									curLoop[i] = false;
								}

							} else if ( pass ) {
								result.push( item );
								anyFound = true;
							}
						}
					}
				}

				if ( found !== undefined ) {
					if ( !inplace ) {
						curLoop = result;
					}

					expr = expr.replace( Expr.match[ type ], "" );

					if ( !anyFound ) {
						return [];
					}

					break;
				}
			}
		}

		// Improper expression
		if ( expr === old ) {
			if ( anyFound == null ) {
				Sizzle.error( expr );

			} else {
				break;
			}
		}

		old = expr;
	}

	return curLoop;
};

利用正则表达式匹配一些常用的选择子，然后直接执行，具体代码见

var Expr = Sizzle.selectors = {
	order: [ "ID", "NAME", "TAG" ],

	match: {
		ID: /#((?:[\w\u00c0-\uFFFF\-]|\\.)+)/,
		CLASS: /\.((?:[\w\u00c0-\uFFFF\-]|\\.)+)/,
		NAME: /\[name=['"]*((?:[\w\u00c0-\uFFFF\-]|\\.)+)['"]*\]/,
		ATTR: /\[\s*((?:[\w\u00c0-\uFFFF\-]|\\.)+)\s*(?:(\S?=)\s*(?:(['"])(.*?)\3|(#?(?:[\w\u00c0-\uFFFF\-]|\\.)*)|)|)\s*\]/,
		TAG: /^((?:[\w\u00c0-\uFFFF\*\-]|\\.)+)/,
		CHILD: /:(only|nth|last|first)-child(?:\(\s*(even|odd|(?:[+\-]?\d+|(?:[+\-]?\d*)?n\s*(?:[+\-]\s*\d+)?))\s*\))?/,
		POS: /:(nth|eq|gt|lt|first|last|even|odd)(?:\((\d*)\))?(?=[^\-]|$)/,
		PSEUDO: /:((?:[\w\u00c0-\uFFFF\-]|\\.)+)(?:\((['"]?)((?:\([^\)]+\)|[^\(\)]*)+)\2\))?/
	}
	//这个对象的还有很多其他属性，它们主要用于在处理各种CSS选择子的过程中需要的回调函数
}；

Sizzle的可扩展性

这个特性在我们编写自定义widget时有用

CSS选择子的分类

ID
CLASS
NAME
TAG
ATTR
CHILD
POS
pesudo
Universal

怎样编写高效的CSS选择子？

有很多非常有价值的文章，中文的比如CSS选择器的优化，这篇文章后面也给出了很多外文的相关链接，也非常好，注意CSS优化的技巧有很多，CSS选择子只是策略之一，其它的方法有比如sprites，minify等等。我只是在这儿做一个简短的总结：

浏览器在解析CSS选择子时，是从右到左的顺序，例如ul li a span，浏览器会先寻找span，然后再匹配a，依次接下来就是li和ul。所以最右边的选择子也被称作关键选择子，它直接影响着匹配的效率。所以提高匹配性能的一个基本方法是让关键选择子越具体越好。
不要使用universal选择子
让规则尽量具体
避免不必要的修饰符，例如ID前不要使用TAG名或之后使用ClassName，ClassName前也不要使用TAG名
避免使用后代选择子，例如我们通过样式
```
ul li {color: blue;}
		ol li {color: red;}
```
来设定不同列表项的前景色，这种方式可以改写为
```
.unordered-list-item {color: blue;}
		.ordered-list-item {color: red;}
```
如果非要用后代选择子，至少我们可以使用直接后代选择子>，这样只需要匹配一级元素就可以了，而不需要一直推演到元素的祖先。
不要在非链接(non-link)元素上使用:hover伪元素选择子，这种写法在IE7或8中可能有性能问题，这种使用方式可以用onmouseover事件来代替。
终极答案，来自David Hyatt: 如果你真的非常在意页面性能，那就不要使用CSS！;-）

当多个CSS选择子都能匹配同一元素时会发生什么？

在大型网站开发中，CSS文件会比较多，并且很有可能是多个人分块负责的，那这个时候会很容易出现CSS选择子冲突，也就是说同一个元素能够被多个CSS选择子匹配。例如有如下的HTML结构

<div>
<span class="warning">Server is busy, please do it later again!</span>
</div>

应用的如下的CSS样式

.warning { background-color:#E0F04D; color: black; }
span { background-color:#e0e0e0; color:#643B57; }

那浏览器会怎样决定span的最终background-color和color属性值呢？

有几个规则会被采用：

上面提到的每种CSS选择子类别是有不同的优先级的，我们可以用权重值来帮助我们更形象的理解这种区别
如果两种匹配规则的总权重值一样，那么就会根据加载顺序来判断，最后加载的规则会覆盖掉之前权重值相同的规则；
!important

参考文章

https://github.com/jquery/sizzle/wiki/Sizzle-Documentation
http://www.w3cplus.com/css/css-selector-performance
https://developers.google.com/speed/docs/best-practices/rendering?hl=zh-CN
http://stevesouders.com/efws/css-selectors/csscreate.php
https://developer.mozilla.org/en-US/docs/CSS/Writing_Efficient_CSS?redirectlocale=en-US&redirectslug=Writing_Efficient_CSS
http://www.noupe.com/css/15-effective-tips-and-tricks-from-the-masters-of-css.html
http://css-tricks.com/efficiently-rendering-css