本文翻译自:How to sort strings in JavaScript
I have a list of objects I wish to sort based on a field attr
of type string. 我有一个对象列表,希望根据字符串类型的字段attr
进行排序。 I tried using -
我尝试使用-
list.sort(function (a, b) {
return a.attr - b.attr
})
but found that -
doesn't appear to work with strings in JavaScript. 但发现-
在JavaScript中似乎不适用于字符串。 How can I sort a list of objects based on an attribute with type string? 如何根据具有字符串类型的属性对对象列表进行排序?
#1楼
参考:https://stackoom.com/question/DJF/如何在JavaScript中对字符串排序
#2楼
I had been bothered about this for long, so I finally researched this and give you this long winded reason for why things are the way they are. 我已经为此烦恼了很长时间,所以我终于对此进行了研究,并为您提供了如此漫长的原因来说明事情的现状。
Section 11.9.4 The Strict Equals Operator ( === )
The production EqualityExpression : EqualityExpression === RelationalExpression
is evaluated as follows:
- Let lref be the result of evaluating EqualityExpression.
- Let lval be GetValue(lref).
- Let rref be the result of evaluating RelationalExpression.
- Let rval be GetValue(rref).
- Return the result of performing the strict equality comparison
rval === lval. (See 11.9.6)
So now we go to 11.9.6 所以现在我们去11.9.6
11.9.6 The Strict Equality Comparison Algorithm
The comparison x === y, where x and y are values, produces true or false.
Such a comparison is performed as follows:
- If Type(x) is different from Type(y), return false.
- If Type(x) is Undefined, return true.
- If Type(x) is Null, return true.
- If Type(x) is Number, then
...
- If Type(x) is String, then return true if x and y are exactly the
same sequence of characters (same length and same characters in
corresponding positions); otherwise, return false.
That's it. 而已。 The triple equals operator applied to strings returns true iff the arguments are exactly the same strings (same length and same characters in corresponding positions). 如果参数是完全相同的字符串(在相应位置具有相同的长度和相同的字符),则应用于字符串的三元等于运算符将返回true。
So ===
will work in the cases when we're trying to compare strings which might have arrived from different sources, but which we know will eventually have the same values - a common enough scenario for inline strings in our code. 因此,当我们尝试比较可能来自不同来源的字符串,但我们知道它们最终将具有相同的值时, ===
将起作用。这是我们代码中内联字符串足够普遍的情况。 For example, if we have a variable named connection_state
, and we wish to know which one of the following states ['connecting', 'connected', 'disconnecting', 'disconnected']
is it in right now, we can directly use the ===
. 例如,如果我们有一个名为connection_state
的变量,并且我们想知道现在处于以下哪个状态['connecting', 'connected', 'disconnecting', 'disconnected']
,我们可以直接使用===
。
But there's more. 但是还有更多。 Just above 11.9.4, there is a short note: 在11.9.4之上,有一个简短的注释:
NOTE 4
Comparison of Strings uses a simple equality test on sequences of code
unit values. There is no attempt to use the more complex, semantically oriented
definitions of character or string equality and collating order defined in the
Unicode specification. Therefore Strings values that are canonically equal
according to the Unicode standard could test as unequal. In effect this
algorithm assumes that both Strings are already in normalized form.
Hmm. 嗯 What now? 现在怎么办? Externally obtained strings can, and most likely will, be weird unicodey, and our gentle ===
won't do them justice. 外部获得的字符串可能而且很可能是奇怪的单一代码,而我们温和的===
不会使它们公正。 In comes localeCompare
to the rescue: 在谈到localeCompare
救援:
15.5.4.9 String.prototype.localeCompare (that)
...
The actual return values are implementation-defined to permit implementers
to encode additional information in the value, but the function is required
to define a total ordering on all Strings and to return 0 when comparing
Strings that are considered canonically equivalent by the Unicode standard.
We can go home now. 我们现在可以回家了。
tl;dr; tl; dr;
To compare strings in javascript, use localeCompare
; 要比较javascript中的字符串,请使用localeCompare
; if you know that the strings have no non-ASCII components because they are, for example, internal program constants, then ===
also works. 如果您知道字符串没有非ASCII成分,因为它们是例如内部程序常量,则===
也可以。
#3楼
In your operation in your initial question, you are performing the following operation: 在初始问题中的操作中,您正在执行以下操作:
item1.attr - item2.attr
So, assuming those are numbers (ie item1.attr = "1", item2.attr = "2") You still may use the "===" operator (or other strict evaluators) provided that you ensure type. 因此,假设这些是数字(即item1.attr =“ 1”,item2.attr =“ 2”),只要确保输入类型,您仍然可以使用“ ===”运算符(或其他严格的求值器)。 The following should work: 以下应该工作:
return parseInt(item1.attr) - parseInt(item2.attr);
If they are alphaNumeric, then do use localCompare(). 如果它们是alphaNumeric,则请使用localCompare()。
#4楼
An updated answer (October 2014) 更新的答案(2014年10月)
I was really annoyed about this string natural sorting order so I took quite some time to investigate this issue. 我对这种字符串自然排序顺序感到非常恼火,因此花了很多时间来研究这个问题。 I hope this helps. 我希望这有帮助。
Long story short 长话短说
localeCompare()
character support is badass, just use it. localeCompare()
字符支持很糟糕,只需使用它即可。 As pointed out by Shog9
, the answer to your question is: 正如Shog9
指出的Shog9
,您的问题的答案是:
return item1.attr.localeCompare(item2.attr);
Bugs found in all the custom javascript "natural string sort order" implementations 在所有自定义javascript“自然字符串排序顺序”实现中发现的错误
There are quite a bunch of custom implementations out there, trying to do string comparison more precisely called "natural string sort order" 有很多自定义实现,试图更精确地进行字符串比较,称为“自然字符串排序顺序”
When "playing" with these implementations, I always noticed some strange "natural sorting order" choice, or rather mistakes (or omissions in the best cases). 当“尝试”这些实现时,我总是注意到一些奇怪的“自然排序顺序”选择,或者是错误(或者在最佳情况下是遗漏)。
Typically, special characters (space, dash, ampersand, brackets, and so on) are not processed correctly. 通常,特殊字符(空格,破折号,“&”号,方括号等)未正确处理。
You will then find them appearing mixed up in different places, typically that could be: 然后,您会发现它们在不同的位置混合出现,通常可能是:
- some will be between the uppercase 'Z' and the lowercase 'a' 有些会在大写字母“ Z”和小写字母“ a”之间
- some will be between the '9' and the uppercase 'A' 有些会在'9'和大写字母'A'之间
- some will be after lowercase 'z' 有些将在小写字母“ z”之后
When one would have expected special characters to all be "grouped" together in one place, except for the space special character maybe (which would always be the first character). 当一个人希望所有特殊字符都被“分组”在一个地方时,除了空格特殊字符(总是第一个字符)。 That is, either all before numbers, or all between numbers and letters (lowercase & uppercase being "together" one after another), or all after letters. 也就是说,要么全部在数字之前,要么全部在数字和字母之间(小写字母和大写字母彼此“在一起”),或者全部在字母之后。
My conclusion is that they all fail to provide a consistent order when I start adding barely unusual characters (ie. characters with diacritics or charcters such as dash, exclamation mark and so on). 我的结论是,当我开始添加几乎不寻常的字符(即带有变音符号或字符(例如破折号,感叹号等)的字符时,它们都无法提供一致的顺序。
Research on the custom implementations: 有关自定义实现的研究:
-
Natural Compare Lite
https://github.com/litejs/natural-compare-lite : Fails at sorting consistently https://github.com/litejs/natural-compare-lite/issues/1 and http://jsbin.com/bevututodavi/1/edit?js,console , basic latin characters sorting http://jsbin.com/bevututodavi/5/edit?js,consoleNatural Compare Lite
https://github.com/litejs/natural-compare-lite :无法始终如一地排序https://github.com/litejs/natural-compare-lite/issues/1和http://jsbin.com / bevututodavi / 1 / edit?js,console ,基本拉丁字符排序http://jsbin.com/bevututodavi/5/edit?js,console -
Natural Sort
https://github.com/javve/natural-sort : Fails at sorting consistently, see issue https://github.com/javve/natural-sort/issues/7 and see basic latin characters sorting http://jsbin.com/cipimosedoqe/3/edit?js,consoleNatural Sort
https://github.com/javve/natural-sort :未能始终进行排序,请参见问题https://github.com/javve/natural-sort/issues/7并查看基本的拉丁字符排序http:// jsbin.com/cipimosedoqe/3/edit?js,控制台 -
Javascript Natural Sort
https://github.com/overset/javascript-natural-sort : seems rather neglected since February 2012, Fails at sorting consistently, see issue https://github.com/overset/javascript-natural-sort/issues/16Javascript Natural Sort
https://github.com/overset/javascript-natural-sort :自2012年2月以来似乎已被忽略,未能始终如一地排序,请参见问题https://github.com/overset/javascript-natural-sort/issues / 16 -
Alphanum
http://www.davekoelle.com/files/alphanum.js , Fails at sorting consistently, see http://jsbin.com/tuminoxifuyo/1/edit?js,consoleAlphanum
http://www.davekoelle.com/files/alphanum.js ,无法始终如一地排序,请参见http://jsbin.com/tuminoxifuyo/1/edit?js , 控制台
Browsers' native "natural string sort order" implementations via localeCompare()
浏览器通过localeCompare()
的本机“自然字符串排序顺序”实现
localeCompare()
oldest implementation (without the locales and options arguments) is supported by IE6+, see http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx (scroll down to localeCompare() method). IE6 +支持最旧的localeCompare()
实现(没有语言环境和选项参数),请参见http://msdn.microsoft.com/zh-cn/library/ie/s4esdbwz(v = vs.94).aspx (向下滚动)到localeCompare()方法)。 The built-in localeCompare()
method does a much better job at sorting, even international & special characters. 内置的localeCompare()
方法在排序(甚至国际字符和特殊字符localeCompare()
方面做得更好。 The only problem using the localeCompare()
method is that "the locale and sort order used are entirely implementation dependent". 使用localeCompare()
方法的唯一问题是“使用的语言环境和排序顺序完全取决于实现”。 In other words, when using localeCompare such as stringOne.localeCompare(stringTwo): Firefox, Safari, Chrome & IE have a different sort order for Strings. 换句话说,当使用诸如stringOne.localeCompare(stringTwo)之类的localeCompare时:Firefox,Safari,Chrome和IE对字符串的排序顺序不同。
Research on the browser-native implementations: 对浏览器本地实现的研究:
- http://jsbin.com/beboroyifomu/1/edit?js,console - basic latin characters comparison with localeCompare() http://jsbin.com/viyucavudela/2/ - basic latin characters comparison with localeCompare() for testing on IE8 http://jsbin.com/beboroyifomu/1/edit?js,console-与localeCompare()的基本拉丁字符比较http://jsbin.com/viyucavudela/2/-与localeCompare()的基本拉丁字符比较用于测试IE8
- http://jsbin.com/beboroyifomu/2/edit?js,console - basic latin characters in string comparison : consistency check in string vs when a character is alone http://jsbin.com/beboroyifomu/2/edit?js,console-字符串比较中的基本拉丁字符:字符串中的一致性检查与单独使用字符时
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare - IE11+ supports the new locales & options arguments https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare-IE11 +支持新的语言环境和选项参数
Difficulty of "string natural sorting order" “字符串自然排序顺序”的困难
Implementing a solid algorithm (meaning: consistent but also covering a wide range of characters) is a very tough task. 实施可靠的算法(意味着:一致但也涵盖了广泛的字符)是一项艰巨的任务。 UTF8 contains more than 2000 characters & covers more than 120 scripts (languages) . UTF8包含2000多个字符 , 涵盖120多个脚本(语言) 。 Finally, there are some specification for this tasks, it is called the "Unicode Collation Algorithm", which can be found at http://www.unicode.org/reports/tr10/ . 最后,有一些针对此任务的规范,称为“ Unicode排序算法”,可以在http://www.unicode.org/reports/tr10/上找到。 You can find more information about this on this question I posted https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order 您可以在我发布的这个问题上找到有关此问题的更多信息https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order
Final conclusion 定论
So considering the current level of support provided by the javascript custom implementations I came across, we will probably never see anything getting any close to supporting all this characters & scripts (languages). 因此,考虑到我遇到的javascript自定义实现所提供的当前支持水平,我们可能永远不会看到有什么东西能够接近支持所有这些字符和脚本(语言)的。 Hence I would rather use the browsers' native localeCompare() method. 因此,我宁愿使用浏览器的本地localeCompare()方法。 Yes, it does have the downside of beeing non-consistent across browsers but basic testing shows it covers a much wider range of characters, allowing solid & meaningful sort orders. 是的,它确实存在跨浏览器不一致的缺点,但是基本测试表明,它涵盖了更大范围的字符,允许可靠且有意义的排序顺序。
So as pointed out by Shog9
, the answer to your question is: 因此,正如Shog9
所指出的,您的问题的答案是:
return item1.attr.localeCompare(item2.attr);
Further reading: 进一步阅读:
- https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order
- How do you do string comparison in JavaScript? 如何在JavaScript中进行字符串比较?
- Javascript : natural sort of alphanumerical strings Javascript:自然的字母数字字符串
- Sort Array of numeric & alphabetical elements (Natural Sort) 数字和字母元素的排序数组(自然排序)
- Sort mixed alpha/numeric array 排序混合的字母/数字数组
- https://web.archive.org/web/20130929122019/http://my.opera.com/GreyWyvern/blog/show.dml/1671288 https://web.archive.org/web/20130929122019/http://my.opera.com/GreyWyvern/blog/show.dml/1671288
- https://web.archive.org/web/20131005224909/http://www.davekoelle.com/alphanum.html https://web.archive.org/web/20131005224909/http://www.davekoelle.com/alphanum.html
- http://snipplr.com/view/36012/javascript-natural-sort/ http://snipplr.com/view/36012/javascript-natural-sort/
- http://blog.codinghorror.com/sorting-for-humans-natural-sort-order/ http://blog.codinghorror.com/sorting-for-humans-natural-sort-order/
Thanks to Shog9's nice answer, which put me in the "right" direction I believe 多亏Shog9的好回答,我相信我朝着“正确”的方向前进
#5楼
list.sort(function(item1, item2){
return +(item1.attr > item2.attr) || +(item1.attr === item2.attr) - 1;
})
How they work samples: 它们是如何工作的:
+('aaa'>'bbb')||+('aaa'==='bbb')-1
+(false)||+(false)-1
0||0-1
-1
+('bbb'>'aaa')||+('bbb'==='aaa')-1
+(true)||+(false)-1
1||0-1
1
+('aaa'>'aaa')||+('aaa'==='aaa')-1
+(false)||+(true)-1
0||1-1
0
#6楼
Answer (in Modern ECMAScript) 答案(现代ECMAScript中)
list.sort((a, b) => (a.attr > b.attr) - (a.attr < b.attr))
Or 要么
list.sort((a, b) => +(a.attr > b.attr) || -(a.attr < b.attr))
Description 描述
Casting a boolean value to a number yields the following: 将布尔值转换为数字会产生以下结果:
-
true
->1
true
->1
-
false
->0
false
->0
Consider three possible patterns: 考虑三种可能的模式:
- x is larger than y:
(x > y) - (y < x)
->1 - 0
->1
x大于y:(x > y) - (y < x)
->1 - 0
>1
- x is equal to y:
(x > y) - (y < x)
->0 - 0
->0
x等于y:(x > y) - (y < x)
->0 - 0
>0
- x is smaller than y:
(x > y) - (y < x)
->0 - 1
->-1
x小于y:(x > y) - (y < x)
->0 - 1
>-1
(Alternative) (可选)
- x is larger than y:
+(x > y) || -(x < y)
x大于y:+(x > y) || -(x < y)
+(x > y) || -(x < y)
->1 || 0
+(x > y) || -(x < y)
->1 || 0
1 || 0
->1
1 || 0
>1
- x is equal to y:
+(x > y) || -(x < y)
x等于y:+(x > y) || -(x < y)
+(x > y) || -(x < y)
->0 || 0
+(x > y) || -(x < y)
->0 || 0
0 || 0
->0
0 || 0
>0
- x is smaller than y:
+(x > y) || -(x < y)
x小于y:+(x > y) || -(x < y)
+(x > y) || -(x < y)
->0 || -1
+(x > y) || -(x < y)
->0 || -1
0 || -1
->-1
0 || -1
>-1
So these logics are equivalent to typical sort comparator functions. 因此,这些逻辑等效于典型的排序比较器功能。
if (x == y) {
return 0;
}
return x > y ? 1 : -1;