如何在JavaScript中对字符串排序

本文翻译自:How to sort strings in JavaScript

I have a list of objects I wish to sort based on a field attr of type string. 我有一个对象列表,希望根据字符串类型的字段attr进行排序。 I tried using - 我尝试使用-

list.sort(function (a, b) {
    return a.attr - b.attr
})

but found that - doesn't appear to work with strings in JavaScript. 但发现-在JavaScript中似乎不适用于字符串。 How can I sort a list of objects based on an attribute with type string? 如何根据具有字符串类型的属性对对象列表进行排序?


#1楼

参考:https://stackoom.com/question/DJF/如何在JavaScript中对字符串排序


#2楼

I had been bothered about this for long, so I finally researched this and give you this long winded reason for why things are the way they are. 我已经为此烦恼了很长时间,所以我终于对此进行了研究,并为您提供了如此漫长的原因来说明事情的现状。

From the spec : 规格

Section 11.9.4   The Strict Equals Operator ( === )

The production EqualityExpression : EqualityExpression === RelationalExpression
is evaluated as follows: 
- Let lref be the result of evaluating EqualityExpression.
- Let lval be GetValue(lref).
- Let rref be the result of evaluating RelationalExpression.
- Let rval be GetValue(rref).
- Return the result of performing the strict equality comparison 
  rval === lval. (See 11.9.6)

So now we go to 11.9.6 所以现在我们去11.9.6

11.9.6   The Strict Equality Comparison Algorithm

The comparison x === y, where x and y are values, produces true or false. 
Such a comparison is performed as follows: 
- If Type(x) is different from Type(y), return false.
- If Type(x) is Undefined, return true.
- If Type(x) is Null, return true.
- If Type(x) is Number, then
...
- If Type(x) is String, then return true if x and y are exactly the 
  same sequence of characters (same length and same characters in 
  corresponding positions); otherwise, return false.

That's it. 而已。 The triple equals operator applied to strings returns true iff the arguments are exactly the same strings (same length and same characters in corresponding positions). 如果参数是完全相同的字符串(在相应位置具有相同的长度和相同的字符),则应用于字符串的三元等于运算符将返回true。

So === will work in the cases when we're trying to compare strings which might have arrived from different sources, but which we know will eventually have the same values - a common enough scenario for inline strings in our code. 因此,当我们尝试比较可能来自不同来源的字符串,但我们知道它们最终将具有相同的值时, ===将起作用。这是我们代码中内联字符串足够普遍的情况。 For example, if we have a variable named connection_state , and we wish to know which one of the following states ['connecting', 'connected', 'disconnecting', 'disconnected'] is it in right now, we can directly use the === . 例如,如果我们有一个名为connection_state的变量,并且我们想知道现在处于以下哪个状态['connecting', 'connected', 'disconnecting', 'disconnected'] ,我们可以直接使用===

But there's more. 但是还有更多。 Just above 11.9.4, there is a short note: 在11.9.4之上,有一个简短的注释:

NOTE 4     
  Comparison of Strings uses a simple equality test on sequences of code 
  unit values. There is no attempt to use the more complex, semantically oriented
  definitions of character or string equality and collating order defined in the 
  Unicode specification. Therefore Strings values that are canonically equal
  according to the Unicode standard could test as unequal. In effect this 
  algorithm assumes that both Strings are already in normalized form.

Hmm. What now? 现在怎么办? Externally obtained strings can, and most likely will, be weird unicodey, and our gentle === won't do them justice. 外部获得的字符串可能而且很可能是奇怪的单一代码,而我们温和的===不会使它们公正。 In comes localeCompare to the rescue: 在谈到localeCompare救援:

15.5.4.9   String.prototype.localeCompare (that)
    ...
    The actual return values are implementation-defined to permit implementers 
    to encode additional information in the value, but the function is required 
    to define a total ordering on all Strings and to return 0 when comparing
    Strings that are considered canonically equivalent by the Unicode standard. 

We can go home now. 我们现在可以回家了。

tl;dr; tl; dr;

To compare strings in javascript, use localeCompare ; 要比较javascript中的字符串,请使用localeCompare if you know that the strings have no non-ASCII components because they are, for example, internal program constants, then === also works. 如果您知道字符串没有非ASCII成分,因为它们是例如内部程序常量,则===也可以。


#3楼

In your operation in your initial question, you are performing the following operation: 在初始问题中的操作中,您正在执行以下操作:

item1.attr - item2.attr

So, assuming those are numbers (ie item1.attr = "1", item2.attr = "2") You still may use the "===" operator (or other strict evaluators) provided that you ensure type. 因此,假设这些是数字(即item1.attr =“ 1”,item2.attr =“ 2”),只要确保输入类型,您仍然可以使用“ ===”运算符(或其他严格的求值器)。 The following should work: 以下应该工作:

return parseInt(item1.attr) - parseInt(item2.attr);

If they are alphaNumeric, then do use localCompare(). 如果它们是alphaNumeric,则请使用localCompare()。


#4楼

An updated answer (October 2014) 更新的答案(2014年10月)

I was really annoyed about this string natural sorting order so I took quite some time to investigate this issue. 我对这种字符串自然排序顺序感到非常恼火,因此花了很多时间来研究这个问题。 I hope this helps. 我希望这有帮助。

Long story short 长话短说

localeCompare() character support is badass, just use it. localeCompare()字符支持很糟糕,只需使用它即可。 As pointed out by Shog9 , the answer to your question is: 正如Shog9指出的Shog9 ,您的问题的答案是:

return item1.attr.localeCompare(item2.attr);

Bugs found in all the custom javascript "natural string sort order" implementations 在所有自定义javascript“自然字符串排序顺序”实现中发现的错误

There are quite a bunch of custom implementations out there, trying to do string comparison more precisely called "natural string sort order" 有很多自定义实现,试图更精确地进行字符串比较,称为“自然字符串排序顺序”

When "playing" with these implementations, I always noticed some strange "natural sorting order" choice, or rather mistakes (or omissions in the best cases). 当“尝试”这些实现时,我总是注意到一些奇怪的“自然排序顺序”选择,或者是错误(或者在最佳情况下是遗漏)。

Typically, special characters (space, dash, ampersand, brackets, and so on) are not processed correctly. 通常,特殊字符(空格,破折号,“&”号,方括号等)未正确处理。

You will then find them appearing mixed up in different places, typically that could be: 然后,您会发现它们在不同的位置混合出现,通常可能是:

  • some will be between the uppercase 'Z' and the lowercase 'a' 有些会在大写字母“ Z”和小写字母“ a”之间
  • some will be between the '9' and the uppercase 'A' 有些会在'9'和大写字母'A'之间
  • some will be after lowercase 'z' 有些将在小写字母“ z”之后

When one would have expected special characters to all be "grouped" together in one place, except for the space special character maybe (which would always be the first character). 当一个人希望所有特殊字符都被“分组”在一个地方时,除了空格特殊字符(总是第一个字符)。 That is, either all before numbers, or all between numbers and letters (lowercase & uppercase being "together" one after another), or all after letters. 也就是说,要么全部在数字之前,要么全部在数字和字母之间(小写字母和大写字母彼此“在一起”),或者全部在字母之后。

My conclusion is that they all fail to provide a consistent order when I start adding barely unusual characters (ie. characters with diacritics or charcters such as dash, exclamation mark and so on). 我的结论是,当我开始添加几乎不寻常的字符(即带有变音符号或字符(例如破折号,感叹号等)的字符时,它们都无法提供一致的顺序。

Research on the custom implementations: 有关自定义实现的研究:

Browsers' native "natural string sort order" implementations via localeCompare() 浏览器通过localeCompare()的本机“自然字符串排序顺序”实现

localeCompare() oldest implementation (without the locales and options arguments) is supported by IE6+, see http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx (scroll down to localeCompare() method). IE6 +支持最旧的localeCompare()实现(没有语言环境和选项参数),请参见http://msdn.microsoft.com/zh-cn/library/ie/s4esdbwz(v = vs.94).aspx (向下滚动)到localeCompare()方法)。 The built-in localeCompare() method does a much better job at sorting, even international & special characters. 内置的localeCompare()方法在排序(甚至国际字符和特殊字符localeCompare()方面做得更好。 The only problem using the localeCompare() method is that "the locale and sort order used are entirely implementation dependent". 使用localeCompare()方法的唯一问题是“使用的语言环境和排序顺序完全取决于实现”。 In other words, when using localeCompare such as stringOne.localeCompare(stringTwo): Firefox, Safari, Chrome & IE have a different sort order for Strings. 换句话说,当使用诸如stringOne.localeCompare(stringTwo)之类的localeCompare时:Firefox,Safari,Chrome和IE对字符串的排序顺序不同。

Research on the browser-native implementations: 对浏览器本地实现的研究:

Difficulty of "string natural sorting order" “字符串自然排序顺序”的困难

Implementing a solid algorithm (meaning: consistent but also covering a wide range of characters) is a very tough task. 实施可靠的算法(意味着:一致但也涵盖了广泛的字符)是一项艰巨的任务。 UTF8 contains more than 2000 characters & covers more than 120 scripts (languages) . UTF8包含2000多个字符涵盖120多个脚本(语言) Finally, there are some specification for this tasks, it is called the "Unicode Collation Algorithm", which can be found at http://www.unicode.org/reports/tr10/ . 最后,有一些针对此任务的规范,称为“ Unicode排序算法”,可以在http://www.unicode.org/reports/tr10/上找到。 You can find more information about this on this question I posted https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order 您可以在我发布的这个问题上找到有关此问题的更多信息https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order

Final conclusion 定论

So considering the current level of support provided by the javascript custom implementations I came across, we will probably never see anything getting any close to supporting all this characters & scripts (languages). 因此,考虑到我遇到的javascript自定义实现所提供的当前支持水平,我们可能永远不会看到有什么东西能够接近支持所有这些字符和脚本(语言)的。 Hence I would rather use the browsers' native localeCompare() method. 因此,我宁愿使用浏览器的本地localeCompare()方法。 Yes, it does have the downside of beeing non-consistent across browsers but basic testing shows it covers a much wider range of characters, allowing solid & meaningful sort orders. 是的,它确实存在跨浏览器不一致的缺点,但是基本测试表明,它涵盖了更大范围的字符,允许可靠且有意义的排序顺序。

So as pointed out by Shog9 , the answer to your question is: 因此,正如Shog9所指出的,您的问题的答案是:

return item1.attr.localeCompare(item2.attr);

Further reading: 进一步阅读:

Thanks to Shog9's nice answer, which put me in the "right" direction I believe 多亏Shog9的好回答,我相信我朝着“正确”的方向前进


#5楼

list.sort(function(item1, item2){
    return +(item1.attr > item2.attr) || +(item1.attr === item2.attr) - 1;
}) 

How they work samples: 它们是如何工作的:

+('aaa'>'bbb')||+('aaa'==='bbb')-1
+(false)||+(false)-1
0||0-1
-1

+('bbb'>'aaa')||+('bbb'==='aaa')-1
+(true)||+(false)-1
1||0-1
1

+('aaa'>'aaa')||+('aaa'==='aaa')-1
+(false)||+(true)-1
0||1-1
0

#6楼

Answer (in Modern ECMAScript) 答案(现代ECMAScript中)

list.sort((a, b) => (a.attr > b.attr) - (a.attr < b.attr))

Or 要么

list.sort((a, b) => +(a.attr > b.attr) || -(a.attr < b.attr))

Description 描述

Casting a boolean value to a number yields the following: 将布尔值转换为数字会产生以下结果:

  • true -> 1 true -> 1
  • false -> 0 false -> 0

Consider three possible patterns: 考虑三种可能的模式:

  • x is larger than y: (x > y) - (y < x) -> 1 - 0 -> 1 x大于y: (x > y) - (y < x) -> 1 - 0 > 1
  • x is equal to y: (x > y) - (y < x) -> 0 - 0 -> 0 x等于y: (x > y) - (y < x) -> 0 - 0 > 0
  • x is smaller than y: (x > y) - (y < x) -> 0 - 1 -> -1 x小于y: (x > y) - (y < x) -> 0 - 1 > -1

(Alternative) (可选)

  • x is larger than y: +(x > y) || -(x < y) x大于y: +(x > y) || -(x < y) +(x > y) || -(x < y) -> 1 || 0 +(x > y) || -(x < y) -> 1 || 0 1 || 0 -> 1 1 || 0 > 1
  • x is equal to y: +(x > y) || -(x < y) x等于y: +(x > y) || -(x < y) +(x > y) || -(x < y) -> 0 || 0 +(x > y) || -(x < y) -> 0 || 0 0 || 0 -> 0 0 || 0 > 0
  • x is smaller than y: +(x > y) || -(x < y) x小于y: +(x > y) || -(x < y) +(x > y) || -(x < y) -> 0 || -1 +(x > y) || -(x < y) -> 0 || -1 0 || -1 -> -1 0 || -1 > -1

So these logics are equivalent to typical sort comparator functions. 因此,这些逻辑等效于典型的排序比较器功能。

if (x == y) {
    return 0;
}
return x > y ? 1 : -1;
JavaScript 字符串首字母排序可以通过以下步骤实现: 1. 首先,将字符串转换为一个数组,并使用字符串的 `split()` 方法,使用空格作为分隔符将字符串拆分为多个单词。 2. 接下来,使用数组的 `map()` 方法将每个单词的首字母提取出来并转换为大写字母,使用字符串的 `charAt()` 方法获取首字母,并通过 `toUpperCase()` 方法将其转换为大写。 3. 使用数组的 `sort()` 方法对首字母数组进行排序。默认情况下,对字符串进行排序时,会按照字母的 Unicode 编码进行排序。 4. 最后,再次使用数组的 `map()` 方法将排序后的首字母数组转换为字符串。首字母数组的每个元素都是字符串的一个字符,因此可以使用数组的 `join()` 方法将这些字符连接起来。使用空格作为连接符将字符拼接成一个字符串。 以下是一个简单的 JavaScript 代码示例: ```javascript let str = "javascript 字符串 首字母排序"; let sortedStr = str.split(' ') // 将字符串分割成单词数组 .map(word => word.charAt(0).toUpperCase()) // 提取每个单词的首字母并转换为大写 .sort() // 对首字母数组进行排序 .map(letter => letter.toLowerCase()) // 转换排序后的首字母为小写 .join(' '); // 将排序后的首字母拼接成字符串,使用空格作为连接符 console.log(sortedStr); // 输出: "z j s" ``` 以上代码会将原始字符串的每个单词的首字母提取出来并进行排序,最后返回一个字符串,其每个字母都是排序后首字母的小写形式,并按照排序顺序排列。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值