JavaScript——数组去重总结笔记-CSDN博客

本文是笔者阅读某数组去重文章后的个人笔记，每一个方法都有动手测试。
文章链接：
juejin.im/post/5b0284… github.com/mqyqingfeng…

测试代码如下：

var arr = [];
// 生成[0, 100000]之间的随机数
for (let i = 0; i < 100000; i++) {
  arr.push(0 + Math.floor((100000 - 0 + 1) * Math.random()))
}

console.time('测试时长：');
arr.unique();
console.timeEnd('测试时长：');
复制代码

注：每种方法都是对同一已去重的数组arr进行测试。

1、最简单粗暴—双层循环嵌套

没有什么算法是for循环解决不了的，一个不够，就再嵌套几个o.o....

写法1：嵌套循环目标数组，循环中设定开关判断是否有重复

Array.prototype.unique = function(){
    let newArr = [];
    let arrLen = this.length;
    for(let i = 0; i < arrLen; i++){
        let bRepeat = true;
        for(let j = 0;j < arrLen; j++){
            if(this[i] === newArr[j]){
                flg = false;
                break;
            }
        }
        
        if(bRepeat){
            newArr.push(this[i]);
        }
    }
    return newArr;
}
复制代码

写法2：嵌套循环目标数组与新数组，循环中设定开关判断是否有重复

Array.prototype.unique = function(){
    let newArr = [];
    
    for(let i = 0, arrLen = this.length; i < arrLen; i++){
        let bRepeat = true;
        for(let j = 0, resLen = newArr.length; j < resLen; j++){
            if(this[i] === newArr[j]){
                flg = false;
                break;
            }
        }
        
        if(bRepeat){
            newArr.push(this[i]);
        }
    }
    return newArr;
}
复制代码

写法3：嵌套循环目标数组与新数组，不设定开关，通过内层循环计数变量j与新数组的长度比较，来判断是否有重复元素。

Array.prototype.unique = function(){
    let newArr = [];
    
    for(let i = 0, arrLen = this.length; i < arrLen; i++){
        for(var j = 0, resLen = newArr.length; j < resLen; j++){
            if(this[i] === newArr[j]){
                break;
            }
        }
        
        //如果新数组长度与j值相等，说明循环走完，没有重复的此元素
        if(j === resLen){
            newArr.push(this[i]);
        }
    }
    return newArr;
}
复制代码

双层循环时长：

写法1：28014.736083984375ms
写法2：3643.562255859375ms
写法3：2853.471923828125ms
复制代码

优点：兼容性好，简单易懂
扩展思考：有效的减少循环次数及临时变量的产生可提升代码执行效率。

2、利用indexOf

ES6给数组原型添加indexOf()方法，返回在该数组中第一个找到的元素位置,如果它不存在则返回-1

方法1：for循环内，利用indexOf判断

Array.prototype.unique = function(){
    let newArr = [];
    
    for(let i = 0, arrLen = this.length; i < arrLen; i++){
        if(this.indexOf(this[i]) === -1){
            newArr.push(this[i]);    
        }
    }
    return newArr;
}
复制代码

方法2：Array.prototype.filter()+Array.prototype.indexOf()。

filter()方法使用指定的函数测试所有元素,并创建一个包含所有通过测试的元素的新数组。通过数组的过滤效果，拿每一个元素的索引与indexOf(当前元素)返回的值比较。

Array.prototype.unique = function(){
    let newArr = this.filter((item, index) => {
        return this.indexOf(item) === index;
    });
    return newArr;
}
复制代码

方法3：Array.prototype.forEach()+Array.prototype.indexOf()

Array.prototype.unique = function(){
    let newArr = [];
    this.forEach(item => {
        if(this.indexOf(item) === -1){
            newArr.push(item);
        }
    });
    return newArr;
}
复制代码

方法4：for-of循环+indexOf

ES6新增for-of循环，比for-in循环，forEach更强大好用

Array.prototype.unique = function(){
    let newArr = [];
    for(let item of this){
         if(this.indexOf(item) === -1){
            newArr.push(item);
        }
    }
    return newArr;
}
复制代码

测试：

方法1： 4826.351318359375ms
方法2： 4831.322265625ms
方法3： 4717.027099609375ms
方法4： 4776.078857421875ms
复制代码

扩展思考：一层循环的效果差不多。关于遍历，建议用for-of。

3、Array.prototype.includes()

includes用于找元素，找到返回true，找不到返回false。相比较indexOf,includes更自然，且能判断NaN

 Array.prototype.unique = function(){
    let newArr = [];
    for(let item of this){
         if(!newArr.includes(item)){
            newArr.push(item);
        }
    }
    return newArr;
}
复制代码

测试：3700.76220703125ms
结论：相比较indexOf更快！且代码更优雅

4、Array.prototype.sort()

方法1：先对数组进行排序，在遍历过程中判断元素是否与后一个元素相同

Array.prototype.unique = function(){
    let newArr = [];
    this.sort();
    for(let i = 0, arrLen = this.length; i < arrLen; i++){
        if(this[i] !== this[i+1]){
            newArr.push(this[i]);    
        }
    }
    return newArr;
}
复制代码

方法2：先对数组进行排序,在遍历过程中判断元素是否与新数组的最后一个元素相同

Array.prototype.unique = function(){
    let newArr = [];
    this.sort();
    for(let i = 0, arrLen = this.length; i < arrLen; i++){
        if(this[i] !== newArr[newArr.length - 1]){
            newArr.push(this[i]);    
        }
    }
    return newArr;
}
复制代码

测试：

 100.182861328125ms
 89.683837890625ms
复制代码

扩展思考：

有序数组的遍历速度要比无序数组快好多倍！！！而且不只是javascript语言，其他语言也这样，比如java，Python！！如果实战中存在大数组遍历，建议可先排序！
复制代码

结合以上分析再优化：

Array.prototype.unique = function(){
    //加concat()，防止污染原数组，当然，仅针对一维数组
    return this.concat().sort().filter((item, index) =>{
        return item !== this[index+1];
    });
}
复制代码

测试：

 89.2060546875ms
复制代码

5、reduce

reduce() 方法接收一个函数作为累加器，数组中的每个值（从左到右）开始缩减，最终计算为一个值。

Array.prototype.unique = function(){
    //仅针对一维数组
    return this.concat().sort().reduce((total, item) =>{
        if(!total.includes(item)){
            total.push(item);
        }
        return total;
    }, []);
}
复制代码

测试：

4022.2578125ms
复制代码

这个慢多了o.o....

6、Object 键值对

原理是利用对象的键的唯一性。
但需要注意：

无法区分隐式类型转换成字符串后一样的值，即字面量相同的数字和字符串，比如 1 和 '1';
无法处理复杂数据类型，比如对象（因为对象作为 key 会变成 [object Object]）；
特殊数据，比如key为 'proto'，因为对象的 proto 属性无法被重写。

6.1 普通对象

解决第一、第三点问题，实现一：

Array.prototype.unique = function () {
  const newArray = [];
  const tmp = {};
  for (let i = 0, arrLen = this.length; i < arrLen; i++) {
    if (!tmp[typeof this[i] + this[i]]) {
      tmp[typeof this[i] + this[i]] = 1;
      newArray.push(this[i]);
    }
  }
  return newArray;
}
复制代码

解决第二点问题，实现二：

Array.prototype.unique = function () {
  const newArray = [];
  const tmp = {};
  for (let i = 0, arrLen = this.length; i < arrLen; i++) {
    // 使用JSON.stringify()进行序列化
    if (!tmp[typeof this[i] + JSON.stringify(this[i])]) {
      // 将对象序列化之后作为key来使用
      tmp[typeof this[i] + JSON.stringify(this[i])] = 1;
      newArray.push(this[i]);
    }
  }
  return newArray;
}
复制代码

优化：

    // 使用 JSON.stringfiy 处理
    Array.prototype.unique = function () {
            return this.filter((item, index) => {
                return this.findIndex(element => {
                    return JSON.stringfy(item) === JSON.stringfy(element)
                }) === index;
            });
       }
    }
复制代码

测试：

实现一：104.859130859375ms
实现二：120.89697265625ms
复制代码

6.2 ES6

ES6新增了Set和Map 数据结构，性质与java类似。如set对象类似数组，成员都是不重复的。

Array.from()+Set()
from用于将类数组对象转为真正的数组，是数组的静态方法

  Array.prototype.unique = function(){
          return Array.from(new Set(this));
  }
复制代码

测试：

   20.884033203125ms
复制代码

利用扩展运算符在简化：

  Array.prototype.unique = function(){
              return [...new Set(this)];
  }
复制代码

测试：

      16.0419921875ms
复制代码

是不是速度更快了？？

Map

方法1：

  Array.prototype.unique = function () {
      let newArray = [];
      let map = new Map();
      for (let item of this) {
          if (!map.has(item)) {
            map.set(item, 1);
            newArray.push(item);
          }
      }
       return newArray;
  }
复制代码

方法2：更优雅的写法

  Array.prototype.unique = function () {
      let map = new Map();
      return this.filter(item => {
          return map.has(item) || map.set(item, 1);
      });
  }
复制代码

测试：

  方法1： 20.84130859375ms
  方法2： 16.893798828125ms
复制代码

7、结论

结合ES6的一些新特性，数组去重速率可以提高上百倍！代码的简介性和优雅度也大大提高！虽然数组去重有很多种方法，写法不同，速度不同，但并不是最快的就是最好的，适合场景才是第一要求。比如以下数组：

let arr = [1, '1', { name: 1, age: 12 }, { name: 1, age: 12 }, , , {},{}, undefined,undefined ,NaN,NaN,null,null, [],[]];
复制代码

这种数组就得不是上面每一种方法都实用了。

再比如：

var arr = [1, 2, NaN];
arr.indexOf(NaN); // -1
arr.includes(NaN) // true
复制代码

所以，选择哪种去重方法，必须结合业务、数据元素类型以及浏览器兼容性等因素来考虑。