【python 教程】02.其他数据类型（序列、集合、字典、有序字典、计数器、矩阵）

浪啦里格朗

已于 2022-07-21 13:05:54 修改

阅读量682

点赞数

分类专栏： python 文章标签： python

于 2022-07-18 21:35:39 首次发布

本文链接：https://blog.csdn.net/songxia928_928/article/details/125860799

版权

python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

其他数据类型

其他数据类型

1 序列（tuple 、list）

# 序列有两种：tuple（定值表； 也有翻译为元组） 和 list (表)
s1 = (2, 1.3, 'love', 5.6, 9, 12, False)    # tuple 元组 元素不可变更
s2 = [True, 5, 'smile']                     # list 表 元素可变更

（1）新建（[0] * n、range）

（a）手动新建（[0] * n）

# 乘以对应中括号内维度，对应维度内的数据复制多份。
# === copy 一个数
n = 4
dp = [0] * n     # n，乘以括号里的维度，0复制n份，数值拷贝为深拷贝
print(dp)
dp[0], dp[1] = 1, 2
print(dp)
[0, 0, 0, 0]
[1, 2, 0, 0]

dp = [0 for _ in range(n)]  # 与 [0] * n等价 
print(dp)
dp[0], dp[1] = 1, 2
print(dp)
[0, 0, 0, 0]
[1, 2, 0, 0]

# === copy 一个list
dp = [[0]] * n    # n*1，乘以外面括号的维度，[0]复制n份，list拷贝为浅拷贝
print(dp)
dp[0][0], dp[3][0] = 1, 4
print(dp)
dp = [[0] for _ in range(n)] # n*1，用for深拷贝
print(dp)
dp[0][0], dp[3][0] = 1, 4
print(dp)
[[0], [0], [0], [0]]
[[4], [4], [4], [4]]
[[0], [0], [0], [0]]
[[1], [0], [0], [4]]

# === copy 两个数
dp = [1, 2] * n     # (2*n)
print(dp)
dp = [[1, 2]] * n   # n*2，浅拷贝
print(dp)
dp = [[1, 2] for _ in range(n)]   # n*2，用for深拷贝
print(dp)
[1, 2, 1, 2, 1, 2, 1, 2]
[[1, 2], [1, 2], [1, 2], [1, 2]]
[[1, 2], [1, 2], [1, 2], [1, 2]]

# === 二维数组
m = 3
dp = [[0] * n ] * m   # m*n，浅拷贝
print(dp)
dp[0][0], dp[0][1] = 1, 2
print(dp)
[ [0, 0, 0, 0], 
[0, 0, 0, 0], 
[0, 0, 0, 0] ]
[ [1, 2, 0, 0], 
[1, 2, 0, 0], 
[1, 2, 0, 0] ]

dp = [[0] * n  for _ in range(m) ]  # m*n，用for深拷贝
print(dp)
dp[0][0], dp[0][1] = 1, 2
print(dp)
[ [0, 0, 0, 0], 
[0, 0, 0, 0], 
[0, 0, 0, 0] ]
[ [1, 2, 0, 0], 
[0, 0, 0, 0], 
[0, 0, 0, 0] ]

# 左边上边为1
#      1*n     (m-1) * n
dp = [[1]*n] + [[1]+[0]*(n-1) for _ in range(m-1)] 
print(dp)
[[1, 1, 1, 1], 
[1, 0, 0, 0], 
[1, 0, 0, 0]]

（b）均匀数组（range）

函数原型： range（start， end， scan)
参数：
start:计数的开始位置，默认是从0开始。
end: 计数的结束位置。
scan：每次跳跃的间距，默认为1。
返回： 并非是list，而是一个迭代器。

注意： 返回的结果并不包含end。

range(5)  # [0, 1, 2, 3, 4] 
range(0,6)   #  [0, 1, 2, 3, 4, 5]
range(0,10,2)   # [0, 2, 4, 6, 8]
range(4,-4,-1)  # [4, 3, 2, 1, 0, -1, -2, -3]

（2）拼接成字符串（‘’.join(list)）

seq1 = ['hello','good','boy','doiido']
str1 = '_'.join(seq1)  # seq1元素需是字符串
str2 = ''.join(seq1)

（3）添加元素（.append、.extend、+）

# ==== .append
list1 = ['a', 'b']
list2 = ['c', 'd']
list3 = ['e', 'f']
list4 = ['1', '2']

list1.append(list4)
print(list1)    # ['a', 'b', ['1', '2']]

# ==== .extend
list2.extend(list4)
print(list2)    # ['c', 'd', '1', '2']

# ==== + 
print(list3 + list4) # ['e', 'f', '1', '2']

（4）删除元素（.remove、.pop、del）

（a） .remove: 删除单个元素，删除首个符合条件的元素，按值删除

str=[1,2,3,4,5,2,6]
str.remove(2)    # [1, 3, 4, 5, 2, 6]

（b） .pop: 删除单个或多个元素，按位删除(根据索引删除)

>>> str=[0,1,2,3,4,5,6]
>>> str.pop(1)   #pop删除时会返回被删除的元素
1
>>> str
[0, 2, 3, 4, 5, 6]
>>> str2=['abc','bcd','dce']
>>> str2.pop(2)
'dce'
>>> str2
['abc', 'bcd']

（c） del：它是根据索引(元素所在位置)来删除

>>> str=[1,2,3,4,5,2,6]
>>> del str[1]
>>> str
[1, 3, 4, 5, 2, 6]
>>> str2=['abc','bcd','dce']
>>> del str2[1]
>>> str2
['abc', 'dce']

除此之外，del还可以删除指定范围内的值。
>>> str=[0,1,2,3,4,5,6]
>>> del str[2:4]  #删除从第2个元素开始，到第4个为止的元素(但是不包括尾部元素)
>>> str
[0, 1, 4, 5, 6]

del 也可以删除整个数据对象(列表、集合等)
>>> str=[0,1,2,3,4,5,6]
>>> del str
>>> str         #删除后，找不到对象

Traceback (most recent call last):
File “<pyshell#27>”, line 1, in
str
NameError: name ‘str’ is not defined

注意： del是删除引用(变量)而不是删除对象(数据)，对象由自动垃圾回收机制（GC）删除。

（5）排序（.sort、sorted）

（a）一维排序

a = [7, 3, 5 ,1]
b = sorted(a)                   # 从小到大，sorted不改变a，有返回
print(a, b)
b = sorted(a, reverse=True)     # 从大到小
print(a, b)
b = a.sort()                    # 从小到大, .sort改变a本身，无返回
print(a, b)
[7, 3, 5, 1] [1, 3, 5, 7]
[7, 3, 5, 1] [7, 5, 3, 1]
[1, 3, 5, 7] None

（b）二维排序

a = [[7,0], [4,4], [7,1], [5,0], [6,1], [5,2]]
b = sorted(a, key = lambda x: (-x[0], x[1]))  # 第0位降序，第1位升序
print(b)

a = [[7,0], [4,4], [7,1], [5,0], [6,1], [5,2]]
a.sort( key = lambda x: (-x[0], x[1]) )  
print(a)
[[7, 0], [7, 1], [6, 1], [5, 0], [5, 2], [4, 4]]
[[7, 0], [7, 1], [6, 1], [5, 0], [5, 2], [4, 4]]

（c）返回排序后索引
最好转换成numpy，再做。

（d）元素为字符串的排序

def list_sort_string():  # 区分大小写
    a=["delphi","Delphi","python","Python","c++","C++","c","C","golang","Golang"]
    a.sort() #按字典顺序升序排列
    print("升序:",a)
    a.sort(reverse=True) #按降序排列
    print("降序:",a)

升序: ['C', 'C++', 'Delphi', 'Golang', 'Python', 'c', 'c++', 'delphi', 'golang', 'python']
降序: ['python', 'golang', 'delphi', 'c++', 'c', 'Python', 'Golang', 'Delphi', 'C++', 'C']


def list_sort_by_length():
    a=["delphi","Delphi","python","Python","c++","C++","c","C","golang","Golang"]
    a.sort(key=lambda ele:len(ele)) #按元素长度顺序升序排列
    print("升序:",a)
    list.sort(key=lambda ele:len(ele),reverse=True) #按降序
    print("降序:",list)

升序: ['c', 'C', 'c++', 'C++', 'delphi', 'Delphi', 'python', 'Python', 'golang', 'Golang']
降序: ['delphi', 'Delphi', 'python', 'Python', 'golang', 'Golang', 'c++', 'C++', 'c', 'C']

（6）反序（reversed）

a = [1,2,5,4,3]

b = reversed(a)   # 迭代器
b1 = list(b)
b3 = a[: :-1] 

print('=====================')
print(type(b), b)
print(b1)
print('=====================')
print(b3)

=====================
<class 'list_reverseiterator'> <list_reverseiterator object at 0x000001C45B43F9B0>
[3, 4, 5, 2, 1]
=====================
[3, 4, 5, 2, 1]

（7）乱序（shuffle）

from random import shuffle
L = [1, 2, 4, 5]
shuffle(L)
print(L)

（8）查找元素（in、.index）

（a）in

if a in b:

（b）.index
list的index(object)返回元素第一次出现的位置

a = ["ab","cd",1,3]
print a.index(1)     # 输出就是2

（c）二分查找（bisect）

import bisect

a = [1,4,6,8,12,15,20]
position = bisect.bisect(a,13)   # bisect 就是在调用 bisect_right
print(position)

a.insert(position,13)  # 用可变序列内置的insert方法插入
print(a)

a2 = [1,4,6,8,12,15,20]
bisect.insort(a2,13)  # 使用bisect.insort，比bisect先查找该插入哪个位置，再用insert方法插入更加快速的方法
print(a2)
5
[1, 4, 6, 8, 12, 13, 15, 20]
[1, 4, 6, 8, 12, 13, 15, 20]


L = [1,3,3,6,8,12,15]  
x = 3

x_insert_point = bisect.bisect_left(L, x)  #在L中查找x，x存在时返回x左侧的位置，x不存在返回应该插入的位置 
print(x_insert_point)  # 1
  
x_insert_point = bisect.bisect_right(L, x)  #在L中查找x，x存在时返回x右侧的位置，x不存在返回应该插入的位置
print(x_insert_point)  # 3
其目的在于查找该数值将会插入的位置并返回，而不会插入。如果x存在a中则返回x右边的位置
def bisect_right(a, x, lo=0, hi=None)
    # a 原列表
    # x 插入的元素
    # lo 起始位置 默认值为0
    # hi 结束位置 默认值为len(a)  

x_insort_left = bisect.insort_left(L, x)  #将x插入到列表L中，x存在时插入在左侧  
print(x_insort_left, L)  # None [1, 3, 3, 3, 6, 8, 12, 15]

x_insort_rigth = bisect.insort_right(L, x) #将x插入到列表L中，x存在时插入在右侧　　　　  
print(x_insort_rigth, L) # None [1, 3, 3, 3, 3, 6, 8, 12, 15]

手写二分查找：

        def search(left, right): #二分查找一个数，[5,7,7,8,8,10]这种就有点麻烦
            while left <= right:
                pivot = (left + right) // 2
                if nums[pivot] == target:
                    return pivot
                else:
                    if target < nums[pivot]:
                        right = pivot - 1
                    else:
                        left = pivot + 1
            return -1

class Solution:
    def search_left(self, nums, target):  # 二分查找左边界
        lo = 0   # lo不需要为-1，因为mid是整除2，中点会和lo重合
        hi = len(nums) # 为了中点能比较到端点

        idx = -1
        F_in = False   # 标记位，标记是否在nums中
        while lo < hi:  # 当lo=hi，结束循环
            mid = (lo + hi) // 2  # 中点坐标
            if nums[mid] == target :   # 中点 = target，选左半边
                F_in = True
                hi = mid
            elif nums[mid] > target : # 中点 > target，选左半边
                hi = mid
            else:
                lo = mid+1
        if F_in : idx = lo
        
        return idx

    def search_right(self, nums, target):  # 二分查找右边界
        lo = 0
        hi = len(nums) 
        
        idx = -1
        F_in = False
        while lo < hi:
            mid = (lo + hi) // 2  # 中点坐标
            if nums[mid] == target :   # 中点 = target，选左右边
                F_in = True
                lo = mid+1
            elif nums[mid] > target : # 中点 > target，选左半边
                hi = mid
            else:
                lo = mid+1
        if F_in : idx = hi-1   # nums[hi]大于target,所以需要-1
        
        return idx

    def searchRange(self, nums: List[int], target: int) -> List[int]:
        if (not nums) or nums[0] > target or nums[-1] < target: # 为空或 不在范围内
            return [-1, -1]
        
        left_idx = self.search_left(nums, target)        
        if left_idx == -1: return [-1, -1]
        
        right_idx = self.search_right(nums, target)

        return [left_idx, right_idx]

（9）清空list

Imgs.clear()

（10）合并/拼接（+），分片

（a）合并

List3 = List1 + List2

（b）分片

（11）去重（set）

set1=set([1,2,3,4])
set2=set(['A','B','D','C'])
set3=set(['A','C', 'B', 'D'])
set4=set(['A','B','D','C','B'])
print('1.',set1)
print('2.',set2)
print('3.',set3)
print('4.',set4)

{1, 2, 3, 4}
{'A', 'C', 'B', 'D'}
{'A', 'C', 'B', 'D'}
{'A', 'C', 'B', 'D'}

注意：
这里set后的查询，只需要O(1)时间（哈希查找）。正常情况下是O(n)。
哈希查找：构造字典，以数的地址为key，数的位置为val。

为什么哈希\字典查找是O(1)？

（12） *list

● 列表前面加星号作用是将列表解开成两个独立的参数，传入函数
● 字典前面加两个星号，是将字典解开成独立的元素作为形参

def add(a, b):
    return a + b

if __name__ == '__main__':
    #  ==== d1
    d1 = [4, 3]
    print('== d1:', d1)
    print('== *d1:', *d1)
    
    print(add(*d1))
    
    #  ==== d2
    d2 = {'a': 4, 'b': 3}
    print('== d2:', d2)
    '''
    d2_tp = **d2
    print('== **d2:', d2_tp2)
    '''
    
    print(add(**d2))

2 集合

集合（set）是一个无序的不重复元素序列。
Python和Java中，集合的插入、查找都依赖于哈希，时间复杂度固定。

（1）新建（{}，set()）

可以使用大括号 { } 或者 set() 函数创建集合，注意：创建一个空集合必须用 set() 而不是 { }，因为 { } 是用来创建一个空字典。

set_a = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
print(type(set_a), set_a)
<class 'set'> {'banana', 'orange', 'apple', 'pear'}

# set list
list1 = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
set_a = set(list1)
print(type(set_a), set_a)
<class 'set'> {'banana', 'orange', 'apple', 'pear'}
# set str
set_a = set('abracadabra')
print(type(set_a), set_a)
<class 'set'> {'d', 'a', 'c', 'b', 'r'}

（2）加入（.add()，.update()）

# add 元素
set_a = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
set_a.add( 'orange' )
print(set_a)
set_a.add( 'crabgrass' )
print(set_a)
{'banana', 'orange', 'apple', 'pear'}
{'orange', 'pear', 'crabgrass', 'banana', 'apple'}

# update 集合 。还有一个方法，也可以添加元素，且参数可以是列表，元组，字典等
set_a = set(("Google", "Runoob", "Taobao"))
set_a.update({1,3})  
print(set_a)
set_a.update([1,4],[5,6]) # s.update( x )可以有多个，用逗号分开
print(set_a)
{'Taobao', 1, 'Runoob', 3, 'Google'}
{'Taobao', 1, 'Runoob', 3, 4, 5, 6, 'Google'}

（3）删除（.remove()，.discard()，pop()）

# remove 指定元素
set_a = set(("Google", "Runoob", "Taobao"))
set_a.remove("Taobao")
print(set_a)
'''
set_a.remove("Facebook")   # 不存在会发生错误
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'Facebook'
'''
{'Google', 'Runoob'}
# discard 指定元素
set_a = set(("Google", "Runoob", "Taobao"))
set_a.discard("Facebook")  # 不存在不会发生错误
print(set_a)
{'Taobao', 'Google', 'Runoob'}
# pop 随机删除
set_a = set(("Google", "Runoob", "Taobao"))
print(set_a)
set_a.pop()
print(set_a)#多次执行测试结果都不一样。
# set 集合的 pop 方法会对集合进行无序的排列，然后将这个无序排列集合的左面第一个元素进行删除。
{'Taobao', 'Google', 'Runoob'}
{'Google', 'Runoob'}

（4）长度（len()）

set_a = set(("Google", "Runoob", "Taobao"))
print(len(set_a))
3

（5）清空（.clear()）

set_a = set(("Google", "Runoob", "Taobao"))
set_a.clear()
print(set_a)
set()

（6）是否存在（in）

集合没有list中 a.index(‘x’)的功能。

set_a = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
print('orange' in set_a)
print('crabgrass' in set_a)
True
False

（7）集合运算（-，|，&，^）

a = set('123')
b = set('234')
print(a)         
print(b)
print(a-b)  # 集合a中包含而集合b中不包含的元素
print(a|b)  # 集合a或b中包含的所有元素
print(a&b)  # 集合a和b中都包含了的元素
print(a^b)  # 不同时包含于a和b的元素
{'2', '3', '1'}
{'2', '4', '3'}
{'1'}
{'4', '3', '2', '1'}
{'2', '3'}
{'4', '1'}

集合内置方法完整列表：

方法	描述
add()	为集合添加元素
clear()	移除集合中的所有元素
copy()	拷贝一个集合
difference()	返回多个集合的差集
difference_update()	移除集合中的元素，该元素在指定的集合也存在
discard()	删除集合中指定的元素
intersection()	返回集合的交集
intersection_update()	返回集合的交集
isdisjoint()	判断两个集合是否包含相同的元素，如果没有返回 True，否则返回 False
issubset()	判断指定集合是否为该方法参数集合的子集
issuperset()	判断该方法的参数集合是否为指定集合的子集
pop()	随机移除元素
remove()	移除指定元素
symmetric_difference()	返回两个集合中不重复的元素集合
symmetric_difference_update()	移除当前集合中在另外一个指定集合相同的元素，并将另外一个指定集合中不同的元素插入到当前集合中
union()	返回两个集合的并集
update()	给集合添加元素

3 字典

键必须不可变，所以可以用数字，字符串或元组充当，所以用列表就不行
不允许同一个键出现两次。创建时如果同一个键被赋值两次，后一个值会被记住

dict = {'Name': 'Zara', 'Age': 7, 'Name': 'Manni'} 
print( "dict['Name']: ", dict['Name'])
dict['Name']:  Manni

（1）新建（{}、.setdefault()、dict()、defaultdict）

（a）一般（dic[‘k1’] = ‘v1’）

dic = {}
# dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
dic['k3'] = 'v3'
print(dic)
{'k1': 'v1', 'k2': 'v2', 'k3': 'v3'}
#OrderedDict([('k1', 'v1'), ('k2', 'v2'), ('k3', 'v3')]) #OrderedDict()的输出

（b）.setdefault()

# =====字典设置默认值
x = {}
x.setdefault(1, 0)   # 设置条目默认值，初始化字典
print(x)
{1: 0}
x[2] = 10
print(x)
{1: 0, 2: 10}
x.setdefault(2, 1)  # 无法改变key为 2 的val
print(x)
{1: 0, 2: 10}
x.setdefault(3, 1) # 可以新加
print(x)
{1: 0, 2: 10, 3: 1}

（c）dict()

>>>dict()                        # 创建空字典 
{} 
>>> dict(a='a', b='b', t='t')     # 传入关键字 
{'a': 'a', 'b': 'b', 't': 't'} 
>>> dict(zip(['one', 'two', 'three'], [1, 2, 3]))   # 映射函数方式来构造字典 
{'three': 3, 'two': 2, 'one': 1}  
>>> dict([('one', 1), ('two', 2), ('three', 3)])    # 可迭代对象方式来构造字典 {'three': 3, 'two': 2, 'one': 1} 

a_list = ['one', 'two', 'three']
b_list = [1, 2, 3]
dic = dict(a_list=a_list, b_list=b_list)     # 传入关键字 
print(dic)
{'a_list': ['one', 'two', 'three'], 'b_list': [1, 2, 3]}

（d）defaultdict
使用dict时，如果引用的Key不存在，就会抛出KeyError。如果希望key不存在时，返回一个默认值，就可以用defaultdict：

from collections import defaultdict
like = defaultdict(list)  #   新建空字典

from collections import defaultdict

dd = defaultdict(lambda: 'N/A')
dd['key1'] = 'abc'

print(dd['key1']) # key1存在
print(dd['key2']) # key2不存在，返回默认值

注意默认值是调用函数返回的，而函数在创建defaultdict对象时传入。
除了在Key不存在时返回默认值，defaultdict的其他行为跟dict是完全一样的。

# 找出个数为1 的数
class Solution:
    def singleNumber(self, nums: List[int]) -> int:
        hash_table = defaultdict(int) # 新建哈希表
        for num in nums:
            print(hash_table[num])  # 如果不存在num，则返回0
            hash_table[num] += 1   # 以num为key，num的个数为val

        for num in hash_table: # 遍历key
            if hash_table[num] == 1:
                return num

（2）清空（.clear()）

dic = {}
# dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
dic.clear()
print(dic)
{}
# OrderedDict()   # OrderedDict()的输出

（3）拷贝（.copy()）

dic = {}
# dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
new_dic = dic.copy()
print(new_dic)
{'k1': 'v1', 'k2': 'v2'}
# OrderedDict([('k1', 'v1'), ('k2', 'v2')])  # OrderedDict()的输出

（4）keys,vals ==> dict （.fromkeys()、 zip + dict()）

（a） .fromkeys()

dic = {}
#dic = OrderedDict()
name = ['tom','lucy','sam']
age = [12, 1, 3]
print(dic.fromkeys(name))
print(dic.fromkeys(name, 20))
print(dic.fromkeys(name, age))
{'tom': None, 'lucy': None, 'sam': None}
{'tom': 20, 'lucy': 20, 'sam': 20}
{'tom': [12, 1, 3], 'lucy': [12, 1, 3], 'sam': [12, 1, 3]}  
#OrderedDict([('tom', None), ('lucy', None), ('sam', None)])  # OrderedDict()
#OrderedDict([('tom', 20), ('lucy', 20), ('sam', 20)])
#OrderedDict([('tom', [12, 1, 3]), ('lucy', [12, 1, 3]), ('sam', [12, 1, 3])])

（b） dict()

x = [1, 2, 3]
y = ["one", "two", "three"]
z = zip(x,y)
dic = dict(z)            # dict是实例化字典，不要用dict命名变量
#dic = OrderedDict(z)
print(z)
print(dic)
<zip object at 0x000001358C032588>
{1: 'one', 2: 'two', 3: 'three'}
#OrderedDict([(1, 'one'), (2, 'two'), (3, 'three')]) # OrderedDict()

（5）items（“键值对”） ==> dict （ dict() ）

dic = {2:'b', 1:'a', 3:'c'}
items = dic.items()
print(items)

dic1 = dict(items)
dic2 = dict( list(items) )      # 转换成list，再转字典
#dic1 = OrderedDict(items)
#dic2 = OrderedDict( list(items) ) 
print(dic1)
print(dic2)
dict_items([(2, 'b'), (1, 'a'), (3, 'c')])
{2: 'b', 1: 'a', 3: 'c'}
{2: 'b', 1: 'a', 3: 'c'}
#OrderedDict([(2, 'b'), (1, 'a'), (3, 'c')])    # OrderedDict()的输出
#OrderedDict([(2, 'b'), (1, 'a'), (3, 'c')])

（6）dict ==> items（.items()）

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
print(dic.items())   # 变成列表需要加list
dict_items([('k1', 'v1'), ('k2', 'v2')])
# odict_items([('k1', 'v1'), ('k2', 'v2')])  # OrderedDict()的输出

# ==== items 到 keys和vals
items_list = list(items) # 必须转换成list，才能提取key和val
keys = items_list[:][0] 
vals = items_list[:][1]
print(keys)
print(vals)
('k1', 'v1')
('k2', 'v2')

（7）返回所有key （.keys()）

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
print(dic.keys())  
dict_keys(['k1', 'k2'])
# odict_keys(['k1', 'k2'])       # OrderedDict()的输出

（8）返回所有value（.values()）

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
dic['k3'] = 'v3'
print(dic.values())
dict_values(['v1', 'v2', 'v3'])
# odict_values(['v1', 'v2', 'v3'])    # OrderedDict()的输出

（9）返回指定key的val （.get()、 .setdefault()）

.get()不存在key时，不创建新的key-val。而.setdefault()创建。

（a）.get()

语法： dict.get(key, default=None)
参数：
key – 字典中要查找的键。
default – 如果指定键的值不存在时，返回该默认值。default不能是序列。

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'

val = dic.get('k2') # 存在
print(val)
val = dic.get('k3') # 不存在，返回None   
print(val, dic)
val = dic.get('k3', 0) # 不存在，返回0    
print(val, dic)
v2
None {'k1': 'v1', 'k2': 'v2'}
0 {'k1': 'v1', 'k2': 'v2'}
#v2                                          # OrderedDict()的输出
#None OrderedDict([('k1', 'v1'), ('k2', 'v2')])
#0 OrderedDict([('k1', 'v1'), ('k2', 'v2')])

（b）.setdefault()

val = dic.setdefault('k2')    # 存在
print(val, dic)
val = dic.setdefault('k4')    # 不存在，则创建，val=None
print(val, dic)
val = dic.setdefault('k5', 0) # 不存在，则创建，val=0
print(val, dic)
v2
None {'k1': 'v1', 'k2': 'v2', 'k4': None}
0 {'k1': 'v1', 'k2': 'v2', 'k4': None, 'k5': 0}
#v2                                                  # OrderedDict()的输出
#None OrderedDict([('k1', 'v1'), ('k2', 'v2'), ('k4', None)])
#0 OrderedDict([('k1', 'v1'), ('k2', 'v2'), ('k4', None), ('k5', 0)])

（10）删除key-val，并返回val。（ .pop() ）

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
dic['k3'] = 'v3'
val = dic.pop('k2')
print(val, dic)
v2 {'k1': 'v1', 'k3': 'v3'}
#v2 OrderedDict([('k1', 'v1'), ('k3', 'v3')])    # OrderedDict()的输出

（11）删除最后加入的元素，返回key-val。（.popitem()）

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
dic['k3'] = 'v3'
print(dic.popitem(), dic)     # 默认后进先出。链表删除，时间复杂度 O(1)
print(dic.popitem(last=False), dic) #先进先出。删除最先加入的键对。
 ('k3', 'v3') {'k1': 'v1', 'k2': 'v2'}
TypeError: popitem() takes no keyword arguments   # 字典报错
#('k3', 'v3') OrderedDict([('k1', 'v1'), ('k2', 'v2')]) # OrderedDict()的输出
#('k1', 'v1') OrderedDict([('k2', 'v2')])         # 有序字典不报错

（12）判断是否存在key（in）

dic = {}
#dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
if 'k1' in dic: print(' in ')    # 哈希查找，时间复杂度 O(1)
else:           print(' not in ')

if 'k4' in dic: print(' in ')
else:           print(' not in ')
in 
not in

（13）长度（len）

dic = {1:'a', 2:'b', 3:'c'}
dic2 = OrderedDict(dic)     # 字典 初始化 有序字典
dic3 = dict(dic2)            # 有序字典 初始化 字典
print(dic)
print(dic2)
print(dic3)
{1: 'a', 2: 'b', 3: 'c'}
OrderedDict([(1, 'a'), (2, 'b'), (3, 'c')])
{1: 'a', 2: 'b', 3: 'c'}

n = len(dic)
n2 = len(dic2)
print(n)
print(n2)
3
3

（14）最值（max(d, key=d.get)）

（a）key_maxKey

# 返回字典中key最大值
d = {2:'a', 3:'b', 1:'c'}
key_maxKey = max(d)  # 不加key函数的话，默认遍历的是字典的key，输出最大的键
print(key_maxKey)
3

（b）key_maxVal

# 返回字典中value最大值对应的key
d = {2:'a', 3:'b', 1:'c'}
key_maxVal = max(d, key=d.get) # O(n)?, 加key表示，遍历的是value，找到最大的value，返回对应key
print(key_maxVal)
key_maxVal = max(d, key=lambda k: d[k])  # 同 d.get
print(key_maxVal)
1
1

（14）字典深拷贝

https://blog.csdn.net/LeonTom/article/details/82761319

（30）字典排序（sorted）

（a）先取items再排序

# 先取items再排序
dic = {2:'b', 1:'c', 3:'a', 4:'a', -1:'a'}
items0 = sorted(dic.items(), key=lambda x: x[0])  # 对items中key排序。升序
items1 = sorted(dic.items(), key=lambda x: x[1])  # 对items中val排序。升序
print(items0)
print(items1)              # 如果val相等，并不再对key排序
[(-1, 'a'), (1, 'c'), (2, 'b'), (3, 'a'), (4, 'a')]
[(3, 'a'), (4, 'a'), (-1, 'a'), (2, 'b'), (1, 'c')]

def dic_sort_by_val_up(dic):
    items1 = sorted(dic.items(), key=lambda x: x[1])  # 对items中val排序。升序
    dic_new = dict(items1)
    return dic_new
    
def dic_sort_by_val_down(dic):
    items1 = sorted(dic.items(), key=lambda x: x[1])[::-1]  # 对items中val排序。降序
    dic_new = dict(items1)
    return dic_new


def dic_sort_by_key_up(dic):
    items0 = sorted(dic.items(), key=lambda x: x[0])  # 对items中key排序。升序
    dic_new = dict(items0)
    return dic_new
    
def dic_sort_by_key_down(dic):
    items0 = sorted(dic.items(), key=lambda x: x[0])[::-1]   # 对items中key排序。降序
    dic_new = dict(items0)
    return dic_new

（b）先取keys再按key排序

# 对key排序，从小打大，并返回排序后的key和val序列
def sort_key(dic):
    keys = dic.keys()
    new_keys = sorted(keys)
    new_vals = [dic[key]  for key  in new_keys]
    return new_keys, new_vals

dic = {2:'b', 1:'a', 3:'c'}
new_keys, new_vals = sort_key(dic)
print(new_keys, new_vals)  
[1, 2, 3] ['a', 'b', 'c']

（31）str《=》字典（json.loads(), eval(), str()）

（a）str =》字典

# ======== json.loads()
import json
d_str = '{"name":"john", "gender":"male", "age":28}'
# d_str = '{1:"a", 2:"b", 3:"c"}'  # 这个会报错
d_dic = json.loads(d_str)
print(type(d_dic), d_dic)
<class 'dict'> {'name': 'john', 'gender': 'male', 'age': 28}

d_str = "{'name':'john', 'gender':'male', 'age': 28}"  # 单引号在里面，会报错
d_dic = json.loads(d_str)
print(type(d_dic), d_dic)

# ========  eval()
#d_str = '{"name" : "john", "gender" : "male", "age": 28}'
d_str = '{1:"a", 2:"b", 3:"c"}'   # 两个都不会错
d_dic = eval(d_str)  
print(type(d_dic), d_dic)
<class 'dict'> {1: 'a', 2: 'b', 3: 'c'}

（b）字典 =》str （str()）

d_dic = {1:'a', 2:'b', 3:'c'}
d_dic2 = OrderedDict(dic)
d_str = str(d_dic)
d_str2 = str(d_dic2)
print(type(d_str), d_str)  
print(type(d_str), d_str2)   # 有序字典，最好不要直接用str。可以先转成字典
<class 'str'> {1: 'a', 2: 'b', 3: 'c'}
<class 'str'> OrderedDict([(1, 'a'), (2, 'b'), (3, 'c')])

（32）拼接（dict(d1.items()+d2.tiems()), .update(d1), dict(d1, **d2)）

https://www.cnblogs.com/lmh001/p/9888156.html

（a）dict(d1.items()+d2.tiems())
（这个在python3.6 不好用）

（b）.update(d1)

**（c）dict(**d1, **d2) ， d3 = dict(d1, d2)

注意：
d1 和 d2 两个字典中存在相同key时，dict(**d1, **d2)会报错：
TypeError: type object got multiple values for keyword argument

（33）字典计数 dic.get(i,0)

li = ['a','a','a','b','b','b','c','c','d']
dic = {}
for i in li:
    dic[i] = dic.get(i,0) + 1

{'a': 3, 'b': 3, 'c': 2, 'd': 1}

（34）字典 <==> namespace

（a）字典 ==> namespace (argparse、munch)

1）argparse

import argparse
b = argparse.Namespace(**dic)

2）munch
https://blog.csdn.net/weixin_30894389/article/details/99624590
作用：将字典转化成支持 a.b的方式获取属性

from munch import Munch
b = Munch({'hello': 'world'})

（b）namespace ==》字典（vars）

1）vars

dic = vars(ns)

4 有序字典（OrderedDict）

字典和链表的综合结构。使用dict时，Key是无序的。在对dict做迭代时，我们无法确定Key的顺序。如果要保持Key的顺序，可以用OrderedDict 。
注意：OrderedDict的Key会按照插入的顺序排列，不是Key本身排序：
from collections import OrderedDict

#### （1）新建
（同字典）
#### （2）清空（.clear()）
（同字典）
#### （3）拷贝（.copy()）
（同字典）
#### （4）list 到 dict。（.fromkeys(), zip + OrderedDict()）
（同字典）
#### （5）items（“键值对”） 到 dict。（OrderedDict()）
（同字典）
#### （6）dict 到 items（.items()）
（同字典）
#### （7）返回所有key（.keys()）
（同字典）
#### （8）返回所有val（.values()）
（同字典）
#### （9）返回指定key的val（.get()  .setdefault()）
（同字典）
#### （10）删除key-val，并返回val。（.pop()）
（同字典）
#### （11）删除最后加入的元素，返回key-val。（.popitem()）
（见字典，有点不一样）
#### （12）判断是否存在key（in）
（同字典）
#### （13）长度（len）
（同字典）

（1）把指定key-val移到最后。（.move_to_end()）

dic = OrderedDict()
dic['k1'] = 'v1'
dic['k2'] = 'v2'
dic['k3'] = 'v3'
dic.move_to_end('k1')    # 时间复杂度 O(1)。字典没有这个属性
print(dic)
OrderedDict([('k2', 'v2'), ('k3', 'v3'), ('k1', 'v1')])

5 计数器（Counter）

from collections import Counter

（1）创建Counter

c = Counter()             # 创建一个空的Counter对象 
print('[0]===0:', type(c), c)  

str1 = 'programming'
c = Counter(str1)   # 使用 可迭代对象 初始化Counter对象 
print('[0]===1:', c)

c1 = Counter()
for ch in str1:
     c1[ch] = c1[ch] + 1    #  一个一个统计
print('[0]===1:', c1)

c = Counter({'red': 4, 'blue': 2})   # 使用 映射对象 初始化Counter对象 
print('[0]===2:', c)

c = Counter(cats=4, dogs=8)  # 使用 关键字参数 初始化Counter对象 
print('[0]===3:', c)
print('[0]===3:', c['cats'])

[0]===0: <class 'collections.Counter'> Counter()
[0]===1: Counter({'a': 3, 'l': 2, 'g': 1, 'h': 1, 'd': 1})
[0]===2: Counter({'red': 4, 'blue': 2})
[0]===3: Counter({'dogs': 8, 'cats': 4})
[0]===3: 4

（2）统计元素个数

a = ['eggs', 'ham', 'eggs']
c2 = Counter(a) 
print('[1]===0:', c2) 
print('[1]===1:', c2['eggs'])
print('[1]===2:', c2['bacon'])   # 获取Counter对象中不存在的元素的值value，不会报错KeyError，而是返回0

[1]===0: Counter({'eggs': 2, 'ham': 1})
[1]===1: 2
[1]===2: 0

（3）获取元素（.items）

c = Counter(a=4, b=2, c=0, d=-2)
items = c.items()                        # 转换成包含类似(elem, cnt)元素的列表 
l1 = list(items)
print(type(items), items)
print('[x]===0:', l1)
<class 'dict_items'> dict_items([('a', 4), ('b', 2), ('c', 0), ('d', -2)])
[x]===0: [('a', 4), ('b', 2), ('c', 0), ('d', -2)]

（4）获取所有键key和value（.keys、.values）

c = Counter(a=4, b=2, c=0, d=-2)
l1 = list(c)               # 以列表的形式展现Counter对象的所有键key
print('[x]===0:', l1)
 [x]===0: ['a', 'b', 'c', 'd']

key = c.keys()
val = c.values()
print('[x]===0:', type(key), key, type(val), val)
 [x]===0: <class 'dict_keys'> dict_keys(['a', 'b', 'c', 'd']) <class 'dict_values'> dict_values([4, 2, 0, -2])

c = Counter(a=4, b=2, c=0, d=-2)
l1 = set(c)                # 以 集合 的形式展现Counter对象的所有键key
print('[x]===1:', l1)
[x]===1: {'c', 'd', 'a', 'b'}

（5）Conter 转字典（dict）

c2 = dict(c)                             # 将Counter对象转换成字典
print('[x]===1:', c2)
[x]===1: {'a': 4, 'b': 2, 'c': 0, 'd': -2}

（6）删除元素(del)

a = ['eggs', 'ham', 'eggs']
c2 = Counter(a) 
c2['sausage'] = 0  # 如果给Counter对象中的某个元素赋值为0，并不意味着这个元素被删除了
print('[2]===0:',c)  # 如果想删除Counter中的某个元素，使用del

del c['sausage']  
print('[2]===1:',c)  
[2]===0: Counter({'a': 4, 'b': 2, 'c': 0, 'd': -2})
[2]===1: Counter({'a': 4, 'b': 2, 'c': 0, 'd': -2})

（7）从Counter恢复序列(.elements)

c3 = Counter(a=4, b=2, c=0, d=-2)  
c3_e1 = c3.elements()    # 返回一个迭代器
c3_e2 = c3.elements()
l1 = sorted(c3_e1)
l2 = list(c3_e2)

print('[3]===0:', c3)
print('[3]===1:', c3_e1)
print('[3]===1:', c3_e2)
print('[3]===2:', l1)   
print('[3]===2:', l2) 
[3]===0: Counter({'a': 4, 'b': 2, 'c': 0, 'd': -2})
[3]===1: <itertools.chain object at 0x0000016A0A0742B0>
[3]===1: <itertools.chain object at 0x0000016A0A074D68>
[3]===2: ['a', 'a', 'a', 'a', 'b', 'b']
[3]===2: ['a', 'a', 'a', 'a', 'b', 'b']

（8）返回最多的k个数（.most_common）

c4 = Counter('abracadabra')
l3 = c4.most_common(3)  # 返回一个列表，里面的元素都是以元组的形式存在
print('[4]===0:', l3)   # 元组里的元素分别是原先Counter对象中的键值对

n = 2
l3 = c4.most_common()[:-n-1:-1]       # 找出数量最少的n个元素
print('[4]===0:', l3)   
[4]===0: [('a', 5), ('b', 2), ('r', 2)]
[4]===0: [('d', 1), ('c', 1)]

（9）相加（.update、+）

c7 = Counter(a=4, b=2, c=0, d=-2)
c8 = Counter(a=1, b=2, c=3, d=4, e=5, f=-1)  
c10 = c7 + c8              # 只保留最后value值为正的元素
c7.update(c8)  
print('[6]===0:', c10)   
print('[6]===1:', c7) 
[6]===0: Counter({'a': 5, 'e': 5, 'b': 4, 'c': 3, 'd': 2})
[6]===1: Counter({'a': 5, 'e': 5, 'b': 4, 'c': 3, 'd': 2, 'f': -1})

（10）相减（.subtract、-）

c5 = Counter(a=4, b=2, c=0, d=-2)
c6 = Counter(a=1, b=2, c=3, d=4, e=5)
c9 = c5 - c6              # 只保留最后value值为正的元素
c5.subtract(c6)
print('[5]===0:', c9) 
print('[5]===1:', c5)  
[5]===0: Counter({'a': 3})
[5]===1: Counter({'a': 3, 'b': 0, 'c': -3, 'e': -5, 'd': -6})

（11）相交（&）

c5 = Counter(a=4, b=2, c=0, d=-2)
c6 = Counter(a=1, b=2, c=3, d=4, e=5, f=-1)
c9 = c5 & c6              # 相交，只保留相同的元素和最小的value值
print('[7]===0:', c9)  
 [7]===0: Counter({'b': 2, 'a': 1})

（12）合并（|）

c5 = Counter(a=4, b=2, c=0, d=-2)
c6 = Counter(a=1, b=2, c=3, d=4, e=5, f=-1)
c9 = c5 | c6              # 合并，保留相同元素和最大的value值，包含不相同的元素
print('[8]===0:', c9)  
 [8]===0: Counter({'e': 5, 'a': 4, 'd': 4, 'c': 3, 'b': 2})

（13）清空

c = Counter(a=4, b=2, c=0, d=-2)
c.clear()                            # 清空Counter对象里面的所有元素
 [9]===0: Counter()

（14）统计val的和

c = Counter(a=4, b=2, c=0, d=-2, f=-1)
sum1 = sum(c.values())                 # 统计所有元素的个数， 会把小于1的也算进去
print(sum1)
3

（15）移除val为0或负数的键值对（+）

c2 = +c                              # 将数量为0或负数的键值对给移除   
print(c2)
Counter({'a': 4, 'b': 2})

（16）移除val为0或正数的键值对（-）

c2 = -c                              # 将数量为0或正数的键值对给移除，并将数量为负数的元素变成整数
print(c2)
Counter({'d': 2, 'f': 1})

6 矩阵（numpy）

6 矩阵（numpy）

（1）新建

（a）一般矩阵（np.matrix、np.array、np.asarray）

a = [[4,3,5],[1,2,1]]
b1 = np.matrix(a)
b2 = np.array(a)
b3 = np.asarray(a)

array和asarray都可以将结构数据转化为ndarray，但是主要区别就是当数据源是ndarray时，array仍然会copy出一个副本，占用新的内存，但asarray不会。

a=np.random.random((3,3))
print(a.dtype)
b=np.array(a,dtype='float64')
c=np.asarray(a,dtype='float64')
a[2]=2
print(a)
print(b)
print(c)

（b）全0矩阵（np.zeros）

num_zero = np.zeros( (1, 3) )
num_zero = np.zeros( (1, 3) ,dtype=np.int16)

函数原型： zeros(shape, dtype=float, order=‘C’)
参数：
- shape: 形状
- dtype: 数据类型，可选参数，默认numpy.float64。dtype类型：
  - t, 位域, 如t4代表4位
  - b, 布尔值，true or false
  - i, 整数, 如i8(64位）
  - u, 无符号整数，u8(64位）
  - f, 浮点数，f8（64位）
  - c, 浮点负数
  - o, 对象
  - s, a，字符串，s24
  - u, unicode, u24
- order: 可选参数，c代表与c语言类似，行优先；F代表列优先
返回： 返回来一个给定形状和类型的用0填充的数组

（c）全1矩阵（np.ones）

bb=np.ones((3,1))

（d）对角矩阵（np.eye）

（e）对角方阵（np.identity）

（f）线性数组（np.linspace）

# numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)

>>> np.linspace(1, 10, 10)
array([  1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.])

>>> np.linspace(1, 10, 10, endpoint = False)
array([ 1. ,  1.9,  2.8,  3.7,  4.6,  5.5,  6.4,  7.3,  8.2,  9.1])	# 最后一个点不算

>>> np.linspace(1, 10, 10, endpoint = False, retstep= True)
 (array([ 1. ,  1.9,  2.8,  3.7,  4.6,  5.5,  6.4,  7.3,  8.2,  9.1]), 0.9)	# 返回了间隔

（g）无穷大无穷小矩阵（np.inf）

a = np.ones([5])*np.inf
print(a)

b = [-np.inf, 1.0, np.inf]
c = np.array(b)
d = -c
print(type(b), b)
print(type(c), c)
print(type(d), d)
'''
[inf inf inf inf inf]
<class 'list'> [-inf, 1.0, inf]
<class 'numpy.ndarray'> [-inf   1.  inf]
<class 'numpy.ndarray'> [ inf  -1. -inf]
'''

（2）修改（条件修改、对角线赋值）

（a）条件修改

arr[arr > 255] = x
e[e < t] = 0 
e[e >= t] = 1

（b）对角线赋值（ np.diag_indices_from()）

data = np.array([[80, 89, 86, 67, 79],
                 [78, 97, 89, 67, 81],
                 [90, 94, 78, 67, 74],
                 [91, 91, 90, 67, 69],
                 [76, 87, 75, 67, 86]])

row, col = np.diag_indices_from(data)
data[row,col] = np.array([1, 2, 3, 4, 5])

print(type(row), type(col))
print(row, col)
print(type(data))
print(data)

# ------------------------- print ----------------------------
<class 'numpy.ndarray'> <class 'numpy.ndarray'>

[0 1 2 3 4] [0 1 2 3 4]

<class 'numpy.ndarray'>

[[ 1 89 86 67 79]
 [78  2 89 67 81]
 [90 94  3 67 74]
 [91 91 90  4 69]
 [76 87 75 67  5]]

（3）插入（np.append、pading）

（a）插入（np.append）

a = np.zeros((3,3))
b1 = np.ones((1,3))
b2 = np.ones((3,1))
c1 = np.append(a, b1, axis = 0)  # 将b1中插入a 的下边
c2 = np.append(a, b2, axis = 1)  # 将b2中插入a 的右边
print(a)
print(b1)
print(c1)
print(b2)
print(c2)

# ---------------------- print -------------------------
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

[[1. 1. 1.]]

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [1. 1. 1.]]

[[1.]
 [1.]
 [1.]]

[[0. 0. 0. 1.]
 [0. 0. 0. 1.]
 [0. 0. 0. 1.]]

（b） padding（np.pad）

函数原型： np.pad(array, pad_width, mode, **kwargs)
参数：
- array: 要填补的数组
- pad_width: 是在各维度的各个方向上想要填补的长度,如（（1，2），（2，2）），表示在第一个维度上水平方向上padding=1,垂直方向上padding=2,在第二个维度上水平方向上padding=2,垂直方向上padding=2。如果直接输入一个整数，则说明各个维度和各个方向所填补的长度都一样。
- mode: 填补类型，即怎样去填补，有“constant”，“edge”等模式，如果为constant模式，就得指定填补的值，如果不指定，则默认填充0。
返回： 矩阵


# ======== 一维数组
a = np.array([1, 1, 1])
b = np.pad(a, (1,2), 'constant')                        # (1,2)表示在一维数组array前面填充1位，最后面填充2位
c = np.pad(a, (1,2), 'constant', constant_values=(2)) 
d = np.pad(a, (1,2), 'constant', constant_values=(0,2)) #  constant_values=(0,2) 表示前面填充0，后面填充2

print(b)
print(c)
print(d)

# ------------------------- print ----------------------------
[0 1 1 1 0 0]
[2 1 1 1 2 2]
[0 1 1 1 2 2]

# ======== 二维矩阵
a = np.array([[1, 1],[2,2]])
b = np.pad(a, ((1,2),(3,4)), 'constant') 
# ((1,2),(3,4))表示在二维数组array第一维（此处便是行）前面填充1行，最后面填充2行。也就是二维矩阵的上下。
#                 在二维数组array第二维（此处便是列）前面填充3列，最后面填充4列。也就是二维矩阵的左右。
c = np.pad(a, ((1,2),(3,4)), 'constant', constant_values=(3)) 
d = np.pad(a, ((1,2),(3,4)), 'constant', constant_values=(0,3))   # constant_values=(0,3) 表示第一维填充0，第二维填充3

print(b)
print(c)
print(d)

# ------------------------- print ----------------------------
[[0 0 0 0 0 0 0 0 0]
 [0 0 0 1 1 0 0 0 0]
 [0 0 0 2 2 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0]]

[[3 3 3 3 3 3 3 3 3]
 [3 3 3 1 1 3 3 3 3]
 [3 3 3 2 2 3 3 3 3]
 [3 3 3 3 3 3 3 3 3]
 [3 3 3 3 3 3 3 3 3]]

[[0 0 0 0 0 3 3 3 3]
 [0 0 0 1 1 3 3 3 3]
 [0 0 0 2 2 3 3 3 3]
 [0 0 0 3 3 3 3 3 3]
 [0 0 0 3 3 3 3 3 3]]

（4）删除（np.unique）

（a）删除array里面的重复数字（np.unique）

b = np.unique(a)

（5）尺寸（a.shape、 a.reshape(8,-1)）

a = np.array([ [1, 2, 3],
               [4, 5, 6] ])

print('number of dim:',a.ndim)
print('shape:', a.shape)
print('size:', a.size)


arr=np.arange(16).reshape(2,8)
print(arr)

# -------------------------- print ----------------------------
[[1 2 3] 
[4 5 6]] 
number of dim: 2 
shape: (2, 3) 
size: 6

array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])

（6）维度（.ndim、np.newaxis、np.swapaxes）

（a）维度（.ndim）

a = np.array([ [1, 2, 3],
               [4, 5, 6] ])

print('number of dim:',a.ndim)

（b）增加维度（np.newaxis）

>>> a = random.rand(4,4)
array([[0.45284467, 0.27883581, 0.72870975, 0.03455946],
       [0.74005136, 0.52413785, 0.78433733, 0.80114353],
       [0.16559874, 0.56112999, 0.18464461, 0.38968731],
       [0.05684794, 0.50929997, 0.45789637, 0.63199181]])


>>> b = a[:,np.newaxis]
array([[[0.45284467, 0.27883581, 0.72870975, 0.03455946]],
       [[0.74005136, 0.52413785, 0.78433733, 0.80114353]],
       [[0.16559874, 0.56112999, 0.18464461, 0.38968731]],
       [[0.05684794, 0.50929997, 0.45789637, 0.63199181]]])


>>> c = a[0:2,np.newaxis]   # 以上是默认选取全部的数据进行增加维度，还可以选取部分的数据增加维度：
array([[[0.45284467, 0.27883581, 0.72870975, 0.03455946]],
       [[0.74005136, 0.52413785, 0.78433733, 0.80114353]]])

（c）交换维度（np.swapaxes）

x = np.array([[1,2,3]])     # 二维矩阵也一样 
y2 = np.swapaxes(x, 0, 1) # 等价于  np.swapaxes(x, 1，0) 
y2 = x.swapaxes(0, 1)

（7）矩阵拼接（np.append、np.concatenate, np.stack(), np.vstack()）

（a） np.append()

a =   [[ 0,  1,  2,  3],
       [ 4,  5,  6,  7], 
       [ 8,  9, 10, 11]]

b =   [[1,1,1,1]]

a = np.array(a)
b = np.array(b)

c = np.append(a, b, axis=0)

print(a.shape)
print(a)

print(b.shape)
print(b)

print(c.shape)
print(c)

# ------------------------------------------------------
(3, 4)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(1, 4)
[[1 1 1 1]]
(4, 4)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [ 1  1  1  1]]

（b） np.concatenate()
比append()效率更高，适合大规模的数据拼接，能够一次完成多个数组的拼接。
numpy.concatenate((a1,a2,…), axis=0)
其中a1,a2,…是数组类型的参数，传入的数组必须具有相同的形状。
axis 指定拼接的方向，默认axis = 0（逐行拼接）（纵向的拼接沿着axis= 1方向）。

a=np.array([1,2,3])
b=np.array([11,22,33])
c=np.array([44,55,66])
np.concatenate((a,b,c),axis=0)  # 默认情况下，axis=0可以不写
#结果：array([ 1,  2,  3, 11, 22, 33, 44, 55, 66]) 
#对于一维数组拼接，axis的值不影响最后的结果

a=np.array([[1,2,3],[4,5,6]])
b=np.array([[11,21,31],[7,8,9]])
np.concatenate((a,b),axis=0)
'''
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [11, 21, 31],
       [ 7,  8,  9]])
'''
np.concatenate((a,b),axis=1)  #axis=1表示对应行的数组进行拼接
'''
array([[ 1,  2,  3, 11, 21, 31],
       [ 4,  5,  6,  7,  8,  9]])
'''

（c） np.stack()

（d） np.vstack()

（8）多份复制（np.repeat, np.tile）

repeat: 是逐元素进行复制，当指定axis之后，就是对于该axis下的各个元素指定重复次数
tile: 是对于整个数组进行复制不可以指定这个元素复制3次那个元素复制2次

（a） np.repeat()


####  简单场景
a = np.arange(3)   # array([0, 1, 2])
    
np.repeat(a, 3)   # array([0, 0, 0, 1, 1, 1, 2, 2, 2])
    
np.tile(a, 3)     # array([0, 1, 2, 0, 1, 2, 0, 1, 2])
    
####  多维场景
b = np.arange(6).reshape(2,3)
# array([[0, 1, 2],
#         [3, 4, 5]])

np.repeat(b, [3,2], axis=0) # 对于数组的第0个轴的第一个元素（第一行）复制3次
'''
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [3, 4, 5],
       [3, 4, 5]])
'''

np.tile(b, (3, 2))
'''
array([[0, 1, 2, 0, 1, 2],
       [3, 4, 5, 3, 4, 5],
       [0, 1, 2, 0, 1, 2],
       [3, 4, 5, 3, 4, 5],
       [0, 1, 2, 0, 1, 2],
       [3, 4, 5, 3, 4, 5]])
'''

（b） np.tile()

（9）计算（x.min/max、x.sum、np.dot）

（a）最小最大（x.min）
import numpy as np
a = np.array([[1,5,3],[4,2,6]])
print(a.min()) #无参，所有中的最小值
print(a.min(0)) # axis=0; 每列的最小值
print(a.min(1)) # axis=1；每行的最小值

（b）求和（x.sum）

a = np.array([[0, 2, 1]])
print a.sum()
print a.sum(axis=0)
print a.sum(axis=1)
结果分别是：3, [0 2 1], [3]

（c）乘（np.dot()）

内积：np.dot(x,y)

2）点乘，对应元素相乘： np.multiply(), 或 *

a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[7, 8, 9], [4, 7, 1]])

c = a * b
d = np.multiply(a, b)

（d）平方（np.square）

np.square(x_data)

（e）均值/方差（np.mean，np.std）

# X.mean(axis=1, keepdims=True)  x 行方向的均值


a = [5, 6, 16, 9]
np.mean(a)   # 9.0

np.std([1,2,3])
np.std([1,2,3], ddof=1)
np.std(X, axis=0, ddof=1)

（f）范数（a.linalg.norm）

np.linalg.norm(X, axis=1, keepdims=True)  X 行向量的 l2 范数

（g）对数（以e、2、10、任意数为底）
https://blog.csdn.net/Fantine_Deng/article/details/104749807

# ======== numpy 
x = 2

a1 = np.log(x)         #  以e为底的对数(自然对数)
b1 = np.log10(x)       #  以10为底的对数
c1 = np.log2(x)        #  以2为底的对数
d1 = np.log1p(x)       #  等价于：np.log(x + 1)

e1 = np.log(x)/np.log(3)   # 以3为底，2的对数， 在Numpy中以任意数为底的对数需要用到换底公式


# ========  math  （与Numpy中一样，除了任意数）
a2 = math.log(x)	# 以e为底的对数(自然对数)
b2 = math.log10(x)	# 以10为底的对数
c2 = math.log2(x)	# 以2为底的对数
d2 = math.log1p(x)	# 等价于：math.log(x + 1)，用于数据平滑

e2 = math.log(x, 3)  # 其中3为底数

print(a1, b1, c1, d1, e1)
print(a2, b2, c2, d2, e2)

# ----------------------- print -----------------------------
0.6931471805599453 0.3010299956639812 1.0 1.0986122886681098 0.6309297535714574
0.6931471805599453 0.3010299956639812 1.0 1.0986122886681098 0.6309297535714574

备注： np.expm1(x) 等价于 np.exp(x) - 1，也是np.log1p(x)的逆运算。

numpy 和 math的区别：

在math库中，函数的输入x只能为单独一个数。
在Numpy库中，函数的输入x不仅可以为单独一个数，还可以是一个列表，一个Numpy数组。

（h）指数

（i）比较矩阵值是否都一样（ np.allclose、==）

np.allclose(a, b)

（10）全是Ture、存在Ture（ np.all()、 np.any() ）

np.all(np.array)   #对矩阵所有元素做与操作，所有为True则返回True
np.any(np.array)   #对矩阵所有元素做或运算，存在True则返回True

（11）条件查找（np.where(), np.argwhere()）

（a） np.where(condition, x, y)

只有条件 (condition)，没有x和y。满足条件(condition)，输出x，不满足输出y。如果是一维数组，相当于

[xv if c else yv for (c,xv,yv) in zip(condition,x,y)]


a = np.arange(10)
b = np.where(a,1,-1)
print(b)

b = np.where(a > 5,1,-1)
print(b)

b = np.where([[True,False], [True,True]],    # 官网上的例子
			 [[1,2], [3,4]],
             [[9,8], [7,6]])
print(b)

a = 10
b = np.where([[a > 5,a < 5], [a == 10,a == 7]],
             [["chosen","not chosen"], ["chosen","not chosen"]],
             [["not chosen","chosen"], ["not chosen","chosen"]])
print(b)


[-1  1  1  1  1  1  1  1  1  1]
[-1 -1 -1 -1 -1 -1  1  1  1  1]
[[1 8]
 [3 4]]
 [['chosen' 'chosen']
 ['chosen' 'chosen']]

（b） np.where(condition)
只有条件 (condition)，没有x和y。则输出满足条件 (即非0) 元素的坐标 (等价于numpy.nonzero)。这里的坐标以tuple的形式给出，通常原数组有多少维，输出的tuple中就包含几个数组，分别对应符合条件元素的各维坐标。

a = np.array([2,4,6,8,10])
b = np.where(a > 5)				# 返回索引
print(b)
b = a[np.where(a > 5)]  			# 等价于 a[a>5]
print(b)
b = np.where([[0, 1], [1, 0]])
print(b)

(array([2, 3, 4], dtype=int64),)
[ 6  8 10]
(array([0, 1], dtype=int64), array([1, 0], dtype=int64))

（c） np.argwhere()

a = [[1, 2, 9], [4, 5, 6], [7, 8, 9]]
a = np.array(a)

idx1 = np.where(a>4)
idx2 = np.argwhere(a>4)

print(idx1)
print(idx2)


b1 = a[idx1]
b2 = a[idx2]  # 不是取元素，是取向量

print(b1)
print(b2)


idxs = idx2
array = a

out, vec = [], []
for i, idx in enumerate(idxs):
    vec.append(array[idx[0], idx[1]])
    if i == len(idxs)-1 or idx[0] != idxs[i+1][0]:
        out.append(vec)
        vec = []
print(out)

（12）读取保存 npy（np.load、np.save） txt（np.loadtxt、np.savetxt）

（a）npy

np.save("a.npy", a)
c = np.load( "a.npy" )
print(c)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

（b）txt

# ======== 读取
scores = np.loadtxt(Path_txt, dtype=np.float32)

# ======== 保存
np.savetxt('a.txt', a, fmt='%0.8f')
#第一个参数是要保存的文件名
#第二参数是要保存的array
#第三个参数是保存的数据格式，详见文档

（13）排序（np.sort np.argsort）

（a）返回排序结果(np.sort)

a = np.array([[4,3,5],[1,2,1]])
print (a)
b = np.sort(a, axis=1) # 对a按每行中元素从小到大排序 ，numpy 暂时没有从大到小
print (b)
# ======================= 
[[4 3 5]
 [1 2 1]]
[[3 4 5]
 [1 1 2]]

（b）返回索引(np.argsort)

a = np.array([4, 3, 1, 2])
b = np.argsort(a) # 求a从小到大排序的坐标
print (b)
print (a[b]) # 按求出来的坐标顺序排序
c = b[::-1]
print(c)
print (a[c]) # 按求出来的坐标顺序排序

# ========================
[2 3 1 0]
[1 2 3 4]

（14）随机数（np.random）

from: https://blog.csdn.net/jinxiaonian11/article/details/53143141

函数名称	函数功能	参数说明
rand(d0, d1, …, dn)	产生均匀分布的随机数	dn为第n维数据的维度
randn(d0, d1, …, dn)	产生标准正态分布随机数	dn为第n维数据的维度
randint(low[, high, size, dtype])	产生随机整数	low：最小值；high：最大值；size：数据个数
random_sample([size])	在[0,1）内产生随机数	size：随机数的shape，可以为元祖或者列表，[2,3]表示2维随机数，维度为（2,3）
random([size])	同random_sample([size])	同random_sample([size])
ranf([size])	同random_sample([size])	同random_sample([size])
sample([size]))	同random_sample([size])	同random_sample([size])
choice(a[, size, replace, p])	从a中随机选择指定数据	a：1维数组 size：返回数据形状
bytes(length)	返回随机位	length：位的长度

（a）正整数（np.random.randint）

函数原型： np.random.randint(low, high=None, size=None, dtype=‘l’)
参数：
- low(int)：表示生成的数要大于等于low，但是当hign = None时，生成的数的范围为[0, low)区间内。
- high(int): 表示生成的数要小于high，生成的数在[low, high)区间。
- size(int或者整数元组)：表示ndnarry的大小(形状)，默认为none。当为none输出值为1个。
- dtype(可选)：想要输出的格式。如int64、int等，默认为int。
返回： 。

a = np.random.randint(2)
b = np.random.randint(2, 4, size=6)
c = np.random.randint(2, 4, (2,3))

# -------------------------- print --------------------------------------
0 

[2 2 3 3 2 3]

[[3 3 3]
 [2 2 2]]

（b）均匀分布（np.random.rand）

np.random.rand(d0,d1,d2……dn)： 通过本函数可以返回一个或一组服从“0~1”均匀分布的随机样本值。随机样本取值范围是[0,1)，不包括1。

a = np.random.rand(4,3)

# -------------------------- print --------------------------------------
[[0.06545033 0.80108246 0.14400236]
 [0.84810368 0.97065713 0.67829134]
 [0.16764187 0.63566725 0.46231684]
 [0.22415399 0.53846922 0.89692351]]

（c）正太分布（np.random.normal）

函数原型： numpy.random.normal(loc=0.0, scale=1.0, size=None)
参数：
- loc：float。均值（对应着整个分布的中心centre）。
- scale：float。标准差（对应于分布的宽度，scale越大越矮胖，scale越小，越瘦高）。
- size：int 或者整数元组。输出的shape，默认为None，只输出一个值。
返回： 整型

a = np.random.normal(0,1)
b = np.random.normal(1,2, (2,3))
c = np.random.normal(0, 0.02, b.shape)  # 产生 均值为0， 标准差为0.02，size和b一样的正态分布随机数

# -------------------------- print --------------------------------------
-2.586554443355694 

[[ 1.59384679  4.08205269  3.69789615]
 [-0.17051646  1.67723559  0.65662088]]

[[ 0.03207627  0.02375576 -0.04414936]
 [ 0.0006749  -0.02088935  0.01176298]]

（d）标准正太分布（ np.random.randn(d0,d1,d2……dn) 、 standard_normal(size=None)）

randn(d0, d1, …, dn)： 返回shape为(d0, d1, …, dn)的标准正态分布（均值为0，标准差为1）的数组。使用方法与np.random.randn()函数相同。
standard_normal(size=None)： 跟randn一样，也是返回标准正态分布的数组，不同的是它的shape由size参数指定，对于多维数组，size必须是元组形式。

注意：

当函数括号内没有参数时，则返回一个浮点数；
当函数括号内有一个参数时，则返回秩为1的数组，不能表示向量和矩阵；
当函数括号内有两个及以上参数时，则返回对应维度的数组，能表示向量或矩阵；
np.random.standard_normal（）函数与np.random.randn()类似，但是np.random.standard_normal（）
的输入参数为元组（tuple）;
np.random.randn()的输入通常为整数，但是如果为浮点数，则会自动直接截断转换为整数;

a = np.random.randn(2,3)
b = np.random.standard_normal((2,3))

# -------------------------- print --------------------------------------
[[-0.86736236 -1.01741523  0.28397295]
 [ 0.13975495  0.09078188 -0.38497118]]

[[-0.71248137  0.85849519  0.1336972 ]
 [-0.14742793  0.35682006 -0.28483632]]

（e）随机种子（ np.random.seed() ）

np.random.seed(1) #数值随便指定，指定了之后对应的数值唯一

a=[]
for i in range(10):
    a0=np.random.randint(0,10)
    a.append(a0)
print(a) #每次运行结果都一样

# -------------------------- print --------------------------------------
[5, 8, 9, 5, 0, 0, 1, 7, 6, 9]

np.random.seed(1) 
a = []
for i in range(10):
    a0=np.random.randint(0,9)  #改为【0,9）
    a.append(a0)
print(a)

# -------------------------- print --------------------------------------
[5, 8, 5, 0, 0, 1, 7, 6, 2, 4]

np.random.seed(1) #数值随便指定，指定了之后对应的数值唯一
i=np.random.randint(0,10) #产生一个[0,10）之间的随机数
j=np.random.randint(0,10)
print(i)
print(j)

# -------------------------- print --------------------------------------
5
8

（15）矩阵转置（.transpose）

test = np.array([[12,4,7,0],[3,7,45,81]])

# 以下为test输出的结果
array([[12,  4,  7,  0],
       [ 3,  7, 45, 81]])

# 对test进行转置操作
test.transpose()

# 转置后得到的结果为
array([[12,  3],
       [ 4,  7],
       [ 7, 45],
       [ 0, 81]])

（16）矩阵旋转（np.rot90）

numpy实现旋转一般是使用numpy.rot90对图像进行90度倍数的旋转操作。
关键参数k表示旋转90度的倍数，k的取值一般为1、2、3，分别表示旋转90度、180度、270度；k也可以取负数，-1、-2、-3。k取正数表示逆时针旋转，取负数表示顺时针旋转。

def totateAntiClockWise90ByNumpy(img_file):  # np.rot90(img, -1) 逆时针旋转90度
    img = cv2.imread(img_file)
    img90 = np.rot90(img, -1)
    return img90

（17）元素数据类型、转换（a.dtype）

（a）元素数据类型（a.dtype）

print( a.dtype )

（b）元素类型转换（a.astype）

b = a.astype( np.uint8 )

（18）异常值 NaN、Inf（np.isnan、np.isinf、np.nan_to_num）

import numpy as np
a = np.array([[np.nan, np.nan, 1, 2], [np.inf, np.inf, 3, 4], [1, 1, 1, 1], [2, 2, 2, 2]])
print a
where_are_nan = np.isnan(a)
where_are_inf = np.isinf(a)
a[where_are_nan] = 0
a[where_are_inf] = 0
print a
print np.mean(a)


b = np.array([[1, 2],
              [3, 4]])

nan_any = np.isnan(a).any()
nan_inf = np.isinf(a).any()
print( nan_any )
print( nan_inf )
nan_any = np.isnan(b).any()
nan_inf = np.isinf(b).any()
print( nan_any )
print( nan_inf )
nan_any = np.isnan(b).all()
nan_inf = np.isinf(b).all()
print( nan_any )
print( nan_inf )


np.nan_to_num(np.nan)  # 将nan转换成0.0

（19）矩阵格式转换（《==》 bytes、list）

（a） np.array 《==》 bytes (bytes、.tobytes)

# ======================================  bytes() ==============================
# ========== int
a_int = 12
print('#### a_int: ', type(a_int), a_int)
a_bytes = bytes(a_int)
print('#### a_bytes: ', type(a_bytes), a_bytes)


a_int = np.array(12)
print('#### a_int: ', type(a_int), a_int.dtype, a_int.shape, a_int)
a_bytes = bytes(a_int)
print('#### a_bytes: ', type(a_bytes), a_bytes)


a_int = np.array([[12, 13, 14], 
                  [0, 1, -3]])
print('#### a_int: ', type(a_int), a_int.dtype, a_int.shape, a_int)
a_bytes = bytes(a_int)
print('#### a_bytes: ', type(a_bytes), a_bytes)

# ========== float
print('######################### float')
'''
# ==== ERROR: cannot convert 'float' object to bytes
a_float = 12.0
print('#### a_float: ', type(a_float), a_float)
a_bytes = bytes(a_float)
print('#### a_bytes: ', type(a_bytes), a_bytes)
'''

a_float = np.array(12.0)
a_float = a_float.astype(np.float32)
print('#### a_float: ', type(a_float), a_float.dtype, a_float.shape, a_float)
a_bytes = bytes(a_float)
print('#### a_bytes: ', type(a_bytes), a_bytes)


a_float = np.array([[12.1, 13.3, 14.0], 
                    [0.2, 1.3, -3.1]])
a_float = a_float.astype(np.float32)
print('#### a_float: ', type(a_float), a_float.dtype, a_float.shape, a_float)
a_bytes = bytes(a_float)
print('#### a_bytes: ', type(a_bytes), a_bytes)

# ========== bytes   to   numpy float
'''
# ==== UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9a in position 0: invalid start byte
a_float = bytes.decode(a_bytes)
print('#### a_float: ', type(a_float), a_float.dtype, a_float.shape, a_float)
'''

a_float = np.frombuffer(a_bytes, dtype=np.float32) # numpy array
print('#### a_float: ', type(a_float), a_float.dtype, a_float.shape, a_float)



# ======================================  .tobytes() ==============================
Img = Img.tobytes()  # 转变为bytes