numpy基础知识
一、数据类型及数组的创建
1. 常量
- numpy.nan
nan表示空值,nan=NaN=NAN
两个numpy.nan是不相等的。np.nan != np.nan #true - numpy.inf
表示正无穷大,Inf = inf = infty = Infinity = PINF - numpy.pi
表示圆周率 - numpy.e
表示自然常数
2. 数据类型
字符 | 对应类型 | 备注 |
---|---|---|
b | boolean | ‘b1’ |
i | signed integer | ‘i1’, ‘i2’, ‘i4’, ‘i8’ |
u | unsigned integer | ‘u1’, ‘u2’ ,‘u4’ ,‘u8’ |
f | floating-point | ‘f2’, ‘f4’, ‘f8’ |
c | complex floating-point | |
m | timedelta64 | timedelta64 |
M | datetime64 | 日期时间类型 |
O | object | |
S | (byte-)string | S3表示长度为3的字符串 |
U | Unicode | Unicode 字符串 |
V | void |
numpy 的数值类型实际上是 dtype 对象的实例。
class dtype(object):
def __init__(self, obj, align=False, copy=False):
pass
import numpy as np
a = np.dtype('b1')
print(a.type) # <class 'numpy.bool_'>
print(a.itemsize) # 1
a = np.dtype('i1')
print(a.type) # <class 'numpy.int8'>
print(a.itemsize) # 1
a = np.dtype('i2')
print(a.type) # <class 'numpy.int16'>
print(a.itemsize) # 2
a = np.dtype('i4')
print(a.type) # <class 'numpy.int32'>
print(a.itemsize) # 4
a = np.dtype('i8')
print(a.type) # <class 'numpy.int64'>
print(a.itemsize) # 8
a = np.dtype('u1')
print(a.type) # <class 'numpy.uint8'>
print(a.itemsize) # 1
a = np.dtype('u2')
print(a.type) # <class 'numpy.uint16'>
print(a.itemsize) # 2
a = np.dtype('u4')
print(a.type) # <class 'numpy.uint32'>
print(a.itemsize) # 4
a = np.dtype('u8')
print(a.type) # <class 'numpy.uint64'>
print(a.itemsize) # 8
a = np.dtype('f2')
print(a.type) # <class 'numpy.float16'>
print(a.itemsize) # 2
a = np.dtype('f4')
print(a.type) # <class 'numpy.float32'>
print(a.itemsize) # 4
a = np.dtype('f8')
print(a.type) # <class 'numpy.float64'>
print(a.itemsize) # 8
a = np.dtype('S')
print(a.type) # <class 'numpy.bytes_'>
print(a.itemsize) # 0
a = np.dtype('S3')
print(a.type) # <class 'numpy.bytes_'>
print(a.itemsize) # 3
a = np.dtype('U3')
print(a.type) # <class 'numpy.str_'>
print(a.itemsize) # 12
3. 时间日期
在numpy中可以将字符串转换为时间类型(datetime64)
创建datetime64对象:
import numpy as np
a = np.datetime64('2020-03-01')
print(a, a.dtype) # 2020-03-01 datetime64[D]
a = np.datetime64('2020-03')
print(a, a.dtype) # 2020-03 datetime64[M]
a = np.datetime64('2020-03-08 20:00:05')
print(a, a.dtype) # 2020-03-08T20:00:05 datetime64[s]
a = np.datetime64('2020-03-08 20:00')
print(a, a.dtype) # 2020-03-08T20:00 datetime64[m]
a = np.datetime64('2020-03-08 20')
print(a, a.dtype) # 2020-03-08T20 datetime64[h]
创建datetime64数组:
import numpy as np
a = np.array(['2020-03', '2020-03-08', '2020-03-08 20:00'], dtype='datetime64')
print(a, a.dtype)
# ['2020-03-01T00:00' '2020-03-08T00:00' '2020-03-08T20:00'] datetime64[m]
生成范围日期数组:
import numpy as np
a = np.arange('2020-08-01', '2020-08-10', dtype=np.datetime64)
print(a)
# ['2020-08-01' '2020-08-02' '2020-08-03' '2020-08-04' '2020-08-05'
# '2020-08-06' '2020-08-07' '2020-08-08' '2020-08-09']
print(a.dtype) # datetime64[D]
a = np.arange('2020-08-01 20:00', '2020-08-10', dtype=np.datetime64)
print(a)
# ['2020-08-01T20:00' '2020-08-01T20:01' '2020-08-01T20:02' ...
# '2020-08-09T23:57' '2020-08-09T23:58' '2020-08-09T23:59']
print(a.dtype) # datetime64[m]
a = np.arange('2020-05', '2020-12', dtype=np.datetime64)
print(a)
# ['2020-05' '2020-06' '2020-07' '2020-08' '2020-09' '2020-10' '2020-11']
print(a.dtype) # datetime64[M]
二、使用步骤
1.引入库
代码如下(示例):
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
2.读入数据
代码如下(示例):
data = pd.read_csv(
'https://labfile.oss.aliyuncs.com/courses/1283/adult.data.csv')
print(data.head())
该处使用的url网络请求的数据。
总结
提示:这里对文章进行总结:
例如:以上就是今天要讲的内容,本文仅仅简单介绍了pandas的使用,而pandas提供了大量能使我们快速便捷地处理数据的函数和方法。