1 python语言特点
- 语法简洁
- 跨平台
- 类库丰富
- 开放源码
- 可扩展
2 python历史
2.1 python历史
1990年 python诞生
2000年 python2.0发布
2008年 python3.0发布
2010年 python2.7发布(最后一个2.x版本)
2.2 python的流行版本
https://mirror.tuna.tsinghua.edu.cn/help/anaconda/
2.3 python的常用工具
2.3.1 python官方文档
https://www.python.org/doc
2.3.2 ipython
https://www.ipython.org
2.3.3 jupyter notebook
http://jupyter-notebook.readthedocs.io/en/latest
2.3.4 sublime text
https://www.sublimetext.com
2.3.5 PyCharm
https://www.jetbrains.com/pycharm/
2.3.6 Pip
https://pip.pypa.io/en/stable/installing/
3 python的安装
验证:
4 python书写规则
5 基本数据类型
整数(int)
浮点数(float)
字符串(str)
布尔值(bool)
6 变量定义和常用操作
网络带宽计算器案例
比特
线路速度的计算单位:bit
字节
计算机存储数据的到单位:byte
变量
a =123
其中 "a"为变量名称;“=”为变量赋值;“123”为变量得到的值;
举例:
print('hello network_bandwidth')
#网络带宽计算
print(100/8)
bandwidth = 100
ratio = 8
print(bandwidth/ratio)
#命名规范:
#临时变量 如a
#常用有意义的英文字符 驼峰式:bandwith/BandWidth/band_width
#可以用字母和下划线开头,但是很少用下划线开头,下划线开头用在有特定含义的地方
7 序列概念
序列:是指它的成员都是有序排列,并且可以通过下标偏移量访问到它的一个或者几个成员。字符串、列表 和元组三种类型都属于序列。
字符串:“abcd”
列表:[0,“abcd”]
元组:(“abc”,“def”)
举例:
#记录生肖,根据年份来判断生肖
#什么时候用单引号,什么时候用双引号
#如 that's,里边用双引号,否则可以用单引号
chinese_zediac = '猴鸡狗猪鼠牛虎兔龙蛇马羊猴鸡狗猪'
#[0:4]是访问到下标是4之前的元素
print(chinese_zediac[0:4])
print(chinese_zediac[-1])
结果:
8 字符串定义和使用
year = 2018
print(year%12)
print(chinese_zediac[year%12])
结果:
9 字符串的常用操作
序列的基本操作
成员关系(in、not in):对象[not] in 序列
连接操作符(+):序列+序列
重复操作符():序列整数
切片操作符([:])
10 元组的定义和常规操作
(1,20)当做是120,(1,20)>(2,20),看做是否是120大于220,结果是false
11 列表的定义和常用操作
序列:字符串、列表、元组
数字:整数、布尔值
变量:变量赋值、关键字、变量名称、命名规范
constellation_name = (u'魔蝎座',u'水瓶座',u'双鱼座',u'白羊座',u'金牛座',u'双子座',
u'巨蟹座',u'狮子座',u'处女座',u'天秤座',u'天蝎座',u'射手座')
constellation_days = ((1,20),(2,19),(3,21),(4,21),(5,21),(6,22),
(7,23),(8,23),(9,23),(10,23),(11,23),(12,23))
(month,day)=(10,10)
constellation_day =filter(lambda x:x<=(month,day),constellation_days)
print(constellation_day)
constellation_len =len(list(constellation_day)) %12
print(constellation_name[constellation_len])
结果:
12 条件语句
if语句:
- 关键字
- 判断条件表达式
- 判断为真时的代码块
> if 表达式:
> 代码块
if语句还可以和else、elif(else-if)语句组合成更复杂的判断
> if 表达式:
> 代码块
> elif 表达式:
> 代码块
> else:
> 代码块
x = 'abc'
if x =='abcd':
print('x的值 和abc相等')
else:
print('x和abc 不相等')
#判断生肖
chinese_zediac = '猴鸡狗猪鼠牛虎兔龙蛇马羊猴鸡狗猪'
year = int(input('请用户输入出生年份:'))
print(chinese_zediac[year%12])
结果:
x和abc 不相等
请用户输入出生年份:2000
龙
进程已结束,退出代码为 0
13 For循环
chinese_zediac = '猴鸡狗猪鼠牛虎兔龙蛇马羊猴鸡狗猪'
#cz是字符串
for cz in chinese_zediac:
print(cz)
#range函数,接收两个参数,0到13
for i in range(1,13):
print(i)
# %s的变量的值
for year in range(2000,2019):
print('%s 年的生肖是 %s' %(year,chinese_zediac[year % 12]))
结果:
猴
鸡
狗
猪
鼠
牛
虎
兔
龙
蛇
马
羊
猴
鸡
狗
猪
1
2
3
4
5
6
7
8
9
10
11
12
2000 年的生肖是 龙
2001 年的生肖是 蛇
2002 年的生肖是 马
2003 年的生肖是 羊
2004 年的生肖是 猴
2005 年的生肖是 鸡
2006 年的生肖是 狗
2007 年的生肖是 猪
2008 年的生肖是 鼠
2009 年的生肖是 牛
2010 年的生肖是 虎
2011 年的生肖是 兔
2012 年的生肖是 龙
2013 年的生肖是 蛇
2014 年的生肖是 马
2015 年的生肖是 羊
2016 年的生肖是 猴
2017 年的生肖是 鸡
2018 年的生肖是 狗
进程已结束,退出代码为 0
14 while循环
条件和循环
- if语句
- for语句
- while语句
举例:
num =5
while True:
print('a')
num = num+1
if num >10:
break
import time
num =5
while True:
print(num)
num = num +1
if num ==10:
continue
print(num)
time.sleep(1)
结果:
a
a
a
a
a
a
5
6
6
7
7
8
8
9
9
10
11
11
12
12
进程已结束,退出代码为 -1
15 for循环中的if嵌套
constel_name = (u'魔蝎座',u'水瓶座',u'双鱼座',u'白羊座',u'金牛座',u'双子座',
u'巨蟹座',u'狮子座',u'处女座',u'天秤座',u'天蝎座',u'射手座')
constel_days = ((1,20),(2,19),(3,21),(4,21),(5,21),(6,22),
(7,23),(8,23),(9,23),(10,23),(11,23),(12,23))
#用户输入月份和日期
int_month = int(input('请输入月份:'))
int_day = int(input('请输入日期:'))
for zd_num in range(len(constel_days)):
if constel_days[zd_num] >= (int_month,int_day):
print(constel_name[zd_num])
elif int_month == 12 and int_day >23:
print(constel_name[0])
break
结果:
16 while循环中的if嵌套
constel_name = (u'魔蝎座',u'水瓶座',u'双鱼座',u'白羊座',u'金牛座',u'双子座',
u'巨蟹座',u'狮子座',u'处女座',u'天秤座',u'天蝎座',u'射手座')
constel_days = ((1,20),(2,19),(3,21),(4,21),(5,21),(6,22),
(7,23),(8,23),(9,23),(10,23),(11,23),(12,23))
#用户输入月份和日期
int_month = int(input('请输入月份:'))
int_day = int(input('请输入日期:'))
n = 0
while constel_days[n] <(int_month,int_day):
if int_month ==12 and int_day >23:
break
n +=1
print(constel_name[n])
结果:
17 字典定义和常用操作
映射类型:字典
字典包括映射的哈希值和指向的对象
{“length”:“180”}
chinese_constel = '猴鸡狗猪鼠牛虎兔龙蛇马羊猴鸡狗猪'
constel_name = (u'魔蝎座',u'水瓶座',u'双鱼座',u'白羊座',u'金牛座',u'双子座',
u'巨蟹座',u'狮子座',u'处女座',u'天秤座',u'天蝎座',u'射手座')
constel_days = ((1,20),(2,19),(3,21),(4,21),(5,21),(6,22),
(7,23),(8,23),(9,23),(10,23),(11,23),(12,23))
#代码不够优雅
cz_num = {}
for i in chinese_constel:
cz_num[i] = 0
z_num = {}
for i in constel_name:
z_num[i] = 0
while True:
# 用户输入出生年份和日期
year = int(input('请输入年份:'))
month = int(input('请输入月份:'))
day = int(input('请输入日期:'))
n =0
while constel_days[n] <(month,day):
if month ==12 and day >23:
break
n += 1
#输出生肖和星座
print(constel_name[n])
print('%s 年的生肖是 %s' %(year,chinese_constel[year % 12]))
cz_num[chinese_constel[year %12]] +=1
z_num[constel_name[n]] +=1
#输出生肖和星座的统计信息
for each_key in cz_num.keys():
print('生肖 %s 有 %d 个' %(each_key,cz_num[each_key]))
for each_key in z_num.keys():
print('星座 %s 有 %d 个' %(each_key,z_num[each_key]))
结果:
请输入年份:2000
请输入月份:12
请输入日期:12
射手座
2000 年的生肖是 龙
生肖 猴 有 0 个
生肖 鸡 有 0 个
生肖 狗 有 0 个
生肖 猪 有 0 个
生肖 鼠 有 0 个
生肖 牛 有 0 个
生肖 虎 有 0 个
生肖 兔 有 0 个
生肖 龙 有 1 个
生肖 蛇 有 0 个
生肖 马 有 0 个
生肖 羊 有 0 个
星座 魔蝎座 有 0 个
星座 水瓶座 有 0 个
星座 双鱼座 有 0 个
星座 白羊座 有 0 个
星座 金牛座 有 0 个
星座 双子座 有 0 个
星座 巨蟹座 有 0 个
星座 狮子座 有 0 个
星座 处女座 有 0 个
星座 天秤座 有 0 个
星座 天蝎座 有 0 个
星座 射手座 有 1 个
18 列表推导式和字典推导式
举例:
alist =[]
for i in range(1,11):
if( i%2 == 0):
alist.append(i*i)
print(alist)
blist = [i*i for i in range(1,11) if( i % 2) == 0]
print(blist)
chinese_constel = '猴鸡狗猪鼠牛虎兔龙蛇马羊猴鸡狗猪'
constel_name = (u'魔蝎座',u'水瓶座',u'双鱼座',u'白羊座',u'金牛座',u'双子座',
u'巨蟹座',u'狮子座',u'处女座',u'天秤座',u'天蝎座',u'射手座')
constel_days = ((1,20),(2,19),(3,21),(4,21),(5,21),(6,22),
(7,23),(8,23),(9,23),(10,23),(11,23),(12,23))
z_num = {}
for i in constel_name:
z_num[i] = 0
z_num = {i:0 for i in constel_name}
print(z_num)
结果:
[4, 16, 36, 64, 100]
[4, 16, 36, 64, 100]
{'魔蝎座': 0, '水瓶座': 0, '双鱼座': 0, '白羊座': 0, '金牛座': 0, '双子座': 0, '巨蟹座': 0, '狮子座': 0, '处女座': 0, '天秤座': 0, '天蝎座': 0, '射手座': 0}
进程已结束,退出代码为 0
19 文件的内建函数和方法
文件内建函数和方法:
- open() 打开文件
- read() 输入
- readline() 输入一行
- seek() 文件内移动
- write() 输出
- close() 关闭文件
举例:
file1 = open(r'C:\Users\Administrator\Desktop\name.txt','w')
file1.write('上海')
file1.close()
#读取
file2 = open(r'C:\Users\Administrator\Desktop\name.txt')
print(file2.read())
file2.close()
#增加人物
file3 = open(r'C:\Users\Administrator\Desktop\name.txt','a')
file3.write('广州')
结果:
20 文件常用操作
举例:
file = open(r'C:\Users\Administrator\Desktop\name.txt','rb')
print('当前文件指针的位置 %s' %file.tell())
print('当前读取到了一个字符,字符的内容是 %s' %file.read(1))
print('当前文件指针的位置 %s' %file.tell())
#seek(5,0) 第一个参数,代表的是偏移的位置,第二个参数
# 0 表示从文件开头偏移,
# 1 表示从当前位置偏移
# 2从文件结尾开始偏移
file.seek(5,2)
print('我们进行了seek操作')
print('当前文件指针的位置 %s' %file.tell())
print('当前读取到了一个字符,字符的内容是 %s' %file.read(1))
print('当前文件指针的位置 %s' %file.tell())
file.close()
结果:
21 异常检测和处理
异常是在出现错误时采用正常控制流以外的动作;异常处理的一般流程是:检测到错误,引发异常;对异常进行捕获的操作。
try:
<监控异常>
except Exception[,reason]:
<异常处理代码>
finally:
<无论异常是否发生都执行>
文件和输入输出
- 文件对象
- 内建函数
异常处理:
- 异常产生
- 异常检测
- 异常处理
year = int(input('input year:'))
try:
year = int(input('input year:'))
except ValueError:
print('年份要输入数字')
a = 123
a.append()
#捕获多个异常信息
#except (ValueError, AttributeError, KeyError)
#输出异常的信息
try:
print(1/0)
#别名 as e
except ZeroDivisionError as e:
print('0不能做为除数 %s' %e)
try:
print(1/0)
#别名 as e
except Exception as e:
print('%s' %e)
try:
raise NameError('helloError')
except NameError:
print('my custom error')
try:
a = open('name.txt')
except Exception as e:
print(e)
finally:
a.close()
22 函数的定义和常用操作
函数是对程序逻辑进行结构化的一种编程方法
函数的定义
def 函数名称():
代码
return 需要返回的内容
函数的调用
函数名称()
举例:
import re
def find_item(hero):
with open(r'C:\Users\Administrator\Desktop\nanhai.txt',encoding='GB18030') as f:
data = f.read().replace('\n','')
re.findall(hero,data)
print('主角 %s 出现 %s 次' %(hero,len(name_num)))
return len(name_num)
#读取人物的信息
name_dict = {}
name_num =''
with open(r'C:\Users\Administrator\Desktop\name.txt') as f:
for line in f:
names = line.split('|')
for n in names:
# print(n)
name_num = find_item(n)
name_dict[n] = name_num
#排序,匿名函数
name_sorted = sorted(name_dict.items(),key =lambda item:item[1], reverse=True)
print(name_sorted[0:10])
结果:
23 函数的可变长参数
print('abc',end ='\n')
print('abc')
def func(a,b,c):
print('a = %s' %a)
print('b = %s' %b)
print('c = %s' %c)
func(1,2,c=3)
#取得参数的个数
def howlong(first,*other):
return 1+len(other)
howlong(123,234,456)
结果:
abc
abc
a = 1
b = 2
c = 3
进程已结束,退出代码为 0
24 函数变量作用域问题
var1 = 123
def func():
global var1
var1 = 456
print(var1)
func()
print(var1)
结果:
456
456
进程已结束,退出代码为 0
25 函数的迭代器和生成器
#迭代器
list1 = [1,2,3]
it = iter(list1)
print(next(it))
print(next(it))
print(next(it))
#生成器
for i in range(10,20,int(1.5)):
print(i)
#yield生成器的一种
def range(star,stop,step):
x = star
while x < stop:
# print(x)
yield x
x = x +step
for i in range(10,20,0.5):
print(i)
结果:
1
2
3
10
11
12
13
14
15
16
17
18
19
10
10.5
11.0
11.5
12.0
12.5
13.0
13.5
14.0
14.5
15.0
15.5
16.0
16.5
17.0
17.5
18.0
18.5
19.0
19.5
进程已结束,退出代码为 0
26 lambda表达式
def func2(item):
return item[1]
adict = {'a':'aa','b':'bb'}
for i in adict.items():
func2(i)
print(i)
结果:
('b', 'bb')
进程已结束,退出代码为 0
27 python的内建函数
help(filter)
a =[1,2,3,4,5,6,7]
list(filter(lambda x:x>2,a))
help(map)
a = [1,2,3]
map(lambda x:x,a)
s1 =list(map(lambda x:x,a))
print(s1)
s2 = list(map(lambda x:x+1,a))
print(s2)
b = [4,5,6]
s3 = list(map(lambda x,y:x+y,a,b))
print(s3)
# reduce
import functools
from functools import reduce
help(functools.reduce)
m = reduce(lambda x,y:x+y,[2,3,4],1)
print(m)
zip
help(zip)
for i in zip((1,2,3),(4,5,6)):
print(i)
dicta = {'a':'aa','b':'bb'}
dictb = zip(dicta.values(),dicta.keys())
print(dict(dictb))
结果:
class filter(object)
| filter(function or None, iterable) --> filter object
|
| Return an iterator yielding those items of iterable for which function(item)
| is true. If function is None, return the items that are true.
|
| Methods defined here:
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __iter__(self, /)
| Implement iter(self).
|
| __next__(self, /)
| Implement next(self).
|
| __reduce__(...)
| Return state information for pickling.
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
Help on class map in module builtins:
class map(object)
| map(func, *iterables) --> map object
|
| Make an iterator that computes the function using arguments from
| each of the iterables. Stops when the shortest iterable is exhausted.
|
| Methods defined here:
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __iter__(self, /)
| Implement iter(self).
|
| __next__(self, /)
| Implement next(self).
|
| __reduce__(...)
| Return state information for pickling.
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
[1, 2, 3]
[2, 3, 4]
[5, 7, 9]
Help on built-in function reduce in module _functools:
reduce(...)
reduce(function, sequence[, initial]) -> value
Apply a function of two arguments cumulatively to the items of a sequence,
from left to right, so as to reduce the sequence to a single value.
For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5). If initial is present, it is placed before the items
of the sequence in the calculation, and serves as a default when the
sequence is empty.
10
Help on class zip in module builtins:
class zip(object)
| zip(*iterables) --> A zip object yielding tuples until an input is exhausted.
|
| >>> list(zip('abcdefg', range(3), range(4)))
| [('a', 0, 0), ('b', 1, 1), ('c', 2, 2)]
|
| The zip object yields n-length tuples, where n is the number of iterables
| passed as positional arguments to zip(). The i-th element in every tuple
| comes from the i-th iterable argument to zip(). This continues until the
| shortest argument is exhausted.
|
| Methods defined here:
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __iter__(self, /)
| Implement iter(self).
|
| __next__(self, /)
| Implement next(self).
|
| __reduce__(...)
| Return state information for pickling.
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
(1, 4)
(2, 5)
(3, 6)
{'aa': 'a', 'bb': 'b'}
进程已结束,退出代码为 0
28 闭包的定义
#闭包:外部函数的变量,被内部函数引用的话,就称作闭包
def func():
a =1
b =2
print(a+b)
#闭包的函数模式
def sum(a):
def add(b):
return a+b #引用外部函数
return add #返回内部函数
#add 单独使用add是函数名称或函数的引用
#add() 是函数的调用
num1 = func()
num2 = sum(2)
print(num2(4))
print(type(num1)) #<class 'NoneType'>
print(type(num2)) #<class 'function'>
#第一种方式
# def counter():
# cnt =[0]
# def add_one():
# cnt[0] +=1
# return cnt[0]
# return add_one
# num1 =counter()
# print(num1())
# print(num1())
# print(num1())
# print(num1())
# print(num1())
# 第二种方式
def counter(FIRST=0):
cnt =[FIRST]
def add_one():
cnt[0] +=1
return cnt[0]
return add_one
num5 =counter(5)
num10 = counter(10)
print(num5())
print(num5())
print(num5())
print(num10())
print(num10())
结果:
3
6
<class 'NoneType'>
<class 'function'>
6
7
8
11
12
进程已结束,退出代码为 0
29 闭包的使用
#闭包和函数的区别:使用闭包,是原来传递变量的方式,变成传递函数的方式
#而且使用闭包的时候,调用的参数比使用普通函数,要少;使用代码要优雅很多
#第一种写法:
def a_line(a,b):
def arg_y(x):
return a*x+b
return arg_y
line1 = a_line(3,5)
print(line1(10))
#def func1(a,b,x)
#第二种写法:lambda表达式
def a_line(a,b):
return lambda x: a * x + b
line1 = a_line(3,5)
print(line1(10))
结果:
35
35
进程已结束,退出代码为 0
30 装饰器的定义
#第一种原始写法
import time
print(time.time())
def i_can_sleep(): #被装饰函数
time.sleep(3)
start_time = time.time()
i_can_sleep()
stop_time =time.time()
print('函数运行了 %s 秒' %(stop_time-start_time))
#第二种运用装饰器的写法
import time
print(time.time())
#装饰器和闭包的区别是:闭包引进来的是变量,装饰器引进来的是函数
#装饰器函数
def timer(func):
def wrapper(): #内部函数,闭包
start_time = time.time()
func()
stop_time =time.time()
print("运行时间是 %s 秒" % (stop_time-start_time))
return wrapper #返回内部函数
@timer #装饰器函数
def i_can_sleep(): #被装饰函数
time.sleep(3)
#第一种写法
timer(i_can_sleep())
#第二种写法
i_can_sleep()
结果 :
1659025189.2046003
函数运行了 3.0 秒
1659025192.2046003
运行时间是 3.0 秒
运行时间是 3.0 秒
进程已结束,退出代码为 0
31 装饰器的使用
#第一种写法
def tips(func):
def nei(a,b):
print('start')
func(a,b)
print('stop')
return nei
@tips('add')
def add(a,b):
print(a+b)
@tips('del')
def sub(a,b):
print(a-b)
print(add(4,5))
print(sub(9,5))
#第二种写法
def new_tips(argv):
def tips(func):
def nei(a,b):
print('start %s %s' %(argv,func.__name__))
func(a,b)
print('stop')
return nei
return tips
@new_tips('add_module')
def add(a,b):
print(a+b)
@new_tips('sub_module')
def sub(a,b):
print(a-b)
print(add(4,5))
print(sub(9,5))
32 自定义上下文管理器
函数
- 调用函数
- 创建函数
- 可变长参数
- 变量的作用域
- 匿名函数
- 生成器
- 迭代器
- 装饰器
- 闭包
举例:
fd = open(r'C:\Users\Administrator\Desktop\name.txt')
try:
for line in fd:
print(line)
finally:
fd.close()
with open(r'C:\Users\Administrator\Desktop\name.txt') as f:
for line in f:
print(line)
结果:
上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州
上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州上海广州
进程已结束,退出代码为 0
33 模块的定义
模块
-
在代码量变得相当大之后,为了蒋需要重复使用的有组织的代码段放在一起,这部分代码可以附加到现有的程序中,附件的过程叫做导入(import)
导入模块的一般写法:
import 模块名称
from 模块名称 import 方法名
举例:
import os
import time
time.sleep()
from time import sleep
sleep()
#mymod.py
def print_me():
print('me')
# print_me()
#mymod_test.py
import mymod
mymod.print_me()
34 PEP8编码规范
34.1 安装pep8
操作命令:
cd /usr/local/venv/bin
./python3.6 pip3 install autopep8
34.2 Pycharm中配置autopep8
配置信息如下:
bash Parameters:--in-place --aggressive --aggressive $FilePath$
Working directory: $ProjectFileDir$ Regular expression to match
output中输入: $FILE_PATH$\ : $LINE$\ : $COLUMN$\: .*
35 类和实例
35.1 第一种写法(面向过程编程)
user1 = {‘name’: ‘tom’, ‘hp’: 100}
user2 = {‘name’: ‘jerry’, ‘hp’: 80}
def print_role(rolename):
print(‘name is %s, hp is %s’ % (rolename[‘name’], rolename[‘hp’]))
print_role(user1)
结果:
name is tom, hp is 100
进程已结束,退出代码为 0
35.2 第二种写法(面向过程编程)
class Player(): # 定义一个类
def __init__(self, name, hp):
self.name = name
self.hp = hp
def print_role(self): #定义一个方法
print('%s %s' % (self.name, self.hp))
user1 = Player('tom', 100) #类的实例化
user2 = Player('jorn', 90)
user1.print_role()
user2.print_role()
结果:
tom 100
jerry 90
进程已结束,退出代码为 0
35.3 总结
- 定义一个类,用class,开头一定要用大写字母
- 定义类的功能,叫做方法;面向对象,更符合人的思维习惯;面向过程更使用机器运行,从上到下进行执行。
- init,self是类的本身
36 如何增加类的属性和方法
#name是实例会被访问到
#__name是实例不会被访问到
class Player(): # 定义一个类
def __init__(self, name, hp,occu):
self.__name = name #变量被成为属性
self.hp = hp
self.occu = occu
def print_role(self): #定义一个方法
print('%s %s %s' % (self.__name, self.hp,self.occu))
def updateName(self,newname):
self.__name = newname
class Monster():
'定义怪物类'
pass
user1 = Player('tom', 100,'war') #类的实例化
user2 = Player('jorn', 90,'master')
user1.print_role()
user2.print_role()
user1.updateName('wilson')
user1.print_role()
user1.name =('aaa')
user1.print_role()
结果:
tom 100 war
jorn 90 master
wilson 100 war
wilson 100 war
进程已结束,退出代码为 0
37 类的继承
class Player(): # 定义一个类
def __init__(self, name, hp,occu):
self.__name = name #变量被成为属性
self.hp = hp
self.occu = occu
def print_role(self): #定义一个方法
print('%s %s %s' % (self.__name, self.hp,self.occu))
def updateName(self,newname):
self.__name = newname
class Monster():
'定义怪物类'
def __init__(self,hp):
self.hp = hp
def run(self):
print('移动到某个位置')
def whoami(self):
print('我是怪物的父类')
class Animals(Monster):
'普通怪物'
def __init__(self,hp=10):
super().__init__(hp)
class Boss(Monster):
'Boss类怪物'
def whoami(self):
print('我是怪物,我怕谁')
a1 = Monster(200)
print(a1.hp)
print(a1.run())
a2= Animals(1)
print(a2.hp)
print(a2.run())
a3 = Boss(800)
a3.whoami()
print(isinstance(a2,Monster))
结果:
200
移动到某个位置
None
1
移动到某个位置
None
我是怪物,我怕谁
True
进程已结束,退出代码为 0
总结:
- 类,描述具有相应的属性、对象的集合。
- 类没有办法,直接拿过来引用的;把类实例化变成我们的实例。
- 对我们的是实例,在进行相应的操作。
38 类的使用及自定义with语句
class Testwith():
def __enter__(self):
print('run')
def __exit__(self,exc_type,exc_val,exc_tb):
if exc_tb is None:
print('正常结束')
else:
print('has error %s' %exc_tb)
with Testwith():
print('Test is running')
raise NameError('testNameError')
结果:
run
Test is running
has error <traceback object at 0x0000000002685F00>
Traceback (most recent call last):
File "E:/python/test.py", line 593, in <module>
raise NameError('testNameError')
NameError: testNameError
进程已结束,退出代码为 1
39 多线程编程的定义
import threading
import time
from threading import current_thread
def myThread(arg1,arg2):
print(current_thread().getName(),'start')
print('%s %s' %(arg1,arg2))
time.sleep(1)
print(current_thread().getName(),'stop')
for i in range(1,6,1):
# t1 = myThread(i,i+1)
t1= threading.Thread(target=myThread,args=(i,i+1))
t1.start()
print(current_thread().getName(),'end')
#第二种写法
import threading
from threading import current_thread
class Mythread(threading.Thread):
def run(self):
print(current_thread().getName(),'start')
print('run')
print(current_thread().getName(),'stop')
t1 = Mythread()
t1.start()
t1.join()
print(current_thread().getName(),'end')
结果:
Thread-1 start
1 2
Thread-2 start
2 3
Thread-3 start
3 4
Thread-4 start
4 5
Thread-5MainThread end
start
5 6
Thread-6 start
run
Thread-6 stop
MainThread end
Thread-3 stop
Thread-4 stop
Thread-2 stop
Thread-1 stop
Thread-5 stop
进程已结束,退出代码为 0
40 经典的生产者和消费者问题
import queue
q= queue.Queue() #队列
q.put(1)
q.put(2)
q.put(3)
q.get()
from threading import Thread,current_thread
import time
import random
from queue import Queue
queue =Queue(5)
class ProductThread(Thread):
def run(self):
name = current_thread().getName()
nums = range(100)
global queue
while True:
num = random.choice(nums)
queue.put(num)
print('生产者 %s 生产了数据 %s' %(name,num))
t = random.randint(1,3)
time.sleep(t)
print('生产者 %s 睡眠了 %s 秒' %(name,t))
class ConsumerThread(Thread):
def run(self):
name = current_thread().getName()
global queue
while True:
num = queue.get()
queue.task_done()
print('消费者 %s 消耗了数据 %s' %(name,num))
t = random.randint(1,5)
time.sleep(t)
print('消费者 %s 睡眠了 %s 秒' %(name,t))
p1 = ProductThread(name ='p1')
p1.start()
p2 = ProductThread(name = 'p2')
p2.start()
p3 = ProductThread(name = 'p3')
p3.start()
c1 = ConsumerThread(name = 'c1')
c1.start()
c2 = ConsumerThread(name = 'c2')
c2.start()
结果:
生产者 p1 生产了数据 26
生产者 p2 生产了数据 88
生产者 p3 生产了数据 61
消费者 c1 消耗了数据 26
消费者 c1 消耗了数据 88
消费者 c1 消耗了数据 61
生产者 p2 睡眠了 1 秒
生产者 p2 生产了数据 58
消费者 c2 消耗了数据 58
生产者 p3 睡眠了 2 秒
生产者 p3 生产了数据 91消费者 c1 消耗了数据 91
生产者 p1 睡眠了 2 秒
生产者 p1 生产了数据 60
消费者 c2 消耗了数据 60
生产者 p3 睡眠了 1 秒
消费者 c1 消耗了数据 80
生产者 p3 生产了数据 80
生产者 p2 睡眠了 2 秒
生产者 p2 生产了数据 35
消费者 c2 消耗了数据 35
生产者 p1 睡眠了 2 秒
生产者 p1 生产了数据 76
生产者 p3 睡眠了 1 秒
生产者 p3 生产了数据 52
消费者 c2 消耗了数据 76
消费者 c1 消耗了数据 52
生产者 p1 睡眠了 1 秒
生产者 p1 生产了数据 45
消费者 c1 消耗了数据 45
生产者 p2 睡眠了 3 秒
生产者 p2 生产了数据 21
消费者 c2 消耗了数据 21
生产者 p3 睡眠了 3 秒
生产者 p3 生产了数据 37
生产者 p1 睡眠了 2 秒
生产者 p1 生产了数据 11
消费者 c1 消耗了数据 37
消费者 c1 消耗了数据 11
生产者 p2 睡眠了 2 秒
生产者 p2 生产了数据 13消费者 c1 消耗了数据 13
生产者 p1 睡眠了 2 秒
生产者 p1 生产了数据 75
消费者 c2 消耗了数据 75
进程已结束,退出代码为 -1
41 python标准库定义
总结:
1、文字处理的 re
2、日期类型的time、datetime
3、数字和数字类型的math、random
4、文件和目录访问的pathlib、os.path
5、数据压缩和归档的tarfile
6、通用操作系统的os、logging、argarse
7、多线程的threading、queue
8、Internet数据处理的base64、json、urllib
9、结构化编辑处理工具的html、xml
10、开发工具的unitest
11、调试工具的timeit
12、软件包发布的venv
13、运行服务的__main__
42 正则表达式库
import re
p = re.compile('a')
# print(p.match('a'))
print(p.match('b'))
结果:
None
进程已结束,退出代码为 0
43 正则表达式的元字符
. ^ $ * + ? {m} {m,n} [] | \d \D \s ()
^$
.*?
. 是匹配多个, ... 是匹配多个,
^ 从开头开始匹配
$ 从后边开始匹配
* 匹配前边字符,出现0次,或者多次
{m} 指定字符出现的次数m
{m,n} 指定字符出现的次数m到n次
[] 中括号中任意字符,匹配成功,都成功
| 表示字符,选择是左边,还是右边
\d [0-9]+ 匹配数字
\D 匹配不包含数字的
\s 匹配的是一个字符串
() 进行分组 (2018)-(03)-(04)
2018-03-04
2018-04-12 (03|04)
^$ 这一行是空行
44 正则表达式分组功能实例
import re
#匹配所有
p = re.compile('...')
#匹配三个字符
p = re.compile('.(3)')
print(p.match('bat'))
p = re.compile('....-..-..')
print(p.match('2018-05-10'))
print(r'\nx\n')
p = re.compile(r'(\d+)-(\d+)-(\d+)')
print(p.match('2018-05-10').group(1))
print(p.match('2018-05-10').group())
print(p.match('2018-05-10').groups())
year,month,day = p.match('2010-05-10').groups()
print(year)
print(month)
print(day)
结果:
None
<re.Match object; span=(0, 10), match='2018-05-10'>
\nx\n
2018
2018-05-10
('2018', '05', '10')
2010
05
10
进程已结束,退出代码为 0
45 正则表达式的match和search
#match是匹配之后,进行分组;search是进行字符串的搜索
# 有特殊字符的话,match不能做相应的匹配
p = re.compile(r'(\d+)-(\d+)-(\d+)')
print(p.match('sss2018-05-10bbbb').group(2))
p = re.compile(r'(\d+)-(\d+)-(\d+)')
print(p.search('sss2018-05-10bbbb')) #<_sre.SRE_Match object; span=(3, 13), match='2018-05-10'>
46 正则表达式的替换函数sub
# sub('c','*','abcd')
phone = '123-456-789 # 这是我的电话'
p2 = re.sub(r'#.*$','',phone)
print(p2) #123-456-789
p3 = re.sub(r'\D','',p2)
print(p3) #123456789
#findall 进行匹配多次
47 日期和时间函数
import time
print(time.time()) #1635778753.5328877
print(time.localtime())#time.struct_time(tm_year=2021, tm_mon=11, tm_mday=1, tm_hour=22, tm_min=59, tm_sec=13, tm_wday=0, tm_yday=305, tm_isdst=0)
print(time.strftime('%y-%m-%d %H:%M:%S'))#21-11-01 22:59:13
print(time.strftime('%y%m%d %H:%M:%S'))#211101 22:59:13
import datetime
print(datetime.datetime.now()) #2021-11-01 23:01:03.058052
#获取十分之后时间
newtime = datetime.timedelta(minutes=10)
print(datetime.datetime.now()+newtime) #2021-11-01 23:05:53.899077
one_day =datetime.datetime(2011,11,1)
new_date = datetime.timedelta(days=10)
print(one_day+new_date) #2011-11-11 00:00:00
print(datetime.datetime.now()+new_date) #2021-11-11 23:05:53.899136
结果:
1659364390.8206003
time.struct_time(tm_year=2022, tm_mon=8, tm_mday=1, tm_hour=22, tm_min=33, tm_sec=10, tm_wday=0, tm_yday=213, tm_isdst=0)
22-08-01 22:33:10
220801 22:33:10
2022-08-01 22:33:10.822600
2022-08-01 22:43:10.822600
2011-11-11 00:00:00
2022-08-11 22:33:10.822600
进程已结束,退出代码为 0
48 数学相关的库
import random
print(random.randint(1,5,4,))
print(random.choice('aa','bb','cc'))
49 使用命令行操作文件
- 代表是文件 d 代表是文件夹
os.path
linux的基本命令行操作指令
windows下边的命令行操作指令
50 文件夹操作库os.path
#os.path
import os
print(os.path.abspath('..')) #路径
print(os.path.exists('..')) #判定文件是否存在
print(os.path.isfile('/users')) #判定是否是文件
print(os.path.isdir('/users')) #判定是否是目录'''
#pathlib
from pathlib import Path
p =Path('.') #当前位置的路径
print(p.resolve()) #相对路径对应的绝对路径
p.is_dir() #目录
q = Path('/tmp/a/b/c') #新建一个目录
Path.mkdir(q,parents = True)
结果:
E:\
True
False
False
E:\python
进程已结束,退出代码为 0
51 机器学习的一般流程与NumPy
机器学习库:
numpy库:用于高性能科学计算和数据分析,是常用的高级数据分析库的基础包。
import numpy as np
arr1 = na.array([2,3,4])
print(arr1)
print(arr1.dtype)
52 NumPy的数组和数据类型
arr1 = np.array([2, 3, 4])
print(arr1) #[2, 3, 4]
print(arr1.dtype) #int64
arr2 = np.array([2.2, 3.4, 4.5])
print(arr2) #[2.2 3.4 4.5]
print(arr2.dtype) #float64
#列表累加
print(arr1+arr2) #[4.2 6.4 8.5]
结果:
[2 3 4]
int32
[2.2 3.4 4.5]
float64
[4.2 6.4 8.5]
进程已结束,退出代码为 0
53 NumPy的数组和标量的计算
arr2 = np.array([2.2, 3.4, 4.5])
print(arr2 *10)
data =[[1,2,3],[4,5,6]]
arr2 = np.array(data)
print(arr2)
'''
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
'''
print(np.zeros(10))
'''
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
'''
print(np.zeros((3,5)))
'''
[[1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1.]]
'''
print(np.ones((4,6)))
print(np.empty((4,3,4)))
结果:
[22. 34. 45.]
[[1 2 3]
[4 5 6]]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
[[1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1.]]
[[[1.90452583e-316 2.24427905e-315 1.67488254e-321 nan]
[1.12646967e-321 0.00000000e+000 4.69461772e+257 6.47522107e+170]
[1.48978395e+214 2.28251431e+232 1.89122297e+219 4.27255602e+180]]
[[3.99550968e+252 5.83439496e+252 9.08254283e+223 4.37214811e+156]
[6.01099866e+175 4.71353372e+257 2.44084997e-154 2.58413351e+161]
[6.21040976e+175 6.24379987e-119 6.01346953e-154 1.36456564e+161]]
[[1.96871856e-153 4.71341931e+257 2.44084997e-154 6.08709837e+247]
[1.61674072e+184 2.63433603e-085 6.75854449e+199 2.35288410e+251]
[3.99193922e+252 1.08785657e+155 2.20892690e+161 5.55183541e-048]]
[[1.72053723e+243 1.81596890e-152 1.69375774e+190 2.97399120e+222]
[1.11493509e+277 6.77023426e+223 9.69279043e-072 8.92892213e+271]
[1.06112832e-153 2.32160044e-152 2.73461243e+161 2.43566010e-154]]]
进程已结束,退出代码为 0
54 NumPy数组的索引和切片
arr4 =np.arange(10)
print(arr4[5:8])
#改变相应的值
arr4 =np.arange(10)
arr4[5:8]=10
print(arr4)
#不改变相应的值
arr4 =np.arange(10)
arr_slice = arr4[5:8].copy()
arr_slice[:] = 15
print(arr_slice)
print(arr4)
结果:
[5 6 7]
[ 0 1 2 3 4 10 10 10 8 9]
[15 15 15]
[0 1 2 3 4 5 6 7 8 9]
进程已结束,退出代码为 0
55 Pandas安装和Series结构
#安装pandas ./python3 pip3 install pandas
from pandas import Series,DataFrame
import pandas as pd
from _bz2 import BZ2Compressor, BZ2Decompressor
import bz2
#1、数据自动对齐 2、缺失数据,填充先要的一些值 3、像数据库SQL查询的连接操作
#一维数组
obj = Series([4,5,6,-7])
print(obj)
#pandas的索引可以重复
print(obj.index)
print(obj.values)
#int float string tuple
#不能作为字典当中的key
#['a'] {'b'}
var = {['a']: 1}
结果:
0 4
1 5
2 6
3 -7
dtype: int64
RangeIndex(start=0, stop=4, step=1)
[ 4 5 6 -7]
Traceback (most recent call last):
File "E:/python/test.py", line 866, in <module>
var = {['a']: 1}
TypeError: unhashable type: 'list'
进程已结束,退出代码为 1
56 Series基本操作
from pandas import Series,DataFrame
import pandas as pd
obj2 = Series([4,7,-5,3],index =['d','b','c','a'])
print(obj2)
obj2['c'] = 6
print(obj2)
print('a' in obj2)
sdata = {'beijing':35000,'shanghai':'71000','guangzhou':'10000','shenzhen':'5000'}
obj3 = Series(sdata)
print(obj3)
obj3.index = ['bj','gz','sh','sz']
print(obj3)
结果:
d 4
b 7
c -5
a 3
dtype: int64
d 4
b 7
c 6
a 3
dtype: int64
True
beijing 35000
shanghai 71000
guangzhou 10000
shenzhen 5000
dtype: object
bj 35000
gz 71000
sh 10000
sz 5000
dtype: object
进程已结束,退出代码为 0
57 DataFrame的基本操作
from pandas import DataFrame,Series
data = {'city':['shanghai','beijing','guangzhou','nanjing','shenzhen'],
'year':['2016','2017','2018','2019','2020'],
'pop':[1.5,1.7,3.6,2.4,2.9]
}
frame = DataFrame(data)
frame2 = DataFrame(data,columns=['year','city','pop'])
print(frame)
print(frame2)
print(frame2['city'])
print(frame2.year)
frame2['new'] =100
print(frame2)
frame2['cap'] = frame2.city =='beijing'
print(frame2)
pop ={'beijing':{2008:1.5,2009:2.0},
'shanghai':{2008:2.0,2009:3.6}}
frame3 = DataFrame(pop)
print(frame)
#行和列的互换
print(frame3.T)
#重新索引
obj4 = Series([4.5,7.2,-5.3,3.6],index = ['b','d','c','a'])
obj5 = obj4.reindex(['a','b','c','d','e'],fill_value=0)
print(obj5)
obj6 = Series(['blue','purple','yellow'],index =[0,2,4])
#ffill,用上边的内容进行填充
print(obj6.reindex(range(6),method='ffill'))
#bfill,用后边的内容进行填充
print(obj6.reindex(range(6),method='bfill'))
#把缺失的数据进行删除操作 Series是一维的
from numpy import nan as NA
data =Series(1,NA,2)
print(data)
#删除缺失值
print(data.dropna())
#DataFrame一行当中某个值缺失,或者全部缺失
data2 = DataFrame =([1,5.5,3],[1,NA,NA],[NA,NA,NA])
print(data2.dropna())
#删掉全部缺失值
print(data2.dropna(how='all'))
#某一例置为缺失值
data2[4] =NA
print[data2]
print(data2.dropna(axis=1,how='all'))
data2.filna(0)
print(data2.fillna(0,inplace=True))
print(data2)
结果 :
city year pop
0 shanghai 2016 1.5
1 beijing 2017 1.7
2 guangzhou 2018 3.6
3 nanjing 2019 2.4
4 shenzhen 2020 2.9
year city pop
0 2016 shanghai 1.5
1 2017 beijing 1.7
2 2018 guangzhou 3.6
3 2019 nanjing 2.4
4 2020 shenzhen 2.9
0 shanghai
1 beijing
2 guangzhou
3 nanjing
4 shenzhen
Name: city, dtype: object
0 2016
1 2017
2 2018
3 2019
4 2020
Name: year, dtype: object
year city pop new
0 2016 shanghai 1.5 100
1 2017 beijing 1.7 100
2 2018 guangzhou 3.6 100
3 2019 nanjing 2.4 100
4 2020 shenzhen 2.9 100
year city pop new cap
0 2016 shanghai 1.5 100 False
1 2017 beijing 1.7 100 True
2 2018 guangzhou 3.6 100 False
3 2019 nanjing 2.4 100 False
4 2020 shenzhen 2.9 100 False
city year pop
0 shanghai 2016 1.5
1 beijing 2017 1.7
2 guangzhou 2018 3.6
3 nanjing 2019 2.4
4 shenzhen 2020 2.9
2008 2009
beijing 1.5 2.0
shanghai 2.0 3.6
a 3.6
b 4.5
c -5.3
d 7.2
e 0.0
dtype: float64
0 blue
1 blue
2 purple
3 purple
4 yellow
5 yellow
dtype: object
0 blue
1 purple
2 purple
3 yellow
4 yellow
5 NaN
dtype: object
Traceback (most recent call last):
File "E:/python/test.py", line 916, in <module>
data =Series(1,NA,2)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 281, in __init__
index = ensure_index(index)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 5917, in ensure_index
return Index(index_like)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 372, in __new__
raise cls._scalar_data_error(data)
TypeError: Index(...) must be called with a collection of some kind, nan was passed
进程已结束,退出代码为 1
58 层次化索引
#series 一维数据 dataframe 二维数据
import numpy as np
data3 =Series(np.random.randn(10),
index=[['a','a','a','b','b','b','c','c','d','d'],
[1,2,3,1,2,3,1,2,2,3]])
print(data3)
print(data3['b'])
#输出多个索引
print(data3['b','c'])
#转化成DataFrame,一维的转化成二维的
print(data3.unstack())
#二维的转化成一维的
print(data3.unstack().stack())
结果:
a 1 -0.651915
2 0.167612
3 -0.199349
b 1 -1.377333
2 0.519355
3 -0.573552
c 1 -1.049817
2 0.736426
d 2 0.138968
3 -0.560910
dtype: float64
1 -1.377333
2 0.519355
3 -0.573552
dtype: float64
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 98, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index_class_helper.pxi", line 89, in pandas._libs.index.Int64Engine._check_type
KeyError: 'c'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "pandas\_libs\index.pyx", line 705, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
raise KeyError(key) from err
KeyError: 'c'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 859, in __getitem__
result = self._get_value(key)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 961, in _get_value
loc = self.index.get_loc(label)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py", line 2886, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 708, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc
KeyError: ('b', 'c')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 98, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index_class_helper.pxi", line 89, in pandas._libs.index.Int64Engine._check_type
KeyError: 'c'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "pandas\_libs\index.pyx", line 705, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
raise KeyError(key) from err
KeyError: 'c'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py", line 3036, in _get_loc_level
return (self._engine.get_loc(key), None)
File "pandas\_libs\index.pyx", line 708, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc
KeyError: ('b', 'c')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "E:/python/test.py", line 944, in <module>
print(data3['b','c'])
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 867, in __getitem__
return self._get_values_tuple(key)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 930, in _get_values_tuple
indexer, new_index = self.index.get_loc_level(key)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py", line 2965, in get_loc_level
return self._get_loc_level(key, level=level, drop_level=drop_level)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py", line 3038, in _get_loc_level
raise KeyError(key) from e
KeyError: ('b', 'c')
进程已结束,退出代码为 1
59 Matplotlib安装和绘图
#绘制一条折线
import matplotlib.pyplot as plt
#绘制简单的曲线
plt.plot([1,3,5],[4,8,10])
plt.show()
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-np.pi,np.pi,100) # x轴的定义为 -3.14~3.14,中间间隔100个元素
plt.plot(x,np.sin(x))
#显示所画的图
plt.show()
#绘制多条曲线
x = np.linspace(-np.pi*2,np.pi*2,100) # -2pi到2pi
plt.figure(1,dpi=50) #创建图表1
for i in range(1,5): #画4条曲线
plt.plot(x,np.sin(x/i))
plt.show()
#直方图
import numpy as np
import matplotlib.pyplot as plt
plt.figure(1,dpi=50)
data =[1,1,1,2,2,2,3,3,4,5,5,6,4]
plt.hist(data)
plt.show()
#散点图
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(1,10)
y = x
fig = plt.figure()
plt.scatter(x,y,c ='r',marker ='o') #表示散点的颜色为红色,marker表示指定三点多形状为图形
plt.show()
#pandas
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
iris =pd.read_csv("./iris_test.csv")
print(iris.head())
#绘制散点图
iris.plot(kind='scatter',x ="120",y="4")
plt.show()
#seaborn的使用
#安装./python3 pip3 install seaborn
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
iris =pd.read_csv("./iris_test.csv")
#设置样式
sns.set(style='white',color_codes=True)
#设置绘制格式为散点图
sns.joinplot(x="120",y="4",data=iris,size =5)
#distplot绘制曲线
sns.displot(iris('120'))
#没啥用,只是让pandas的plot()方法在pycharm上显示
plt.show()
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import warnings
warnings.filterwarning("ignore")
iris =pd.read_csv("./iris_test.csv")
#设置样式
sns.set(style='white',color_codes=True)
#FaceGrid 一般绘图函数
#hue 彩色显示分类0/1/2
#plt.scatter 绘制数点图
#add_legend() 显示分类的描述信息
sns.FacetGrid(iris,hue="virginica",size=5).map(plt.scatter,"120","4").add_legend()
#换成相应列的标题
sns.FaceGrid(iris,hue="virginica",size=5).map(plt.scatter,"setosa","versicolor").add_legend()
#没啥用,只是让pandas的plot()方法在pycharm上显示
plt.show()
结果:
60 机器学习分类的原理
1、算法的设计
2、训练数据
3、预测的工具-模型
ax+by = c
a = 5
b = 6
c = 7
5x + 6y = 7
61 Tensorflow的安装
62 根据特征值分析的模型和代码
在这里插入图片描述
63 网页数据的采集与urllib库
网络库:
1. urlib库 http协议常用库
2. requests库 http协议常用库
3. beaufifulSoup库 xml格式处理库
from urllib import request
url = "http://www.baidu.com"
response =request.urlopen(url,timeout=1)
print(response.read().decode('UTF-8'))
64 网页常见的两种请求方式get和post
参考地址:http://httpbin.org
#get请求
from urllib import parse
from urllib import request
#带上这个参数:timeout,否则超时的时候,会被卡死
response =request.urlopen('http://httpbin.org/get',timeout =1)
print(response.read())
#post请求
from urllib import parse
from urllib import request
data = bytes(parse.urlencode({'word':'hello'}),encoding ='utf-8')
print(data)
response =request.urlopen('http://httpbin.org/post',data=data)
print(response.read().decode('UTF-8'))
#异常捕获
import urllib
import socket # 套接字的库
try:
response3 =urllib.request.urlopen('http://httpbin.org/get',timeout =0.1)
except urllib.error.URLError as e:
if isinstance(e.reason,socket.timeout):
print('TIME OUT')
65 http头部信息模拟
from urllib import request,parse
url = 'http://httpbin.org/post'
headers ={
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Accept': 'text/html, */*; q=0.01',
'X-Requested-With': 'XMLHttpRequest',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36',
'DNT': '1',
'Referer': 'httpbin.org',
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'zh-CN,zh;q=0.8,ja;q=0.6'
}
dict = {
'name':'value'
}
data = bytes(parse.urlencode(dict),encoding='utf-8')
req = request.Request(url=url,data=data,headers =headers,method='POST')
response = request.urlopen(req)
print(response.read().decode('utf-8'))
结果:
{
"args": {},
"data": "",
"files": {},
"form": {
"name": "value"
},
"headers": {
"Accept": "text/html, */*; q=0.01",
"Accept-Encoding": "gzip, deflate, sdch",
"Accept-Language": "zh-CN,zh;q=0.8,ja;q=0.6",
"Cache-Control": "max-age=0",
"Content-Length": "10",
"Content-Type": "application/x-www-form-urlencoded",
"Dnt": "1",
"Host": "httpbin.org",
"Referer": "httpbin.org",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36",
"X-Amzn-Trace-Id": "Root=1-62e7ed3c-308d28c33ca0d7af03dfb476",
"X-Requested-With": "XMLHttpRequest"
},
"json": null,
"origin": "117.136.39.102",
"url": "http://httpbin.org/post"
}
进程已结束,退出代码为 0
66 requests库的基本使用
#安装./python3 pip3 install requests
#get请求
import requests
url = 'http://httpbin.org/get'
data ={"key":'value','abc':'xyz'}
#get是使用get方法请求url,字典类型的data不用进行额外处理
response = requests.get(url,data)
print(response.text)
#post请求
import requests
url = 'http://httpbin.org/post'
data = {'key':'value','abc':'xyz'}
response = requests.post(url,data)
#返回类型为json格式
print(response.json())
结果:
{
"args": {
"abc": "xyz",
"key": "value"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.25.1",
"X-Amzn-Trace-Id": "Root=1-62e7ed85-007cdcc9324cbf9639748079"
},
"origin": "117.136.39.102",
"url": "http://httpbin.org/get?key=value&abc=xyz"
}
{'args': {}, 'data': '', 'files': {}, 'form': {'abc': 'xyz', 'key': 'value'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Content-Length': '17', 'Content-Type': 'application/x-www-form-urlencoded', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.25.1', 'X-Amzn-Trace-Id': 'Root=1-62e7ed86-0c5aff6665c0c15c5e9e27dc'}, 'json': None, 'origin': '117.136.39.102', 'url': 'http://httpbin.org/post'}
进程已结束,退出代码为 0
67 结合正则表达式爬取图片链接
import requests
import re
content = requests.get('http://www.cnu.cc/discoveryPage/hot-人像').text
print(content)
#findall
pattern = re.compile(r'<a href="(.*?)".*?title">(.*?)</div>',re.S)
results =re.findall(pattern,content)
print(results)
for result in results:
url,name = result
print(url,re.sub('\s','',name))
结果:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Error 500 - Internal Server Error</title>
<meta name="viewport" content="width=device-width">
<style type="text/css">
article, aside, details, figcaption, figure, footer, header, hgroup, nav, section { display: block; }
audio, canvas, video { display: inline-block; *display: inline; *zoom: 1; }
audio:not([controls]) { display: none; }
[hidden] { display: none; }
html { font-size: 100%; -webkit-text-size-adjust: 100%; -ms-text-size-adjust: 100%; }
html, button, input, select, textarea { font-family: sans-serif; color: #222; }
body { margin: 0; font-size: 1em; line-height: 1.4; }
::-moz-selection { background: #E37B52; color: #fff; text-shadow: none; }
::selection { background: #E37B52; color: #fff; text-shadow: none; }
a { color: #00e; }
a:visited { color: #551a8b; }
a:hover { color: #06e; }
a:focus { outline: thin dotted; }
a:hover, a:active { outline: 0; }
abbr[title] { border-bottom: 1px dotted; }
b, strong { font-weight: bold; }
blockquote { margin: 1em 40px; }
dfn { font-style: italic; }
hr { display: block; height: 1px; border: 0; border-top: 1px solid #ccc; margin: 1em 0; padding: 0; }
ins { background: #ff9; color: #000; text-decoration: none; }
mark { background: #ff0; color: #000; font-style: italic; font-weight: bold; }
pre, code, kbd, samp { font-family: monospace, serif; _font-family: 'courier new', monospace; font-size: 1em; }
pre { white-space: pre; white-space: pre-wrap; word-wrap: break-word; }
q { quotes: none; }
q:before, q:after { content: ""; content: none; }
small { font-size: 85%; }
sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
sup { top: -0.5em; }
sub { bottom: -0.25em; }
ul, ol { margin: 1em 0; padding: 0 0 0 40px; }
dd { margin: 0 0 0 40px; }
nav ul, nav ol { list-style: none; list-style-image: none; margin: 0; padding: 0; }
img { border: 0; -ms-interpolation-mode: bicubic; vertical-align: middle; }
svg:not(:root) { overflow: hidden; }
figure { margin: 0; }
form { margin: 0; }
fieldset { border: 0; margin: 0; padding: 0; }
label { cursor: pointer; }
legend { border: 0; *margin-left: -7px; padding: 0; white-space: normal; }
button, input, select, textarea { font-size: 100%; margin: 0; vertical-align: baseline; *vertical-align: middle; }
button, input { line-height: normal; }
button, input[type="button"], input[type="reset"], input[type="submit"] { cursor: pointer; -webkit-appearance: button; *overflow: visible; }
button[disabled], input[disabled] { cursor: default; }
input[type="checkbox"], input[type="radio"] { box-sizing: border-box; padding: 0; *width: 13px; *height: 13px; }
input[type="search"] { -webkit-appearance: textfield; -moz-box-sizing: content-box; -webkit-box-sizing: content-box; box-sizing: content-box; }
input[type="search"]::-webkit-search-decoration, input[type="search"]::-webkit-search-cancel-button { -webkit-appearance: none; }
button::-moz-focus-inner, input::-moz-focus-inner { border: 0; padding: 0; }
textarea { overflow: auto; vertical-align: top; resize: vertical; }
input:valid, textarea:valid { }
input:invalid, textarea:invalid { background-color: #f0dddd; }
table { border-collapse: collapse; border-spacing: 0; }
td { vertical-align: top; }
body
{
font-family:'Droid Sans', sans-serif;
font-size:10pt;
color:#555;
line-height: 25px;
}
.wrapper
{
width:760px;
margin:0 auto 5em auto;
}
.main
{
overflow:hidden;
}
.error-spacer
{
height:4em;
}
a, a:visited
{
color:#2972A3;
}
a:hover
{
color:#72ADD4;
}
</style>
</head>
<body>
<div class="wrapper">
<div class="error-spacer"></div>
<div role="main" class="main">
<h1>Ouch.</h1>
<h2>Server Error: 500 (Internal Server Error)</h2>
<hr>
<h3>What does this mean?</h3>
<p>
Something went wrong on our servers while we were processing your request.
We're really sorry about this, and will work hard to get this resolved as
soon as possible.
</p>
<p>
Perhaps you would like to go to our <a href="http://www.cnu.cc">home page</a>?
</p>
<p>
有问题,请微博联系<a href="https://weibo.com/songhuai">CNU_Will</a>
</p>
</div>
</div>
</body>
</html>
[]
进程已结束,退出代码为 0
68 BeautifulSoup的安装和使用
#不用写正则表达式
#安装 ./python3 pip3 install bs4
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>
<p class="story">...</p>
"""
'''from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc,'lxml')
#格式处理
print(soup.prettify())
# # 找到p标签
print(soup.p)'''
#
# # 找到p标签class的名字
# print(soup.p['class'])
#
# # 找到第一个a标签
# print(soup.a)
#
# # 找到所有的a标签
# print(soup.find_all('a'))
#
#
# # 找到id为link3的的标签
# print(soup.find(id="link3"))
#
# # 找到所有<a>标签的链接
# for link in soup.find_all('a'):
# print(link.get('href'))
#
# # 找到文档中所有的文本内容
# print(soup.get_text())
结果:
<html>
<head>
<title>
The Dormouse's story
</title>
</head>
<body>
<p class="title">
<b>
The Dormouse's story
</b>
</p>
<p class="story">
Once upon a time there were three little sisters; and their names were
<a class="sister" href="http://example.com/elsie" id="link1">
Elsie
</a>
,
<a class="sister" href="http://example.com/lacie" id="link2">
Lacie
</a>
and
<a class="sister" href="http://example.com/tillie" id="link3">
Tillie
</a>
;
and they lived at the bottom of a well.
</p>
<p class="story">
...
</p>
</body>
</html>
<p class="title"><b>The Dormouse's story</b></p>
['title']
<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>
[<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>, <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>, <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]
<a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>
http://example.com/elsie
http://example.com/lacie
http://example.com/tillie
The Dormouse's story
The Dormouse's story
Once upon a time there were three little sisters; and their names were
Elsie,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
进程已结束,退出代码为 0
69 使用爬虫爬取新闻网站
from bs4 import BeautifulSoup
import requests
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.8",
"Connection": "close",
"Cookie": "_gauges_unique_hour=1; _gauges_unique_day=1; _gauges_unique_month=1; _gauges_unique_year=1; _gauges_unique=1",
"Referer": "http://www.infoq.com",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36 LBBROWSER"
}
url = 'http://www.infoq.com/cn/news'
#取得新闻标题
def craw2(url):
response = requests.get(url,headers = headers)
soup = BeautifulSoup(response.text,'lxml')
for title_href in soup.find_all('div',class_='news_type_block'):
print([title.get('title')
for title in title_href.find_all('a') if title.get('title')])
# craw2(url)
#翻页
for i in range(15,46,15):
url = 'http://www.infoq.com/cn/news'+str(i)
craw2(url)
70 使用爬虫爬取图片链接并下
from bs4 import BeautifulSoup
import requests
import os
import shutil
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.8",
"Connection": "close",
"Cookie": "_gauges_unique_hour=1; _gauges_unique_day=1; _gauges_unique_month=1; _gauges_unique_year=1; _gauges_unique=1",
"Referer": "http://www.infoq.com",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36 LBBROWSER"
}
url = 'http://www.infoq.com/cn/presentations'
# 下载图片
# Requests 库封装复杂的接口,提供更人性化的 HTTP 客户端,但不直接提供下载文件的函数。
# 需要通过为请求设置特殊参数 stream 来实现。当 stream 设为 True 时,
# 上述请求只下载HTTP响应头,并保持连接处于打开状态,
# 直到访问 Response.content 属性时才开始下载响应主体内容
def download_jpg(image_url, image_localpath):
response = requests.get(image_url, stream=True)
if response.status_code == 200:
with open(image_localpath, 'wb') as f:
response.raw.deconde_content = True
shutil.copyfileobj(response.raw, f)
# 取得演讲图片
def craw3(url):
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'lxml')
for pic_href in soup.find_all('div', class_='news_type_video'):
for pic in pic_href.find_all('img'):
imgurl = pic.get('src')
dir = os.path.abspath('.')
filename = os.path.basename(imgurl)
imgpath = os.path.join(dir, filename)
print('开始下载 %s' % imgurl)
download_jpg(imgurl, imgpath)
# craw3(url)
#
# 翻页
j = 0
for i in range(12, 37, 12):
url = 'http://www.infoq.com/cn/presentations' + str(i)
j += 1
print('第 %d 页' % j)
craw3(url)
71 如何分析源代码并设计合理
1、不知道从那些模块开始写
2、不知道模块和业务进行结合
3、设计各种模块
#dbdb 实现字典的持久化存储
1 python3 -m dbdb.tool
2 python3 -m dbdb.tool a.db set a 123
3 python3 -m dbdb.tool a.db get a
4 python3 -m dbdb.tool a.db delete a