Catalogue
- 1.Introduction
- 2.grammar
- 3. Anaconda
- 4.File & os
- 5. Crawler(request+beautifulsoup)
- 6. Django
- 7. pymysql
- 8. JSON
- 9. pyqt
- 10. re
- 11. numpy
- 12. pandas
- 13. Thread
- 14. socket
- 15.Tools
- 16.Pypi
- 17.design mode
- 18. selenium
- 19. Time
- 20. built in functions
- 21. logging
- 22. decorator
- 23. generator & yield
- 24. async & gevent
- 25. Variable
- 26. unittest
1.Introduction
Record some common command and common third party package usage.
About Pip
Pip tool is very imporant in python.Let’s introducte some skills about pip in order to convince us
use pip install -r ./requirements.txt
can install the third party packages fastly and batch processing.
requirements.txt
Cython>=0.22
numpy>=1.9.1
scipy>=0.16.0
scikit-learn>=0.18.0
2.grammar
doc:
Class
public protected private
doc:
class MyObject(object):
def __init__(self):
self.public_field = 5
self._protect_filed = 8
self.__private_field = 10
def get_private_filed(self):
return self.__private_field
foo = MyObject()
# 访问 public 属性
print(foo.public_field)
# 访问 protected 属性
print(foo._protect_filed)
# 访问 private 属性会报错
# AttributeError: 'MyObject' object has no attribute '__private_field'
print(foo.__private_field)
protected
以单个下划线开头的属性,这是为了尽量减少无意间访问内部属性所带来的意外,用一种习惯性的命名方式来表示该字段受保护,本类之外的代码使用该字段时要小心。它本质上与 public 属性使用相同,但命名上体现了保护目的。
应该多用 protected 属性,并在文档里把这些字段的合理用法告诉子类开发者,而不要试图用 private 属性来限制子类访问这些字段。
应该主观上避免对 protected 属性的访问,但访问它也不会导致报错
private
其原理是 Python 对 private 属性的名称做了一些变换:比如 MyObject 的 __private_field 字段,实际上被变换成 _MyObject__private_filed 字段,通过变换后属性名与被访问属性名不相符达到类之外或子类无法访问 private 属性目的。
换句话说,Python 编译器无法严格保证 private 字段的私密性。
Python 为什么不从语法上严格保证 private 字段的私密性呢?用最简单的话讲,We are all consenting adults here(我们都是成年人了)。这也是很多 Python 程序员的观点,大家都认为开放要比封闭好。
另外一个原因在于 Python 语言本身就已经提供了一些属性挂钩(getattr 等),使得开发者能够按照自己的需要来操作对象内部的数据。既然如此,那为什么还要阻止访问 private 属性呢?
最后,不要盲目地将属性设为 private,而是应该从一开始就做好规划,并允许子类更多地访问超类的内部 API;只有当子类不受自己控制时,才考虑用 private 属性来避免命名冲突。
lambda
import time
# 1.变量转变为函数
ans = lambda x,y : x+y
print(ans(1,3))
# 2.覆盖其他函数的功能
time.sleep=lambda x:None
# 输出无效
time.sleep(3)
# switch的使用
def switch(options):
funcdic = {
1: lambda: method1(),
2: lambda: method2(),
3: lambda: method3(),
}
return funcdic[options]()
deepcopy
当你复制的一个普通变量的时候,你两个变量之间不会互相干扰,但是如果你复制一个引用数据类型,如一个自定义的类,那么这个时候python拷贝的是其变量地址,两个变量会发生同步影响,要想消除这个影响,就需要用到深拷贝,如下所示。
import copy
a = Apple(params)
b = copy.deepcopy(a)
3. Anaconda
doc:
4.File & os
doc:
4.1 os file operate
import os
# 返回当前目录
os.getcwd()
# 返回上一级目录地址
os.path.abspath(os.path.join(os.getcwd(), '..'))
5. Crawler(request+beautifulsoup)
doc:
注意设置请求头
5.1 A easy request Demo
import requests
base_url = 'https://www.baidu.com/more/'
response = requests.get(base_url)
response.encoding='utf-8'
print(response.text)
5.2 Beautiful soup demo1
from bs4 import BeautifulSoup
import requests
base_url = 'https://www.shanghairanking.cn/rankings/bcur/2022'
if __name__ == '__main__':
print('[info] start')
response = requests.get(base_url)
response.encoding='utf-8'
# get soup
soup = BeautifulSoup(response.text,'lxml')
# get rank info of one page
uni_html_info_list = soup.find('table').find('tbody').find_all('tr')
cols = "{0:{6}^10}\t{1:{6}^10}\t{2:{6}^10}\t{3:{6}^10}\t{4:^10}\t{5:^10}\t"
print(cols.format('排名', '大学名称', '省市', '类型', '总分', '办学层次', chr(12288)))
# analysis every uni info
for uni_item_html_info in uni_html_info_list:
uni_item_info = []
uni_item_info.append(uni_item_html_info.find_all('td')[0].find('div').string.strip())
uni_item_info.append(uni_item_html_info.find_all('td')[1].find_all('a')[0].string)
uni_item_info.append(uni_item_html_info.find_all('td')[2].contents[0].strip())
uni_item_info.append(uni_item_html_info.find_all('td')[3].contents[0].strip())
uni_item_info.append(uni_item_html_info.find_all('td')[4].contents[0].strip())
uni_item_info.append(uni_item_html_info.find_all('td')[5].contents[0].strip())
print(cols.format(uni_item_info[0],uni_item_info[1],uni_item_info[2],uni_item_info[3],uni_item_info[4],uni_item_info[5],chr(12288)))
print('[info] end')
5.3 Beautiful soup demo2
from bs4 import BeautifulSoup
import requests
base_url = 'https://www.cae.cn'
home_url = '/cae/html/main/col48/column_48_1.html'
if __name__ == '__main__':
print('[info] start')
response = requests.get(base_url+home_url)
response.encoding='utf-8'
# get soup
soup = BeautifulSoup(response.text,'lxml')
# get every ys info url
for item_list in soup.find_all(name='div', attrs={'class': 'ysxx_namelist clearfix'}):
for item in item_list.find_all('a'):
# ys_name = item.string
item_url = item.get('href')
item_response = requests.get(base_url+item_url)
item_soup = BeautifulSoup(item_response.text,'lxml')
ys_description = ''
for item_p in item_soup.find(name='div',attrs={'class': 'intro'}).find_all('p'):
ys_description = ys_description + item_p.string.strip()
with open("ys_description.txt", "a+",encoding='utf-8') as f: # 打开文件
f.write(ys_description+'\n') # 读取文件
f.close()
print(ys_description)
6. Django
doc:
start up
python manage.py runserver 0.0.0.0:8087
7. pymysql
doc:
8. JSON
doc:
-
dumps(obj covert json)
import json
# dumps可以格式化所有的基本数据类型为字符串
data1 = json.dumps([]) # 列表
print(data1, type(data1))
data2 = json.dumps(2) # 数字
print(data2, type(data2))
data3 = json.dumps('3') # 字符串
print(data3, type(data3))
dict = {"name": "Tom", "age": 23} # 字典
data4 = json.dumps(dict)
print(data4, type(data4))
# 将dict中的数据写入文件中
with open("test.json", "w", encoding='utf-8') as f:
# indent 超级好用,格式化保存字典,默认为None,小于0为零个空格
f.write(json.dumps(dict, indent=4))
json.dump(dict, f, indent=4) # 传入文件描述符,和dumps一样的结果
- loads(str covert to json)
import json
dict = '{"name": "Tom", "age": 23}' # 将字符串还原为dict
data1 = json.loads(dict)
print(data1, type(data1))
with open("test.json", "r", encoding='utf-8') as f:
data2 = json.loads(f.read()) # load的传入参数为字符串类型
print(data2, type(data2))
f.seek(0) # 将文件游标移动到文件开头位置
data3 = json.load(f)
print(data3, type(data3))
9. pyqt
doc:
QMessageBox.about(self, "提示", "需要选中一项内容")
user = self.lineEdit_user.text()
10. re
doc:
11. numpy
doc:
11.1 numpy file operate
import numpy as np
a = np.random.rand(2, 3)
print(a)
# save as npy
np.save('my_array', a)
# read from npy
a_loaded = np.load('my_array.npy')
print(a_loaded)
# save as
np.savetxt('my_array.csv', a, delimiter=',')
a_loaded = np.loadtxt('my_array.csv', delimiter=',')
print(a_loaded)
12. pandas
doc:
13. Thread
doc:
GIL问题
14. socket
doc:
15.Tools
15.1 pyinstaller
doc:
16.Pypi
doc:
- https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/
- 编写属于自己的Python第三方库
- 如何制作自己的python库 -csdn
- setup.py github demo
python setup.py sdist
twine upload dist/*
17.design mode
doc:
18. selenium
doc:
19. Time
doc:
import time
# 打印时间戳
print(time.time()) # 打印自从1970年1月1日午夜(历元)经过了多长时间,以秒为单位
# 打印本地时间
print(time.localtime(time.time())) # 打印本地时间
# 打印格式化时间
print(time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))) # 打印按指定格式排版的时间
20. built in functions
doc:
21. logging
doc:
demo1: basic usage
import logging
# 设置日志等级和输出日志格式
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s')
logging.debug('这是一个debug级别的日志信息')
logging.info('这是一个info级别的日志信息')
logging.warning('这是一个warning级别的日志信息')
logging.error('这是一个error级别的日志信息')
logging.critical('这是一个critical级别的日志信息')
# 关闭日志
logging.shutdown()
demo2: save as log.txt
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s',
filename="log.txt",
filemode="w")
logging.debug('这是一个debug级别的日志信息')
logging.info('这是一个info级别的日志信息')
logging.warning('这是一个warning级别的日志信息')
logging.error('这是一个error级别的日志信息')
logging.critical('这是一个critical级别的日志信息')
在多项目文件下,推荐使用logger来区分不同包下的log,如果统一使用一个log,则log的配置会在全局所有包中集体生效。
demo3: use logger
import logging
log = logging.getLogger(__name__)
22. decorator
Python的装饰器(decorator)可以说是Python的一个神器,它可以在不改变一个函数代码和调用方式的情况下给函数添加新的功能。Python的装饰器同时也是Python学习从入门到精通过程中必需要熟练掌握的知识。小编我当初学习Python时差点被装饰器搞晕掉,今天尝试用浅显的语言解释下Python装饰器的工作原理及如何编写自己的装饰器吧。
Python的装饰器本质上是一个嵌套函数,它接受被装饰的函数(func)作为参数,并返回一个包装过的函数。这样我们可以在不改变被装饰函数的代码的情况下给被装饰函数或程序添加新的功能。Python的装饰器广泛应用于缓存、权限校验(如django中的@login_required和@permission_required装饰器)、性能测试(比如统计一段程序的运行时间)和插入日志等应用场景。有了装饰器,我们就可以抽离出大量与函数功能本身无关的代码,增加一个函数的重用性。
doc:
- Python中的注解“@” -csdn
- Python类型注解,你需要知道的都在这里了 -知乎
- 一文看懂Python系列之装饰器(decorator)(工作面试必读) -知乎
- 【python】一个公式解决所有复杂的装饰器,理解了它以后任何装饰器都易如反掌! -csdn
decorator grammar equals the following
def dec(f):
return 1
@dec
def double(x):
return x * 2
# dec equals the following
double = dec(double)
print(double)
more complex
import time
def record_time(fn):
def wrapper(x):
start = time.time()
ret = fn(x)
print(time.time() - start)
return ret
return wrapper
@record_time
def my_func(x):
time.sleep(x)
""" equlas the following """
# ret = record_time(my_func(1))
# wrapper(fn(1))
my_func(1)
if here is a variable in decorator
demo1: record time
from functools import wraps
import logging
import time
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s')
def record_time(fn):
@wraps(fn)
def wrapper(*args, **kwargs):
start_time = time.time()
fn(*args)
used_time = time.time() - start_time
logging.debug('it cost {0}s'.format(used_time))
return wrapper
@record_time
def method():
count = 0
for i in range(10000000):
count += 1
if __name__ == '__main__':
method()
demo2: a standard decorator
from functools import wraps
# print log
def hint(func):
@wraps(func)
def wrapper(*args, **kwargs):
print('{} is running'.format(func.__name__))
return func(*args, **kwargs)
return wrapper
@hint
def hello():
print("Hello!")
demo3: another standard decorator
from decorator import decorator
@decorator
def hint(fn, *args, **kwargs):
print('function {0} is running'.format(func.__name__))
return fn(*args, **kwargs)
demo4: decorator with params
from functools import wraps
# print log with author
def hint(author):
def wrapper(fn):
@wraps(fn)
def inner_wrapper(*args, **kwargs):
logging.debug('function {0} is running now'.format(fn.__name__))
logging.debug('author is {0}'.format(author))
return fn(*args,**kwargs)
return inner_wrapper
return wrapper
@hint(author='jack')
def hello():
print("Hello!")
23. generator & yield
doc:
def sqe(n):
for i in range(n):
yield i ** 2
list(sqe(5))
# [0, 1, 4, 9, 16]
24. async & gevent
doc:
IO多路复用
25. Variable
doc:
If you change value of a in a function, it will get error. **If you change a variable then python will default the variable to the local variable. **You can see the following example.
a = 10
def change_a():
print(a)
a += 1
if __name__ == '__main__':
change_a()
# UnboundLocalError: local variable 'a' referenced before assignment
How to solved this problem if you want to use a global variable in a function? Declare global
key word before you use it.
a = 10
def change_a():
global a
print(a)
a += 1
if __name__ == '__main__':
change_a()
Also, If you want to use global variable in a function function, you should use nolocal
a = 10
def method():
def inner():
nolocal a
a +=1
def main():
method()
main()
The follow example is Ok if you use a dict and operate dict.
a = {}
def change_a():
a['info'] = "jack"
print(a)
if __name__ == '__main__':
change_a()
全局变量可读不可写,但是在操作dict的时候,dict所指向的地址不会发生改变,因此assgnment可以通过。
26. unittest
doc: