python 一二三

最新推荐文章于 2022-03-11 14:14:53 发布

joshuwang0810

最新推荐文章于 2022-03-11 14:14:53 发布

阅读量603

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/SARACH_WONG/article/details/94600345

版权

python 专栏收录该内容

3 篇文章 2 订阅

订阅专栏

综合参考

logs

import logging
# 1.通常的设置
logging.basicConfig(level=logging.DEBUG,
		format='%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s : %(message)s')

# 2.指定输出地
# handler实例负责把日志事件发到具体的目的地,具体参考链接：logger，handler，filter，formatter
logger = logging.getLogger('zjuRE') 
formatter = logging.Formatter('%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s : %(message)s')
logger.setLevel(logging.INFO) 

# 2.1 console输出
stream_handler = logging.StreamHandler(sys.stdout)
stream_handler.setLevel(logging.DEBUG)
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)

# 2.2 file输出
file_handler = logging.FileHandler(args.log_path)
file_handler.setLevel(logging.INFO)
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)

# 测试
logging.info('this is a loggging info message')
logging.debug('debug')
logging.warning('warning')
logging.error('error')
logging.critical('critical')

#一些函数的使用

`locals()`

Python的locals()函数会以dict类型返回当前位置的全部局部变量。

def main(_):
	model_name = 'a'
	from model.a import a
	from model.b import b
	model = locals()[model_name]

`format()`

通过{}和:来代替%。
参考： https://www.cnblogs.com/wongbingming/p/6848701.html

'{0},{1}'.format('kzc',18)
'{name},{age}'.format(age=18,name='kzc') 
# 通过下标 
p=['kzc',18]
'{0[0]},{0[1]}'.format(p)
# 通过对象属、填充对齐等

`name`

这个系统变量显示了当前模块执行过程中的名称，如果当前程序运行在这个模块中，__name__ 的名称就是__main__如果不是，则为这个模块的名称。

Python中默认参数self的理解

一直很困惑为什么python中定义的class中，每个函数都需要有个self的参数

原因如下：self是实例，而非类。比如下面的类，python的解释器在接收到t = Test()时候会将其解释为Test.add1(t,a,b)，会传入3个参数，而不是像普通函数一样，传入2个参数。这也能解释，下面的错误

class Test:
	# 执行时会传入3个参数，self对应实例
    def add1(self,a,b):
        print(a + b)
    def add2(a,b)
	    print(a + b)
def add3(a,b):
	print(a + b)
t = Test()
t.add1(1,2) # 正常执行
t.add2(1,2) # TypeError: add2() takes 2 positional arguments but 3 was give，还给了self
add3(1,2) # 正常执行，传入2个参数

参考：https://blog.csdn.net/daocaoren1543169565/article/details/80626035

exec–将字符串转为可执行的命令

string = "[{'a':'a','b':'b'}, {'c':'c','d':'d'}]"
exec("a = " +  string )
print(type(a))  # list
print(type(a[0]))  # dict

读取数据库

参考： https://blog.csdn.net/u013421629/article/details/77982598

os

当前文件所在目录：os.getcwd()
获得目录中所有的文件列表 os.listdir(path)
执行linux命令：os.system()

sys

查看对象占用的内存空间 sys.getsizeof(a)。
- help(sys.getsizeof) 可以看到Return the size of object in bytes.

pickle – .pkl文件

import cPickle as pickle # 2.x
import _pickle as pickle # 3.x

操作参考:Python pickle模块学习（超级详细）

subprocess – 可以在当前程序中执行其他程序或命令

import subprocess
subprocess.call('wget xxx'], shell=True)

numpy

import numpy as np
a = np.zeros(10,np.int) # [0 0 0 0 0 0 0 0 0 0]
b = [1,3,6]
a[b] = 1
print(a) # [0 1 0 1 0 0 1 0 0 0]

a = [1,2,3,4,3]
print(np.sum(a==3)) # 2

pandas

取行列

df = df.loc[0:2, ['A', 'C']]
df = df.iloc[0:2, [0, 2]]

满足某些条件

满足单个条件的行: df[df['a'] == 1],df[df['a'] > 1]
满足多个条件的行: df[ df['a'] == 1 & df['b'] > 2 ]
满足条件的索引：df[df['a'] == 1],df[df['a'] > 1].index.tolist()

划分数据集：

train_df = new_df.sample(frac=0.8)
train_df_index = train_df.index.to_list()
eval_df = new_df.iloc[~new_df.index.isin(train_df_index)]
print(train_df.shape,eval_df.shape)

按行读取/遍历行

for index, row in df.iterrows():
    print row["c1"], row["c2"]

去重, 计数

df['name'].unique() # 去重
df['name'].value_counts() # 统计表类别个数
df.isin({'name':['mike','john','sarah']}) # 某一列的值是否在指定list中
df.loc[result.name.isin(['mike','john','sarah'])] # 取出在指定list中那些行

参考：https://vimsky.com/article/3842.html

合并

参考：https://blog.csdn.net/zutsoft/article/details/51498026

apply

def func(series):
	return series['c1'] == series['c1']
df.apply(func, axis=1)

随机采样

参考：https://blog.csdn.net/qq_22238533/article/details/71080942

DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

n是要抽取的行数。（例如n=20000时，抽取其中的2W行）
frac是抽取的比列。（有一些时候，我们并对具体抽取的行数不关系，我们想抽取其中的百分比，这个时候就可以选择使用frac，例如frac=0.8，就是抽取其中80%）
replace：是否为有放回抽样，取replace=True时为有放回抽样。
weights这个是每个样本的权重，具体可以看官方文档说明。
axis是选择抽取数据的行还是列。axis=0的时是抽取行，axis=1时是抽取列（也就是说axis=1时，在列中随机抽取n列，在axis=0时，在行中随机抽取n行）

修改列名：

df.columns=['a','b'] # 修改全部列名
df = df.rename(columns={'0':'a','1':'b'}) # 修改部分

展示行数/列数

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

统计某一列值的频率

diff_ct = gt_df['coli'].value_counts()

排序相关

df.sort_values(by="coli", ascending=False, inplace=True)

字符

unicodedata

UCD是Unicode字符数据库（Unicode Character DataBase）的缩写。
参考：

进度条

tqdm

tqdm 是一个快速，可扩展的Python进度条，可以在 Python 长循环中添加一个进度提示信息，用户只需要封装任意的迭代器 tqdm(iterator)。
参考：https://blog.csdn.net/langb2014/article/details/54798823

progressbar

github:https://pypi.org/project/progressbar2/
参考:https://blog.csdn.net/saltriver/article/details/53055942

import time
import progressbar
p = progressbar.ProgressBar()
N = 1000
for i in p(range(N)):
    time.sleep(0.01)

sys

`sys._getframe().f_code.co_name`获取当前函数的函数名

`sys._getframe().f_lineno` 获取行号

`sys.path` 当导入模块时，解释器按照`sys.path`列表中的目录顺序来查找导入文件

比如现在的代码目录结构：
/src/a/acode.py
/src/b/bcode.py
假如bcode.py期望导入acode.py，则可以增加上级目录到sys.path列表里面：

sys.path.append('../a/')
import acode

rst文件

rst文件对于python就像是javadoc之于java
参考：Python-doc rst文件打开
转成html之后的类似于python3-cookbook

setuptools

参考：Python包管理工具setuptools详解及entry point

`init.py` 和 `main.py`文件

__init__.py :
- https://stackoverflow.com/questions/448271/what-is-init-py-for
- https://www.cnblogs.com/Lands-ljk/p/5880483.html
__main__.py ：
- https://stackoverflow.com/questions/4042905/what-is-main-py

`psutil` 在Python中获取系统信息

参考：psutil

zip, izip, zip_longest

参考：Python zip函数详解+和izip和zip_longest的比较辨析
这种效果参考：how-do-you-split-a-list-into-evenly-sized-chunks

迭代和迭代器

参考：迭代器 (Iterator)

当我们用一个循环（比如 for 循环）来遍历容器（比如列表，元组）中的元素时，这种遍历的过程就叫迭代
含有 __iter__() 方法或 __getitem__() 方法的对象称之为可迭代对象。
迭代器是指遵循迭代器协议（iterator protocol）的对象。迭代器协议（iterator protocol）是指要实现对象的 __iter()__ 和 next() 方法（注意：Python3 要实现 __next__() 方法）。其中，iter() 方法返回迭代器对象本身，next() 方法返回容器的下一个元素，在没有后续元素时抛出 StopIteration 异常。

[],(),{},‘abc’ 都是可迭代对象，但不是迭代器，可以通过python的内置函数iter('abc')函数获得它们的迭代器对象。

事实上，Python 用 for 循环进行迭代的过程，就是先通过内置函数 iter() 获得一个迭代器，然后再不断调用 next() 函数实现的

Python 之新手安装详解、安装目录说明及修改pip默认包安装位置

参考：https://blog.csdn.net/zcshoucsdn/article/details/84990674

pip默认包安装位置

python -m site 列出全局的packages包的安装路径。
python -m site -help 可以看到对应的配置文件去修改其中的USER_BASE和USER_SITE即可

安装目录介绍

DLLs： Python 自己使用的动态库
Doc：自带的 Python 使用说明文档（如果上面安装时不选择，应该会没有，这个没具体试过）
include：包含共享目录
Lib：库文件，放自定义模块和包
libs：编译生成的Python 自己使用的静态库
Scripts：各种包/模块对应的可执行程序。安装时如果选择了pip。那么pip的可执行程序就在此！
tcl：桌面编程包

ubuntu更改默认的python版本

参考：https://blog.csdn.net/white_idiot/article/details/78240298

软链接方式更改：ln -s ~/software/anaconda3/bin/python3 /usr/bin/python

np.random

为什么你用不好Numpy的random函数

import numpy as np

np.random.rand(d0,d1,...,dn): 根据给定维度生成[0,1)之间的数据，包含0，不包含1
np.random.randn(d0,d1,...,dn): randn函数返回一个或一组样本，具有标准正态分布。
np.random.randint(low, high=None, size=None, dtype='l') :返回随机整数，范围区间为[low,high），包含low，不包含high
生成[0,1)之间的浮点数
- np.random.random_sample(size=(2,2))
- np.random.random(size=(2,2))
- np.random.ranf(size=(2,2))
- np.random.sample(size=(2,2))
np.random.choice(a, size=None, replace=True, p=None) : 从给定的一维数组中生成随机数
- a为整数时，对应的一维数组为np.arange(a)
- p为数组中的数据出现的概率;p的长度与参数a的长度需要一致；参数p为概率，p里的数据之和应为1
np.random.seed()使得随机数据可预测。

pdb调试

参考
- https://www.cnblogs.com/xiaohai2003ly/p/8529472.html
- https://zhuanlan.zhihu.com/p/37294138

调用

查看源码

添加断点

添加临时断点

包

自定义安装包命令：python setup.py install

deepwalk

核心包括训练代码deepwalk和训练样本example_graphs

装饰器decorator

https://foofish.net/python-decorator.html

def use_logging(func):

    def wrapper():
        logging.warn("%s is running" % func.__name__)
        return func()
    return wrapper

@use_logging
def foo():
    print("i am foo")

foo()

time 和 datetime的差别

参考
time更底层，datetime可以理解为对time进行了封装

# time常用函数
import time
print(time.time())
print(time.strftime('%Y-%m-%d %H:%M:%S',time.localtime()) )
    
# datetime常用函数：
import datetime
time_now = datetime.datetime.now()
print(time_now)
time_now.strftime('%Y-%m-%d %H:%M:%S')
delta = datetime.timedelta(hours=24)
print(time_now + delta)
print(time_now - delta)
print(time_now - delta)

[待记录]将std同时输出到文件中

[待看]NetworkX----网络/图相关

参考： https://blog.csdn.net/qq_31192383/article/details/53748129

[待看]from . import a 和 import a差别？

参考Python 的 Import 陷阱

joshuwang0810

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 一二三

exec–将字符串转为可执行的命令string = "[{'a':'a','b':'b'}, {'c':'c','d':'d'}]"exec("a = " + string )print(type(a)) # listprint(type(a[0])) # dict
复制链接

扫一扫