Python笔记

最新推荐文章于 2022-04-12 16:45:20 发布

chief_lin

最新推荐文章于 2022-04-12 16:45:20 发布

阅读量219

点赞数

文章标签： tensorflow

本文链接：https://blog.csdn.net/weixin_41128293/article/details/83105355

版权

Python笔记

Tensorflow

tf.InteractiveSession()
sess = tf.InteractiveSession()
sess.run(a)
tf.nn.l2_loss(t, name=None)
这个函数的作用是利用 L2 范数来计算张量的误差值，但是没有开方并且只取 L2 范数的值的一半，具体如下：
output = sum(t ** 2) / 2
tf.unsorted_segment_sum(data, segment_ids, num_segments, name = None)
根据切片id求和

a = tf.constant([[2,3,4],[4,5,5],[3,-2,4]])
id = tf.constant([0,0,1])
b = tf.unsorted_segment_sum(a,id,3)
# b = [[6,8,9],[3,-2,4],[0,0,0]]
id是同一组的相加，即[2,3,4]和[4,5,5]相加，应当有两组，但是segment数目为3，所以多一行为全零

tf.app.flags
使用flags定义命令行参数

// An highlighted block
import tensorflow as tf 

#第一个是参数名称，第二个参数是默认值，第三个是参数描述
tf.app.flags.DEFINE_string('str_name', 'def_v_1',"descrip1") 
tf.app.flags.DEFINE_integer('int_name', 10,"descript2") 
tf.app.flags.DEFINE_boolean('bool_name', False, "descript3")
FLAGS = tf.app.flags.FLAGS

#获取参数值
print(FLAGS.str_name)
print(FLAGS.int_name)

reduce_sum
应该理解为压缩求和，用于降维
‘x’ is [[1, 1, 1]
[1, 1, 1]]
#求和
tf.reduce_sum(x) ==> 6
#按列求和
tf.reduce_sum(x, 0) ==> [2, 2, 2]
#按行求和
tf.reduce_sum(x, 1) ==> [3, 3]
tf.shape()和x.get_shape().as_list()
tf.shape()是获取张量的大小
x.get_shape()，只有tensor才可以使用这种方法，返回的是一个元组。

import tensorflow as tf 
import numpy as np 
a_array=np.array([[1,2,3],[4,5,6]]) 
b_list=[[1,2,3],[3,4,5]] 
c_tensor=tf.constant([[1,2,3],[4,5,6]]) 
print(c_tensor.get_shape()) 
print(c_tensor.get_shape().as_list()) 
with tf.Session() as sess: 
	print(sess.run(tf.shape(a_array))
	print(sess.run(tf.shape(b_list)))
	print(sess.run(tf.shape(c_tensor)))

tf.argmax(input, dimension, name=None)
参数：
input：输入数据
dimension：按某维度查找。
　dimension=0：按列查找；
　dimension=1：按行查找；
返回：
最大值的下标

a = tf.constant([1.,2.,3.,0.,9.,])
b = tf.constant([[1,2,3],[3,2,1],[4,5,6],[6,5,4]])
with tf.Session() as sess:
   sess.run(tf.argmax(a, 0))
with tf.Session() as sess:
   sess.run(tf.argmax(b, 0))
with tf.Session() as sess:
   sess.run(tf.argmax(b, 1))

输出：
4
输出：
[3, 2, 2]
输出：
[2, 0 ,2, 0]

tf.cast(x, dtype, name=None)
将x的数据格式转化成dtype.例如，原来x的数据格式是bool，
那么将其转化成float以后，就能够将其转化成0和1的序列。反之也可以

a = tf.Variable([1,0,0,1,1])
b = tf.cast(a,dtype=tf.bool)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run(b))
#[ True False False True True]

tf.equal()
equal(x, y, name=None)
判断x, y 是不是相等，它的判断方法不是整体判断，而是逐个元素进行判断，如果相等就是True，不相等，就是False。由于是逐个元素判断，所以x，y 的维度要一致。

import tensorflow as tf
a = [[1,2,3],[4,5,6]]
b = [[1,0,3],[1,5,1]]
with tf.Session() as sess:
print(sess.run(tf.equal(a,b)))

结果：
[[ True False True]
[False True False]]

tf.nn.dropout()
tf.nn.dropout是TensorFlow里面为了防止或减轻过拟合而使用的函数，它一般用在全连接层。
Dropout就是在不同的训练过程中随机扔掉一部分神经元。也就是让某个神经元的激活值以一定的概率p，让其停止工作，这次训练过程中不更新权值，也不参加神经网络的计算。但是它的权重得保留下来（只是暂时不更新而已），因为下次样本输入时它可能又得工作了

def dropout(x, keep_prob, noise_shape=None, seed=None, name=None)

输入是：

x，你自己的训练、测试数据等
keep_prob，dropout概率
……，其它参数不咋用，不介绍了

输出是：

A Tensor of the same shape of x

运行的结果如下：

[[ 0.   0.   2.5  2.5  0.   0.   2.5  2.5  2.5  2.5]
 [ 0.   2.5  2.5  2.5  2.5  2.5  0.   2.5  0.   2.5]
 [ 2.5  0.   0.   2.5  0.   0.   2.5  0.   2.5  0. ]
 [ 0.   2.5  2.5  2.5  2.5  0.   0.   2.5  0.   2.5]
 [ 0.   0.   0.   0.   0.   0.   0.   0.   2.5  2.5]
 [ 2.5  2.5  2.5  0.   2.5  0.   0.   2.5  2.5  2.5]
 [ 0.   2.5  2.5  2.5  0.   2.5  2.5  0.   0.   0. ]
 [ 0.   2.5  0.   2.5  0.   0.   2.5  2.5  0.   0. ]
 [ 2.5  2.5  2.5  2.5  2.5  0.   0.   2.5  0.   0. ]
 [ 2.5  0.   0.   0.   0.   0.   2.5  2.5  0.   2.5]]

分析一下运行结果：

输入和输出的tensor的shape果然是一样的
不是0的元素都变成了原来的 “1/keep_prob” 倍

tensorflow中的dropout就是：使输入tensor中某些元素变为0，其它没变0的元素变为原来的1/keep_prob大小

tf.sparse_retain(sp_input, to_retain)
在一个 SparseTensor 中保留指定的非空值。
sp_input：输入的 SparseTensor 带有 N 个非空元素。
to_retain：长度为 N 的具有 M 个真值的 bool 向量。

例如，如果 sp_input 有形状 [4, 5] 和4个非空字符串值，如下所示：
[0, 1]: a
[0, 3]: b
[2, 0]: c
[3, 1]: d
并且 to_retain = [True, False, False, True]，则输出将是一个形状为 [4, 5] 以及具有2个非空值的 SparseTensor：
[0, 1]: a
[3, 1]: d

tf.floor(), tf.ceil()
tf.floor(x, name=None) 是向下取整，3.6=>3.0
tf.ceil(x, name=None) 是向上取整，3.6=>4.0
tf.cast(x, dtype, name=None)
将x的数据格式转化成dtype.例如,原来x的数据格式是bool, 那么将其转化成float以后,就能够将其转化成0和1的序列
tf.random_uniform((4, 4), minval=low,maxval=high,dtype=tf.float32)))
返回4*4的矩阵，产生于low和high之间，产生的值是均匀分布
tf.gather

类似于数组的索引，可以把向量中某些索引值提取出来，得到新的向量，适用于要提取的索引为不连续的情况。这个函数似乎只适合在一维的情况下使用。

    import tensorflow as tf 
     
    a = tf.Variable([[1,2,3,4,5], [6,7,8,9,10], [11,12,13,14,15]])
    index_a = tf.Variable([0,2])
     
    b = tf.Variable([1,2,3,4,5,6,7,8,9,10])
    index_b = tf.Variable([2,4,6,8])
     
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        print(sess.run(tf.gather(a, index_a)))
        print(sess.run(tf.gather(b, index_b)))
     
    #  [[ 1  2  3  4  5]
    #   [11 12 13 14 15]]
     
    #  [3 5 7 9]

tf.expand_dim()
增加维度

# tensor 'x' is [[1, 1, 3],[3, 2, 4]
# x is a tensor of shape[2,3]
expand_dims(x,0)===>shape[1,2,3]
expand_dims(x,1)===>shape[2,1,3]
expand_dims(x,-1)===>shape[2,3,1]
#-1默认是最后一个维度， 数字代表在第几个维度

tf.unique_with_counts
在一维张量中找到唯一的元素。
该操作返回一个张量 y，该张量包含出现在 x 中的以相同顺序排序的 x 的所有的唯一元素。此操作还会返回一个与 x 具有相同大小的张量 idx，包含唯一的输出 y 中 x 的每个值的索引。最后，它返回一个包含第三个张量 count，其中包含 x 中 y 的每个元素的计数，即：

# tensor 'x' is [1, 1, 2, 4, 4, 4, 7, 8, 8]
y, idx, count = unique_with_counts(x)
y ==> [1, 2, 4, 7, 8]
idx ==> [0, 0, 1, 2, 2, 2, 3, 4, 4]
count ==> [2, 1, 3, 1, 2]

tf.scatter_sub(ref, indices, updates)
将ref中特定位置的数分别进行减法运算。

ref = tf.Variable([1, 2, 3, 4, 5, 6, 7, 8],dtype = tf.int32) 
indices = tf.constant([4, 3, 1, 7],dtype = tf.int32) 
updates = tf.constant([9, 10, 11, 12],dtype = tf.int32) 
sub = tf.scatter_sub(ref, indices, updates) 
with tf.Session() as sess: 
	sess.run(tf.global_variables_initializer()) 
	print sess.run(sub)
[ 1 -9  3 -6 -4  6  7 -4]  结果，indices 4，对应ref中的5，updates中的是9， 5-9＝ -4，因此ref的第4个位置-4，其他位置类似

文件处理

pkl文件

import pickle as pkl
#读取文件
#>python3, r打开文本文件, rb打开二进制文件, encoding = 'latin1'
word = pkl.load(open("word.pkl", 'rb'), encoding='utf-8')
train = pkl.load(open("train.pkl", 'rb'),encoding='iso-8859-1')
#else
word = pkl.load(open("word.pkl", 'rb'))
train = pkl.load(open("train.pkl", 'rb'))

#写文件
temp = [1,2,3,'adb','a']
with open('filename','wb') as f:
    pkl.dump(temp, f, pickle.HIGHEST_PROTOCOL)

strip()

Python strip() 方法用于移除字符串头尾指定的字符（默认为空格或换行符）或字符序列。

注意：该方法只能删除开头或是结尾的字符，不能删除中间部分的字符。

str = "00000003210Runoob01230000000"; 
print str.strip( '0' );  # 去除首尾字符 0
str2 = "   Runoob      ";   # 去除首尾空格
print str2.strip();
##输出
3210Runoob0123
Runoob

Scipy

Scipy是一组专门解决科学计算中各种标准问题域的包的集合，主要包括下面这些包：

scipy.integrate：数值积分例程和微分方程求解器
scipy.linalg：扩展了由numpy.linalg提供的线性代数例程和矩阵分解功能
scipy.optimize：函数优化器（最小化器）以及跟查找算法
scipy.signal：信号处理工具
scipy.sparse：稀疏矩阵和系数线性系统求解器
scipy.special：SPECFUN(这是一个实现了许多常用数学函数（如伽马函数）的Fortran库)的包装器
scipy.stats：标准连续和离散概率分布（如密度函数、采样器、连续分布函数等）、各种统计检验方法，以及更好的描述统计法
scipy.weave：利用内联C++代码加速数组计算的工具

A.todense()
返回一般矩阵格式
sparse模块
针对稀疏矩阵, sparse模块里面有7种存储稀疏矩阵的方式。

lil_matrix
lil_matrix则是使用两个列表存储非0元素。data保存每行中的非零元素,rows保存非零元素所在的列。这种格式也很适合逐个添加元素，并且能快速获取行相关的数据。

from scipy.sparse import lil_matrix
l = lil_matrix((6,5))
l[2,3] = 1
l[3,4] = 2
l[3,2] = 3
print l.toarray() [[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0.]
[ 0. 0. 3. 0. 2.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
print l.data
[[] [] [1.0] [3.0, 2.0] [] []]
print l.rows
[[] [] [3] [2, 4] [] []]
上面两种构建稀疏矩阵的方式，一般也是用来通过逐渐添加非零元素的方式来构建矩阵，然后转换成其他可以快速计算的矩阵存储方式。

vstack
将矩阵按照行进行拼接，对应的列数必须相等

sp.vstack((allx, tx))

csr_matrix
sparse.csr_matric(csr:Compressed Sparse Row marix);

import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([[0,1,0,2,0],[1,1,0,2,0],[2,0,5,0,0]])
b = csr_matrix(arr)
print b
输出
(0, 1) 1
(0, 3) 2
(1, 0) 1
(1, 1) 1
(1, 3) 2
(2, 0) 2
(2, 2) 5

#b矩阵的维数
print b.shape
#非零值个数
print b.nnz
#非零值 print b.data
#稀疏矩阵非0元素对应的列索引值所组成数组
print b.indices
#第一个元素0，之后每个元素表示稀疏矩阵中每行元素(非零元素)个数累计结果 print b.indptr
#按照普通矩阵格式输出
print b.toarray()
示例解读
indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csr_matrix((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
按row行来压缩 # 对于第i行，非0数据列是indices[indptr[i]:indptr[i+1]] 数据是data[indptr[i]:indptr[i+1]]
在本例中
第0行，有非0的数据列是indices[indptr[0]:indptr[1]] = indices[0:2] = [0,2]
数据是data[indptr[0]:indptr[1]] = data[0:2] = [1,2],所以在第0行第0列是1，第2列是2
第1行，有非0的数据列是indices[indptr[1]:indptr[2]] = indices[2:3] = [2]
数据是data[indptr[1]:indptr[2] = data[2:3] = [3],所以在第1行第2列是3
第2行，有非0的数据列是indices[indptr[2]:indptr[3]] = indices[3:6] = [0,1,2]
数据是data[indptr[2]:indptr[3]] = data[3:6] = [4,5,6],所以在第2行第0列是4，第1列是5,第2列是6

csc_matrix
sparse.csc_matric(csc:Compressed Sparse Column marix)

示例解读
indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csc_matrix((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 4], [0, 0, 5], [2, 3, 6]])
按col列来压缩 # 对于第i列，非0数据行是indices[indptr[i]:indptr[i+1]] 数据是data[indptr[i]:indptr[i+1]]
在本例中 # 第0列，有非0的数据行是indices[indptr[0]:indptr[1]] = indices[0:2] = [0,2]
数据是data[indptr[0]:indptr[1]] = data[0:2] = [1,2],所以在第0列第0行是1，第2行是2
第1行，有非0的数据行是indices[indptr[1]:indptr[2]] = indices[2:3] = [2]
数据是data[indptr[1]:indptr[2] = data[2:3] = [3],所以在第1列第2行是3
第2行，有非0的数据行是indices[indptr[2]:indptr[3]] = indices[3:6] = [0,1,2]
数据是data[indptr[2]:indptr[3]] = data[3:6] = [4,5,6],所以在第2列第0行是4，第1行是5,第2行是6

coo_matrix

A = sp.coo_matrix([[1, 2], [2, 4]])
#输出
print(A)
(0, 0) 1
(0, 1) 2
(1, 0) 2
(1, 1) 4
print(A.toarray())
[[1 2]
[2 4]]
data = [1,2,3,4]
C = sp.coo_matrix((data,(row,col)),shape=(5,6))
print(C.toarray())
[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 5 2 0]
[0 0 3 0 0 0]
[0 0 0 0 0 0]]
toarray是输出最终矩阵结果，矩阵被真正创建完成以后，相应的坐标值会加起来得到data(1+2+3+4 = 5+3+2)。

sparse.tolil()
链表稀疏矩阵lil可以加快查询速度。因此在需要对稀疏矩阵的元素值做大量访问时，首先将待访问的稀疏矩阵做一个转换 sp.tolil() 是非常必要的。

features = sp.vstack((allx, tx)).tolil()

diags
diags函数建立稀疏的对角矩阵

r_mat_inv = sp.diags(r_inv)

numpy

np.flatten()
a是个矩阵或者数组，a.flatten()就是把a降到一维，默认是按横的方向降

a = np.array([[1,2], [3,4]])
a.flatten()
array([1, 2, 3, 4])
a.flatten(‘F’) #按竖的方向降
array([1, 3, 2, 4])

np.isfinite()、np.isinf()
分布返回一个表示“哪些元素是有穷的（非inf，非NaN）”或“哪些元素是无穷的”的布尔型数组

r_inv[np.isinf(r_inv)] = 0.

X[:,0]和X[:,1]
X[:,0]是numpy中数组的一种写法，表示对一个二维数组，取该二维数组第一维中的所有数据，第二维中取第0个数据，直观来说，X[:,0]就是取所有行的第0个数据, X[:,1] 就是取所有行的第1个数据。

import numpy as np
X = np.array([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14, 15], [16, 17], [18, 19]])
print(X[:, 0])
#输出
[0 2 4 6 8 10 12 14 16 18]

X[1:5:2]，slice
从1到4取数，步长为2

X=[3,4,2,4,2,1,2,3]
A=X[1:5:2]
A=[4,4]

transpose()
矩阵的转置

a.transpose()
b= transpose(a)

eyes()
可以用来构造单位矩阵

eye(3)
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])

full()
NumPy 1.8引入了np.full()，这是一个比empty()更直接的方法，后面是fill()，用于创建一个用某个值填充的数组

np.full((3, 5), 7)
array([[ 7., 7., 7., 7., 7.],
[ 7., 7., 7., 7., 7.],
[ 7., 7., 7., 7., 7.]])

hstack(), vstack()
合并两个矩阵

###hstack()在行上合并
np.hstack((a,b))
array([[ 8., 5., 1., 9.],
[ 1., 6., 8., 5.]])

####vstack()在列上合并
np.vstack((a,b))
array([[ 8., 5.],
[ 1., 6.],
[ 1., 9.],
[ 8., 5.]])

nans = np.random.randint(0, 2, size=(5,5))
初始化0-2之间的大小为5×5的int型矩阵
其他float型np.random.rand(0, 2, size=(5,5))
对角线元素置值
np.fill_diagonal(A, 0)
令A矩阵的对角线都为0

.dot

numpy乘法运算中dot是按照矩阵乘法的规则来运算, 而numpy乘法运算中"*"是数组元素逐个计算。

python控制台输出&文件保存

import time
import sys
import os
filename = "/home/XXX/PycharmProjects/XXX/log/{}/tt".format(time.strftime('%m-%d',time.localtime(time.time())))

#判断文件夹是否存在
def check_folder():
	# 将文件路径分割出来
    file_dir = os.path.split(filename)[0]
    # 判断文件路径是否存在，如果不存在，则创建，此处是创建多级目录
    if not os.path.isdir(file_dir):
        os.makedirs(file_dir)
        
class Logger(object):
    def __init__(self, fileN="Default.log"):
        self.terminal = sys.stdout
        self.log = open(fileN, "w+")

    def write(self, message):
        self.terminal.write(message)
        self.log.write(message)

    def flush(self):
        pass

check_folder()
sys.stdout = Logger('log/{}/{}.txt'.format(time.strftime('%m-%d',time.localtime(time.time())), time.strftime('%H-%M-%S',time.localtime(time.time()))))
print("hello")

获取并输出当前日期时间

import time

print time.time()
#输出的结果是：
1357723206.31

#用time.localtime()方法，作用是格式化时间戳为本地的时间。
time.localtime(time.time())

#输出的结果是：
time.struct_time(tm_year=2010, tm_mon=7, tm_mday=19, tm_hour=22, tm_min=33, tm_sec=39, tm_wday=0, tm_yday=200, tm_isdst=0)

time.strftime('%Y-%m-%d',time.localtime(time.time()))
#2013-01-09

#最后用time.strftime()方法，把刚才的一大串信息格式化成我们想要的东西

#输出日期和时间：
time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))

argsort函数

返回的是数组值从小到大的索引值

x = np.array([3, 1, 2])
np.argsort(x) array([1, 2, 0])
x[np.argsort(x)] #通过索引值排序后的数组 array([1, 2, 3])
x[0] 3
x[1] 1
x[2] 2
np.argsort(-x) #按降序排列 array([0, 2, 1])

chief_lin

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Python笔记

Tensorflow笔记tf.app.flags使用flags定义命令行参数@requires_authorizationdef somefunc(param1=’’, param2=0):‘’‘A docstring’’’if param1 &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt; param2: #
复制链接

扫一扫