Machine Learning

最新推荐文章于 2024-05-16 10:26:46 发布

狐狸取经

最新推荐文章于 2024-05-16 10:26:46 发布

阅读量443

点赞数

文章标签： tensorflow 机器学习神经网络深度学习

本文链接：https://blog.csdn.net/Lxg101015/article/details/116072241

版权

模型及代码

Faster-RCNN

csdn上一位博主的原理及代码解析：Faster-RCNN

Mask R-CNN

theory and code

RefineDet

Paper and Code

metric method

mAP-Mean Average Precision

code: mAP
theory:mAP, 以及各种模型评估方法
Fast image augmentation library: albumentations

dictionary relevant usage

Input:

dic={}
dic['name']='LXG' # name 和 LXG 称为一个键值对 dic[key]=value
dic['age']=25
print(dic)
print(dic.keys()) # dic.keys() 返回字典 dic所有的 key
print('name'in dic.keys())
print('name' not in dic.keys())

Output:

 {'name': 'LXG', 'age': 25}
dict_keys(['name', 'age'])
True
False

Input:

dataSet = [[1, 1, 'yes'],
               [1, 1, 'yes'],
               [1, 0, 'no'],
               [0, 1, 'no'],
               [0, 1, 'no']]
labelCounts = {}
for featVec in dataSet: #读取数据集中的一行数据
    currentLabel = featVec[-1] #取featVec中最后一列的值
    if currentLabel not in labelCounts.keys():labelCounts[currentLabel] = 0
     labelCounts[currentLabel] += 1
    print(labelCounts)
    print(labelCounts['yes'])

Output:

{'yes': 2, 'no': 3}
2

list relevant usage

append 和 extend 的区别

Input:

a=[1,2,3] #列表(list) a
b=[4,5,6] #list b
c=[7,8,9]
a.append(c)
print(a)
b.extend(c)
print(b)

Output:

[1, 2, 3, [7, 8, 9]]
[4, 5, 6, 7, 8, 9]

执行a.append(c),则列表a得到了第四个元素，而且第四个元素也是一个列表。而执行b.extend(c)则得到一个包含b和c所以元素的列表。

对 featList=[example[i] for example in dataSet] 的说明

Input:

a=[b[-1] for b in dataSet]
c=[b[0] for b in dataSet]
print(a)
print(c)

Output:

['yes', 'yes', 'no', 'no', 'no']
[1, 1, 1, 0, 0]

即 b=[b[0] for b in dataSet] 表示从左至右依次获取dataSet每个元素(列表)的第一列

set 函数

set() 函数创建一个无序不重复元素的集合
Input:

print(set(a))
print(set(c))

Output:

{'yes', 'no'}
{0,1}

sorted()，key=operator.itemgetter()

Input:

labelCounts={'yes': 2, 'no': 3}
p=sorted(labelCounts.items(),key=operator.itemgetter(1),reverse=True)
print(p)
print(p[0][0])

Output:

[('no', 3), ('yes', 2)]
no

字典 items() 方法以列表返回可遍历的(键, 值) 元组数组，就是这样 [(),()]。sorted(iterable,function,reverse),上面的 operator.itemgetter(1)表示一个函数，1表示取iterable里的第二个数，用这个数来进行排序。

count函数

count()：统计在字符串/列表/元组中某个字符出现的次数。
Input:

a=['yes', 'no', 'no', 'no', 'no'] # a是一个列表
print(a.count(a[0]))
print(a.count(a[1]))

Output:

1
4

a.count(a[0])是统计列表中第一个元素出现的次数。

range()和list(range())的区别

Input.

print(range(5))
print(list(range(5)))

Output.

range(0, 5)
[0, 1, 2, 3, 4]

del() 函数

del用于list列表操作，删除一个或者连续几个元素。

例：定义一个list型数据，data = [1,2,3,4,5,6]

1.del(data):删除整个list。

2.del(data[i]):删除data中索引为i个数据

3.del(data[i:j]):删除data中第i个数据到第j个数据之前（不含j）的数据

列表，元组，字典

list:

a_list = [12, '-23.1', 'Python']

tuple：

a_tuple = (23, '9we', -8.5)

dictionary:

a_dictionary={'Huawei':'China', 'Apple':'USA'}

pandas库常用函数

pd.concat()：将数据进行融合。
.iloc()：提取行数据

文本处理

文本文件名(.txt)处理

Input

from os import listdir
trainingFileList=listdir('E:\Machine Learning\machine learning practice\《机器学习实战》完整资源\machinelearninginaction\Ch02/trainingDigits')
m=len(trainingFileList)
print(m)
#print(trainingFileList)
fileNameStr=trainingFileList[254]
print(fileNameStr)
fileStr=fileNameStr.split(".")[0]
print(fileStr)
classNumStr=int(fileStr.split("_")[0])
print(classNumStr)

Output

1934
1_157.txt
1_157
1

切分文本

使用string.split切分文本会将标点符号当作是词的一部分。最好的方法是使用正则表示式来切分句子，其中分隔符是除单词，数字外的任意字符串。
将文本切分成字符串列表：
Input.

# 切分文本
emailText=open('E:\Machine Learning\machine learning practice\《机器学习实战》完整资源\machinelearninginaction\Ch04\email\ham/1.txt').read()
import re
listOfTokens=re.split(r'\W+',emailText)
print(listOfTokens)
print([tok.lower() for tok in listOfTokens if len(tok)>2])

Output.

注意print(listOfTokens)和print([tok.lower() for tok in listOfTokens if len(tok)>2])之间的区别

LSTM (长短期记忆网络)网络

LSTM网络是一种改进的基于门控的循环神经网络（Gated RNN），它改善了循环神经网络的长程依赖问题，可以有效的解决简单循环神经网络的梯度爆炸或消失问题。
LSTM 网络的循环单元结构
LSTM网络用到下列公式：
$c_{t}=f_{t}\mathop{\odot}c_{t-1}+i_{t}\mathop{\odot}\tilde{c}_{t},$
$h_{t}=o_{t}\mathop{\odot}tanh\left(c_{t}\right),$
$\tilde{c}_{t}=tanh\left(W_{c}x_{t}+U_{c}h_{t-1}+b_{c}\right),$
$i_{t}=\sigma\left(W_{i}x_{t}+U_{i}h_{t-1}+b_{i}\right),$
$f_{t}=\sigma\left(W_{f}x_{t}+U_{f}h_{t-1}+b_{f}\right),$
$o_{t}=\sigma\left(W_{o}x_{t}+U_{o}h_{t-1}+b_{o}\right).$

神经网络内积：W的形状

神经网络内积

Batch Normalization-BN

优点：

可以使学习快速进行(可以增大学习率)
不那么依赖初始值（对初始值不那么神经质）
抑制过拟合（降低Dropout等的必要性）

BN是对一个中间层的单个神经元进行逐一归一化操作。
LN是对一个中间层的所有神经元进行归一化操作。
BN与LN的区别：批量归一化是不同训练数据之间对单个神经元的归一化，层归一化是单个训练数据对某一层所有神经元之间的归一化。
Frederik Kratzert: Understanding the backward pass through Batch Normalization Layer

方差：常用方差来度量模型不确定性最高的地方。

相较于 least confident 和 margin sample 而言，entropy 的方法考虑了该模型对某个 [公式] 的所有类别判定结果。而 least confident 只考虑了最大的概率，margin sample 考虑了最大的和次大的两个概率。

Bayesian optimization:BoTorch
贝叶斯优化详解
 深度学习的经典算法的论文、解读和代码实现
每两个红点间的蓝色区域是关于虚线(均值)奇对称。

狐狸取经

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Machine Learning

Machine Learning : code interpret字典的相关用法字典的相关用法Input:dic={}dic['name']='LXG' # name 和 LXG 称为一个键值对 dic[key]=valuedic['age']=25print(dic)print(dic.keys()) # dic.keys() 返回字典 dic所有的 keyprint('name'in dic.keys())print('name' not in dic.keys())Output:
复制链接

扫一扫