deeplearning学习小结(一)——tf.keras

1. 机器学习原理——线性回归

线性回归主要用于做预测,对于单变量线性回归:f(x)=ax+b

  • 预测目标:预测函数f(x)与其真实值的整体误差最小
  • 损失函数:定义为均方误差 E = ( f ( x ) − y ) 2 n E=\frac {(f(x)-y)^2}{n} E=n(f(x)y)2
  • 优化目标:找到合适的a,b使 ( f ( x ) − y ) 2 (f(x)-y)^2 (f(x)y)2越小越好
  • 优化方法:梯度下降算法

2. tf.keras实现线性回归

教育与收入预测:

import tensorflow as tf
x=data.Education
y=data.Income
model=tf.keras.Sequential() #使用顺序模型
model.add(tf.keras.layers.Dense(1,input_shape=(1,))) #输出输入均为1维
model.compile(optimizer='adam',loss='mse') #开始编译:优化器选择adam,损失函数定义为均方误差 mse
history=model.fit(x,y,epochs=10000) #训练10000次
model.predict(pd.Series([20])) #预测20年后收入

Dense层作用:将前项和后项的所有节点全部连接起来,在该模型中,Dense层把它自动化成了f(x)=ax+b。


3. 多层感知机神经网络

功能:

  • 计算输入特征的加权和
  • 使用一个激活函数计算输出,激活函数的作用是模拟人工神经元

具有两个隐藏层的感知机结构图:

激活函数:通常使用‘relu’、‘sigmoid’


4. tf.keras 实现多层感知机网络(销量与广告投入关系)

import tensorflow as tf
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
%matplotlib inline
data=pd.read_csv('dataset/Advertising.csv') #读数据
data.head() #显示数据
plt.scatteplt.scatter(data.tv,data.sale) #绘制TV与sales之间的关系
plt.scatteplt.scatter(data.radio,data.sale) #绘制radio与sales之间的关系
x=data.iloc[:,1;-1] #取出除第一列和最后一列的所有行
y=data.iloc[:,-1] #取最后一列
model=tf.keras.Sequential([tf.keras.layers.Dense(10,input_shape=(3,),activation='relu'),
                           tf.keras.layers.Dense(1)]) #定义隐藏层与输出层
model.summary() #查看模型框架
model.compile(optimizer='adam',loss='mse') #编译,配置优化器和损失函数
model.fit(x,y,epochs=100) #在x,y轴上训练100次
test=data.iloc[:10,1,-1] #取出去第一列和最后一列的前十行
model.predict(test) #预测前十行结果

5. 逻辑回归与交叉熵

  • 逻辑回归的激活函数通常采用sigmoid函数:即给定某一输入,其输出为概率值
  • 分类问题的损失函数:通常使用交叉熵(值越小,概率越接近)
    设概率分布p为期望输出,概率分布q为实际输出,
    设H(p,q)为交叉熵,则有 H(p,q)=- ∑ \sum p(x) l o g log logq(x)
  • 在tf.keras中,使用binary_crossentropy计算二分类交叉熵,使用categorical_crossentropy计算多分类交叉熵

6. tf.keras实现逻辑回归

import tensorflow as tf
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
%matplotlib inline
data=pd.read_csv('dataset/creadit_a.csv') #读数据
x=data.iloc[:,:,-1] #取出去最后一列的所有行
y=data.iloc[:,-1] #取最后一列
model=tf.keras.Sequential() #顺序模型
model.add(tf.keras.layers.Dense(4,input_shape=(15,),activation='relu')) #添加第一个隐藏层,其中包含4个神经元,输入为15列的数据
model.add(tf.keras.layers.Dense(4,activation='relu')) #第二层不用再手动添加input_shape
model.add(tf.keras.layers.Dense(1,activation='sigmoid')) #输出层神经元个数为1
model.compile(optimizer='adam',loss='binary_crossentropy',metrics='acc') #编译,配置优化器,损失函数和正确率
history=model.fit(x,y,epochs=100) #在x,y轴上训练100次
plt.plot(history.epoch,history.history.get('loss')) #显示损失函数曲线
plt.plot(history.epoch,history.history.get('acc')) #显示正确率曲线

7. softmax分类

  • sigmoid用于二分类,softmax用于多分类`
  • softmax作用:将输出变成概率分布
  • softmax要求:每个样本必须属于某个类别,且所有可能样本均被覆盖
  • softmax特征:所有样本的分布概率之和为1
  • 交叉熵计算:
    (1) categorical_crossentropy——独热编码时使用,独热编码就是假设有猫,狗,牛三种动物,猫表示为[1,0,0],狗表示为[0,1,0],牛表示为[0,0,1]
    (2) sparse_categorical_crossentropy——数字编码时使用,猫为1,狗为2,牛表示为3

8. softmax多分类实现fashion-mnist数据集

fashion-mnist数据集包含了7万张图,共10个类别,本次训练过程使用6万张图,测试使用10万张图,像素均为28*28

  • Flatten层用来将输入“压平”,即把多维的输入一维化
  • Dense层用来将一维数据映射到另一维
import tensorflow as tf
train_image=train_image/255
test_image=test_image/255 #数据归一化
model=tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28.28))) #扁平成长度为28*28的向量形式
model.add(tf.keras.layers.Dense(128,activation='relu')) #128个神经元
model.add(tf.keras.layers.Dense(10,activation='softmax')) #把10个输出变成概率分布
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['acc']) #编译
model.fit(train_image,test_image,epochs=5) #训练

9. 优化函数、学习速率、BP

(1) 可通过损失函数随时间的变化曲线来判断学习速率的选取是否合适,合适的学习速率使损失函数随时间下降,不合适的学习速率可能使损失函数发生震荡
(2) 深度学习无需考虑局部极值点
(3) 优化器(optimizer)是编译模型时所必须的两个参数之一
常见的优化函数:

  • SGD:随机梯度下降优化器。即抽取X个小批量独立分布样本,计算他们的平均梯度
  • R M S p r o p RMS_{prop} RMSprop:用于深度学习网络优化算法,增加了一个衰减系数来控制历史信息的获取多少
  • ADAM:对超参数的选择不敏感,可以看做是修正后的Momentum+ R M S p r o p RMS_{prop} RMSprop算法,学习率建议设置为0.001

10. 超参数选择与网络优化

  • 超参数:指在搭建网络时需要我们自己选择,而不是通过梯度下降算法去优化的那些参数。例如:学习率、神经元个数。
  • 网络容量:网络中可训练参数越多,网络容量越大
  • 提高网络拟合能力:可通过增大网络容量的方式提高拟合能力
    增大网络容量方法:(1)增加网络层数 (2)增加神经元个数

11. 网络过拟合、欠拟合

判定方法:
(1)通过损失函数观察
在这里插入图片描述
随着epoch的增加,训练数据集的损失函数不断下降,而测试数据集的损失函数下降一段时间后区域平缓,test_loss>>val_loss
过拟合:训练数据上的损失率远远大于测试数据
欠拟合:训练数据与测试数据上的损失都很大

(2)通过正确率观察
在这里插入图片描述
随着epoch增加,测试数据集正确率趋于平缓,且测试数据集正确率远小于训练数据集正确率。
过拟合:训练数据上的正确率得分较高,测试数据上的正确率得分较低
欠拟合:训练数据上的正确率得分较低,测试数据上的正确率得分更低


12. 解决过、欠拟合

解决欠拟合办法:提高网络拟合能力,增大网络容量
解决过拟合办法:网络中添加dropout层

dropout 简介:随机丢弃隐藏层中的一些神经元,但在测试时使用全部神经元
dropout 解决过拟合原因:
(1)起到取平均的作用
(2)减少神经元之间复杂的共适应关系
dropout 使用方法:一般在隐藏层与输出层之间添加

model=tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28.28))) #扁平成长度为28*28的向量形式
model.add(tf.keras.layers.Dense(128,activation='relu')) #128个神经元
model.add(tf.keras.layers.Dropout(0.5)) #每一次随机丢弃50%

dropout() 括号内参数为丢弃神经元程度


13. 网络参数选择原则

理想的模型在过拟合与欠拟合的界限上
模型建立方法:
一、首先开发一个过拟合模型
(1)添加更多层
(2)让每一层更大
(3)训练更多轮数
二、抑制过拟合
(1)dropout
(2)正则化
(3)图像增强
三、调节超参数
(1)学习速率
(2)隐藏层单元数
(3)训练轮数
构建网络总原则:
一、增大网络容量直到过拟合
二、采取措施抑制过拟合
三、继续增大网络容量,直到快过拟合

# Deep Reinforcement Learning for Keras [![Build Status](https://api.travis-ci.org/matthiasplappert/keras-rl.svg?branch=master)](https://travis-ci.org/matthiasplappert/keras-rl) [![Documentation](https://readthedocs.org/projects/keras-rl/badge/)](http://keras-rl.readthedocs.io/) [![License](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/matthiasplappert/keras-rl/blob/master/LICENSE) [![Join the chat at https://gitter.im/keras-rl/Lobby](https://badges.gitter.im/keras-rl/Lobby.svg)](https://gitter.im/keras-rl/Lobby) ## What is it? `keras-rl` implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library [Keras](http://keras.io). Just like Keras, it works with either [Theano](http://deeplearning.net/software/theano/) or [TensorFlow](https://www.tensorflow.org/), which means that you can train your algorithm efficiently either on CPU or GPU. Furthermore, `keras-rl` works with [OpenAI Gym](https://gym.openai.com/) out of the box. This means that evaluating and playing around with different algorithms is easy. Of course you can extend `keras-rl` according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. In a nutshell: `keras-rl` makes it really easy to run state-of-the-art deep reinforcement learning algorithms, uses Keras and thus Theano or TensorFlow and was built with OpenAI Gym in mind. ## What is included? As of today, the following algorithms have been implemented: - Deep Q Learning (DQN) [[1]](http://arxiv.org/abs/1312.5602), [[2]](http://home.uchicago.edu/~arij/journalclub/papers/2015_Mnih_et_al.pdf) - Double DQN [[3]](http://arxiv.org/abs/1509.06461) - Deep Deterministic Policy Gradient (DDPG) [[4]](http://arxiv.org/abs/1509.02971) - Continuous DQN (CDQN or NAF) [[6]](http://arxiv.org/abs/1603.00748) - Cross-Entropy Method (CEM) [[7]](http://learning.mpi-sws.org/mlss2016/slides/2016-MLSS-RL.pdf), [[8]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.81.6579&rep=rep1&type=pdf) - Dueling network DQN (Dueling DQN) [[9]](https://arxiv.org/abs/1511.06581) - Deep SARSA [[10]](http://people.inf.elte.hu/lorincz/Files/RL_2006/SuttonBook.pdf) You can find more information on each agent in the [wiki](https://github.com/matthiasplappert/keras-rl/wiki/Agent-Overview). I'm currently working on the following algorithms, which can be found on the `experimental` branch: - Asynchronous Advantage Actor-Critic (A3C) [[5]](http://arxiv.org/abs/1602.01783) Notice that these are **only experimental** and might currently not even run. ## How do I install it and how do I get started? Installing `keras-rl` is easy. Just run the following commands and you should be good to go: ```bash pip install keras-rl ``` This will install `keras-rl` and all necessary dependencies. If you want to run the examples, you'll also have to install `gym` by OpenAI. Please refer to [their installation instructions](https://github.com/openai/gym#installation). It's quite easy and works nicely on Ubuntu and Mac OS X. You'll also need the `h5py` package to load and save model weights, which can be installed using the following command: ```bash pip install h5py ``` Once you have installed everything, you can try out a simple example: ```bash python examples/dqn_cartpole.py ``` This is a very simple example and it should converge relatively quickly, so it's a great way to get started! It also visualizes the game during training, so you can watch it learn. How cool is that? Unfortunately, the documentation of `keras-rl` is currently almost non-existent. However, you can find a couple of more examples that illustrate the usage of both DQN (for tasks with discrete actions) as well as for DDPG (for tasks with continuous actions). While these examples are not replacement for a proper documentation, they should be enough to get started quickly and to see the magic of reinforcement learning yourself. I also encourage you to play around with other environments (OpenAI Gym has plenty) and maybe even try to find better hyperparameters for the existing ones. If you have questions or problems, please file an issue or, even better, fix the problem yourself and submit a pull request! ## Do I have to train the models myself? Training times can be very long depending on the complexity of the environment. [This repo](https://github.com/matthiasplappert/keras-rl-weights) provides some weights that were obtained by running (at least some) of the examples that are included in `keras-rl`. You can load the weights using the `load_weights` method on the respective agents. ## Requirements - Python 2.7 - [Keras](http://keras.io) >= 1.0.7 That's it. However, if you want to run the examples, you'll also need the following dependencies: - [OpenAI Gym](https://github.com/openai/gym) - [h5py](https://pypi.python.org/pypi/h5py) `keras-rl` also works with [TensorFlow](https://www.tensorflow.org/). To find out how to use TensorFlow instead of [Theano](http://deeplearning.net/software/theano/), please refer to the [Keras documentation](http://keras.io/#switching-from-theano-to-tensorflow). ## Documentation We are currently in the process of getting a proper documentation going. [The latest version of the documentation is available online](http://keras-rl.readthedocs.org). All contributions to the documentation are greatly appreciated! ## Support You can ask questions and join the development discussion: - On the [Keras-RL Google group](https://groups.google.com/forum/#!forum/keras-rl-users). - On the [Keras-RL Gitter channel](https://gitter.im/keras-rl/Lobby). You can also post **bug reports and feature requests** (only!) in [Github issues](https://github.com/matthiasplappert/keras-rl/issues). ## Running the Tests To run the tests locally, you'll first have to install the following dependencies: ```bash pip install pytest pytest-xdist pep8 pytest-pep8 pytest-cov python-coveralls ``` You can then run all tests using this command: ```bash py.test tests/. ``` If you want to check if the files conform to the PEP8 style guidelines, run the following command: ```bash py.test --pep8 ``` ## Citing If you use `keras-rl` in your research, you can cite it as follows: ```bibtex @misc{plappert2016kerasrl, author = {Matthias Plappert}, title = {keras-rl}, year = {2016}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/matthiasplappert/keras-rl}}, } ``` ## Acknowledgments The foundation for this library was developed during my work at the [High Performance Humanoid Technologies (H²T)](https://h2t.anthropomatik.kit.edu/) lab at the [Karlsruhe Institute of Technology (KIT)](https://kit.edu). It has since been adapted to become a general-purpose library. ## References 1. *Playing Atari with Deep Reinforcement Learning*, Mnih et al., 2013 2. *Human-level control through deep reinforcement learning*, Mnih et al., 2015 3. *Deep Reinforcement Learning with Double Q-learning*, van Hasselt et al., 2015 4. *Continuous control with deep reinforcement learning*, Lillicrap et al., 2015 5. *Asynchronous Methods for Deep Reinforcement Learning*, Mnih et al., 2016 6. *Continuous Deep Q-Learning with Model-based Acceleration*, Gu et al., 2016 7. *Learning Tetris Using the Noisy Cross-Entropy Method*, Szita et al., 2006 8. *Deep Reinforcement Learning (MLSS lecture notes)*, Schulman, 2016 9. *Dueling Network Architectures for Deep Reinforcement Learning*, Wang et al., 2016 10. *Reinforcement learning: An introduction*, Sutton and Barto, 2011 ## Todos - Documentation: Work on the documentation has begun but not everything is documented in code yet. Additionally, it would be super nice to have guides for each agents that describe the basic ideas behind it. - TRPO, priority-based memory, A3C, async DQN, ...
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值