Tensorflow 环境搭建
首先是python 环境的管理,tensorflow 版本在演进,python 的版本也在演进,两者之间的版本又有依赖关系,我们借助pyenv来进行python 环境的管理。
pyenv是python版本管理和切换工具
下载pyenv
下载脚本
$ curl -o pyenv-installer.sh -L https://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer
执行脚本
$ bash pyenv-installer.sh
最终pyenv 会安装到~/.bashrc 目录下
配置pyenv
ubuntu .bashrc
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)" # 这句可以不加
初始化
source ~/.bashrc
查看当前安装版本:
==> pyenv versions
* system (set by /home/hunter/.pyenv/version)
3.6.5
查看可安装版本:
==>pyenv install --list
Available versions:
2.1.3
2.2.3
2.3.7
2.4.0
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
安装依赖库:
Ubuntu:
sudo apt-get update
sudo apt-get install make build-essential libssl-dev zlib1g-dev
sudo apt-get install libbz2-dev libreadline-dev libsqlite3-dev wget curl
sudo apt-get install llvm libncurses5-dev libncursesw5-dev
安装指定的python 版本:
pyenv install 3.6.5 -s
下载的安装包路径在~/.pyenv/cache 目录,也可以手动通过wget或其他工具下载安装包到此目录,再运行install
创建虚拟环境
pyenv virtualenv 3.6.5 env3.6.5
activate 时遇到的问题:
==>pyenv activate env3.6.5
Failed to activate virtualenv.
Perhaps pyenv-virtualenv has not been loaded into your shell properly.
Please restart current shell and try again.
解决方式:
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
添加到bashrc 或者zshrc中即可。
安装Tensorflow
pip 源更新
==> vim ~/.pip/pip.conf
[global]
trusted-host=mirrors.aliyun.com
index-url = https://mirrors.aliyun.com/pypi/simple
install tensorflow
pip install tensorflow==1.4
Tensorflow api
- tf.reduce_mean()
epochs
The next argument specifies how many times our entire training set will be run through the network during training. The more epochs, the more training will occur. You might think that the more training happens, the better the network will be. However, some networks will start to overfit their training data after a certain number of epochs, so we might want to limit the amount of training we do.
In addition, even if there’s no overfitting, a network will stop improving after a certain amount of training. Since training costs time and computational resour‐ ces, it’s best not to train if the network isn’t getting better!
batch_size
The batch_size argument specifies how many pieces of training data to feed into the network before measuring its accuracy and updating its weights and biases. If we wanted, we could specify a batch_size of 1, meaning we’d run inference on a single datapoint, measure the loss of the network’s prediction, update the weights and biases to make the prediction more accurate next time, and then continue this cycle for the rest of the data.
Because we have 600 datapoints, each epoch would result in 600 updates to the network. This is a lot of computation, so our training would take ages! An alter‐ native might be to select and run inference on multiple datapoints, measure the loss in aggregate, and then updating the network accordingly.
If we set batch_size to 600, each batch would include all of our training data. We’d now have to make only one update to the network every epoch—much quicker. The problem is, this results in less accurate models. Research has shown that models trained with large batch sizes have less ability to generalize to new data—they are more likely to overfit.
The compromise is to use a batch size that is somewhere in the middle. In our training code, we use a batch size of 16. This means that we’ll choose 16 data‐ points at random, run inference on them, calculate the loss in aggregate, and update the network once per batch. If we have 600 points of training data, the network will be updated around 38 times per epoch, which is far better than 600.
When choosing a batch size, we’re making a compromise between training effi‐ ciency and model accuracy. The ideal batch size will vary from model to model. It’s a good idea to start with a batch size of 16 or 32 and experiment to see what works best.