@dream-CSDN博客

原创 [python3] 最大整数、最大浮点数、无穷大（小）值

python 中可以将值初始化为无穷大或者无穷小值。show me the codea = float("inf") # 无穷大, 大小写均可b = float("-inf") # 无穷小c = -float("inf") # 无穷小

2019-10-19 13:07:41 2380

原创 [C++11] override 与 final

目录overridefinaloverride 与 final 是 c++11中的说明符（specifier）override参考页面：https://zh.cppreference.com/w/cpp/language/override作用：指定一个虚函数覆盖另一个虚函数。class A{ virtual void foo(); void bar();}; ...

2019-09-25 19:58:03 219

翻译 Chapter 6. Temporal-Difference Learning

时序差分算法（TD）是强化学习中的核心和新颖的算法。TD特点：(1).无需完整的环境模型；(2).bootstrap(判断一个状态的值要依赖其它状态的估计值)。是 DP 和 MC 的结合。policy evaluation: prediction problem.control problem: find an optimal policy.###6.1 TD Predictio...

2019-09-19 23:43:59 306

翻译 Chapter 5. Monte Carlo Methods

Monte Carlo Methods预测value functions 和发现最优策略的学习算法。无需对环境有完全的掌控。Monte Carlo Methods 只需要经验即可。（经验指的是与环境实时或者模拟交互中的状态、动作、奖励的序列信息。）不需要对环境信息有先验知识。Monte Carlo Methods 是基于平均采样的思想来解决强化学习问题。为了保证结果是有效的，我们认为M...

2019-09-19 23:33:08 385

翻译 Chapter 4. Dynamic Programming

本章所介绍的dynamic programming 指的是在给出一个可以把环境视为马尔科夫决策过程的完美的模型下，用以计算最优策略的一系列的算法。传统的DP算法对模型和计算代价要求较高。其它的解决强化学习的算法可以看成以较小的计算成本、无需完美模型的代价来试图实现DP算法相同的效果。假设环境是 a finite MDP。形象化描述是：状态空间和动作空间都是有限的，即SSS and A(s)...

2019-09-19 23:09:23 155

翻译 Chapter 3. The Reinforcement Learning Problem

what is the reinforcement learning problem ?The reinforcement learning problem is meant to be a straightforward framing of the problem of learning from interaction to achieve a goal.what is a reinf...

2019-09-19 22:18:05 405

转载 tex 中的空格

目录一样式一样式二 codeshow me the code:\documentclass{article}\setlength{\parindent}{0pt}% Just for this example\begin{document}There are a number of horizontal spacing macros for ...

2019-09-17 00:05:06 2559

原创 [python] 将 pylab 图像保存到内存中并可以被PIL打开

使用 BytesIOshow me the code:import ioimport matplotlib.pyplot as pltfrom PIL import Imageplt.figure()plt.plot([0, 1])buf = io.BytesIO()plt.savefig(buf, format='png')buf.seek(0) //此处划重点im = I...

2019-09-16 23:48:35 1691

原创排序算法总结

排序总结这里介绍了插入排序（直接、二分法），希尔排序, 直接选择排序，堆排序，冒泡排序，快速排序, 归并排序排序算法。问题描述：假设元素为整数，按照从小到大的顺序排序。一直接插入排序1. 描述...

2019-09-14 18:01:17 333

原创 C++面试输入整理

目录一. 单行单项数据1. cin >>2. get() 函数3. getline() 函数4. C 中的做法：4.1 getchar()4.2 scanf()二单行多项数据三多行多项数据一. 单行单项数据1. cin >>头文件：#include <iostream>// Eg:int _int; // input(控制台的输入):...

2019-09-14 17:36:22 309

原创 PageRank算法

目录一 PageRank算法概述二 PageRank的两个基本假设三 PageRank算法原理3.1 算法步骤3.2 基本思路3.3 公式形成思路四参考网页一 PageRank算法概述PageRank, 是谷歌创始人拉里 ⋅\cdot⋅佩奇和谢尔盖⋅\cdot⋅布林于1997年构建早期的搜索系统原型时提出的链接分析算法。PageRank是Google用于标识网页的等级/重要性的一种方法，...

2019-09-14 16:48:05 1660

原创 TFIDF

目录1.TFIDF算法原理2. TFIDF 概率模型解释3. TFIDF python 实战1.TFIDF算法原理TFIDF (term frequency - inverse document frequency)主要思想：如果某个词或者短语在一篇文章中出现的频率TF较高，而且在其他文章中出现的频率较少，则认为此词或短语具有很好的类别区分能力。计算公式：有语料库DDD，文章表示为 d...

2019-09-14 16:32:36 548

转载好未来笔试（C++实现）

参考：https://www.nowcoder.com/discuss/100235

2019-09-11 17:01:47 364

nothing的博客