2017年08月_江南小白龙

09月 08月 04月

转载【机器学习系列1】Xgboost算法

http://skyhigh233.com/blog/2016/12/01/gbdt-and-xgboost/ RF、GBDT和xgboost RF：从M个训练样本中随机选取m个样本，从N个特征中随机选取n个特征，然后建立一颗决策树。这样训练出T棵树后，让这k颗树对测试集进行投票产生决策值。RF是一种bagging的思路。可以并行化处理。 GBDT：总共构建T棵树

2017-08-30 20:12:04 697

转载【招聘系列2】Hive面试问题

Hive的运算逻辑存的是和hdfs的映射关系，hive是逻辑上的数据仓库，实际操作的都是hdfs上的文件，HQL就是用sql语法来写的mr程序。 Hive与关系型数据库的关系没有关系，hive是数据仓库，不能和数据库一样进行实时的CURD操作。是一次写入多次读取的操作，可以看成是ETL工具。

2017-08-29 19:49:19 426

转载【招聘系列1】Hadoop常见问题

简答说一下Hadoop的map-reduce编程模型 hadoop和Spark的都是并行计算，那么他们有什么相同和区别两者都是用mr模型来进行并行计算，hadoop的一个作业称为job，job里面分为map task和reduce task，每个task都是在自己的进程中运行的，当task结束时，进程也会结束 spark用户提交的任务成为applicatio

2017-08-29 19:42:10 297

转载【Spark系列6】spark submit提交任务

根据spark官网，在提交任务的时候指定–jars，用逗号分开。这样做的缺点是每次都要指定jar包，如果jar包少的话可以这么做，但是如果多的话会很麻烦。 spark-submit --master yarn-client --executor-memory 3g --executor-cores 2 --num-executors 2 --jars ***.jar,***.jar(你的jar包

2017-08-29 19:05:11 640

转载【Spark系列5】cache和persist的区别

通过观察RDD.Scala源代码即可知道cache和persist的区别： def persist(newLevel: StorageLevel): this.type = { 　　if (storageLevel != StorageLevel.NONE && newLevel != storageLevel) { 　　　　throw new UnsupportedOp

2017-08-29 15:14:44 934

转载【Spark系列4】Spark的shuffle原理

大多数Spark作业的性能主要就是消耗在了shuffle环节，因为该环节包含了大量的磁盘IO、序列化、网络数据传输等操作。因此，如果要让作业的性能更上一层楼，就有必要对shuffle过程进行调优。但是也必须提醒大家的是，影响一个Spark作业性能的因素，主要还是代码开发、资源参数以及数据倾斜，shuffle调优只能在整个Spark的性能调优中占到一小部分而已。在Spar

2017-08-29 13:08:24 1095

转载【Spark系列3】Spark优化

遇到的问题数据倾斜问题数据倾斜后果很严重：OOM、速度慢，不能控制时间数据倾斜的定位： 1、Web UI 可以清晰看见哪些个task运行的数据量大小 2、log 日志可以清晰的告诉是哪一行出现问题OOM 在哪个stage出现了数据倾斜，一般在shuffle过程 3、代码走读，重点看join groupbykey reducebykey等关键代码； 4、对数据特征分布

2017-08-29 13:03:00 1157

aaai_2020_xai_tutorial_Explainable AI.pdf

AI模型可解释性是当前非常火爆的领域，本资料是AAAI 2020 tutorial的PPT，内容非常丰富，值得学习

2021-01-10

深度学习在百度搜索中的工程实践-百度-曹皓.pdf

作者介绍：百度-核心搜索部，2012年硕士毕业于北大，同年加入百度，目前负责百度搜索调研架构相关工作

2020-01-28

超大规模深度学习在美团的应用-余建平.pdf

美团深度学习的应用，作者介绍：2011年硕士毕业于南京大学计算机科学与技术系。毕业后曾在百度凤巢从事机器学习工程相关的工作，加入美团后，负责超大规模机器学习系统，从无到有搭建起支持千亿级别规模的深度学习系统，与推荐、搜索、广告业务深度合作，在算法上提供从召回到排序的全系统优化方案，在工程上提供离线、近线、在线的全流程解决方案。

2020-01-28

Adversarial Examples in Modern Machine Learning- A Review.pdf

Recent research has found that many families of machine learning models are vulnerable to adversarial examples: inputs that are specifically designed to cause the target model to produce erroneous outputs. In this survey, we focus on machine learning models in the visual domain, where methods for generating and detecting such examples have been most extensively studied. We explore a variety of adversarial attack methods that apply to image-space content, real world adversarial attacks, adversarial defenses, and the transferability property of adversarial examples. We also discuss strengths and weaknesses of various methods of adversarial attack and defense. Our aim is to provide an extensive coverage of the field, furnishing the reader with an intuitive understanding of the mechanics of adversarial attack and defense mechanisms and enlarging the community of researchers studying this fundamental set of problems.

2020-01-28

Toward AI Security.pdf

This report uses the lens of global AI security to investigate the robustness and resiliency of AI systems, as well as the social, political, and economic systems with which AI interacts. The report introduces a framework for navigating the complex landscape of AI security, visualized in the AI Security Map. This is followed by an analysis of AI strategies and policies from ten countries around the world within this framework to identify areas of convergence and diver- gence. This comparative exercise highlights significant policy gaps, but also opportunities for coordination and cooperation among all surveyed nations. Five recommendations are provided for policymakers around the world who are hoping to advance global AI security and move us toward a more resilient future. The steps nations take now will shape AI trajectories well into the future, and those governments working to develop global and multistakeholder strategies will have an advantage in establishing the international AI agenda.

2020-01-28

刘焱-web安全与机器学习（三本书打包）

刘焱-web安全与机器学习（三本书打包）《Web安全之机器学习入门》《Web安全之深度学习实战》《Web安全之强化学习与GAN》

2019-01-10

An Introduction to Deep Reinforcement Learning

介绍深度强化学习的教材，非常实用。摘要：Deep reinforcement learning is the combination of reinforce- ment learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision- making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts.

2018-12-26

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

安全曼巴2020

转载【机器学习系列1】Xgboost算法

转载【招聘系列2】Hive面试问题

转载【招聘系列1】Hadoop常见问题

转载【Spark系列6】spark submit提交任务

转载【Spark系列5】cache和persist的区别

转载【Spark系列4】Spark的shuffle原理

转载【Spark系列3】Spark优化

华为-AI的安全和隐私保护.pdf

aaai_2020_xai_tutorial_Explainable AI.pdf

深度学习在百度搜索中的工程实践-百度-曹皓.pdf

超大规模深度学习在美团的应用-余建平.pdf

Adversarial Examples in Modern Machine Learning- A Review.pdf

Toward AI Security.pdf

刘焱-web安全与机器学习（三本书打包）

An Introduction to Deep Reinforcement Learning

一文读懂如何用深度学习实现网络安全

在线广告欺诈检测方法

51单片机c语言教程

空空如也