2017年09月_江南小白龙

09月 08月 04月

原创【实时计算架构系列1】WePay如何基于谷歌云平台(GCP)和kafka实现实时流式欺诈检测

原文：https://cloud.google.com/blog/big-data/2017/08/how-wepay-uses-stream-analytics-for-real-time-fraud-detection-using-gcp-and-apache-kafkaBy Wei Li, Lead Engineer at WePay首先，wepay反欺诈场景：交易欺诈

2017-09-24 15:59:35 924

原创【Spark系列8】Spark Shuffle FetchFailedException报错解决方案

前半部分来源：http://blog.csdn.net/lsshlsw/article/details/51213610后半部分是我的优化方案供大家参考。+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++SparkSQL shuffle操作带来的报错o

2017-09-24 15:25:21 21756

转载【Spark系列7】Spark如何读写hive

hive数据表建立可以在hive上建立，或者使用hiveContext.sql（“create table ...."）1）写入hive表case class Person(name:String,col1:Int,col2:String)val sc = new org.apache.spark.SparkContext val hiveContext = new org.a

2017-09-24 14:48:37 2782

转载【数据结构系列1】Hash_Map

hash_map和map的区别在哪里？构造函数。hash_map需要hash函数，等于函数；map只需要比较函数(小于函数).存储结构。hash_map采用hash表存储，map一般采用红黑树(RB Tree)实现。因此其memory数据结构是不一样的。什么时候需要用hash_map，什么时候需要用map?总体来说，hash_map 查找速度会比map快，而且查找速度基本和数

2017-09-10 15:04:05 330

转载【Flink系列2】时间窗口

引出对于流处理系统来说，流入的消息是无限的，所以对于聚合或是连接等操作，流处理系统需要对流入的消息进行分段，然后基于每一段数据进行聚合或是连接等操作。消息的分段即称为窗口，流处理系统支持的窗口有很多类型，最常见的就是时间窗口，基于时间间隔对消息进行分段处理。本节主要介绍Flink流处理系统支持的各种时间窗口。对于目前大部分流处理系统来说，时间窗口一般是根据Task所在节点的本地时钟来进

2017-09-10 14:12:25 3703

转载【Flink系列1】flink与spark的区别

spark基本架构flink基本架构Spark提出的最主要抽象概念是弹性分布式数据集(RDD)flink支持增量迭代计算性能对比首先它们都可以基于内存计算框架进行实时计算，所以都拥有非常好的计算性能。经过测试，Flink计算性能上略好。测试环境：CPU：7000个；内存：单机128GB；版本：Hadoop 2.

2017-09-10 12:51:20 15049

转载【机器学习系列2】FPGrowth算法与spark实现

原理基础支持度支持度是指在所有项集中{X, Y}出现的可能性，即项集中同时含有X和Y的概率：该指标作为建立强关联规则的第一个门槛，衡量了所考察关联规则在“量”上的多少。置信度置信度表示在先决条件X发生的条件下，关联结果Y发生的概率：这是生成强关联规则的第二个门槛，衡量了所考察的关联规则在“质”上的可靠性。提升度提升度表示在含有X的条件下同时含有Y的可

2017-09-07 17:22:17 3458

转载【区块链系列1】区块链科普

前言区块链本质上是一个账本，当一个商品、一个行为、一个交易开始的时候，可以产生一个区块，它的整个流动的生命周期被详细的记录下来，形成一个链。这个账本会存在互联网上，理论上无法被任何人拿走、篡改或者销毁。区块链的优缺点优点：1.分布式，去中心化拿比特币来讲，去中心化的好处就是不需要有一个类似银行的机构来为双方交易提供信任和担保。2.不可篡改、撤销因为区

2017-09-06 20:11:00 1098

aaai_2020_xai_tutorial_Explainable AI.pdf

AI模型可解释性是当前非常火爆的领域，本资料是AAAI 2020 tutorial的PPT，内容非常丰富，值得学习

2021-01-10

阿里云企业应用事业部-区块链在企业的落地探索.pdf

2018年阿里云企业应用事业部高级产品专家刘昕《区块链在企业的落地探索》，探索了区块链在保险/金融场景的一些应用

2020-01-31

房源质量打分中深度学习应用及算法优化-周玉驰.pdf

贝壳，作者介绍：硕士毕业于中科院，先后就职于华为、百度和医渡云，目前就职于贝壳找房，主要负责两个方向：房源策略算法、房客人关系图谱

2020-01-29

深度学习在百度搜索中的工程实践-百度-曹皓.pdf

作者介绍：百度-核心搜索部，2012年硕士毕业于北大，同年加入百度，目前负责百度搜索调研架构相关工作

2020-01-28

超大规模深度学习在美团的应用-余建平.pdf

美团深度学习的应用，作者介绍：2011年硕士毕业于南京大学计算机科学与技术系。毕业后曾在百度凤巢从事机器学习工程相关的工作，加入美团后，负责超大规模机器学习系统，从无到有搭建起支持千亿级别规模的深度学习系统，与推荐、搜索、广告业务深度合作，在算法上提供从召回到排序的全系统优化方案，在工程上提供离线、近线、在线的全流程解决方案。

2020-01-28

Adversarial Examples in Modern Machine Learning- A Review.pdf

Recent research has found that many families of machine learning models are vulnerable to adversarial examples: inputs that are specifically designed to cause the target model to produce erroneous outputs. In this survey, we focus on machine learning models in the visual domain, where methods for generating and detecting such examples have been most extensively studied. We explore a variety of adversarial attack methods that apply to image-space content, real world adversarial attacks, adversarial defenses, and the transferability property of adversarial examples. We also discuss strengths and weaknesses of various methods of adversarial attack and defense. Our aim is to provide an extensive coverage of the field, furnishing the reader with an intuitive understanding of the mechanics of adversarial attack and defense mechanisms and enlarging the community of researchers studying this fundamental set of problems.

2020-01-28

Toward AI Security.pdf

This report uses the lens of global AI security to investigate the robustness and resiliency of AI systems, as well as the social, political, and economic systems with which AI interacts. The report introduces a framework for navigating the complex landscape of AI security, visualized in the AI Security Map. This is followed by an analysis of AI strategies and policies from ten countries around the world within this framework to identify areas of convergence and diver- gence. This comparative exercise highlights significant policy gaps, but also opportunities for coordination and cooperation among all surveyed nations. Five recommendations are provided for policymakers around the world who are hoping to advance global AI security and move us toward a more resilient future. The steps nations take now will shape AI trajectories well into the future, and those governments working to develop global and multistakeholder strategies will have an advantage in establishing the international AI agenda.

2020-01-28

刘焱-web安全与机器学习（三本书打包）

刘焱-web安全与机器学习（三本书打包）《Web安全之机器学习入门》《Web安全之深度学习实战》《Web安全之强化学习与GAN》

2019-01-10

An Introduction to Deep Reinforcement Learning

介绍深度强化学习的教材，非常实用。摘要：Deep reinforcement learning is the combination of reinforce- ment learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision- making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts.

2018-12-26

Forward Neural Network for Time Series Anomaly Detection

腾讯的研究论文，摘要：Time series anomaly detection is usually formulated as finding outlier data points relative to some usual data, which is also an important problem in industry and academia. To ensure systems working stably, internet companies, banks and other companies need to monitor time series, which is called KPI (Key Performance Indicators), such as CPU used, number of orders, number of online users and so on. However, millions of time series have several shapes (e.g. seasonal KPIs, KPIs of timed tasks and KPIs of CPU used), so that it is very difficult to use a simple statistical model to detect anomaly for all kinds of time series. Although some anomaly detectors have developed many years and some supervised models are also available in this field, we find many methods have their own disadvantages. In this paper, we present our system, which is based on deep forward neural network and detect anomaly points of time series. The main difference between our system and other systems based on supervised models is that we do not need feature engineering of time series to train deep forward neural network in our system, which is essentially an end-to-end system.

2018-12-26

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

安全曼巴2020

原创【实时计算架构系列1】WePay如何基于谷歌云平台(GCP)和kafka实现实时流式欺诈检测

原创【Spark系列8】Spark Shuffle FetchFailedException报错解决方案

转载【Spark系列7】Spark如何读写hive

转载【数据结构系列1】Hash_Map

转载【Flink系列2】时间窗口

转载【Flink系列1】flink与spark的区别

转载【机器学习系列2】FPGrowth算法与spark实现

转载【区块链系列1】区块链科普

华为-AI的安全和隐私保护.pdf

aaai_2020_xai_tutorial_Explainable AI.pdf

阿里云企业应用事业部-区块链在企业的落地探索.pdf

房源质量打分中深度学习应用及算法优化-周玉驰.pdf

深度学习在百度搜索中的工程实践-百度-曹皓.pdf

超大规模深度学习在美团的应用-余建平.pdf

Adversarial Examples in Modern Machine Learning- A Review.pdf

Toward AI Security.pdf

刘焱-web安全与机器学习（三本书打包）

An Introduction to Deep Reinforcement Learning

Forward Neural Network for Time Series Anomaly Detection

一文读懂如何用深度学习实现网络安全

在线广告欺诈检测方法

携程的推荐及智能算法

51单片机c语言教程

空空如也