自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+

原创 使用maven命令安装jar包到本地仓库

使用maven命令安装jar包到本地仓库第三方jar包在开发工具中引入后编译没问题, 启动调试包括打包时会提示找不到jar包的错误.需要上传到maven仓库中,并在pom文件内引入.maven命令:安装指定文件到本地仓库命令:mvn install:install-file-DgroupId=<groupId> : 设置上传到仓库的包名-DartifactId=<artifactId> : 设置该包所属的模块名-Dversion=1.0.0

2020-08-04 16:50:58 23

原创 Giraph参数说明

公共参数参数名称简称默认值示例备注giraph.maxNumberOfSupersteps-15最多迭代次数giraph.computationClass--org.apache.giraph.examples.PageRankComputationVertex计算classgiraph.vertex.input.dir-vip-/data/data_wdc_600/arc数据输入路径giraph.vertexInputFormatCla

2020-07-23 10:38:05 31

原创 DiskBackedPartition

为什么开发DiskBackedPartition?默认的Partition的实现是SimplePartition,它用一个ConcurrentMap<I, Vertex<I, V, E>> vertexMap;存储本partition的vertex id和vertex的映射信息,所有的信息都在内存里。当数据量大时,会出现OOM。DiskBackedPartition是一种大数据量的实现。主要目的是借助本地文件系统,如果本Partition的数据较多,则序列化到磁盘。序列化之后则释

2020-07-20 11:08:40 26

原创 Giraph Aggregator Guide

AggregatorAggregator运行聚集在一个超级步中所有顶点的操作。Aggregator的操作类型可以是多样的,并不总是对值求和,如LongSumAggregator是对Long性进行求和,而LongMinAggregator只保留所有的最小值。LongMaxAggregator只保留最大值。LongProductAggregator保留把聚集的每个数的乘积。如 LongProductAggregator longProductAggregator = ... longProductAg

2020-07-16 16:41:21 28

原创 The Message Process of Giraph

AbstractComputation.sendMessage /** * Send a message to a vertex id. * * @param id Vertex id to send the message to * @param message Message data to send */@Override public void sendMessage(I id, M2 message) { workerClientRequestProc

2020-06-17 14:12:27 76

原创 Guava MapMaker的用法

guava.version:21.0MapMaker.makeMappublic <K, V> ConcurrentMap<K, V> makeMap() { if (!useCustomMap) { return new ConcurrentHashMap<K, V>(getInitialCapacity(), 0.75f, getConcurrencyLevel()); } return MapMakerInternalMap.

2020-06-17 09:58:18 73

原创 The Process of Vertex Computation

A thread is created for a partition, process a partition returns a PartitionStats object.ComputeCallable is a callable to process a partitoin.ComputeCallable.callThere are three parts of ComputeCallable.call.public Collection<PartitionStats> cal

2020-06-16 17:30:49 51

原创 The process of OutOfCoreCallable

Constructor of ServerDataAt the constructor of ServerData, if USE_OUT_OF_CORE_GRAPH is set true. oocEngine is created, and partitionStore is wrapped using DiskBackedPartitionStore.PartitionStore<I, V, E> inMemoryPartitionStore = new Simple

2020-06-15 19:18:18 72

原创 Giraph: The process of read Vertices

###add local cacheVertex reader read vertex and i’ts edges information, first store in local cache.catch size: 629145. if (workerMessageSize >= maxVerticesSizePerWorker) { call WorkContext.sendMessageToWorker() } maxVerticesSizePerWorker: defau

2020-06-15 17:22:24 99

原创 Java Garbage Collector (GC) Monitor

List<GarbageCollectorMXBean> mxBeans = ManagementFactory .getGarbageCollectorMXBeans(); final OutOfCoreEngine oocEngine = serviceWorker.getServerData().getOocEngine(); for (GarbageCollectorMXBean gcBean : mxBeans) { Noti

2020-06-15 08:41:27 58

原创 Giraph的大数据量测试

算法PageRank不开启Out of Core,数据文件244M,用-Xmx1G的程序测试,很快失败。开启Out of Core,数据文件244M, 用-Xmx1G的程序测试,设置giraph.useOutOfCoreGraph=true, giraph.useOutOfCoreMessages=truepartitions的数量设置100.giraph.maxPartitionsInMemory=2结果:运行成功开启Out of Core, 数据文件在1G,用-Xmx1G的程序测试

2020-06-12 11:41:13 60

原创 Giraph Partiton Strategy

add local cacheVertex reader read vertex and i’ts edges information, first store in local cache.catch size: 629145. if (workerMessageSize >= maxVerticesSizePerWorker) { sendOut() } maxVerticesSizePerWorker: default 512kgiraph.useOutOfCoreGrap

2020-06-12 11:22:10 51

原创 Run Giraph-1.3.0 on Hadoop-2.5.1

Download and install hadoopDownload hadoop-2.5.1 from apachetar -xzf hadoop-2.5.1.tar.gzvim core-site.xml<configuration><property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> &

2020-06-10 18:53:04 72

原创 用实例说明PageRank的计算过程

初始值,每个节点的rank值都是1/6.第一轮以ripple为例,它的rank值是josh贡献的值1/6∗(1/2)∗0.851/6 * (1/2) * 0.851/6∗(1/2)∗0.85, base值0.15/60.15/60.15/6。所有的rank值计算之后,sum(所有的rank)值,得到0.15+0.5∗0.85=0.5750.15+0.5*0.85 = 0.5750.15+0.5∗0.85=0.575。 前面的0.15是base值之和,0.85是(1-alpha)。0.5是所有参加贡献.

2020-06-08 11:25:34 204

原创 稀疏矩阵的存储方法(DOK、LIL、COO、CSR, CRS)

存储稀疏矩阵经常用二维数组来存储矩阵。 用数组的ai,ja_{i,j}ai,j​可以用索引值iii和jjj访问。通常,iii是 行索引,从上往下编号,jjj是列索引,从左到右进行编号。对于m×nm × nm×n的矩阵,用这种格式存储需要的内存和m×nm × nm×n成比例。对于稀疏矩阵,如果只存储非零的数据,可以极大的节约内存。根据非零数据的数量和分布情况,有不同的数据结构可以使用。需要权衡的是访问单个元素时会比较复杂,并且需要额外的数据结构。这些数据结构主要分为两组:˙支持高效修改的,如关键字字

2020-05-26 19:03:03 211

原创 Pregel:一个大规模图计算系统

本文不是原文翻译,但是包含所有重点的内容。查看原论文请点击此链接1.简介1.1 为什么开发Pregel为每一种图算法都定制开发一个分布式程序需要非常大的工作。现有的分布式计算平台不能满足图计算的需求。像MapReduce可以处理非常大的数据量,但是处理图计算的性能稍差。用单机版本的图算法限制了能处理的图的规模。现有的并行图计算系统没有容错能力。容错能力对大数据系统非常重要。块同步并行(Bulk Synchronous Parallel)模型的启发Pregel的框架组织。Pregel的.

2020-05-25 10:45:51 118

原创 How to add Oracle JDBC driver in your Maven local repository

Here’s a simple guide to show you how to add an Oracle JDBC driver into your Maven local repository, and also how to reference it inpom.xmlTested with Oracle database 19c and Java 8NoteDue to Or...

2020-04-29 16:56:44 50

原创 如何写好项目文档

引子有太多的程序员(包括很多资深的程序员)不会写文档有太多的项目没有(完整的)文档即使有文档,这些文档达标了吗?你对文档有正确的认识吗?你会写文档吗??软件项目的文档是可有可无的吗?目录项目文档的重要性文档的目的:提高沟通的效率提升对“思考过程”的管理项目中,超过50%的时间用于沟通提高沟通的效率非常重要沟通的方式-口头, 文...

2020-04-19 21:15:23 83

原创 使用Markdown输出LaTex数学公式

在MarkDown里使用LaTex写数学公式,直接在LaTex表达式两边加$。如$\alpha$ 输出α\alphaα,注意第一个$前面至少要有一个空格,两个$$和表达式之间不能有空格。 第2个$后面可以没有空格。1 希腊字母表达式两边的$省略。字母表达式字母表达式字母表达式字母表达式字母表达式字母表达式α\alphaα\alphaκ\kappa...

2020-04-15 16:08:39 63

原创 技术教程和文档写作的8要素

1. 从用户视角考虑问题:他们要什么, 而不是我们有什么。2. 小段授课,理论和实践结合3. 讲是什么, 也要讲为什么4. 懂10讲1,以附加价值带动产品使用。5. 用实例讲理论,将变量和公式变成实际案例6. 逻辑(顺序合理、思路连贯)和结构(金字塔原理)严谨。7. 风趣抓人,有趣的小案例,让文章有滋有味。8. 一图胜千言,图解复杂理论,图文结合讲述更高效。...

2020-04-14 15:51:36 71

原创 大数据茶馆-决策树系列

1. 聊聊信息熵2. 决策树是何许人也3. 信息增益、增益率、基尼系数4. 随机森林 (三个臭皮匠顶个诸葛亮)5. AdaBoost 竟如此简单6. XGBoost, 机器学习的大杀器...

2020-04-07 08:54:03 55

原创 Andrew - Deep Learning - C4-Week2-1 Program Assignment - Keras tutorial - the Happy House

Welcome to the first assignment of week 2. In this assignment, you will:Learn to use Keras, a high-level neural networks API (programming framework), written in Python and capable of running on top ...

2020-04-05 15:50:34 87

原创 Andrew - Deep Learning - C4-Week1-2 Program Assignment - Convolutional Neural Networks: Application

Convolutional Neural Networks: ApplicationWelcome to Course 4’s second assignment! In this notebook, you will:Implement helper functions that you will use when implementing a TensorFlow modelImple...

2020-04-05 13:11:02 65

原创 jni Java_com_sgx_jni_RaISVNative_decryptAndSealingPartData2 memory leak

The following code does not release memory correctly.JNIEXPORT void JNICALL Java_com_baidu_dragonshare_sgx_jni_RaISVNative_decryptAndSealingPartData2 (JNIEnv *env, jobject thisObj, jobject j_encl...

2020-04-05 13:08:20 39

原创 Andrew - Deep Learning - C4-Week1-1 Program Assignment - Convolutional Neural Networks: Step by Step

Convolutional Neural Networks: Step by StepWelcome to Course 4’s first assignment! In this assignment, you will implement convolutional (CONV) and pooling (POOL) layers in numpy, including both forwa...

2020-03-28 11:55:09 65

原创 C2-Week 2 Quiz - Autonomous driving (case study)

Week 2 Quiz - Autonomous driving (case study)You are just getting started on this project. What is the first thing you do? Assume each of the steps below would take about an equal amount of time (a...

2020-03-27 17:42:48 140

原创 C3 Week 1 Quiz - Bird recognition in the city of Peacetopia (case study)

Week 1 Quiz - Bird recognition in the city of Peacetopia (case study)Having three evaluation metrics makes it harder for you to quickly choose between two different algorithms, and will slow down t...

2020-03-27 10:57:12 468

原创 C2-Week3 Program Assignment - TensorFlow Tutorial

TensorFlow TutorialWelcome to this week’s programming assignment. Until now, you’ve always used numpy to build neural networks. Now we will step you through a deep learning framework that will allow ...

2020-03-26 10:21:04 68

原创 C2-Week1 Program Assignment(3 of 3) - Gradient Checking

Gradient CheckingWelcome to the final assignment for this week! In this assignment you will learn to implement and use gradient checking.You are part of a team working to make mobile payments availa...

2020-03-26 08:40:43 69

原创 C2-Week2 Program Assignment - Optimization Methods

Optimization MethodsUntil now, you’ve always used Gradient Descent to update the parameters and minimize the cost. In this notebook, you will learn more advanced optimization methods that can speed u...

2020-03-25 16:21:20 77

原创 C2-Week1 Program Assignment(2 of 3) - Regularization

RegularizationWelcome to the second assignment of this week. Deep Learning models have so much flexibility and capacity that overfitting can be a serious problem, if the training dataset is not big e...

2020-03-25 10:46:15 66

原创 C2-Week1 Program Assignment(1 of 3) - Initialization

InitializationWelcome to the first assignment of “Improving Deep Neural Networks”.Training your neural network requires specifying an initial value of the weights. A well chosen initialization metho...

2020-03-24 10:03:06 66

原创 C2 - Week 3 Quiz - Hyperparameter tuning, Batch Normalization, Programming Frameworks

Week 3 Quiz - Hyperparameter tuning, Batch Normalization, Programming FrameworksIf searching among a large number of hyperparameters, you should try values in a grid rather than random values, so t...

2020-03-24 08:58:20 251

原创 C2-Week 2 Quiz - Optimization algorithms

Week 2 Quiz - Optimization algorithmsWhich notation would you use to denote the 3rd layer’s activations when the input is the 7th example from the 8th minibatch?a^[3]{8}(7)Note: [i]{j}(k) super...

2020-03-23 09:47:04 196

原创 C2-Week 1 Quiz - Practical aspects of deep learning

Week 1 Quiz - Practical aspects of deep learningIf you have 10,000,000 examples, how would you split the train/dev/test set?98% train . 1% dev . 1% testThe dev and test set should:Come from...

2020-03-23 09:30:54 200

原创 C1-Week4 Program Assignment(part 2of 2): Deep Neural Network for Image Classification: Application

Deep Neural Network for Image Classification: ApplicationWhen you finish this, you will have finished the last programming assignment of Week 4, and also the last programming assignment of this cours...

2020-03-23 08:29:46 78

原创 C1-Week4 Program Assignment(part 1 of 2): Building your Deep Neural Network: Step by Step

Building your Deep Neural Network: Step by StepWelcome to your week 4 assignment (part 1 of 2)! You have previously trained a 2-layer Neural Network (with a single hidden layer). This week, you will ...

2020-03-20 09:57:10 117

原创 python notebook 遇到的坑

1. import 本地文件出错from utils.testCases import *---------------------------------------------------------------------------ModuleNotFoundError Traceback (most recent call last...

2020-03-19 10:25:25 109

原创 C1-Week 3 Program:Planar data classification with one hidden layer

Planar data classification with one hidden layerWelcome to your week 3 programming assignment. It’s time to build your first neural network, which will have a hidden layer. You will see a big differe...

2020-03-17 15:26:39 84

原创 C1-Week2 Program Assignment: Logistic Regression with a Neural Network mindset

Logistic Regression with a Neural Network mindsetWelcome to your first (required) programming assignment! You will build a logistic regression classifier to recognize cats. This assignment will step...

2020-03-17 09:50:00 56

提示
确定要删除当前文章?
取消 删除