自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(294)
  • 资源 (1)
  • 收藏
  • 关注

原创 Ngnix log to Elasticsearch

nginx-es.conf input { file { path => "/opt/logtest/nginx_access.log.1" start_position => "beginning" sincedb_path => "/opt/logstash-2.3.4/sincedb/" }...

2016-08-03 17:39:22 315

原创 Install Logstash And Sample Conf

1. Download #wget https://download.elastic.co/logstash/logstash/logstash-2.3.4.tar.gz#tar -xzf logstash-2.3.4.tar.gz#cd logstash-2.3.4#./bin/logstash-plugin install logstash-output-webhdfs...

2016-08-01 11:05:52 306

原创 大数据挖掘高质量博客

https://pkghosh.wordpress.com/2012/09/03/from-item-correlation-to-rating-prediction/ https://pkghosh.wordpress.com/?s=recommendation  sifarishhttps://github.com/pranab/sifarish

2016-07-29 14:20:42 426

原创 Storm: monitor storm with supervisor

 #yum install supervisor#vi /etc/supervisord.conf[program:storm-supervisor]command=/opt/apache-storm-0.9.3/bin/storm supervisoruser=rootautostart=trueautorestart=truestartsecs=10st...

2015-09-02 15:58:29 234

原创 Solr: 5.2.1 install and config

1. upload  solr-5.2.1.tgz   install_solr_service.sh to the same dir2.# install_solr_service.sh   solr-5.2.1.tgz 3. #cd /var/solr/   #vi  solr.in.shmodify solr's jvm configure#SOLR_HEAP="10...

2015-09-01 18:50:15 160

原创 Solr: index product and price for sellers and perfoming query and sorting

In my current project,  the modle seller has multiply products with price,  I want to index products and query them then sorting them by price , seller's credit ,the distance between the seller and ...

2015-08-25 16:58:53 149

原创 Top ML software

http://www.predictiveanalyticstoday.com/top-free-software-for-text-analysis-text-mining-text-analytics/

2015-08-05 15:02:02 182

原创 Curator: delay queue

curator http://curator.apache.org/curator-client/index.html

2015-08-03 16:15:07 142

原创 mac short-cut keys

http://my.oschina.net/leejan97/blog/214112?p=1

2015-07-15 10:38:36 152

原创 matlab install on ubuntu

http://blog.csdn.net/lanbing510/article/details/41698285

2015-07-10 13:59:05 146

原创 Solr: Using FunctionQuery in SOLR Sort Syntax

In my project, I got a similar problem likeshttp://stackoverflow.com/questions/27701533/using-functionquery-in-solr-sort-syntax I want to sort my documents by a custom score using function ...

2015-07-07 17:36:48 196

原创 Ubuntu: common errors

when run#sudo update-managererror:solution:sudo apt-get update && sudo apt-get dist-upgrade---------update firefox flash plugin#tar -xzf install_flash_player_11_linux.x86_6...

2015-07-07 09:53:15 147

原创 Solr: integrate carrot2 with solr-5.1.0

I already integrated carrot2 with solr-4.x with my customerized chinese tokenizer successfully.But I run some errors following my series of blogs http://ylzhj02.iteye.com/blog/2152348  to adopt ca...

2015-07-01 10:42:22 174

原创 Solr: Spatial Search

1. schema  <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.001" distanceUnits="kilometers"/&amp

2015-06-26 14:59:54 343

原创 Solr: Synonym Query

1. config schema.xml<fieldtype name="text_ch" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.lionsoul.jcseg.analyzer.JcsegTokenizerFactory" mode=&qu

2015-06-18 17:59:03 201

原创 Solr: Install solr to production

1. download solr-5.2.1.tgz2. install#tar xzf solr-5.2.1.tgz solr-5.2.1/bin/install_solr_service.sh --strip-components=2#./install_solr_service.sh solr-5.2.1.tgz 3. check solr status#servi...

2015-06-17 16:31:04 129

原创 SOLR: tika with OCR engine

I want to parse the content not just the metadata of a jpg picture. The following code is the test classimport java.io.File;import java.io.FileInputStream;import java.io.IOException;impo...

2015-06-12 15:03:35 482

原创 Solr: Install tesseract-ocr

Install dependency#tar -jxzf leptonica-1.69.tar.bz2#cd leptonica-1.69#./configure#make -j4#sudo make install-------------------------- download tesseract-ocr-3.02.02.tar.gz   #tar -xzf  t...

2015-06-11 16:35:45 157

原创 用 Apache Tika 理解信息内容

www.ibm.com/developerworks/cn/opensource/tutorials/os-apache-tika/ http://www.tutorialspoint.com/tika/tika_quick_guide.htm

2015-06-09 16:53:20 125

原创 Android: 信息推送

       Preferenceshttp://www.cnblogs.com/hanyonglu/archive/2012/03/04/2378971.html

2015-06-08 16:58:08 112

原创 Neo4j: Create multiple relationships between the same two nodes

In my case, I want to build a addreebook in neo4j, which a person has mutiply cellphones and maybe some cellphones have the same concacter with same phone number but different nicknames. such asus...

2015-06-03 14:40:54 229

原创 Jubatus: Data conversion

http://jubat.us/en/fv_convert.html

2015-05-28 15:38:57 163

原创 Jubatus: Setup in Distributed Mode

      Referenceshttp://jubat.us/en/tutorial_distributed.htmlhttp://jubat.us/en/admin.html 

2015-05-28 14:17:00 118

原创 Jubatus: Classify Example

1.create a mvn project with pom.xml<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0...

2015-05-28 10:26:30 179

原创 Strom: Trident-ML realtime ML

https://github.com/pmerienne/trident-ml

2015-05-26 14:25:49 147

原创 Jubatus: Realtime online ML Introduction

http://jubat.us/en/overview.html

2015-05-26 14:24:47 147

原创 Neo4j: Remote Restful API (java)

#git clone https://github.com/neo4j-contrib/java-rest-binding.git#git tag -l#git checkout neo4j-rest-graphdb-2.0.1#mvn clean install In mvn project's pom.xmladd<dependency> <...

2015-05-25 14:29:01 494

原创 Solr: Using solrJ to operate solr

      Referenceshttp://www.solrtutorial.com/solrj-tutorial.htmlhttps://cwiki.apache.org/confluence/display/solr/Using+SolrJ

2015-05-22 13:29:12 110

原创 Flume: morphline sink with solr 5.1.0

1.  down flume 1.5.2 source code and change solr version to 5.1.0 2. compile  and install3. cp solr 4.10.1 related jars to lib dir  to sove this errorCloudSolrServer' (current frame, stack[2])...

2015-05-21 16:38:37 195

原创 Strom: Trident Fields and tuples

https://storm.apache.org/documentation/Trident-tutorial.html The Trident data model is the TridentTuple which is a named list of values. During a topology, tuples are incrementally built up throu...

2015-04-28 10:14:54 122

原创 HighQulity PPT on line

http://www.slideshare.net/yuhuang/large-scale-machine-learning-for-big-data

2015-04-24 15:33:21 116

原创 Spark: Spark Streaming

Spark Streaming uses a “micro-batch” architecture, where the streaming computation is treated as a continuous series of batch computations on small batches of data. Spark Streaming receives data fro...

2015-04-22 16:02:40 145

原创 Spark: cluters architecture

In distributed mode, Spark uses a master/slave  architecture with one central  coordinator and many distributed  workers. The central coordinator is called the driver.The driver communicates with a p...

2015-04-22 10:51:33 158

原创 Spark: deploy cluster in standlone mode

Host: 192.168.0.135 192.168.0.136   192.168.0.137master: 137  workers:135 136 1.Install spark on all hosts  in /opt dir 2.Install SSH Remote Access137#ssh-keygen137#ssh-copy-id -i ~/.s...

2015-04-20 12:32:56 134

原创 Spark: Cluster Mode Overview

https://spark.apache.org/docs/latest/cluster-overview.html This document gives a short overview of how Spark runs on clusters, to make it easier to understand the components involved. Read throug...

2015-04-20 10:15:03 139

原创 Flume: avro source and sink

In order to flow the data across multiple agents or hops, the sink of the previous agent and source of the current hop need to be avro type with the sink pointing to the hostname (or IP address) and ...

2015-04-17 11:12:42 125

原创 Flume: hbase sink

flume.confa1.sinks.hbase-sink1.channel = ch1a1.sinks.hbase-sink1.type = hbasea1.sinks.hbase-sink1.table = usersa1.sinks.hbase-sink1.columnFamily= infoa1.sinks.hbase-sink1.serializer=org.ap...

2015-04-16 17:04:38 243

原创 Kite:Morphlines Introduction

  http://kitesdk.org/docs/1.0.0/morphlines/http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/

2015-04-13 11:09:08 216

原创 Kite: A Data API for Hadoop

http://kitesdk.org/docs/current/ 

2015-04-13 11:04:27 442

原创 Neo4j: fulltext search

Model @Indexed(indexType = IndexType.FULLTEXT, indexName = "TaskTile") private String title; Repository @Query("START n=node:TaskTile({0}) return n") Iterable<Task> fin...

2015-04-08 15:03:53 377

hadoop in action

hadoop in action is a book which is suit for beginners who want to study the pupular distribution propcess technology

2014-11-24

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除