自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(294)
  • 资源 (1)
  • 收藏
  • 关注

原创 Ngnix log to Elasticsearch

nginx-es.conf input { file { path => "/opt/logtest/nginx_access.log.1" start_position => "beginning" sincedb_path => "/opt/logstash-2.3.4/sincedb/" }...

2016-08-03 17:39:22 255

原创 Install Logstash And Sample Conf

1. Download #wget https://download.elastic.co/logstash/logstash/logstash-2.3.4.tar.gz#tar -xzf logstash-2.3.4.tar.gz#cd logstash-2.3.4#./bin/logstash-plugin install logstash-output-webhdfs...

2016-08-01 11:05:52 240

原创 大数据挖掘高质量博客

https://pkghosh.wordpress.com/2012/09/03/from-item-correlation-to-rating-prediction/ https://pkghosh.wordpress.com/?s=recommendation  sifarishhttps://github.com/pranab/sifarish

2016-07-29 14:20:42 357

原创 Storm: monitor storm with supervisor

 #yum install supervisor#vi /etc/supervisord.conf[program:storm-supervisor]command=/opt/apache-storm-0.9.3/bin/storm supervisoruser=rootautostart=trueautorestart=truestartsecs=10st...

2015-09-02 15:58:29 187

原创 Solr: 5.2.1 install and config

1. upload  solr-5.2.1.tgz   install_solr_service.sh to the same dir2.# install_solr_service.sh   solr-5.2.1.tgz 3. #cd /var/solr/   #vi  solr.in.shmodify solr's jvm configure#SOLR_HEAP="10...

2015-09-01 18:50:15 123

原创 Solr: index product and price for sellers and perfoming query and sorting

In my current project,  the modle seller has multiply products with price,  I want to index products and query them then sorting them by price , seller's credit ,the distance between the seller and ...

2015-08-25 16:58:53 119

原创 Top ML software

http://www.predictiveanalyticstoday.com/top-free-software-for-text-analysis-text-mining-text-analytics/

2015-08-05 15:02:02 144

原创 Curator: delay queue

curator http://curator.apache.org/curator-client/index.html

2015-08-03 16:15:07 102

原创 mac short-cut keys

http://my.oschina.net/leejan97/blog/214112?p=1

2015-07-15 10:38:36 125

原创 matlab install on ubuntu

http://blog.csdn.net/lanbing510/article/details/41698285

2015-07-10 13:59:05 119

原创 Solr: Using FunctionQuery in SOLR Sort Syntax

In my project, I got a similar problem likeshttp://stackoverflow.com/questions/27701533/using-functionquery-in-solr-sort-syntax I want to sort my documents by a custom score using function ...

2015-07-07 17:36:48 156

原创 Ubuntu: common errors

when run#sudo update-managererror:solution:sudo apt-get update && sudo apt-get dist-upgrade---------update firefox flash plugin#tar -xzf install_flash_player_11_linux.x86_6...

2015-07-07 09:53:15 114

原创 Solr: integrate carrot2 with solr-5.1.0

I already integrated carrot2 with solr-4.x with my customerized chinese tokenizer successfully.But I run some errors following my series of blogs http://ylzhj02.iteye.com/blog/2152348  to adopt ca...

2015-07-01 10:42:22 138

原创 Solr: Spatial Search

1. schema  <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.001" distanceUnits="kilometers"/&amp

2015-06-26 14:59:54 300

原创 Solr: Synonym Query

1. config schema.xml<fieldtype name="text_ch" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.lionsoul.jcseg.analyzer.JcsegTokenizerFactory" mode=&qu

2015-06-18 17:59:03 164

原创 Solr: Install solr to production

1. download solr-5.2.1.tgz2. install#tar xzf solr-5.2.1.tgz solr-5.2.1/bin/install_solr_service.sh --strip-components=2#./install_solr_service.sh solr-5.2.1.tgz 3. check solr status#servi...

2015-06-17 16:31:04 96

原创 SOLR: tika with OCR engine

I want to parse the content not just the metadata of a jpg picture. The following code is the test classimport java.io.File;import java.io.FileInputStream;import java.io.IOException;impo...

2015-06-12 15:03:35 381

原创 Solr: Install tesseract-ocr

Install dependency#tar -jxzf leptonica-1.69.tar.bz2#cd leptonica-1.69#./configure#make -j4#sudo make install-------------------------- download tesseract-ocr-3.02.02.tar.gz   #tar -xzf  t...

2015-06-11 16:35:45 112

原创 用 Apache Tika 理解信息内容

www.ibm.com/developerworks/cn/opensource/tutorials/os-apache-tika/ http://www.tutorialspoint.com/tika/tika_quick_guide.htm

2015-06-09 16:53:20 94

原创 Android: 信息推送

       Preferenceshttp://www.cnblogs.com/hanyonglu/archive/2012/03/04/2378971.html

2015-06-08 16:58:08 75

原创 Neo4j: Create multiple relationships between the same two nodes

In my case, I want to build a addreebook in neo4j, which a person has mutiply cellphones and maybe some cellphones have the same concacter with same phone number but different nicknames. such asus...

2015-06-03 14:40:54 186

原创 Jubatus: Data conversion

http://jubat.us/en/fv_convert.html

2015-05-28 15:38:57 129

原创 Jubatus: Setup in Distributed Mode

      Referenceshttp://jubat.us/en/tutorial_distributed.htmlhttp://jubat.us/en/admin.html 

2015-05-28 14:17:00 82

原创 Jubatus: Classify Example

1.create a mvn project with pom.xml<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0...

2015-05-28 10:26:30 144

原创 Strom: Trident-ML realtime ML

https://github.com/pmerienne/trident-ml

2015-05-26 14:25:49 120

原创 Jubatus: Realtime online ML Introduction

http://jubat.us/en/overview.html

2015-05-26 14:24:47 105

原创 Neo4j: Remote Restful API (java)

#git clone https://github.com/neo4j-contrib/java-rest-binding.git#git tag -l#git checkout neo4j-rest-graphdb-2.0.1#mvn clean install In mvn project's pom.xmladd<dependency> <...

2015-05-25 14:29:01 432

原创 Solr: Using solrJ to operate solr

      Referenceshttp://www.solrtutorial.com/solrj-tutorial.htmlhttps://cwiki.apache.org/confluence/display/solr/Using+SolrJ

2015-05-22 13:29:12 76

原创 Flume: morphline sink with solr 5.1.0

1.  down flume 1.5.2 source code and change solr version to 5.1.0 2. compile  and install3. cp solr 4.10.1 related jars to lib dir  to sove this errorCloudSolrServer' (current frame, stack[2])...

2015-05-21 16:38:37 154

原创 Strom: Trident Fields and tuples

https://storm.apache.org/documentation/Trident-tutorial.html The Trident data model is the TridentTuple which is a named list of values. During a topology, tuples are incrementally built up throu...

2015-04-28 10:14:54 87

原创 HighQulity PPT on line

http://www.slideshare.net/yuhuang/large-scale-machine-learning-for-big-data

2015-04-24 15:33:21 85

原创 Spark: Spark Streaming

Spark Streaming uses a “micro-batch” architecture, where the streaming computation is treated as a continuous series of batch computations on small batches of data. Spark Streaming receives data fro...

2015-04-22 16:02:40 106

原创 Spark: cluters architecture

In distributed mode, Spark uses a master/slave  architecture with one central  coordinator and many distributed  workers. The central coordinator is called the driver.The driver communicates with a p...

2015-04-22 10:51:33 126

原创 Spark: deploy cluster in standlone mode

Host: 192.168.0.135 192.168.0.136   192.168.0.137master: 137  workers:135 136 1.Install spark on all hosts  in /opt dir 2.Install SSH Remote Access137#ssh-keygen137#ssh-copy-id -i ~/.s...

2015-04-20 12:32:56 103

原创 Spark: Cluster Mode Overview

https://spark.apache.org/docs/latest/cluster-overview.html This document gives a short overview of how Spark runs on clusters, to make it easier to understand the components involved. Read throug...

2015-04-20 10:15:03 110

原创 Flume: avro source and sink

In order to flow the data across multiple agents or hops, the sink of the previous agent and source of the current hop need to be avro type with the sink pointing to the hostname (or IP address) and ...

2015-04-17 11:12:42 100

原创 Flume: hbase sink

flume.confa1.sinks.hbase-sink1.channel = ch1a1.sinks.hbase-sink1.type = hbasea1.sinks.hbase-sink1.table = usersa1.sinks.hbase-sink1.columnFamily= infoa1.sinks.hbase-sink1.serializer=org.ap...

2015-04-16 17:04:38 206

原创 Kite:Morphlines Introduction

  http://kitesdk.org/docs/1.0.0/morphlines/http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/

2015-04-13 11:09:08 154

原创 Kite: A Data API for Hadoop

http://kitesdk.org/docs/current/ 

2015-04-13 11:04:27 403

原创 Neo4j: fulltext search

Model @Indexed(indexType = IndexType.FULLTEXT, indexName = "TaskTile") private String title; Repository @Query("START n=node:TaskTile({0}) return n") Iterable<Task> fin...

2015-04-08 15:03:53 337

hadoop in action

hadoop in action is a book which is suit for beginners who want to study the pupular distribution propcess technology

2014-11-24

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除