Ngnix log to Elasticsearch

nginx-es.conf input { file { path => "/opt/logtest/nginx_access.log.1" start_position => "beginning" sincedb_path => "/opt/logstash-2.3.4/sincedb/" }...

2016-08-03

Install Logstash And Sample Conf

1. Download #wget https://download.elastic.co/logstash/logstash/logstash-2.3.4.tar.gz#tar -xzf logstash-2.3.4.tar.gz#cd logstash-2.3.4#./bin/logstash-plugin install logstash-output-webhdfs...

2016-08-01

大数据挖掘高质量博客

https://pkghosh.wordpress.com/2012/09/03/from-item-correlation-to-rating-prediction/ https://pkghosh.wordpress.com/?s=recommendation  sifarishhttps://github.com/pranab/sifarish

2016-07-29

Storm: monitor storm with supervisor

 #yum install supervisor#vi /etc/supervisord.conf[program:storm-supervisor]command=/opt/apache-storm-0.9.3/bin/storm supervisoruser=rootautostart=trueautorestart=truestartsecs=10st...

2015-09-02

Solr: 5.2.1 install and config

1. upload  solr-5.2.1.tgz   install_solr_service.sh to the same dir2.# install_solr_service.sh   solr-5.2.1.tgz 3. #cd /var/solr/   #vi  solr.in.shmodify solr's jvm configure#SOLR_HEAP="10...

2015-09-01

Solr: index product and price for sellers and perfoming query and sorting

In my current project,  the modle seller has multiply products with price,  I want to index products and query them then sorting them by price , seller's credit ,the distance between the seller and ...

2015-08-25

Top ML software


2015-08-05

Curator: delay queue

curator http://curator.apache.org/curator-client/index.html

2015-08-03

mac short-cut keys


2015-07-15

matlab install on ubuntu


2015-07-10

Solr: Using FunctionQuery in SOLR Sort Syntax

In my project, I got a similar problem likeshttp://stackoverflow.com/questions/27701533/using-functionquery-in-solr-sort-syntax I want to sort my documents by a custom score using function ...

2015-07-07

Ubuntu: common errors

when run#sudo update-managererror:solution:sudo apt-get update && sudo apt-get dist-upgrade---------update firefox flash plugin#tar -xzf install_flash_player_11_linux.x86_6...

2015-07-07

Solr: integrate carrot2 with solr-5.1.0

I already integrated carrot2 with solr-4.x with my customerized chinese tokenizer successfully.But I run some errors following my series of blogs http://ylzhj02.iteye.com/blog/2152348  to adopt ca...

2015-07-01

Solr: Spatial Search

1. schema  <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.001" distanceUnits="kilometers"/&amp

2015-06-26

Solr: Synonym Query

1. config schema.xml<fieldtype name="text_ch" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.lionsoul.jcseg.analyzer.JcsegTokenizerFactory" mode=&qu

2015-06-18

Solr: Install solr to production

1. download solr-5.2.1.tgz2. install#tar xzf solr-5.2.1.tgz solr-5.2.1/bin/install_solr_service.sh --strip-components=2#./install_solr_service.sh solr-5.2.1.tgz 3. check solr status#servi...

2015-06-17

SOLR: tika with OCR engine

I want to parse the content not just the metadata of a jpg picture. The following code is the test classimport java.io.File;import java.io.FileInputStream;import java.io.IOException;impo...

2015-06-12

Solr: Install tesseract-ocr

Install dependency#tar -jxzf leptonica-1.69.tar.bz2#cd leptonica-1.69#./configure#make -j4#sudo make install-------------------------- download tesseract-ocr-3.02.02.tar.gz   #tar -xzf  t...

2015-06-11

用 Apache Tika 理解信息内容

www.ibm.com/developerworks/cn/opensource/tutorials/os-apache-tika/ http://www.tutorialspoint.com/tika/tika_quick_guide.htm

2015-06-09

Android: 信息推送


2015-06-08

Neo4j: Create multiple relationships between the same two nodes

In my case, I want to build a addreebook in neo4j, which a person has mutiply cellphones and maybe some cellphones have the same concacter with same phone number but different nicknames. such asus...

2015-06-03

Jubatus: Data conversion


2015-05-28

Jubatus: Setup in Distributed Mode


2015-05-28

Jubatus: Classify Example

1.create a mvn project with pom.xml<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0...

2015-05-28

Strom: Trident-ML realtime ML


2015-05-26

Jubatus: Realtime online ML Introduction


2015-05-26

Neo4j: Remote Restful API (java)

#git clone https://github.com/neo4j-contrib/java-rest-binding.git#git tag -l#git checkout neo4j-rest-graphdb-2.0.1#mvn clean install In mvn project's pom.xmladd<dependency> <...

2015-05-25

Solr: Using solrJ to operate solr


2015-05-22

Flume: morphline sink with solr 5.1.0

1.  down flume 1.5.2 source code and change solr version to 5.1.0 2. compile  and install3. cp solr 4.10.1 related jars to lib dir  to sove this errorCloudSolrServer' (current frame, stack[2])...

2015-05-21

Strom: Trident Fields and tuples

https://storm.apache.org/documentation/Trident-tutorial.html The Trident data model is the TridentTuple which is a named list of values. During a topology, tuples are incrementally built up throu...

2015-04-28

HighQulity PPT on line


2015-04-24

Spark: Spark Streaming

Spark Streaming uses a “micro-batch” architecture, where the streaming computation is treated as a continuous series of batch computations on small batches of data. Spark Streaming receives data fro...

2015-04-22

Spark: cluters architecture

In distributed mode, Spark uses a master/slave  architecture with one central  coordinator and many distributed  workers. The central coordinator is called the driver.The driver communicates with a p...

2015-04-22

Spark: deploy cluster in standlone mode

Host: 137  workers:135 136 1.Install spark on all hosts  in /opt dir 2.Install SSH Remote Access137#ssh-keygen137#ssh-copy-id -i ~/.s...

2015-04-20

Spark: Cluster Mode Overview

https://spark.apache.org/docs/latest/cluster-overview.html This document gives a short overview of how Spark runs on clusters, to make it easier to understand the components involved. Read throug...

2015-04-20

Flume: avro source and sink

In order to flow the data across multiple agents or hops, the sink of the previous agent and source of the current hop need to be avro type with the sink pointing to the hostname (or IP address) and ...

2015-04-17

Flume: hbase sink

flume.confa1.sinks.hbase-sink1.channel = ch1a1.sinks.hbase-sink1.type = hbasea1.sinks.hbase-sink1.table = usersa1.sinks.hbase-sink1.columnFamily= infoa1.sinks.hbase-sink1.serializer=org.ap...

2015-04-16

Kite:Morphlines Introduction


2015-04-13

Kite: A Data API for Hadoop


2015-04-13

Neo4j: fulltext search

Model @Indexed(indexType = IndexType.FULLTEXT, indexName = "TaskTile") private String title; Repository @Query("START n=node:TaskTile({0}) return n") Iterable<Task> fin...

2015-04-08

hadoop in action

hadoop in action is a book which is suit for beginners who want to study the pupular distribution propcess technology



