Solr5 POST TOOL

Solr includes a simple command line tool for POSTing various types of content to a Solr server. The tool is bin/post. The bin/post tool is a Unix shell script; for Windows (non-Cygwin) usage, see the Windows section below.

To run it, open a window and enter:

bin/post -c gettingstarted example/films/films.json

This will contact the server at localhost:8983. Specifying the collection/core name is mandatory. The '-help' (or simply '-h' option will output information on its usage (i.e., bin/post -help).

Using the bin/post Tool

Specifying either the collection/core name or the full update url is mandatory when using bin/post.

The basic usage of bin/post is:

$ bin/post -help
 
Usage: post -c <collection> [OPTIONS] <files|directories|urls|-d ["...",...]>
     or post -help
    collection name defaults to DEFAULT_SOLR_COLLECTION if not specified
 
OPTIONS
=======
   Solr options:
     -url <base Solr update URL> (overrides collection, host, and port)
     -host <host> (default: localhost)
     -port <port> (default: 8983)
     -commit yes|no (default: yes)
   Web crawl options:
     -recursive <depth> (default: 1)
     -delay <seconds> (default: 10)
   Directory crawl options:
     -delay <seconds> (default: 0)
   stdin/args options:
     -type <content/type> (default: application/xml)
   Other options:
     -filetypes <type>[,<type>,...] (default: xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log)
     -params "<key>=<value>[&<key>=<value>...]" (values must be URL-encoded; these pass through to Solr update request)
     -out yes|no (default: no; yes outputs Solr response to console)
...

 

Examples

There are several ways to use bin/post.  This section presents several examples.

Indexing XML

Add all documents with file extension .xml to collection or core named gettingstarted.

bin/post -c gettingstarted *.xml

Add all documents with file extension .xml to the gettingstarted collection/core on Solr running on port 8984.

bin/post -c gettingstarted -port 8984 *.xml

Send XML arguments to delete a document from gettingstarted.

bin/post -c gettingstarted -d '<delete><id>42</id></delete>'

Indexing CSV

Index all CSV files into gettingstarted:

bin/post -c gettingstarted *.csv

Index a tab-separated file into gettingstarted:

bin/post -c signals -params "separator=%09" -type text/csv data.tsv

The content type (-type) parameter is required to treat the file as the proper type, otherwise it will be ignored and a WARNING logged as it does not know what type of content a .tsv file is.  The CSV handler supports the separator parameter, and is passed through using the -params setting.

Indexing JSON

Index all JSON files into gettingstarted.

bin/post -c gettingstarted *.json

Indexing rich documents (PDF, Word, HTML, etc)

Index a PDF file into gettingstarted.

bin/post -c gettingstarted a.pdf

Automatically detect content types in a folder, and recursively scan it for documents for indexing into gettingstarted.

bin/post -c gettingstarted afolder/

Automatically detect content types in a folder, but limit it to PPT and HTML files and index into gettingstarted.

bin/post -c gettingstarted -filetypes ppt,html afolder/

Windows support

bin/post exists currently only as a Unix shell script, however it delegates its work to a cross-platform capable Java program.  The  SimplePostTool can be run directly in supported environments, including Windows.

SimplePostTool

The bin/post script currently delegates to a standalone Java program called SimplePostTool.  This tool, bundled into a executable JAR, can be run directly using java -jar example/exampledocs/post.jar.  See the help output and take it from there to post files, recurse a website or file system folder, or send direct commands to a Solr server.  

$ java -jar example/exampledocs/post.jar -h
SimplePostTool version 5.0.0
Usage: java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg> [<file|folder|url|arg>...]]
.
.
.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值