spark写arango,与ArangoDB配合使用的ETL工具-它们是什么?

There are so many ETL tools out there. Not many that are Free. And of the Free choices out there they don't appear to have any knowledge of or support for ArangoDB. If anyone has dealt with the migration of their data over to ArangoDB and automated this process I would love to hear how you accomplished this. Below I have listed out several choices we have for ETL Tools. These choices I actually took from the 2016 Spark Europe presentation by Bas Geerdink.

* IBM InfoSphere DataStage

* Oracle Warehouse Builder

* Pervasive Data Integrator

* PowerCenter Informatica

* SAS Data Management

* Talend Open Studio

* SAP Data Services

* Microsoft SSIS

* Syncsort DMX

* CloverETL

* Jaspersoft

* Pentaho

* NiFi

解决方案

I was able to utilize Apache NiFi in order to accomplish this goal. Below is an extremely basic overview of what I did in order to get data out of a source Database into ArangoDB.

Using NiFi you are able to extract data from many of the standard databases out there. There are many Java Drivers out there that are already created to work with databases such as MySQL, SQLite, Oracle, etc....

I was able to use two processors to pull data out of a source database using:

QueryDatabaseTable

ExecuteSQL

The output of these are in NiFi's Avro format which I then converted to JSON using the ConvertAvroToJSON Processor. This converts the output to a JSON List.

While there really isn't anything within NiFi specifically built for use with ArangoDB there is one feature that comes built in with ArangoDB and that is it's API.

I was able to Bulk Insert data into ArangoDB using NiFi's InvokeHTTP Processor with a POST method into a Collection named Cities.

The value I used as the RemoteURL:

http://localhost:8529/_api/import?collection=cities&type=list&details=true

wRUG0.png

Below is a screenshot of NiFi. I could have definitely used this to kick start my research. I hope this helps someone else. Ignore some of the extra processors as I have them in there for testing purposes and was messing around with JOLT to see if I can use it to 'Transform' my JSON. The "T" in ETL.

ZVzLj.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值