There are so many ETL tools out there. Not many that are Free. And of the Free choices out there they don't appear to have any knowledge of or support for ArangoDB. If anyone has dealt with the migration of their data over to ArangoDB and automated this process I would love to hear how you accomplished this. Below I have listed out several choices we have for ETL Tools. These choices I actually took from the 2016 Spark Europe presentation by Bas Geerdink.
* IBM InfoSphere DataStage
* Oracle Warehouse Builder
* Pervasive Data Integrator
* PowerCenter Informatica
* SAS Data Management
* Talend Open Studio
* SAP Data Services
* Microsoft SSIS
* Syncsort DMX
* CloverETL
* Jaspersoft
* Pentaho
* NiFi
解决方案
I was able to utilize Apache NiFi in order to accomplish this goal. Below is an extremely basic overview of what I did in order to get data out of a source Database into ArangoDB.
Using NiFi you are able to extract data from many of the standard databases out there. There are many Java Drivers out there that are already created to work with databases such as MySQL, SQLite, Oracle, etc....
I was able to use two processors to pull data out of a source database using:
QueryDatabaseTable
ExecuteSQL
The output of these are in NiFi's Avro format which I then converted to JSON using the ConvertAvroToJSON Processor. This converts the output to a JSON List.
While there really isn't anything within NiFi specifically built for use with ArangoDB there is one feature that comes built in with ArangoDB and that is it's API.
I was able to Bulk Insert data into ArangoDB using NiFi's InvokeHTTP Processor with a POST method into a Collection named Cities.
The value I used as the RemoteURL:
http://localhost:8529/_api/import?collection=cities&type=list&details=true
Below is a screenshot of NiFi. I could have definitely used this to kick start my research. I hope this helps someone else. Ignore some of the extra processors as I have them in there for testing purposes and was messing around with JOLT to see if I can use it to 'Transform' my JSON. The "T" in ETL.