This post covers to use ElasticSearch-Hadoop to read data from Hadoop system and index that in ElasticSearch. The functionality it covers is to index product views count and top search query per customer in last n number of days. The analyzed data can further
be used on website to display customer recently viewed, product views count and top search query string.
We already have customer search clicks data gathered using Flume and stored in Hadoop HDFS and ElasticSearch, and how to analyze same data using Hive and generate statistical data. Here we will further see how to use the analyzed data to enhance customer experience
on website and make it relevant for the end customers.
Recently Viewed Items
We already have covered in first part, how we can use flume ElasticSearch sink to index the recently viewed items directory to ElasticSearch instance and the data can be used to display real time clicked items for the customer.
Elasticsearch-hadoop-hive, allows to access ElasticSearch using Hive. As shared in previous post, we have product views count and also customer top search query data extracted in Hive tables. We will read and index the same data to ElasticSearch so that it
can be used for display purpose on website.
Product views count functionality
Take a scenario to display each product total views by customer in the last n number of days. For better user experience, you can use the same functionality to display to end customer how other customer perceive the same product.
The functionality described above is only sample functionality and ofcourse need to be extended to map to specific business scenario. This may cover business scenario of displaying search query cloud to customers on website or for further Business Intelligence
Spring ElasticSearch for testing purpose has also been included to create ESRepository to count total records and delete All.
Check the service for details, ElasticSearchRepoServiceImpl.java