Data export
Customize, Export Interaction data, Sitecore

Customization: Sitecore – Export the interaction data to a custom/external database


While working on the Sitecore project, the client had a very specific requirement that they want to export the user interaction data to an external database, and from that database, they wanted to create reports using some reporting tools like Spotfire or Tableau. Initially, we had requirements to export Interactions, Goals, and PageViewEvents data to an external database.

The main challenge was how we export the interaction data in real-time? Which pipeline we intercept should give us maximum output as we want as well as also help us to export data again if we want through the Reporting Database rebuild functionality provided by Sitecore.

Also, there is another requirement that the client wanted to export the older interaction data as well from the xDB so they can have reports from that old data. About this, I’ll write another article about how I implemented it.

How I finished the challenge

With initial scanning of Sitecore documentation and other blogs from ever helping Sitecore community members, I found the interesting article Real-Time Power BI Dashboard from Sitecore xDB where Alexei Vershalovich explained how he exported the real-time data using the Service Plugin implementation. In the article, Alexei explained that they used the Service Plugin for subscribing to the xConnect operations like AddContactOperation or AddInteractionOperation to capture data submitted to Collection Database. Here in the article he also mentioned the limitation of the Collection database that, the Marketing definitions (such as the name of the Goals, Campaign, page item name, etc.) as not available, and for that he suggested exporting those data separately to make Power BI data more meaningful with human-readable friendly names of the items.

For my requirements, first I implemented the Service Plugin which worked well but as mentioned above, the human-readable names are not there with the data.

Also, I observed that the Service Plugin is not getting executed during Reporting database rebuild and hence I should not have any historical data be exported to a custom database. – I have observed this once, but I’ll re-verify and update the article with actual findings.

Now, what to do next?

Based on the ideas presented by Alexei, I started finding some other pipelines or events provided by Sitecore which should fulfill my requirements, and fortunately, I found one pipeline named InteractionsBatch which is getting executed when the user session ends and the interactions data are being submitted to xDB Collection database. This pipeline is getting executed by the Processing Service.

Also, the same pipeline executes when we do the reporting database rebuilding and that fulfilled my requirements that we can rebuild the reporting database and along with that our external database will also be filled with the data required. Of course, first I needed to truncate all the tables from my custom database to avoid duplicate data, but I’m fine with this as I only had to rebuild the reporting database once or twice.

Another big benefit of the InteractionsBatch pipeline, being executed by the Processing service, is that I have access to the Master database and hence with that I’ve all the human-readable names for Goals, Campaigns, Page items, etc., are available to me to export along with interaction data.

Now, my work went ahead with implementation, first I created the following tables into the customer database with all necessary columns to capture the data being exported.

  1. Interactions – In this table, I am storing the interaction data. This is the parent table for the rest other two tables
  2. Goals – This table stores the goals data
  3. PageViewEvents – This table stores the page view events data.

After preparing my database, I implemented a custom processor and added it to the InteractionsBatch pipeline. The Process method of the process accepts Sitecore.Analytics.Aggregation.Pipeline.InteractionBatchAggregationPipelineArgs as input parameter. From the input parameter, I extracted the interaction data, evaluated it, and stored the data in the format I wanted to export into the DataTable.

During the processing, I also read the item names from the Master database and populated respective fields into the data tables.

Then using the SqlBulkCopy, I exported all those data in one go to the external database.

After implementing all the requirements, I deployed the processor to the Processing Service, as I’ve installed XP1 topology on my local. If you have standalone topology installed you just need to deploy it to the web application from where you run the CMS.

Please refer to the code on Github to see how the implementation was done.

Happy Sitecore Customization!!!!!