Jomal Andrews – Senior Consultant
The Sparklore Team came across a project that required a lot of manual data processing before it got to the visualization phase. Ideally, processes like this would be scheduled and reports would automatically be refreshed. The manual processing was a tedious job and made Q&A of the reports quite difficult and time-consuming. We used Datorama to automate this process.
While trying out one of our automation initiatives, we came up against a brick wall where ‘DataStream’ was unable to parse the headers. The source allowed us to take the data in either an Excel or CSV format. However, few headers were merged into the first two columns, while for Excel and CSV this lead to headers being shown in separate rows altogether. Since the DataStream by default reads the first row as headers, most of the required values were not being read as it didn’t have headers.
Another issue was that we needed to map a Date field to complete the DataStream mapping for which there wasn’t any available field. The data belonged to the Day before Yesterday (There is a delay of 18 hours from the source) and hence any sort of simple ‘TODAY ()’ formulae were deemed to be irrelevant. It was a single day’s data and the date was mentioned in the file name for our understanding.
Thankfully, Datorama’s Transformation rule does provide a solution to the problem. Using Transformation rules, we can convert the data source in to a table and make Datorama believe that there are no headers present as such. This would mean every column would be named by their respective column number – Column 01, Column 02 etc. This enabled us to map the required fields to their relevant dimensions or metrics in Datorama.
Generating a Date field was the only issue since the date pertaining to that data was in the file name and not within the file. And Datorama allows you to read data even from the file name.