Data architecture from right to left: start with the business needs
Date: Monday May 18, 2020
The gap between technology and business stubbornly persists
What is “data architecture from right to left?” It’s a data architecture that starts with the business needs and then puts into place the technology solution to meet those needs.
This may seem like Business Intelligence 101, but the gap between analyst needs and the solutions delivered by data engineering teams stubbornly persists in most large organizations. Too often, a team is fielded to build the next great data hub, but fails to consider who the users of the data actually are and what questions they want answered.
As a result, business users don’t trust the solution because they can’t get usable answers to their questions or it takes too long to return answers. The very first step in designing a trusted data hub is to know your use cases. In other words, what questions are the analysts trying to answer, and what are they reporting on?
When impressive technology fails to meet business needs
As a solution architect at Wavicle, I recently worked on a project with a Fortune 500 food and beverage company. The company wanted to look at patterns in the point-of-sale (POS) system and supporting data to determine which sales channels could benefit from greater adoption of digital technology to increase productivity, items sold, and check size.
The company’s IT team built a data warehouse that consolidates 3 billion POS transactions annually across thousands of retail locations into a single database schema, and refreshes data from all locations on a nightly basis.
The intention was to make this data available for digital analysts and data scientists to identify sales patterns and measure customer loyalty with respect to the company’s mobile app.
Technically speaking, it’s an impressive solution, but the data set is so large and the data so raw, that two problems ensued: 1) the raw data didn’t make sense to business users, and 2) to answer their questions, they had to write complex queries that didn’t return in a timely manner, if at all.
This single database simply could not meet all possible query requests! As a result, the users didn’t trust or use the solution.
Putting the business questions first
When Wavicle was called in to help solve these challenges, we took the time up front to intimately understand the company’s use cases and made them the cornerstone of the technology solution.
We learned that analysts wanted to make sales comparisons across different times of the day and across individual locations and regions, and that they would like to understand how different POS locations in its facilities (e.g. kiosk, drive-thru) impacted measures of service efficiency.
Understanding that some users were looking to run specific queries against certain data sets, while others wanted the ability to explore the data as a whole, we were able to look at the solution architecture differently. With this insight, it was easy to start modeling a solution that would meet these requests in a timely fashion.
“Too often, a team is fielded to build the next great data hub, but fails to consider who the users of the data actually are and what questions they want answered.“
Data mart summarizes data for specific queries
Our solution consisted of a data mart that extends the client’s existing database platform, Amazon Redshift, and integrates POS data with supporting data such as speed of service (SoS), offer, and loyalty data.
In designing this solution, we kept two types of users in mind: business users who wanted summary level data and data scientists who wanted to search a more granular level of data for sales and channel usage patterns.
Because the data set is so large, we knew that queries would always be slower. We addressed this by creating summary tables and including only the data needed to answer specific questions for business users. We used Redshift stored procedures to create the summarizations.
We also looked at how the database technology works. To accommodate large data sets, Amazon organizes the data via sort keys (how the data is ordered) and distribution keys (how the data is “spread out” among servers in the cloud).
We took advantage of these design features to enhance the data scientists’ experience with data sets that were manageable in size and ideal for complex queries and detailed analysis. A table built specifically for exploration allows data scientists to search for patterns using a more granular level of data.
Meeting business users’ needs
What happens when you put the business needs first? They trust it and they come back for more!
Once we had the initial set of tables in place and validated, the user community quickly realized the value of the solution and requested additional tables. They even started making plans for expanding the scope of the solution.
Instead of waiting for queries that were never coming back, or plowing through unwieldy data files that were difficult to understand, this company’s business users could now easily construct dashboards that directly answered their questions and set them up to provide answers for the next round of questions.
The next time your team sets out to build a data warehouse or data mart, stop and consider whether the business needs have been adequately communicated and understood. Are they at the heart of your technology solution? If not, back up and try again. It’s critical to the success of the solution.
Wavicle’s team of consultants, data architects and cloud engineers work with global organizations to build a roadmap to success with unmatched technology expertise, creative innovation, and superior customer service. Our toolkit of proprietary accelerators helps clients deliver world-class data analytics solutions in record time. From data management services and cloud migration consulting to dashboard development and data analytics consulting, our professionals enable and empower data-driven enterprises.