- Capabilities
-
-
-
FEATURED SOLUTIONS
-
-
- Industries
-
-
RETAIL
- ActiveInsightsBuild the profiles combining in-store, e‑commerce, loyalty, and third-party data.
-
Retail
A retailer with thousands of franchise locations modernized their data ecosystem to enable critical data analytics use cases.
View Case Study
-
HEALTH & WELLNESS
- EZConvertETLTransforming healthcare data pipeline by enabling patient data integration
-
Healthcare
A leading healthcare RCM company modernized its data governance to enhance security, streamline access, and boost efficiency.
View Case Study
-
MANUFACTURING
- EZForecastGet Insights into supply chain dynamics and predict production back-log
-
Manufacturing
Discover how Vyaire Medical uses Amazon QuickSight for real-time global sales and forecasting insights, boosting production efficiency.
View Case Study
-
FINANCE & INSURANCE
- EZConvertBIAutomate BI asset migration from legacy platforms to modern cloud-native tools.
-
Insurance
A major insurer modernized its operations by implementing a cloud-based data strategy, enabling faster reporting, improved scalability, and better regulatory compliance.
View Case Study
-
-
- Company
-
- Resources
-
- Amazon Athena
- Amazon Redshift
Accelerated Data Validation With Wavicle’s Data Capture Tool
Managing data is always difficult and leads to complex processes. Data quality and ETL testing are critical to validate, verify, and qualify data while preventing duplicate records and data loss. This process ensures data transfer from various heterogeneous environments to a central data repository while adhering to strict quality protocols and compliance regulations.
However, data testing may require manual validation and auditing. This manual work increases the risk of defects leaking into the production environment without proper quality checks. Ultimately, it can be difficult for even the most talented and certified testers to create an accurate testing strategy to validate all possible data combinations and potential failures.
Project background
One of North America’s largest collision repair center operators approached Wavicle for help building testing strategies and solutions within strict deadlines. They needed data validated across two stages: first, when moving from its source to a data lake, and second, when moving from a data lake to a data warehouse. The testing strategies included validating data in various file types and comparing it across the staging and warehouse layers.
This project included several challenges:
- An accelerated timeline of two months
- Large data volumes with complex logic
- Multiple file types, including XML, JSON, CSV & Excel data comparison
- Database-to-database validation for a process that included files coming from the business, then loading into a data lake (Athena), then moving into a data warehouse (RedShift) based on business logic, and then being translated onto dashboards
Wavicle’s approach
The Wavicle team presented a data capture tool that compares various data sources and identifies variances in the same data group. Using multiple connectors to relational and non-relational databases, cloud databases, and filesystems, the tool compares the test data and generates test results into smart reports. The test results are then stored and accessed through user-defined Amazon S3 storage.
This approach ensured the quality of the test deliverables and provided a better focus on testing. It paved a hassle-free path for testers to test high volumes of data containing more than one million records. With this system, testers could easily compare the results as the reports narrowed down the exact differences between the source data and target databases to identify bugs.
Wavicle’s team used their data capture accelerator to address a variety of challenges, including the following:
- Complexity: The tool provides quality assessment to visualize test plans in workspaces to reduce the complexity of managing the test plans and retesting. Output-matching reports allowed for simpler evaluation of results and quicker understanding of actual errors for reporting.
- Time: The tool significantly reduced the time needed to create test plans and corresponding SQL, verify the results, and communicate errors. This resulted in a 50% reduction in data validation efforts in complex data platforms. Proper testing of data flows can create hundreds of test steps for an average-size implementation, so time and cost savings can be significant.
- Effort: Parameterizing the query for multiple batches in the same test plan reduced the effort that would be required to create multiple test plans.
- Validation: Manually doing file-to-file validation took approximately 2 – 3 hours. Execution with Wavicle’s data capture accelerator reduced the time by 75% to validate within 30 minutes.
- Defect identification: The tool rapidly identified bugs using a smart report.
In addition, Wavicle’s data capture tool helped by retesting defects. Using the data capture tool, no new test plans or additional efforts were needed after fixing defects, which reduced effort. Having the ability to edit the existing test plan in response to minor changes was faster and more straightforward than creating a new one. The test plans could also parameterize the queries, which enabled the team to run SQL queries for different conditions. This allowed them to run the test plans with different values for the conditions.
During the project, Wavicle’s team had to adapt to overcome unforeseen challenges. For example, they encountered issues with a server that went down while running high-volume test plans and deployed a fix to bring up the server automatically after 15 minutes of downtime and improve performance through coding solutions. In addition, handling XML and hierarchical JSON comparisons against CSV or DB was challenging. Using JSON conversion utilities, the team converted the XMLs to JSON, loaded them to MongoDB, and queried the MongoDB to flatten data, which enabled them to compare it against flat files like CSVs or DBs.
Project results
Wavicle’s data capture tool reduced risk with the ability to manage hundreds of testing tasks. Plus, the tool made understanding test outputs and identifying key errors easier, allowing for faster and more reliable break/fix cycles.
Wavicle’s consultants completed the two-month project on time, tested vast volumes of data in the order of a few hundred million records, and identified all data mismatches. This success was recognized and appreciated by the customer, as the rate of execution was three times faster than the manual execution, and the testing quality was noteworthy.
Related Posts
- Amazon Quick Suite
Turbocharging Voice of Customer Analytics Using...
- Databricks
- EZForecast
Empowering Planners with Interactive Forecastin...
- Azure
- Databricks
Global Packaging Material Manufacturer Streamli...
- AWS
- Databricks
Global QSR Chain Strengthens Data Governance by...
- Azure
- Databricks
Healthcare Company Optimizes Cloud Costs in Pre...
- Amazon Quick Suite
- Tableau
Seamlessly Migrating 550+ Dashboards from Table...
- Amazon Quick Suite
- AWS
Seamlessly Migrating 550+ Dashboards from Table...
- Amazon Quick Suite
- AWS
Global Digital Platform Migrates from Tableau t...
- Azure
- EZConvertBI
Manufacturer Transforms Forecasting Process Wit...
- Azure
- Databricks
Ensemble Health Partners Modernizes Data Govern...
- Azure
- Microsoft Fabric
Standards Body Centralizes Supply Chain Data to...
- Power BI
- Tableau
U.S. Air Force Leverages Wavicle’s EZConvertBI ...
- MicroStrategy
- SAP Business Objects
International Manufacturer Leverages Wavicle’s ...
- Azure
- Azure ML
Greenhouse Grower Improves Yield Predictions Th...
- Amazon Quick Suite
- Tableau
Rail Technology Services Provider Upgrades Anal...
- Amazon Quick Suite
- Amazon S3
Global Automotive Supplier Modernizes Reporting...
- Microsoft Fabric
- Microsoft SQL Server
Greenhouse Grower Modernizes Data and Insights ...
- Amazon Redshift
- BigQuery
Major Home Builder Leverages Snowflake to Catal...
- Salesforce Net Zero Cloud
- Talend
QSR Improves Sustainability Initiatives With Ac...
- Amazon Athena
- Amazon Quick Suite
International Coffee Chain Modernizes Business ...
- Amazon S3
- AWS
Pilot Company Transforms Data Ecosystem to Unif...
- AWS
- Databricks
Healthcare Product Supplier Launches Feature St...
- Matillion
- Power BI
Merchants Fleet Fuels Growth With Modern Data A...
- Amazon QLDB
- Amazon Redshift
Medical Equipment Manufacturer Saves Millions o...
- Amazon QLDB
- Amazon Redshift
Accelerating Store-Level Speed to Insight for P...
- Amazon Quick Suite
- Amazon S3
Automotive Supplier Leverages Data Modernizatio...
- Amazon Redshift
- Tableau
QSR Maximizes Franchise Performance Using BI Vi...
- Azure
- Profisee
Manufacturer Unlocks Growth With Unified Custom...
- Azure
- Azure ML
Automotive Retailer Modernizes Data Management ...
- AWS Glue
- Snowflake
Global Electronics Manufacturer Saves Millions ...
- Alteryx
- Oracle
Global Manufacturer Overhauls Data Practices wi...
- AWS
- Matillion
Major Insurer Transforms Operations With Modern...
- Amazon DynamoDB
- AWS Lambda
Retail/CPG Leader Accelerates Data Pipeline to ...
- Amazon Quick Suite
Vyaire Medical Gets Global Sales, Inventory, an...
- Amazon Elastic Container Service (ECS)
- AWS Aurora
Travel Center Operator Accelerates Access to Da...
- Amazon S3
- AWS
Travel Center Operator Migrates to Cloud Data W...
- Python
- Tableau
Managed Services Organization Modernizes its Da...
- Amazon Redshift
- Amazon S3
New Ordering System Uses Machine Learning to Op...
- Amazon Redshift
- Talend
Global QSR Uses Micro-Segmentation to Improve C...
- Amazon Redshift
- AWS Aurora
Modernizing ESG Data for Resilience and Compliance
- Amazon Redshift
- AWS Aurora
Master Data Management Delivers Single View of ...
- Amazon Redshift
- Amazon S3
Electronics Manufacturer Optimizes Global Logis...
- Amazon Redshift
- Amazon S3
Modernizing ESG Data for Resilience and Compliance
- AWS
- IBM DataStage
Global QSR Accelerates Migration from Legacy ET...
- Amazon Redshift
- Amazon S3
Integrated Procurement Analytics Platform Drive...
- Amazon Redshift
- Amazon S3
Cloud Migration Brings Agility and Innovation t...
- Amazon Redshift
- AWS
Intuitive POS Data Mart Drives Smarter Analyst ...
- Amazon Quick Suite
- Amazon Redshift
Post-Merger Data Consolidation Reduces Reportin...
- Amazon Redshift
- Tableau
Global QSR Orders Up Fast Data-Driven Solutions
- Amazon Redshift