Feature store to streamline machine learning
Automotive retailer
Wavicle helped this major automotive retailer build a feature store to streamline their machine learning endeavors.
Advanced Analytics
Machine Learning & MLOps
Azure
Azure ML
Cosmos DB
Feathr
Purview
Python
Synapse

Feature store to streamline machine learning

Automotive Retailer Modernizes Data Management and ML Performance with a Logical Feature Store

This automotive retailer reached out to Wavicle to address the unique challenges of managing features and transformations used in training and deploying their machine learning (ML) models. They also needed assistance in addressing the challenges faced by their ML professionals when converting data into production-ready features throughout AI projects. The retailer teamed with Wavicle to create a feature store that improved model accuracy, reproducibility, and deployment to enhance the company’s decision-making capabilities.

 

Challenges in training and deploying machine learning models

The automotive retailer’s data science and machine learning (DSML) team focused on converting customers. They required an improved and reproducible model training and inferencing workflow to accommodate use cases involving traditional tabular data and computer vision. In addition, their ML professionals encountered many tasks and responsibilities throughout the ML lifecycle, with each role facing unique challenges when it comes to turning data into production-grade features.

 

Here are the pain points of the DSML team:

  • Data scientists found it difficult to identify features used in the current training of production models.
  • DevOps engineers had problems creating frameworks around deployed models.
  • Data engineers struggled to create accurate data pipelines because they could not find the production models that used features impacted by source data changes.

To tackle these challenges, the automotive retailer engaged Wavicle’s consultants to facilitate the tasks of ML professionals across various roles and make their jobs more seamless and efficient. They needed Wavicle’s guidance to implant a solution that could bridge the gap between raw data and ML models, thereby enhancing their ML models’ training and deployment process.

 

Leveraging the logical feature store to its full potential

To address some of the most common pain points among ML professionals, Wavicle focused on feature reproducibility, reusability, and reliability.

 

The proposed solution

A solution was needed to adequately capture version-controlled transformation logic, pull training sets, orchestrate transformations, and monitor data that serves production models. Wavicle recommended a logical feature store.

 

Wavicle’s team evaluated multiple feature store solutions, assessing the complexity of implementation, estimated operating costs, and operational maintainability. Based on the automotive retailer’s requirements, the most significant consideration was freeing up ML professionals across the board.

 

The practical application of the solution

 

Wavicle recommended the open-source Feathr feature store, as it met the critical requirements of a synced online/offline feature flow for API serving and data science needs. The implementation had some unique substitutions to the default pattern, which included the following components:

 

  • Cosmos DB met the need for low-latency feature retrieval during online inferencing.
  • SQL Server served pre-transformed offline features for training and validating models.
  • Purview promoted the understandability of features by capturing lineage, tied directly to transformation logic by acting as the feature registry.
  • Databricks, utilizing its Spark clusters and notebooks, provided the compute power for transformations and backfilling.

 

The components were orchestrated with Azure Bicep templates, and all registration and execution code was used via the Feathr Python SDK.

 

New feature store transforms data management and ML performance

With Wavicle’s help, the automotive retailer finally reduced the burden on their ML professionals.

  1. Data scientists had consolidated their repetitive and disparate feature discovery and selection processes.
  2. DevOps engineers were able to ensure consistency between features used in offline training and online inferencing and scale for new models.
  3. Data engineers had consistent quality monitoring and access across all data pipelines.

The instrumental steps taken by Wavicle’s experts ultimately enhanced the automotive retailer’s ML efforts. Now, the new robust feature store manages features and transformations used in the training and deployment of ML models. The technological shift allowed more efficient use of their time, avoiding duplicate efforts and missed connections while enabling them to scale their ability to deploy models and manage and govern new and current features. By reusing, sharing, and collaborating on features, teams can now consistently handle data across different AI initiatives and enhance their foresight in decision-making processes.