Maximizing the value of data science and machine learning models
Many organizations struggle to operationalize advanced analytics and data science models to meet the goals of the business.
Operationalizing advanced analytics means more than simply building and deploying analytics and data science models for specific use cases. It includes managing, monitoring, and refining models so they remain relevant and useful. By operationalizing analytics models, you increase their value to the organization over time.
Organizations are typically good at ideation and model development but may fall short when it comes to the rest of the advanced analytics application lifecycle.
They hire data scientists who know how to build models but often lack the skills (or patience) for data integration and data wrangling. They might not be able to write production-quality code, put software into production or understand the change management needed to implement a new application.
The bottom line is that many organizations, thinking they should rely solely on data scientists to build and deploy these models, skip the important steps that come before or after development and deployment. This points to the need for a process to operationalize advanced analytics models and applications within an organization.
A framework for operationalizing analytics models
Five key capabilities for operationalizing analytics create a foundation for operationalizing advanced analytics.
1. Business engagement and strategic planning
Analytics models should be in alignment with the organizational strategic plan, meet the needs of the business, and be accepted by key stakeholders.
Earning a return on investment requires organizations to:
• Prioritize investments to meet strategic goals.
• Clearly define roles and responsibilities.
• Identify business decisions that will deliver the most financial value.
• Decide how to monitor and measure success.
A center of excellence and governance framework will define how to manage and monitor analytics models in production. As data sources and the business change, standards and policies must be in place to ensure the data that feeds the models is accurate and complete — and that they generate reliable results.
Also, consider an agile development methodology to deliver quick wins, fail and learn fast, iterate ongoing enhancements and reevaluate priorities over time.
2. Building data pipelines
Analytic models depend on the continuous ingestion of large volumes of data. To achieve this, data engineers build data pipelines — including data ingestion, integration, and transformation capabilities — to move data safely into production environments. This typically involves robust data quality and governance processes, mature standards for data curation, and the development of conceptual/logical models.
But first, data scientists need to explore and assess raw data to determine where it has value for the business. To develop and test analytic models, they simply need a snapshot of the data; they don’t need continuous data feeds with all the functionality of a fully operational model.
In the process of this exploration, the data scientists will do their own data profiling and quality checks and will design code to do the transformation and loading they need for analysis. This isn’t the complete ETL process that data engineering would do, but it’s enough for them to learn and draw conclusions about the data.
Once the data scientists understand which data is valuable, they should share their profiling, quality, and transformation code so the data engineering team can evolve it into production-quality code and build pipelines to ingest data from a flow perspective. This should trigger an ongoing dialog with both teams sharing new developments with each sprint.
3. Model development, improvement, and measurement
Next comes the ability to develop and refine models and measurements, including iteratively developing, refining, and improving the models to reflect business changes. Companies need robust, automated feedback mechanisms to measure performance, tooling, and processes for business-as-usual model management.
Best practices include providing advanced training for data science resources, staffing experienced data engineers and report designers, automation of model selection, and reconciliation of standards.
The model pipeline requires similar code evolution as data pipelines. Data scientists start by writing code designed to find the best model; that code must be based (initially) upon a snapshot of the data. Scoring engines should be placed into production and designed to score the best (champion and challenger) models.
The production data science team monitors the performance of all models, assessing population drift and model degradation as well as user adoption. They also should plan, schedule, and run randomized control tests to determine model lift so accurate reporting on model performance and return on investment can be reported.
4. Insights to action
Operationalization of processes involves rapid prototyping of models with engaged business partners, integration of data into operational systems, and the ability to take prescriptive actions with minimal human intervention.
Best practices include maintaining a strong security posture and applying it to new domains, establishing infrastructure as code and platform as a service where appropriate and measuring outcomes to make needed improvements.
5. Adoption and measurement
The final capability involves creating metrics to measure the business value of analytics. These metrics should relate to financial results, key performance indicators, and other measures of success for the organization.
Best practices include implementing and socializing the RACI matrix, building and improving a program for each business unit, increasing the number of analytic translators, and deploying field tests to fail fast and adapt.
Success begets more success
With these capabilities, organizations can increase agility; advance into DevOps, DataOps, and MLOps; create business cases that demonstrate value; enforce governance; create role definitions and assignments; and complete portfolio prioritization. They will have achieved the ability to operationalize advanced analytics applications in support of the organization’s objectives.
This article was originally published as Five Steps To Operationalizing Advanced Analytics Models on March 8, 2022, on Forbes.