MLOps Buyers Guide Executive Summary

Written by David Menninger | Jul 31, 2024 1:45:15 PM

Executive Summary

MLOps

Artificial intelligence (AI) has continued to evolve for many decades and several factors have combined to dramatically increase the awareness of and investment in technologies to support its purpose. Since its inception, AI has provided value no matter the use and where it has been applied, helping to prevent credit card fraud, segmenting customers for more effective marketing campaigns, making recommendations for the best next action, predicting maintenance routines to prevent machine failures and many other use cases. Recently, the advent of generative AI (GenAI) has brought heightened attention to the AI market. ISG Buyer Behavior research shows that nearly one-half (49%) of enterprise AI budgets are being allocated to GenAI investments. This heightened awareness of AI has brought a focus on the broader issues of developing, deploying and maintaining AI applications in enterprise production environments.

AI Platforms have existed for decades, but several challenges have prevented widespread adoption.

Ventana Research defines AI Platforms as those platforms that include the ability to prepare, deploy and maintain AI models. Preparing a model requires accessing and preparing data used in the modeling process. Training a model requires tooling for data scientists to explore, compare and optimize models developed using different algorithms and parameters. Deploying and maintaining models require governance and monitoring frameworks to ensure that models comply with both internal policies as well as regulatory requirements. And, when models are out of compliance, the platforms should provide mechanisms to retrain and redeploy new models that are compliant.

AI Platforms have existed for decades, but several challenges have prevented widespread adoption. Among the challenges were the costs and technical difficulties in collecting and processing the volumes of data needed to produce accurate models. For example, the data from many transactions must be collected and analyzed in order to predict fraudulent transactions. But since the overwhelming majority of transactions are legitimate, many transactions must be analyzed so that enough observations of fraudulent transactions can be used by the models to make accurate predictions. As scale-out computing and object storage have driven down costs, it is now much more economically feasible to collect and process all this data.

Another challenge has been the lack of skills needed to create and deploy AI models. Our research shows that hiring and retaining AI-talented resources is the most challenging technical role to fill and the lack of expertise is the most significant challenge enterprises face in adopting AI. GenAI has brought new attention to the domain of AI and has the promise to make AI much more accessible and more easily utilized in a broader portion of the workforce and the general public. In fact, 85% of enterprises believe that investment in GenAI technology in the next 24 months is important or critical. And the ISG Buyer Market Lens AI research finds enterprises are experiencing positive outcomes from their AI investments. Nearly 9 in 10 (88%) report positive outcomes when using AI for search that proactively answers questions. A similar proportion (87%) report positive outcomes in the interpretation of tabular data.

While the rise of GenAI has been meteoric, enterprises still plan to invest one-half of their AI spend on predictive or traditional AI. The most common tasks where GenAI is being applied include natural language processing such as chatbots/copilots/assistants, extracting information from and summarizing documents, and assisting with software development tasks such as code generation and application migration. GenAI is expected to have a bigger impact in these areas than predictive AI.

However, in areas such as credit risk, fraud detection, algorithmic trading and even customer acquisition, predictive AI is expected to have a bigger impact. Part of the reason, as noted previously, is that predictive AI is hard. Fine tuning models requires knowledge of not just the algorithms, but also their various parameters and the appropriate data preparation techniques. Data scientists must also understand biases in the data and issues in the training process such as overfitting or poor sampling.

Developing and deploying AI models is a multistep process beginning with collecting and curating the data that will be used to create the model. Once a model is developed and tuned using the training data, it needs to be tested to determine its accuracy and performance. Then the model needs to be applied in an operational application or process.

For example, in a customer service application, a predictive AI model might make a recommendation for how a representative should respond to the customer’s situation. Similarly, a self-service customer application might use a large language model (LLM) to provide a chatbot or guided experience to deliver those recommendations.

The process does not conclude once a model is deployed. Enterprises need to monitor and maintain the models ensuring they continue to be accurate and relevant as market conditions change. Realistically, it is only a matter of time before a model’s accuracy has declined to the point where it can be replaced by another more accurate model. The new model may simply be the result of retraining the old model on new data or it may be the result of using different modeling techniques. In either case, the models must be monitored constantly and updated as and when necessary. In the case of third-party LLMs, the providers are constantly updating and improving their models, so enterprises need to be prepared to deploy the newer models as well.

Data is flowing throughout these processes. Considerable time and effort are invested in preparing data to feed into predictive models. Feature engineering requires exploration and experimentation with the data. Once the features are identified, robust repeatable processes are needed to create data pipelines that feed these features into the models. In the case of GenAI, data—often in the form of documents—feeds custom LLM development or fine tuning. Additional data flows through the prompting process to direct LLMs to provide more specific and more accurate responses. Enterprises must govern these data flows to ensure compliance with internal policies and regulatory requirements. The regulatory environment is emerging and evolving with the European Union passing the AI Act, the US issuing an Executive Order on responsible development of AI and dozens of US states either enacting or proposing AI regulations.

The processes of moving AI to production, keeping models up to date and including governance throughout are collectively referred to as machine learning operations (MLOps) or in the case of LLMs, LLMOps. Software providers slowly recognized that the lack of MLOps/LLMOps tooling was inhibiting successful use of AI. Enterprises were left to their own devices to create scripts and cobble together solutions to address these issues. Fortunately, AI software providers have expanded their platforms to address many of these capabilities and specialist providers have emerged with a focus on MLOps/LLMOps. In fact, we assert that by 2026, 4 in 5 enterprises will use MLOps and LLMOps tools to improve the quality and governance of their AI/ML efforts.

All these capabilities are important to maximize success of AI investments. As a result, our evaluation of AI software providers considers each of them. Our separate AI Platform Buyers Guide includes data preparation to ensure quality data is used to train and test models. We also consider the range of modeling algorithms available, as well as the tuning, optimization and testing options supported. Similarly, the range of AutoML functionality is included along with data scientist tooling to understand how well the platform boosts productivity. Today, no platform would be complete without generative AI. And finally, MLOps / LLMOps must be considered to evaluate how effectively the platform can be used to put models into production and to maintain those models over time.

Technology improvements alone are not enough to improve the use of data in an enterprise.

ISG advises enterprises that a methodical approach is essential to maximize competitiveness. To improve the performance of your enterprise’s people, process, information and technology components, it is critical to select the right provider and product. Many need to improve in this regard. Our research analysis placed fewer than 1 in 5 enterprises (18%) at the highest Innovative level of performance in their use of analytics and data. However, caution is appropriate here: technology improvements alone are not enough to improve the use of data in an enterprise. Doing so requires applying a balanced set of upgrades that also include efforts to improve people skills and processes. The research finds fewer than 1 in 6 enterprises (15%) at the highest Innovative level of performance for process in relation to analytics and data, and fewer than 1 in 8 (12%) at the Innovative level of performance for people.

For software providers that are part of this Buyers Guide, only those with specific MLOps support were considered for inclusion in the evaluation. The specific MLOps Buyers Guide uses portions the AI Platform capability framework and includes the evaluation of specific AI/ML modeling, developer and data scientist tooling, MLOps and advanced model optimization. To be included in the Buyers Guide the software provider must include specific capabilities for deployment, monitoring, and governance of models and developer tooling.

This Buyers Guide research evaluates the following software providers that offer products that address key elements of MLOps: Alibaba Cloud, Altair, Alteryx, Amazon Web Services (AWS), Anaconda, C3 AI, Cloudera, Databricks, Dataiku, DataRobot, Domino Data Lab, Google, H20.ai, IBM, Microsoft, Nvidia, Oracle, Palantir, Red Hat, SAP, SAS, Snowflake and Teradata.

Buyers Guide Overview

For over two decades, Ventana Research has conducted market research in a spectrum of areas across business applications, tools and technologies. We have designed the Buyers Guide to provide a balanced perspective of software providers and products that is rooted in an understanding of the business requirements in any enterprise. Utilization of our research methodology and decades of experience enables our Buyers Guide to be an effective method to assess and select software providers and products. The findings of this research undertaking contribute to our comprehensive approach to rating software providers in a manner that is based on the assessments completed by an enterprise.

Ventana Research has designed the Buyers Guide to provide a balanced perspective of software providers and products that is rooted in an understanding of business requirements in any enterprise.

This Ventana Research Buyers Guide: MLOps is the distillation of over a year of market and product research efforts. It is an assessment of how well software providers’ offerings address enterprises’ requirements for MLOps software. The index is structured to support a request for information (RFI) that could be used in the request for proposal (RFP) process by incorporating all criteria needed to evaluate, select, utilize and maintain relationships with software providers. An effective product and customer experience with a provider can ensure the best long-term relationship and value achieved from a resource and financial investment.

In this Buyers Guide, Ventana Research evaluates the software in seven key categories that are weighted to reflect buyers’ needs based on our expertise and research. Five are product-experience related: Adaptability, Capability, Manageability, Reliability, and Usability. In addition, we consider two customer-experience categories: Validation, and Total Cost of Ownership/Return on Investment (TCO/ROI). To assess functionality, one of the components of Capability, we applied the Ventana Research Value Index methodology and blueprint, which links the personas and processes for MLOps to an enterprise’s requirements.

The structure of the research reflects our understanding that the effective evaluation of software providers and products involves far more than just examining product features, potential revenue or customers generated from a provider’s marketing and sales efforts. We believe it is important to take a comprehensive, research-based approach, since making the wrong choice of MLOps technology can raise the total cost of ownership, lower the return on investment and hamper an enterprise’s ability to reach its full performance potential. In addition, this approach can reduce the project’s development and deployment time and eliminate the risk of relying on a short list of software providers that does not represent a best fit for your enterprise.

Ventana Research believes that an objective review of software providers and products is a critical business strategy for the adoption and implementation of MLOps software and applications. An enterprise’s review should include a thorough analysis of both what is possible and what is relevant. We urge enterprises to do a thorough job of evaluating MLOps systems and tools and offer this Buyers Guide as both the results of our in-depth analysis of these providers and as an evaluation methodology.

How To Use This Buyers Guide

Evaluating Software Providers: The Process

We recommend using the Buyers Guide to assess and evaluate new or existing software providers for your enterprise. The market research can be used as an evaluation framework to establish a formal request for information from providers on products and customer experience and will shorten the cycle time when creating an RFI. The steps listed below provide a process that can facilitate best possible outcomes.

Define the business case and goals.
Define the mission and business case for investment and the expected outcomes from your organizational and technology efforts.
Specify the business needs.
Defining the business requirements helps identify what specific capabilities are required with respect to people, processes, information and technology.
Assess the required roles and responsibilities.
Identify the individuals required for success at every level of the organization from executives to front line workers and determine the needs of each.
Outline the project’s critical path.
What needs to be done, in what order and who will do it? This outline should make clear the prior dependencies at each step of the project plan.
Ascertain the technology approach.
Determine the business and technology approach that most closely aligns to your organization’s requirements.
Establish technology vendor evaluation criteria.
Utilize the product experience: Adaptability, Capability, Manageability, Reliability and Usability, and the customer experience in TCO/ROI and Validation.
Evaluate and select the technology properly.
Weight the categories in the technology evaluation criteria to reflect your organization’s priorities to determine the short list of vendors and products.
Establish the business initiative team to start the project.
Identify who will lead the project and the members of the team needed to plan and execute it with timelines, priorities and resources.

The Findings

All of the products we evaluated are feature-rich, but not all the capabilities offered by a software provider are equally valuable to types of workers or support everything needed to manage products on a continuous basis. Moreover, the existence of too many capabilities may be a negative factor for an enterprise if it introduces unnecessary complexity. Nonetheless, you may decide that a larger number of features in the product is a plus, especially if some of them match your enterprise’s established practices or support an initiative that is driving the purchase of new software.

Factors beyond features and functions or software provider assessments may become a deciding factor. For example, an enterprise may face budget constraints such that the TCO evaluation can tip the balance to one provider or another. This is where the Value Index methodology and the appropriate category weighting can be applied to determine the best fit of software providers and products to your specific needs.

Overall Scoring of Software Providers Across Categories

The research finds Oracle atop the list, followed by AWS and Databricks. Companies that place in the top three of a category earn the designation of Leader. Oracle has done so in five of the seven categories; SAP and AWS in four; Alteryx and Databricks in two; and Dataiku, DataRobot, Google, Microsoft and Teradata in one category.

The overall representation of the research below places the rating of the Product Experience and Customer Experience on the x and y axes, respectively, to provide a visual representation and classification of the software providers. Those providers whose Product Experience have a higher weighted performance to the axis in aggregate of the five product categories place farther to the right, while the performance and weighting for the two Customer Experience categories determines placement on the vertical axis. In short, software providers that place closer to the upper-right on this chart performed better than those closer to the lower-left.

The research places software providers into one of four overall categories: Assurance, Exemplary, Merit or Innovative. This representation classifies providers’ overall weighted performance.

Exemplary: The categorization and placement of software providers in Exemplary (upper right) represent those that performed the best in meeting the overall Product and Customer Experience requirements. The providers rated Exemplary are: Alteryx, AWS, Databricks, Dataiku, Google, IBM, Microsoft, Oracle, SAP and Teradata.

Innovative: The categorization and placement of software providers in Innovative (lower right) represent those that performed the best in meeting the overall Product Experience requirements but did not achieve the highest levels of requirements in Customer Experience. The providers rated Innovative are: DataRobot and Domino Data Lab.

Assurance: The categorization and placement of software providers in Assurance (upper left) represent those that achieved the highest levels in the overall Customer Experience requirements but did not achieve the highest levels of Product Experience. The providers rated Assurance are: NVIDIA and Snowflake.

Merit: The categorization of software providers in Merit (lower left) represents those that did not exceed the median of performance in Customer or Product Experience or surpass the threshold for the other three categories. The providers rated Merit are: Alibaba Cloud, Altair, Anaconda, C3 AI, Cloudera, H2O.ai, Palantir, Red Hat and SAS.

We warn that close provider placement proximity should not be taken to imply that the packages evaluated are functionally identical or equally well suited for use by every enterprise or for a specific process. Although there is a high degree of commonality in how enterprises handle MLOps, there are many idiosyncrasies and differences in how they do these functions that can make one software provider’s offering a better fit than another’s for a particular enterprise’s needs.

We advise enterprises to assess and evaluate software providers based on organizational requirements and use this research as a supplement to internal evaluation of a provider and products.

Product Experience

The process of researching products to address an enterprise’s needs should be comprehensive. Our Value Index methodology examines Product Experience and how it aligns with an enterprise’s life cycle of onboarding, configuration, operations, usage and maintenance. Too often, software providers are not evaluated for the entirety of the product; instead, they are evaluated on market execution and vision of the future, which are flawed since they do not represent an enterprise’s requirements but how the provider operates. As more software providers orient to a complete product experience, evaluations will be more robust.

The research results in Product Experience are ranked using the specific underlying weighted category performance to 80% or four-fifths of the overall rating. Importance was placed on the categories as follows: Capability (40%), Adaptability (10%), Manageability (10%), Reliability (10%) and Usability (10%). This weighting impacted the resulting overall ratings in this research. Oracle, AWS and Dataiku were designated Product Experience Leaders. While not a Leader, Google was also found to meet a broad range of enterprise ML operation requirements.

Many enterprises will only evaluate capabilities for data scientists or data engineers, but the research identified the collection of Adaptability, Manageability, Reliability and Usability as equally important to Capability.

Customer Experience

The importance of a customer relationship with a software provider is essential to the actual success of the products and technology. The advancement of the Customer Experience and the entire life cycle an enterprise has with its software provider is critical for ensuring satisfaction in working with that provider. Technology providers that have chief customer officers are more likely to have greater investments in the customer relationship and focus more on their success. These leaders also need to take responsibility for ensuring this commitment is made abundantly clear on the website and in the buying process and customer journey.

The research results in Customer Experience are ranked using the specific underlying weighted category performance of 20% or one-fifth as it relates to the framework of commitment and value to the software provider-customer relationship. The two evaluation categories are Validation (10%) and TCO/ROI (10%), which are weighted to represent their importance to the overall research

The software providers that evaluated the highest overall in the aggregated and weighted Customer Experience categories are Databricks, Microsoft and SAP. These category leaders best communicate commitment and dedication to customer needs.

A few of the software providers we evaluated have sufficient information available through the website and presentations including customer case studies to promote success, but others lack depth in articulating a commitment to customer experience and an enterprise’s MLOps journey. As the commitment to a software provider is a continuous investment, the importance of supporting customer experience in a holistic evaluation should be included and not underestimated.

Appendix: Software Provider Inclusion

For inclusion in the Ventana Research MLOps Buyers Guide for 2024, a software provider must be in good standing financially and ethically, have at least 20 million in annual or projected revenue verified using independent sources, sell products and provide support on at least two continents, and have at least 25 customers. The principal source of the relevant business unit’s revenue must be software-related and there must have been at least one major software release in the last 18 months. The provider must have a product that delivers the following functional areas at a minimum, which are mapped into Buyers Guide capability criteria:     

Model deployment
Model monitoring
Model governance
Developer tooling

The research is designed to be independent of the specifics of software provider packaging and pricing. To represent the real-world environment in which businesses operate, we include providers that offer suites or packages of products that may include relevant individual modules or applications. If a software provider is actively marketing, selling and developing a product for the general market and it is reflected on the provider’s website that the product is within the scope of the research, that provider is automatically evaluated for inclusion.

All software providers that offer relevant MLOps products and meet the inclusion requirements were invited to participate in the evaluation process at no cost to them.

Software providers that meet our inclusion criteria but did not completely participate in our Buyers Guide were assessed solely on publicly available information. As this could have a significant impact on classification and ratings, we recommend additional scrutiny when evaluating those providers.

Products Evaluated

Provider	Product Names	Version	Release Month/Year
Alibaba Cloud	Platform for AI (PAI)	v. 2023	December 2023
Altair	Ai Studio	v. 2024.0	April 2024
Alteryx	Designer Alteryx Cloud Platform	v. 2024.1 v. 2024	May 2024 May 2024
AWS	Sagemaker	v. 2024	April 2024
Anaconda	Data Science & AI Workbench	v. 5.7.1	April 2024
C3 AI	C3 AI Platform	v. 2024	March 2024
Cloudera	CDP Private Cloud Data Services CDP Public Cloud Data Services	v. 1.5.4 v. 2024	May 2024 May 2024
Databricks	Mosaic AI	v. 2024	April 2024
Dataiku	Dataiku	v. 2024	March 2024
DataRobot	DataRobot	v. 2024	April 2024
Domino Data Lab	Domino Enterprise AI Platform	v. 5.10.0	March 2024
Google	Vertex AI	v. 2024	April 2024
H2O.ai	H2O AI Cloud H20	v. 23.10.2 v. 3.46.01	April 2024
IBM	watsonx.ai	v. 4.8.5	April 2024
Microsoft	Azure ML Azure OpenAI Service	v. 2 v. 2024	February 2024 May 2024
NVIDIA	NVIDIA AI Enterprise	v. 5.0	May 2024
Oracle	Oracle AI	v. 2024.2	May 2024
Palantir	Palantir AIP	v. 2024	May 2024
Red Hat	Red Hat OpenShift AI Red Hat OpenShift AI Cloud Service 1	v. 2.9 v. 2024	May 2024 May 2024
SAP	SAP AI Core	v. 2024-4-20	April 2024
SAS	SAS Viya	v. 2024	April 2024
Snowflake	Snowflake ML Snowflake Cortex	v. 1.5.1 v. 2024	May 2024 May 2024
Teradata	Teradata VantageCloud Lake and ClearScape Analytics	v. 2024	May 2024

Providers of Promise

We did not include software providers that, as a result of our research and analysis, did not satisfy the criteria for inclusion in this Buyers Guide. These are listed below as “Providers of Promise.”

Provider	Product	Revenue	Capability	Inter-national	Customers
Arthur	Arthur Scoppe	No	Yes	No	No
Fiddler	Fiddler AI Observability	No	Yes	No	No

View full post