AI and Data Platforms Buyers Guide Executive Summary

Written by David Menninger | Sep 6, 2024 10:00:00 AM

Executive Summary

AI and Data Platforms

Artificial intelligence (AI) is front and center due to a combination of factors that have dramatically increased awareness of and investment in technologies to support its purpose. Since its inception, AI has provided value no matter how or where it has been applied: helping to prevent credit card fraud, segmenting customers for more effective marketing campaigns, making recommendations for the best next action, predicting maintenance routines to prevent machine failures and many other use cases. However, AI is dependent on data. Large amounts of high-quality data are necessary to feed and train models.

The volumes of data necessary for accurate AI models places significant demands on data platforms.

Data platforms provide an environment for organizing and managing the storage, processing, analysis and presentation of data across an enterprise, and play a critical role in business operations supporting and enabling applications used to run and evaluate the business. Today, many business operations and analyses involve a combination of data and AI. As a result, most leading data platform providers have expanded their product capabilities to include AI. There are many synergies between AI platforms and data platforms. The AI model development process requires data preparation and features engineering that identifies and organizes data in a way that will produce the most accurate models. The volumes of data necessary for accurate AI models places significant demands on data platforms. The costs of these systems can be significant, and any inefficiencies in the process can exacerbate the costs.

To mitigate this, it is important to coordinate AI and data efforts. Enterprises cannot waste time or resources applying AI in their operations. The risks of being left behind and put at a competitive disadvantage are too great. Software platform providers have recognized the opportunity to help enterprises address these needs, and many are offering platforms that combine AI and data capabilities. You can debate whether it is AI first or data first, but in reality, it is both.

ISG defines AI Platforms as those platforms that include the ability to prepare, deploy and maintain AI models. Preparing a model requires accessing and preparing data used in the modeling process. Training a model requires tooling for data scientists to explore, compare and optimize models developed using different algorithms and parameters. Deploying and maintaining models requires governance and monitoring frameworks to ensure that models comply with both internal policies as well as regulatory requirements. When models are out of compliance, the platforms should provide mechanisms to retrain and redeploy new models that are.

At the heart of any data platform is the storage and management of a collection of related data. This is typically provided by a database management system, more commonly referred to simply as a database, that provides the data persistence, data management, data processing and data query functionality that enables access to, and interaction with, the stored data. Adoption of cloud computing environments has also led to the widespread use of object stores as a data persistence layer with query engines such as Apache Spark, Apache Presto and Trino adding the data management, data processing and data query functionality required of a data platform. These object stores also provide storage for unstructured data, which is critical to generative AI applications.

Developing and deploying AI models is a multistep process involving data, beginning with collecting and curating the data that will be used to create the model. Once a model is developed and tuned using the training data, it needs to be tested to determine its accuracy and performance. Then the model needs to be applied in an operational application or process. For example, in a customer service application, a predictive AI model might make a recommendation for how a representative should respond to the customer’s situation. Similarly, a self-service customer application might use a large language model (LLM) to provide a chatbot or guided experience to deliver those recommendations.

The increasing importance of intelligent operational applications driven by AI is blurring the lines that have traditionally divided the requirements for AI platforms and data platforms. Consumers are increasingly engaged with data-driven services that are differentiated by personalization and contextually relevant recommendations. Worker-facing applications are following suit, targeting users based on their roles and responsibilities. The shift to more agile business processes requires ML for more responsive data platforms and applications.

The need for real-time interactivity driven by AI has significant implications for the data platform functionality required to support these applications.

The need for real-time interactivity driven by AI has significant implications for the data platform functionality required to support these applications. While there have always been general-purpose databases that could be used for both analytic and operational workloads, traditional architectures have involved the extraction, transformation and loading of data from the operational data platform into an external analytic or AI platform. This enables the operational and analytic workloads to run concurrently without adversely impacting each other, protecting the performance of both.

Over time, data platforms have evolved differentiated architectural approaches designed to improve workload management and isolation of potentially conflicting workloads. Intelligent applications, while operational in nature, rely on real-time analytic processing to deliver functionality, including contextually relevant recommendations, predictions and forecasting driven by ML and generative AI (GenAI). While data-driven companies continue to use separate data and AI platforms to train models offline, the need for real-time online predictions and recommendations requires that operational data platforms perform ML inferencing.

The popularization of GenAI has had a significant impact on the requirements for data platforms in the last 18 months, particularly in relation to support for storing and processing vector embeddings. These are multi-dimensional mathematical representations of features or attributes of raw data that are used to support GenAI-based natural language processing (NLP) and recommendation systems. Vector search can also improve accuracy and trust with GenAI via retrieval-augmented generation, which is the process of retrieving vector embeddings that represent factually accurate and up-to-date information from a database and combining it with text automatically generated by an LLM. We assert that through 2027, the development of intelligent applications providing personalized experiences driven by GenAI will increase demand for data platforms capable of supporting hybrid operational and analytic processing.

The overall AI and Data Platform Buyers Guide includes evaluation of platforms that provide both sets of capabilities. To be considered for inclusion in this Buyers Guide, a product must be marketed as a general-purpose data platform, database, database management system, data warehouse, data lake or data lakehouse. The primary use case for the product should be to support worker- and customer-facing operational applications and/or analytics workloads such as business intelligence or data science. The product should provide the following functional areas at a minimum: data persistence, data management, data processing and data query; database administrator functionality; developer functionality; data engineering functionality; and data architect functionality. It must also support the following AI-related capabilities: data preparation, AI/ML modeling, AutoML, GenAI, developer and data scientist tooling, MLOps/LLMOps, model deployment, model tuning and optimization.

The Buyers Guide for AI and Data Platforms is the result of our evaluation of the following software providers that offer products that address key elements of both AI and data platforms to support a full range of AI and data workloads: Alibaba Cloud, AWS, Cloudera, Databricks, Google Cloud, IBM, Microsoft, Oracle, Salesforce, SAP, Snowflake and Teradata.

Buyers Guide Overview

For over two decades, ISG Research has conducted market research in a spectrum of areas across business applications, tools and technologies. We have designed the Buyers Guide to provide a balanced perspective of software providers and products that is rooted in an understanding of the business requirements in any enterprise. Utilization of our research methodology and decades of experience enables our Buyers Guide to be an effective method to assess and select software providers and products. The findings of this research undertaking contribute to our comprehensive approach to rating software providers in a manner that is based on the assessments completed by an enterprise.

ISG Research has designed the Buyers Guide to provide a balanced perspective of software providers and products that is rooted in an understanding of business requirements in any enterprise.

This ISG Research Buyers Guide: AI and Data Platforms is the distillation of over a year of market and product research efforts. It is an assessment of how well software providers’ offerings address enterprises’ requirements for AI and Data Platforms software. The index is structured to support a request for information (RFI) that could be used in the request for proposal (RFP) process by incorporating all criteria needed to evaluate, select, utilize and maintain relationships with software providers. An effective product and customer experience with a provider can ensure the best long-term relationship and value achieved from a resource and financial investment.

In this Buyers Guide, ISG Research evaluates the software in seven key categories that are weighted to reflect buyers’ needs based on our expertise and research. Five are product-experience related: Adaptability, Capability, Manageability, Reliability, and Usability. In addition, we consider two customer-experience categories: Validation, and Total Cost of Ownership/Return on Investment (TCO/ROI). To assess functionality, one of the components of Capability, we applied the ISG Research Value Index methodology and blueprint, which links the personas and processes for AI and Data Platforms to an enterprise’s requirements.

The structure of the research reflects our understanding that the effective evaluation of software providers and products involves far more than just examining product features, potential revenue or customers generated from a provider’s marketing and sales efforts. We believe it is important to take a comprehensive, research-based approach, since making the wrong choice of AI and Data Platforms technology can raise the total cost of ownership, lower the return on investment and hamper an enterprise’s ability to reach its full performance potential. In addition, this approach can reduce the project’s development and deployment time and eliminate the risk of relying on a short list of software providers that does not represent a best fit for your enterprise.

ISG Research believes that an objective review of software providers and products is a critical business strategy for the adoption and implementation of AI and Data Platforms software and applications. An enterprise’s review should include a thorough analysis of both what is possible and what is relevant. We urge enterprises to do a thorough job of evaluating AI and Data Platforms systems and tools and offer this Buyers Guide as both the results of our in-depth analysis of these providers and as an evaluation methodology.

How To Use This Buyers Guide

Evaluating Software Providers: The Process

We recommend using the Buyers Guide to assess and evaluate new or existing software providers for your enterprise. The market research can be used as an evaluation framework to establish a formal request for information from providers on products and customer experience and will shorten the cycle time when creating an RFI. The steps listed below provide a process that can facilitate best possible outcomes.

Define the business case and goals.
Define the mission and business case for investment and the expected outcomes from your organizational and technology efforts.
Specify the business needs.
Defining the business requirements helps identify what specific capabilities are required with respect to people, processes, information and technology.
Assess the required roles and responsibilities.
Identify the individuals required for success at every level of the organization from executives to front line workers and determine the needs of each.
Outline the project’s critical path.
What needs to be done, in what order and who will do it? This outline should make clear the prior dependencies at each step of the project plan.
Ascertain the technology approach.
Determine the business and technology approach that most closely aligns to your organization’s requirements.
Establish technology vendor evaluation criteria.
Utilize the product experience: Adaptability, Capability, Manageability, Reliability and Usability, and the customer experience in TCO/ROI and Validation.
Evaluate and select the technology properly.
Weight the categories in the technology evaluation criteria to reflect your organization’s priorities to determine the short list of vendors and products.
Establish the business initiative team to start the project.
Identify who will lead the project and the members of the team needed to plan and execute it with timelines, priorities and resources.

The Findings

All of the products we evaluated are feature-rich, but not all the capabilities offered by a software provider are equally valuable to types of workers or support everything needed to manage products on a continuous basis. Moreover, the existence of too many capabilities may be a negative factor for an enterprise if it introduces unnecessary complexity. Nonetheless, you may decide that a larger number of features in the product is a plus, especially if some of them match your enterprise’s established practices or support an initiative that is driving the purchase of new software.

Factors beyond features and functions or software provider assessments may become a deciding factor. For example, an enterprise may face budget constraints such that the TCO evaluation can tip the balance to one provider or another. This is where the Value Index methodology and the appropriate category weighting can be applied to determine the best fit of software providers and products to your specific needs.

Overall Scoring of Software Providers Across Categories

The research finds Oracle atop the list, followed by IBM and Amazon Web Services. Companies that place in the top three of a category earn the designation of Leader. Oracle has done so in five categories, Amazon Web Services and SAP in four, Databricks in three, Teradata in two and Google Cloud, IBM, Microsoft and Salesforce in one category.

The overall representation of the research below places the rating of the Product Experience and Customer Experience on the x and y axes, respectively, to provide a visual representation and classification of the software providers. Those providers whose Product Experience have a higher weighted performance to the axis in aggregate of the five product categories place farther to the right, while the performance and weighting for the two Customer Experience categories determines placement on the vertical axis. In short, software providers that place closer to the upper-right on this chart performed better than those closer to the lower-left.

The research places software providers into one of four overall categories: Assurance, Exemplary, Merit or Innovative. This representation classifies providers’ overall weighted performance.

Exemplary: The categorization and placement of software providers in Exemplary (upper right) represent those that performed the best in meeting the overall Product and Customer Experience requirements. The providers rated Exemplary are: IBM, Microsoft, Teradata and Oracle.

Innovative: The categorization and placement of software providers in Innovative (lower right) represent those that performed the best in meeting the overall Product Experience requirements but did not achieve the highest levels of requirements in Customer Experience. The providers rated Innovative are: Amazon Web Services and Google Cloud.

Assurance: The categorization and placement of software providers in Assurance (upper left) represent those that achieved the highest levels in the overall Customer Experience requirements but did not achieve the highest levels of Product Experience. The providers rated Assurance are: Databricks and SAP.

Merit: The categorization of software providers in Merit (lower left) represents those that did not exceed the median of performance in Customer or Product Experience or surpass the threshold for the other three categories. The providers rated Merit are: Alibaba Cloud, Cloudera, Salesforce and Snowflake.

We warn that close provider placement proximity should not be taken to imply that the packages evaluated are functionally identical or equally well suited for use by every enterprise or for a specific process. Although there is a high degree of commonality in how enterprises handle AI and Data Platforms, there are many idiosyncrasies and differences in how they do these functions that can make one software provider’s offering a better fit than another’s for a particular enterprise’s needs.

We advise enterprises to assess and evaluate software providers based on organizational requirements and use this research as a supplement to internal evaluation of a provider and products.

Product Experience

The process of researching products to address an enterprise’s needs should be comprehensive. Our Value Index methodology examines Product Experience and how it aligns with an enterprise’s life cycle of onboarding, configuration, operations, usage and maintenance. Too often, software providers are not evaluated for the entirety of the product; instead, they are evaluated on market execution and vision of the future, which are flawed since they do not represent an enterprise’s requirements but how the provider operates. As more software providers orient to a complete product experience, evaluations will be more robust.

The research results in Product Experience are ranked at 80%, or four-fifths, of the overall rating using the specific underlying weighted category performance. Importance was placed on the categories as follows: Usability (10%), Capability (35%), Reliability (15%), Adaptability (10%) and Manageability (15%). This weighting impacted the overall ratings in the research. Oracle, IBM and Amazon Web Services were designated Product Experience Leaders.

Many enterprises will only evaluate capabilities for workers in IT or administration, but the research identified the criticality of Manageability and Reliability (15% weighting each) across a broader set of requirements that are important in AI and Data Platforms.

Customer Experience

The importance of a customer relationship with a software provider is essential to the actual success of the products and technology. The advancement of the Customer Experience and the entire life cycle an enterprise has with its software provider is critical for ensuring satisfaction in working with that provider. Technology providers that have chief customer officers are more likely to have greater investments in the customer relationship and focus more on their success. These leaders also need to take responsibility for ensuring this commitment is made abundantly clear on the website and in the buying process and customer journey.

The research results in Customer Experience are ranked at 15%, or approximately one-seventh, using the specific underlying weighted category performance as it relates to the framework of commitment and value to the software provider-customer relationship. The two evaluation categories are Validation (10%) and TCO/ROI (5%), which are weighted to represent their importance to the overall research.

The software providers that evaluated the highest overall in the aggregated and weighted Customer Experience categories are Databricks, Microsoft and SAP. These category Leaders best communicate commitment and dedication to customer needs.

Most software providers we evaluated have sufficient information available through the website and presentations to represent the provider’s investment in customer experience. These providers have customer case studies to promote success, with enough depth to articulate a commitment to customer experience and an enterprise’s AI and Data Platforms journey. As the commitment to a software provider is a continuous investment, the importance of supporting customer experience in a holistic evaluation should be included and not underestimated.

Appendix: Software Provider Inclusion

For inclusion in the ISG Research AI and Data Platforms Buyers Guide for 2024, a software provider must be in good standing financially and ethically, have at least $100 million in annual or projected revenue verified using independent sources, sell products and provide support on at least two continents and have at least 50 customers. The principal source of the relevant business unit’s revenue must be software-related, and there must have been at least one major software release in the last 12 months.

The product must be marketed as a data platform, database, database management system, data warehouse, data lake or data lakehouse, and the primary use case for the product should be to support operational or analytical applications. The provider must have a product that provides, at a minimum, the following data platform functional areas, which are mapped into the Buyers Guide capability criteria:   

Core database functionality (data persistence, management, processing and query)
Database administrator functionality
Developer functionality
Data engineer functionality
Data architect functionality

The provider must also have a product that delivers the following AI platform functional areas at a minimum, which are also mapped into the Buyers Guide capability criteria:

Data preparation
AI/ML modeling
Developer and data scientist tooling
Model deployment
Model tuning and optimization

The research is designed to be independent of the specifics of software provider packaging and pricing. To represent the real-world environment in which businesses operate, we include providers that offer suites or packages of products that may include relevant individual modules or applications. If a software provider is actively marketing, selling and developing a product for the general market and it is reflected on the provider’s website that the product is within the scope of the research, that provider is automatically evaluated for inclusion.

All software providers that offer relevant AI and Data Platforms products and meet the inclusion requirements were invited to participate in the evaluation process at no cost.

Software providers that meet our inclusion criteria but did not completely participate in our Buyers Guide were assessed solely on publicly available information. As this could have a significant impact on classification and ratings, we recommend additional scrutiny when evaluating those providers.

Products Evaluated

Provider

Product Names

Version

Release
Month/Year

Alibaba Cloud

Platform for AI

Alibaba Cloud MaxCompute

Alibaba Cloud PolarDB for PostgreSQL

2023

2024-04

14.10.19.0

December 2023

April 2024

Amazon Web Services

Sagemaker

Amazon Redshift

Amazon RDS for PostgreSQL

2024

patch 180

16.2

April 2024

February 2024

Cloudera

CDP Private Cloud Data Services
CDP Public Cloud Data Services

Cloudera Data Platform

1.5.4

2024

March 2024

May 2024

March 2024

Databricks

Mosaic AI

Databricks Data Intelligence Platform

2024

April 2024

Google Cloud

Vertex AI

Google BigQuery

Google AlloyDB for PostgreSQL

2024

April 2024

IBM

watsonx.ai

IBM watsonx.data

IBM Db2

4.8.5

1.1.4

11.5.9

April 2024

March 2024

Microsoft

Azure ML

Microsoft Fabric

Microsoft Azure SQL

May 2024

April 2024

February 2024

May 2024

April 2024

Oracle

Oracle AI

Oracle Autonomous Database

2024.2

April 2024

May 2024

April 2024

Salesforce

Einstein 1 Platform

Salesforce Data Cloud

Summer ’24

Summer ‘24

May 2024

SAP

SAP AI Core

SAP Datasphere

SAP HANA Cloud

2024-4-20

2024.08

QRC 1/2024

April 2024

March 2024

Snowflake

Snowflake ML
Snowflake Cortex

Snowflake Platform

1.5.1
2024

8.13

May 2024
May 2024

April 2024

Teradata

Teradata VantageCloud Lake and ClearScape Analytics

Teradata VantageCloud

2024

2.4.4.0

May 2024

February 2024

View full post