ISG Research is happy to share insights gleaned from our latest Buyers Guide, an assessment of how well software providers’ offerings meet buyers’ requirements. The AI and Data Platforms: ISG Research Buyers Guide is the distillation of a year of market and product research by ISG Research.
Artificial intelligence (AI) is front and center due to a combination of factors that have dramatically increased awareness of and investment in technologies to support its purpose. Since its inception, AI has provided value no matter how or where it has been applied: helping to prevent credit card fraud, segmenting customers for more effective marketing campaigns, making recommendations for the best next action, predicting maintenance routines to prevent machine failures and many other use cases. However, AI is dependent on data. Large amounts of high-quality data are necessary to feed and train models.
Data platforms provide an environment for organizing and managing the storage, processing, analysis and presentation of data across an enterprise, and play a critical role in business operations supporting and enabling applications used to run and evaluate the business. Today, many business operations and analyses involve a combination of data and AI. As a result, most leading data platform providers have expanded their product capabilities to include AI. There are many synergies between AI platforms and data platforms. The AI model development process requires data preparation and features engineering that identifies and organizes data in a way that will produce the most accurate models. The volumes of data necessary for accurate AI models places significant demands on data platforms. The costs of these systems can be significant, and any inefficiencies in the process can exacerbate the costs.
To mitigate this, it is important to coordinate AI and data efforts. Enterprises cannot waste time or resources applying AI in their operations. The risks of being left behind and put at a competitive disadvantage are too great. Software platform providers have recognized the opportunity to help enterprises address these needs, and many are offering platforms that combine AI and data capabilities. You can debate whether it is AI first or data first, but in reality, it is both.
ISG defines AI Platforms as those platforms that include the ability to prepare, deploy and maintain AI models. Preparing a model requires accessing and preparing data used in the modeling process. Training a model requires tooling for data scientists to explore, compare and optimize models developed using different algorithms and parameters. Deploying and maintaining models requires governance and monitoring frameworks to ensure that models comply with both internal policies as well as regulatory requirements. When models are out of compliance, the platforms should provide mechanisms to retrain and redeploy new models that are.
At the heart of any data platform is the storage and management of a collection of related data. This is typically provided by a database management system, more commonly referred to simply as a database, that provides the data persistence, data management, data processing and data query functionality that enables access to, and interaction with, the stored data. Adoption of cloud computing environments has also led to the widespread use of object stores as a data persistence layer with query engines such as Apache Spark, Apache Presto and Trino adding the data management, data processing and data query functionality required of a data platform. These object stores also provide storage for unstructured data, which is critical to generative AI applications.
Developing and deploying AI models is a multistep process involving data, beginning with collecting and curating the data that will be used to create the model. Once a model is developed and tuned using the training data, it needs to be tested to determine its accuracy and performance. Then the model needs to be applied in an operational application or process. For example, in a customer service application, a predictive AI model might make a recommendation for how a representative should respond to the customer’s situation. Similarly, a self-service customer application might use a large language model (LLM) to provide a chatbot or guided experience to deliver those recommendations.
The increasing importance of intelligent operational applications driven by AI is blurring the lines that have traditionally divided the requirements for AI platforms and data platforms. Consumers are increasingly engaged with data-driven services that are differentiated by personalization and contextually relevant recommendations. Worker-facing applications are following suit, targeting users based on their roles and responsibilities. The shift to more agile business processes requires ML for more responsive data platforms and applications.
The need for real-time interactivity driven by AI has significant implications for the data platform functionality required to support these applications. While there have always been general-purpose databases that could be used for both analytic and operational workloads, traditional architectures have involved the extraction, transformation and loading of data from the operational data platform into an external analytic or AI platform. This enables the operational and analytic workloads to run concurrently without adversely impacting each other, protecting the performance of both.
Over time, data platforms have evolved differentiated architectural approaches designed to improve workload management and isolation of potentially conflicting workloads. Intelligent applications, while operational in nature, rely on real-time analytic processing to deliver functionality, including contextually relevant recommendations, predictions and forecasting driven by ML and generative AI (GenAI). While data-driven companies continue to use separate data and AI platforms to train models offline, the need for real-time online predictions and recommendations requires that operational data platforms perform ML inferencing.
The popularization of GenAI has had a significant impact on the requirements for data platforms in the last 18 months, particularly in relation to support for storing and processing vector embeddings. These are multi-dimensional mathematical representations of features or attributes of raw data that are used to support GenAI-based natural language processing (NLP) and recommendation systems. Vector search can also improve accuracy and trust with GenAI via retrieval-augmented generation, which is the process of retrieving vector embeddings that represent factually accurate and up-to-date information from a database and combining it with text automatically generated by an LLM. We assert that through 2027, the development of intelligent applications providing personalized experiences driven by GenAI will increase demand for data platforms capable of supporting hybrid operational and analytic processing.
The overall AI and Data Platform Buyers Guide includes evaluation of platforms that provide both sets of capabilities. To be considered for inclusion in this Buyers Guide, a product must be marketed as a general-purpose data platform, database, database management system, data warehouse, data lake or data lakehouse. The primary use case for the product should be to support worker- and customer-facing operational applications and/or analytics workloads such as business intelligence or data science. The product should provide the following functional areas at a minimum: data persistence, data management, data processing and data query; database administrator functionality; developer functionality; data engineering functionality; and data architect functionality. It must also support the following AI-related capabilities: data preparation, AI/ML modeling, AutoML, GenAI, developer and data scientist tooling, MLOps/LLMOps, model deployment, model tuning and optimization.
The Buyers Guide for AI and Data Platforms is the result of our evaluation of the following software providers that offer products that address key elements of both AI and data platforms to support a full range of AI and data workloads: Alibaba Cloud, AWS, Cloudera, Databricks, Google Cloud, IBM, Microsoft, Oracle, Salesforce, SAP, Snowflake and Teradata.
This research-based index evaluates the full business and information technology value of artificial intelligence software offerings. We encourage you to learn more about our Buyers Guide and its effectiveness as a provider selection and RFI/RFP tool.
We urge enterprises to do a thorough job of evaluating AI and data platforms offerings in this Buyers Guide as both the results of our in-depth analysis of these software providers and as an evaluation methodology. The Buyers Guide can be used to evaluate existing suppliers, plus provides evaluation criteria for new projects. Using it can shorten the cycle time for an RFP and the definition of an RFI.
The Buyers Guide for AI and Data Platforms in 2024 finds Oracle first on the list, followed by IBM and AWS.
Software providers that rated in the top three of any category ﹘ including the product and customer experience dimensions ﹘ earn the designation of Leader.
The Leaders in Product Experience are:
The Leaders in Customer Experience are:
The Leaders across any of the seven categories are:
- Oracle, which has achieved this rating in five of the seven categories.
- AWS and SAP in four categories.
- Databricks in three categories.
- Teradata in two categories.
- Google Cloud, IBM, Microsoft and Salesforce in one category.
The overall performance chart provides a visual representation of how providers rate across product and customer experience. Software providers with products scoring higher in a weighted rating of the five product experience categories place farther to the right. The combination of ratings for the two customer experience categories determines their placement on the vertical axis. As a result, providers that place closer to the upper-right are “exemplary” and rated higher than those closer to the lower-left and identified as providers of “merit.” Software providers that excelled at customer experience over product experience have an “assurance” rating, and those excelling instead in product experience have an “innovative” rating.
Note that close provider scores should not be taken to imply that the packages evaluated are functionally identical or equally well-suited for use by every enterprise or process. Although there is a high degree of commonality in how organizations handle AI and data platforms, there are many idiosyncrasies and differences that can make one provider’s offering a better fit than another.
ISG Research has made every effort to encompass in this Buyers Guide the overall product and customer experience from our AI and data platforms blueprint, which we believe reflects what a well-crafted RFP should contain. Even so, there may be additional areas that affect which software provider and products best fit an enterprise’s particular requirements. Therefore, while this research is complete as it stands, utilizing it in your own organizational context is critical to ensure that products deliver the highest level of support for your projects.
You can find more details on our community as well as on our expertise in the research for this Buyers Guide.