Viewpoints

Five Factors Driving Data-Intensive Information Architectures

Written by ISG Software Research | Jul 18, 2022 3:33:00 PM

Analyst Viewpoint

Digital transformation is driving organizations to become much more data intensive. Organizations now process information from an increasing number of internal and external sources. There are many more devices generating information, including IoT sensors. The consumerization of IT and the focus on improving customer, partner and workplace experiences requires organizations to be more responsive. This digitization is leading to a proliferation of different types of databases built to process different types of information. Our research shows nearly two-thirds (63%) of organizations are using three or more different database technologies in their analytics and data processes.

Here are five factors creating requirements for data-intensive applications and some suggestions for information architectures to handle these requirements.

1. Increasing data volumes are forcing organizations to consider specialized data architectures. While relational databases remain the most common way organizations manage their data, they aren’t the only way and aren’t always the most effective way. Most organizations have adopted multiple data platforms as part of their information architecture. Interestingly, those organizations that are using specialized analytic databases and NoSQL databases show the highest rates of satisfaction and the highest levels of confidence in their ability to make informed business decisions. However, managing multiple databases to support these different data models can lead to additional administration costs and frustration trying to synchronize and replicate the different sets of information. Instead, consider database platforms that support multiple models.

2. The speed with which data is generated and collected places additional demands on an organization’s data architectures. More than one-quarter (28%) of organizations complain that their technologies are too slow. Nearly one-half (48%) of the participants in our research are using streaming data in their operational and/or analytical processes. Eventually, 8 in ten plan to do so. Our research shows they consider financial market data, machine data, call center events, social media and IoT data as important sources of information. These types of data can stream into organizations at rates of hundreds of thousands or millions of rows per second, making them difficult to process quickly enough to keep up with the demand.

3. Not only is information arriving at a rapid rate, but organizations must also be able to instantly use that information to be more responsive to their customers, partners and employees. In financial markets, fractions of seconds matter. In online services, more than a few seconds of delay can cause a customer to abandon the service and try a competitor. In healthcare, delays can even mean the difference between life or death. Nearly one-half of the participants in our research indicate that processing events in second or sub-seconds is essential to their organizations. Being able to respond in real time often requires the application of artificial intelligence and machine learning (AI/ML). So, not only must data be processed in real time, but the architecture must support making predictions in real time.

4. Organizations need to combine many different types of data as part of their information processing. The variety of data sources and complexity involved with joining these sources together puts a significant load on databases. For example, a robust customer-360 analysis requires joining information from sales, marketing, customer service, social media and external demographic data. Our research shows that more than one-half (55%) of organizations are working with six or more data sources, each of which involve multiple tables. The result could be joins involving dozens of tables which, when coupled with the need to process information quickly, can create heavy demands on the underlying databases. Even worse, if multiple, separate special-purpose databases are used, it may be difficult or impossible to join data across these instances.

5. As organizations move from historical to real-time analysis to be more responsive, it creates additional concurrency demands. When a larger portion of the workforce relies on data for more of their daily activities, it means a larger number of people are accessing the databases simultaneously. It also means that the distinction between operational processing and analytical processing becomes blurred. Most databases are organized and optimized for one or the other. Typically, operational databases use row-oriented structures and analytical databases use column-oriented structures. Combined operational and analytical workloads need specialized architectures to support the concurrency and performance organizations require. Simply trying to use one or the other type of database for a combined workload will result in a lack of performance and lack of scalability.

Any one of these factors present a challenge, but when combined, the hurdles are magnified. Many organizations try to tackle these different factors by deploying different types of databases for each of these workloads. While that might make each individual challenge manageable, it complicates data integration, database administration and data governance. It also creates additional licensing, procurement and maintenance requirements. A single database platform that addresses the five factors above can help organizations support their data-intensive application requirements. It also reduces the burden on IT to manage multiple systems and enable the organization to be more responsive and more competitive.