Where not to use a Multi Dimensional Database

 
Published on 2003-11-27 by John Collins.

3. Where not to use a Multi-Dimensional Database

At this stage, given the previously stated advantages, you might be forgiven for thinking that all databases should be converted to multi-dimensional systems. There are examples where this is not required, however, or may not even make any sense at all. Take the example of the students used previously, suppose our fictional college keeps a table of student personal information, like this one below:

Lastname Firstname Student No Age Course
Collins John 111 25 Comp Science
Wall Larry 222 55 Comp Science
Torvalds Linus 333 26 Comp Science

Let is imagine a two-dimensional view or slice of this information, as generated by a multi-dimensional system, where Lastname and Firstname are the dimensions, and Age is contained within the cell values, as below:

Two-dimensional array of student age

Collins 25
Lastname Wall 55
Torvalds 26
Linus Larry John
Firstname

As you can see, we end up with a sparsely populated array, as opposed to the array we would have obtained from Figure 3 if we would have taken Student Name and Exam as dimensions, where we would have a fully populated array of exam results as presented in that diagram.

Our slice above yielded a 3x3 array containing 9 cells, but only 3 of these cells actually contain an age value, leaving 6 empty! If the dataset where larger, which in all likelihood it would be, then our two-dimensional array would be mostly empty.

The dataset in the student personal information table is not multi-dimensional, as there is no inherent relationship between the elements of the different records. No last name is matched to more than one student number, while no age is matched to more than one course, or indeed is the any logical relationship between the two elements. This example produced a sparsely populated multi-dimensional array, whereas our first example in Figure 3 presented a fully populated 3x3 multi-dimensional array, where all dimensions of that array intersected neatly on an exam result.

For performance reasons, it would also be unwise to store such a dataset in a multi-dimensional system. In relational form, a search would require 3 comparisons, whereas in multi-dimensional form a search would require 9 comparisons (3x3), thereby reducing efficiency.

Apart from performance considerations, there is no inherent value in storing non-multi-dimensional data in a multi-dimensional database. Multi-dimensional databases are designed to ease the manipulation and analyzing of complex database structures, which are structures that have large amounts of inter-relationships. In the student personal information example, a user accessing this information would not require that kind of analytical power, they simply want to view or edit a particular student's record. Because there are a limited number of meaningful relationships between the data elements of the dataset, and the information content of this database resides in the informational context of the individual data records, it is not a suitable candidate for a multi-dimensional database system.

This raises a fundamental observation: multi-dimensional database technology is a complementary technology to relational database technology, not a direct replacement.

A further observation can be made for the business application of multi-dimensional systems at this stage (Kenan): The greater the number of inherent inter-relationships between the elements of a dataset, the more likely it is that a study of those inter-relationships will yield business information of value to the company.

It is clear that where these inter-relationships exist, especially in a commercial environment, that such relationships have a real-world meaning to the organization that is worth analyzing over time, that is by the process of trend analysis, and that it is multi-dimensional systems that facility the analysis of these trends, via OLAP (On-Line Analytical Processing). Indeed it has become evident that the process of management is greatly aided by information arranged by subject (dimension) rather than by operational applications (Baum 1996).

Previous - Contents - Next


Updated 2020 : note that this post was originally published in 2003, but is left here for archival purposes.