Our SALES table has just six fields including the record id. Realistic databases are much more complex with, perhaps, over one hundred fields at times.
To fully understand the power of the MultiValue data model, it is useful to understand how this data is represented inside the database. All data is normally stored in character format rather than storing binary values for numeric items. The data in each field is concatenated to form a single character sequence that represents the entire record. To provide a way to know where the boundaries between the fields lie, we insert a special marker character called a field mark between each field. Similarly, where a field contains multiple values, these are separated by value mark characters. Thus order 12001 from the diagram above becomes
14 Jan 11 FM 0783 FM 106 VM 210 FM 1 VM 4 FM 1.75 VM 4.00
Character data that is divided into fields and values is known as a dynamic array, a term that emphasizes that it is a list of items, each of which can be of variable format. Although dynamic arrays are most commonly used to represent database records, they are often used for other purposes within MultiValue applications.
There are several important things to note about this representation of our data record.
Firstly, note that the order number is not in the data. In any database system, we need a way to uniquely identify the records in the table, in this example by use of the order number. In the MultiValue data model, the unique identifier (known as the record id or key) is not considered to be part of the data but is a handle by which we access the data. Although it may at times be helpful to consider it as part of the record as in the diagram above, reading a record from the database returns only the data fields.
Secondly, notice that although there is a value by value relationship (an association) between the values in the product number, quantity and price fields, there is nothing in the stored data to show this. The structure and interpretation of the content of a record is defined by the table’s dictionary.
Thirdly, by use of mark characters, it is possible to store truly variable length data. There will come a time when the customer numbers need to be expanded to five digits. With the MultiValue model, we can store these as variable length so that no changes are needed to the database when the number of digits increases. Use of mark characters also implies that the multivalued fields can hold any number of values. The data model imposes no restrictions.
The database may at times store data internally in a different form than would be used to present this data to a user. Dates are usually stored as a number of days from a reference point in time and fractional values are normally scaled into a lower unit to eliminate the decimal point. Our example record above becomes
15720 FM 0783 FM 106 VM 210 FM 1 VM 2 FM 175 VM 400
The data model allows us to go one more level, breaking values into subvalues. Perhaps our order processing database needs to store the serial numbers of each part sold. We could add a further field that has a value for each part number, further divided into a subvalue for each serial number.
15720 FM 0783 FM 106 VM 210 FM 1 VM 2 FM 175 VM 400 FM 81 VM 98 SM 472
If this looks complex, take the dynamic array apart one layer at a time. First identify the fields, then identify the values in each field and then, if relevant, identify the subvalues in each value.
By storing our data in this form, our orders table is now a four dimensional structure that, once we understand the concept, is far simpler than the multi-table representation forced upon users of the relational model. Also, the additional processing needed to locate and extract items from the dynamic array is usually far less costly than merging data from separate tables as would be needed in a relational database. Note, however, that there is no reason why a MultiValue database cannot store fully normalized data if the developer so wishes.