|Powered by QM on a Linux server|
KnowledgeBase 00024: Performance: Scanning Dynamic Arrays
Applications frequently need to walk through elements of a dynamic array. The manner in which this is done can have a significant impact on performance.
The very nature of dynamic arrays means that it is not possible to index directly to a specific field, value or subvalue. Instead, the system must walk through the array, counting mark characters until it reaches the desired position. Although this is not a serious problem with small arrays, it can become serious for large arrays. Appropriate use of QMBasic language constructs can improve performance in this area.
When a field is extracted from a dynamic array using the VAR<I> notation, the EXTRACT() function, LOCATE, etc, the QMBasic run machine remembers the position of the field that it has just extracted. This allows a subsequent extraction to start from this known position instead of restarting from the beginning of the array.
This mechanism applies only to fields. There is no hint maintained for positions of values within a field or subvalues within a value. When processing dynamic arrays that contain fields that are divided into large numbers of values, it may be possible to gain performance by extracting the field into a temporary variable, using RAISE() to promote all the mark characters, and then walking through this temporary item at the field level.
As an example, an application that processes an array containing 10000 fields, each of five characters, linearly from field 1 to the end is approximately 150 times faster than walking through the same data delimited by value marks.
The fastest way to walk through a dynamic array, element by element, is to use the REMOVE statement or the REMOVE() function.
Whenever a value is assigned to a character string variable, a pointer known as the remove pointer is set to character position zero (off the front of the string). The REMOVE statement advances this pointer by one character and then extracts data until it finds either a mark character or the end of the string. The remove pointer is left pointing at the mark character or one character beyond the end of the string. By using REMOVE in a loop, a program can walk element by element through a dynamic array with no searching. Use of this technique is approximately twice as fast as extracting fields sequentially but has the advantage that it works for all mark character delimiters.
When necessary, the remove pointer can be set to a specific position in the string by use of the SETREM statement. A commonly used alternative when the pointer is to be reset to the start of the string is simply to copy the variable containing the dynamic array to itself, a process that doesn't alter the string in any way but has the side effect of resetting the remove pointer just like any other assignment of a character string. Note that the string is not physically copied by this operation and hence there is no performance issue with very large strings.
The REMOVEF() function, unique to QM, is an extension to the remove mechanism that allows removal of data separated by any single character delimiter, not just the mark characters. The function gets its name from its default behaviour of removing fields, complete with any embedded values or subvalues.