Posted by: Paul Lefebvre
by Seth Willits
In this tutorial we’re going to write a CapacityString class which will vastly improve string performance in certain situations. Now, I admit this tutorial isn’t exactly going to be eye catching, but I think for some of you it will be quite an eye opener.
The Problem
Let’s say that you’re going to be importing some data from a file, processing it, and outputting the results to a string. Since this process can take quite a while you’re going to want to display a progress dialog with a progress bar that increments realistically based on the current position in the file. Normally you’d have something that looks like this:
bin = File.OpenAsBinaryFile length = bin.Length For i = 1 To length Step 2048 s = s + ProcessData( bin.Read(2048) ) // show progress Next
There’s nothing extraordinary about this code at all, but what if I told you that it could be sped up by over 50 times? It’s quite possible, and easy.
Note that every time through the loop we assign a value to the string “s”, and during that assignment REALbasic reallocates a block of memory to store the contents of that string. The problem is that allocating memory is actually pretty slow, so doing it over and over and over again is very inefficient. The solution is to simply allocate enough memory up front so that it never has to be reallocated. Sounds easy, and it is, but REALbasic strings can’t do this so what we need to do is do it ourselves using a MemoryBlock.
The CapacityString Class
Create a new class called CapacityString and add three properties to it: mCapacity as Integer, mLength as Integer, and mData as MemoryBlock. mData is the chunk of memory that we’re going to be using to store the string, mCapacity will cache the size of the MemoryBlock (although it will always contain the value returned by mData.Size, the fewer function calls we make the faster the code will be), and because the string inside the MemoryBlock will almost never be the size of the MemoryBlock itself, we use mLength to store the size of the string.
Sub Constructor(capacity as Integer) mCapacity = capacity mData = New MemoryBlock(mCapacity) End Sub Function Operator_Convert() As String Return mData.StringValue(0, mLength) End Function
The constructor initializes the mData MemoryBlock to have the capacity we want, and Operator_Convert is a handy method to return the string that is stored in the CapacityString.
The SetString method below sets the string in the CapacityString. The first thing that each of these methods below does is first check to see if the string will actually fit inside of the MemoryBlock. If it doesn’t, it resizes (within the method, function calls would add overhead ;^) and then assigns the string.
Sub SetString(s as String) Dim slen As Integer = LenB(s) If mCapacity < slen Then mCapacity = slen mData.Size = mCapacity End If mData.StringValue(0, slen) = s mLength = slen End Sub Sub AppendString(s As String) Dim slen As Integer = LenB(s) If mCapacity < mLength + slen Then mCapacity = mLength + slen mData.Size = mCapacity End If mData.StringValue(mLength, slen) = s mLength = mLength + slen End Sub
AppendString is similar to SetString but just adds the string onto the end. This is equivalent to “s = s + …”. The InsertString method below doesn’t have a direct equivalent to REALbasic’s String type because you have to use Mid or Left and Right with Strings to be able to insert text in the middle. So this not only speeds things up, but gives us extra functionality. That’s nice. :^)
Sub InsertString(location As Integer, s As String) Dim slen As Integer = LenB(s) If mCapacity < mLength + slen Then mCapacity = mLength + slen mData.Size = mCapacity End If // 0 based location = location - 1 mData.StringValue(location + slen, mLength - location) = mData.StringValue(location, mLength - location) mData.StringValue(location, slen) = s mLength = mLength + slen End Sub
For a simple test of the class, you can use this code:
Sub Action() dim s as String dim cs as CapacityString dim bin as BinaryStream dim i, length as Integer dim time as Double dim file as FolderItem file = GetOpenFolderItem("") if file = nil then return /////////////////////// // Using a String /////////////////////// time = Microseconds bin = File.OpenAsBinaryFile length = bin.Length for i = 1 to length step 2048 s = s + bin.Read(2048) next time = (Microseconds - time) / 1000000 MsgBox "String: " + Format(time, "###.##") + " seconds" bin.Close /////////////////////// // Using a CapacityString /////////////////////// time = Microseconds bin = File.OpenAsBinaryFile length = bin.Length cs = New CapacityString(length) for i = 1 to length step 2048 cs.AppendString bin.Read(2048) next time = (Microseconds - time) / 1000000 MsgBox "CapacityString: " + Format(time, "###.##") + " seconds" bin.Close End Sub
Finished
This isn’t a completely “finished” class as it doesn’t take every string posibility into account, but it’s a solid foundation for anyone wanting to take the idea even further.
Download CapacityString REALbasic project
Originally published by ResExcellence
Reprinted with permission

