This article shows how to use the OLE Automation DLL to create a string without relying on VB to do it. When VB creates a string, it automaticly fills it with data (which takes a great deal of time when dealling with large strings). This way bypasses VB and creates the string itself, without filling it with data. The result? 200 times faster! I have updated the tutorial to include Example 1 re-written using the faster method. The code includes a benchmark and the function described below. Please vote or leave a comment!
Submitted On | 2002-12-31 14:07:10 |
By | jbay101 |
Level | Advanced |
User Rating | 4.9 (69 globes from 14 users) |
Compatibility | VB 5.0, VB 6.0 |
Category | Object Oriented Programming (OOP) |
World | Visual Basic |
Archive File | Create_str15211912312002.zip |
Fast string in Visual Basic, version 2
Visual Basic vs. C++
Visual Basic stores it's strings in a type referred to in C++ as a BSTR. This type is completely different from the C char type, as a BSTR doesn't necessarily terminate with a null, and it has a different header. The C char is stored as an array of bytes, terminating at a null byte or character 0x0. Unlike C or C++, when you create a string in VB it is automatically filled with data.
The SLOW way - Visual Basic's String
creation
When you dynamically create a string in Visual Basic, there are only two
methods that VB supports. These are:
1. Using the String function
Example:
Dim strData As String
'our string variable
Open "test.bin"
For Binary Access Read As #1
'open a file
strData = String(LOF(1), 0)
'create a buffer
Get #1, , strData
'read data into the buffer
Close #1
'close the file
The String function
takes two parameters, the length of the string and the character to fill the
string with.
2. Using the Space function
This is much like using the String function, except it
automatically fills the string with spaces.
Now, for the example above all we want is an empty storage space to fill with data. But VB doesn't do this. In both instances, VB fills the string with data, which can take a lot of time. This is where the API optimization comes into play.
The FAST way - the OLE Automation library
The OLE Automation library provides support, not only for the BSTR type but
also for all variable-related operations. To increase the speed of the string
creation, we want to tell the OLE Automation library to create a region of
memory that we can access - without filling it with data. To do this we will use
two functions, RtlMoveMemory in the windows kernel and SysAllocStringByteLen is
the OLE Automation library. The declarations are below.
Declare Sub
RtlMoveMemory Lib "kernel32" (dst
As Any, src As Any,
ByVal nBytes&)
Declare Sub SysAllocStringByteLen&
Lib "oleaut32" (ByVal
olestr&, ByVal BLen&)
The RltMoveMemory function copies nBytes bytes from the src address to the dst address. The SysAllocStringByteLen allocates BLen of storage space for a BSTR, or in this case a Visual Basic String. In reality, the Visual Basic String is nothing more than a pointer, or a reference to an address in memory that can be used to store the data. With this in mind, we can create out own string allocation function, as shown below.
Public Function
AllocString(ByVal lSize As Long)
As String
RtlMoveMemory ByVal VarPtr(AllocString_ADVANCED),
SysAllocStringByteLen(0&, lSize + lSize), 4&
End Function
This may look a bit complicated at first
but it is really relatively simple. The function allocates the space and then
copies the 4 byte pointer from this space to the string returned by the
function. If we were to expand the function a little it would look like this:
Public Function
AllocString(ByVal lSize As Long)
As String
Dim lPtr As Long
'the address of the allocated memory
Dim lRetPtr As Long
'the pointer to the return variable
Dim sBuffer As String
'the variable to return
lRetPtr = VarPtr(sBuffer)
'the pointer to the string buffer
lPtr = SysAllocStringByteLen(0&, lSize + lSize)
'allocate the memory and get it's pointer
RtlMoveMemory ByVal lRetPtr, lPtr, 4&
'copy the pointer address
AllocString = sBuffer 'return the string
with the modified pointer
End Function
As someone highlighted in the previous
tutorial, when a value is returned it is duplicated and added to the stack. When
the function ends, this value is pushed off the stack and return to the assigned
variable. However, this is where some more knowledge of how VB works is
required. Visual Basic is not duplicating the data. All that Visual Basic is
doing is duplicating the pointer to the data. Why move 30 MB when you can move 4
bytes? Still, returning a value does take time, and if you a looking for a few
more miliseconds you could try making the call inline (removing the function all
together). For example, if your string is called strBuffer you could use the
code below.
Dim strBuffer
As String
RtlMoveMemory ByVal VarPtr(strBuffer),
SysAllocStringByteLen(0&, 100 + 100), 4& '
allocate 100 bytes
This method will be slightly faster, but I don't
think it's worth the trouble (unless you only need to allocate the data once)
As most of you know, when dealing with the API it is very important to free all the memory you allocate, otherwise you can easily develop memory leaks. But the best part of using the above method is that we don't have to worry about freeing the memory block. When your Visual Basic program ends (of a function/sub containing the relative variable ends), VB automatically checks each variable and frees the memory associated with them. But this is not a VB variable you may say? Wrong. This is a normal VB string variable, we have just created it without VB. Any Visual Basic string function will still work on the data. VB just doesn't know how it was allocated - but VB doesn't care. If you really wanted, you could write a small function to delete a string. Just be careful about how you do it. Since to create the variable we just copied the pointer, some people may think that the below code would free the string.
Public Function
DeallocString(sString As String)
Dim lPtr As Long
'the address of the allocated memory
lPtr = VarPtr(sBuffer)
'the pointer to the string buffer
RtlMoveMemory ByVal lPtr, 0&, 4&
'copy the pointer address (nulls)
End Function
When dealing with other API types (and VB
types), erasing the pointer will tell VB or the API that the variable hasn't
been initialized. But VB will loose track of all the memory associated with the
string in this case. For the enclosed sample, that is 30 MB or RAM!!! The
correct way to remove the string is to use VB to do it. The easiest way is to
assign its value to "". But if you really MUST write a function, you could tri
the one below.
Public Function
DeallocString(sString As String)
sString = ""
End Function
Sometimes the simplest way is the best!
Now that we know how to allocate strings
the fast way, we can re-write the sample in Example 1.
Dim strData As String
'our string variable
Open "test.bin"
For Binary Access Read As #1
'open a file
strData = AllocString(LOF(1))
'create a buffer using out function
Get #1, , strData
'read data into the buffer
Close #1
'close the file
It really is not that difficult, and it makes a HUGE speed increase. This
article comes with the above function and a benchmark to show the dramatic speed
difference.
The next tutorial will talk about making string functions (compare, join etc) as fast as C and will show how to make a C string in Visual Basic. Please leave a comment or vote!