Pete Freitag Pete Freitag

Arrays VS Structures

Published on May 25, 2005
By Pete Freitag
coldfusion

Array's and Structures (structures in CF are called hashtables, or associative arrays) are two very different data structures. There is some confusion about how arrays work in CF, and from what I can understand they are based on native java arrays.

Once common misconception is that the following code will only allocate two slots of memory:

<cfset a = ArrayNew(1)>
<cfset a[1] = "One">
<cfset a[10] = "Ten">

What is really going on here is that cf is allocating an Array with 10 elements , elements 2-9 are null. The memory allocated is about 8 bytes for each element in the array (these are references the objects they hold). The memory is allocated in one contiguous block, so if you have a very large range in your Array, it can be fairly taxing on your memory manager.

You can see the size of the array, and the length of the array is 10 elements by using either ArrayLen or the .size() method (undocumented).

<cfoutput>
Len: #ArrayLen(a)# Size: #a.size()#
</cfoutput>

It is a common misconception that CF will only allocate two slots of memory, but from my understanding it will allocate all ten, so beware of this.

If your wasting a lot of space in the array, the other option is to use a structure:

<cfset a = StructNew()>
<cfset a[1] = "One">
<cfset a[10] = "Ten">
<cfoutput>Size: #a.size()#</cfoutput>

You will notice that with a structure the size of the data structure will be just 2, instead of 10. Also note that the index of the structure is treated as a string, so you can index that structure with any string, not just unsigned integers as in Arrays.

A performance tradeoff decision must be made. Array's are very fast for retrieving elements, and for looping over. They can be slow to add elements to if you don't know the size of the array ahead of time (because they need to allocate contiguous blocks of memory). If your using array's with non-consecutive indexes you will require a lot of memory, and possibly lots of dynamic resizing.

Structures or hashtables have an expensive lookup cost compared to arrays because the string values are converted to an integer using a hash function that is used to locate a hash bucket then (because hash functions can create identical hashes for two different input keys) if the bucket has more than one item each item is compared with the key value. However if you have lots of non-consecutive indexes you will not have wasted memory like you would with an array.

So bottom line is if you have somewhat consecutive non negative integer indexes arrays are a good bet. Otherwise use a structure.



cfml arrays structures

Arrays VS Structures was first published on May 25, 2005.


Fixinator

The Fixinator Code Security Scanner for ColdFusion & CFML is an easy to use security tool that every CF developer can use. It can also easily integrate into CI for automatic scanning on every commit.


Try Fixinator

CFBreak
The weekly newsletter for the CFML Community


Comments

I should point out that the 8 byte size I gave may be different based on the jvm and operating system (eg 32bits vs 64bit will make a difference).
by Pete Freitag on 05/25/2005 at 4:32:35 PM UTC
Mark Kruger also has some info about this: http://mkruger.cfwebtools.com/index.cfm?mode=entry&entry=15C620E6-F331-2471-D5D7C7891F375559
by Pete Freitag on 05/25/2005 at 5:53:31 PM UTC
CF Arrays are based on java.util.Vector, which is based on a native array. It doesn't really change anything (you'll still get 10 slots), but it's worth mentioning, since it has some implications when dealing with Java directly.
by Barneyb on 05/25/2005 at 5:54:24 PM UTC
Wayne,

Array's don't iterate through the entire array every time you look something up. Think of an Array as a chunk of memory with slots for each item in the array, if you know you want item 10, the exact memory address of the item can be calculated by taking the address of the first item, and adding 10 * size of each item (object references are all the same size). So an array item lookup is O(1) constant time.

And for structures your forgetting about looking up the hash item, when you generate the hash of the item it has to be put in a sorted sequence, then you have to do a binary search to find the item, which I think would give you a O(log(n)) - my big O is a bit rusty...

As for synchronization both java.util.Hashtable and java.util.Vector are both synchronized, so that would not have any effect on performance.
by Pete Freitag on 05/27/2005 at 11:56:01 AM UTC
require paticular differences b/n array &structure and also structures and punoins w.r.t C language
by deevi on 03/23/2007 at 11:04:57 PM UTC
Back in the U.S.S.R Doru Hendrikje.
by Doru Hendrikje on 08/12/2007 at 3:23:04 PM UTC
that's why it will never wor. Anandi Hristina.
by Anandi Hristina on 09/19/2007 at 6:01:50 PM UTC
please tell me why, i'm feeling . Jumana Fanni.
by Jumana Fanni on 10/21/2007 at 5:34:36 PM UTC
after the putting the value of array in main function ..i wanna change it udf function..
by sachin on 04/06/2009 at 8:51:20 AM UTC
please send the concept of structers using arrays clearly
by j.koteswararao on 12/10/2010 at 2:40:33 AM UTC