Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

2d array optimization

Name: Anonymous 2010-01-20 6:59

I'm writing some program (C++) which will heavily depend on bitmaps... and it needs to be ultrafast.
I thought of implementing:
- 2d arrays: bitmap[y][x]=0xdeadbeef;
- 1d arrays: bitmap[y_+x]=0xdeadbeef; where y_ = y*w, recalculated as rarely as possible
- vectors.
The point is it needs to be fast. I will often need to traverse every bitmap's cell in usual order (y=0..h, x=0..w). Having that in mind, which of the options above is the best?

Name: Anonymous 2010-01-20 7:05

Look at what your compiler does with various optimization settings. If it does poorly, use pointers.

Name: Anonymous 2010-01-20 7:50

Completely dependent on compiler. If it needs speed, just make a quick ASM implementation, however the compiler probably knows ASM far better than you. The simplest solution is always get a better CPU.

Name: Anonymous 2010-01-20 8:32

>>1
Vectors are superior because there is no dead beef in them.

Name: Anonymous 2010-01-20 9:02

Don't optimize until you've shown there's a problem.

Name: Anonymous 2010-01-20 10:30

>>1
How does C++ internally handle multidimensional arrays?  What I mean is, with a 1D array that you want to mimic a 2D array you merely have a row and a column index and do the following:

array[(row * rowlength) + column]

With a 2D array, you access the array by distributing the index ( array[row][column] ).  Does the structure of a 2D array in memory look different than the structure of a 1D array?

Name: Anonymous 2010-01-20 11:17

Does the structure of a 2D array in memory look different than the structure of a 1D array?
No. The blocks are contiguous regardless. A two dimensional array is basically syntactic sugar.

Name: Anonymous 2010-01-20 13:53

mingw produces exactly the same code on win32 for 2d and 1d arrays.
Done some tests and implemented four methods of walking through 2d arrays:
1. foreach y { foreach x { arr[y][x] = x + y } }
2. foreach y, y2 += w each step { foreach x { arr[y2+x] = x + y } }
3. pointer = arr; foreach y { foreach x { *pointer = x + y, ++pointer } }
4. y = 0, x = 0, l=x*y, for i = 0 to l { x ++; if ( x == w ) { y ++; x = 0; } * pointer = x + y; pointer ++; }
...and every of above runs for about 1-2 sec (this is random each run) so apparently the style is not that important..

Name: Anonymous 2010-01-20 16:23

>>5
Good way to stagnate the state of the art.

Name: Anonymous 2010-01-20 16:59

just thought of runtime code assembling, create var with opcodes containing some instructions repeated x*y times - w/o any comparison instructions... then properly change eip to &var.. heh. useless though.

Name: Anonymous 2010-01-25 3:21

Well, if you're going to loop trough pixels, USE A 1D ARRAY.
'cause you can loop trough it linearly, at the cost of a counter increment. For a 2d array, you're also paying for a multiply (if you are using 2d indexing that is).

But if speed is really that important, you might want to check out sse2 and mmx, or even interlace both. (2*sse 1*mmx) to get a fuckton of speed out of your operations.

Name: Anonymous 2010-01-25 4:18

>>11
One multiplication every row is not going to be a speed issue.

Name: Anonymous 2010-01-25 4:24

>>10
That's what I did for this obscure data mining project I was working on several months ago. Sped up a critical algorithm's inner loop by around 700%.

Name: Anonymous 2010-01-25 9:45

>>9
Stagnate the state of the art? Accessing elements of an array is state of the art?

Name: Anonymous 2010-01-25 10:15

>>14
You are not an artist, are you?

Name: Anonymous 2010-01-25 10:25

>>15
He was replaced

Name: Anonymous 2010-01-27 4:33

>>1
Option #1 is a filing cabinet.
Option #2 is a chessboard.

Go for #2.

>>7
Not necessarily. Each row of the array must be a contiguous location in memory, but the rows do not necessarily have to be contiguous.

Name: Anonymous 2010-01-27 14:31

>>13
Which would, in real situations be slower than the original solution.
Let's analyze your situation:
Generate the assembly code - takes quite a bit of time
Assembly code per cell is probably bigger than the original data -does not only take a lot of bandwidth, but also a LOT of time due to cache trashing etc.

Guess the good ol' for loop using SSE and MMX is faster.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List