Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

JS Optimization

Name: Anonymous 2011-09-12 5:05

After profiling, I see that roughly 25-30% of my program's time is spent inside this function.

Unfortunately, this isn't my algorithm, and I don't think I can improve it. The paper it came from is here: http://www.dgp.toronto.edu/people/stam/reality/Research/pdf/GDC03.pdf

My only option seems to be "performance tricks", but I don't see any openings. The only thing I can say for sure that I haven't tried is ``loop unrolling'', but I don't have much experience doing that myself, and I don't see how I could do it here while staying in bounds. If it helps, height, width, and iterations are all set by the user.

So: any ideas on how to improve the performance of this code?

function lin_solve(b, x, x0, a, c) {
        if (a === 0 && c === 1) {
            for (var j = 1; j <= height; j++) {
                var currentRow = j * rowSize;
                ++currentRow;
        var i = width - 1; do {
                    x[currentRow] = x0[currentRow];
                    ++currentRow;
                } while (i--);
            }
            set_bnd(b, x);
        } else {
            var invC = 1 / c;
            for (var k = 0; k < iterations; k++) {
                for (var j = 1; j <= height; j++) {
                    var lastRow = (j - 1) * rowSize;
                    var currentRow = j * rowSize;
                    var nextRow = (j + 1) * rowSize;
                    var lastX = x[currentRow];
                    ++currentRow;
                   
            var i = width; do {
                        lastX = x[currentRow] = (x0[currentRow] + a*(lastX+x[++currentRow]+x[++lastRow]+x[++nextRow])) * invC;
            } while (i-- > 1)
                }
                set_bnd(b, x);
            }
        }
    }

Name: Anonymous 2011-09-12 10:37

Your first loop can be "flattened" like this (I don't do JS so there might be syntax errors, and you might need to adjust the indices a little, but you get the point.) It's essentially a memcpy().

var i = 1, j = rowSize*height;
while(i<l)
 x[i] = x0[i++];


A good JIT JS compiler should turn that into a rep movsd which is about as fast as you can get (since it's an intrinsic loop in hardware.)

Apply the same principles to your second loop. Try to move the multiplies out of the loop. If you profile line-by-line you will find they are taking the bulk of the execution time.

I also found this idiot who contradicts his own benchmarks:
http://lemire.me/blog/archives/2010/07/19/is-multiplication-slower-than-addition/
(Scroll down to the table of results. Hardly surprising.)

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List