====== AS3 Speed tests ====== This is a community maintained page of speed optimization techniques for use with ActionScript3. Originally started on [[http://www.rockonflash.com/blog/?p=63|RockOnFlash.com]], the post just grew with new revisions and tests and yet new techniques from the community! Only made sense to put it on a wiki for everyone to maintain ;) ==== General Notes ==== (**Morf**) Many of these enhancements are particular to the type of variable declaration you use. So far, in testing **Number** vs. **int**, the greatest improvements are gained when using **int**s. In general (with exceptions), only moderate if any improvements are seen when using **Number**-declared variables, due to the limited accurate operations. The following excellent generalizations come from **Andre Michelle**’s website (link below): - Do not overload condition and statements as in AS2 It astounds me that this once-excellent technique for optimizing for speed no produces faster code. Overloading *should* produce faster code by reducing the number of compiled instructions and generating faster ones. - Do not use Objects, if you know which properties will be finally involved. - Cast instances, while reading from an Array - Try to use Integers (int) for all possible cases - Bitoperators are lightning fast (Other things we what would like to see in AS3: - Typed, length-fixed Arrays. “…this would increase the speed…” - A faster solution to read/write pixels on a bitmap.) ==== Optimization Links ==== [[http://rozengain.com/?postid=35|Dennis Ippel - Some ActionScript 3.0 Optimizations and links]] [[http://blog.andre-michelle.com/2005/as3-optimations-suggestions|Andre Michelle - General Guidelines for AS3 Optimizations]] ===== Integer Addition & Increment/Decrement ===== Tests done to determine if "a += b" really is faster (as it **should** be) than "a = a + b" have been inconclusive so far. I'd still recommend using the shorthand if you're accustomed to it, mostly because I'm hoping a new compiler will grant faster speeds for using it. The good news is that Incrementing (adding 1) with the ++ operator is far faster than "a = a + 1" - in fast, using the ++ four times (in consecutive statements) is still about 50% faster than "a = a + 4"!! Unfortunately, //**the same is not true with the decrement (--) operator**//. So far, tests on that have not shown any consistent speed increase over "a = a - 1". (sigh) Here's the testing code: import flash.utils.getTimer; var time : Number = getTimer(); var sumStd : Number = 0; var sumOpt : Number = 0; var sumMpt : Number = 0; var loops : Number; var thresh : Number = 0.002; function runStdTest():void { time = getTimer(); for (var i:int = 0; i < 20000000; i++) { var test:int = i; test = test - 1; } sumStd += (getTimer() - time); } function runOptTest():void { time = getTimer(); for (var i:int = 0; i < 20000000; i++) { var test:int = i; test--; } sumOpt += (getTimer() - time); } function runEmptyTest():void { time = getTimer(); for (var i:int = 0; i < 20000000; i++) { var test:int = i; } sumMpt += (getTimer() - time); } loops = 10; for (var i:int = 0; i < loops; i++) { runStdTest(); runOptTest(); runEmptyTest(); } trace ("Standard vs. Optimization - Mean over", loops, "loops"); trace ("Standard Test: ", (sumStd /= loops)); trace(); ===== Division ===== ==== divide by 2 ==== Now, you've probably heard the tip about "use multiplication instead of division when dividing by 2", right? faster: var n:Number = value *.5; slower: var n:Number = value / 2; run it with a test: import flash.utils.getTimer; var time:Number = getTimer(); function runDivisionTest():void { time = getTimer(); for(var i:int=0;i<10000000;i++) { var test:Number = i/2; } trace("DivisionTest: ", (getTimer()-time)); } function runMultTest():void { time = getTimer(); for(var i:int=0;i<10000000;i++) { var test:Number = i*.5; } trace("MultTest: ", (getTimer()-time)); } runDivisionTest(); runMultTest(); traces out: DivisionTest: 162 MultTest: 110 Alright, so it's not miles faster, but in a 3D engine like [[http://blog.papervision3d.org|Papervision3D]], this becomes the difference between a fast engine and a slow engine real quickly. Well, there's one still that's even faster: Bitwise shift operations Divide by 2 with a right shift: trace(10 >> 1); // traces: 5 Multiply by 2 with a left shift: trace(10 << 1); // traces: 20 Now, run the test against the Division and Multiplication tests above: function runBitTest():void { time = getTimer(); for(var i:int=0;i<10000000;i++) { var test:int = i >> 1; } trace("BitTest: ", (getTimer()-time)); } runBitTest(); traces out: DivisionTest: 152 MultTest: 112 BitTest: 63 HOLY CRAP Batman - that's nearly 1/2 the speed?? or should I say, 1*.5 the speed ;) (**Morf**) Well, **kind of** - but **you may not want to use the bitshift**. When the BitTest code above uses the the same (slower) **Number** declaration for **test**, the results are slightly different (btw, the iterator **i** in each testing loop also now has the same declaration (**int**) in each code section =)): Division by 2 - Mean over 20 loops DivisionTest: 83.3 MultTest: 62.9 BitRightTest: 45.85 EmptyCodeTest: 29.1 I've also thrown in an empty loop timing for comparison - so a more accurate comparison can be made by subtracting that timing from each of the others. These gave: Net timing results: DivisionTest: 54.2 MultTest: 33.8 BitRightTest: 16.75 So both optimizations are still faster with **Number** declarations. **But don't use BitShifting for Number operations.** That's because the bit shift will round off the result to an integer - AS3 turns your original value into an integer before making the bit shift. Unless that's what you're after, **use "0.5 *" instead.** ===== Math.floor/Math.ceil ===== So, I was looking at some other stuff with uint and this seems to be a real gem when it comes to Math.floor and Math.ceil operations. I was reading through Flash's help (I have to say, I really enjoy the new help - I live in there) on uint and realized that, like int, when you pass a number, it strips everything passed the decimal (like Math.floor). So, I thought, what the hey, let's give er' a speed test. Sure enough, it was much faster to use uint(n) than Math.floor(n) - nearly 10x's as fast ** -- UPDATE -- ** After Paulius Uza's comments about using int, I went back and added tests for int with the floor/ceil tests, and sure enough, they're even faster than using uint. floor's test wasn't that drastic, but ceil's was 1/2 the time of uint's version ** -- /UPDATE -- ** fast: var test:int = int(1.5); //test equals 1 slow: var test:Number = Math.floor(1.5); //test equals 1 ** -- /UPDATE by lilive -- ** But take care, this solution is only for positive integer: var test1:int = int(-1.5); //test1 equals -1 var test2:Number = Math.floor(-1.5); //test2 equals -2 Now, I know what yer thinkin': what about Math.ceil? +1? yes ;) ** -- /UPDATE by matthewwithanm -- ** Note that this this is //not// a general replacement for ceil as it will give an incorrect result for integers. [ceil(1) == 1 but int(1) + 1 == 2] fast: var test:int = int(1.5)+1; //test equals 2 slow: var test:Number = Math.ceil(1.5); //test equals 2 Time for a test: function runFloorTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:Number = Math.floor(n); } trace("FloorTest: ", (getTimer()-time)); } function runUintFloorTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:uint = uint(n); } trace("UintFloorTest: ", (getTimer()-time)); } function runIntFloorTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:int = int(n); } trace("IntFloorTest: ", (getTimer()-time)); } function runCeilTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:Number = Math.ceil(n); } trace("CeilTest: ", (getTimer()-time)); } function runUintCeilTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:uint = n == uint(n) ? n : uint(n)+1; } trace("UintCeilTest: ", (getTimer()-time)); } function runIntCeilTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:int = n == int(n) ? n : int(n)+1; } trace("IntCeilTest: ", (getTimer()-time)); } function runBitShiftFloorTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = 1.5; var test:Number = n >> 0; } trace("BitShiftFloorTest: ", (getTimer()-time)); } runFloorTest(); runUintFloorTest(); runIntFloorTest(); runBitShiftFloorTest(); runUintCeilTest(); runIntCeilTest(); traces out: FloorTest: 1733 UintFloorTest: 176 *IntFloorTest: 157 BitShiftFloorTest: 178 UintCeilTest: 650 *IntCeilTest: 384 ===== Math.abs ===== I was thinking about another one that I use sometimes which was using *-1 instead of Math.abs on a number faster: var nn:Number = -23 var test:Number= nn < 0 ? nn * -1 : nn; or just : var nn:Number = -23 var test:Number= nn < 0 ? -nn : nn; slower: var nn:Number = -23 var test:Number = Math.abs(nn); Lets test! function runABSTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = -1.5; var test:Number = Math.abs(n); } trace("ABSTest: ", (getTimer()-time)); } function runABSMultTest():void { time = getTimer(); for(var i:uint=0;i<10000000;i++) { var n:Number = -1.5; var test:Number = n < 0 ? n * -1 : n; } trace("ABSMultTest: ", (getTimer()-time)); } runABSTest(); runABSMultTest(); traces out: ABSTest: 1615 *ABSMultTest: 153 (**Morf**) So... Taking this a bit further, instead of multiplying by -1, what about just assigning the inverse? The code is updated so **n** (instead of **test**) receives the new **Abs** value: function runAbsTest():void { time = getTimer(); for (var i:int = 0; i < 10000000; i++) { var n:Number = -1.5; n = Math.abs (n); } sumAbs += (getTimer() - time); } function runMm1Test():void { time = getTimer(); for (var i:int = 0; i < 10000000; i++) { var n:Number = -1.5; n = n < 0 ? n * -1 : n; } sumMm1 += (getTimer() - time); } function runChsTest():void { time = getTimer(); for (var i:int = 0; i < 10000000; i++) { var n:Number = -1.5; n = n < 0 ? -n : n; } sumChs += (getTimer() - time); } function runEmptyTest():void { time = getTimer(); for (var i:int = 0; i < 10000000; i++) { var n:Number = -1.5; n = n; } sumMpt += (getTimer() - time); } And what if we just tested to see if **n** was less than zero **before** doing anything? function runIfmTest():void { time = getTimer(); for (var i:int = 0; i < 10000000; i++) { var n:Number = -1.5; if (n < 0) n = -n; } sumIfm += (getTimer() - time); } Traced out: Abs - Mean over 25 loops AbsTest: 1572.4 MultByMinus1Test: 121.96 ChsTest: 95.76 IfMinusTest: 76 EmptyTest: 32.48 Net timing results: AbsTest: 1539.92 MultByMinus1Test: 89.48 ChsTest: 63.28 IfMinusTest: 43.52 Bear in mind the **IfMinusTest** code assumes the inverse operation must take place **every** time (n is always < 0) - so the actual implementation would probably run even faster. ===== Math.sqrt ===== In response to a question posted on how to determine distance, someone said "sqrt is an expensive operation". So I tried it out against the old Babylonian Method that was known (as recently as the late 20th century =)!) to be quite accurate once you'd run it through a few iterations: import flash.utils.getTimer; var time : Number = getTimer(); var sumSqrt : Number = 0; var sumAppr : Number = 0; var sumMpt : Number = 0; var loops : Number; var thresh : Number = 0.00001; var a : Number = 0, b : Number = 0, c : Number = 0, w : Number = 0; var count : int = 0; w = 147.29; trace ("Actual value =", Math.sqrt (w)); // This is what AS3 gives us. trace ("Testing threshold:", thresh); b = w * 0.25; do { count++; c = w / b; b = (b + c) * 0.5; a = b - c; if (a < 0) a = -a; } while (a > thresh); // This tests the algorithm. c = Math.sqrt (w); trace ("Number of iterations for approximation:", count); trace ("Approximation:", b); trace ("Error in approximation:", b - Math.sqrt (w)); trace (" - - - -"); function runSqrtTest():void { time = getTimer(); for (var i:int = 0; i < 1000000; i++) { var test:Number = Math.sqrt (w); } sumSqrt += (getTimer() - time); } function runApprTest():void { time = getTimer(); for (var i:int = 0; i < 1000000; i++) { var b :Number = w * 0.25, a :Number; c :Number; do { c = w / b; b = (b + c) * 0.5; a = b - c; if (a < 0) a = -a; } while (a > thresh); } sumAppr += (getTimer() - time); } function runEmptyTest():void { time = getTimer(); for (var i:int = 0; i < 1000000; i++) { var n:Number = w; } sumMpt += (getTimer() - time); } loops = 20; for (var i:int = 0; i < loops; i++) { runSqrtTest(); runApprTest(); runEmptyTest(); } trace ("Math.sqrt - Mean over", loops, "loops"); trace ("Math.sqrt() Test: ", (sumSqrt /= loops)); trace ("Approximation Test: ", (sumAppr /= loops)); trace ("Empty Test: ", (sumMpt /= loops)); trace ("Net timing results:"); trace ("Math.sqrt() Test:", (sumSqrt - sumMpt)); trace ("Approximation Test:", (sumAppr - sumMpt)); I could not believe the results: Actual value = 12.136309158883519 Testing threshold: 0.00001 Number of iterations for approximation: 6 Approximation: 12.13630915888352 Error in approximation: 1.7763568394002505e-15 - - - - Math.sqrt - Mean over 20 loops Math.sqrt() Test: 172.05 Approximation Test: 134.8 Empty Test: 4.05 Net timing results: Math.sqrt() Test: 168 Approximation Test: 130.75 Okay, what gives? Even my math co-processor on my old 286 (that I **still** do my taxes on) could do better than that! And when we boost the testing threshold up for one less iteration, we still get 10-digit accuracy, but with nearly a 1/3 speed boost: Actual value = 12.136309158883519 Testing threshold: 0.002 Number of iterations for approximation: 5 Approximation: 12.136309166280597 Error in approximation: 7.397078505277932e-9 - - - - Math.sqrt - Mean over 20 loops Math.sqrt() Test: 167.3 Approximation Test: 114.95 Empty Test: 3.95 Net timing results: Math.sqrt() Test: 163.35000000000002 Approximation Test: 111 Speed Increase: 32.047750229568415 % I can't help feeling like I'm missing something here :T If I'm not, then this algorithm will be faster than Math.sqrt. UPDATE by K2xL.com@gmail.com: Whoever posted this tip must've used an older version of Flash. I ran the Math.sqrt improvement test with Flash 9 v. 115 today on March 26th, 2008. The results were the opposite. I think Adobe optimized their Math.sqrt. Here are the results I got when running your code: Actual value = 12.136309158883519 Testing threshold: 0.00001 Number of iterations for approximation: 6 Approximation: 12.13630915888352 Error in approximation: 1.7763568394002505e-15 - - - - Math.sqrt - Mean over 20 loops Math.sqrt() Test: 36.65 Approximation Test: 60.1 Empty Test: 26.8 Net timing results: Math.sqrt() Test: 9.849999999999998 Approximation Test: 33.3 (**Morf**) UPDATE RESPONSE: That revision of Flash **did** speed a lot of things up, including Math.sqrt(); I suspect they streamlined calls in general (this process is by far the biggest culprit in this problem). I did go ahead and create faster code that is way way uglier. It is faster by about 20% under Flash 10... but only gives approx .0001 accuracy, which is plenty for Papervision3D applications. In any case, I don't recommend its use as it will not offer any increase in speed unless used inline (not as any sort of function or method). Here's the package used for testing: package { import flash.display.*; import flash.text.*; import flash.utils.getTimer; public class Main extends Sprite { public function Main() { var time : Number = getTimer(); var sumSqrt : Number = 0; var sumAppr : Number = 0; var sumMpt : Number = 0; var loops : Number; var resultMsg:TextField = new TextField(); addChild (resultMsg); var thresh : Number = 0.002; var a : Number = 0, b : Number = 0, c : Number = 0, ww : Number = 0, www : Number = 0; var count : int = 0; var j : int = 0; var jj : int = 0; function runApprTest():void { time = getTimer(); for (var ii:int = 0; ii < 20; ii++) { for (var i:int = 33000; i < 44000; i++) { var sqNum:Number = i + 0.208; //--Using the Babylonian Method for computing square roots efficiently requires making the initial fast // approximation as accurate as possible, insuring that the number of iterations can be minimized while // perserving a desired level of accuracy. // var apprxInit : int; var apprxRoot : Number; var initVal : Number; var invVal : Boolean; if (sqNum < 1) { initVal = 1 / sqNum; invVal = true; } else { initVal = sqNum; invVal = false; } apprxInit = initVal; //--Make first approximation of APPRXROOT: Set the starting approximation APPRXINIT to determine // its binary degree; in this case, the location of its most significant bit to determine its length // in bits. // In AS3, the most efficient method for accomplishing this is by comparing it to integers equal to odd // powers of 2, then shifting it right half the number of bits of its length. This unfortunately results // in rather unseemly code. The final treatment is one iteration of the Babylonian Method. "Hey, Morf // said it's fast and it works" // Well, yes... but this MUST BE USED AS INLINE CODE FOR ANY SPEED OPTIMIZATION TO BE REALIZED. // if (apprxInit > 7) { if (apprxInit < 32768) { if (apprxInit < 128) { if (apprxInit < 32) { apprxInit >>= 2; if (apprxInit < 4) apprxInit++; } else { apprxInit >>= 3; } } else { if (apprxInit < 2048) { if (apprxInit < 512) { apprxInit >>= 4; } else { apprxInit >>= 5; } } else { if (apprxInit < 8096) { apprxInit >>= 6; } else { apprxInit >>= 7; } } } } else { if (apprxInit < 8388608) { if (apprxInit < 524288) { if (apprxInit < 131072) { apprxInit >>= 8; } else { apprxInit >>= 9; } } else { if (apprxInit < 2097152) { apprxInit >>= 10; } else { apprxInit >>= 11; } } } else { if (apprxInit < 134217728) { if (apprxInit < 33554432) { apprxInit >>= 12; } else { apprxInit >>= 13; } } else { apprxInit >>= 14; //What are you doing with a number this big anyway? Not bothering with yet another test. } } } apprxRoot = (apprxInit + initVal / apprxInit) * 0.5; } else if (apprxInit < 2) apprxRoot = initVal * 0.5 + 0.5 else apprxRoot = initVal * 0.25 + 1; // //-- First approximation will be within about 1% at twice the speed and may be sufficient for collision detection computations. //-- Now perform as many additional approximations as desired; fourth iteration = same precision as Math.sqrt(). // apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5; // babylonian iteration 2 -- about 20% faster. // apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5; // babylonian iteration 3 -- seven-digit precision. // apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5; // babylonian iteration 4 -- full precision. if (invVal) apprxRoot = 1 / apprxRoot; } } sumAppr += (getTimer() - time); } function runSqrtTest():void { time = getTimer(); for (var ii:int = 0; ii < 20; ii++) { for (var i:int = 33000; i < 44000; i++) { var w:Number = i + 0.208; var test:Number = Math.sqrt (w); } } sumSqrt += (getTimer() - time); } function runEmptyTest():void { time = getTimer(); for (var ii:int = 0; ii < 20; ii++) { for (var i:int = 33000; i < 44000; i++) { var n:Number = i + 0.208; } } sumMpt += (getTimer() - time); } loops = 50; for (var i:int = 0; i < loops; i++) { runSqrtTest(); runApprTest(); runEmptyTest(); } resultMsg.text = "Math.sqrt - Mean over " + loops + " loops" + "\n" + "Math.sqrt() Test: " + (sumSqrt /= loops) + "\n" + "Approximation Test: " + (sumAppr /= loops) + "\n" + "Empty Test: " + (sumMpt /= loops) + "\n" + "Net timing results:" + "\n" + "Math.sqrt() Test: " + (sumSqrt - sumMpt) + "\n" + "Approximation Test: " + (sumAppr - sumMpt) + "\n" + "Speed Increase: " + 100 * (sumSqrt - sumAppr) / (sumSqrt - sumMpt) + "% (anecdotal)"; resultMsg.autoSize = TextFieldAutoSize.LEFT; } } } ===== Summary ===== ==== Number ==== **Operation** >>>> "**faster code**" Division by 2 »» "n * 0.5" Multiply by 2 »» "n + n" Math.floor »» "n >> 0" Math.ceil »» "n == int(n) ? n : int(n)+1" Absolute value »» "if (n < 0) n = -n" //slow var test:Number = n < 0 ? n * -1 : n; //fast var test:Number = n; if (n < 0) { n * -1; } else { n; }