Monday, November 14, 2016

Exceptions and Performance

First, let me put out this disclaimer - the tests and improvements shown here are a worst case scenario done with "out of bounds" testing.  While the tests and solutions here show 4,000x+ performance improvement, this is not a typical real world scenario.  The information learned here, though, can help gain significant performance improvements in most typical applications.  Don't expect the results shown here in a normal real world application as there are normally a lot of other factors affecting performance other than just one or two pain points.

In a prior post, I discussed the issues with string equality and how to easily increase performance around those areas.  Recently I ran into another problematic problem that seems to plague code - the exception.

Exceptions, in languages such as .Net and Java, are there to let the code know when something has happened the code has not accounted for.  This could be something such as an attempt to divide by 0, a file missing, or even a network/database connection lost.  Exceptions are good things - they allow the programmer to work around the problem.  For instance, with a database connection is lost and an exception happens, the program could attempt to reconnect when it receives the exception.  Of course, there's an exception to this, there are some exceptions that can't easily be handled in user code, such as out of stack space or out of memory - most languages simply abort the program at the point of these exceptions as it's normally a sign something seriously went wrong.  But, exceptions are generally meant to be handled and accounted for, or, if at all possible, avoided entirely as they're extremely slow.

How slow?  Here's a comparison of various loops against an exception.  For the tests, the tests are executed 5 times with an inner loop that performs a simple calculation 250,000,000 times - yes, I did say 250 million.  I used 250 million as the loop count as I'm trying to get to get a baseline of approximately 1 second for the specific tests.

First, though, a walk-thru of the performance test code.

    int iterations = 5;
    int loopCount = 250000000;
    Stopwatch sw = new Stopwatch();
    int sum = 0;
    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount; j++)
        {
            //code to test goes here
        }
        sw.Stop();

        System.Console.Write(sw.ElapsedMilliseconds + " ");
        sw.Reset();
    }
    System.Console.WriteLine();

In general, there's nothing really special about this code.  It sets up the variables, creates a stopwatch timer (more on that in a second), and then does the loops, with the stopwatch being stopped and reset at the end.  The only interesting thing here is the stopwatch timer.

Date/time calculations, such as those done with DateTime.Now.Millisecond, are notorious for not being very accurate - they can vary and are often off by as much as 100ms (on top of DateTime.Now being fairly slow itself).  The calculations we need here need to be more accurate.  The Stopwatch timer, which is in the System.Diagnostics namespace, is normally very accurate and often to as low as 1ms.  Hence, the stopwatch timer is used here for that reason.

Now, back to the timings.

The simplest case is an empty loop that does nothing.  The results, for an empty loop, are around 754ms to do the entire loop 250 million times.  The exact times I received were:

785 742 769 748 727

Next, let's see what affect a simple calculation will have on this timing.  To help prevent the compiler from possibly optimizing the calculation away, we'll put a variable named sum (shown above) on the outside of the loop and print out the calculation of the summation of j*k, through the entire loop, at the end of the loops.  Here's the code for this simple calculation.

    sum = 0;
    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount; j++)
        {
            sum = sum + j * k;
        }
        sw.Stop();

        System.Console.Write(sw.ElapsedMilliseconds + " ");
        sw.Reset();
    }
    System.Console.WriteLine();

The results for this one, as we'd expect, are slightly higher than the loop with no code - average of around 938ms (so we come very close to the 1-second test I intended).  The exact results were:

959 927 911 942 953

Ok, now let's see what affect a conditional has on this - we'll check to see if j (inner loop) is 0, if so, just add k (outer loop) to sum, otherwise do the j*k calculation.  The code –

    sum = 0;
    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount; j++)
        {
            if (j == 0)
            {
                sum = sum + k;
            }
            else
            {
                sum = sum + j * k;
            }
        }
        sw.Stop();

        System.Console.Write(sw.ElapsedMilliseconds + " ");
        sw.Reset();
    }
    System.Console.WriteLine();

For this one, we have roughly 1324ms of time used for this summation. Almost double the empty loop and a little over 300ms than the prior example.  Still, 1.3s for 250 million calculations isn't bad.

Since we're on the topic of conditionals, can we get better speed out of the C# conditional operator (?:)?  The code technically should be roughly equivalent to the code above when it converts to IL.  So, for this code:

    sum = 0;
    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount; j++)
        {
            sum = sum + ((j == 0) ? k : j * k);
        }
        sw.Stop();

        System.Console.Write(sw.ElapsedMilliseconds + " ");
        sw.Reset();
    }
    System.Console.WriteLine();

The results are a bit surprising:

2308 2245 2286 2274 2317

Wait! What? Why? Why is the code above twice as slow as the "equivalent" inline code? Good question but that's one to investigate at a different time since the goal of this comparison is exceptions. The hint on this, though, is to look at the IL that's generated between the two conditional methods. [but, oh, you thought that using the newest way of doing things makes it better and more condensed? Uh, nope - can't assume anything like that - using this is almost twice as slow as the verbose code in these circumstances].

Ok, so now we have our baseline, some simple code, and two ways of doing conditions. Now, to the meat of the matter - how bad is the exception? Let's try with throwing an exception within a try/catch block. The catch block does nothing mainly for the sake of trying to get accurate timing of the performance test.

    sum=0;
    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount/2000000; j++)
        {
            try
            {
                throw new Exception();
            }
            catch(Exception)
            {

            }
        }
        sw.Stop();

        System.Console.Write(sw.ElapsedMilliseconds + " ");
        sw.Reset();
    }
    System.Console.WriteLine();

The results average out to be 1382ms - oh, so just slightly above the one for the conditional then, right?  So, why am I even writing this blog to complain about something that's not a problem?  Well, look closely at the inner loop - look at the count - loopCount (250 million) DIVIDED BY 2,000,000...yes, a for loop that counts from 0 up to 250,000,000/2,000,000, which means it counts up to 125!  Yes, 125...not 100,000, just 125...so 1382ms average for just 125 exceptions!  That's roughly 10ms each exception!  Nah, exceptions aren't really a problem, they're a disaster!  The exact times I received for this run were:

1801 1269 1248 1245 1351

And, yes, 1801 seems a bit of an outlier but really it's right in line with the others at taking 14ms vs 10ms.  Either way, it's over 1 second for just 125 exceptions.

Now, I'm doing these tests in debug mode and with any CPU as the configuration.  I'm also running these tests within the IDE and with Intelitrace off (Intelitrace only makes things worse, with the worst hit coming to the exceptions test - up to a 3x further performance drop).  So, can we get better results when run in x86, x64 or in release modes?  Well, let's try:

Debug - x86
***Empty loop***
803 844 770 767 769
***Simple calculation***
901 900 899 903 905
***If conditional with simple calculation***
1248 1241 1241 1234 1234
***Conditional operator with simple calculation***
2215 2185 2196 2207 2206
***Exceptions (scaled down by 2,000,000)***
1100 1075 1070 1101 1086

Debug - x64
***Empty loop***
845 844 818 802 818
***Simple calculation***
1021 1021 1014 1023 1008
***If conditional with simple calculation***
1315 1320 1364 1371 1364
***Conditional operator with simple calculation***
2247 2204 2189 2211 2208
***Exceptions (scaled down by 2,000,000)***
2760 2922 2785 2807 2793

Release - Any CPU
***Empty loop***
136 145 138 131 132
***Simple calculation***
131 132 134 134 150
***If conditional with simple calculation***
277 266 270 271 271
***Conditional operator with simple calculation***
496 515 495 504 495
***Exceptions (scaled down by 2,000,000)***
758 793 762 816 872

Release - x86
***Empty loop***
134 142 146 134 150
***Simple calculation***
135 131 133 145 148
***If conditional with simple calculation***
267 272 258 269 260
***Conditional operator with simple calculation***
500 501 503 500 525
***Exceptions (scaled down by 2,000,000)***
794 781 816 811 763

Release - x64
***Empty loop***
240 224 208 204 203
***Simple calculation***
235 237 227 222 225
***If conditional with simple calculation***
336 332 331 334 339
***Conditional operator with simple calculation***
341 360 380 357 356
***Exceptions (scaled down by 2,000,000)***
1885 1883 1988 1973 2031

Ok, so, in general, if we're doing release mode and x86, we can get it down quite well, to roughly 6x slower than a simple calculation.  Worst case performance seems to be on the x64 with almost a 9x performance drop compared to a simple calculation.  Now, remember, the exceptions are actually only executing the loop 125 times compared to 250,000,000 for the others, so, in reality, we're talking 6x * 2,000,000 or 12 million times worse for the best case and 9x * 2,000,000 or 18 million times for the worst case. Are you having a performance issue in your application, you're debugging it in the IDE, and the app is throwing a lot of exceptions, and you're wondering why the app is so slow?  Well, to quote a famous Myth Buster - "There's your problem!"  

So, in the "real world," apps are not run in the IDE - they're started by the OS in various ways (ie command prompt, service/daemon, etc).  What affect does the IDE have on exceptions?  Well, let's see...same tests, same code:

Debug - x86
***Empty loop***
752 748 746 759 758
***Simple calculation***
887 890 895 890 883
***If conditional with simple calculation***
1213 1209 1202 1205 1186
***Conditional operator with simple calculation***
2162 2241 2164 2155 2162
***Exceptions (scaled down by 2,000,000)***
2 2 2 2 2

2ms for exceptions now?  The entire loop ran in 2ms?  The fun part here is 2ms for just 125 executions of the loop.  Let's drop the scale down a bit and see how it does.  For the following runs, I've set the scale to be 4,000, so the exceptions loop runs basically 62,500 times and we get the following results for each of the various build configurations.  Remember this is outside the IDE (or also via Ctrl-F5 within the IDE).

Debug - x86
***Empty loop***
768 758 771 764 752
***Simple calculation***
888 901 891 889 890
***If conditional with simple calculation***
1216 1219 1254 1233 1227
***Conditional operator with simple calculation***
2162 2173 2157 2157 2170
***Exceptions (scaled down by 4,000)***
1287 1311 1352 1292 1479

Debug - x64
***Empty loop***
820 803 834 793 791
***Simple calculation***
1010 1018 1026 1014 1042
***If conditional with simple calculation***
1413 1290 1295 1315 1293
***Conditional operator with simple calculation***
2237 2235 2270 2314 2277
***Exceptions (scaled down by 4,000)***
948 949 1011 996 990

Release - x86
***Empty loop***
139 133 135 131 134
***Simple calculation***
131 131 130 135 132
***If conditional with simple calculation***
261 260 263 266 263
***Conditional operator with simple calculation***
504 519 503 514 513
***Exceptions (scaled down by 4,000)***
1407 1392 1446 1407 1358

Release - x64
***Empty loop***
133 141 134 133 152
***Simple calculation***
600 602 596 599 604
***If conditional with simple calculation***
597 603 607 612 592
***Conditional operator with simple calculation***
599 599 591 599 587
***Exceptions (scaled down by 4,000)***
885 901 883 855 931

So, about 500x faster outside the IDE (125x500=62,500).  Still, though, at best case we're talking roughly 4,000x slower than the simple calculation itself...  Still, a large enough difference to be a fairly major problem when exceptions are being thrown a lot.

Am I saying we should abandon exceptions?  No, not at all (sure, abandon them and go back to what - the "on error gosub" of the VB era?).  I am saying that we should use them for what they're there for - they were designed to give details about a problem that the code did not account for.  They should not be used for control flow to break out of methods nor should they be "allowed" to happen "on their own" without code trying to find and avoid the problem first.  The simple answer is exceptions are expensive - guard against them to keep them from happening in the first place, where it is possible and where it makes sense.  If the code expects a file to exist, verify the file exists before you try to do something with it - don't assume that because you checked for the file 30 seconds prior that it still exists.  Do the checks and guards where it makes sense in the code.  If exceptions are extremely rare and should "never" happen, the guard does still cost so use judgment there, but, for things that are possible to happen, definitely try to guard against them to prevent the cost of the exception hit.

Just as an example, let's try a similar test to above but with a guaranteed failure for a certain percentage of the test.  We'll do a 0-100% fail rate (for space, the results for only 0-10% will be shown) over a smaller loop count of 2,500,000.  The code to do the test and avoid the division by zero exception looks like:

int sum = 0;
System.Console.WriteLine("***J/K summation***");
for (int l = 0; l < 101; l+=1)
{
    double remainderValue = loopCount * (1.0 - l / 100.0);
    double avg = 0.0;
    sum = 0;
    
    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount; j++)
        {
            if (remainderValue < 0 && (int)remainderValue >j)
            {
                sum = sum + k * j / Math.Max(0, 
                   (int)remainderValue - j);
            }
            else
            {
                sum = sum + k;
            }
        }
        sw.Stop();

        avg = avg + sw.ElapsedMilliseconds;
        sw.Reset();
    }
    System.Console.Write("  " + (int)(avg / iterations) +
        " (" + l + "%) ");
}
System.Console.WriteLine();

And the code that allows the exceptions to happen looks like:

System.Console.WriteLine("***J/K summation w/ exceptions***");
for (int l = 0; l < 101; l += 1)
{
    double remainderValue = loopCount * (1.0 - l / 100.0);
    double avg = 0.0;
    sum = 0;

    for (int k = 0; k < iterations; k++)
    {
        sw.Start();
        for (int j = 0; j < loopCount; j++)
        {
            try
            {
                sum = sum + k * j / Math.Max(0, 
                    (int)remainderValue - j);
            }
            catch (DivideByZeroException)
            {
                sum = sum + k;
            }
        }
        sw.Stop();

        avg = avg + sw.ElapsedMilliseconds;
        sw.Reset();

    }
    System.Console.Write("  " + (int)(avg / iterations) + 
        " (" + l + "%) ");
}
System.Console.WriteLine();

The results, not performed in the IDE, for the first 10% without exceptions looks like:

***J/K summation***
  26 (0%)   27 (1%)   25 (2%)   23 (3%)   25 (4%)   24 (5%)
  25 (6%)   23 (7%)   32 (8%)   29 (9%)   32 (10%)

And with exceptions, also, not in the IDE, the timing looks like:

***J/K summation w/ exceptions***
    28 (0%)  617 (1%) 1227 (2%) 1826 (3%) 2292 (4%)  2708 (5%)
  3260 (6%) 3787 (7%) 4154 (8%) 5126 (9%) 5412 (10%)

So, at even 1% of the data failing (25,000 items out of 2,500,000), we're seeing roughly a 30x drop in performance and at 10% fail rate (250,000 items) we're seeing over 150x drop in performance - and this drop is already above the "slower performing" code that has the guard clauses in it to prevent the exceptions.

Hopefully, this makes the point fairly clear - exceptions are expensive, even when compared to adding in guard clauses to help prevent them.  Even a small frequency of exceptions can cause substantial performance problems in an application so prevent them when you can.