Understanding the performance characteristics of a C# program isn’t always as straightforward as you might think. In much the same way as the JIT can take IL code, transform and then optimise it. Your C# code will first go through a phase where it is lowered
before being transformed into IL.
Shortcuts
Rather than providing custom implementations for each syntax feature at the IL level, lowering allows the compiler authors to save a bit of time and convert that syntax into simpler constructs, which can then be transformed into IL.
For example, a switch
statement can sometimes be lowered to a series of if-else
statements instead.
public void Test(Things thing)
{
switch (thing)
{
case Things.First:
Console.WriteLine("First");
break;
case Things.Second:
Console.WriteLine("Second");
break;
}
}
// This is the lowered form of the above procedure
public void Test(Things thing)
{
if (thing != 0)
{
if (thing == Things.Second)
{
Console.WriteLine("Second");
}
}
else
{
Console.WriteLine("First");
}
}
As we can see, the switch
statement is first converted to a series of if-else
statements before being transformed to IL. If we take a look at the resulting IL code, we can see that this step has allowed the compiler to turn the switch statement into one containing just two branching operations instead:
IL_0000: ldarg.1
IL_0001: brfalse.s IL_0008 // First branch
IL_0003: ldarg.1
IL_0004: ldc.i4.1
IL_0005: beq.s IL_0013 // Second branch
IL_0007: ret
IL_0008: ldstr "First"
IL_000d: call void [System.Console]System.Console::WriteLine(string)
IL_0012: ret
IL_0013: ldstr "Second"
IL_0018: call void [System.Console]System.Console::WriteLine(string)
IL_001d: ret
Optimisations
Not all lowering is done for syntax conversion reasons though. If we add just one more case to our switch statement from before:
public void TestThing(Things thing)
{
switch (thing)
{
case Things.First:
break;
case Things.Second:
break;
// Added case
case Things.Third:
break;
}
}
// This is the lowered form of the above procedure
public void TestThing(Things thing)
{
switch (thing)
{
case Things.First:
break;
case Things.Second:
break;
case Things.Third:
break;
}
}
It turns out a switch statement is something that C# supports after all. If we take a look at the resulting IL code we can confirm this:
IL_0000: ldarg.1
IL_0001: switch (IL_0013, IL_001e, IL_0029) // The switch statement
IL_0012: ret
IL_0013: ldstr "First"
IL_0018: call void [System.Console]System.Console::WriteLine(string)
IL_001d: ret
IL_001e: ldstr "Second"
IL_0023: call void [System.Console]System.Console::WriteLine(string)
IL_0028: ret
IL_0029: ldstr "Third"
IL_002e: call void [System.Console]System.Console::WriteLine(string)
IL_0033: ret
So if C# supports the switch statement natively, why wasn’t it used in the first example? The answer is performance.
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1889 (21H1/May2021Update)
AMD Ryzen 7 5800X, 1 CPU, 16 logical and 8 physical cores
[Host] : .NET Framework 4.8 (4.8.4515.0), X86 LegacyJIT
DefaultJob : .NET Framework 4.8 (4.8.4515.0), X86 LegacyJIT
| Method | i | Mean | Error | StdDev | Allocated |
|------------- |-- |----------:|----------:|----------:|----------:|
| IfSwitch | 1 | 0.1992 ns | 0.0020 ns | 0.0018 ns | - |
| SwitchSwitch | 1 | 0.2166 ns | 0.0011 ns | 0.0009 ns | - |
I added some small computation to the switch cases to avoid the compiler optimising out the instructions entirely.
The two branches in the first case are quicker as an if-else
, but anything more than two cases the compiler will instead opt for the native switch
instead.
The difference in performance for this example is negligible, however this kind of information is good to know so that you can make informed choices about the code you write.
Further Optimisations
One more switch
example:
public void SwitchIfString(string i)
{
switch (i)
{
case "0": break;
case "1": break;
case "2": break;
case "3": break;
case "4": break;
case "5": break;
// case "6": break;
}
}
// This is the lowered form of the above procedure
public void SwitchIfString(string i)
{
if (!(i == "0") && !(i == "1") && !(i == "2") && !(i == "3") && !(i == "4"))
{
bool flag = i == "5";
}
}
We have some slightly different behaviour again. By changing the data type from an integer/enum to a string, the compiler opts to use a construct that is likely to yield better performance instead.
For even more interesting results we can add in just one more case
:
public void SwitchHashString(string i)
{
switch (i)
{
case "0": break;
case "1": break;
case "2": break;
case "3": break;
case "4": break;
case "5": break;
case "6": break;
}
}
// This is the lowered form of the above procedure
public void SwitchHashString(string i)
{
uint num = <PrivateImplementationDetails>.ComputeStringHash(i);
if (num <= 856466825)
{
if (num != 806133968)
{
if (num != 822911587)
{
if (num == 856466825)
{
bool flag = i == "6";
}
}
else
{
bool flag2 = i == "4";
}
}
else
{
bool flag3 = i == "5";
}
}
else if (num <= 890022063)
{
if (num != 873244444)
{
if (num == 890022063)
{
bool flag4 = i == "0";
}
}
else
{
bool flag5 = i == "1";
}
}
else if (num != 906799682)
{
if (num == 923577301)
{
bool flag6 = i == "2";
}
}
else
{
bool flag7 = i == "3";
}
}
A somewhat surprising turn of events perhaps and a completely new construct. Instead of a switch
or a simple if-else
block, the comparisons have instead morphed into a graph of if-else
statements performed on a hash of the string value.
| Method | i | k | Mean | Error | StdDev | Allocated |
|----------------- |-- |-- |---------:|----------:|----------:|----------:|
| SwitchIfString | 1 | 1 | 4.591 ns | 0.0593 ns | 0.0555 ns | - |
| SwitchHashString | 1 | 1 | 2.961 ns | 0.0130 ns | 0.0121 ns | - |
Interestingly the benchmarks show the if-else
block being slightly slower than the hash. However, given the numbers we’re working with here I wouldn’t read too much into it. Perhaps different hardware and situations will change the results, or perhaps my benchmark isn’t quite right.
It is worth noting that microbenchmarks such as these are not indicative of too much, and they were not the purpose of this article. What is important though is that you have some understanding that the code you are writing can sometimes not behave in the manner that you would expect, given that the compiler may “lower” your syntax into another form. Knowing this gives you the perspective to delve deeper to fully understand the problems when they arise.
As a further exercise I suggest taking a look at how async code is turned into a state machine when it is lowered:
using System.Threading.Tasks;
public class C
{
public async Task TestAsync()
{
await Task.Delay(10);
}
}
// This is the lowered form of the above code
using System;
using System.Diagnostics;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
using System.Security.Permissions;
using System.Threading.Tasks;
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints)]
[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification = true)]
[assembly: AssemblyVersion("0.0.0.0")]
[module: UnverifiableCode]
public class C
{
[StructLayout(LayoutKind.Auto)]
[CompilerGenerated]
private struct <TestAsync>d__0 : IAsyncStateMachine
{
public int <>1__state;
public AsyncTaskMethodBuilder <>t__builder;
private TaskAwaiter <>u__1;
private void MoveNext()
{
int num = <>1__state;
try
{
TaskAwaiter awaiter;
if (num != 0)
{
awaiter = Task.Delay(10).GetAwaiter();
if (!awaiter.IsCompleted)
{
num = (<>1__state = 0);
<>u__1 = awaiter;
<>t__builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
return;
}
}
else
{
awaiter = <>u__1;
<>u__1 = default(TaskAwaiter);
num = (<>1__state = -1);
}
awaiter.GetResult();
}
catch (Exception exception)
{
<>1__state = -2;
<>t__builder.SetException(exception);
return;
}
<>1__state = -2;
<>t__builder.SetResult();
}
void IAsyncStateMachine.MoveNext()
{
//ILSpy generated this explicit interface implementation from .override directive in MoveNext
this.MoveNext();
}
[DebuggerHidden]
private void SetStateMachine(IAsyncStateMachine stateMachine)
{
<>t__builder.SetStateMachine(stateMachine);
}
void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
{
//ILSpy generated this explicit interface implementation from .override directive in SetStateMachine
this.SetStateMachine(stateMachine);
}
}
[AsyncStateMachine(typeof(<TestAsync>d__0))]
public Task TestAsync()
{
<TestAsync>d__0 stateMachine = default(<TestAsync>d__0);
stateMachine.<>t__builder = AsyncTaskMethodBuilder.Create();
stateMachine.<>1__state = -1;
stateMachine.<>t__builder.Start(ref stateMachine);
return stateMachine.<>t__builder.Task;
}
}
References
- Common Intermediate Language
- An excellent video by Nick Chapsas which covers some of the same ground as this article.
- sharplab.io used to convert C# code to its lowered form.
- BenchmarkDotNet used to create the benchmarks shown in this article.