Friday, November 1, 2013

Compile and execute a code snippet from your C# program

The problem

Every so often I find myself creating little command line programs that:
  1. Gather a list of items from somewhere
  2. Perform some user-controllable operation on each item
My problem for today is in that user-controllable operation. Somehow I often end up designing and implementing a little mini-programming language.
    > ChangeDocuments "UserName = Frank" "Set RetentionPolicy=30y"

    > ChangeDocuments "Today > ExpirationDate" "Add State=Expired"
The first parameter to this ChangeDocuments program is a quite regular search query that I can feed into the system I am programming against. The system is well designed, so I can feed nicely complex search queries into it:
    RetentionPolicy = 30y AND (ExpirationDate = Null OR AnyOtherField = AnyValue)
When I am testing my little ChangeDocuments program I can clearly see that the query language was designed by people that know that sort-of stuff and parse search queries for a living.

Unfortunately that doesn't apply to me. Which means that the second parameter, the one that I have to implement myself, turns into a mini programming language that gets uglier with every step.

    Set RetentionPolicy=30y
    Add State=Expired
    Clear RetentionPolicy
    Set ExpirationDate=TodayPlus30Years()
Ouch? Did you see that last one? Not only is it ugly, but I'll have to find a way to parse that function call out of there and implement the function. And what if they want a different number of years. Sure, I could write a proper parser for that and implement function calls, call stacks, contexts and all that. But I also need it to be done today and to feel slightly more wieldy then the way I'll end up implementing it. Besides... aren't there enough people that implement programming languages for a living already?

So what I'd really prefer to do, is execute a snippet of C# code. So that the above examples can turn into something more regular, like:

    Set("RetentionPolicy", "30y")
    Add("State", "Expired")
    Clear("RetentionPolicy")
    Set("ExpirationDate", DateTime.Now + TimeSpan.FromDays(30 * 365))
Note that this is still not an ideal language. I'd prefer to have symbols like RetentionPolicy and State to be available like named variables. But short of implementing my own domain-specific language, this is as close as I could get with C# in a reasonable time.

Walkthrough

When I first dynamically compiled code in .NET, I was shocked at how simple this is.

Let's start with this code snippet:

    DateTime.Now + TimeSpan.FromDays(30 * 365)
For the moment, we'll put it in a variable:
     var snippet = "DateTime.Now + TimeSpan.FromDays(30 * 365)";
We'll be compiling the code using .NET's built-in C# compiler. The C# compiler can only handle full "code files" as its input, so the type of thing you'd normally find inside a .cs file.

So we'll wrap our snippet in a little template:

    using System;
    public class Snippet {
        public static void main(string[] args) {
            Console.WriteLine(CODE SNIPPET GOES HERE);
        }
    }
In code:
    var template = "using System;\npublic class Snippet {{\n\tpublic static void main(string[] args) {{\n\t\tConsole.WriteLine({0});\n\t}}\n}}";
    var code = string.Format(template, snippet);
Note that for now we simply write the result of the expression to the console. We'll see other ways to handle this later.

Next up is the bulk of our operation: compiling this code into an assembly.

    var provider = new Microsoft.CSharp.CSharpCodeProvider();

    var parameters = new System.CodeDom.Compiler.CompilerParameters{ GenerateExecutable = false, GenerateInMemory = true };

    var results = provider.CompileAssemblyFromSource(parameters, code);                                                

    if (results.Errors.Count > 0) {
        foreach (var error in results.Errors) {
            Console.Error.WriteLine("Line number {0}, Error Number: {1}, '{2};\r\n\r\n", error.Line, error.ErrorNumber, error.ErrorText);
        }
    } else {
        var type = results.CompiledAssembly.GetType("Snippet");
        var method = type.GetMethod("main" );
        method.Invoke(null, new object[] { new string[0] });
    }
That is really all there is to it.
    10/25/2043 1:21:04 PM
There are tons of additional parameters you can pass to the CSharpCodeProvider, but this minimal set tells it to:
  • generate an assembly, instead of an executable
  • keep the generated assembly in memory, instead of putting it on disk

Complete code

This is the complete code snippet that we constructed so far:
    var snippet = "DateTime.Now + TimeSpan.FromDays(30 * 365)";
    var template = "using System;\npublic class Snippet {{\n\tpublic static void main(string[] args) {{\n\t\tConsole.WriteLine({0});\n\t}}\n}}";

    var code = string.Format(template, snippet);

    var provider = new Microsoft.CSharp.CSharpCodeProvider();

    var parameters = new System.CodeDom.Compiler.CompilerParameters{ GenerateExecutable = false, GenerateInMemory = true };

    var results = provider.CompileAssemblyFromSource(parameters, code);                                                
    if (results.Errors.Count > 0) {
        foreach (System.CodeDom.Compiler.CompilerError error in results.Errors) {
            Console.Error.WriteLine("Line number {0}, Error Number: {1}, '{2};\r\n\r\n", error.Line, error.ErrorNumber, error.ErrorText);
        }
    } else {
        var type = results.CompiledAssembly.GetType("Snippet");
        var method = type.GetMethod("main" );
        method.Invoke(null, new object[] { new string[0] });
    }
I normally run this type of code snippet in LINQPad, which does something quite similar (albeit likely a lot more elaborate) internally. But if you just paste the above into the main of a command line program in your Visual Studio project it'll also work of course.

Possible changes and considerations

Use an instance method In the above code we use a static main method. If you'd instead prefer to use a regular instance method, you'll need to instantiate an object and pass it to Invoke, like this:
    var type = results.CompiledAssembly.GetType("Snippet");
    var obj = Activator.CreateInstance(type);
    var method = type.GetMethod("main" );
    method.Invoke(obj, new object[] { new string[0] });
If you do this, I recommend that you name the method something else than main, since most people will associate main with a static void method. Pass parameters to the snippet The snippet operates in complete isolation so far. To make it a bit more useful, let's pass some parameters into it:

First we'll need to modify our template to do something with the new parameter:

	var template = "using System;\npublic class Snippet {{\n\tpublic static void main(string[] args) {{\n\t\tConsole.WriteLine(args[0], {0});\n\t}}\n}}";
So we just use the first parameter as the format string for Console.WriteLineargs[0], {0}). Then we pass a value for this parameter when we invoke the method:
    method.Invoke(null, new object[] { new string[] { "Expiration date = {0}" } });
And now the snippet will print:
    Expiration date = 10/25/2043 1:21:04 PM
Make the script return a value However interesting writing to the console is, it is probably even more useful if our snippet would return its value instead.

To accomplish this, we'll change the template for the main method to this:

    public static string main(string[] args) {{\n\t\treturn string.Format(args[0], {0});\n\t}}
And the invocation now needs to handle the return value:
    var output = method.Invoke(null, new object[] { new string[] { "Expiration date = {0}" } });
    Console.WriteLine(output);
Note that the snippet itself has remained completely unchanged through all our modifications so far. This is a good design principle if you ever allow the users of your application to specify logic in this way: make sure that any changes you make to your code are backwards compatible. Whatever code the users wrote, should remain working without changes. The line numbers for errors are offset If there is an error in the snippet, the script will write the error message(s) that it gets back from the C# compiler.

So when we change the snippet to:

    var snippet = "DateTime.No + TimeSpan.FromDays(30 * 365)";
We'll get this output:
    Line number 4, Error Number: CS0117, ''System.DateTime' does not contain a definition for 'No';
The error message itself is correct of course, but the snippet we provided is one line, to the line number is clearly wrong. The reason for that is that our snippet is merged into the template and becomes:
    using System;
    public class Snippet {
        public static string main(string[] args) {
            return string.Format(args[0], DateTime.No + TimeSpan.FromDays(30 * 365));
        }
    }
And indeed, this C# contains the problem on line 4.

The solution for this line number offset is to either subtract the offset from the line number in the error message or simply not print the error message. In a simple case such as this the latter option is not as bad as it may sound: we only support short snippets of code, so the line numbers should be of limited value. But then again: never underestimate the ability of your users to do more with a feature than you ever thought possible. Make the wrapper class more complete Probably the most powerful way you can extend the abilities of your snippet-writing users is by providing them more "built-in primitives".

  • Any method you add to the Snippet class, becomes like a built-in, global function to the snippet author. So the Set, Add and Clear methods of my original snippets could be implemented like that.
  • You can also make the Snippet class inherit from your own base class, where you implement these helper functions.
  • Any variables that you define inside the main method before you include the user's snippet, will become like built-in, global variables to your snippet authors.
  • I've had great success in the past with making utilities such as a log variable available like this.
Allow importing more namespaces and referencing more assemblies The template above imports just one namespace and only binds to the default system assemblies of .NET. To allow your snippet authors to use more functionality easily, you can either expand the number of using statements in the template and add additional references to the ReferencedAssemblies of the CompilerOptions.

Alternatively you can give the users a syntax that allows them to specify their own namespaces to import and even assemblies to reference. In the past I got some pretty decent results with this syntax:

   <%@ Import Namespace="Path.Of.Namespace.To.Import" %>
Use VB instead of C# There is also a compiler for Visual Basic code. If you'd prefer to use that, you can find it here:
    var provider = new Microsoft.VisualBasic.VBCodeProvider();

No comments: