Referencing system assemblies in Roslyn compilations

With the recent stable release of the Roslyn packages (.NET Compiler Platform) I finally decided to give it a try. I’ve found it pretty simple to compile some code in runtime, load the resulting assembly and execute it. Overall it’s pretty impressive, given the amount of work that’s being done.

There are loads of resources on Roslyn online, namely Josh Varty’s Learn Roslyn Now series and Rolsyn’s documentation on GitHub. On this post, however, I’ll focus one aspect that wasn’t obvious to me in the first place: referencing shared framework assemblies.

When writing code it’s very likely that one ends up using types defined in reference assemblies such as System.Runtime and System.Runtime.Extensions, specially with .NET Standard  and .NET Core’s pay-for play model. These assemblies are just facades and the used types will eventually be forwarded to implementation assemblies in runtime, depending on the target framework.

Going back to Roslyn, the idea is that we’ll be compiling/generating code in runtime, so the question arises: which assemblies should I reference during the compilation? The only satisfying explanation I’ve found is a bit lost in a GItHub issue, hence this post.

There are two main approaches we can use during compilation: 1) use runtime (implementation) assemblies; and 2) use reference (contract) assemblies.

The remainder of this post is based on the following sample program that uses Roslyn. Note that the code to compile (code variable) uses DateTime (defined in reference assembly System.Runtime) and Math (defined in reference assembly System.Runtime.Extensions).

class Program
{
    static void Main(string[] args)
    {
        var code = @"
        using System;
        public class Test
        {
            public double MagicNumber => Math.Round(42.42 * DateTime.Now.Second);
        }
        ";

        var syntaxTree = CSharpSyntaxTree.ParseText(code);

        List references = // TODO define references here!
        var options = new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary, optimizationLevel: OptimizationLevel.Release, allowUnsafe: false);

        var compilation = CSharpCompilation.Create(Guid.NewGuid().ToString("N"), new[] { syntaxTree }, references, options);

        using (var ms = new MemoryStream())
        {
           var compilationResult = compilation.Emit(ms);
           if (compilationResult.Success)
           {
                ms.Position = 0;
                var assembly = AssemblyLoadContext.Default.LoadFromStream(ms);
                var instance = Activator.CreateInstance(assembly.DefinedTypes.First().UnderlyingSystemType);
               // Do something with "instance"
            }
            else
            {
                foreach (var error in compilationResult.Diagnostics)
                {
                   Console.WriteLine(error.ToString());
                }
            }
        }
     }
}

Option 1 – Runtime assemblies

Since we’re invoking Roslyn, we’re already in the context of an executing application targeting a specific framework. This framework includes the runtime (implementation) assemblies that we need, so the first option for compilation is referencing those assemblies. But how can we know their location in a flexible way?

The .NET Core runtime defines a so called “trusted platform assemblies” list, accessible via the AppContext class:

var trustedAssembliesPaths = ((string)AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES")).Split(Path.PathSeparator);

This list contains the locations of all the assemblies loaded from trusted locations, namely the ones on shared framework installations and NuGet caches. Example:

"C:\Program Files\dotnet\shared\Microsoft.NETCore.App\1.1.1\System.Linq.dll"
"C:\Users\luis\.nuget\packages\system.xml.xpath\4.3.0\lib\netstandard1.3\System.Xml.XPath.dll"

The aforementioned list contains a lot of assemblies that we probably don’t need (and don’t want to make available!) in our code, so it’s probably a good idea to filter the list. After that, creating a MetadataReference is straightforward. The full code to obtain the references that my code needs is:

var trustedAssembliesPaths = ((string)AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES")).Split(Path.PathSeparator);
var neededAssemblies = new[]
{
    "System.Runtime",
    "mscorlib",
};
List references = trustedAssembliesPaths
    .Where(p => neededAssemblies.Contains(Path.GetFileNameWithoutExtension(p)))
    .Select(p => MetadataReference.CreateFromFile(p))
    .ToList();

Pros:

  • All the needed assemblies are made available by the framework. No need to include additional resources on the application package.

Cons:

  • Need to know the actual runtime (implementation) assemblies. Note that my code depends on System.Runtime.Extensions facade, but I actually add a reference to mscorlib.
  • If an implementation assembly is updated (e.g. shared framework updated) the code might break (it’s unlikely, but possible). Even though .NET APIs should be backward compatible, “compiler overload resolution might prefer an API added in the new version instead of the one that it used to pick before

Option 2 – Reference assemblies

Instead of relying on runtime assemblies, we can compile the code against reference assemblies. The NuGet packages for System.Runtime and alike include both implementation and reference/contract assemblies (you can see this on your local NuGet cache).

System.Runtime/4.3.0/
                   \_ lib/net462/System.Runtime.dll
                   \_ ref/netstandard1.5/System.Runtime.dll

As these assemblies are in a well know location, we can easily create MetadataReferences:

var nugetCache = Environment.GetEnvironmentVariable("UserProfile") + @"\.nuget\packages\";
List references = return new List<MetadataReference>
{
  MetadataReference.CreateFromFile(nugetCache + @"System.Runtime\4.3.0\ref\netstandard1.5\System.Runtime.dll"),
  MetadataReference.CreateFromFile(nugetCache + @"System.Runtime.Extensions\4.3.0\ref\netstandard1.5\System.Runtime.Extensions.dll"),
}
.ToList();

Note that in the sample above I’m using the local NuGet cache. In a real-world scenario you’d include the reference assemblies for the chosen .NET Standard version in the application package (e.g. as resources).

When the assembly generated using Roslyn is loaded, its assembly references are resolved into the corresponding runtime assemblies.

Pros:

  • Assemblies are loaded from a well-known location.
  • Since the reference assemblies won’t change (unless you explicitly update them) there’s a guarantee that the code won’t break on future versions of the .NET Core runtime.

Cons:

  • Additional items on the application package.

Conclusion

Despite both approaches being valid and the first one being used by the C# Scripting API, I tend to prefer the second as it is more stable and less obscure. Hope this helps!