A look at a powerful development technique as used in the PetaPoco Micro ORM
In a previous post I talked about using PetaPoco on a recent project. I mentioned that it uses Dynamic Method Invocation to create a method on the fly to convert an IDataReader record into a POCO.
The technique itself is very powerful in the right situation, and is used to good effect in PetaPoco (and other Mircro ORMs). If you had to write a method to convert a known IDataReader instance into a known POCO instance it would be strightforward enough. However if you knew neither the format of the values in the IDataReader, not the type of the POCO it would be more difficult. You would have to write quite a convoluted method to be able to deal with it, and the conditional logic (and reflection) would make it perform poorly, especially as there is a potential that the method would be called many times to load a dataset. Instead you could examine the IDataReader and the type and use Dynamic Method Invocation to build a method that works for that exact situation.
There are lots of problems that can be solved by the generalised form of this approach but there are a couple of reasons I think that it is not more widely used; the technique is a little misunderstood and can sometimes feel a bit like magic, and it can look imtimidating because the method body is built by emitting .Net Intermediate Language (which has gone through various guises, but is now known as Common Intermediate Language, or CIL for short) which can be out of some developer’s comfort zone.
The basic idea is that PetaPoco will generate a method to hydrate a POCO at runtime by examining the IDataReader and creating a one off factory method to perform the conversion, so lets dive in and have a look at what PetaPoco is doing. A useful place to start is by looking at how the dynamically created method will be used.
At this point it is worth mentioning that PetaPoco makes use of the Dynamic Language Runtime introduced as part as .Net 4.0 and the ExpandoObject type. This means that you can return a Dynamic ExpandoObject type with the properties matching the fields returned from the IDataReader. It also supports .Net 3.5 without the DLR with #if directives. I am going to assume that we are using .Net 4.0 and will be looking at the code used to return an ExpandoObject.
This particular usage is from the Query method, which returns and IEnumerable, where T is the type of the POCO you want to return. Inside you will find this:
var factory = pd.GetFactory(cmd.CommandText, _sharedConnection.ConnectionString,
ForceDateTimesToUtc, 0, r.FieldCount, r) as Func<IDataReader, T>;
using (r)
{
while (true)
{
T poco;
try
{
if (!r.Read())
yield break;
poco = factory(r);
}
catch (Exception x)
{
OnException(x);
throw;
}
yield return poco;
}
}
Our first clue is that the the Query used the GetFactory method to return a factory which is then used to return the POCOs for the IEnumerable one at at time using the yield return. Notice that the return type from GetFactory is cast to a Func as GetFactory return a Delegate.
public Delegate GetFactory(string sql,
string connString, bool ForceDateTimesToUtc,
int firstColumn, int countColumns, IDataReader r)
The first part of the GetFactory does some caching to ensure that there is reuse of the factories as far as possible, than a new DynamicMethod is created:
var m = new DynamicMethod("petapoco_factory_" + PocoFactories.Count.ToString(),
type, new Type[] { typeof(IDataReader) }, true);
This particular constructor creates an anonymously hosted dynamic method which means the dynamic method is associated with an anonymous assembly, rather than an exisitng type. This isolates the dynamic method from the other code and provides some safety to be used in partial trust environments. The constructor takes a name, a return type (in this case the POCO type), a Type[] containing the parameter types (just the one, the IDataReader) and a bool that sets [italic]restrictedSkipVisibility allowing the method access to private, protected and internal methods from existing types.
So now we have a new DynamicMethod we can start to generate the IL. First get the ILGenerator.
var il = m.GetILGenerator();
Now we need to start building up the method body with IL. Everything you can do with C# can be written in IL (as after all, C# and other .Net languages are compiled down to IL) but the IL can be more verbose. Also IL is a stack based language so operands are pushed onto the stack, then operators pop the operands from the stack and to perform an operation and push the result onto the top of the stack.
There are 3 different sections depending on the type to be returned. Although geared towards POCOs, PetaPoco can happily also return an ExpandoObject or a single scalar value.
First up is the section that returns an Expando with properties mirroring the IDataReader row. We will have a look at the highlights and hopefully learn some IL as we go. The first thing is to create the ExpandoObject and place it in the top of the stack (and also the bottom, as the stack is currently empty).
il.Emit(OpCodes.Newobj, typeof(System.Dynamic.ExpandoObject)
.GetConstructor(Type.EmptyTypes));
We are going to need to call a method on the Expando at some point to add the property. A MethodInfo is defined to hold the method meta-data for the Add method of the Expando. It will be used later.
MethodInfo fnAdd = typeof(IDictionary<string, object>).GetMethod("Add");
Now we can loop through all of the columns in the IDataReader and add them to the Expando using IL. There is some additional logic for some data type conversion to support various IDataReader implementations for some Databases. For the sake of brevity I will leave the converter logic out and focus on the main logic. I have left in the comment from the PetaPoco source.
// Enumerate all fields generating a set assignment for the column
for (int i = firstColumn; i < firstColumn + countColumns; i++)
{
var srcType = r.GetFieldType(i);
il.Emit(OpCodes.Dup); // obj, obj
il.Emit(OpCodes.Ldstr, r.GetName(i)); // obj, obj, fieldname
Firstly get the type of the IDataRecord corresponding to column i. Recall that the IL stack already has the Expando we created earlier on it (or to be more precise the reference to the object). Create a duplicate of that object and use Ldstr to push a string reference of the column name.
// r[i]
il.Emit(OpCodes.Ldarg_0); // obj, obj, fieldname, rdr
il.Emit(OpCodes.Ldc_I4, i); // obj, obj, fieldname, rdr,i
il.Emit(OpCodes.Callvirt, fnGetValue); // obj, obj, fieldname, value
The next three statements are used to get the value from the IDataReader. Ldarg_0 pushes the argument at position 0 onto the stack, which is the IDataReader. Then Ldc_I4 pushed the int value of i onto the stack. Callvirt is used to call a method on an object. The fnGetValue has been defined previously. It is a MethodInfo for IDataRecord.GetValue(i). The object in question is the IDataReader and the argument is i. The result of the method call is left on the top of the stack.
// Convert DBNull to null
il.Emit(OpCodes.Dup); // obj, obj, fieldname, value, value
il.Emit(OpCodes.Isinst, typeof(DBNull)); // obj, obj, fieldname, value, (value or null)
Call dup to duplicate the value on the top of the stack. Isinst is used to check the object on the top of the stack is an instance of a particular type, in this case DBNull. If the value is a DBNull it is cast to DBNull and pushed onto the top of the stack. If it is not DBNull then a null reference is pushed onto the stack. The top of the stack now contains either a reference to DBNull or a reference to Null.
var lblNotNull = il.DefineLabel();
il.Emit(OpCodes.Brfalse_S, lblNotNull); // obj, obj, fieldname, value
il.Emit(OpCodes.Pop); // obj, obj, fieldname
il.Emit(OpCodes.Ldnull); // obj, obj, fieldname, null
il.MarkLabel(lblNotNull);
il.Emit(OpCodes.Callvirt, fnAdd);
}
This section can be a bit confusing so I will step through it slowly. It defines an if-statement where execution is branched depending on a condition. First define a label. This label will be used as the execution target for the branch. Recall that the top two items on the stack are the value from the IDataReader and either DBNull (if the value is DBNull) or null (if the value is not DBNull).
The Brfalse_S instruction will pop the top item from the stack and check it for a false-ish value, either false, null or zero. If it is false-ish then jump to the label lblNotNull. Remember that the null on the top of the stack represents a non DBNull value which is now left on the top of the stack. Execution now passes to the point marked by MarkLabel and continues with the Callvirt on the fnAdd MethodInfo we defined earlier, which pops the values from the stack to use as arguments until it get the ExpandoObject an then call the method on it. This has the effect of adding a property with the fieldname of column i and the value from IDataReader column i.
If the branch condition is non false (meaning the value on the stack is DBNull) then execution continues without jumping. As DBNull is no use to us in the Expando it is popped from the stack and ldnull is used to push null onto the top of the stack. It is this null value that is then used in the Callvirt method to add a property with a null value to the stack.
These set of instructions are repeated for each column in the IDataReader until the stack contains just a refernce to the ExpandoObject, which has a property for each column. Finally the Expando object is returned.
il.Emit(OpCodes.Ret);
You may be thinking that the above steps could be done without resorting to building up a dynamic method in IL, and you would be correct. It is trivially easy to add named properties to an ExpandoObject. PetaPoco does it in IL because it is part of a larger area of code that can return an ExpandoObject, a scalar or a POCO depending on what has been asked for.
This covers the case when PetaPoco needs to return a dynamic ExpandoObject. Head over to part 2 to see how PetaPoco goes about returning a scalar value or an typed POCO.