Saturday, January 14, 2012

Comparing Transparent Lazy Loading between NHibernate and Entity Framework

Motivation

When I was trying Entity Framework after many years with NHibernate, I assume the Entity Framework has same transparent lazy loading behavior like NHibernate. But the assumption is not completely right. There are some minor behavior differences in Entity Framework. In this article, I will explain what lazy loading is, how NHibernate and Entity Framework implement it, and the same and differences.

What is Lazy Loading

Quote from Wikipedia:
Lazy loading is a design pattern commonly used in computer programming to defer initialization of an object until the point at which it is needed. It can contribute to efficiency in the program’s operation if properly and appropriately used. The opposite of lazy loading is Eager Loading.

When there is a domain model in application, entity in domain model is used to represent business entity in problem domain. Association between entities is equivalent to the relationship or interaction of business entities. The beauty of domain model is it can be used to mimic problem domain as much as we can. However, domain object in memory is volatility. Hence, we need tables in database to sustaining domain object. Domain object needs to be saved to database or loaded back into memory from database. The code for loading domain object from database is not trivial. Sometimes, this kind of code spends developer most time in application development. On the other hand, the code for loading domain object is very similar for any domain entity. Therefore, ORM framework is introduced to provide unified data access code. ORM framework relieves developer from doing nitty-gritty database access code and let them focus on more important business logic code. By using ORM framework, domain object is loaded or saved with developer friendly API.
Domain object in domain model is not independent usually. In most situations, multiple domain objects are linked with each other as a big domain object graph. When application got a particular domain object from database, we need to also load objects of related domain entities. Lazy loading is invented to load related domain objects on-demand. By doing this, developer will directly work with domain object and use it naturally without worry how the related domain objects are loaded. The easiest way to support lazy loading is to have customized code in properties and methods to initialize related domain object when it is going to be used. To add customized lazy loading code is a lot of work, time consuming, and error-prone. Fortunately, this kind of cross-cutting code can be injected by AOP framework, such as Windsor Castle AOP framework. This is called transparent lazy loading. The precondition for using transparent lazy loading is to make any property or method lazy loading friendly. This means marking property or method as virtual in C#. The benefit of lazy loading is deferring query executing to the moment application really need it and reduce application memory usage.

How transparent lazy loading works in NHibernate

NHiberenate uses Windsor Castle AOP framework to implement lazy loading. Let’s take a look how to get lazy loading works in NHibernate:
1. Make class properties and methods accessible and overload-able
The lazy loading can only be added for public or protect virtual property or method. Hence, if your class properties and methods are private, you can’t use lazy loading. Also lazy loading is not available on class fields
public class Employee
{
    public virtual int EmployeeID { get; set; }
    public virtual string Name { get; set; }
    public virtual Iesi.Collections.Generic.ISet<Task> Tasks { get; set; }
}

2. Turn on lazy loading in HBM mapping file
The lazy loading attribute is default to true, though, you can still explicitly specify it in HBM mapping file.

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" assembly="Demo" namespace="Demo">
  <class name="Employee">
    <id name="EmployeeID">
      <generator class="identity" />
    </id>
    <property name="Name" />
    <set name="Tasks" lazy="true" inverse="true" cascade="all-delete-orphan">
      <key column="EmployeeID"/>
      <one-to-many class="Demo.Task"/>
    </set>
  </class>
</hibernate-mapping>

3. Return domain object from session
The domain object must be returned back from session, such as using session.Get method with domain entity id or using session.CreateQuery with HQL query
var employee = session.Get<Employee>(1);
Or
var employee = session.CreateQuery("from Employee as e where e.EmployeeID = 1").List<Employee>().FirstOrDefault();
If the domain object is attached into NHibernate session, you will not have lazy loading on it.
Internally, NHibernate uses Windsor Castle AOP framework to implement lazy loading. Windsor Castle AOP framework subclasses domain class and overrides all public and protect virtual properties and methods to provide logical for loading related domain object during the first time access. The subclass is named as <Class>Proxy.
Task Object in NHibernate
The session.Get method is the place letting NHibernate create an instance of subclass to replace the real domain object. The hereafter code using the created object is same to the proxy object and real domain object. Lazy loading process is transparent to developer, if you want to know whether the object is loaded, you can use NHibernateUtil.IsInitialized to verify it.
Assert.IsFalse(NHibernateUtil.IsInitialized(task.Employee));

If you don’t want to have lazy loading enabled because N+1 performance issue or you knew you will use all related objects in your application, you can turn the lazy loading off by set lazy attribute to false.
<set name="Tasks" lazy="false" inverse="true" cascade="all-delete-orphan">
  <key column="EmployeeID"/>
  <one-to-many class="Demo.Task"/>
</set>
By set lazy attribute to false, NHibernate will do implicate eager loading for you.

How transparent lazy loading works in Entity Framework

The earlier version Entity Framework doesn’t have transparent lazy loading feature. This is introduced in version 4.0. Because there is no transparent lazy loading design upfront, this feature in Entity Framework is designed differently to backward compatible with Entity Framework older version. If you have POCO domain classes in your application, you are required to do following to have lazy loading enabled in Entity Framework:
1. Make class properties and methods override-able and accessible
This step is similar to NHibernate, Entity Framework uses subclass to inject additional code into domain class for lazy loading.
Note: this only applies to POCO style domain class, if your domain classes are generated with EntityModelCodeGenerator template, it should already contains explicit lazy loading code, so no subclass needed. Below is code snippet generated with EntityModelCodeGenerator template.
[XmlIgnoreAttribute()]
[SoapIgnoreAttribute()]
[DataMemberAttribute()]
[EdmRelationshipNavigationPropertyAttribute("LazyLoadingModel", "FK_Task_Employee", "Task")]
public EntityCollection<Task> Tasks
{
    get
    {
        return ((IEntityWithRelationships)this).RelationshipManager.GetRelatedCollection<task>("LazyLoadingModel.FK_Task_Employee", "Task");
    }
    set
    {
        if ((value != null))
        {
            ((IEntityWithRelationships)this).RelationshipManager.InitializeRelatedCollection<task>("LazyLoadingModel.FK_Task_Employee", "Task", value);
        }
    }
}
2. Specify true for “Lazy Loading Enabled” in edmx properties
You can actually specify Lazy Loading Enabled in your context class directly. However, the default place is in edmx file. The Lazy Loading Enabled setting will eventually get into your context class if it’s automatically generated, like LazyLoadingEntities in my example.
public partial class LazyLoadingEntities : ObjectContext
{
    public const string ConnectionString = "name=LazyLoadingEntities";
    public const string ContainerName = "LazyLoadingEntities";

    #region Constructors

    public LazyLoadingEntities()
        : base(ConnectionString, ContainerName)
    {
        this.ContextOptions.LazyLoadingEnabled = true;
    }

    public LazyLoadingEntities(string connectionString)
        : base(connectionString, ContainerName)
    {
        this.ContextOptions.LazyLoadingEnabled = true;
    }

    public LazyLoadingEntities(EntityConnection connection)
        : base(connection, ContainerName)
    {
        this.ContextOptions.LazyLoadingEnabled = true;
    }

    #endregion

    #region ObjectSet Properties

    public ObjectSet<Employee> Employees
    {
        get { return _employees  ?? (_employees = CreateObjectSet<employee>("Employees")); }
    }
    private ObjectSet<employee> _employees;

    public ObjectSet<task> Tasks
    {
        get { return _tasks  ?? (_tasks = CreateObjectSet<task>("Tasks")); }
    }
    private ObjectSet<task> _tasks;

    #endregion
}
</task>
3. Return domain object from context
Domain object needs to be lazy loaded must be returned from context
A lazy loaded object looks like this in Visual Studio 2010 watch window
Task Object in Entity Framework
If you don’t want to have lazy loading in Entity Framework, you can turn it off by set LazyLoadingEnabled to false 
this.ContextOptions.LazyLoadingEnabled = false;
After the lazy loading is turned off, Entity Framework will not load related objects for you anymore. If you access a navigation property that is no loaded, you will get a NullReferenceException error. Or, when you access a navigation collection property that is not loaded, you will get an empty collection. This confuses me when I first use Entity Framework after years with NHibernate.
Task Object in Entity Framework with Lazy Loading False

Caveat on Using Lazy Loading

In both NHibernate and Entity Framework, lazy loading is enabled by default. This is the best practice for most situations. By enabling lazy loading, developer doesn’t need to worry how related objects are loaded. Also, untouched reference objects or collections do not occupy memory space. However, we need to be careful on N+1 problem incurred by lazy loading in some situations. The N+1 problem is when you have lazy loading enabled for a domain object that has a collection property contains N items, you will have one select query be executed to retrieve data for the domain object itself, and then you will have every single query be executed for each item in the collection. In total, you executed N+1 queries to get all data in the domain object and collection property instead of just one query. This causes too many queries to database. To avoid N+1 problem, we need to use profiler monitor execution of application to find the potential spot that has this problem, and then use explicit eager loading or batch loading to avoid or mitigate this problem.

Using the Code

The code is developed in Visual Studio 2010. You need to create database LazyLoading in SQL Server, and then run the attached CreateTables.sql script to create tables and data in it. Before running the code, remember to update connection string in configuration file to your local connection string.

No comments:

Post a Comment