Entity Framework 4.1: Inheritance (7)


This is part of a series of blog post about Entity Framework 4.1.  The past blog entries are:

In this article, I’ll cover inheritance.

I must say that I did struggle a fair bit with this feature.  Like the rest of EF 4.1, it isn’t officially documented, but it seems the API changed substantially between CTPs and RC, so the blog posts out there were a bit misleading for me.

A good source for me was Morteza Manavi’s blog, which I recommend.

In ORM literature, there are three ways to map tables to object hierarchies (i.e. classes related to each others with inheritance relations):

  • Table per Type (TPT):  for each class in the hierarchy, there is a table in the DB, related to each others with FK
  • Table per Hierarchy (TPH):  there is only one table for the entire hierarchy with all possible data in it
  • Table per Concrete Type (TPC):  a bit of mix of the other two, there is one table for each concrete type, which flattens the abstract types as in TPH

Here I’m going to cover TPT & TPH.  The beauty of EF 4.1 is that you can mix those ways of dealing with the mapping within one hierarchy, as I’ll show.

Let’s start with table per type.  Here I define a simple hierarchy with one abstract base class and two derived classes:

public abstract class PersonBase
    {
        public int PersonID { get; set; }
        [Required]
        public string FirstName { get; set; }
        [Required]
        public string LastName { get; set; }
        public int Age { get; set; }
    }

public class Worker : PersonBase
    {
        public decimal AnnualSalary { get; set; }
    }

public class Retired : PersonBase
{
    public decimal MonthlyPension { get; set; }
}

We need to tell the model builder how to map those classes to tables:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
    base.OnModelCreating(modelBuilder);

    modelBuilder.Entity<PersonBase>().HasKey(x => x.PersonID);
    modelBuilder.Entity<PersonBase>().Property(x => x.PersonID)
        .HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
    //    TPT mapping
    modelBuilder.Entity<PersonBase>().ToTable("tpt.Person");
    modelBuilder.Entity<Worker>().ToTable("tpt.Worker");
    modelBuilder.Entity<Retired>().ToTable("tpt.Retired");
}

We use simple table name overrides, but with this information the model builder knows it has to build the database with TPT:

image

If we run some code against that model, we can appreciate how we can leverage the mapping.  Basically, we expose one and only one DbSet:  a collection of PersonBase.  EF takes care of managing which real type each member should be.

public static void ManageTPT()
{
    using (var context1 = new TptContext())
    {
        var worker = new Worker
        {
            AnnualSalary = 20000,
            Age = 25,
            FirstName = "Joe",
            LastName = "Plumber"
        };
        var retired = new Retired
        {
            MonthlyPension = 1500,
            Age = 22,
            FirstName = "Mike",
            LastName = "Smith"
        };
        //    Make sure the tables are empty…
        foreach (var entity in context1.Persons)
        {
            context1.Persons.Remove(entity);
        }
        context1.Persons.Add(worker);
        context1.Persons.Add(retired);

        context1.SaveChanges();
    }
    using (var context2 = new TptContext())
    {
        Console.WriteLine("Persons count:  " + context2.Persons.OfType<PersonBase>().Count());
        Console.WriteLine("Worker:  " + context2.Persons.OfType<Worker>().Count());
        Console.WriteLine("Retired:  " + context2.Persons.OfType<Retired>().Count());
    }
}

This is quite powerful since we can access the Workers only and EF takes care of accessing only the Worker table.

TPH is actually the default for EF.  We could simply comment out lines from the previous example to fallback on the default to see the general mechanic:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
    base.OnModelCreating(modelBuilder);

    modelBuilder.Entity<PersonBase>().HasKey(x => x.PersonID);
    modelBuilder.Entity<PersonBase>().Property(x => x.PersonID)
        .HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
    //    TPT mapping
    //modelBuilder.Entity<PersonBase>().ToTable("tpt.Person");
    //modelBuilder.Entity<Worker>().ToTable("tpt.Worker");
    //modelBuilder.Entity<Retired>().ToTable("tpt.Retired");
}

This results in only one table (for the entire hierarchy):

image

We notice the entire hierarchy is flatten within one table.  The properties not found in the base class are automatically marked as nullable.  There is also an addition:  a discriminator column.  If we run the previous code sample, we’ll see how this discriminator column is used by default by looking at the content of the table afterwards:

image

The discriminator column is used by EF to know which class to instantiate when it reads a row, since all classes map to the same table.

We can override all that.  Let’s show that at the same time as showing the mix of TPH & TPT.  I’ll define two new subclasses of Worker and I want to map them both, and the Worker class, in one table:

public class Manager : Worker
{
    public int? ManagedEmployeesCount { get; set; }
}

public class FreeLancer : Worker
{
    [Required]
    public string IncCompanyName { get; set; }
}

You’ll notice that each property must be null.  This is a great inconvenience with TPH:  every property must be nullable.  Now we can instruct the model builder:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
    base.OnModelCreating(modelBuilder);

    modelBuilder.Entity<PersonBase>().HasKey(x => x.PersonID);
    modelBuilder.Entity<PersonBase>().Property(x => x.PersonID)
        .HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
    //    TPT mapping
    modelBuilder.Entity<PersonBase>().ToTable("tpt.Person");
    modelBuilder.Entity<Retired>().ToTable("tpt.Retired");
    //    TPH mapping
    modelBuilder.Entity<Worker>()
        .Map<FreeLancer>(m => m.Requires(f => f.IncCompanyName).HasValue())
        .Map<Manager>(m => m.Requires(ma => ma.ManagedEmployeesCount).HasValue())
        .ToTable("tph.Worker");
}

Here I use one way of discriminating:  I asked that a column belonging only to one class be not-null for a row to be mapped to that class.  This is different than what the default does.

The consumer code isn’t dissimilar than TPT (you could change your mapping without impacting the consumer code):

public static void ManageTPH()
{
    using (var context1 = new HierarchyContext())
    {
        var worker = new Worker
        {
            AnnualSalary = 20000,
            Age = 25,
            FirstName = "Joe",
            LastName = "Plumber"
        };
        var freeLancer = new FreeLancer
        {
            Age = 22,
            FirstName = "Mike",
            LastName = "Smith",
            IncCompanyName = "Mike & Mike Inc"
        };
        var manager = new Manager
        {
            Age = 43,
            FirstName = "George",
            LastName = "Costanza",
            ManagedEmployeesCount = 12
        };
        //    Make sure the tables are empty…
        foreach (var entity in context1.Persons)
        {
            context1.Persons.Remove(entity);
        }
        context1.Persons.Add(worker);
        context1.Persons.Add(freeLancer);
        context1.Persons.Add(manager);

        context1.SaveChanges();
    }
    using (var context2 = new HierarchyContext())
    {
        Console.WriteLine("Persons count:  " + context2.Persons.OfType<PersonBase>().Count());
        Console.WriteLine("Worker:  " + context2.Persons.OfType<Worker>().Count());
        Console.WriteLine("Retired:  " + context2.Persons.OfType<Retired>().Count());
        Console.WriteLine("FreeLancer:  " + context2.Persons.OfType<FreeLancer>().Count());
        Console.WriteLine("Manager:  " + context2.Persons.OfType<Manager>().Count());
    }
}

The SQL schema is, as planned, an hybrid of TPT & TPH:

image

I haven’t found the way to use a discriminator and override it.  I do find this way of testing for nulls more interesting than having a discriminator column, which introduces yet more denormalization.


9 thoughts on “Entity Framework 4.1: Inheritance (7)

  1. Interesting, though I wonder; how does EF handle inherited properties from interfaces in the various scenarios?

    I guess TPT in this case would generate a massive key-explosion? And therefor sink performance for advanced queries.

    Would TPH use the same columns for subclasses inheriting the same interfaces? Or if implementing an inherited interface which a sibling class implements?

    Just making a guess that TPC would be a good trade-off between performance and database readability.

    I’m really just looking for a way to make the application level developer not to have to care about database design and being unable to seriously fuck stuff up at database level with massive use of interfaces and inheritance for application logic layer stuff.

  2. How would you convert an existing Worker into a Manager? I don’t think you can retrieve them as Manager in order to set the count

    1. I do not think you can. This is quite typical in OO that you can’t convert an object type by flicking data in it. A Worker will remain a Worker forever.

      Although you could convert it by altering the DB state directly.

Leave a comment