Managing parent and child collection relationships in Entity Framework – What is an identifying relationship anyway?

Some of the posts on this site are more as a reminder to myself about some problem that I have encountered and the solution I have found for that problem. This is one such post. The issue addressed below is fairly common and has been described and solved more comprehensively elsewhere but I have added this so I can find it again in the future should the need arise. If it proves helpful to anyone else then all the better. 

It is often desirable to have a parent entity that has a collection of child entities. For example a Order entity may have a collection of OrderLine entities, or a User entity may have a collection of Role entities. These example may look similar but they have a subtle difference that may cause some behavioural issues in some circumstances.

In the first example the order lines belong to an order and it doesn’t make sense for them to exist on their own. In the second example a Role can exist independently from the User.

This is the difference between an identifying and non-identifying relationship. Another example from Stack Overflow:

A book belongs to an owner, and an owner can own multiple books. But the book can exist also without the owner and it can change the owner. The relationship between a book and an owner is a non identifying relationship.

A book however is written by an author, and the author could have written multiple books. But the book needs to be written by an author it cannot exist without an author. Therefore the relationship between the book and the author is an identifying relationship.

Whatever the type of relationship it would be nice to be able to just add and remove child items to the parent collection and let EF worry about all the bothersome adding and removing from the Database.

Adding to the collection works correctly in both cases and if you have a non identifying relationship (your child foreign key is nullable) you will also be fine when removing items from the collection.

If you have only implied an identifying relationship (your child primary key is on the table Id field and your child foreign key is non nullable) you are likely to be in for a world of pain trying to remove items from the child collection.

When you remove an item from a child collection with the above kind of relationship Entity Framework tries to update the child entity foreign key to null. This causes problems as the foreign key is non-nullable and you get this System.InvalidOperationException when the context saves:

The operation failed: The relationship could not be changed because one or more of the foreign-key properties is non-nullable. When a change is made to a relationship, the related foreign-key property is set to a null value. If the foreign-key does not support null values, a new relationship must be defined, the foreign-key property must be assigned another non-null value, or the unrelated object must be deleted.

The way to stop this from happening is to make the identifying relationship explicit by specifying the foreign key field of the child to be part of its primary key. I know that in the ORM world composite keys are not really in vogue, but this make things explicit to Entity Framework that the child cannot exist without its parent and so should be deleted when it is removed from the parents collection.

How do I specify a composite key?

To set the composite keys simply annotate the entity (or use the fluent API as part of OnModelCreating in the Context). If you have an OrderLine entity that has an identifying relationship with an Order entity the OrderLine entity should be annotated like this:

public class OrderLine
    [KeyColumn(Order = 0), DatabaseGenerated(DatabaseGeneratedOption.Identity)] 
    public int Id { getset; }

    public string Description { getset; }

    public Decimal Cost { getset; }

    [KeyForeignKey("Order"), Column(Order = 1)]
    public int PupilId { getset; }

    public virtual Order Order { getset; }

Managing SpecFlow acceptance test data with Entity Framework

A look into how Entity Framework can be used to simplify the management of data for your acceptance test scenarios.

In a previous set of posts I looked at how it is possible to use SpecFlow and Selenium to create automated acceptance tests driven via the browser.

An important part of any acceptance or integration test is the test data. It is important that the data for each test is isolated from any other test. Often these tests require complicated data setup which can add additional complication and maintenance to to the test suite. This post assumes that the system under test is using a SQL database.

Traditional ways to manage test data

There are a few ways to manage the test data for acceptance tests. Probably the most widely used are:

  • Run some SQL scripts as part of the test setup and tear-down – Can be effective but means extra maintenance of the scripts is required. This type of script can be prone to errors as they are built outside of the type safety that the rest of the system benefits from.  This approach also requires a mechanism to run the scripts when the tests run.
  • Use a standard set of data for all tests – the main issue with this approach is that it may not allow for proper isolation of the test data. It is possible to use transactions or some similar mechanism to allow for this, but sometimes it is only the order of the running tests that ensures the correct data is available for the tests.

There are of course other ways, and many variations on the theme.

A better approach..?

It would be better if it were possible to add the data as part of the test setup in a type-safe way, using the language of the system without the need for external scripts.

If you are using an ORM in your application, it is very likely that you already have the tools needed to do just this. In this post I am going to use Entity Framework 5, but you could just as easily use nHibernate or something else to get the same result.

An example

Here is the SpecFlow scenario we are going to automate.

Scenario: Search for events in a region and display the results
	Given that the following regions exist
	| Id | Name   |
	| 1  | North  |
	| 2  | South  |
	| 3  | London |
	And the following events exist
	| Code    | Name                              | Description                                   | Region Id |
	| CR/1001 | Crochet for Beginners             | A gentle introduction into the art of crochet | 1         |
	| CH/3001 | Cat Herding                       | A starter session for the uninitiated         | 3         |
	| CH/3002 | Cat Herding - Advanced techniques | Taking it to the next level                   | 3         |
	When I navigate to the event search page
	And I select the "London" region 
	And perform a search
	Then The search results page is displayed
	And the following results are displayed
	| Code    | Name                              | Description                           | Region |
	| CH/3001 | Cat Herding                       | A starter session for the uninitiated | North  |
	| CH/3002 | Cat Herding - Advanced techniques | Taking it to the next level           | North  |

It is only the first two steps of the scenario that we are interested in for this post. Even before we start satisfying the first step we should make sure that the database is in the correct state to start adding the data for the test.

To do this we can use a BeforeScenario step. As the name suggests, this step is called before each scenario is run. You will notice that our scenario had a tag of @eventSearch. The same tag will be used by the BeforeScenario attribute to ensure that the BeforeScenario method is only triggered by scenarios that have the @eventSearch tag. This becomes important when you have multiple feature files in the same project.

public void Setup()
    using (var context = new CrazyEventsContext())
        context.Database.ExecuteSqlCommand("DELETE FROM Events DBCC CHECKIDENT ('CrazyEvents.dbo.Events',RESEED, 0)");
        context.Database.ExecuteSqlCommand("DELETE FROM Regions DBCC CHECKIDENT ('CrazyEvents.dbo.Events',RESEED, 0)");

We are using the ExecuteSqlCommand method to clear the contents of the tables and reseed the identity columns of the tables. If the tables are not used as part of a foreign key relationship it is possible to use TRUNCATE TABLE.

The next step is to use Entity Framework to populate the region and event details from the SpecFlow table. Although I have these steps as part of a scenario, you could also use a Background step to have the data populated at the start of each scenario.

[Given(@"that the following regions exist")]
public void GivenThatTheFollowingRegionsExist(Table table)
    var regions = table.CreateSet<Region>();
    using (var context = new CrazyEventsContext())
        foreach (var region in regions)
[Given(@"the following events exist")]
public void GivenTheFollowingEventsExist(Table table)
    var events = table.CreateSet<Event>();
    using (var context = new CrazyEventsContext())
        foreach (var crazyEvent in events)

SpecFlow has some table helper extension methods. I am using table.CreateSet(T) to create an IEnumerable of the specified type T from the data in the table. For this to work your table column names need to be the same as the properties of your entity (you can use spaces for readability if you like).

After I have the list of entities, I simply use the Entity Framework context to add each entity to the context in turn. Remember to call SaveChanges() to perform the update.

And that is all there is to it. The nice thing about this method of managing test data is that the data required for the test is specified as part of the test itself. Everyone can see what data is expected to be present when the test is run.