Skip to content

All types are not compared equally (part 2)

In part 1 of this series I explained the difference between reference equality and value equality. In this article I am going to demonstrate how to compare two reference types using value equality semantics.

Override Object.Equals

Every time you use the binary equality operator (==) or the Equals method on a reference type you are invoking Object.Equals for the instances in question. If you wish to provide value equality the most obvious thing to do would be to override System.Object.Equals and use this method to compare the fields of your two instances. Let us begin by revisiting our Person type which I have refactored to make these examples a bit more interesting:

class Person
{
    public String Name { get; set; }
    public DateTime Birthday { get; set; }
}

We have a class with two properties – let’s override the Equals method:

class Person
{
    public String Name { get; set; }
    public DateTime Birthday { get; set; }
public override Boolean Equals(Object obj) { Person other = (Person)obj; return this.Name == other.Name && this.Birthday == other.Birthday; } }

Now we have value equality for our reference type. While this is a simple solution, it is not ideal for the following reasons:

  1. This approach is not type safe. Since the Equals method accepts an argument of type Object we cannot guarantee that the instance that was passed to this method is actually a Person.
    somePerson.equals("hello, world");  // throws an InvalidCastException
  2. This approach is not “null safe”. Any comparisons with null will throw a NullReferenceException.
    somePerson.equals(null); // throws a NullReferenceException

The null safety issue is easy enough to fix:

public override Boolean Equals(Object obj)
{
    if (obj == null)
        return false;
Person other = (Person)obj; return this.Name == other.Name && this.Birthday == other.Birthday; }

But how do we preserve type safety? Meet the IEquatable<T> interface.

IEquatable<T>

This interface was designed specifically to help us tackle the type safety issue that we are facing. It declares a single member:

public interface IEquatable<T>
{
    Bolean Equals(T other);
}

As you can see, this interface gives us the ability to create a strongly-typed override of our existing Equals method. Implement the interface like this:

class Person : IEquatable<Person>
{
    public String Name { get; set; }
    public DateTime Birthday { get; set; }
public override Boolean Equals(Object obj) { return this.Equals(obj as Person); } public Boolean Equals(Person other) { if (other == null) return false;
return this.Name == other.Name && this.Birthday == other.Birthday; } }

Now that we have a strongly-typed Equals method any equality comparisons that are done on two instances of our type will be type-safe and null-safe. Using the as cast in the default overridden implementation of Equals allows us to pass either an instance of Person or null and our implementation of IEquatable<T>.Equals returns false which ensures that our methods won’t fail for null (for more information on the as operator, please read Two casts are not better than one).

GetHashCode

I am not going to cover the GetHashCode method in depth but it is an essential part of identity equality checks. A hash code is an integral value that represents the state of the current instance. Basically, if two instances have the same hash code, they may be equal in terms of value. But if two objects do not have the same hash code they are most certainly not equal in terms of value. This method allows our calling code a performance boost by not having to call Equals in the event that the hash codes do not match.

As for the proper or best way to generate a hash code for an object instance, that is a discussion for another day. For now I will add a simple GetHashCode implementation to our example to complete the exercise. (Thank you to Jon Skeet for this particular implementation of GetHashCode.)

class Person : IEquatable<Person>
{
    public String Name { get; set; }
    public DateTime Birthday { get; set; }
public override Boolean Equals(Object obj) { return this.Equals(obj as Person); } public Boolean Equals(Person other) { if (other == null) return false;
return this.Name == other.Name && this.Birthday == other.Birthday; } public override Int32 GetHashCode() { Int32 hash = 23; hash = hash * 37 + this.Name.GetHashCode(); hash = hash * 37 + this.Birthday.GetHashCode(); return hash; } }

All we are doing here is taking two coprime numbers (23 and 37) and using them to manipulate the hash codes of our instance’s state in order to arrive at a final integral value. Again, how the implementation works is not important at this point, what is important is that we are providing some implementation so that we can reap the performance benefits that GetHashCode can provide. I will post an article in the future that discusses the different techniques of generating hash codes and why some implementations are better than others.

Conclusion

Now we have a class that properly provides value equality semantics. I hope I have showed not only how to implement this pattern in your code but also why it is important and necessary in the first place.

Edit: Bradley Grainger correctly points out in a comment below that I have neglected to provide equality operator overloads for my Person type! Here is a complete example that includes those operator overloads:

class Person : IEquatable<Person>
{
    public String Name { get; set; }
    public DateTime Birthday { get; set; }
public override Boolean Equals(Object obj) { return this.Equals(obj as Person); } public Boolean Equals(Person other) { if (other == null) return false;
return this.Name == other.Name && this.Birthday == other.Birthday; } public override Int32 GetHashCode() { Int32 hash = 23; hash = hash * 37 + this.Name.GetHashCode(); hash = hash * 37 + this.Birthday.GetHashCode(); return hash; } public static Boolean operator ==(Person left, Person right) { // If the object's have the same reference then they are most // certainly equal - return true here. if (Object.ReferenceEquals(left, right)) return true;
// Check for null here as well - make sure you cast the references // to Object so that you don't accidentally invoke this same operator // again. Casting to Object allows you to invoke Object's == operator. if ((Object)left == null || (Object)right == null) return false;
return left.Equals(right); } public static Boolean operator !=(Person left, Person right) { return !(left == right); } }

4 Comments

  1. Sandro

    Assuming String.GetHashCode and DateTime.GetHashCode are properly written, why not just xor them for Person’s GetHashCode?

    Posted on 21-Mar-10 at 9:53 am | Permalink
  2. That is a great question! In this particular case you are correct to say that XOR’ing the two values would create a good distribution – the place where the XOR’ing technique fails is when you have many values of the same type that are XOR’d together. Let’s imagine that we have a class that looks like this:

    class Foo { public Int32 First { get; set; } public Int32 Second { get; set; } }

    Given two instances of this class that look like this:

    Foo one = new Foo { First=1, Second=2 }; Foo two = new Foo { First=2, Second=1 };

    your GetHashCode implementation wouldn't be able to disambiguate between these two instances since XOR'ing them would produce the same hash code even though the instances themselves are not equal. I am working on an article that will go into this exact question in some more depth - check back soon!

    Posted on 21-Mar-10 at 10:54 am | Permalink
  3. The statement “Every time you use the binary equality operator (==) … you are invoking the Equals method for the instances in question” is not true; this is easily demonstrated with the following code:

    Person p1 = new Person { Name = “one”, Birthday = new DateTime(2000, 1, 1) }; Person p2 = new Person { Name = p1.Name, Birthday = p1.Birthday }; Console.WriteLine(“{0} {1}”, p1.Equals(p2), p1 == p2); // prints True False

    Calling Equals prints “True”, but calling == prints “False”.

    operator== needs to be explicitly overloaded in order to provide value semantics for your class, otherwise you will get the default version that compares object identity.

    A recent blog post I wrote (http://code.logos.com/blog/2010/02/creating_equatable_objects.html) covers this, and also provides reference implementations of equality for classes and structs.

    Posted on 21-Mar-10 at 6:39 pm | Permalink
  4. Ah, I believe I misspoke there a bit. What I was trying to say was that in cases where no custom equality members have been introduced you are invoking Object.Equals for the instances in question, not the current type’s Equals method (I updated that line to make it clearer). You are correct that I neglected to override the equals operator in my example – nice catch!

    Posted on 22-Mar-10 at 7:45 am | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*