Are Equals and == equal?

Hello!

In C# all types can be split into reference and value types. This implies that there is a need for two different types of equality comparison mechanisms. Value and referential equality exist to allow us compare value and reference types. By default, value types use the value equality, and reference types rely on referential equality.

Before I proceed in comparing these equality types, let me try getting your attention with a simple example. Which of the two methods below would you rather have, if you were worried about run-time stability?

public static bool Equals1(double? a, double? b)
     {
        if (a == b)
            return true;
         else
            return false;
      }

public static bool Equals2(object a, object b)
     {
        if (a.Equals(b))
            return true;
        else
            return false;
     }

The acceptable answer is another question – it depends on what you are trying to achieve and the context of the task at hand. However, Equals2 is a small but a potentially dangerous function. If object a is null, we get a Null Reference Exception, as in the following example. The == comparison, on the other hand, will never fail so badly.

The reason for the behavior difference lies in the main difference between how == and object.Equals are implemented internally. The first is a static operator (or a function, if you like), and the second is a virtual method (remember you can override Equals?). Since dLeft is null, it cannot be de-referenced to call an Equals method on. The == operator, being a static function, resolves the type of dLeft at compilation. In our example, it is a boxed double (yes, nullable value types come at a cost). But even though we boxed our doubles, we get value type comparison in Equals1.

So, do we always have to check for null before calling Equals on them? Not always, as we can utilize the static helper object.Equals(object,object) instead. This method accepts two arguments and lets us avoid having to check for equality on a potentially null reference. Here is a much less dangerous Equals2 for you:

public static bool Equals2(object a, object b)
       {
            if (object.Equals(a,b))
                return true;
            else
                return false;
       }

To continue comparing ‘==’ with Equals, note that the latter is reflexive, meaning that two NaNs are equal according to Equals, and are different according to ‘==’. Other than execution speed, with value types ‘==’ and Equals behave the same. With reference types, the default behavior of Equals is value-type equality and with ‘==’ operator it is referential equality. This can be demonstrated with a simple example:

double left = 10;
double right = left;
double middle = 10;

if (right.Equals(middle))
    Console.WriteLine("Same doubles");
else
    Console.WriteLine("Diff doubles");

if (right==middle)
    Console.WriteLine("Same doubles");
else
    Console.WriteLine("Diff doubles");

object oLeft = 10;
object oRight = oLeft;
object oMiddle = 10;

if (oRight.Equals(oMiddle))
    Console.WriteLine("Same objects");
else
    Console.WriteLine("Diff objects");

if (oRight == oMiddle)
    Console.WriteLine("Same objects");
else
Console.WriteLine("Diff objects");

The above will print that two variables are the same in the first 3 cases. In the last case, by default, ‘==’ for reference types performs referential equality comparison. Finally, if you are into numerical comparisons, I would recommend sticking with ‘==’ since it is faster than calling Equals or even CompareTo. For comparing monetary amounts, the difference between two doubles can be compared to some predefined numerical threshold (that is, still using ==).

Happy programming!

Advertisements

Generic Delegates

Greetings to my blog readers!

I have previously covered delegates and events, but I’ve decided to go over the concept of delegates in C# once more. In addition, this time I will mention generic delegates.

So, let’s refresh – what is a delegate? A delegate is a type. It is a type that is designed to be compatible with methods. You may have come across someone comparing C# delegates with C++ function pointers. It is a fair functional comparison. That is, only a certain part of how C++ pointers behave and what they are used for is comparable with the C# delegates. As far as the internal representation – there is really almost nothing in common. The reason I bring up the function pointer is because I too think of delegates as being “like function pointers”. This makes it easier to understand their purpose and syntax. Let me reinforce this functional comparison with an example. Take the below declaration of a delegate:

public delegate T AnyMethod<T>(T input_param);

This declares a delegate that “can represent” any method that takes one parameter and returns one parameter. In other words, AnyMethod delegate is compatible with any method with a signature like that of this delegate. Since AnyMethod is generic, this gives us flexibility in what methods we can assign. Take a look at the below example code which was written to simply support this explanation. Given some appropriate type, AnyMethod can be assigned to a method that works with any parameter type.

using System;

namespace GenericDelegate
{
 
    public delegate T AnyMethod<T>(T input_param);
      

    class EntryLevel
    {
        static void Main(string[] args)
        {
            int i = 10;
            string s = "any string";
            AnyMethod<int> im = IntMethod;          // this is the same as AnyMethod<int> im = new AnyMethod<int>(IntMethod); 
            AnyMethod<string> sm = StringMethod;    // this is the same as AnyMethod<string> sm = new AnyMethod<string>(StringMethod);
            
            im(i);
            sm(s);
        }

        static int IntMethod(int int_param)
        {
            Console.WriteLine("Inside IntMethod with "+int_param.ToString());
            return 0;
        }

        static string StringMethod(string string_param)
        {
            Console.WriteLine("Inside StringMethod with " + string_param);
            return "void";
        }
    }
}

AnyMethod delegate can work with generic methods as well. I can easily replace my two static methods with a single generic one.

class EntryLevel
    {
        static void Main(string[] args)
        {
            int i = 10;
            string s = "any string";
            AnyMethod<int> im = TMethod;          // same as AnyMethod<int> im = new AnyMethod<int>(TMethod);  
            AnyMethod<string> sm = TMethod;       // same AnyMethod<string> sm = new AnyMethod<string>(TMethod); 
            
            im(i);
            sm(s);
        }

         static T TMethod<T> (T t_param)
         {
             Console.WriteLine("Inside TMethod with " + t_param.ToString());
             return t_param;
         }
    }

Remember I wrote that a delegate is a type. Note, I did not write that it is simply a method. Because it is not. Under the hood, for the delegate declaration on line 6, the compiler will be busy creating a real class (albeit a sealed one) that extends System.MultiCastDelegate class, has a constructor and three methods! I prefer not to bother with these details because they tend to confuse me. Instead, I will stick with the comfortable “types compatible with methods” definition. In fact, I like this definition so much that I will remove that delegate declaration entirely…And replace it with Func:

using System;

namespace GenericDelegate
{
    class EntryLevel
    {
        static void Main(string[] args)
        {
            int i = 10;
            string s = "any string";
            Func<int, int> im = TMethod;             // Func<int,int> im = new Func<int,int>(TMethod);   
            Func<string, string> sm = TMethod;       //  Func<string,string> sm = new Func<string,string>(TMethod); 
            
            im(i);
            sm(s);
        }

         static T TMethod<T> (T t_param)
         {
             Console.WriteLine("Inside TMethod with " + t_param.ToString());
             return t_param;
         }
    }
}

Much better! If you have never come across Func before, take a look at the reference source. You will see that Func is just another delegate (with an upper bound on its input parameters), but its definition has been hidden away. Action and Func were introduced in .Net 3.5. I believe they were added to support lambda expressions. However, both are very handy in covering up the unnecessary complexity and reinforcing my earlier analogy.

Introducing Dependency Injection into Interpolation Code

Greetings!

In one of my previous blogs I have touched upon the idea of programming to interfaces. I did not expect that the blog would be getting so many hits as it does. As far as writing maintainable loosely coupled code, I have a long way to go to become good at. Thus I’ve decided to investigate the most useful design patters to learn and introduce into the hobby and professional coding practices. And I am not going to start small…

One of the seriously misunderstood software design patterns is dependency injection. Many people like writing about it on their sites and blogs, but not many people do a decent job at describing it correctly. I am not claiming that I will now fix this for my blog readers. Instead, I will tel you that I am reading a fantastic book about it: Dependency Injection in .NET by Mark Seemann. So, if you are looking for a good material on this topic, please check out this recommendation.

My last post was on linear and cubic spline interpolation. The implementation can be improved, since I’ve hard-coded at least one inflexible dependency: outputting the results to the console screen. Anyone who wishes to use my example code without using the console screen (e.g. writing to a file instead) would need to rewrite at least twelve lines of code!  We can definitely improve upon this by decoupling the console writer and substituting a generic class that is ‘injected‘ with the user-specified writer through a constructor (aka constructor injection).

To begin with, I made a small change to the Tridiagonal class and removed console-based output. The Solve method now throws ArgumentException whenever passed matrix is not square or something else went wrong. I then made a similar change to the extensions. After all, it is not a good idea to tie extensions to a specific output class like Console. The LinearInterpolation and CubicSplineInterpolation classes now have private constructors. The basic checks on user input is now done by the initializing methods, which could have been defined as properties. Also, the interpolating classes do not display any messages when something goes wrong. Instead, they throw appropriate exceptions, which the calling method will deal with. Finally, I have added a new UserOutput namespace which defines an interface and a very generic class that will work with any concrete implementation of Write method declared by the interface:

namespace UserOutput
{
    #region GENERIC_OUTPUT
    internal interface IWriter
    {
        void Write(string message);
    }

    internal class Output
    {
        private readonly IWriter writer;

        internal Output(IWriter writer)
        {
            if (writer == null)
            {
                throw new ArgumentNullException("Null writer interface.");
            }

            this.writer = writer;
        }

        internal void Display(string message)
        {
            writer.Write(message);
        }
    }
    #endregion
}

The concrete implementation is another class defined as:

internal class ConsoleWriter : UserOutput.IWriter
    {
        public void Write(string message)
        {
            Console.WriteLine(message);
        }
    }

The Main method creates an instance of the IWriter interface and sets it to the ConsoleWriter. The linear and cubic spline interpolations are called through the IInterpolate interface which they both define. We no longer need two separate instances of each class. The full code can be downloaded here as a pdf file.

Boxed Value Types and Interfaces

Greetings!

The first programming language I could claim to be proficient at was C. Then, it was C++. Thus I would say that I am a ‘low-level’ programmer at heart, and I like to think about efficient memory consumption. What I am getting at is this – I do spent time thinking about the data type to use for an entity I need to encode, and I have not swore off structs yet. In C# a struct is a value type and value types are meant to be immutable. This means that, by default,  they do not have interfaces that would allow the instance fields to be modified. Value types reside on the thread’s stack, and reference types reside on the heap managed by the garbage collector. The main advantage of using the stack for declaring variables is speed. Unlike with the heap allocation, we only need to update the stack pointer with a push or pop instruction. Finally, the stack variables’ life-time is easily determined by the scope of the program.

Structs, being the most advanced of the value types in C#, are very useful. However, sometimes it is difficult to decide between a class and a struct. Here is a concise guidance from Microsoft on when to prefer one type over another. The rule of thumb seems to be that if the type is small and should be immutable by design, then structs should be preferred over classes. We have now come to the main topic of this post, which is on how an interface implemented on a value type can affect its immutability. The idea behind the following code samples is not originally mine. I have borrowed it (and modified slightly) from J.Richter’s CLR via C#, 4th edition (part 2, Designing Types). Take a look at the short program below. Can you guess what its output is?

using System;
using System.Collections.Generic;

namespace BoxedValueTypesWithInterfaces
{
    internal struct Coordinate
    {
        public Coordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public void UpdateCoordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public override string ToString()
        {

            return String.Format("({0},{1},{2})", x.ToString(), y.ToString(), z.ToString());
        }

        private int x;
        private int y;
        private int z;
    }

    class EntryPoint
    {
        static void Main(string[] args)
        {
            Coordinate myCoordinate = new Coordinate(0, 0, 0);

            //no boxing because we overode ToString()
            Console.WriteLine("Current position is " + myCoordinate.ToString());

            myCoordinate.UpdateCoordinate(1, 2, 3);
            Console.WriteLine("New position is " + myCoordinate.ToString());

            //myCoordinate is boxed here
            Object o = myCoordinate;
            Console.WriteLine("Current position is " + o.ToString());

            ((Coordinate)o).UpdateCoordinate(2, 3, 4);
            Console.WriteLine("New position is " + o.ToString());

            Console.ReadLine();
        }
    }
}

The most interesting output is on line 50. It is not going to be (2,3,4), otherwise the question would be too simple. On line 49 we cast an object o, which is a boxed value type myCoordinate, back to a struct, which unboxes it and saves to a temporary location on the stack. It is this temporary location that is then modified with new coordinates (2,3,4). The boxed value type in the form of object o is never updated, it cannot be changed. And the code on line 50 displays (1,2,3).

We can change the behavior of boxed value types through interfaces. Consider now the following modified Coordinate struct:

using System;
using System.Collections.Generic;

namespace BoxedValueTypesWithInterfaces
{
    internal interface IUpdateCoordinate
    {
        void UpdateCoordinate(int _x, int _y, int _z);
    }

    internal struct Coordinate: IUpdateCoordinate
    {
        public Coordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public void UpdateCoordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public override string ToString()
        {

            return String.Format("({0},{1},{2})", x.ToString(), y.ToString(), z.ToString());
        }

        private int x;
        private int y;
        private int z;
    }

    class EntryPoint
    {
        static void Main(string[] args)
        {
            Coordinate myCoordinate = new Coordinate(0, 0, 0);

            //no boxing because we overode ToString()
            Console.WriteLine("Current position is " + myCoordinate.ToString());

            myCoordinate.UpdateCoordinate(1, 2, 3);
            Console.WriteLine("New position is " + myCoordinate.ToString());

            //myCoordinate is boxed here
            Object o = myCoordinate;
            Console.WriteLine("Current position is " + o.ToString());

            //unboxing here and creating a temp location
            ((Coordinate)o).UpdateCoordinate(2, 3, 4);
            Console.WriteLine("Updating through cast: new position is " + o.ToString());

            //no boxing here
            ((IUpdateCoordinate)o).UpdateCoordinate(2, 3, 4);
            Console.WriteLine("Updating through interface: new position is " + o.ToString());

            Console.ReadLine();
        }
    }
}

Again, what is the output on line 60? It is as expected, (2,3,4). Why? Here is why. On line 59, o, being an object that points to a boxed value type, is cast to IUpdateCoordinate. No boxing takes place, since o is already an object. Because Coordinate struct implements this interface, this cast emits a castclass operation which successfully casts object o to IUpdateCoordinate interface. When an entity (such as a class or a struct) implements an interface, it implicitly allows an instance of that interface to represent some part of itself. If you are now thoroughly confused, I suggest you read this post which should clarify what I mean. The UpdateCoordinate method is called virtually through IUpdateCoordinate interface and this updates the boxed value type referenced by object o. So, no temporary variable was created because no boxing/unboxing took place. Calling the method through an interface allows us to modify the actual boxed value.

I began this post by writing that structs should not be completely avoided as they can be handy when all you need is a simple structured object with value type semantics. If you find yourself making structs implement interfaces, perhaps this can be a good point to stop and reconsider your data type design…

Favorite Advice from “The Elements of C# Style”

Greetings! Long time no blogging!

The usual excuses apply: no time, too much work, to tired to think about anything but work…

This post will hopefully make my readers smile, as I am going to list my favorite bits of advice on programming I have taken from “The Elements of C# Style” by K.Baldwin, A.Gray and T. Misfeldt. This book makes for a short and wonderful read, and in my opinion should be re-read on regular basis by all practitioners. It covers naming, design, documentation, the programming language and packaging and release. The book is full or “do’s” and “don’ts”, and it is the “don’ts” that I  smile at because, admittedly, I have done all of them at some point in the past…

Here  are ten, in the “smiley” order:

  1. Do It Right the First Time: the authors advice to apply this rule to any code one writes, not just what is designed for production. Probably the best advice I have ever come across, and the one I seldom practice. Quite often I would begin by simply trying a few things out, getting the code to work and then never coming back to clean it up and make it presentable.
  2. Adhere to the Principle of Least Astonishment: this advice is one of my favorite! The authors write that simplicity and clarity must be preferred over complexity and unpredictability.
  3. Keep Your Comments and Code Synchronized: this one is followed by a quote by Norm Schryer: “When the code and the comments disagree, both are probably wrong.”
  4. Recognize the Cost of Reuse: reuse is often held up as the holy grail, however, truly re-usable code is difficult and time consuming. In addition, it increases code dependency and, sometimes, complexity.
  5. Separate Distinct Programming Layers: as this creates a flexible and maintainable architecture. Personally, I should be remembering this one more often…
  6. Reuse Objects to Avoid Reallocation: this piece of advice is from the efficiency chapter. The idea is to cache and reuse frequently created objects with limited lifespan. Use accessor methods (get) rather than constructors.
  7. Avoid the Use of End-line Comments: according to the authors these interfere with the overall structure of the code. One-line comments should be placed on separate line. I agree, will remember this one in the future.
  8. Do Not Use try…throw…catch to Manage Control Flow: one should use exceptions handling mechanism to handle exceptional conditions. Except for serious failures there should be a way to continue execution (and notifying the users or logging errors).
  9. Organize Source Code into Regions: this one really aids in improving readability and I often forget it!
  10. Consider Using Code-checking Software to Enforce Coding Standards: it is true that tools like FxCop are very useful and one can learn more about the language by simply following its suggestions! However, sometimes I turn FxCop off because I find that I don’t like to see most of my code highlighted in red!

So, how many times did you smile?