Boxed Value Types and Interfaces

Greetings!

The first programming language I could claim to be proficient at was C. Then, it was C++. Thus I would say that I am a ‘low-level’ programmer at heart, and I like to think about efficient memory consumption. What I am getting at is this – I do spent time thinking about the data type to use for an entity I need to encode, and I have not swore off structs yet. In C# a struct is a value type and value types are meant to be immutable. This means that, by default,  they do not have interfaces that would allow the instance fields to be modified. Value types reside on the thread’s stack, and reference types reside on the heap managed by the garbage collector. The main advantage of using the stack for declaring variables is speed. Unlike with the heap allocation, we only need to update the stack pointer with a push or pop instruction. Finally, the stack variables’ life-time is easily determined by the scope of the program.

Structs, being the most advanced of the value types in C#, are very useful. However, sometimes it is difficult to decide between a class and a struct. Here is a concise guidance from Microsoft on when to prefer one type over another. The rule of thumb seems to be that if the type is small and should be immutable by design, then structs should be preferred over classes. We have now come to the main topic of this post, which is on how an interface implemented on a value type can affect its immutability. The idea behind the following code samples is not originally mine. I have borrowed it (and modified slightly) from J.Richter’s CLR via C#, 4th edition (part 2, Designing Types). Take a look at the short program below. Can you guess what its output is?

using System;
using System.Collections.Generic;

namespace BoxedValueTypesWithInterfaces
{
    internal struct Coordinate
    {
        public Coordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public void UpdateCoordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public override string ToString()
        {

            return String.Format("({0},{1},{2})", x.ToString(), y.ToString(), z.ToString());
        }

        private int x;
        private int y;
        private int z;
    }

    class EntryPoint
    {
        static void Main(string[] args)
        {
            Coordinate myCoordinate = new Coordinate(0, 0, 0);

            //no boxing because we overode ToString()
            Console.WriteLine("Current position is " + myCoordinate.ToString());

            myCoordinate.UpdateCoordinate(1, 2, 3);
            Console.WriteLine("New position is " + myCoordinate.ToString());

            //myCoordinate is boxed here
            Object o = myCoordinate;
            Console.WriteLine("Current position is " + o.ToString());

            ((Coordinate)o).UpdateCoordinate(2, 3, 4);
            Console.WriteLine("New position is " + o.ToString());

            Console.ReadLine();
        }
    }
}

The most interesting output is on line 50. It is not going to be (2,3,4), otherwise the question would be too simple. On line 49 we cast an object o, which is a boxed value type myCoordinate, back to a struct, which unboxes it and saves to a temporary location on the stack. It is this temporary location that is then modified with new coordinates (2,3,4). The boxed value type in the form of object o is never updated, it cannot be changed. And the code on line 50 displays (1,2,3).

We can change the behavior of boxed value types through interfaces. Consider now the following modified Coordinate struct:

using System;
using System.Collections.Generic;

namespace BoxedValueTypesWithInterfaces
{
    internal interface IUpdateCoordinate
    {
        void UpdateCoordinate(int _x, int _y, int _z);
    }

    internal struct Coordinate: IUpdateCoordinate
    {
        public Coordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public void UpdateCoordinate(int _x, int _y, int _z)
        {
            x = _x;
            y = _y;
            z = _z;
        }

        public override string ToString()
        {

            return String.Format("({0},{1},{2})", x.ToString(), y.ToString(), z.ToString());
        }

        private int x;
        private int y;
        private int z;
    }

    class EntryPoint
    {
        static void Main(string[] args)
        {
            Coordinate myCoordinate = new Coordinate(0, 0, 0);

            //no boxing because we overode ToString()
            Console.WriteLine("Current position is " + myCoordinate.ToString());

            myCoordinate.UpdateCoordinate(1, 2, 3);
            Console.WriteLine("New position is " + myCoordinate.ToString());

            //myCoordinate is boxed here
            Object o = myCoordinate;
            Console.WriteLine("Current position is " + o.ToString());

            //unboxing here and creating a temp location
            ((Coordinate)o).UpdateCoordinate(2, 3, 4);
            Console.WriteLine("Updating through cast: new position is " + o.ToString());

            //no boxing here
            ((IUpdateCoordinate)o).UpdateCoordinate(2, 3, 4);
            Console.WriteLine("Updating through interface: new position is " + o.ToString());

            Console.ReadLine();
        }
    }
}

Again, what is the output on line 60? It is as expected, (2,3,4). Why? Here is why. On line 59, o, being an object that points to a boxed value type, is cast to IUpdateCoordinate. No boxing takes place, since o is already an object. Because Coordinate struct implements this interface, this cast emits a castclass operation which successfully casts object o to IUpdateCoordinate interface. When an entity (such as a class or a struct) implements an interface, it implicitly allows an instance of that interface to represent some part of itself. If you are now thoroughly confused, I suggest you read this post which should clarify what I mean. The UpdateCoordinate method is called virtually through IUpdateCoordinate interface and this updates the boxed value type referenced by object o. So, no temporary variable was created because no boxing/unboxing took place. Calling the method through an interface allows us to modify the actual boxed value.

I began this post by writing that structs should not be completely avoided as they can be handy when all you need is a simple structured object with value type semantics. If you find yourself making structs implement interfaces, perhaps this can be a good point to stop and reconsider your data type design…

Favorite Advice from “The Elements of C# Style”

Greetings! Long time no blogging!

The usual excuses apply: no time, too much work, to tired to think about anything but work…

This post will hopefully make my readers smile, as I am going to list my favorite bits of advice on programming I have taken from “The Elements of C# Style” by K.Baldwin, A.Gray and T. Misfeldt. This book makes for a short and wonderful read, and in my opinion should be re-read on regular basis by all practitioners. It covers naming, design, documentation, the programming language and packaging and release. The book is full or “do’s” and “don’ts”, and it is the “don’ts” that I  smile at because, admittedly, I have done all of them at some point in the past…

Here  are ten, in the “smiley” order:

  1. Do It Right the First Time: the authors advice to apply this rule to any code one writes, not just what is designed for production. Probably the best advice I have ever come across, and the one I seldom practice. Quite often I would begin by simply trying a few things out, getting the code to work and then never coming back to clean it up and make it presentable.
  2. Adhere to the Principle of Least Astonishment: this advice is one of my favorite! The authors write that simplicity and clarity must be preferred over complexity and unpredictability.
  3. Keep Your Comments and Code Synchronized: this one is followed by a quote by Norm Schryer: “When the code and the comments disagree, both are probably wrong.”
  4. Recognize the Cost of Reuse: reuse is often held up as the holy grail, however, truly re-usable code is difficult and time consuming. In addition, it increases code dependency and, sometimes, complexity.
  5. Separate Distinct Programming Layers: as this creates a flexible and maintainable architecture. Personally, I should be remembering this one more often…
  6. Reuse Objects to Avoid Reallocation: this piece of advice is from the efficiency chapter. The idea is to cache and reuse frequently created objects with limited lifespan. Use accessor methods (get) rather than constructors.
  7. Avoid the Use of End-line Comments: according to the authors these interfere with the overall structure of the code. One-line comments should be placed on separate line. I agree, will remember this one in the future.
  8. Do Not Use try…throw…catch to Manage Control Flow: one should use exceptions handling mechanism to handle exceptional conditions. Except for serious failures there should be a way to continue execution (and notifying the users or logging errors).
  9. Organize Source Code into Regions: this one really aids in improving readability and I often forget it!
  10. Consider Using Code-checking Software to Enforce Coding Standards: it is true that tools like FxCop are very useful and one can learn more about the language by simply following its suggestions! However, sometimes I turn FxCop off because I find that I don’t like to see most of my code highlighted in red!

So, how many times did you smile?