Tuesday, May 28, 2013

Strongly Typed (and Named) Primitives

Quick note: this post can be classified under "language abuse". If you don't get it, don't worry, you're probably better off that way. :)

struct SpecialPrimitive
{
 enum class Value
 {
  One = 1,
  Two,
  Three
 };

 SpecialPrimitive(Value i) : _value((int)i) {}
 void operator=(Value i) { _value = (int)i; }
 operator int&() { return _value; }
 operator const int&() const { return _value; }
  
private:
  int _value;
};

int main()
{
  // SpecialPrimitive p = 42; will not compile
 SpecialPrimitive p = SpecialPrimitive::Value::Two;

 cout << p << endl;

  // again, p = 69; will not compile
 p = SpecialPrimitive::Value::Three;

 cout << p << endl;
}


This extendable kind of enum class, lets you deserialize ints into it, while still getting the *compile time* benefits of an enum class. Note that it might be a good idea to perform validation after deserialization, depending on your context.

Thursday, April 4, 2013

Custom Exceptions

Regarding local applications, and not public APIs, here's a quick tip: Only define custom exceptions in your application if you already have a use case for catching them specifically, and  a general runtime error won't do (i.e. the basic Exception class in your domain).

And if at any point you realize you have custom exceptions classes defined, but not used, remove them and replace the thrown exceptions with standard runtime errors, with a proper message.

Reasons:
  1. They add code and complexity
  2. Coming up with the right exceptions to detect and when and how to throw them is often not an easy task, when designing bottom-up.
  3. You often don't need to distinguish between types of exceptions. You just need to know that one happened. If you need details for retrospective, simply write a nice human-readable message.
  4. If you ever really need them, adding them is trivial. When it's your application, it's your code, and you should be able to do whatever you want with it, as long as the application stays correct, especially if it makes it better.
So now there's still an open issue: What happens if you want to associate data with the exception object, that would help in debugging later on (for example, if you find it in a stack trace)?

My current solution is to let your application's basic exception contain a pointer/reference to an ExceptionData object, which has the following features:
  1. A vector of string messages (const char* in C++ to avoid constructors which may throw).
  2. A map between a string description and any standard type (int, double, char, bool, etc), to describe state.
Note that, in the second case, if a more complex object is detected as corrupted, then one of two cases apply. The first is that it is possible to expose its values as well. and you can do just that; The second is that it's not possible, hence keeping a reference to it is useless anyway, because you can not inspect it.

Another important fact to keep in mind is that any kind of data-keeping that's not primitive (type-wise, just like those vectors and maps I suggested, or any kind of string formatting) might result in another corruption or a thrown exception, which could cause the program to terminate immediately if you aren't careful. So if you choose to associate data with your exceptions, consider the context, and do it carefully.

Happy error handling.

Wednesday, November 23, 2011

Global State

They say global state is harmful. Is it disadvantageous in every possibly context? In this article I'll attempt to answer this question from a new and different perspective.

In the course of my "explaining simple ideas" series, I decided to start by explaining why global state is considered harmful among most programmers.

But try as I might (and I did), I couldn't find a single general reason for it. I searched all over the web and no reason quite satisfied me in a general way. That's when it hit me - there might not even be one. At least no one seemed to have found it yet.

I'll be explaining, from my point of view, the separate ups and downs of using globals, and the consequences. Along the way I'll be criticizing the way we programmers usually think and work.

Abstract

Using globally accessible state has been preached and warned against for decades. There are many different reasons for that. But more surprisingly, I found that many of those reasons are almost injective in relation to the context in which global variables are used. In other words, we shouldn't defiantly avoid global variables, but understand where we're heading to with our code, and whether using global variables can ultimately sabotage our productivity, in a way that doesn't justify its immediate value. We need to make calculated, rational decisions.

Unfortunately, many of us (programmers) are victims and often transitive sinners of the "don't use globals" evangelism.
In this article I'll try to refrain from talking about whether global variables are good or bad per se, but I'll present specific contexts in which it is clearly advisable not to use them.

I'll try to answer the following questions:

  • Are there any benefits to using globals?
  • Is it simply better never to use globally accessed variables, as a rule?

As well as review the effects of using global variables on:

  • Comprehensibility (Complexity mitigation)
  • Modularity (Reusability and Commutability)
  • Testing scenarios

What's a Global

Before we go into whether using globals is recommended or discouraged, it's important to make sure we understand what indeed is considered (by us) a global.

You could say that variables that are defined in global scope are called globals, but what about functions that give you access to a static resource such as static class functions, such as in the Singleton design pattern? The following things fall under globals:

  • Variables defined in global scope
  • Free functions and static class functions
  • Classes (Types) and their constructors

In other words, anything that can be accessed from anywhere in our code.
Consequently, anything that once we use in our code, we have no way to modify or control externally (the ability to do this is called a seam), but rather, we have to go back to that point and change it there.

If you're not sure what I mean, don't sweat; you'll get it in the examples throughout this article.

Alternatives

There is, of course, an alternative to using globals. Simply put, it's just passing the variables around to any method or class that needs to use them. Recently this has been labeled as Dependency Injection, and it means that instead of accessing your variables from a global scope, you explicitly depend on them.

So for example, if this is a case of using globals:

class MyClass {
    public doWork() {
        // ...
        Database.getInstance().getNewest("Thing");
        // ...
    }
}

Then the alternative, using DI (Dependency Injection) is:

class MyClass {
    private IDatabase database;

    public MyClass(IDatabase database) {
        this.database = database;
    }
    public doWork() {
        // ...
        database.getNewest("Thing");
        // ...
    }
}

The upsides of this method will be covered in a bit (if you can't see them by yourself already).

General Cases

Since I'm going to focus on specifics, I'll only address two simple general questions.

Are There Benefits to Using Globals

From an old-fashioned and procedural point of view, "depending" on services is not so trivial. It's easier to understand when a variable is simply global, like an omnipresent entity, and we just access it whenever we want. That in itself is an advantage.

If you're not aware of the consequences, you might even consider it good practice. Some might even consider it a more flexible solution, as it allows any class or method to access that variable, without any set up.

Of course, the flexibility claim is false, simply because that's not the kind of flexibility we want to achieve. As programmers we need to define another type of flexibility that in some aspects is the complete opposite of the first type. This will be covered in a bit.

In addition, if your program is not built in such a way that it allows you to pass every last class all the dependencies it needs at the right time (right is kind of subjective in this sense), then it's obviously easier to just hook up a global, rather than reorganizing your initialization code.

Yes, it would "smell bad". Yes, it could get back at you someday, and that's nothing to take lightly. No, it wouldn't be my first choice, and yes, you could go back and fix it later. But when all is said and done, it can save you time and effort just to get you to ship your product on time. So that in itself is worth considering as an option.

It gets dangerous as you start to lose control, or become lazy. Either way, if you choose to use a global, and a day comes when you need to use it again in a different place - that waves a red flag. When red flags wave, you should refactor mercilessly.

Is Total Avoidance the Way to Go

I can't honestly claim to have the answer for this question. I'll tell you this, though: if you see nothing fishy, or any similarities between the examples (that I'll show you later on) and your situation, and you have reason to believe that using globals will actually be profitable in the short run and manageable in the long one, then I say just go for it, and if anything bad happens, make sure you take the time and learn from your mistakes, and do your best to fix them.

Also consider documenting the reason(s) you have for using them, just so future developers won't think you're an asshole, and so they'll be able to use their own judgment, instead of making unknowledgeable and possibly dogma-driven refactorings, which might somehow turn out harmful themselves.

Just in case, though; if you're in doubt about using or not using (hehe), try asking Google. Remember that you should try to get as many opinions as you can afford to, given your time constraints.

Recommended Avoidance

It doesn't take a lot of effort to understand the simplicity and elegance with which globals annihilate some of our aspirations. Namely: predictability, transparency, loose coupling, testable classes, etc. Let's have a look an example:

class Database {
    private static Database instance = new Database();
    public static Database getInstance() { return instance; }
    private Database() { }
}

class PhoneBook {
    public string getPhoneNumberOf(string fullName) {
        return Database.getInstance()
            .Single(person => person.FullName == fullName).PhoneNumber;
    }
}

What can we conclude from this design? Well, PhoneBook can just pick out the Database globally without having to explicitly depend on it, and forcing the creator of the PhoneBook to supply a Database. So that's a flexible solution, right? Wrong. That's not a flexible solution, that's a lazy and constrictive solution. Puzzled? Stay with me here...

As grown-up developers, we need to stop nurturing our despicable no-good laziness. "Do as little work as possible" is simply not what they meant at the university when they said "Good programmers are lazy". That was , apparently, an immensely harmful over-generalization. They just meant you shouldn't duplicate code, and added a touch of poisonous humor to it.

We by all means should never, ever be literally lazy. Indolence won't lead us anywhere worth going. In fact, we need to be extremely industrious and resourceful if we want to go somewhere. That's way, way on the other side of being lazy.

But yet, we naturally appreciate small, simple solutions - and that's ok. Coming up with small, simple solutions is a great bonus - but only if it doesn't stop you from reaching your goals. So yes, look for the simplest possible solution, but define your objectives well! Make sure you're not biased towards simplicity, just because you don't want to bother thinking a bit more, at the cost of missing out on some great advantages (which, ironically enough, help you maintain simplicity in the long run).

Armed with this intellect; what kind of flexibility do we really want, and why? What do we even mean by flexibility?

If you've been developing software for some time now, you know that there's no such thing as unchanging requirements. Software requirements are an inherently dynamic entity. Customers never know what they're always going to want, because the possibilities just grow with time as the software becomes a reliable and productive tool. This is also true (from experience) if the only customer is the only developer. This is practically always true.

So based on that, we need to be able to respond to change. And that's what we mean by flexibility. We mean that if the software requirements change, we want to be able to reassemble the software parts (or create new ones) so that the new assembly meets the new requirements. We don't want to have to delete old work and replace it with new one. Once we've written something and tested it, we strive to keep it. We also don't want any obstructive disturbances where assembling two or more parts together creates an unexpected result.

Let's look at our examples again and see if we find anything wrong with it now:

class Database {
    private static Database instance = new Database();
    public static Database getInstance() { return instance; }
    private Database() { }
}

class PhoneBook {
    public string getPhoneNumberOf(string fullName) {
        return Database.getInstance()
            .Single(person => person.FullName == fullName).PhoneNumber;
    }
}

Few things come to mind:

Transparency and Predictability

If we did something like this:

PhoneBook phoneBook = new PhoneBook();
Console.WriteLine("Number: " + phoneBook.getPhoneNumberOf("Yam Marcovic"));

then we'd have to deduce a couple of things.

  1. This code might throw an exception, since we're creating a new PhoneBook and trying to access an existing entry, but there's no code inserting entries prior to that. Therefore it's reasonable to assume that no number for "Yam Marcovic" exists and this would throw an exception.
  2. If it doesn't throw an exception, we might think that the PhoneBook maintains a pre-defined default list of names, which contains the number for "Yam Marcovic".

What we really wouldn't want to think is that PhoneBook uses some kind of a heavy database in the background, and this getPhoneNumberOf method called on a brand new object would actually be a performance bottleneck in a simple program such as this.

But that indeed is the case here. This code is opaque and unpredictable. The ambiguousness would just fade away if we had to write it like this:

PhoneBook phoneBook = new PhoneBook(new Database("<CONNECTION STRING>"));
Console.WriteLine("Number: " + phoneBook.getPhoneNumberOf("Yam Marcovic"));

or even better:

PhoneBook phoneBook = PhoneBook.FromDatabase(new Database("<CONNECTION STRING>"));
Console.WriteLine("Number: " + phoneBook.getPhoneNumberOf("Yam Marcovic"));

Now this code is transparent and predictable. A fellow developer looking at it would instantly be able to figure out what it does and what to expect of it.

Loose Coupling

What if we wanted a different kind of database, some place else in the program? For example, a different part of the program could well use all the logic in PhoneBook, and the fact that it is a PhoneBook (an instance of the class), but we want that part to get its data from a file, not a database. We want that, while a different part of the program still uses that database.

We just can't do that right now, because we're tightly coupled to a global variable. We could do something horrendous such as putting conditions inside the PhoneBook about whether to get the data from a file or from a database. But that's the thing - we don't want to have to come back and change the PhoneBook code every time we just want it to handle data from a different source. We want to be loosely coupled to the source of data. The opposite of being tightly coupled to it.

Here's a solution:

class PhoneBook {
    private IDatabase database;

    public PhoneBook(IDatabase database) {
        this.database = database;
    }

    public string getPhoneNumberOf(string fullName) {
        return database.Single(person => person.FullName == fullName).PhoneNumber;
    }
}

Now we just need to implement the IDatabase interface, and the PhoneBook will automatically know how to retrieve phone numbers from our data source! Now we can respond to changes much more effectively.

This is also called Dependency Inversion. The Iron Law of DI: If it could work without a dependency, it should work without a dependency.

Testing

There's not a whole lot to say here at this point. If you know what stubbing and mocking is, then just take what I said about transparency and loose-coupling and see how it applies here. This way you can create cleaner, simpler, faster, more efficient, and longer-lasting tests, and keep your code clean of test-enabling behavior.

Summary

While using globals often results in a bit less work during the initial stages, it can (and quite often does) get back at us later on. The simple solution is either not to use them at all, or alternatively, refactor them away when multiple components depend on them.

This article isn't a complete reference of when and when not to use globals, but it should have given you a good starting point for figuring things out on your own. Weigh your options carefully. Use globals sparingly.

Sunday, November 6, 2011

Small Update

Hey.
Just to keep everybody posted, I'm a bit tied up with my studies, but my first "Explaining Things" article is almost ready.

I think it's going to come out really cool. We'll see how you like it! It'll be out in about 2 weeks.

Sorry for the delays (although, I doubt anyone's anxiously waiting for it... :) ).

Saturday, October 8, 2011

Simple Ideas Explained

This is the preface of a series of articles I intend to post in the coming weeks about simple ideas, ubiquitous among all programmers, but rarely understood, and too often preached for without any justifiable reason.

Follow me in the coming weeks to understand, once and for all, what's so important in the following concepts, and why.

Do that, and you'll be able to explain, both to yourself, and to your friends and co-workers, why you, as a programmer, make all those (somewhat intuitive) design decisions that you make.

What I'm going to talk about is why and when the following concepts are considered a bad thing:

  • Global state
  • Global access
  • Tight coupling
  • Multiple responsibilities per class
  • Operator overloading
  • Bad or inconsistent naming conventions

But first, a few key notes you should be aware of:

Intuition

It turns out that, as programmers, we rely heavily on intuition. Why is that? Simply because we can never be bothered to learn and understand all the smallest peculiarities found in Computer Science. We often feel competent as soon as we think we get the general idea of something, without bothering to philosophize about it, which can be quite a time waster, accumulatively.

However, it is fundamental that we understand that that's exactly what makes up our intuition. Things which we kind of get, but not entirely.

This may often prove to still be valuable and useful, but we must never assume that it is unquestionably correct.

Dogmatism

One obvious problem resulting from reliance on intuition is dogmatism.

When we don't understand an idea to its core, but know that it is preached for and is widely considered a good idea, we tend to apply it even where it's redundant or even harmful.

One side effect of not knowing how to explain such ideas is transitive fanatic evangelism. Meaning, if you manage to use your charisma to persuade other people that an idea is good, without explaining it thoroughly, they will often listen to you, and evangelize others in exactly the same way.

This is how many common bad ideas are formed.

Context

We often find ourselves asking such unnecessarily general questions, such as "Is this idea important?".

It is vital to understand, once and for all (get it in your head) that there is no such thing as "important things".

One must always define a context. When a context is not defined, or implicitly and unambiguously rendered, one should not bother to answer questions.

Self-rationalization is a disease. If you wish to open yourself to practicality, you must embrace objectivity. Nothing is good or bad without a context. Nothing!

While subjectivity may often lead to quite humorous catchy mnemonics, such as this quote from some guy at CodeReview.SE:

"Daddy, Daddy, he defined a global!"

"Now son, what have I told you about using language like that?"

It should also be explicitly taken into consideration that, for example, defining a global variable might not always be a bad idea, until it is proven otherwise.

A lack of practical examples justifies avoidance, but does not constitute groundless aversion.

First-Hand Experience

I would like to clarify that everything I say is based on personal first-hand experience.

If anything portrays me faithfully, it is the ability to never skip making impulsive mistakes, but, quite fortunately, to learn from some.

I hope that you find sympathy, understanding, and enlightenment in the upcoming posts.