They say global state is harmful. Is it disadvantageous in every possibly context? In this article I'll attempt to answer this question from a new and different perspective.
In the course of my "explaining simple ideas" series, I decided to start by explaining why global state is considered harmful among most programmers.
But try as I might (and I did), I couldn't find a single general reason for it. I searched all over the web and no reason quite satisfied me in a general way. That's when it hit me - there might not even be one. At least no one seemed to have found it yet.
I'll be explaining, from my point of view, the separate ups and downs of using globals, and the consequences. Along the way I'll be criticizing the way we programmers usually think and work.
Abstract
Using globally accessible state has been preached and warned against for decades. There are many different reasons for that. But more surprisingly, I found that many of those reasons are almost injective in relation to the context in which global variables are used. In other words, we shouldn't defiantly avoid global variables, but understand where we're heading to with our code, and whether using global variables can ultimately sabotage our productivity, in a way that doesn't justify its immediate value. We need to make calculated, rational decisions.
Unfortunately, many of us (programmers) are victims and often transitive sinners of the "don't use globals" evangelism.
In this article I'll try to refrain from talking about whether global variables are good or bad per se, but I'll present specific contexts in which it is clearly advisable not to use them.
I'll try to answer the following questions:
- Are there any benefits to using globals?
- Is it simply better never to use globally accessed variables, as a rule?
As well as review the effects of using global variables on:
- Comprehensibility (Complexity mitigation)
- Modularity (Reusability and Commutability)
- Testing scenarios
What's a Global
Before we go into whether using globals is recommended or discouraged, it's important to make sure we understand what indeed is considered (by us) a global.
You could say that variables that are defined in global scope are called globals, but what about functions that give you access to a static resource such as static class functions, such as in the Singleton design pattern? The following things fall under globals:
- Variables defined in global scope
- Free functions and static class functions
- Classes (Types) and their constructors
In other words, anything that can be accessed from anywhere in our code.
Consequently, anything that once we use in our code, we have no way to modify or control externally (the ability to do this is called a seam), but rather, we have to go back to that point and change it there.
If you're not sure what I mean, don't sweat; you'll get it in the examples throughout this article.
Alternatives
There is, of course, an alternative to using globals. Simply put, it's just passing the variables around to any method or class that needs to use them. Recently this has been labeled as Dependency Injection, and it means that instead of accessing your variables from a global scope, you explicitly depend on them.
So for example, if this is a case of using globals:
class MyClass {
public doWork() {
// ...
Database.getInstance().getNewest("Thing");
// ...
}
}
Then the alternative, using DI (Dependency Injection) is:
class MyClass {
private IDatabase database;
public MyClass(IDatabase database) {
this.database = database;
}
public doWork() {
// ...
database.getNewest("Thing");
// ...
}
}
The upsides of this method will be covered in a bit (if you can't see them by yourself already).
General Cases
Since I'm going to focus on specifics, I'll only address two simple general questions.
Are There Benefits to Using Globals
From an old-fashioned and procedural point of view, "depending" on services is not so trivial. It's easier to understand when a variable is simply global, like an omnipresent entity, and we just access it whenever we want. That in itself is an advantage.
If you're not aware of the consequences, you might even consider it good practice. Some might even consider it a more flexible solution, as it allows any class or method to access that variable, without any set up.
Of course, the flexibility claim is false, simply because that's not the kind of flexibility we want to achieve. As programmers we need to define another type of flexibility that in some aspects is the complete opposite of the first type. This will be covered in a bit.
In addition, if your program is not built in such a way that it allows you to pass every last class all the dependencies it needs at the right time (right is kind of subjective in this sense), then it's obviously easier to just hook up a global, rather than reorganizing your initialization code.
Yes, it would "smell bad". Yes, it could get back at you someday, and that's nothing to take lightly. No, it wouldn't be my first choice, and yes, you could go back and fix it later. But when all is said and done, it can save you time and effort just to get you to ship your product on time. So that in itself is worth considering as an option.
It gets dangerous as you start to lose control, or become lazy. Either way, if you choose to use a global, and a day comes when you need to use it again in a different place - that waves a red flag. When red flags wave, you should refactor mercilessly.
Is Total Avoidance the Way to Go
I can't honestly claim to have the answer for this question. I'll tell you this, though: if you see nothing fishy, or any similarities between the examples (that I'll show you later on) and your situation, and you have reason to believe that using globals will actually be profitable in the short run and manageable in the long one, then I say just go for it, and if anything bad happens, make sure you take the time and learn from your mistakes, and do your best to fix them.
Also consider documenting the reason(s) you have for using them, just so future developers won't think you're an asshole, and so they'll be able to use their own judgment, instead of making unknowledgeable and possibly dogma-driven refactorings, which might somehow turn out harmful themselves.
Just in case, though; if you're in doubt about using or not using (hehe), try asking Google. Remember that you should try to get as many opinions as you can afford to, given your time constraints.
Recommended Avoidance
It doesn't take a lot of effort to understand the simplicity and elegance with which globals annihilate some of our aspirations. Namely: predictability, transparency, loose coupling, testable classes, etc. Let's have a look an example:
class Database {
private static Database instance = new Database();
public static Database getInstance() { return instance; }
private Database() { }
}
class PhoneBook {
public string getPhoneNumberOf(string fullName) {
return Database.getInstance()
.Single(person => person.FullName == fullName).PhoneNumber;
}
}
What can we conclude from this design? Well, PhoneBook can just pick out the Database globally without having to explicitly depend on it, and forcing the creator of the PhoneBook to supply a Database. So that's a flexible solution, right? Wrong. That's not a flexible solution, that's a lazy and constrictive solution. Puzzled? Stay with me here...
As grown-up developers, we need to stop nurturing our despicable no-good laziness. "Do as little work as possible" is simply not what they meant at the university when they said "Good programmers are lazy". That was , apparently, an immensely harmful over-generalization. They just meant you shouldn't duplicate code, and added a touch of poisonous humor to it.
We by all means should never, ever be literally lazy. Indolence won't lead us anywhere worth going. In fact, we need to be extremely industrious and resourceful if we want to go somewhere. That's way, way on the other side of being lazy.
But yet, we naturally appreciate small, simple solutions - and that's ok. Coming up with small, simple solutions is a great bonus - but only if it doesn't stop you from reaching your goals. So yes, look for the simplest possible solution, but define your objectives well! Make sure you're not biased towards simplicity, just because you don't want to bother thinking a bit more, at the cost of missing out on some great advantages (which, ironically enough, help you maintain simplicity in the long run).
Armed with this intellect; what kind of flexibility do we really want, and why? What do we even mean by flexibility?
If you've been developing software for some time now, you know that there's no such thing as unchanging requirements. Software requirements are an inherently dynamic entity. Customers never know what they're always going to want, because the possibilities just grow with time as the software becomes a reliable and productive tool. This is also true (from experience) if the only customer is the only developer. This is practically always true.
So based on that, we need to be able to respond to change. And that's what we mean by flexibility. We mean that if the software requirements change, we want to be able to reassemble the software parts (or create new ones) so that the new assembly meets the new requirements. We don't want to have to delete old work and replace it with new one. Once we've written something and tested it, we strive to keep it. We also don't want any obstructive disturbances where assembling two or more parts together creates an unexpected result.
Let's look at our examples again and see if we find anything wrong with it now:
class Database {
private static Database instance = new Database();
public static Database getInstance() { return instance; }
private Database() { }
}
class PhoneBook {
public string getPhoneNumberOf(string fullName) {
return Database.getInstance()
.Single(person => person.FullName == fullName).PhoneNumber;
}
}
Few things come to mind:
Transparency and Predictability
If we did something like this:
PhoneBook phoneBook = new PhoneBook();
Console.WriteLine("Number: " + phoneBook.getPhoneNumberOf("Yam Marcovic"));
then we'd have to deduce a couple of things.
- This code might throw an exception, since we're creating a new PhoneBook and trying to access an existing entry, but there's no code inserting entries prior to that. Therefore it's reasonable to assume that no number for "Yam Marcovic" exists and this would throw an exception.
- If it doesn't throw an exception, we might think that the PhoneBook maintains a pre-defined default list of names, which contains the number for "Yam Marcovic".
What we really wouldn't want to think is that PhoneBook uses some kind of a heavy database in the background, and this getPhoneNumberOf method called on a brand new object would actually be a performance bottleneck in a simple program such as this.
But that indeed is the case here. This code is opaque and unpredictable. The ambiguousness would just fade away if we had to write it like this:
PhoneBook phoneBook = new PhoneBook(new Database("<CONNECTION STRING>"));
Console.WriteLine("Number: " + phoneBook.getPhoneNumberOf("Yam Marcovic"));
or even better:
PhoneBook phoneBook = PhoneBook.FromDatabase(new Database("<CONNECTION STRING>"));
Console.WriteLine("Number: " + phoneBook.getPhoneNumberOf("Yam Marcovic"));
Now this code is transparent and predictable. A fellow developer looking at it would instantly be able to figure out what it does and what to expect of it.
Loose Coupling
What if we wanted a different kind of database, some place else in the program? For example, a different part of the program could well use all the logic in PhoneBook, and the fact that it is a PhoneBook (an instance of the class), but we want that part to get its data from a file, not a database. We want that, while a different part of the program still uses that database.
We just can't do that right now, because we're tightly coupled to a global variable. We could do something horrendous such as putting conditions inside the PhoneBook about whether to get the data from a file or from a database. But that's the thing - we don't want to have to come back and change the PhoneBook code every time we just want it to handle data from a different source. We want to be loosely coupled to the source of data. The opposite of being tightly coupled to it.
Here's a solution:
class PhoneBook {
private IDatabase database;
public PhoneBook(IDatabase database) {
this.database = database;
}
public string getPhoneNumberOf(string fullName) {
return database.Single(person => person.FullName == fullName).PhoneNumber;
}
}
Now we just need to implement the IDatabase interface, and the PhoneBook will automatically know how to retrieve phone numbers from our data source! Now we can respond to changes much more effectively.
This is also called Dependency Inversion. The Iron Law of DI: If it could work without a dependency, it should work without a dependency.
Testing
There's not a whole lot to say here at this point. If you know what stubbing and mocking is, then just take what I said about transparency and loose-coupling and see how it applies here. This way you can create cleaner, simpler, faster, more efficient, and longer-lasting tests, and keep your code clean of test-enabling behavior.
Summary
While using globals often results in a bit less work during the initial stages, it can (and quite often does) get back at us later on. The simple solution is either not to use them at all, or alternatively, refactor them away when multiple components depend on them.
This article isn't a complete reference of when and when not to use globals, but it should have given you a good starting point for figuring things out on your own. Weigh your options carefully. Use globals sparingly.