Introduction
Abstraction is a foundational aspect of programming. It allows us as the programmers, to take what is a potentially complex, cumbersome, or perhaps simply an inconvenient system, and expose only what you deem to be important, whilst hiding any unnecessary details. The hope is of course that by adding this abstraction, you make the system more convenient to work with for solving your actual problem.
The following is a simple example of an abstraction:
class Counter {
getValue() => value
increment() => value += 1
}
let counter = new Counter()
counter.increment()
counter.increment()
assert(counter.getValue() == 2)
Counter hides the internal value
state, and exposes an easy to understand interface, allowing the user to only increment the value by one at a time. This of course is an artificial limitation. If you personally had access to the value
, you could do all sorts of crazy things, such as multiplications, divisions, maybe even a square root!
Let’s for a moment pretend that this abstraction represents a far more complex system, and perhaps the increment
function invokes a number of highly expensive operations which can take a non-trivial amount of time to complete. If this were the case, you might be reluctant to do what we did in the example above, and call increment
multiple times in a row. Instead you would prefer a way to call it just the once, allowing for the underlying function to potentially optimise the process.
counter.increment(2)
This obviously changes the definition of the increment
function, which in turn requires changes to the places in which this function is called. Perhaps this is not an issue, or maybe the altered definition and behaviour could be problematic in certain scenarios. The complexity of the situation will obviously vary based on how and where it is being used.
This is a bit of an abstract example of abstraction, but what I am trying to get at here, is that it isn’t always clear that an abstraction is naturally the right option in all situations. Maybe we didn’t need a Counter
class here at all, and instead just writing the logically obvious code inline would have been the better choice.
value += 2
In this case it certainly looks like less code, and it doesn’t lock us away from being able to make changes in the future that might expand beyond simply incrementing that number.
Levels of abstraction
Abstractions aren’t necessarily all or nothing though. Instead you can choose to add a simple layer of indirection, and slowly build up the layers as the needs of your project demands. This is the strategy that I would personally advocate for, rather than starting with all the bells and whistles from the offset.
Below is a simple series of examples which hopefully demonstrates how something like this could look in practice.
The original code:
route('/login') {
user = getUser(username)
login_withEamilPassword(user)
}
The code below is one of the most basic forms of abstraction, although perhaps in the loosest sense of the definition. A simple switch statement over user.type
, allowing for a single route to support a potentially growing number of login providers:
route('/login') {
user = getUser(username)
switch (user.type) {
case UserLoginType.EmailPassword => login_withEmailPassword()
case UserLoginType.GoogleAuth => login_withGoogleAuth()
}
}
Example two is functionally the same as the first, but hides some of the details from the final point of invocation:
interface LoginProvider {}
class EmailPasswordLoginProvider {}
class GoogleAuthLoginProvider {}
class LoginProvider {
get(user) {
switch (user.type) {
case UserLoginType.EmailPassword => return EmailPasswordLoginProvider
case UserLoginType.GoolgeAuth => return GoogleAuthLoginProvider
}
}
}
route('/login') {
user = getUser(username)
loginProvider = LoginProviders.get(user)
loginProvider.login(user)
}
From this point onwards the abstractions start to make more demands of you.
In addition to adding a case to the switch statement, you now also need to create a new class, which in turn implements the LoginProvider
interface. Given the class separation, perhaps this starts to dictate the project file and directory structure too? Each new class in a new file, perhaps a login_providers
directory?
Example three has all the same features as the previous example, but includes support for adding additional login providers at runtime:
interface LoginProvider {}
class EmailPasswordLoginProvider {}
class GoogleAuthLoginProvider {}
class LoginProviders {
register(provider) {}
get(user) {}
}
function setup() {
LoginProviders.register([
new EmailPasswordLoginProvider(),
new GoogleAuthLoginProvider()
)
}
route('/login') {
user = getUser(username)
loginProvider = LoginProviders.get(user)
loginProvider.login(user)
}
Example four takes this to the logical next step. With the ability to dynamically add login providers at runtime, we likely would also want to be able to specify the supported providers via some form of configuration.
function setup() {
config = loadConfigFile()
LoginProviders.register(config.providers)
}
We can obviously keep going, piling layers upon layers of abstraction until the original four-line switch statement from the first example is nothing but a distant memory.
Is this level of abstraction necessary?
If you are working on a publicly available library or framework, and the way in which it will be used varies greatly from one project to the next, then the higher levels of abstraction which offer greater degrees of flexibility may start to make some sense. If you have used Laravel, Django, or one of the many popular web frameworks, you may even recognise this particular pattern in the above examples. Importantly though, abstractions such as these require you to produce solutions in a form that fits their narrative. Deviating from these established patterns will usually result in either confusion from programmers experienced with these frameworks, poor performance, or bugs.
Are you writing a framework?
If you are not working on a publicly available framework, and lets be honest here, most of us are not, and if your hand also isn’t being forced by a project dependency, then surely the first option using the simple switch statement makes the most sense? What our original problem demanded, was for us to provide a way to support multiple login providers. Wouldn’t abstracting this fundamental problem be needlessly overcomplicating our code for little to no gain?
If we needed to expand our login route and switch statement further in the future, it would be trivial to add another branch and another function. At what point would this method become untenable? Ten providers, a hundred, maybe a thousand? To be honest I don’t think there is actually a limit where a simple switch statement like this doesn’t solve this particular problem perfectly. Would the higher levels of abstraction fare as well? In my mind these methods potentially degrade even more as the provider count increases.
Another common justification for starting with higher levels of abstraction is to protect yourself from future changes, which may require this kind of flexibility. The only advice I can really offer on this point is “don’t”.
If the goal of the project is to eventually convert part of the code into a reusable publicly available library, you may be compelled to start down this abstraction route immediately, but what happens if the business pivots and the goals change? Now you have a library which is adding complexity to a project that no longer needs a library.
Conclusion
While this particular example was easy to illustrate, this doesn’t cover all the possible scenarios where abstractions may help or harm your project. The primary message I am trying to convey here is that your goal from the outset should be to focus on solving your actual problem first, while keeping your solution as simple as you can, and to only add abstractions and complexity when a specific need arises.
This isn’t normally how programming is taught though. Over the last 30-40 years it has been popular to demand that you try to model your solutions around the real world. This mindset naturally encourages you to heavily abstract the underlying technical details, as these seldom reflect “real world” ideas or objects.
Instead of simplicity, you are asked to embrace immeasurable principles such as SOLID, and to follow the nebulous ideals of OOP and Clean Code, which demand that you abstract and separate from the offset; to favour a certain subset of programming patterns, all while not asking why, or whether or not this actually improves my project in a way that can actually be measured and proved to be an improvement.
It only requires a little bit of effort on your part to take the time to ask these questions for yourself, rather than to blindly follow someone else’s guidelines, and to continue piling on the layers on an already monumental stack of abstractions.
Here are some simple guidelines that I personally try to keep in mind when writing code, so perhaps they might be useful to you too:
- The goal should never be to find the perfect solution on your first go, but instead you should aim to produce simple solutions that solve your actual problem first, and to slowly expand on this over time
- The higher the degree of complexity, and the more layers involved, the more resistant the code is to being thrown away. Putting that much effort into a solution will make you very reluctant to just bin it, even if that is clearly the best option
- Each abstraction added takes something away from the underlying system. Think carefully before you limit your future solutions unnecessarily
- Consider where your project may go in the future, and write code that can be easily replaced or moulded to fit if necessary
- Favour simple pure functions that can be picked up and moved to a different file, module, or even entirely different codebase if necessary
- Create automated tests for important systems to increase your confidence for when you do need to refactor in the future
- Add comments to code where non-obvious business decisions have been made to give others, and the future you, the missing context for why you solved it this way
- Abstract only when the current implementation starts to become provably cumbersome to work with, not purely because someone told you this is how code should be written
I’m interested to hear your thoughts and comments on any of the above, especially if you disagree at all. I am always looking to learn.