Beyond Abstractions - A Theory of Interfaces

This article aims to express the mental model that I have built over the last few years for thinking about human-computer interfaces, software, and how we might produce a step-function increase in building and using software. A new way to program, if you will.

Note that this is more of a mental-model/philosophy than a falsifiable scientific theory, and not fundamentally novel.

For many years I have been searching for an idea to work on within software that is deep, impactful, personally fulfilling, and one that enables a massive business to be built on top of it. I started exploring everything related to software, from the origins and history of computing, to experiments trying new ways of doing things, and even new hardware that unlocks different approaches to software design.

I am happy to say the search has been both interesting and fruitful in that it evolved into a funded startup that I am super excited about and have been doing full-time for almost a year now, but more on that later. It’s time to talk about the core of this article.

The word I will use for what I am trying to describe is: Interface.

Feel free to reach out:

X: @bloeys

Personal email: [email protected]

Work email: [email protected]

Interfaces

Interfaces in this context are more of an “I know it when I see it” kind of thing due to the difficulty of exactly defining their characteristics, and as such I’ll be providing ample examples and analogies to hopefully make it very clear. With that being said, it’s probably good to note that core aspects of an interface include input, output, structure, and how they are all combined together to produce some wanted functionality in a maximally efficient manner.

For example, a calendar (e.g., Google calendar) can be represented by a single scrolling list or by the grid view we are all used to. Changing from list to grid does not offer any fundamentally new abilities, they are both equally powerful (you can see/modify appointments in both), but the grid is a much better interface.

A grid structure makes better use of modern output (i.e., screens) and with an input system that makes use of this structure (e.g., dragging an appointment diagonally to change its date and time) you get massive efficiency improvements because now you can understand an entire month worth of information in an instant and make complex modifications quickly and via intuitive movements. So interfaces don’t, in theory, give new abilities, but good interfaces are features.

From this we see that our interface isn’t the UI, it’s the interactive system that results from the interplay of the different elements without losing power in the functionality of interest. A better interface can make new things possible without new fundamental abilities by broadening the scope of what’s practical.

To make this concrete, let’s discuss speed in software and how just by being faster you unlock new abilities. Imagine a paint program where you draw with a pen tool, but the lines you make don’t show up immediately, rather after you ‘draw’ some lines by clicking and dragging around, the program loads for 5 seconds, then shows you the new image (e.g., you see your blue line). Now say you made a mistake, in that case you click undo, which loads for 5s, then you draw the corrected line, then wait another 5s and pray to God it’s correct this time.

By simply being slower (without reduction in features) the scope of what’s realistically possible has been greatly reduced. Very simple drawings might be done with this, but not even a middle school level drawing will be created with this program, because while ‘possible’, it is not practical. Speed is one aspect of interfaces.

Here we need to make a distinction between possible, practical, and easy. Possible is painful or even borderline useless, practical is what we want, and easy is the ideal. To illustrate, let us look at how these might look for someone navigating to a shop:

Possible: Ask for verbal directions (e.g., go left, once you see the sign go right, etc.)
Practical: Old GPS devices like TomTom
Easy: Google maps with voice input and auto-rerouting

Note that these three are a spectrum, and I would argue most things we have today are between possible and practical. For example, I wouldn’t put a 10s compilation time in something frequently changed as 100% practical, but its not merely possible either.

This discussion of interfaces extends beyond software. Take one of the most impactful advancements in human history, the invention of modern Mathematical notation. Here is how we used to do math before:

The above is written by Al Khwarizmi and taken from the talk Media for Thinking the Unthinkable by Bret Victor. Natural language was how Math used to be done. Here is what all the confusing writing above stands for:

$$x^2+10x=39$$

This new notation is just a new interface with which to do Math, it didn’t change the reality of what Math is, however from a practical perspective it is revolutionary.

For one, it made the ‘structure’ of Mathematics visual instead of purely logical, we can now see instead of just think. With the structure clear new ideas now flow naturally, like how some Mathematical operations (e.g., rules of Algebra) can now be done mechanically through this interface (e.g., moving “+” to other side makes it “-”), which makes work much easier.

As this new interface made Mathematical work generally easier, it made Math accessible to more people and raised the ceiling of what problems can be tackled. The interface made some things so easy we now teach them to children!

Another revolution was the move from Roman numerals to Hindu-Arabic numerals. While it might just seem like a different way of writing the same thing, Roman numerals were actually so cumbersome to work with that multiplication was considered an advanced subject, similar to how we view calculus today!

XV is 15, but in the Roman setup X is 10 and V is 5 so you have to add the different values to get the final number, and the pieces are not positional. In ‘15’, the position of the number 5 tells you it’s a ones, while the position of the 1 tells you it’s a tens, and this gives rise to very convenient reading and computation, while positions of X and V have nothing to do with the value represented and so the structure doesn’t help in computation.

Again we see that a change in interface towards reducing mental load and intuitiveness expands the boundary of what’s practical and easy, allows deeper understanding, allows novices to do what once belonged to experts, and allows experts to explore uncharted territory.

I hope this makes it clear that work on, and improvements of, interfaces are some of the most important and impactful technical projects we can undertake. Now let’s talk about programming.

Programming as an Interface

Deep underneath, programming is just Mathematics. Computers, their powers, and their limitations are all encapsulated in a Turing Machine, which is a mathematical construct. Not only can a Turing machine do Math, but programs can be turned into pure Math equations.

Programming languages are equivalent in power to Turing machines, both are Turing Complete, but good luck writing a program on a Turing machine. Programming languages and Turing machines are both interfaces to the computation represented by Turing machines, but modern software is a better interface in every way.

Building and/or discovering a good interface is difficult, but revolutionary once achieved. We wouldn’t have the scale, ubiquitousness, nor accessibility of software without the modern programming interface.

Similarly, Assembly is an interface to Machine language. Higher level languages (e.g., C, Go, Python, etc.) can also be said to be interfaces of Machine language, but that’s not exactly true as we will see later.

Abstractions

So far I have been careful not to use the word ‘abstraction’, but by this point some people are probably saying everything we have discussed so far is ‘just abstractions’. They aren’t.

In programming, abstraction is generally understood as the hiding of complexity or details to make the usage of some system easier and to help the programmer focus on the wanted result. The Wikipedia page on abstractions has a quote with the same meaning:

The essence of abstraction is preserving information that is relevant in a given context, and forgetting information that is irrelevant in that context.

– John V. Guttag

Abstractions remove details of a system and provide an ‘outer layer’ (aka API) through which to use a subset of the system. The fundamental distinction between interfaces and abstractions is that abstractions lose power compared to the original system.

Since the underlying system, by definition, has more details than is shown by the abstraction and since those details inevitably become needed, an abstraction always reduces the power the user has over the system and makes the system more opaque, for the tradeoff that the provided API is (hopefully) better for certain use cases.

This loss of power inherent in abstractions is why we have ’leaky abstractions’. When an abstraction forces the programmer to learn about internal details that it was supposed to hide, the abstraction has ’leaked’, and this leakage happens because some detail eventually finds its way out and forces the programmer to deal with it.

One common example is how any sufficiently complex system using a database ORM (a high level DB library) will be forced to do one or more of the following:

Adjust an ORM query for performance reasons because of how the ORM does queries (leaks details of how the ORM and DB are implemented)
Write a native query because the ORM doesn’t support some feature (leaks the DB query language)
Change the schema because the ORM doesn’t support some DB types it (leaks ORM implementation)

I hope it’s now clear that practically all current libraries/frameworks/APIs/etc. are abstractions, not interfaces, as they all suffer from loss of power (and thus leakage).

Interfaces are special because they represent the entire system, and that’s why you might wish to use raw SQL instead of an ORM, but you never wish to use natural language instead of modern Mathematical notation nor to use a Turing machine instead of a programming language, and that’s because those are better interfaces and are thus both complete and more ergonomic.

We can now summarize abstractions as having the following properties:

Represents only a subset of a system
Is less powerful than the underlying system
Guaranteed to leak under heavy usage (due to the above points)

Interfaces are Not Abstractions

To understand the properties of what we mean by an interface and how they compare to abstractions, I think its best to look at some of our examples again.

Let’s take modern Mathematical notation which looks like $x^2+y^2=z^2$. Compared to doing Mathematics in natural language, this notation does not hide details or complexity, does not provide a ’narrower API’ for interaction, and does not have any loss of power. The modern notation is equivalent in terms of power to natural language when looking at the theoretical ability of representing Mathematical statements, but better suited to display the inherent properties of Mathematics and to manipulate them.

While abstractions are imperfect models of a system, interfaces are the system. In a sense, an abstractions is a failed interface.

One might ask if the interface is equivalent to the original system what’s the point of it. While this has been answered in an ‘intuitive’ way via the previous examples on programming languages, numerals, Mathematical notation, and calendars, I’ll attempt to give a more ’logical’ answer.

An interface represents the entirety of a system. A better interface represents the entirety a system in a more practical way. ‘Entirety’ is important here because it implies no specialization of the system (abstractions specialize). The word ‘practical’ is intentionally vague because interfaces can provide value in multiple ways, like:

Make the system more understandable
- In Assembly, reading MOV is clearer than 10001001.
- In modern Mathematical notation, $x+1=2$ is clearer and more exact than a natural language sentence, and the notation removes almost all words of natural language, most of which had no purpose in expressing the Math itself like the/and/etc.
  - Note that this is not complexity hiding, rather this is getting rid of things that if removed, the system would still be complete.
Make working with the system easier
- In Assembly, typing MOV is both easier to type and remember than 10001001
- In modern Mathematical notation, algebraic manipulation is not only easy but is possible to do mechanically, while it’s impossible to mechanically operate on sentences of natural language.
- Multiplication is easy in Hindu-Arabic numerals but very hard in Roman numerals.

It should be noted that neither abstractions nor interfaces expand the fundamental capabilities of a system, as that would mean we now have a new system altogether.

Mechanical algebraic manipulation might seem like a new capability, but it isn’t. You can represent (i.e., describe) the manipulations that should happen in natural language just as you can do them in the modern notation, and as such it is not a new ability.

Rather what happened, and one of the reasons the modern notation is better, is that the new interface mirrored an aspect of the underlying system in its structure, and thus interactions with and manipulations of the interface become interactions with and manipulations of the underlying system! on the other hand, natural language mirrored the system in meaning, but not in structure.

A similar thing happened in the transition from Roman to Hindu-Arabic numerals. Both systems can represent any number, but the Hindu-Arabic numerals encoded numbers as a series of multipliers of powers of ten (i.e., ones place, tens place, hundreds place, etc.) and made the physical order of digits signify which multiplier applied to which digit.

Just as with modern Mathematical notation, this setup of numerals mirrored an aspect of the system in its structure, and thus manipulations of the structure led to meaningful manipulations of the underlying system.

When an interface mirrors the underlying system’s structure in an appropriate way, interactions with the system become easier because they become intuitive interactions with the interface, which in turn increases how much of the system we can practically use.

To summarize, the fundamental property of interfaces is: They represent the entirety of the underlying system. They are the system.

Additionally, good interfaces have the following extra properties:

Some or all of the underlying system is mirrored in their structure. The more the better.
The amount of the system that’s practically usable is increased by adapting input and output to enable efficient structure understanding and manipulation.

The Spectrum of Representation

One should be careful not to think in terms of black/white when it comes to abstractions and interfaces, because most things actually fall on a spectrum between the two.

Interfaces are extremely hard to achieve, because the representation of the system must be complete for one to have an interface. Anything that is not an interface is by definition an abstraction, as not being an interface means some parts of the system aren’t represented and thus incurs a power loss, which are the hallmarks of abstraction.

The above graphic is obviously very high level, as for example different libraries will be closer or further away from being an interface, and programming languages are interfaces of Turing machines but not of modern CPUs (e.g., you might need to write assembly for some instructions that aren’t available to you in a high level language), but nevertheless I hope it helps clarify the discussion so far.

Looking at the example spectrum of representations, one might be tempted to think that interfaces are always harder to use and abstractions easier. While that is true sometimes, it is not a rule. It’s only true when our interface isn’t good (or perhaps the system is of extreme inherent complexity? I am not sure.).

Roman numerals and Hindu-Arabic numerals are both interfaces, they occupy the same position on the spectrum, however one is obviously better. Similarly for the transition from natural language Mathematics to modern notation, from Machine language and Assembly, and modern grid calendars.

As such, it is not the case that we should move to abstractions, rather we should be working to replace our abstractions and current interfaces with good interfaces, we should be moving between interfaces. The only reason we have so many abstractions and little interfaces is because interfaces are extremely hard to build/discover. I am not aware of a methodology to create an interface for a given system.

Finding or creating new interfaces is hard because:

An alternative and complete representation of the system must be found.
The new representation must be more practical.
What things are part of the system, and what a system even is, is hard to define.
- This reminds me of the struggles of object recognition before deep neural networks. You just can’t write a program to detect a cat, because you intuitively understand but can’t specify what a cat is.

In a way, since an interface is the system, you must first understand the entirety of the system (or whatever we discovered from it so far), and then reproject it from your mind into an improved representation. Most probably though, it’s a combination of deep understanding, reprojection, and evolving into an interface, because when have we ever understood the entirety of something.

A Note on AI

I am not sure if AI itself is an interface nor what it is an interface of (an abstraction of a part of human consciousness, maybe?), but I wanted to note that interfaces affect AIs just as much as they affect humans!.

The AI, same as a human, uses the interfaces it has available to it, and as such interface improvements would help both humans and AIs. For example, if I give an AI 100 code files of a mobile app and tell it to do something, well, good luck to it. On the other hand, if I give it those files along with information of, say, the screens of the app and which files are used by which screens and so on, then the AI can do a much better and faster job. Both are technically equivalent, but one is a better interface for the AI, and a human would fare similarly.

That is to say, a change in only the interface can significantly boost AI performance. Computers have generally been strong where humans are weak and vice versa, but LLM AIs seem to share both our strengths and weaknesses.

Rethinking Programming

At this point I hope it’s clear what I mean when I say “I want to find a new way to program”. What I am saying is that I want to find a new and better interface for programming!

Good interfaces can be revolutionary and are of such immense value that it is worth the hard effort. If programming was to get an improved interface not only would programming open up to a much wider segment of the world, but the experts would have the ability to solve harder problems and to more easily create systems like what we have today, only better, faster, and in a more maintainable way.

With that being said, one shouldn’t think of it as an all-or-nothing bet. Even if you weren’t able to create an interface, if you were able to create an abstraction that’s extremely close to being an interface (i.e., represents most of the system) and that’s much more practical to use, then that also provides great value.

We don’t know how to approach building a new interface for all of programming (i.e., an alternative to programming languages), I certainly don’t (email me if you do!), but perhaps we can build interfaces for certain domains, like a better interface for backends.

To try and get you thinking, we can ask why are we still programming by executing code in our head instead of seeing the effects of our code in real time? (tip: it’s not hot reloading) Why are DB schemas still so painful to modify? why can’t a mobile developer visually rearrange the flows of an app while seamlessly adjusting their logic?

I believe part of the reason why interface creation in software stagnated is that we consider it best practice and good business to consider only small slices of a problem space (e.g., most or all modern SaaS). Interfaces require taking a holistic view of a problem space and building the solution all at once, or if not all at once then at least the end solution (after evolving from an abstraction) should cover all aspects of the problem space.

I don’t blame anyone, this is complex and requires tons of work and lots of time, but at least some people should always be working on new interfaces for their field, be it programming or biology.

While I don’t have a methodology to build interfaces, it’s safe to say that unlocking your mind is a necessity, and when I say unlock I mean really unlock it and let it be free to imagine the craziest things.

Say you think of a new better interface for programming but it will be too slow on modern von Neumann hardware, should you drop the idea? No! instead be brave and say if this is the better approach for software then we should rethink the hardware to enable the improved approach.

Rethinking hardware might be surprising at first, but it’s a natural consequence of our discussion so far, because if your interface is a new structure, it might need input/output to adapt, which can naturally lead to new hardware being required. Also, this is not a new system, as the new hardware is just an interface of a Turing machine ;)

What I am Doing About It

Improved software interfaces are exciting to me. They are a deep software topic, satisfying to work on, and when achieved will lead to immense value for the people who created them and for their users.

Using the philosophy outlined in this article, and with my interest in this work and the opportunity I see, I founded Overlord Systems to build the next generation software interface for web development (and perhaps later, for all of programming!).

We are starting with a new interface for cloud infrastructure and backend, and will (insha Allah with time) expand to cover all domains of web development. We built a PoC that, among other things, allowed an AI agent to do certain tasks using 10x tokens and 2x faster than Cursor (!), raised a pre-seed round, and are now working on a public beta 🚀.

If this article was at all interesting you will definitely want to follow our work, so put your email here: Overlord Systems.

Feel free to reach out:

X: @bloeys
Personal email: [email protected]
Work email: [email protected]

References

Some references for further mind expansion (‘references’ used very loosely; can be indirect):

Table Of Contents

Interfaces#

Programming as an Interface#

Abstractions#

Interfaces are Not Abstractions#

The Spectrum of Representation#

A Note on AI#

Rethinking Programming#

What I am Doing About It#

References#