Leaky Abstractions
In any large system of many components, it's impossible to keep in view full implementation details of all components. To manage complexity better, it's helpful to abstract away the details of other components when working on a particular component. Each component talks to other components via interfaces without worrying about the implementation details of those components. It's for this reason that abstractions are used.
Sometimes an abstraction is not perfect. When an abstraction fails to hide some of the underlying implementation details, we call this a leaky abstraction. In this case, client users of that interface will experience wrong or unsatisfactory behaviour. Clients can mitigate this by considering implementation details behind the interface and changing the way they use the interface.
Discussion
-
Could you explain leaky abstractions with an example? Let's take the example of hashing that takes plaintext and produces a hash out of it. This problem is so well defined that application programmers rarely need to write their own implementation. Many libraries are available for hashing and their interfaces nicely abstract away the implementations. Programmers rarely need to bother how the hashing is implemented. They can simply treat hashing as a "black box". Hashing is an example of a good abstraction.
An example of a leaky abstraction is Axios that wraps the fetch JavaScript API in browsers. When there's an HTTP error, Axios will coerce it into JavaScript error. This behaviour is different from fetch that treats even HTTP 404 responses as successful responses. Axios behaviour may work for many use cases but it's not the general case. Some applications may not want this behaviour.
Consider a database search. A MySQL query containing
LIKE 'abc%'
is fast but one containingLIKE '%abc%'
is slow. This is because indices use binary trees in which the latter search is not optimized. Thus, the implementation is exposed and clients have to be aware of this. -
Which are the different types of leaky abstractions? In some examples of leaky abstractions, we find that performance is affected. An MySQL query may run lot slower than expected. An array access may take lot longer than expected.
In other examples, we find that the behaviour is not as expected of the abstraction. An HTTP 404 status code is coerced into a JavaScript error. A database orchestration layer promises support for transactions when in fact it can't achieve this when dealing with multiple SQL and NoSQL databases. A single call to the service layer results in six HTTP calls when in fact the caller expects only one.
Another variant, or perhaps a related leak, is called technical leak. This can be stated as "it compiles, but doesn't work". For example, an interface would work only if the methods are called in a certain order. This is a technical leak that's called temporal coupling. Another example is initializing an object before using it or closing a database before destroying the object. Technical leaks require developers to learn something about the implementation even when it has nothing to do with business logic.
-
What does the Law of Leaky Abstractions say? This is a phrase coined by Joel Spolsky in 2002. It says,
All non-trivial abstractions, to some degree, are leaky.
Spolsky gives many examples where leaky abstractions arise. TCP provides higher layers reliable transport and delivery of packets. However, TCP can't do anything if a cable is cut or there's an overloaded hub along the way. The abstraction therefore leaks.
When iterating over a 2-D array, performance shouldn't differ if you iterate by rows versus by columns. Ideally, a programmer should not care about how the array is stored in memory. In reality, when virtual memory is involved and page faults happen, some memory accesses may take a lot longer. C++ string class is another example of leaky abstraction. They're not first-class data types. On a string instance
s
, we can dos + "bar"
but when we do"foo" + "bar"
we have to recognize that strings are reallychar*
underneath.In conclusion, abstractions are good when writing code but we still have to learn what's underneath them. High-level languages with abstractions are paradoxically harder to work with since we have to learn what these abstractions are attempting to hide.
-
How can developers overcome leaky abstractions? Abstractions reduce complexity but they're not perfect. If an abstraction leaks too much, remove it or create a better one. If you're writing an abstraction, document its limitations. Abstractions are good but having too many adds complexity, as noted by David J. Wheeler,
All problems in computer science can be solved by another level of indirection, except for the problem of too many layers of indirection.
Given that at least some abstractions will leak, developers could create a wrapper around the abstraction. The application is required to call this wrapper rather than the original abstraction. This wrapper would modify the behaviour into what the application expects.
In a more extreme case, the developer reimplements the functionality to suit the application. This is not a good practice since, with the loss of the abstraction, the application becomes more complex.
Another approach is to code between the lines. The developer understands the implementation behind the abstraction (such as how memory is allocated) and contorts the code to suit that implementation. Code becomes more complex, less readable and less portable to other platforms.
Milestones
The fact that abstractions leak is recognized in the design of high-level programming languages that tend to abstract low-level details. A quote by Niklaus Wirth is relevant here,
I found a large number of programs perform poorly because of the language’s tendency to hide “what is going on” with the misguided intention of “not bothering the programmer with details.”
In the context of programming languages, some decisions taken by language designers are seen to be pre-emptive, that is, they constrain developers to use the language in a specific way. For example, a developer needs two triangular arrays but is forced to use two rectangular arrays (more memory) or pack them into a single rectangular array (complex code). We may say that the abstraction provided by the language doesn't suit such specialized use cases.
Gregor Kiczales explains leaky abstractions at a workshop. He proposes to divide an abstraction into two parts: one that does abstraction in the traditional way and another that allows clients some control over the implementation. He calls this an open implementation supported by meta-level architectures and metaobject protocols. For designing meta-level interfaces he notes four design principles: scope control, conceptual separation, incrementality and robustness.
Ryan Bemrose of Microsoft states a corollary to the Law of Leaky Abstractions, "An abstraction should not hide or disable that which it abstracts". He gives an example of an IRC bot that could interface with third-party plugins. The IRC protocol itself is abstracted via an object-oriented interface. It's discovered that plugins that use custom modes can't function properly because such functionality was disabled by the abstraction. Abstractions are useful but we shouldn't attempt to abstract away everything.
2017
At the ContainerWorld 2017 conference, one speaker notes that containers are also leaky abstractions. Processes running inside containers have to sometimes know about I/O performance, versions, configurations, garbage collection of old images, etc. In one example, it's seen that two containers contending for the same IO are affected.
Sample Code
References
- Bemrose, Ryan. 2006. "Interface Design and the Law of Leaky Abstractions." The Audio Fool, MSDN Blog, September 29. Accessed 2019-08-23.
- Brack, Fagner. 2018. "How To Hide A Leaky Abstraction In Plain Sight." Medium, June 30. Accessed 2019-08-23.
- Bräutigam, Róbert. 2017. "Transcending the Limitations of the Human Mind." The New Java Developer, November 02. Accessed 2019-08-23.
- Kiczales, Gregor. 1992. "Towards a New Model of Abstraction in Software Engineering." IMSA'92 Workshop on Reflection and Meta-level Architectures, Xerox Corporation. Accessed 2019-08-23.
- MPJ. 2016. "Leaky abstractions." MPJ's Musings #58, Fun Fun Function, on YouTube, November 14. Accessed 2019-08-23.
- Mansoor, Umer. 2016. "Good Abstractions Have Fewer Leaks." CodeAhoy, May 06. Accessed 2019-08-23.
- Nadel, Ben. 2016. "Follow-Up: Creating Leaky Abstractions With RxJS In Angular 2.1.1." Blog, November 16. Accessed 2019-08-23.
- Nadel, Ben. 2017. "Reflecting On Data Persistence, Transactions, And Leaky Abstractions." Blog, June 16. Accessed 2019-08-23.
- Nag, Dev. 2017. "Containers are Leaky Abstractions (and other truths I hide from my kids)." VMware Cloud Community, April 10. Accessed 2019-08-23.
- Principles Wiki. 2018. "Law Of Leaky Abstractions (LLA)." Principles Wiki, April 11. Accessed 2019-08-23.
- Shaw, Mary and Wm. A. Wulf. 1980. "Toward Relaxing Assumptions in Languages and Their Implementations." ACM SIGPLAN Notices vol. 15, no. 3, pp. 45-61, March. Accessed 2019-08-23.
- Spolsky, Joel. 2002. "The Law of Leaky Abstractions." Joel on Software, November 11. Accessed 2019-08-23.
- stereobooster. 2018. "Spot a leaky abstraction." Dev.to, January 12. Accessed 2019-08-23.
Further Reading
- Spolsky, Joel. 2002. "The Law of Leaky Abstractions." Joel on Software, November 11. Accessed 2019-08-23.
- Dietrich, Erik. 2016. "Plugging Leaky Abstractions." Blog, NDepend, August 18. Accessed 2019-08-23.
- Kluck, Timo and Lukas Vermeer. 2017. "Leaky Abstraction In Online Experimentation Platforms: A Conceptual Framework To Categorize Common Challenges." arXiv, October 01. Accessed 2019-08-23.
- Bräutigam, Róbert. 2017. "Transcending the Limitations of the Human Mind." The New Java Developer, November 02. Accessed 2019-08-23.
- Atwood, Jeff. 2009. "All Abstractions Are Failed Abstractions." Blog, Coding Horror, June 30. Accessed 2019-08-23.
Article Stats
Cite As
See Also
- SOLID Design Principles
- Design Patterns
- Design by Contract
- Metaprogramming
- Application Programming Interface
- Dependency Injection