TLA+

Formal methods don't have to be that formal

Published on 2024/05/13

This week at MongoDB is Hackathon week! All meetings are canceled and we get to work on anything that inspires us or anything you're curious about. Some products were born out of prototypes developed during one of our hackathons, and App Services was one of them (originally created as MongoDB Stitch). As a manager, it's incredibly rare to be free of meetings like this. I take the chance to do some coding without interruptions. The last time I integrated ChatGPT in Atlas functions, I realized that I was not streaming the response back and was wondering why it was taking so long to get a response (:facepalm:).

This time around some folks donated their time to offer a one-day-long workshop on TLA+ (Temporal Logic of Actions). I read "Specifying Systems" by Leslie Lamport the first time I heard about TLA, that was over 5 years ago. While I didn't have any specific system or protocol to verify, I always wanted to keep that knowledge in my back pocket. The syntax might be challenging to digest the first time around, If you're familiar with Latex or mathematical syntax then you might be fine. And yes L. Lamport is the same brilliant mind behind Latex, which is most likely why some of the syntax overlaps.

Like most things, getting your hands dirty is the best way to learn. Books and manuals can be used as a reference later as needed. An intuitive way to describe what TLA+ and all of its tooling help with is by allowing you to describe how a system or a protocol works in a more formal way than plain English.

Why does it matter?

The formal nature of the way we describe such systems is done so the tooling can examine each step (e.g. Action A takes the system from state S to S') and verify that they don't violate a desired invariance. In other words, your system has properties and can transition from one state to another. The possible combination of states can be quite large and when implementing this system, it could require an extensive amount of testing before you feel good enough about its correctness. TLA+ can do that for you with the model checker. It will explore each step and all the various possible states and let you know if any of the properties are not valid anymore.

You define the properties you want to stay valid (e.g. a clock system could state that every transition either increments the seconds, minutes, or hours within their appropriate bounds), you define the possible states, and you also define the possible transitions from one state to the other. This is nothing more than writing down how your system should behave and what properties should stay valid after any transition to a different state (this is roughly what an invariance is).

Thoughts

This workshop reinforced the idea of how powerful TLA+ can be. Don't be intimidated by its syntax, you can use PlusCal which is closer to C to describe your system (it's a pseudocode-like language to be precise). The tooling will do the transpilation for you to generate valid TLA+. The reason why I also encourage you to get your hands dirty is because you might use this rarely. It's not every day that you find yourself with a problem that could use a formal specification. But when you do, it doesn't hurt to be familiar with how to navigate the tooling. It saves you a lot of headaches and it's appreciated for its ability to let you discover bugs before you start the implementation phase. It's also a good exercise to get out of your comfort zone. My question was, would I be able to use this to verify properties of a system I am already familiar with? The answer was yes! You can define any invariance and the model checker will let you know if your understanding is correct.

Can it be used to verify any system that can be represented with a state machine? That I don't know yet but I might try and find out.

← Go Back