What is a Process and Why Would I Care?

Scratching the surface on the importance of process isolation
Published on 2024/02/24

My introduction to Assembly Language was rough. It was one of my first classes in college but I was coming in with little to no background in programming. I was eager to learn, and the lessons were pretty fun. As I complained in other thoughts though, I look back at how some of these topics were taught and hold my head in despair. I'm afraid I had a lot of professors who lacked the ability to really hook you into the topic. I was given these back to back theoretical lessons and we were evaluated on a practical exam that we didn't even remotely got close to during the class. What I mean is that we never did an exercise to go from all this registry talk to a real-life example. Then I got asked to design a Braille printer.

After college, and to this day, it makes me really sad to think about it. I still love all of it, I'm still curious about the lower level of computing, but I have been burned out so much that if you talk to me about registry, or assembly, I would just walk away. I think it's been over a few years that I got over it and I'm slowly going back to pick up topics I wish I was taught better. I thought about this as I was trying to dig deeper into modern problems with process level isolation and all the work it's been done throughout the industry.

So I want to spend some thoughts around basic topics in the hope it makes it click for some or even just to re-ignite an old flame.

What is a Process?

It's just a program that is being executed. A process is not just the program code but it's the way we abstract the execution of it, which of course, includes its source code. When you run a program it's going to need memory, an identifier, a call stack for function calls, ... . We don't need all the details but hopefully you get the idea, it's an ensemble of things your program needs to run. To keep it simple just think about the source, some memory and CPU usage.

As you can imagine, your computer runs a good amount of processes. Right now I have 25 tabs open on my browser, each one of them is a separate process. I am running VSCode, also a separate process. I am running a Next.js server locally, also a process. Now remember each process includes some memory and CPU usage. For now let's focus on memory.

Your machine has a limited amount of memory. Each process will need some amount of it to run, your OS will make sure they have enough available to function properly. Process A will be reserved slice of memory A, process B will be reserved slice of memory B and so on. It's a good idea to reserve some memory, you wouldn't want process A to accidentally write on slice of memory B, or you wouldn't want process A to READ slice of memory B.

Things are getting juicy. Let's keep it simple and say that we have 10 memory units, process A gets 0-4, so when you start process B, the OS knows that it should not give it access to 0-4 so it reserves 5-9. So now your two processes are isolated from one another, and the interaction between the OS and the hardware is a way to guarantee it (this is not really visible to you as an end user). Process A doesn't even need to know slots 5-9 even exist and in fact it doesn't have access to it either (supposedly..cough..cough...Spectre).

Why Would I Care?

Isolation. This is important because it determines how secure your program can be. If there were no boundaries and process A was reserved 0-4 but still had access to all of the memory units, then it could read any data from process B. So now you have no guarantees of what process A could do with data read from memory reserved for process B. What if process B was managing your email, or your passwords. Not ideal.

Isolation is important in cloud environments especially in FaaS. These type of services allow execution of untrusted code, which means it could be malicious. Some solutions allow you to run functions code within the same process. I will dig deeper into this in a more comprehensive post but, now that they run in the same process you increase the surface of attack. Why is that? Before, the isolation was managed by the interaction of the OS with the hardware, now that guarantee is gone. So you, the person offering FaaS, will have to take software level measures to provide a certain level of isolation (and hey if you're as good as the folks at Cloudflare maybe that's good enough!).

Thoughts

All of this started from Sandboxing JavaScript Code and I couldn't get enough of navigating the conversations happening in the industry. I kept going lower and lower level and just realized that I might be over my bias. It was fun exploring things I haven't touched on before and taking a look at what what the Spectre Vulnerability is about and unavoidably what speculative executions are. These thoughts are barely scratching the surface (virtual addresses, MMUs, microVMs, containers, hypervisor, ...) at a very elementary level too (but today that's the best I can do for a 5min max thought).

0
← Go Back