Threads vs Processes in Linux: Which is Best?

Creating a process

In Linux, there are two system calls for creating a process, fork and execve.

First, the fork uses an internal clone system call to create a clone of the process that is requesting a new process be created. Then, the execve system call replaces the executable of the newly cloned process.

To see it in action, use the guidance below to create a process and trace the system calls. For additional help, use the strace command to dig into the system calls during a command execution.


strace -f -etrace=execve,clone bash -c '{ date; }'

Only the output of this command for clone and execve system calls should be visible in the output, as shown below.


execve("/usr/bin/bash", ["bash", "-c", "{ date; }"], 0x7ffe2658f290 /* 63 vars */) = 0 
clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,  
child_tidptr=0x7f7c2f3c6a10) = 1132 
strace: Process 1132 attached 
[pid  1132] execve("/usr/bin/date", ["date"], 0x561c3127c3b0 /* 62 vars */) = 0 
Sat 22 Jan 2022 06:35:05 AM UTC 
[pid  1132] +++ exited with 0 +++ 
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1132, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- 
+++ exited with 0 +++

First, a clone process is created with PID 1132. Then, execve is used to replace the executable of the process to /usr/bin/date.

Creating a thread

The same clone system call can be used to spawn threads with the following method signature:


pid = clone(function, stack ptr, sharing flags, arg)

Passing different flags to the clone function makes it possible to create either a process or a thread. Some of the most-used flags are noted in the table below, and a complete list can be found on the Linux manual pages.

FLAG	When flag is set	When flag is not set
CLONE_VM	Creates a thread by sharing the memory of the current process with the cloned process.	Creates a process by allocating separate memory to the cloned process.
CLONE_PARENT	The parent of the clone process will be the same as the parent of the calling process.	The parent of the clone process will be the calling process.

Once a process starts, it can spawn its threads using the clone system call. However, it’s best to use the pthread_create system call which abstracts the actual implementation details and avoids portability issues. It’s also worth noting that pthread_create internally calls the clone system call to create threads.

Internal organization of processes and threads

Both processes and threads are represented by a data structure called task_struct. This data structure holds the information about a process, its threads, scheduling parameters, machine registers, system calls state, and file descriptors. The operating system holds a list of these data structures called a task list, which forms a list of all the running processes and threads.

Running processes and threads

Now, to monitor the running processes and threads in the Linux system, use the ps command to get the list:

ps -eLF

This command will generate a list of all the running processes and their associated threads.

Fig. 1. List of running processes and threads

In this output, note the PID, PPID, LWP, and NLWP.

PID: Process ID
PPID: Parent process ID
LWP: Lightweight process (thread) ID
NLWP: Number of lightweight processes

NLWP gives you the number of threads each process has, and LWP gives you the ID for the thread. If NLWP is 1, it's a single-threaded process and you’ll note that the PID is equal to LWP (thread ID). In a single-threaded process, thread and process are functionally the same thing.

For multi-threaded processes, one of the threads will always have the same ID as the PID.

The next section will cover how resource sharing works differently in processes and threads.

Resource model

As stated above, processes in Linux are isolated from each other, but threads share the same memory space and other resources.

Communication

Sharing memory makes inter-thread communication easy, as threads can share variables among each other. However, inter-process communication is slightly more complicated and requires advanced constructs such as pipes or sockets. Due to these differences, inter-thread communication is much faster than inter-process communication.

CPU sharing

In a modern processor (with multiple CPUs/cores), many processes or threads can run in parallel. However, on a single core/CPU, only one process or thread can run at a time. In that case, the operating system uses a scheduling algorithm to share the CPU among the running threads or processes.

Each process or thread executes for a brief period called time-slice before being swapped out (context switched) to give room to another process or thread. The context switching is so fast that it looks like they’re executing in parallel, even though they’re executing concurrently. Such operating systems are called time-sharing operating systems.

Although context switching is fast, there are some aspects that differentiate process context switching from thread context switching.

When a process is being context switched, the operating system has to persist the working state of the process, namely memory and register values, before giving the CPU time to the new process. As threads share the memory and registers, in the case of thread switching, the operating system doesn’t need to store and restore the state. This is why thread switching is lightweight and why multi-threaded processes are so popular.

Processes that need to run multiple computations concurrently choose to implement a multi-threaded architecture. Although it's not a guarantee that threads will be running in parallel in a time-sharing system, all the threads of a process can make progress.

Which is best?

There is no one-size-fits-all formula to decide between multi-process and multi-thread implementation. It is a matter of choosing the right tradeoffs. If you have to implement a program with concurrent processing needs, you can choose either processes or threads to implement it.

To keep resources in a program isolated from each concurrently running code, implement that layer with processes. Within each isolated process, implementing multi-threading will provide fast inter-thread communication due to lightweight context switching and easy data sharing.

On the contrary, if an entire process is implemented as multiple threads, errors in one thread could tank the entire process. Also, running multiple processes instead of threads leads to an overall low-memory footprint, as the state of low-priority or inactive processes can be saved out to disk to make room for other running processes. Conversely, no such construct can be applied to multi-threaded processes.

Google Chrome provides a useful example. Chrome uses one process per tab, but within a tab it uses multiple threads to ensure all the activities related to that tab (rendering, network calls, etc.) run concurrently. By using such an approach, Chrome can isolate problems in one tab so they don’t affect other tabs. This also keeps the overall memory utilization low, as the operating system swaps out the inactive tabs to make room for active tabs.

In short, a good mix of multi-process and multi-thread methods is the way to achieve the best concurrency and good resource utilization.

Threads Vs Processes in Linux