- Related Processes in Linux
- Multiple Processes in Python
Related Processes in Linux
In 2007, when I was taking the undergraduate operating system course, the textbook1 mentioned that a child process could possible be terminated on its parent’s exit. Years later, I came to the world of Linux and found that the design is OS-dependent because in Linux, the child process will keep running even though its parent is gone. But it will be adopted by the process 1 instead of becoming an orphan. Let’s use a small program2 to validate the behaviour.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
In Linux, let’s start 2 shell sessions. In the first one, type
In the other one, type
ps -ef | grep a.out to check the status of the 2 processes. About 10s later, you should find that the parent is gone and the the child is still alive. But the parent of the child has become process 1.
Based on what we have observed, it can be concluded that
- child will continue to run even though its parent is dead, and it will be adopted by the
initprocess whose pid is 1.
- parent will not wait for child’s termination if only
forkprimitive is used.
Now let’s think about another case. If you press
ctrl + c in the shell immediately after running
a.out, you could find that both parent and child are terminated. This seems to contradict what we have mentioned. Actually, when
ctrl + c is typed, the
SIGINT will be sent to the whole foreground process group instead of a single process3. When a child is forked by the parent, they will share the same process group id. So both the parent and child will receive the
SIGINT signal and exit. You can change the child’s process group id to see what happens.
Multiple Processes in Python
There are multiple ways in the standard lib to implement multiple processes in Python, one is the
fork primitive provided by the
os package and another is the
os.fork function is a wrapper of the
fork function in C. So its behaviour is similar. I won’t give too much detail about it. Here is a sample program. You can find that it behaves as the C version one.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
multiprocess is a high level module built upon the primitives such as
fork, thus it is much more versatile. Besides forking child processes, it also provides features like daemonic process, shared variables. On this post, I will focus only on the process relation. Let’s start with a simple piece of code.
1 2 3 4 5 6 7 8 9 10 11 12
You can find that the parent will always wait for the children to terminated even no
wait are called. Besides, if the child process is set to be daemonic, instead of waiting, the parent will kill the child before it exits. The behaviour is documented here. So how does the module achieve that? Let’s navigate to the source code of class Process and check the implementation.
- When the
startmethod of the child is invoked, it will create a Popen object which will fork a new process and execute the child process by calling back the child’s _bootstrap method.
- Meanwhile, the parent will maintain a list to save all its children.
- The _bootstrap method imports the
utilmodule in which a _exit_function function is registered as a cleanup handler. The function will check all its children and terminate all the daemonic children but wait for those that are not.
daemonicis quit different from the
daemon processin Linux system programming. In Python multiprocessing module, it is just a flag indicating how the child processes should be handled when parent exits.
Due to the GIL, multithreading is very limited in high concurrency scenario. Even in the gevent’s implementation of thread pool, the issue is a big pain. Thus multiprocessing is encouraged in (C)Python. This post can not cover all the aspects of multiprocessing in Python, but I hope it can help you understand the concurrency in Python.