As we know, Linux supports virtual memory. Actually, almost all modern general operating systems such as Solaris, Windows, Mac OS X support virtual memory. Every user space process in Linux has its own virtual address space. The virtual address will be translated to physical address by operating system finally. In fact, most CPU architectures such as x86 and arm provide hardware support for virtual to physical address translation with MMU. In that case, the translation is done by the cooperation of operating system(software) and CPU(hardware). In this post, I am trying to explain the pagemap interface which is used to explore the mapping information of physical memory.
I assume you are familiar with the basic memory management principles. At least you must know
- Terminologies like virtual page, page frame, page size
- How to locate an address within a page
- Basic administration knowledge of Linux such as proc file system
- Know how to figure out virtual addresses that are interesting to you of a process in the proc file system or with tools such as GDB and readelf
Pagemap Interface Explained
Managing hardware is one of the operating system’s main responsibilitiles. OS kernel needs some data structures to manage the physical memory. Since those data structures lives in kernel space, they can’t be accessed from the user space directly. The pagemap interface which has been introduced since 2.6.25 allows page tables and related information to be examined from user space. The information is exposed as virtual files living in proc file system:
- /proc/$(pid}/pagemap
- /proc/kpagecount
- /proc/kpageflags
Let’s start to explore these files now.
/proc/${pid}/pagemap
This file contains the map information between virtual pages and physical pages of a process. The mapping information between a virtual page and physical page is represented as a 64-bit long entry. The file is a virtual file which means that it does not exist in the disk. Even so, you can imagine that it consists of many 64-bit long records each of which contains the physical page frame number of a virtual memory page and some other attributes. For example, record 0, 1, 2 are for the first, second and third virtual page respectively. In other words, the records are indexed by the virtual page number. design doc of the pagemap interface to see the detail of the format. It should be noted that the entry’s content represent different information when page is present and swapped out.
Fig-1 is the format of pagemap entries. Please refer to the/proc/kpagecount
In linux, it is possible (and likely!) that a physical page is mapped to different virtual pages of different process. This kpagecount file contains the reference number of physical pages. It is also a virtual file in which each physical frame’s reference count is represent as a 64-bit integer and indexed by the physical frame number. For example, if you want to find second physical frame’s reference count, you can just need to simply read byte 8 to byte 15 of the kpagecount file.
/proc/kpageflags
This virtual file contains the flags of physical frames. The flags of each page frame are present in a 64-bit long entry which is also indexed by the physical frame number. That means it is accessed in the same way as the kpagecount file. Each bit in the entry present a flag. I won’t explore every flags here because they are explained clearly in the design doc. Even the entry is 64-bit long, only 23 bit are used so far.
Parse the Pagemap Interface
Following is a Python script to parse the files of pagemap interface. It accepts 2 arguments, one is the process’ pid and the other is the virtual address. Remember to append the 0x
prefix if you want to pass a hexadecimal address.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
|
Save the script as v2pfn.py. Usage:
1
|
|
Of course, this script does not cover everything of the pagemap interface. What I want to show here is how to read the information of a physical frame from the interface. It is very to extend if you want to parse other flags.
Conclusion
Pagemap interface is a quite simple tool to learn more about how Linux manage the physical memory. Hope this post can be helpful to understanding it. However, pagemap interface does NOT expose the origin PTE entry. This is kind of pity because PTE is very helpful to understand how virtual addresses are translated to physical address by CPU. In next post, I will show you how to inspect the raw PTE entry with SystemTap. Stay tuned.
Further Reading
- design doc: The design doc of the pagemap. Besides explaining the interface, it also give some examples to use the interface.
- page-type tool A pagemap interface explore tool shipped with the linux kernel. It need to be compiled before you use it.