**Introduction**
Linux’s memory management mechanism is a critical part of its operating system design, ensuring programs can use memory resources efficiently and securely. This report will explore various aspects of Linux memory management in detail, including the role of the MMU, the organization of address spaces, memory allocation strategies (such as the buddy system and slab allocator), and the specific implementation of kernel memory management.
**1. Memory Management Unit (MMU) and Address Translation**
The MMU is part of the CPU responsible for translating virtual addresses used by programs into actual physical addresses. This process is implemented via segmentation and paging mechanisms on the x86 architecture.
* **Segmentation Mechanism:** The CPU generates logical addresses, which are converted into linear addresses by the segmentation unit. Segmentation allows a process’s physical address space to be non-contiguous, providing flexibility for memory management. A segment is defined by a base address, a limit (length), and a type, which determines attributes like readability and writability.
* **Paging Mechanism:** Linear addresses are further mapped to physical addresses by the paging unit. A key advantage of paging is avoiding external fragmentation (i.e., gaps between memory blocks) by allowing memory to be allocated in fixed-size pages (typically 4KB).
* **Paging Models:** On x86, the MMU supports multi-level paging:
* 32-bit systems: 2-level paging (Page Directory and Page Table). Each Page Directory Entry (PDE) points to a Page Table, and each Page Table Entry (PTE) points to a 4KB physical page.
* 32-bit systems with Physical Address Extension (PAE): 3-level paging.
* 64-bit systems: 4-level paging.
* **Control Registers:** Paging is enabled by the PG bit in the CR0 register (PG=1 enabled, PG=0 disabled; linear addresses are used directly as physical addresses when disabled). The CR3 register holds the physical address of the Page Directory Table, which is always aligned on a 4KB boundary, so the lower 12 bits of CR3 are typically 0. The CR2 register holds the linear address of the last page fault. The CR4 register handles features like Virtual 8086 mode.
**2. Segmentation and Paging in Linux**
In Linux, the segmentation mechanism is not heavily utilized. The Global Descriptor Table (GDT) defines several segments (e.g., kernel code, kernel data, user segments), but their base addresses are all 0 and their limits are 4GB. This means Linux effectively uses a single flat segment, and virtual addresses in programs are directly equal to linear addresses.
* Linux primarily relies on the paging mechanism for memory management. Using a 4-level paging model, Linux is compatible with 32-bit, 64-bit systems, and their PAE extensions. For example, on 32-bit systems, the virtual address space is divided into user space (0-3GB) and kernel space (3GB-4GB), with paging providing flexible address mapping.
* A process’s address space is described by the `mm_struct` structure, and each process has only one `mm_struct`. Kernel space is shared and does not trigger page faults or access user space; therefore, the `task_struct->mm` of a kernel thread is NULL.
**3. Memory Allocation and Management in Linux**
Linux allocates a virtual address space to each process, the size and organization of which depend on the architecture. On 32-bit x86 systems, the virtual address space is 0-4GB, divided as follows:
| Address Range | Purpose |
| :———————- | :————————————— |
| 0 – 3GB | User Space: For user programs. |
| 3GB – 3GB + 896MB | Kernel Space: Directly maps physical memory 0-896MB. |
| 3GB + 896MB – 4GB | `vmalloc` Area: Used by the kernel to map high memory. |
* **Physical Memory Zones:**
* **ZONE_DMA (0-16MB):** Reserved for DMA (Direct Memory Access) devices, as DMA requires contiguous physical address buffers (it bypasses the MMU).
* **ZONE_NORMAL (16MB-896MB):** Normal memory directly accessible by the kernel.
* **ZONE_HIGHMEM (Above 896MB):** High memory, not directly accessible by the kernel; requires mapping via `vmalloc` or other mechanisms.
* **Page Table Allocation:**
* Kernel page tables are initialized during system boot via the `paging_init` function, directly mapping the physical pages of ZONE_DMA and ZONE_NORMAL to the virtual addresses 3GB to 3GB+896MB.
* User space and high kernel addresses are mapped by modifying the virtual-to-physical address mapping via the MMU and flushing the TLB (Translation Lookaside Buffer).
**4. Buddy System**
The buddy system is the mechanism Linux uses to manage physical memory, designed to solve the problem of external fragmentation caused by frequent allocation and deallocation. External fragmentation refers to scattered small free blocks of memory that cannot satisfy large requests.
* **Principle:** The buddy system divides physical memory into blocks of different sizes that satisfy:
* Blocks are the same size.
* Blocks are physically contiguous.
* Two such blocks are “buddies” and can be merged into a larger block.
* **Implementation:** Linux maintains 11 free lists (one for each order), each corresponding to blocks of 2^0 to 2^11 pages (4KB to 4MB). The maximum request is 1024 pages (4MB of contiguous RAM). The starting address of a block is aligned to a multiple of its size (e.g., a 16-page block starts at an address multiple of 16×4KB).
* **Initialization:** At system boot, all physical memory is released into the buddy system. Each memory node maintains per-CPU caches for single-page allocations. Allocation and freeing are handled by the buddy algorithm, which aggressively merges free blocks to minimize fragmentation.
**5. Slab Allocator**
The slab allocator is the mechanism Linux uses to manage small memory objects, aiming to reduce internal fragmentation (i.e., underutilized space within pages). It pre-allocates memory, sacrificing some space for speed, assuming allocated blocks are smaller than a page.
* **Design Concept:** Group several pages into a *slab*. Each slab stores objects of only one data type (e.g., a specific kernel object). Allocation within the slab happens at the granularity of the object size, reducing page-internal fragmentation. Multiple slabs storing the same type of object form a *cache* (unrelated to hardware cache).
* **Slab States:** A slab can be in one of three states:
* **slabs_full:** Fully allocated.
* **slabs_free:** Completely free, available for allocation.
* **slabs_partial:** Partially allocated.
* **Advantages:**
* Reduces fragmentation, especially efficient for frequently allocated small objects (e.g., kernel data structures).
* Supports object initialization, avoiding repeated initialization overhead.
* Supports hardware cache alignment and coloring, improving cache utilization and performance.
* **Disadvantages:**
* Management complexity involves multiple queues (e.g., per-CPU cache queues, slab free lists).
* Significant storage overhead: Each slab requires a `struct slab` and a `kmem_bufctl_t` array to manage objects. For very small objects (e.g., 32 bytes), this array can waste up to 1/8 of the space.
* Buffer reclamation and performance tuning are complex.
**6. Kernel Memory Management**
The Linux kernel manages memory allocation through the `kmalloc` and `vmalloc` functions, built upon the buddy system and slab allocator.
* **`kmalloc`:**
* Used to allocate physically contiguous memory, suitable for small blocks.
* Implemented via the slab allocator. Call chain: `kmalloc -> __kmalloc -> __do_kmalloc`. Key steps:
1. Find the appropriate `kmem_cache` via `kmalloc_slab`.
2. Request object memory from the slab allocator via `slab_alloc`.
* **`vmalloc`:**
* Used to allocate virtually contiguous memory; physical memory pages may not be contiguous.
* Call chain: `vmalloc -> __vmalloc_node_flags -> __vmalloc_node -> __vmalloc_node_range`. Key steps:
1. Allocate virtual address space via `__get_vm_area_node`.
2. Request physical pages one by one via `alloc_pages` and establish page table mappings for the virtual addresses.
**Conclusion**
Linux’s memory management mechanism, through components like the MMU, paging, the buddy system, and the slab allocator, achieves efficient memory allocation and management. The MMU handles address translation, paging provides flexible memory mapping, the buddy system solves external fragmentation, and the slab allocator mitigates internal fragmentation. The kernel manages its own memory space via `kmalloc` and `vmalloc`, ensuring system stability and performance.