Page Size Extension
In computing, Page Size Extension (PSE) refers to a feature of x86 processors that allows for pages larger than the traditional 4 KiB size. It was introduced in the original Pentium processor, but it was only publicly documented by Intel with the release of the Pentium Pro.[1] The CPUID instruction can be used to identify the availability of PSE on x86 CPUs.[2]
Motivation
    


Imagine the following scenario. An application program requests a 1 MiB memory block. In order to fulfill this request, an operating system that supports paging and that is running on older x86 CPUs will have to allocate 256 pages of 4 KiB each. An overhead of 1 KiB of memory is required for maintaining page directories and page tables.
When accessing this 1 MiB memory, each of the 256 page entries would be cached in the translation lookaside buffer (TLB; a cache that remembers virtual address to physical address translations for faster lookup on subsequent memory requests). Cluttering the TLB is possibly one of the largest disadvantages of having several page entries for what could have been allocated in one single memory block. If the TLB gets filled, then a TLB entry would have to be freed, the page directory and page tables would have to be “walked” in memory, and finally, the memory would be accessed and the new entry would be brought into the TLB. This is a severe performance penalty and was possibly the largest motivation for augmenting the x86 architecture with larger page sizes.
The PSE allows for page sizes of 4 MiB to exist along with 4 KiB pages. The 1 MiB request described previously would easily be fulfilled with a single 4 MiB page, and it would require only one TLB entry. However, the disadvantage of using larger page sizes is internal fragmentation.
Operation
    
In traditional 32-bit protected mode, x86 processors use a two-level page translation scheme, where the control register CR3 points to a single 4 KiB-long page directory, which is divided into 1024 × 4-byte entries that point to 4 KiB-long page tables, similarly consisting of 1024 × 4-byte entries pointing to 4 KiB-long pages.
Enabling PSE (by setting bit 4, PSE, of the system register CR4) changes this scheme. The entries in the page directory have an additional flag, in bit 7, named PS (for page size). This flag was ignored without PSE, but now, the page-directory entry with PS set to 1 does not point to a page table, but to a single large 4 MiB page. The page-directory entry with PS set to 0 behaves as without PSE.
If newer PSE-36 capability is available on the CPU, as checked using the CPUID instruction, then 4 more bits, in addition to normal 10 bits, are used inside a page-directory entry pointing to a large page. This allows a large page to be located in 36-bit address space.
If Physical Address Extension (PAE) is used, the size of large pages is reduced from 4 MiB down to 2 MiB, and PSE is always enabled, regardless of the PSE bit in CR4.
References
    
- T. Shanley (1998). Pentium Pro and Pentium II System Architecture. Addison-Wesley Professional. p. 439. ISBN 978-0-201-30973-7.
- Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A. Intel Corporation. August 2007. pp. 3-26 to 3-28.