Paging Extensions for the
Pentium Pro Processor

by

Robert R. Collins


Background

As early as June 1991, Intel circulated confidential documents describing the new features of their P5 processor, eventually known as the "Pentium Processor. "Most of those new features would be shrouded in controversy, their details kept secret by Intel. But one of the most advanced features - Physical Address Extensions (PAE) - was entirely removed. PAE gave the processor the ability to address up to 64 GB of physical memory (36-bit address bus), and access page sizes of 2 MB. The larger physical-address space and the new 2-MB paging features were interrelated, as both were enabled by the same control bit. In all other operating modes, the normal 32-bit address space was in operation.

PAE would have been enabled with CR4, bit 5. When CR4.PAE=1 (CR4[5]=1), PAE (36-bit addressing) and large 2-MB pages would be accessible. When CR4.PAE=0, A[35..32] would be forced to 0, regardless of what addresses could be generated in protected mode (when a descriptor pointing near 4 GB is combined with an offset that results in an address above 4 GB). Even when CR4.PAE=1, addresses above 4 GB would not be generated unless they were the result of a paging translation. The only means to access memory above 4 GB was through these extensions to page mode.

Before the Pentium went into production, PAE was removed. Even so, documented references to this feature still appear in the various Pentium manuals, and from other sources. In the Pentium® Processor User's Manual Volume 1 (1993 edition), "Chapter 2: Overview," paragraph 8 mentions "extensions to the architecture which allow 2 Mbyte and 4 Mbyte page sizes." This reference was removed in the next edition of this manual. In the Pentium® Processor User's Manual Volume 1 (all editions), Figures 3 and 4 show that the linear address is composed of four fields, one of which is called the "directory pointers." The directory pointers are unique to PAE. The Pentium® Processor at iCOMP Index 735h90 MHz (Intel part number 241997 all revisions), Section 1.1, paragraph 5, also mentions 2-MB paging extensions. This same reference is present in other versions of this data sheet, using different part numbers and speed ratings (242323 all revisions). Ironically, this reference is absent in at least one analogous data sheet (241595). It is also worth mentioning that the Intel-designed in-circuit test probe also included software support for PAE.

The most significant and interesting details on PAE come from the Pentium® Processor Family Developer's Manual, Volume 3, and from an Intel Architecture Labs (IAL) CD-ROM (Special Edition: P6 Processor Software Developer CD April 1995). In my previous column, I discussed page-size extensions (PSE) on the Pentium and mentioned that setting some reserved bits in various page tables would cause a page fault when an access was made to that structure (see "Understanding Pentium's 4-MB Page Size Extensions," DDJ, May 1996). This reference was quoted from the Pentium® Pro Processor Reference Manual, section 23.2.14.1:

I wrote that it might be tempting to associate the page-directory pointer as CR3 and that any such association would be incorrect. In reality, this is another reference to PAE that hasn't been removed from the Pentium documentation. As amusing as these references might be, the most expository ones are in a place that you might not have thought to look - the glossary. These glossary references virtually give away PAE detail:

Finding the glossary, however, may prove to be more difficult. The 1993 edition of Volume 3 (241430-001) is the only hard-copy edition that contains the glossary. The early-1994 edition (241430-002) mentions the glossary in the "Table of Contents," but it is missing from the manual. Ironically, the late-1p94 electronic edition (241430-003, available on CD-ROM from Intel) contains the glossary, but its hard-copy counterpart does not. Finally, neither the latest electronic version nor hard-copy of Volume 3 (241430-004) contains the glossary.

Searching the IAL P6 CD-ROM turned up a near-perfect description of the PAE paging-translation mechanism, the only error being the association with 32-bit addressing and 2-MB page sizes. Embedded in a small, obscure file on this CD is the following information:

We can deduce from this reference that PAE, like PSE, supports two page sizes - 4-KB pages and 2-MB pages. Bits 0 - 11 of the linear address point to an offset in the page frame. This indicates 12 bits of addressability in this mode (212), or a 4-KB page size. When the PS bit in the page directory equals 1, bits 0 - 11 of the linear address are combined with bits 12 - 20 to allow 21 bits of addressability (221), or a 2-MB page size.

Other evidence of PAE may be found by examining the Pentium architecture itself. CR4[5] is marked reserved, and was to enable PAE; CPUID.flags[6] is marked reserved and was to indicate the existence of PAE; the model-specific register (MSR) TR8 is marked reserved and was to contain the upper-four address bits used for TLB testability.

PAE and 36-bit addressing are now implemented in the Pentium Pro Processor (P6). I managed to get a copy of the official Intel documentation just as the article was being finished. As a result, the description of PAE was deduced from these previously documented sources and from reverse engineering on a P6-based computer, not the P6 manuals.

How PAE Works

To support 36-bit addressing, it's necessary to make substantial changes to the paging mechanism. Thirty-two-bit linear addresses are still used, but they are translated to 36-bit physical addresses. Intel chose to use a three-tier paging mechanism to support PAE for 4-KB pages, and a two- tier mechanism for 2-MB pages. With PAE enabled, CR3 points to a small Page Directory Pointers Table (PDPT). Each PDPT entry references a separate page directory. Each page directory points to a page table (for 4-KB pages), or directly to the page frame (for 2-MB pages). Table 1 provides a detailed description of all of the CPU structures associated with page translations while PAE is enabled. For comparative purposes, Table 2 gives a detailed description of all of the CPU structures associated with page translations while PSE is enabled (4-MB pages). The description of these fields may be found in the appropriate Pentium documentation. Table 3 describes the fields associated with Table 1 and Table 2, which may need further clarification.

For the most part, the paging-translation process works the same as it always has. Linear addresses are converted to physical addresses through a series of table lookups. The most significant changes in the PAE implementation are the extra table lookup (extra level of indirection) and the changes in the paging structures themselves. When PAE is enabled, an extra level of indirection is added (the PDPT lookup). CR3 points to a 4- entry PDPT, with entries that point to the base of separate page directories. This behavior is different from all previous implementations, where CR3 pointed to the base of a single page directory. The size of the paging structures have doubled to accommodate the extra 4 bits in the base address field, but otherwise, they are virtually identical to their predecessors. The significance of these changes is twofold:

The paging translation process is virtually identical to the prior implementation. The translation process is identical for 2-MB and 4-KB pages, except the page table reference is omitted: When a 32-bit linear address is presented to the paging unit, it is broken down into various fields:

In all cases, the base addresses contained in these tables (PDPT, PDE, and PTE) are physical addresses and are not subject to any paging translation. Figure 1 shows the page-address translation for 2-MB and 4-KB pages when PAE is enabled.

Table 1-- Paging Structures for PAE


Table 2 -- Paging Structures for PSE


Table 3 -- Description of Paging Structure Fields



Caveats of PAE

New features are subject to caveats and bugs, and PAE is no exception. In this case, however, the caveats are understandable, and in some cases desirable:

Table 4 -- PAE/PSE page size precedence

CR4.PAE

CR4.PSE

PDE.PS

Page Size

# Address Bits

0

0

0

4 KB

32

0

0

1

4 KB

32

0

1

0

4 KB

32

0

1

1

4 MB

32

1

0

0

4 KB

36

1

0

1

2 MB

36

1

1

0

4 KB

36

1

1

1

2 MB

36


Conclusions

PAE may have been a feature that had come before its time. In 1991, when it was secretly introduced to developers, very few computers needed anything even approaching 4 GB of physical memory. For whatever reason, Intel decided to remove this feature before the Pentium went into production. As with the Appendix H features from the Pentium Processor, not all documentation references to this feature were removed. Tracking down all of these references allowed me to write source code to test this feature months before Intel released Pentium Pro documentation; see source code examples.

In many ways, this new paging mode is more desirable than 4-MB pages. By allowing page sizes of 4 KB and 2 MB, the operating system has better control over paging large data structures. Page sizes of 4 KB are obviously too small for these large structures, and 4-MB pages are often times too large. Therefore 2-MB pages are a good compromise for efficiently paging large data structures. Unfortunately, since PAE wasn't introduced five years ago, its data structures aren't backward compatible. It will be much harder for operating-system developers to justify using this new feature, since it is one generation removed from other new paging features on the Pentium.


Source code examples

The following example demonstrates how to initialize and use 2 MB pages. This program will detect all of the reserved bits in the 2 MB paging structures.

View source code for PAE_RSVD.ASM:
http://www.rcollins.org/ftp/source/2mpages/pae_rsvd.asm
http://www.rcollins.org/ftp/source/2mpages/pagefns.asm
http://www.rcollins.org/ftp/source/2mpages/macros.inc
http://www.rcollins.org/ftp/source/2mpages/struct.inc
http://www.rcollins.org/ftp/source/2mpages/makefile

Download source code and executable archive:
http://www.rcollins.org/ftp/dloads/2mpages.zip


Back to Dr. Dobb's Undocumented Corner home page