This is an abbreviated version of an extensive article I prepared in 4 MB page size extensions on the Pentium processor. At this time, I'm not prepared to release the full version, as I'm still negotiating with a magazine to publish the article. I appologize for the inconvenience and delay. However I promised to release this information, so I will do so in keeping my promise. Instead of releasing the full article, I'm releasing these excerpts from the article.
For the past two, possible three years (depending on the release date of the i860 XP manual), Intel has been demanding anybody needing information on 4 MB pages to sign a 15-year NDA. During this time the entire details of 4 MB paging have been documented publicly in the i860 XP manual. This is most likely the result of one branch of Intel not knowing what the other branch is doing. In addition to the details in the i860 XP manual, many details regarding 4 MB paging can be found in the Pentium manuals themselves. This serves to demonstrate how ludicrous and frivolous Appendix H really is. If you called Intel today and demanded details on 4 MB paging, I'm sure they will still ask you sign the 15 year NDA, regardless of the public disclosure in the i860 XP manual.
In the Pentium manuals, there are at least 9 references to 4MB pages.
The Pentium Family User's Manual, Volume 1 (P/N 241428) mentions 4MB pages
in sections 2.0, 3.7.2, and 3.7.4. Volume 3 refers to 4MB pages in sections
10.1.3, 11.3.3, 11.3.4, 16.5.3, 23.2.10.2, and 23.2.18.1. The Intel 860
XP processor documentation claims the i860 XP is page-level compatible
with the Intel386, Intel486, and Pentium processors. This compatibility
is noteworthy, as the i860 XP also supports 4MB pages, and its documentation
provides a complete description of the 4MB paging mechanism(1).
All that's needed to obtain an Appendix H description of 4MB pages, are
a few references from the Pentium manuals, and the description of 4MB pages
from the i860 XP manual.
With an understanding of the 4KB paging mechanism, it's not difficult to deduce the 4MB paging mechanism. Recall that each page directory entry controls 4MB of memory. Now imagine how Figure 111 would change if the page table lookup were eliminated. The page frame index would increase from 12-bits to 22-bits, thus allowing direct control of a 4MB page size. The 20-bit pointer in the page directory, would be reduced to a 10-bit pointer, pointing directly to the 4MB page frame of memory. With the page table lookup eliminated, the page directory points directly to a 4MB page frame. This describes how 4MB pages are implemented in the i860 XP(1). But the question remains: are i860 XP 4MB pages compatible with Pentium 4MB pages? To answer that question, we need to compare the i860 and Pentium manuals.
The Pentium manual, volume 3, describes that CR4.PSE enables page-size extensions and 4MB pages but refers the reader to Appendix H for more information(4,5). Later in the Pentium manual, Intel shows that bit-7 of the page directory entry is the Page Size (PS) bit(3). Without CR4.PSE=1, the Pentium will always use Intel486-compatible (4KB) paging, regardless of the setting of the PDE.PS bit. Similarly, when CR4.PSE=1, and PDE.PS=0, Pentium still uses 4KB pages. But when CR4.PSE=1, and PDE.PS=1, Pentium uses an i860 XP-compatible 4MB page translation mechanism.
The linear address for a 4MB pages is converted to a physical address in much the same manner as 4KB pages. In this case however, the access to the page table is omitted. The high-order 10-bits form an index into the page directory. The page directory no longer contains a 20-bit pointer to a page table, but instead contains a 10-bit pointer to the 4MB page frame of memory. This convention mandates that all 4MB pages reside on 4MB boundaries. The 10-bit pointer in the page directory, is then combined with the low-order 22 bits of the linear address to form the 32-bit physical address.
Figure 1 shows a pictorial description of the 4MB
and 4KB paging translation mechanism. Given all of the official documented
references to 4MB pages in the Pentium manuals, all one needs to complete
their understanding of 4MB pages is to study and understand this picture.
Ironically, the 1993 edition of the Pentium manual, volume 3 contained
a virtually identical picture(6). Intel obviously recognized
the significance of this pictorial representation of 4MB pages, and substantially
modified it in subsequent editions of their Pentium manual to remove the
visual representation of the 4MB paging mechanism.
Figure 1-- Page Translation for 4MB and 4KB Page Sizes
(Their existence is worth mentioning here. However the details will be reserved for the magazine article.)
After formulating our understanding of 4MB paging, it should be quite straightforward to write characterization code which would confirm our hypothesis. To detect whether or not 4MB pages are implemented in Pentium as they are in the i860 XP, we could do the following:
The key to this technique is to read from one location in memory if 4MB pages work, but another location if they don't (so we don't page fault). This approach is demonstrated in the source code listing, 4MPAGES.ASM to show that 4MB pages work as described herein.
Now that we have demonstrated that 4MB pages work as expected, we could
write more characterization code to prove other behavioral characteristics
of enabling CR4.PSE. Distributed with this article is source code to demonstrate
the page faulting behavior of PSE. Another program is included to detect
the TLB size and associativity. Finally, another program will demonstrate
that writing any values to CR4.PSE will not invalidate the TLB.
Since the Pentium was introduced, Intel has withheld the architectural
details of 4MB pages. Only by signing a 15-year NDA would you be given
access to the documents that describe their implementation and use. The
earliest Pentium manuals documented enough details of 4MB pages to allow
anyone to reverse-engineer the details. As newer Pentium manuals were introduced,
Intel removed the most expository details. Unknown to most people outside
of Intel, the entire implementation details are documented in the i860
XP data sheet which is readily available -- no NDA required.
The following examples are available for viewing and download.
View source code for 4MPAGES.ASM:
http://www.rcollins.org/ftp/source/4mpages/4mpages.asm
Download source code and executable archive:
http://www.rcollins.org/ftp/dloads/4mpages.zip