One and a half years ago, I was contacted by Intels book publisher to see if I would consider providing technical reviews of Intels upcoming programmers reference manual. For the preceding 12 months, I had publicly complained that the technical quality of Intels microprocessor manuals had deteriorated to the point that they couldnt be relied upon. Judging by the quality of the content, it seemed obvious that nobody of technical significance inside of Intel was reviewing the manuals (which were being written by college interns) before dumping them on the engineering community. Consequently, when Intels publishing Arm contacted me, I viewed the request as a positive indication that Intel wanted to rehabilitate their image. I gladly accepted the offer, feeling I was being allowed to participate in helping to fix Intels problem, instead of being an antagonist. It was akin to voting in an election. Unless you participate when asked, you dont have a right to complain about the outcome.
Upon receiving the manuals, my two primary concerns were whether or not they were accurate, and whether or not Intel was still trying to hide information. As I reviewed the manuscripts, it became clear that they were written pretty much like previous Intel manuals. They contained some inaccuracies and Intel omitted some important information. Some of the faulty areas concerned information that would benefit Intels competitors. Therefore, it's hard to believe those errors were not intentional. My most important observation didnt concern what was in the manuals it concerned what was missing.
After my (very) public disclosure of Intel Virtual Mode Extensions (VME) in November 1995 (see my January 1998 "Undocumented Corner" column), I assumed Intel would publish all of the missing details. Intel had already published some of the VME details in its Pentium Pro manuals. However, some of the most important details were still omitted.
Unfortunately, the missing information isnt only important to Intels competitors, but to every engineer who needs to write kernel-level code for Intel processors. Intels competitors can easily reverse engineer this information; but Joe Engineer doesnt have the time or means to do so and may not know these features exist.
Intel still hasnt released the manuals I reviewed, nor have they released the remainder of the VME details. Furthermore, the programmers reference manual due since July 1997 still hasnt been released.
In my last column (DDJ, January 1998), I began a discussion of VME, mentioning that even though Intel had shrouded details of VME in a 15-year nondisclosure agreement (NDA), publicly available Intel manuals included at least 27 references to VME. I explained that the original v86 mode was crippled by poor performance when executing legacy DOS applications. To increase the performance of DOS boxes running in a protected-mode operating system (like Windows), Intel created VME. Finally, I discussed how VME was designed to fix these performance problems, and give DOS boxes approximately 20 percent better performance than their non-VME counterparts.
In this column, Ill continue this discussion of VME , starting with a description of the various components of VME and how they work together.
VME support is enabled and disabled by setting and clearing the VME bit in Control Register 4 (CR4, bit 0). The EFLAGS register has been enhanced with two new bits: a Virtual Interrupt Flag (VIF), and a Virtual Interrupt Pending flag (VIP). When CR4.VME=1, the microprocessor uses new microcode to process the Interrupt-Flag-sensitive (IF-sensitive) instructions CLI, STI, PUSHF, POPF, INT-n, and IRET.
When enabled and running at I/O Permission Level three (IOPL=3), all INT-n instructions are controlled by a new structure that resides inside of the Task State Segment (TSS). The new structure is called the interrupt redirection bitmap. Interrupt redirection always occurs when CR4.VME=1. This behavior does not change when EFLAGS. IOPL does not equal three (IOPL<3). When running at IOPL<3, in addition to the enhanced INT-n behavior, IF-sensitive instruction are allowed to execute without faulting to the Ev86 monitor.
The TSS has been extended to include a 32-byte interrupt redirection bitmap. 32 bytes are exactly 256 bits (one bit for each software interrupt), which can be invoked via the INT-n instruction. This bitmap resides immediately below the I/O permission bitmap (see Figure 1). The definition of the I/O base field in the TSS is therefore extended and now has dual meaning. Not only does the I/O base field point to the base of the I/O permission bitmap, but also to the end (tail) of the interrupt redirection bitmap. This structure behaves exactly like the I/O permission bitmap, except that it controls software interrupts. Hardware interrupt (IRQs) and CPU-generated exceptions are not influenced by the interrupt redirection bitmap. When an INT-n instruction occurs inside of an Enhanced v86 mode task (Ev86 task using VME), and its corresponding bit is set in the interrupt redirection bitmap, an interrupt will fault to the Ev86 monitor just as a standard v86 mode task does in the 80386 and 80486. When the interrupts corresponding bit is clear, the Ev86 task will service the interrupt without ever leaving Ev86 mode as if the interrupt occurred in native real mode.
Figure 1 - Interrupt redirection bit map in the TSS
Two new flags EFLAGS.VIF and EFLAGS.VIP were added to the EFLAGS register. These flags are intended for use when the IOPL of the Ev86 task is less than three (IOPL<3). These flags can only be purposely modified by the CPL-0 (kernel mode), Ev86 monitor, or an interrupt service routine running in kernel mode.
EFLAGS.VIF is a virtualized version of the standard interrupt flag (EFLAGS.IF). While the Ev86 task is running, CLI and STI instructions will not modify the actual EFLAGS.IF; instead, these instructions modify EFLAGS.VIF. The only exception to this rule occurs when EFLAGS.VIP=1. When EFLAGS.VIP=1, STI still faults to the Ev86 monitor. The actual value of EFLAGS.IF is completely hidden from the Ev86 task, since PUSHF, POPF, INT-n, and IRET have also been modified to help hide the behavior.
The EFLAGS.VIP flag is a Virtual Interrupt Pending flag. EFLAGS.VIP can assist the Multitasking Operating System (MTOS) in sending virtual interrupts to the Ev86 task. The easiest way to understand the VIP flag is to explain its use in the context of a program running on an 8086. When the 8086 is in an uninterruptible state (FLAGS.IF=0), external interrupts remain pending but dont get serviced. After FLAGS.IF is set (because of an STI, POPF, or IRET instruction), the pending interrupt is serviced by the CPU. EFLAGS.VIF and EFLAGS.VIP are intended to serve this same purpose to the MTOS running an Ev86 task. Lets assume your Ev86 task was executing at the same uninterruptible point as the previous 8086 example. A timer-tick interrupt occurs, and the MTOS services the interrupt. During the interrupt service routine, the MTOS decides that the Ev86 task needs to service this timer tick, and sets EFLAGS.VIP=1. After returning, the Ev86 task is still in an uninterruptible state (EFLAGS.VIF=0). At some later time, the Ev86 task attempts to set EFLAGS.IF (via STI, POPF, or IRET). When this happens, the Ev86 task becomes interruptible, and a general protection fault to the monitor immediately occurs (#GP(0)). (There are many caveats associated with this behavior. Those caveats will be discussed in my next column.)
To support the new EFLAGS.VIF and EFLAGS.VIP flags, changes were needed to the instructions that read and write the interrupt flag of the EFLAGS register. CLI, STI, PUSHF, POPF, INT-n, and IRET all had to be changed to support Ev86 mode.
When an Ev86 task is running at IOPL<3, CLI and STI clear and set EFLAGS.VIF, instead of faulting to the Ev86 monitor or affecting EFLAGS.IF. As mentioned earlier, the exception to this rule occurs when EFLAGS.VIP=1.
PUSHF copies the contents of the EFLAGS.VIF flag to the FLAGS.IF position as it pushes the FLAGS image onto the stack. This gives the appearance to the Ev86 task that STI and CLI are really setting and clearing EFLAGS.IF. This appearance is necessary in case the software attempts to check for this condition. Listing One is a code sequence that tests for the interrupt flag. In addition to moving EFLAGS.VIF to the FLAGS.IF location on the stack image, PUSHF always pushes an IOPL image of three onto the stack. It is important to remember that the Pentiums IF-sensitive instructions behave identically to the Inte1486 when IOPL=3, even when CR4.VME=1. Therefore, PUSHF simulates an IOPL=3 to any software wishing to read the stack image to determine its IOPL. The actual IOPL of the Ev86 task never changes during this process.
Listing 1 -- Code demonstrating how software tests for the IF flag
STI ; Enable interrupts PUSHF ; Store FLAGS on stack POP AX ; Restore flags into register TEST AX,200h ; Interrupt flag set? Jcc label ; Jump on condition
POPF works in a similar manner: It copies the bit in the FLAGS.IF position to EFLAGS.VIF position as it pops the FLAGS image from the stack. The Pentium is careful to make sure that the faked IF and IOPL arent accidentally copied into the real IOPL during the POPF operation. Before the FLAGS image is merged into the EFLAGS register, the IF image is copied to the EFLAGS.VIF slot, and the FLAGS.IF and FLAGS.IOPL images are cleared. (All of the FLAGS register bits are cleared, except the actual IF and IOPL). Finally, the filtered FLAGS image is merged with the actual EFLAGS register. A side effect of POPF is its handling of the Trap Flag (TF) in the FLAGS stack image. If the FLAGS.TF bit on the stack image is set, then POPF causes a general protection fault before any FLAGS values are modified (#GP(0)).
The IRET instruction behaves exactly as the POPF instruction does with respect to EFLAGS.IF, EFLAGS.VIF, and EFLAGS.IOPL. IRET and POPF differ in how they handle the trap flag from the stack image. If FLAGS.TF is set in the FLAGS stack image during POPF, a #GP(0) occurs; yet for IRET, the #GP does not occur.
The INT-n instruction is the most complicated of the IOPL-sensitive instructions. INT-n behaves exactly like PUSHF in the way it handles EFIAGS.IF, EFLAGS.VIF, and EFLAGS.IOPL (provided the interrupt instruction has been redirected to the Ev86 task via the interrupt redirection bitmap). However, one of the enhancements to Ev86 mode is the ability of the Ev86 task to execute software interrupts without leaving Ev86 mode. This enhancement has been accomplished with the aid of the interrupt redirection bitmap in the TSS. When the corresponding IR bit is set, the interrupt will be invoked in exactly the same manner as a normal v86 task, by faulting to the monitor. When the corresponding bit is clear, the interrupt handler is invoked as if it were executing on an 8086 processor. In other words, a fault to the monitor is never generated, nor is a transition to the protected mode interrupt handler. The interrupt transition and return are done entirely within the Ev86 task. The influence of the IR bitmap is best described by the pseudocode in Listing Two.
Listing 2 - Interrupt handling description in Ev86 mode
N = INTERRUPT_NUMBER; INTERRUPT_BIT_MAP_PTR = TSS_BASE->IO_PERMISSION_BASE - 32; IF INTERRUPT_BIT_MAP_PTR->BIT_NUMBER[N] IF (IOPL<3) #GP(0); ELSE GOTO INT-FROM-V86-MODE; ELSE INVOKE_REAL_MODE_STYLE_INTERRUPT_FROM_Ev86_TASK(N);
To demonstrate how to implement and use VME, Ive included some source-code examples that demonstrate how interrupt redirection works with and without VME. Table 1 describes the code, which demonstrates the following characteristics of VME:
The virtual mode extensions are useful to memory managers and multitasking operating systems. Memory managers can primarily benefit by the use of the interrupt redirection bitmap to reduce the number of switches to and from protected mode. This has the added benefit of reducing the complexity of the interrupt service routines, as they no longer need to reflect software interrupts back to the v86 task.
Multitasking operating systems benefit from interrupt redirection and from the virtual interrupt support. The MTOS would run with virtual mode extensions enabled, and the Ev86 tasks running at IOPL<3. This gives an MTOS full benefit of the virtualization of interrupts. When the MTOS wishes to send a virtual interrupt (like a virtual timer tick) to an uninterruptible Ev86 task, it will do so by setting EFLAGS.VIP=1. When the task becomes interruptible, a general protection fault occurs, and the MTOS will send the virtual interrupt to the Ev86 task. This would give programs that are timer dependent (such as games) a significant performance advantage. As an added benefit of using the vitalization features of the CPU, more complexity can be removed from the Ev86 monitor. The result of using these new features is an Ev86 monitor that is simpler to implement and maintain than its nonEv86 counterpart.
As I mentioned, Intel has kept many of these details secret. Some parts of VME are still undocumented, and will likely remain so. All of the CPU microcode algorithms describing CLI, STI, PUSHF, POPF, INT-n, and IRET are not documented in any of Intels manuals. Even though these manuals describe the microcode algorithms used by all other instructions, Intel has asserted its right as the intellectual property owner by withholding this information. Intels competitors no longer need this information theyve already cloned it. Instead, withholding this information hurts you and me, and every engineer who can benefit from this knowledge.
In my next column, Ill disclose these algorithms and discuss the many caveats of Intels VME implementation. Ill show how microcode has been added to implement VME. During that discussion, Ill show an interesting bug in Intels microcode implementation. Accompanying the discussion, Ill demonstrate how much information about VME was haphazardly left in the manuals after Intel tried to remove it when creating the infamous Appendix-H (see my November 1997 column for a description of Appendix-H). By extracting these quotes and rearranging them, I'll let Intel their own story about why VME was needed and how it was implemented. All this comes free for the reading, without an NDA.
The following examples are available for viewing and download.
The first eight examples demonstrate how to reflect an interrupt back to the (E)v86 task in various processor environments.
DPL INTR VME IOPL IGATE BITMAP VME_1.ASM 0 3 0 X VME_2.ASM 0 2 X X VME_3.ASM 0 3 3 X VME_4.ASM 1 3 0 1 VME_5.ASM 1 2 X 1 VME_6.ASM 1 3 3 1 VME_7.ASM 1 3 X 0 VME_8.ASM 1 2 X 0
View source code for VME_1.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_1.asm
http://www.rcollins.org/ftp/source/vme1/struct.inc
http://www.rcollins.org/ftp/source/vme1/macros.inc
http://www.rcollins.org/ftp/source/vme1/sysseg.inc
http://www.rcollins.org/ftp/source/vme1/dataseg.inc
View source code for VME_2.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_2.asm
View source code for VME_3.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_3.asm
View source code for VME_4.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_4.asm
View source code for VME_5.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_5.asm
View source code for VME_6.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_6.asm
View source code for VME_7.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_7.asm
View source code for VME_8.ASM:
http://www.rcollins.org/ftp/source/vme1/vme_8.asm
Download all eight examples and
executables:
http://www.rcollins.org/ftp/dloads/vme1.zip