In the January 1997 "Undocumented Corner," I presented a brief prehistory and overview of System Management Mode (SMM), and made a comparison between 80386 ICE mode and Pentiums SMM. As demonstrated in that column, there are many similarities and many differences between ICE mode and SMM. Even though Intel did document SMM in its Pentium manuals, it skipped a few things the secrets of System Management Mode. This column will disclose some of those secrets. Specifically, I will discuss the state save map, show how the AutoHALT feature works, explain the I/O Restart feature (and how it is capable of restarting a string I/O operation from the beginning), and discuss interrupt servicing within SMM. Keep in mind that the information presented in this column is highly implementation dependent. Intel offers no guarantee that this behavior will exist in any future processors, or future steppings of the same processor. Therefore, it would be inappropriate to use any of these secrets in production code. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The Secrets of the State Save
Map As mentioned in my previous column, SMMs state save map is nearly identical to the memory image used by the undocumented LOADALL instruction (see http://www.rcollins.org/articles/loadall for a description of the LOADALL instruction). In the Pentium Processor Family Developers Manual, Volume 3 (Intel part number 241430), Table 20-2 presents the SMRAM State Save Map. Many parts of this map are designated "Reserved." You are warned not to modify any of these location lest unpredictable microprocessor behavior may result. Often times, those words are Intel technobabble that means "the microprocessor behavior is fully defined we just dont want to tell you what it is." This case appears to be no different. Table 1 shows the entire SMM State Save Map, including all of the undocumented location. The undocumented locations can be subdivided into four categories: undocumented registers; descriptor cache; I/O restart; and unwritten. The unwritten category is self explanatory, as these locations are never written by the Pentium processor. Undocumented Registers The only undocumented registers are CR4 in location 7F28, RSM Control in 7F26, and the Alternate DR6 register in location 7F24. The CR4 register contains control bits that enable many Pentium features, such as Virtual Mode Extensions, Protected Virtual Interrupts, and 4 MB Pages (See http://www.rcollins.org/articles/vme1, http://www.rcollins.org/articles/pvi1, and DDJ May 1996, or http://www.rcollins.org/ddj/May96 for a description of these features). If CR4 were not stored in the state save map, it would be impossible to restore the Pentium processor to all of its operating environments. The Alternate DR6 register is controlled by the RSM Control register. When RSM_CTL[bit0] = 1, the least significant word of DR6 (the lower 16 bits) is loaded from the ALT_DR6 slot, instead of the normal DR6 slot at 7FCC. It appears that the remaining 15 bits in RSM Control serve no other measurable purpose. I dont know why this alternate version of DR6 exists, and would strongly recommend that nobody rely on it existing in future versions of SMM, or even different steppings of the Pentium processor. Descriptor Cache Slots The descriptor caches contain the microprocessors internal form of each segment register (DS, CS, and so on) and the system registers (GDT, IDT, LDT, and TR). The registers that we normally call segment registers (CS, DS, and so on), are merely user-visible registers that dont have any real effect on the internal operations of the microprocessor. Whether in real mode, protected mode, or virtual 8086 mode, all microprocessor operations that use segments are controlled by the values in the descriptor cache registers not the user-visible segment registers. Each time a segment register or system register is loaded, the microprocessor loads the appropriate descriptor cache register. Each descriptor cache is composed of three fields: a base address, a limit, and segment access rights. (Example 1 shows the format of the descriptor cache registers.) Some of these fields are read from descriptor tables (for example, when loading a segment register in protected mode), some are calculated (for instance, the segment register base address in real mode), some are ignored (for example, the segment access rights are not changed in real mode), and others are given hard-coded values (for example, the access rights when loading GDT, and IDT). Whenever a field in the descriptor cache register is modified, it has an immediate effect on microprocessor operations. However, there are only two ways to modify an individual field in the descriptor cache registers. The first method is to modify any of the descriptor cache slots in SMMs state save map. Upon execution of the RSM instruction, the new values have an immediate effect. The second method requires using an in-circuit emulator (ICE) to modify the fields. Beware modifying the descriptor cache contents to illegal values, such as values that would be impossible to achieve through any programmatic means. (See the article at http://www.rcollins.org/Productivity/DescriptorCache.html for a detailed description of the descriptor cache contents and source code examples of changing the various fields to illegal values.) For example, you can modify the segment limit to 0xFFFEFF a value that cant be programmed by any other method. The CS access rights may be changed to read/writeable for protected mode. AutoHALT The AutoHALT Restart feature of SMM is intended to give the systems designer the choice of whether or not to return to a HALT state after the return from SMM. When the microprocessor is in the halt state upon entrance to SMM, a flag is set in the AutoHALT field of the state save map (offset 0x7F02). When AutoHALT[bit0]=1, SMM was entered from the HALT state. If this hit is cleared upon exit (AutoHALT[bit0]=0), the microprocessor will continue execution at the instruction following the HLT instruction. (See Table 2 for a list of possible entry an exit values for the AutoHALT.) In general, the AutoHALT field directs the microprocessor whether or not to restart the HLT instruction upon exit of SMM. This is accomplished by decrementing EIP and executing whatever instruction resides at that position. AutoHALT restart behavior is consistent, regardless of whether or not EIP-1 contains a HLT instruction. If the SMM handler set Auto HALT[bit0]=1 when the interrupted instruction was not a HLT instruction (AutoHALT[bit0]= 0 upon entrance), they would run the risk of resuming execution at an undesired location. The RSM microcode doesnt know the length of the interrupted instruction. Therefore when AutoHALT[bit0]=1 upon exit, the RSM microcode blindly decrements the EIP register by 1 and resumes execution. This explains why Intel warns that unpredictable behavior may result from setting this field to restart a HLT instruction when the microprocessor wasnt in a HALT state upon entrance. Listing One presents an algorithm that describes the AutoHALT Restart feature. I/O Restart There are few reserved fields in the state save map that provide support for the I/O Restart feature. The I/O Restart feature is intended to restart an I/O operation, such as OUT and IN instructions. But this task isnt as easy as it sounds. The OUT and IN instructions are single-byte instructions. However, when the source or destination register is a 32-bit register, a size-override prefix is prepended to the normal opcode to create a two-byte opcode. Next, consider a string operation like REP OUTS BYTE PTR CS:[SI]. In this case, there is a repeat prefix (REP) and a CS override, thus adding two bytes to the single-byte opcode. The extra opcode byte(s) substantially complicate the restart process the instruction pointer cant be decremented by a fixed value, as in the case of the AutoHALT Restart feature. To overcome the variable length opcode problem, Intel added four fields to the state save map to aid in restarting I/O operations. Whenever an I/O instruction is executed, the Pentium stores the values of ECX, ESI, EDI, and EIP in temporary (internal) registers. These temp registers seem to retain their contents even when hundreds or even thousands of other instructions precede the entrance to SMM. Once an SMI# is triggered, the Pentium stores the contents of these temp registers to slots reserved for their use. ECX, ESI, EDI; and EIP are stored in the state save map slots at locations 7F08, 7F0C, 7F04, and 7F10 respectively. After completion of the SMM handler, the RSM instruction doesnt know whether or not to restart an I/O operation without being told to do so. This is the purpose of the I/O Restart fields the state save map. When any bit in the IORestart field is set, the RSM microcode uses these undocumented fields as the restoration values for ECX, ESI, EDI, and EIP. For string operations that use the REP prefix, the operation is restarted from the very beginning using the initial values of ECX, ESI, and EDI. Listing Two shows how the I/O Restart operation behaves. NMI or INIT from within SMM Upon entrance to SMM, interrupts are disabled (EFLAGS.IF= 0) and both NMI and INIT are disabled. The IDT register has not been changed, and retains whatever value it had before SMM entrance. Before servicing any interrupts, it is necessary to load your own interrupt vectors, and most likely reload the IDT register with a new value. Once you issue the STI instruction, youre ready to begin servicing interrupts.. However, two asynchronous interrupts pins remained disabled: NMI and INIT. One occasion, I needed to write an SMM handler that was capable of servicing non-maskable interrupts (NMIs). I read the appropriate Intel manuals and wrote my SMM and NMI handlers in accordance with their (ambiguous) recommendations. After writing the NMI handler, I decided to test it by generating an NMI from within SMM. Much to my surprise, the NMI hander was never called until I returned from SMM. Either the Pentium manuals were wrong, or my interpretation of them was wrong. The Pentium Processor Family Developers Manual, Volume 3 describes NMI recognition within SMM in the following manner:
This statement is highly ambiguous, and is open to at least three interpretations. However, the Pentium Processor Specification Update P54C erratum #14, (Intel part number 242480) is more much specific:
This is exactly what I had done, but it didnt work. Therefore, I decided to set up some tests to determine the exact circumstances where NMI is unmasked within SMM. After collecting my results, I found that the Pentium documentation is completely wrong, as NMI isnt unmasked under any of the circumstances described therein. Therefore, I contacted Intels technical support department for further clarification (I didnt tell them that I already knew the answer). I asked for a specific example describing how to unmask NMI from within SMM, and got the following response:
The problem, as I saw it, was that everybody was wrong. The Pentium documentation was wrong; the Pentium errata (P54c erratum #14), which normally provides very accurate workarounds for specific anomalies, was wrong; and now Intels tech support was wrong. All sources were giving consistent solutions to this problem but the solution didnt even remotely match the behavior I had observed. My discoveries showed that most interrupt conditions dont unmask NMI and INIT; but I found a few cases that do. Table 3 is a list of conditions which do not unmask NMI and INIT during SMM. Table 4 is a list of conditions which do unmask NMI and INIT. As you can see from these tables, unmasking NMI and INIT from within SMM doesnt behave as documented, nor appear to have any consistent methodology. For example, if BOUND (exception taken) unmasked NMI/INIT, and BOUND (exception not taken) didnt, why didnt INTO (exception taken) unmask NMI/INIT also? Most disappointingly, the most obvious examples of a "dummy interrupt routine" failed to provide the documented behavior. Conclusion If youre planning to write your own SMM handler, hopefully youve learned something in this column that will give you insights while writing your own code. I wouldnt rely on any Pentium-specific behavior. On the contrary, I would stay clear of any implementation-specific usage. Instead, I would learn from the undocumented behavior and apply that knowledge to debugging efforts. A good grasp of Intel processor internals can substantially increase productivity if nothing else. In my next column, Ill continue my SMM discussion by discussing the many caveats of SMM. These caveats are things that every SMM programmer should know before beginning to code; in doing so, you might save many hours debugging code that appears to be written perfectly. Listing 3 - Logic Analyzer Trace of SMM/GDT Shutdown 8 Oct 1996 06:26 DAS 92A96SD-1 Disasm GDT Shutdown after RSM Page 1 Sequence Address Data Mnemonic Timestamp ---------------------------------------------------------------------------- 0 00038048 665A66EF OUT DX,AX SMM(16) 00038049 665A66EF POP EDX SMM(16) 0003804B 665A66EF POP EAX SMM(16) 0003804D 00AA0F58 RSM SMM(16) [ ] ; GDT Descriptor Cache from the SMM State Save Map 32 0003FF88 00015BA0 ( MEM READ ) SMM 240 ns 33 0003FF8C 00000002 ( MEM READ ) SMM 230 ns 34 0003FF84 0000001F ( MEM READ ) SMM 230 ns [ ] 60 00016240 100135EA JMPL 0010:0135 (16) 500 ns [ ] 64 00015C48 0010017F ( SEGMENT OVERRUN ) (13) 720 ns 00015C4C 00008600 ( SEGMENT OVERRUN ) (13) 65 00015C20 0010017A ( DOUBLE FAULT ) (8) 740 ns 00015C24 00008600 ( DOUBLE FAULT ) (8) 66 00000000 ------4F ( SHUTDOWN ) 490 ns |
Example 1 - Descriptor Cache Structure Desc_cache STRUC _Limit dd ? _Addr dd ? _Type dd ? Desc_cache ENDS Listing 1 - AutoHALT Restart Operation If (AutoHALT & 0xFFFF) { EIP = EIP - 1; return; } Listing 2 - I/O Restart Operation if (TR12 & 0x200) && (IORestart & 0xFF)) { EDI = REP_EDI; ECX = REP_ECX; ESI = REP_ESI; EIP = IORestart_EIP; return; }
|