1
0
Fork 0
x64dbg/bin/mnemdb.json

9519 lines
4.7 MiB

{
"__github_x86-64": "https://github.com/nologic/idaref/blob/master/x86-64.sql",
"__license_x86-64": "GPLv2",
"_github_x86-64-brief": "https://github.com/radareorg/radare2/blob/c4d416c7b96d2735c24a2f9e2787df3fdb764c71/libr/asm/d/x86.sdb.txt",
"_license_x86-64-brief": "GPLv3",
"x86-64": [
{
"description": "AAA-ASCII Adjust After Addition\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n37 AAA NP Invalid Valid ASCII adjust AL after addition.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nAdjusts the sum of two unpacked BCD values to create an unpacked BCD result. The AL register is the implied\r\nsource and destination operand for this instruction. The AAA instruction is only useful when it follows an ADD\r\ninstruction that adds (binary addition) two unpacked BCD values and stores a byte result in the AL register. The\r\nAAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result.\r\nIf the addition produces a decimal carry, the AH register increments by 1, and the CF and AF flags are set. If there\r\nwas no decimal carry, the CF and AF flags are cleared and the AH register is unchanged. In either case, bits 4\r\nthrough 7 of the AL register are set to 0.\r\nThis instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64-Bit Mode\r\n THEN\r\n #UD;\r\n ELSE\r\n IF ((AL AND 0FH) > 9) or (AF = 1)\r\n THEN\r\n AX <- AX + 106H;\r\n AF <- 1;\r\n CF <- 1;\r\n ELSE\r\n AF <- 0;\r\n CF <- 0;\r\n FI;\r\n AL <- AL AND 0FH;\r\nFI;\r\n\r\nFlags Affected\r\nThe AF and CF flags are set to 1 if the adjustment results in a decimal carry; otherwise they are set to 0. The OF,\r\nSF, ZF, and PF flags are undefined.\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n\r\n\r\n\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AAA"
},
{
"description": "AAD-ASCII Adjust AX Before Division\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\nD5 0A AAD NP Invalid Valid ASCII adjust AX before division.\r\nD5 ib AAD imm8 NP Invalid Valid Adjust AX before division to number base\r\n imm8.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nAdjusts two unpacked BCD digits (the least-significant digit in the AL register and the most-significant digit in the\r\nAH register) so that a division operation performed on the result will yield a correct unpacked BCD value. The AAD\r\ninstruction is only useful when it precedes a DIV instruction that divides (binary division) the adjusted value in the\r\nAX register by an unpacked BCD value.\r\nThe AAD instruction sets the value in the AL register to (AL + (10 * AH)), and then clears the AH register to 00H.\r\nThe value in the AX register is then equal to the binary equivalent of the original unpacked two-digit (base 10)\r\nnumber in registers AH and AL.\r\nThe generalized version of this instruction allows adjustment of two unpacked digits of any number base (see the\r\n\"Operation\" section below), by setting the imm8 byte to the selected number base (for example, 08H for octal, 0AH\r\nfor decimal, or 0CH for base 12 numbers). The AAD mnemonic is interpreted by all assemblers to mean adjust\r\nASCII (base 10) values. To adjust values in another number base, the instruction must be hand coded in machine\r\ncode (D5 imm8).\r\nThis instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64-Bit Mode\r\n THEN\r\n #UD;\r\n ELSE\r\n tempAL <- AL;\r\n tempAH <- AH;\r\n AL <- (tempAL + (tempAH * imm8)) AND FFH;\r\n (* imm8 is set to 0AH for the AAD mnemonic.*)\r\n AH <- 0;\r\nFI;\r\nThe immediate value (imm8) is taken from the second byte of the instruction.\r\n\r\nFlags Affected\r\nThe SF, ZF, and PF flags are set according to the resulting binary value in the AL register; the OF, AF, and CF flags\r\nare undefined.\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AAD"
},
{
"description": "AAM-ASCII Adjust AX After Multiply\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\nD4 0A AAM NP Invalid Valid ASCII adjust AX after multiply.\r\nD4 ib AAM imm8 NP Invalid Valid Adjust AX after multiply to number base\r\n imm8.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nAdjusts the result of the multiplication of two unpacked BCD values to create a pair of unpacked (base 10) BCD\r\nvalues. The AX register is the implied source and destination operand for this instruction. The AAM instruction is\r\nonly useful when it follows an MUL instruction that multiplies (binary multiplication) two unpacked BCD values and\r\nstores a word result in the AX register. The AAM instruction then adjusts the contents of the AX register to contain\r\nthe correct 2-digit unpacked (base 10) BCD result.\r\nThe generalized version of this instruction allows adjustment of the contents of the AX to create two unpacked\r\ndigits of any number base (see the \"Operation\" section below). Here, the imm8 byte is set to the selected number\r\nbase (for example, 08H for octal, 0AH for decimal, or 0CH for base 12 numbers). The AAM mnemonic is interpreted\r\nby all assemblers to mean adjust to ASCII (base 10) values. To adjust to values in another number base, the\r\ninstruction must be hand coded in machine code (D4 imm8).\r\nThis instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64-Bit Mode\r\n THEN\r\n #UD;\r\n ELSE\r\n tempAL <- AL;\r\n AH <- tempAL / imm8; (* imm8 is set to 0AH for the AAM mnemonic *)\r\n AL <- tempAL MOD imm8;\r\nFI;\r\nThe immediate value (imm8) is taken from the second byte of the instruction.\r\n\r\nFlags Affected\r\nThe SF, ZF, and PF flags are set according to the resulting binary value in the AL register. The OF, AF, and CF flags\r\nare undefined.\r\n\r\nProtected Mode Exceptions\r\n#DE If an immediate value of 0 is used.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n\r\n\r\n\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AAM"
},
{
"description": "AAS-ASCII Adjust AL After Subtraction\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n3F AAS NP Invalid Valid ASCII adjust AL after subtraction.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nAdjusts the result of the subtraction of two unpacked BCD values to create a unpacked BCD result. The AL register\r\nis the implied source and destination operand for this instruction. The AAS instruction is only useful when it follows\r\na SUB instruction that subtracts (binary subtraction) one unpacked BCD value from another and stores a byte\r\nresult in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-\r\ndigit unpacked BCD result.\r\nIf the subtraction produced a decimal carry, the AH register decrements by 1, and the CF and AF flags are set. If no\r\ndecimal carry occurred, the CF and AF flags are cleared, and the AH register is unchanged. In either case, the AL\r\nregister is left with its top four bits set to 0.\r\nThis instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64-bit mode\r\n THEN\r\n #UD;\r\n ELSE\r\n IF ((AL AND 0FH) > 9) or (AF = 1)\r\n THEN\r\n AX <- AX - 6;\r\n AH <- AH - 1;\r\n AF <- 1;\r\n CF <- 1;\r\n AL <- AL AND 0FH;\r\n ELSE\r\n CF <- 0;\r\n AF <- 0;\r\n AL <- AL AND 0FH;\r\n FI;\r\nFI;\r\n\r\nFlags Affected\r\nThe AF and CF flags are set to 1 if there is a decimal borrow; otherwise, they are cleared to 0. The OF, SF, ZF, and\r\nPF flags are undefined.\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AAS"
},
{
"description": "ADC-Add with Carry\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n14 ib ADC AL, imm8 I Valid Valid Add with carry imm8 to AL.\r\n15 iw ADC AX, imm16 I Valid Valid Add with carry imm16 to AX.\r\n15 id ADC EAX, imm32 I Valid Valid Add with carry imm32 to EAX.\r\nREX.W + 15 id ADC RAX, imm32 I Valid N.E. Add with carry imm32 sign extended to 64-\r\n bits to RAX.\r\n80 /2 ib ADC r/m8, imm8 MI Valid Valid Add with carry imm8 to r/m8.\r\n *\r\nREX + 80 /2 ib ADC r/m8 , imm8 MI Valid N.E. Add with carry imm8 to r/m8.\r\n81 /2 iw ADC r/m16, imm16 MI Valid Valid Add with carry imm16 to r/m16.\r\n81 /2 id ADC r/m32, imm32 MI Valid Valid Add with CF imm32 to r/m32.\r\nREX.W + 81 /2 id ADC r/m64, imm32 MI Valid N.E. Add with CF imm32 sign extended to 64-bits\r\n to r/m64.\r\n83 /2 ib ADC r/m16, imm8 MI Valid Valid Add with CF sign-extended imm8 to r/m16.\r\n83 /2 ib ADC r/m32, imm8 MI Valid Valid Add with CF sign-extended imm8 into r/m32.\r\nREX.W + 83 /2 ib ADC r/m64, imm8 MI Valid N.E. Add with CF sign-extended imm8 into r/m64.\r\n10 /r ADC r/m8, r8 MR Valid Valid Add with carry byte register to r/m8.\r\nREX + 10 /r ADC r/m8*, r8* MR Valid N.E. Add with carry byte register to r/m64.\r\n11 /r ADC r/m16, r16 MR Valid Valid Add with carry r16 to r/m16.\r\n11 /r ADC r/m32, r32 MR Valid Valid Add with CF r32 to r/m32.\r\nREX.W + 11 /r ADC r/m64, r64 MR Valid N.E. Add with CF r64 to r/m64.\r\n12 /r ADC r8, r/m8 RM Valid Valid Add with carry r/m8 to byte register.\r\nREX + 12 /r ADC r8*, r/m8* RM Valid N.E. Add with carry r/m64 to byte register.\r\n13 /r ADC r16, r/m16 RM Valid Valid Add with carry r/m16 to r16.\r\n13 /r ADC r32, r/m32 RM Valid Valid Add with CF r/m32 to r32.\r\nREX.W + 13 /r ADC r64, r/m64 RM Valid N.E. Add with CF r/m64 to r64.\r\nNOTES:\r\n*In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r, w) imm8 NA NA\r\n I AL/AX/EAX/RAX imm8 NA NA\r\n\r\nDescription\r\nAdds the destination operand (first operand), the source operand (second operand), and the carry (CF) flag and\r\nstores the result in the destination operand. The destination operand can be a register or a memory location; the\r\nsource operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be\r\nused in one instruction.) The state of the CF flag represents a carry from a previous addition. When an immediate\r\nvalue is used as an operand, it is sign-extended to the length of the destination operand format.\r\n\r\n\r\n\r\n\r\n\r\nThe ADC instruction does not distinguish between signed or unsigned operands. Instead, the processor evaluates\r\nthe result for both data types and sets the OF and CF flags to indicate a carry in the signed or unsigned result,\r\nrespectively. The SF flag indicates the sign of the signed result.\r\nThe ADC instruction is usually executed as part of a multibyte or multiword addition in which an ADD instruction is\r\nfollowed by an ADC instruction.\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nDEST <- DEST + SRC + CF;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nADC: extern unsigned char _addcarry_u8(unsigned char c_in, unsigned char src1, unsigned char src2, unsigned char *sum_out);\r\n\r\nADC: extern unsigned char _addcarry_u16(unsigned char c_in, unsigned short src1, unsigned short src2, unsigned short\r\n*sum_out);\r\n\r\nADC: extern unsigned char _addcarry_u32(unsigned char c_in, unsigned int src1, unsigned char int, unsigned int *sum_out);\r\n\r\nADC: extern unsigned char _addcarry_u64(unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64\r\n*sum_out);\r\n\r\nFlags Affected\r\nThe OF, SF, ZF, AF, CF, and PF flags are set according to the result.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment\r\n selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADC"
},
{
"description": "ADCX - Unsigned Integer Addition of Two Operands with Carry Flag\r\n Opcode/ Op/ 64/32bit CPUID Description\r\n Instruction En Mode Feature\r\n Support Flag\r\n 66 0F 38 F6 /r RM V/V ADX Unsigned addition of r32 with CF, r/m32 to r32, writes CF.\r\n ADCX r32, r/m32\r\n 66 REX.w 0F 38 F6 /r RM V/NE ADX Unsigned addition of r64 with CF, r/m64 to r64, writes CF.\r\n ADCX r64, r/m64\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nPerforms an unsigned addition of the destination operand (first operand), the source operand (second operand)\r\nand the carry-flag (CF) and stores the result in the destination operand. The destination operand is a general-\r\npurpose register, whereas the source operand can be a general-purpose register or memory location. The state of\r\nCF can represent a carry from a previous addition. The instruction sets the CF flag with the carry generated by the\r\nunsigned addition of the operands.\r\nThe ADCX instruction is executed in the context of multi-precision addition, where we add a series of operands with\r\na carry-chain. At the beginning of a chain of additions, we need to make sure the CF is in a desired initial state.\r\nOften, this initial state needs to be 0, which can be achieved with an instruction to zero the CF (e.g. XOR).\r\nThis instruction is supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-\r\nbit mode.\r\nIn 64-bit mode, the default operation size is 32 bits. Using a REX Prefix in the form of REX.R permits access to addi-\r\ntional registers (R8-15). Using REX Prefix in the form of REX.W promotes operation to 64 bits.\r\nADCX executes normally either inside or outside a transaction region.\r\nNote: ADCX defines the OF flag differently than the ADD/ADC instructions as defined in Intel 64 and IA-32 Archi-\r\ntectures Software Developer's Manual, Volume 2A.\r\n\r\nOperation\r\nIF OperandSize is 64-bit\r\n THEN CF:DEST[63:0] <- DEST[63:0] + SRC[63:0] + CF;\r\n ELSE CF:DEST[31:0] <- DEST[31:0] + SRC[31:0] + CF;\r\nFI;\r\n\r\nFlags Affected\r\nCF is updated based on result. OF, SF, ZF, AF and PF flags are unmodified.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nunsigned char _addcarryx_u32 (unsigned char c_in, unsigned int src1, unsigned int src2, unsigned int *sum_out);\r\nunsigned char _addcarryx_u64 (unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *sum_out);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) For an illegal address in the SS segment.\r\n\r\n\r\n\r\n#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a null segment\r\n selector.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) For an illegal address in the SS segment.\r\n#GP(0) If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) For an illegal address in the SS segment.\r\n#GP(0) If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADCX"
},
{
"description": "ADD-Add\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n04 ib ADD AL, imm8 I Valid Valid Add imm8 to AL.\r\n05 iw ADD AX, imm16 I Valid Valid Add imm16 to AX.\r\n05 id ADD EAX, imm32 I Valid Valid Add imm32 to EAX.\r\nREX.W + 05 id ADD RAX, imm32 I Valid N.E. Add imm32 sign-extended to 64-bits to RAX.\r\n80 /0 ib ADD r/m8, imm8 MI Valid Valid Add imm8 to r/m8.\r\nREX + 80 /0 ib ADD r/m8*, imm8 MI Valid N.E. Add sign-extended imm8 to r/m64.\r\n81 /0 iw ADD r/m16, imm16 MI Valid Valid Add imm16 to r/m16.\r\n81 /0 id ADD r/m32, imm32 MI Valid Valid Add imm32 to r/m32.\r\nREX.W + 81 /0 id ADD r/m64, imm32 MI Valid N.E. Add imm32 sign-extended to 64-bits to\r\n r/m64.\r\n83 /0 ib ADD r/m16, imm8 MI Valid Valid Add sign-extended imm8 to r/m16.\r\n83 /0 ib ADD r/m32, imm8 MI Valid Valid Add sign-extended imm8 to r/m32.\r\nREX.W + 83 /0 ib ADD r/m64, imm8 MI Valid N.E. Add sign-extended imm8 to r/m64.\r\n00 /r ADD r/m8, r8 MR Valid Valid Add r8 to r/m8.\r\n * *\r\nREX + 00 /r ADD r/m8 , r8 MR Valid N.E. Add r8 to r/m8.\r\n01 /r ADD r/m16, r16 MR Valid Valid Add r16 to r/m16.\r\n01 /r ADD r/m32, r32 MR Valid Valid Add r32 to r/m32.\r\nREX.W + 01 /r ADD r/m64, r64 MR Valid N.E. Add r64 to r/m64.\r\n02 /r ADD r8, r/m8 RM Valid Valid Add r/m8 to r8.\r\n * *\r\nREX + 02 /r ADD r8 , r/m8 RM Valid N.E. Add r/m8 to r8.\r\n03 /r ADD r16, r/m16 RM Valid Valid Add r/m16 to r16.\r\n03 /r ADD r32, r/m32 RM Valid Valid Add r/m32 to r32.\r\nREX.W + 03 /r ADD r64, r/m64 RM Valid N.E. Add r/m64 to r64.\r\nNOTES:\r\n*In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r, w) imm8 NA NA\r\n I AL/AX/EAX/RAX imm8 NA NA\r\n\r\nDescription\r\nAdds the destination operand (first operand) and the source operand (second operand) and then stores the result\r\nin the destination operand. The destination operand can be a register or a memory location; the source operand\r\ncan be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one\r\ninstruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination\r\noperand format.\r\nThe ADD instruction performs integer addition. It evaluates the result for both signed and unsigned integer oper-\r\nands and sets the CF and OF flags to indicate a carry (overflow) in the signed or unsigned result, respectively. The\r\nSF flag indicates the sign of the signed result.\r\n\r\n\r\n\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nDEST <- DEST + SRC;\r\n\r\nFlags Affected\r\nThe OF, SF, ZF, AF, CF, and PF flags are set according to the result.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment\r\n selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADD"
},
{
"description": "ADDPD-Add Packed Double-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 58 /r RM V/V SSE2 Add packed double-precision floating-point values from\r\n ADDPD xmm1, xmm2/m128 xmm2/mem to xmm1 and store result in xmm1.\r\n VEX.NDS.128.66.0F.WIG 58 /r RVM V/V AVX Add packed double-precision floating-point values from\r\n VADDPD xmm1,xmm2, xmm3/mem to xmm2 and store result in xmm1.\r\n xmm3/m128\r\n VEX.NDS.256.66.0F.WIG 58 /r RVM V/V AVX Add packed double-precision floating-point values from\r\n VADDPD ymm1, ymm2, ymm3/mem to ymm2 and store result in ymm1.\r\n ymm3/m256\r\n EVEX.NDS.128.66.0F.W1 58 /r FV V/V AVX512VL Add packed double-precision floating-point values from\r\n VADDPD xmm1 {k1}{z}, xmm2, AVX512F xmm3/m128/m64bcst to xmm2 and store result in xmm1\r\n xmm3/m128/m64bcst with writemask k1.\r\n EVEX.NDS.256.66.0F.W1 58 /r FV V/V AVX512VL Add packed double-precision floating-point values from\r\n VADDPD ymm1 {k1}{z}, ymm2, AVX512F ymm3/m256/m64bcst to ymm2 and store result in ymm1\r\n ymm3/m256/m64bcst with writemask k1.\r\n EVEX.NDS.512.66.0F.W1 58 /r FV V/V AVX512F Add packed double-precision floating-point values from\r\n VADDPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst to zmm2 and store result in zmm1\r\n zmm3/m512/m64bcst{er} with writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV-RVM ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nAdd two, four or eight packed double-precision floating-point values from the first source operand to the second\r\nsource operand, and stores the packed double-precision floating-point results in the destination operand.\r\nEVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be\r\na ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a\r\n64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAX_VL-1:256)\r\nof the corresponding ZMM register destination are zeroed.\r\nVEX.128 encoded version: the first source operand is a XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper Bits (MAX_VL-1:128) of the corresponding\r\nZMM register destination are unmodified.\r\n\r\nOperation\r\nVADDPD (EVEX encoded versions) when src2 operand is a vector register\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nIF (VL = 512) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n\r\n\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+63:i] <- SRC1[i+63:i] + SRC2[i+63:i]\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVADDPD (EVEX encoded versions) when src2 operand is a memory source\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+63:i] <- SRC1[i+63:i] + SRC2[63:0]\r\n ELSE\r\n DEST[i+63:i] <- SRC1[i+63:i] + SRC2[i+63:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVADDPD (VEX.256 encoded version)\r\nDEST[63:0] <- SRC1[63:0] + SRC2[63:0]\r\nDEST[127:64] <- SRC1[127:64] + SRC2[127:64]\r\nDEST[191:128] <- SRC1[191:128] + SRC2[191:128]\r\nDEST[255:192] <- SRC1[255:192] + SRC2[255:192]\r\nDEST[MAX_VL-1:256] <- 0\r\n.\r\n\r\n\r\n\r\n\r\n\r\nVADDPD (VEX.128 encoded version)\r\nDEST[63:0] <- SRC1[63:0] + SRC2[63:0]\r\nDEST[127:64] <- SRC1[127:64] + SRC2[127:64]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nADDPD (128-bit Legacy SSE version)\r\nDEST[63:0] <- DEST[63:0] + SRC[63:0]\r\nDEST[127:64] <- DEST[127:64] + SRC[127:64]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVADDPD __m512d _mm512_add_pd (__m512d a, __m512d b);\r\nVADDPD __m512d _mm512_mask_add_pd (__m512d s, __mmask8 k, __m512d a, __m512d b);\r\nVADDPD __m512d _mm512_maskz_add_pd (__mmask8 k, __m512d a, __m512d b);\r\nVADDPD __m256d _mm256_mask_add_pd (__m256d s, __mmask8 k, __m256d a, __m256d b);\r\nVADDPD __m256d _mm256_maskz_add_pd (__mmask8 k, __m256d a, __m256d b);\r\nVADDPD __m128d _mm_mask_add_pd (__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVADDPD __m128d _mm_maskz_add_pd (__mmask8 k, __m128d a, __m128d b);\r\nVADDPD __m512d _mm512_add_round_pd (__m512d a, __m512d b, int);\r\nVADDPD __m512d _mm512_mask_add_round_pd (__m512d s, __mmask8 k, __m512d a, __m512d b, int);\r\nVADDPD __m512d _mm512_maskz_add_round_pd (__mmask8 k, __m512d a, __m512d b, int);\r\nADDPD __m256d _mm256_add_pd (__m256d a, __m256d b);\r\nADDPD __m128d _mm_add_pd (__m128d a, __m128d b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 2.\r\nEVEX-encoded instruction, see Exceptions Type E2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADDPD"
},
{
"description": "ADDPS-Add Packed Single-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 58 /r RM V/V SSE Add packed single-precision floating-point values from\r\n ADDPS xmm1, xmm2/m128 xmm2/m128 to xmm1 and store result in xmm1.\r\n VEX.NDS.128.0F.WIG 58 /r RVM V/V AVX Add packed single-precision floating-point values from\r\n VADDPS xmm1,xmm2, xmm3/m128 xmm3/m128 to xmm2 and store result in xmm1.\r\n VEX.NDS.256.0F.WIG 58 /r RVM V/V AVX Add packed single-precision floating-point values from\r\n VADDPS ymm1, ymm2, ymm3/m256 ymm3/m256 to ymm2 and store result in ymm1.\r\n EVEX.NDS.128.0F.W0 58 /r FV V/V AVX512VL Add packed single-precision floating-point values from\r\n VADDPS xmm1 {k1}{z}, xmm2, AVX512F xmm3/m128/m32bcst to xmm2 and store result in\r\n xmm3/m128/m32bcst xmm1 with writemask k1.\r\n EVEX.NDS.256.0F.W0 58 /r FV V/V AVX512VL Add packed single-precision floating-point values from\r\n VADDPS ymm1 {k1}{z}, ymm2, AVX512F ymm3/m256/m32bcst to ymm2 and store result in\r\n ymm3/m256/m32bcst ymm1 with writemask k1.\r\n EVEX.NDS.512.0F.W0 58 /r FV V/V AVX512F Add packed single-precision floating-point values from\r\n VADDPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst to zmm2 and store result in\r\n zmm3/m512/m32bcst {er} zmm1 with writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV-RVM ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nAdd four, eight or sixteen packed single-precision floating-point values from the first source operand with the\r\nsecond source operand, and stores the packed single-precision floating-point results in the destination operand.\r\nEVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be\r\na ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a\r\n32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAX_VL-1:256)\r\nof the corresponding ZMM register destination are zeroed.\r\nVEX.128 encoded version: the first source operand is a XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper Bits (MAX_VL-1:128) of the corresponding\r\nZMM register destination are unmodified.\r\n\r\nOperation\r\nVADDPS (EVEX encoded versions) when src2 operand is a register\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nIF (VL = 512) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\n\r\n\r\n\r\n\r\nFI;\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+31:i] <- SRC1[i+31:i] + SRC2[i+31:i]\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR;\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVADDPS (EVEX encoded versions) when src2 operand is a memory source\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <-j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <- SRC1[i+31:i] + SRC2[31:0]\r\n ELSE\r\n DEST[i+31:i] <- SRC1[i+31:i] + SRC2[i+31:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR;\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVADDPS (VEX.256 encoded version)\r\nDEST[31:0] <- SRC1[31:0] + SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] + SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] + SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] + SRC2[127:96]\r\nDEST[159:128] <- SRC1[159:128] + SRC2[159:128]\r\nDEST[191:160]<- SRC1[191:160] + SRC2[191:160]\r\nDEST[223:192] <- SRC1[223:192] + SRC2[223:192]\r\nDEST[255:224] <- SRC1[255:224] + SRC2[255:224].\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVADDPS (VEX.128 encoded version)\r\nDEST[31:0] <- SRC1[31:0] + SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] + SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] + SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] + SRC2[127:96]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nADDPS (128-bit Legacy SSE version)\r\nDEST[31:0] <- SRC1[31:0] + SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] + SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] + SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] + SRC2[127:96]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVADDPS __m512 _mm512_add_ps (__m512 a, __m512 b);\r\nVADDPS __m512 _mm512_mask_add_ps (__m512 s, __mmask16 k, __m512 a, __m512 b);\r\nVADDPS __m512 _mm512_maskz_add_ps (__mmask16 k, __m512 a, __m512 b);\r\nVADDPS __m256 _mm256_mask_add_ps (__m256 s, __mmask8 k, __m256 a, __m256 b);\r\nVADDPS __m256 _mm256_maskz_add_ps (__mmask8 k, __m256 a, __m256 b);\r\nVADDPS __m128 _mm_mask_add_ps (__m128d s, __mmask8 k, __m128 a, __m128 b);\r\nVADDPS __m128 _mm_maskz_add_ps (__mmask8 k, __m128 a, __m128 b);\r\nVADDPS __m512 _mm512_add_round_ps (__m512 a, __m512 b, int);\r\nVADDPS __m512 _mm512_mask_add_round_ps (__m512 s, __mmask16 k, __m512 a, __m512 b, int);\r\nVADDPS __m512 _mm512_maskz_add_round_ps (__mmask16 k, __m512 a, __m512 b, int);\r\nADDPS __m256 _mm256_add_ps (__m256 a, __m256 b);\r\nADDPS __m128 _mm_add_ps (__m128 a, __m128 b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 2.\r\nEVEX-encoded instruction, see Exceptions Type E2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADDPS"
},
{
"description": "ADDSD-Add Scalar Double-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F 58 /r RM V/V SSE2 Add the low double-precision floating-point value from\r\n ADDSD xmm1, xmm2/m64 xmm2/mem to xmm1 and store the result in xmm1.\r\n VEX.NDS.128.F2.0F.WIG 58 /r RVM V/V AVX Add the low double-precision floating-point value from\r\n VADDSD xmm1, xmm2, xmm3/mem to xmm2 and store the result in xmm1.\r\n xmm3/m64\r\n EVEX.NDS.LIG.F2.0F.W1 58 /r T1S V/V AVX512F Add the low double-precision floating-point value from\r\n VADDSD xmm1 {k1}{z}, xmm3/m64 to xmm2 and store the result in xmm1 with\r\n xmm2, xmm3/m64{er} writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S-RVM ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nAdds the low double-precision floating-point values from the second source operand and the first source operand\r\nand stores the double-precision floating-point result in the destination operand.\r\nThe second source operand can be an XMM register or a 64-bit memory location. The first source and destination\r\noperands are XMM registers.\r\n128-bit Legacy SSE version: The first source and destination operands are the same. Bits (MAX_VL-1:64) of the\r\ncorresponding destination register remain unchanged.\r\nEVEX and VEX.128 encoded version: The first source operand is encoded by EVEX.vvvv/VEX.vvvv. Bits (127:64) of\r\nthe XMM register destination are copied from corresponding bits in the first source operand. Bits (MAX_VL-1:128)\r\nof the destination register are zeroed.\r\nEVEX version: The low quadword element of the destination is updated according to the writemask.\r\nSoftware should ensure VADDSD is encoded with VEX.L=0. Encoding VADDSD with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\nOperation\r\nVADDSD (EVEX encoded version)\r\nIF (EVEX.b = 1) AND SRC2 *is a register*\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF k1[0] or *no writemask*\r\n THEN DEST[63:0] <- SRC1[63:0] + SRC2[63:0]\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[63:0] remains unchanged*\r\n ELSE ; zeroing-masking\r\n THEN DEST[63:0] <- 0\r\n FI;\r\nFI;\r\nDEST[127:64] <- SRC1[127:64]\r\n\r\n\r\n\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVADDSD (VEX.128 encoded version)\r\nDEST[63:0] <-SRC1[63:0] + SRC2[63:0]\r\nDEST[127:64] <-SRC1[127:64]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nADDSD (128-bit Legacy SSE version)\r\nDEST[63:0] <-DEST[63:0] + SRC[63:0]\r\nDEST[MAX_VL-1:64] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVADDSD __m128d _mm_mask_add_sd (__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVADDSD __m128d _mm_maskz_add_sd (__mmask8 k, __m128d a, __m128d b);\r\nVADDSD __m128d _mm_add_round_sd (__m128d a, __m128d b, int);\r\nVADDSD __m128d _mm_mask_add_round_sd (__m128d s, __mmask8 k, __m128d a, __m128d b, int);\r\nVADDSD __m128d _mm_maskz_add_round_sd (__mmask8 k, __m128d a, __m128d b, int);\r\nADDSD __m128d _mm_add_sd (__m128d a, __m128d b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 3.\r\nEVEX-encoded instruction, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADDSD"
},
{
"description": "ADDSS-Add Scalar Single-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 58 /r RM V/V SSE Add the low single-precision floating-point value from\r\n ADDSS xmm1, xmm2/m32 xmm2/mem to xmm1 and store the result in xmm1.\r\n VEX.NDS.128.F3.0F.WIG 58 /r RVM V/V AVX Add the low single-precision floating-point value from\r\n VADDSS xmm1,xmm2, xmm3/mem to xmm2 and store the result in xmm1.\r\n xmm3/m32\r\n EVEX.NDS.LIG.F3.0F.W0 58 /r T1S V/V AVX512F Add the low single-precision floating-point value from\r\n VADDSS xmm1{k1}{z}, xmm2, xmm3/m32 to xmm2 and store the result in xmm1with\r\n xmm3/m32{er} writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nAdds the low single-precision floating-point values from the second source operand and the first source operand,\r\nand stores the double-precision floating-point result in the destination operand.\r\nThe second source operand can be an XMM register or a 64-bit memory location. The first source and destination\r\noperands are XMM registers.\r\n128-bit Legacy SSE version: The first source and destination operands are the same. Bits (MAX_VL-1:32) of the\r\ncorresponding the destination register remain unchanged.\r\nEVEX and VEX.128 encoded version: The first source operand is encoded by EVEX.vvvv/VEX.vvvv. Bits (127:32) of\r\nthe XMM register destination are copied from corresponding bits in the first source operand. Bits (MAX_VL-1:128)\r\nof the destination register are zeroed.\r\nEVEX version: The low doubleword element of the destination is updated according to the writemask.\r\nSoftware should ensure VADDSS is encoded with VEX.L=0. Encoding VADDSS with VEX.L=1 may encounter unpre-\r\ndictable behavior across different processor generations.\r\n\r\nOperation\r\nVADDSS (EVEX encoded versions)\r\nIF (EVEX.b = 1) AND SRC2 *is a register*\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF k1[0] or *no writemask*\r\n THEN DEST[31:0] <- SRC1[31:0] + SRC2[31:0]\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[31:0] remains unchanged*\r\n ELSE ; zeroing-masking\r\n THEN DEST[31:0] <- 0\r\n FI;\r\nFI;\r\nDEST[127:32] <- SRC1[127:32]\r\n\r\n\r\n\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVADDSS DEST, SRC1, SRC2 (VEX.128 encoded version)\r\nDEST[31:0] <-SRC1[31:0] + SRC2[31:0]\r\nDEST[127:32] <-SRC1[127:32]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nADDSS DEST, SRC (128-bit Legacy SSE version)\r\nDEST[31:0] <-DEST[31:0] + SRC[31:0]\r\nDEST[MAX_VL-1:32] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVADDSS __m128 _mm_mask_add_ss (__m128 s, __mmask8 k, __m128 a, __m128 b);\r\nVADDSS __m128 _mm_maskz_add_ss (__mmask8 k, __m128 a, __m128 b);\r\nVADDSS __m128 _mm_add_round_ss (__m128 a, __m128 b, int);\r\nVADDSS __m128 _mm_mask_add_round_ss (__m128 s, __mmask8 k, __m128 a, __m128 b, int);\r\nVADDSS __m128 _mm_maskz_add_round_ss (__mmask8 k, __m128 a, __m128 b, int);\r\nADDSS __m128 _mm_add_ss (__m128 a, __m128 b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 3.\r\nEVEX-encoded instruction, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADDSS"
},
{
"description": "ADDSUBPD-Packed Double-FP Add/Subtract\r\nOpcode/ Op/ 64/32-bit CPUID Description\r\nInstruction En Mode Feature\r\n Flag\r\n66 0F D0 /r RM V/V SSE3 Add/subtract double-precision floating-point\r\nADDSUBPD xmm1, xmm2/m128 values from xmm2/m128 to xmm1.\r\n\r\nVEX.NDS.128.66.0F.WIG D0 /r RVM V/V AVX Add/subtract packed double-precision\r\nVADDSUBPD xmm1, xmm2, xmm3/m128 floating-point values from xmm3/mem to\r\n xmm2 and stores result in xmm1.\r\nVEX.NDS.256.66.0F.WIG D0 /r RVM V/V AVX Add / subtract packed double-precision\r\nVADDSUBPD ymm1, ymm2, ymm3/m256 floating-point values from ymm3/mem to\r\n ymm2 and stores result in ymm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nAdds odd-numbered double-precision floating-point values of the first source operand (second operand) with the\r\ncorresponding double-precision floating-point values from the second source operand (third operand); stores the\r\nresult in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered\r\ndouble-precision floating-point values from the second source operand from the corresponding double-precision\r\nfloating values in the first source operand; stores the result into the even-numbered values of the destination\r\noperand.\r\nIn 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers\r\n(XMM8-XMM15).\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding\r\nYMM register destination are unmodified. See Figure 3-3.\r\nVEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination\r\noperand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are\r\nzeroed.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register.\r\n\r\n\r\n\r\n\r\n\r\n ADDSUBPD xmm1, xmm2/m128\r\n\r\n\r\n [127:64] [63:0] xmm2/m128\r\n\r\n\r\n\r\n\r\n RESULT:\r\n xmm1[127:64] + xmm2/m128[127:64] xmm1[63:0] - xmm2/m128[63:0]\r\n xmm1\r\n\r\n [127:64] [63:0]\r\n\r\n\r\n\r\n Figure 3-3. ADDSUBPD-Packed Double-FP Add/Subtract\r\n\r\n\r\nOperation\r\nADDSUBPD (128-bit Legacy SSE version)\r\nDEST[63:0] <- DEST[63:0] - SRC[63:0]\r\nDEST[127:64] <- DEST[127:64] + SRC[127:64]\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVADDSUBPD (VEX.128 encoded version)\r\nDEST[63:0] <- SRC1[63:0] - SRC2[63:0]\r\nDEST[127:64] <- SRC1[127:64] + SRC2[127:64]\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nVADDSUBPD (VEX.256 encoded version)\r\nDEST[63:0] <- SRC1[63:0] - SRC2[63:0]\r\nDEST[127:64] <- SRC1[127:64] + SRC2[127:64]\r\nDEST[191:128] <- SRC1[191:128] - SRC2[191:128]\r\nDEST[255:192] <- SRC1[255:192] + SRC2[255:192]\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nADDSUBPD: __m128d _mm_addsub_pd(__m128d a, __m128d b)\r\n\r\nVADDSUBPD: __m256d _mm256_addsub_pd (__m256d a, __m256d b)\r\n\r\nExceptions\r\nWhen the source operand is a memory operand, it must be aligned on a 16-byte boundary or a general-protection\r\nexception (#GP) will be generated.\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal.\r\n\r\nOther Exceptions\r\nSee Exceptions Type 2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADDSUBPD"
},
{
"description": "ADDSUBPS-Packed Single-FP Add/Subtract\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n F2 0F D0 /r RM V/V SSE3 Add/subtract single-precision floating-point\r\n ADDSUBPS xmm1, xmm2/m128 values from xmm2/m128 to xmm1.\r\n\r\n VEX.NDS.128.F2.0F.WIG D0 /r RVM V/V AVX Add/subtract single-precision floating-point\r\n VADDSUBPS xmm1, xmm2, xmm3/m128 values from xmm3/mem to xmm2 and stores\r\n result in xmm1.\r\n VEX.NDS.256.F2.0F.WIG D0 /r RVM V/V AVX Add / subtract single-precision floating-point\r\n VADDSUBPS ymm1, ymm2, ymm3/m256 values from ymm3/mem to ymm2 and stores\r\n result in ymm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nAdds odd-numbered single-precision floating-point values of the first source operand (second operand) with the\r\ncorresponding single-precision floating-point values from the second source operand (third operand); stores the\r\nresult in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered\r\nsingle-precision floating-point values from the second source operand from the corresponding single-precision\r\nfloating values in the first source operand; stores the result into the even-numbered values of the destination\r\noperand.\r\nIn 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers\r\n(XMM8-XMM15).\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding\r\nYMM register destination are unmodified. See Figure 3-4.\r\nVEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination\r\noperand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are\r\nzeroed.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register.\r\n\r\n\r\n\r\n\r\n\r\n ADDSUBPS xmm1, xmm2/m128\r\n\r\n xmm2/\r\n [127:96] [95:64] [63:32] [31:0]\r\n m128\r\n\r\n\r\n\r\n\r\n xmm1[127:96] + xmm1[95:64] - xmm2/ xmm1[63:32] + xmm1[31:0] - RESULT:\r\n xmm2/m128[127:96] m128[95:64] xmm2/m128[63:32] xmm2/m128[31:0] xmm1\r\n\r\n [127:96] [95:64] [63:32] [31:0]\r\n\r\n\r\n OM15992\r\n\r\n\r\n\r\n Figure 3-4. ADDSUBPS-Packed Single-FP Add/Subtract\r\n\r\n\r\nOperation\r\nADDSUBPS (128-bit Legacy SSE version)\r\nDEST[31:0] <- DEST[31:0] - SRC[31:0]\r\nDEST[63:32] <- DEST[63:32] + SRC[63:32]\r\nDEST[95:64] <- DEST[95:64] - SRC[95:64]\r\nDEST[127:96] <- DEST[127:96] + SRC[127:96]\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVADDSUBPS (VEX.128 encoded version)\r\nDEST[31:0] <- SRC1[31:0] - SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] + SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] - SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] + SRC2[127:96]\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nVADDSUBPS (VEX.256 encoded version)\r\nDEST[31:0] <- SRC1[31:0] - SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] + SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] - SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] + SRC2[127:96]\r\nDEST[159:128] <- SRC1[159:128] - SRC2[159:128]\r\nDEST[191:160]<- SRC1[191:160] + SRC2[191:160]\r\nDEST[223:192] <- SRC1[223:192] - SRC2[223:192]\r\nDEST[255:224] <- SRC1[255:224] + SRC2[255:224].\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nADDSUBPS: __m128 _mm_addsub_ps(__m128 a, __m128 b)\r\n\r\nVADDSUBPS: __m256 _mm256_addsub_ps (__m256 a, __m256 b)\r\n\r\nExceptions\r\nWhen the source operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-\r\nprotection exception (#GP) will be generated.\r\n\r\n\r\n\r\n\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal.\r\n\r\nOther Exceptions\r\nSee Exceptions Type 2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADDSUBPS"
},
{
"description": "ADOX - Unsigned Integer Addition of Two Operands with Overflow Flag\r\n Opcode/ Op/ 64/32bit CPUID Description\r\n Instruction En Mode Feature\r\n Support Flag\r\n F3 0F 38 F6 /r RM V/V ADX Unsigned addition of r32 with OF, r/m32 to r32, writes OF.\r\n ADOX r32, r/m32\r\n F3 REX.w 0F 38 F6 /r RM V/NE ADX Unsigned addition of r64 with OF, r/m64 to r64, writes OF.\r\n ADOX r64, r/m64\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nPerforms an unsigned addition of the destination operand (first operand), the source operand (second operand)\r\nand the overflow-flag (OF) and stores the result in the destination operand. The destination operand is a general-\r\npurpose register, whereas the source operand can be a general-purpose register or memory location. The state of\r\nOF represents a carry from a previous addition. The instruction sets the OF flag with the carry generated by the\r\nunsigned addition of the operands.\r\nThe ADOX instruction is executed in the context of multi-precision addition, where we add a series of operands with\r\na carry-chain. At the beginning of a chain of additions, we execute an instruction to zero the OF (e.g. XOR).\r\nThis instruction is supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit\r\nmode.\r\nIn 64-bit mode, the default operation size is 32 bits. Using a REX Prefix in the form of REX.R permits access to addi-\r\ntional registers (R8-15). Using REX Prefix in the form of REX.W promotes operation to 64-bits.\r\nADOX executes normally either inside or outside a transaction region.\r\nNote: ADOX defines the CF and OF flags differently than the ADD/ADC instructions as defined in Intel 64 and\r\nIA-32 Architectures Software Developer's Manual, Volume 2A.\r\n\r\nOperation\r\nIF OperandSize is 64-bit\r\n THEN OF:DEST[63:0] <- DEST[63:0] + SRC[63:0] + OF;\r\n ELSE OF:DEST[31:0] <- DEST[31:0] + SRC[31:0] + OF;\r\nFI;\r\n\r\nFlags Affected\r\nOF is updated based on result. CF, SF, ZF, AF and PF flags are unmodified.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nunsigned char _addcarryx_u32 (unsigned char c_in, unsigned int src1, unsigned int src2, unsigned int *sum_out);\r\nunsigned char _addcarryx_u64 (unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *sum_out);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) For an illegal address in the SS segment.\r\n#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a null segment\r\n selector.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) For an illegal address in the SS segment.\r\n#GP(0) If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) For an illegal address in the SS segment.\r\n#GP(0) If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ADOX"
},
{
"description": "AESDEC-Perform One Round of an AES Decryption Flow\r\nOpcode/ Op/ 64/32-bit CPUID Description\r\nInstruction En Mode Feature\r\n Flag\r\n66 0F 38 DE /r RM V/V AES Perform one round of an AES decryption flow,\r\nAESDEC xmm1, xmm2/m128 using the Equivalent Inverse Cipher, operating\r\n on a 128-bit data (state) from xmm1 with a\r\n 128-bit round key from xmm2/m128.\r\nVEX.NDS.128.66.0F38.WIG DE /r RVM V/V Both AES Perform one round of an AES decryption flow,\r\nVAESDEC xmm1, xmm2, xmm3/m128 and using the Equivalent Inverse Cipher, operating\r\n AVX flags on a 128-bit data (state) from xmm2 with a\r\n 128-bit round key from xmm3/m128; store\r\n the result in xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand2 Operand3 Operand4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nThis instruction performs a single round of the AES decryption flow using the Equivalent Inverse Cipher, with the\r\nround key from the second source operand, operating on a 128-bit data (state) from the first source operand, and\r\nstore the result in the destination operand.\r\nUse the AESDEC instruction for all but the last decryption round. For the last decryption round, use the AESDE-\r\nCLAST instruction.\r\n128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an\r\nXMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-\r\n1:128) of the corresponding YMM destination register remain unchanged.\r\nVEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second\r\nsource operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM\r\nregister are zeroed.\r\n\r\nOperation\r\nAESDEC\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- InvShiftRows( STATE );\r\nSTATE <- InvSubBytes( STATE );\r\nSTATE <- InvMixColumns( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVAESDEC\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- InvShiftRows( STATE );\r\nSTATE <- InvSubBytes( STATE );\r\nSTATE <- InvMixColumns( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] <- 0\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)AESDEC: __m128i _mm_aesdec (__m128i, __m128i)\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AESDEC"
},
{
"description": "AESDECLAST-Perform Last Round of an AES Decryption Flow\r\nOpcode/ Op/ 64/32-bit CPUID Description\r\nInstruction En Mode Feature\r\n Flag\r\n66 0F 38 DF /r RM V/V AES Perform the last round of an AES decryption\r\nAESDECLAST xmm1, xmm2/m128 flow, using the Equivalent Inverse Cipher,\r\n operating on a 128-bit data (state) from\r\n xmm1 with a 128-bit round key from\r\n xmm2/m128.\r\nVEX.NDS.128.66.0F38.WIG DF /r RVM V/V Both AES Perform the last round of an AES decryption\r\nVAESDECLAST xmm1, xmm2, xmm3/m128 and flow, using the Equivalent Inverse Cipher,\r\n AVX flags operating on a 128-bit data (state) from\r\n xmm2 with a 128-bit round key from\r\n xmm3/m128; store the result in xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand2 Operand3 Operand4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nThis instruction performs the last round of the AES decryption flow using the Equivalent Inverse Cipher, with the\r\nround key from the second source operand, operating on a 128-bit data (state) from the first source operand, and\r\nstore the result in the destination operand.\r\n128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an\r\nXMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-\r\n1:128) of the corresponding YMM destination register remain unchanged.\r\nVEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second\r\nsource operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM\r\nregister are zeroed.\r\n\r\nOperation\r\nAESDECLAST\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- InvShiftRows( STATE );\r\nSTATE <- InvSubBytes( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVAESDECLAST\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- InvShiftRows( STATE );\r\nSTATE <- InvSubBytes( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)AESDECLAST: __m128i _mm_aesdeclast (__m128i, __m128i)\r\n\r\n\r\n\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AESDECLAST"
},
{
"description": "AESENC-Perform One Round of an AES Encryption Flow\r\nOpcode/ Op/ 64/32-bit CPUID Description\r\nInstruction En Mode Feature\r\n Flag\r\n66 0F 38 DC /r RM V/V AES Perform one round of an AES encryption flow,\r\nAESENC xmm1, xmm2/m128 operating on a 128-bit data (state) from\r\n xmm1 with a 128-bit round key from\r\n xmm2/m128.\r\nVEX.NDS.128.66.0F38.WIG DC /r RVM V/V Both AES Perform one round of an AES encryption flow,\r\nVAESENC xmm1, xmm2, xmm3/m128 and operating on a 128-bit data (state) from\r\n AVX flags xmm2 with a 128-bit round key from the\r\n xmm3/m128; store the result in xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand2 Operand3 Operand4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nThis instruction performs a single round of an AES encryption flow using a round key from the second source\r\noperand, operating on 128-bit data (state) from the first source operand, and store the result in the destination\r\noperand.\r\nUse the AESENC instruction for all but the last encryption rounds. For the last encryption round, use the AESENC-\r\nCLAST instruction.\r\n128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an\r\nXMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-\r\n1:128) of the corresponding YMM destination register remain unchanged.\r\nVEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second\r\nsource operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM\r\nregister are zeroed.\r\n\r\nOperation\r\nAESENC\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- ShiftRows( STATE );\r\nSTATE <- SubBytes( STATE );\r\nSTATE <- MixColumns( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVAESENC\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- ShiftRows( STATE );\r\nSTATE <- SubBytes( STATE );\r\nSTATE <- MixColumns( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] <- 0\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)AESENC: __m128i _mm_aesenc (__m128i, __m128i)\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AESENC"
},
{
"description": "AESENCLAST-Perform Last Round of an AES Encryption Flow\r\nOpcode/ Op/ 64/32-bit CPUID Description\r\nInstruction En Mode Feature\r\n Flag\r\n66 0F 38 DD /r RM V/V AES Perform the last round of an AES encryption\r\nAESENCLAST xmm1, xmm2/m128 flow, operating on a 128-bit data (state) from\r\n xmm1 with a 128-bit round key from\r\n xmm2/m128.\r\nVEX.NDS.128.66.0F38.WIG DD /r RVM V/V Both AES Perform the last round of an AES encryption\r\nVAESENCLAST xmm1, xmm2, xmm3/m128 and flow, operating on a 128-bit data (state) from\r\n AVX flags xmm2 with a 128 bit round key from\r\n xmm3/m128; store the result in xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand2 Operand3 Operand4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nThis instruction performs the last round of an AES encryption flow using a round key from the second source\r\noperand, operating on 128-bit data (state) from the first source operand, and store the result in the destination\r\noperand.\r\n128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an\r\nXMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-\r\n1:128) of the corresponding YMM destination register remain unchanged.\r\nVEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second\r\nsource operand can be an XMM register or a 128-bit memory location. Bits (VLMAX-1:128) of the destination YMM\r\nregister are zeroed.\r\n\r\nOperation\r\nAESENCLAST\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- ShiftRows( STATE );\r\nSTATE <- SubBytes( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVAESENCLAST\r\nSTATE <- SRC1;\r\nRoundKey <- SRC2;\r\nSTATE <- ShiftRows( STATE );\r\nSTATE <- SubBytes( STATE );\r\nDEST[127:0] <- STATE XOR RoundKey;\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)AESENCLAST: __m128i _mm_aesenclast (__m128i, __m128i)\r\n\r\n\r\n\r\n\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AESENCLAST"
},
{
"description": "AESIMC-Perform the AES InvMixColumn Transformation\r\nOpcode/ Op/ 64/32-bit CPUID Description\r\nInstruction En Mode Feature\r\n Flag\r\n66 0F 38 DB /r RM V/V AES Perform the InvMixColumn transformation on\r\nAESIMC xmm1, xmm2/m128 a 128-bit round key from xmm2/m128 and\r\n store the result in xmm1.\r\nVEX.128.66.0F38.WIG DB /r RM V/V Both AES Perform the InvMixColumn transformation on\r\nVAESIMC xmm1, xmm2/m128 and a 128-bit round key from xmm2/m128 and\r\n AVX flags store the result in xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand2 Operand3 Operand4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nPerform the InvMixColumns transformation on the source operand and store the result in the destination operand.\r\nThe destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory loca-\r\ntion.\r\nNote: the AESIMC instruction should be applied to the expanded AES round keys (except for the first and last round\r\nkey) in order to prepare them for decryption using the \"Equivalent Inverse Cipher\" (defined in FIPS 197).\r\n128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain\r\nunchanged.\r\nVEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.\r\nNote: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\nOperation\r\nAESIMC\r\nDEST[127:0] <- InvMixColumns( SRC );\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVAESIMC\r\nDEST[127:0] <- InvMixColumns( SRC );\r\nDEST[VLMAX-1:128] <- 0;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)AESIMC: __m128i _mm_aesimc (__m128i)\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4; additionally\r\n#UD If VEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AESIMC"
},
{
"description": "AESKEYGENASSIST-AES Round Key Generation Assist\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 3A DF /r ib RMI V/V AES Assist in AES round key generation using an 8\r\n AESKEYGENASSIST xmm1, xmm2/m128, imm8 bits Round Constant (RCON) specified in the\r\n immediate byte, operating on 128 bits of data\r\n specified in xmm2/m128 and stores the\r\n result in xmm1.\r\n VEX.128.66.0F3A.WIG DF /r ib RMI V/V Both AES Assist in AES round key generation using 8\r\n VAESKEYGENASSIST xmm1, xmm2/m128, imm8 and bits Round Constant (RCON) specified in the\r\n AVX flags immediate byte, operating on 128 bits of data\r\n specified in xmm2/m128 and stores the\r\n result in xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand2 Operand3 Operand4\r\n RMI ModRM:reg (w) ModRM:r/m (r) imm8 NA\r\n\r\nDescription\r\nAssist in expanding the AES cipher key, by computing steps towards generating a round key for encryption, using\r\n128-bit data specified in the source operand and an 8-bit round constant specified as an immediate, store the\r\nresult in the destination operand.\r\nThe destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory loca-\r\ntion.\r\n128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain\r\nunchanged.\r\nVEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed.\r\nNote: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\nOperation\r\nAESKEYGENASSIST\r\nX3[31:0] <- SRC [127: 96];\r\nX2[31:0] <- SRC [95: 64];\r\nX1[31:0] <- SRC [63: 32];\r\nX0[31:0] <- SRC [31: 0];\r\nRCON[31:0] <- ZeroExtend(Imm8[7:0]);\r\nDEST[31:0] <- SubWord(X1);\r\nDEST[63:32 ] <- RotWord( SubWord(X1) ) XOR RCON;\r\nDEST[95:64] <- SubWord(X3);\r\nDEST[127:96] <- RotWord( SubWord(X3) ) XOR RCON;\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\n\r\n\r\n\r\n\r\nVAESKEYGENASSIST\r\nX3[31:0] <- SRC [127: 96];\r\nX2[31:0] <- SRC [95: 64];\r\nX1[31:0] <- SRC [63: 32];\r\nX0[31:0] <- SRC [31: 0];\r\nRCON[31:0] <- ZeroExtend(Imm8[7:0]);\r\nDEST[31:0] <- SubWord(X1);\r\nDEST[63:32 ] <- RotWord( SubWord(X1) ) XOR RCON;\r\nDEST[95:64] <- SubWord(X3);\r\nDEST[127:96] <- RotWord( SubWord(X3) ) XOR RCON;\r\nDEST[VLMAX-1:128] <- 0;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)AESKEYGENASSIST: __m128i _mm_aeskeygenassist (__m128i, const int)\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4; additionally\r\n#UD If VEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AESKEYGENASSIST"
},
{
"description": "AND-Logical AND\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 24 ib AND AL, imm8 I Valid Valid AL AND imm8.\r\n 25 iw AND AX, imm16 I Valid Valid AX AND imm16.\r\n 25 id AND EAX, imm32 I Valid Valid EAX AND imm32.\r\n REX.W + 25 id AND RAX, imm32 I Valid N.E. RAX AND imm32 sign-extended to 64-bits.\r\n 80 /4 ib AND r/m8, imm8 MI Valid Valid r/m8 AND imm8.\r\n REX + 80 /4 ib AND r/m8*, imm8 MI Valid N.E. r/m8 AND imm8.\r\n 81 /4 iw AND r/m16, imm16 MI Valid Valid r/m16 AND imm16.\r\n 81 /4 id AND r/m32, imm32 MI Valid Valid r/m32 AND imm32.\r\n REX.W + 81 /4 id AND r/m64, imm32 MI Valid N.E. r/m64 AND imm32 sign extended to 64-bits.\r\n 83 /4 ib AND r/m16, imm8 MI Valid Valid r/m16 AND imm8 (sign-extended).\r\n 83 /4 ib AND r/m32, imm8 MI Valid Valid r/m32 AND imm8 (sign-extended).\r\n REX.W + 83 /4 ib AND r/m64, imm8 MI Valid N.E. r/m64 AND imm8 (sign-extended).\r\n 20 /r AND r/m8, r8 MR Valid Valid r/m8 AND r8.\r\n REX + 20 /r AND r/m8*, r8* MR Valid N.E. r/m64 AND r8 (sign-extended).\r\n 21 /r AND r/m16, r16 MR Valid Valid r/m16 AND r16.\r\n 21 /r AND r/m32, r32 MR Valid Valid r/m32 AND r32.\r\n REX.W + 21 /r AND r/m64, r64 MR Valid N.E. r/m64 AND r32.\r\n 22 /r AND r8, r/m8 RM Valid Valid r8 AND r/m8.\r\n REX + 22 /r AND r8*, r/m8* RM Valid N.E. r/m64 AND r8 (sign-extended).\r\n 23 /r AND r16, r/m16 RM Valid Valid r16 AND r/m16.\r\n 23 /r AND r32, r/m32 RM Valid Valid r32 AND r/m32.\r\n REX.W + 23 /r AND r64, r/m64 RM Valid N.E. r64 AND r/m64.\r\n NOTES:\r\n *In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r, w) imm8 NA NA\r\n I AL/AX/EAX/RAX imm8 NA NA\r\n\r\nDescription\r\nPerforms a bitwise AND operation on the destination (first) and source (second) operands and stores the result in\r\nthe destination operand location. The source operand can be an immediate, a register, or a memory location; the\r\ndestination operand can be a register or a memory location. (However, two memory operands cannot be used in\r\none instruction.) Each bit of the result is set to 1 if both corresponding bits of the first and second operands are 1;\r\notherwise, it is set to 0.\r\nThis instruction can be used with a LOCK prefix to allow the it to be executed atomically.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\n\r\n\r\nOperation\r\nDEST <- DEST AND SRC;\r\n\r\nFlags Affected\r\nThe OF and CF flags are cleared; the SF, ZF, and PF flags are set according to the result. The state of the AF flag is\r\nundefined.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination operand points to a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "AND"
},
{
"description": "ANDN - Logical AND NOT\r\n Opcode/Instruction Op/ 64/32 CPUID Description\r\n En -bit Feature\r\n Mode Flag\r\n VEX.NDS.LZ.0F38.W0 F2 /r RVM V/V BMI1 Bitwise AND of inverted r32b with r/m32, store result in r32a.\r\n ANDN r32a, r32b, r/m32\r\n VEX.NDS.LZ. 0F38.W1 F2 /r RVM V/NE BMI1 Bitwise AND of inverted r64b with r/m64, store result in r64a.\r\n ANDN r64a, r64b, r/m64\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RVM ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a bitwise logical AND of inverted second operand (the first source operand) with the third operand (the\r\nsecond source operand). The result is stored in the first operand (destination operand).\r\nThis instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in\r\n64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An\r\nattempt to execute this instruction with VEX.L not equal to 0 will cause #UD.\r\n\r\nOperation\r\nDEST <- (NOT SRC1) bitwiseAND SRC2;\r\nSF <- DEST[OperandSize -1];\r\nZF <- (DEST = 0);\r\n\r\nFlags Affected\r\nSF and ZF are updated based on result. OF and CF flags are cleared. AF and PF flags are undefined.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nAuto-generated from high-level language.\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Section 2.5.1, \"Exception Conditions for VEX-Encoded GPR Instructions\", Table 2-29; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ANDN"
},
{
"description": "ANDNPD-Bitwise Logical AND NOT of Packed Double Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 55 /r RM V/V SSE2 Return the bitwise logical AND NOT of packed double-\r\n ANDNPD xmm1, xmm2/m128 precision floating-point values in xmm1 and xmm2/mem.\r\n VEX.NDS.128.66.0F 55 /r RVM V/V AVX Return the bitwise logical AND NOT of packed double-\r\n VANDNPD xmm1, xmm2, precision floating-point values in xmm2 and xmm3/mem.\r\n xmm3/m128\r\n VEX.NDS.256.66.0F 55/r RVM V/V AVX Return the bitwise logical AND NOT of packed double-\r\n VANDNPD ymm1, ymm2, precision floating-point values in ymm2 and ymm3/mem.\r\n ymm3/m256\r\n EVEX.NDS.128.66.0F.W1 55 /r FV V/V AVX512VL Return the bitwise logical AND NOT of packed double-\r\n VANDNPD xmm1 {k1}{z}, xmm2, AVX512DQ precision floating-point values in xmm2 and\r\n xmm3/m128/m64bcst xmm3/m128/m64bcst subject to writemask k1.\r\n EVEX.NDS.256.66.0F.W1 55 /r FV V/V AVX512VL Return the bitwise logical AND NOT of packed double-\r\n VANDNPD ymm1 {k1}{z}, ymm2, AVX512DQ precision floating-point values in ymm2 and\r\n ymm3/m256/m64bcst ymm3/m256/m64bcst subject to writemask k1.\r\n EVEX.NDS.512.66.0F.W1 55 /r FV V/V AVX512DQ Return the bitwise logical AND NOT of packed double-\r\n VANDNPD zmm1 {k1}{z}, zmm2, precision floating-point values in zmm2 and\r\n zmm3/m512/m64bcst zmm3/m512/m64bcst subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a bitwise logical AND NOT of the two, four or eight packed double-precision floating-point values from the\r\nfirst source operand and the second source operand, and stores the result in the destination operand.\r\nEVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be\r\na ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a\r\n64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register\r\nor a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAX_VL-1:256) of the\r\ncorresponding ZMM register destination are zeroed.\r\nVEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (MAX_VL-1:128) of the corresponding\r\nregister destination are unmodified.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVANDNPD (EVEX encoded versions)\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n IF k1[j] OR *no writemask*\r\n IF (EVEX.b == 1) AND (SRC2 *is memory*)\r\n THEN\r\n DEST[i+63:i] <- (NOT(SRC1[i+63:i])) BITWISE AND SRC2[63:0]\r\n ELSE\r\n DEST[i+63:i] <- (NOT(SRC1[i+63:i])) BITWISE AND SRC2[i+63:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] = 0\r\n FI;\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVANDNPD (VEX.256 encoded version)\r\nDEST[63:0] <- (NOT(SRC1[63:0])) BITWISE AND SRC2[63:0]\r\nDEST[127:64] <- (NOT(SRC1[127:64])) BITWISE AND SRC2[127:64]\r\nDEST[191:128] <- (NOT(SRC1[191:128])) BITWISE AND SRC2[191:128]\r\nDEST[255:192] <- (NOT(SRC1[255:192])) BITWISE AND SRC2[255:192]\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVANDNPD (VEX.128 encoded version)\r\nDEST[63:0] <- (NOT(SRC1[63:0])) BITWISE AND SRC2[63:0]\r\nDEST[127:64] <- (NOT(SRC1[127:64])) BITWISE AND SRC2[127:64]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nANDNPD (128-bit Legacy SSE version)\r\nDEST[63:0] <- (NOT(DEST[63:0])) BITWISE AND SRC[63:0]\r\nDEST[127:64] <- (NOT(DEST[127:64])) BITWISE AND SRC[127:64]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVANDNPD __m512d _mm512_andnot_pd (__m512d a, __m512d b);\r\nVANDNPD __m512d _mm512_mask_andnot_pd (__m512d s, __mmask8 k, __m512d a, __m512d b);\r\nVANDNPD __m512d _mm512_maskz_andnot_pd (__mmask8 k, __m512d a, __m512d b);\r\nVANDNPD __m256d _mm256_mask_andnot_pd (__m256d s, __mmask8 k, __m256d a, __m256d b);\r\nVANDNPD __m256d _mm256_maskz_andnot_pd (__mmask8 k, __m256d a, __m256d b);\r\nVANDNPD __m128d _mm_mask_andnot_pd (__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVANDNPD __m128d _mm_maskz_andnot_pd (__mmask8 k, __m128d a, __m128d b);\r\nVANDNPD __m256d _mm256_andnot_pd (__m256d a, __m256d b);\r\nANDNPD __m128d _mm_andnot_pd (__m128d a, __m128d b);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\n\r\n\r\n\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 4.\r\nEVEX-encoded instruction, see Exceptions Type E4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ANDNPD"
},
{
"description": "ANDNPS-Bitwise Logical AND NOT of Packed Single Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 55 /r RM V/V SSE Return the bitwise logical AND NOT of packed single-precision\r\n ANDNPS xmm1, xmm2/m128 floating-point values in xmm1 and xmm2/mem.\r\n VEX.NDS.128.0F 55 /r RVM V/V AVX Return the bitwise logical AND NOT of packed single-precision\r\n VANDNPS xmm1, xmm2, floating-point values in xmm2 and xmm3/mem.\r\n xmm3/m128\r\n VEX.NDS.256.0F 55 /r RVM V/V AVX Return the bitwise logical AND NOT of packed single-precision\r\n VANDNPS ymm1, ymm2, floating-point values in ymm2 and ymm3/mem.\r\n ymm3/m256\r\n EVEX.NDS.128.0F.W0 55 /r FV V/V AVX512VL Return the bitwise logical AND of packed single-precision\r\n VANDNPS xmm1 {k1}{z}, AVX512DQ floating-point values in xmm2 and xmm3/m128/m32bcst\r\n xmm2, xmm3/m128/m32bcst subject to writemask k1.\r\n EVEX.NDS.256.0F.W0 55 /r FV V/V AVX512VL Return the bitwise logical AND of packed single-precision\r\n VANDNPS ymm1 {k1}{z}, AVX512DQ floating-point values in ymm2 and ymm3/m256/m32bcst\r\n ymm2, ymm3/m256/m32bcst subject to writemask k1.\r\n EVEX.NDS.512.0F.W0 55 /r FV V/V AVX512DQ Return the bitwise logical AND of packed single-precision\r\n VANDNPS zmm1 {k1}{z}, floating-point values in zmm2 and zmm3/m512/m32bcst\r\n zmm2, zmm3/m512/m32bcst subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a bitwise logical AND NOT of the four, eight or sixteen packed single-precision floating-point values from\r\nthe first source operand and the second source operand, and stores the result in the destination operand.\r\nEVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be\r\na ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a\r\n32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register\r\nor a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAX_VL-1:256) of the\r\ncorresponding ZMM register destination are zeroed.\r\nVEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (MAX_VL-1:128) of the corresponding\r\nZMM register destination are unmodified.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVANDNPS (EVEX encoded versions)\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n IF (EVEX.b == 1) AND (SRC2 *is memory*)\r\n THEN\r\n DEST[i+31:i] <- (NOT(SRC1[i+31:i])) BITWISE AND SRC2[31:0]\r\n ELSE\r\n DEST[i+31:i] <- (NOT(SRC1[i+31:i])) BITWISE AND SRC2[i+31:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] = 0\r\n FI;\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVANDNPS (VEX.256 encoded version)\r\nDEST[31:0] <- (NOT(SRC1[31:0])) BITWISE AND SRC2[31:0]\r\nDEST[63:32] <- (NOT(SRC1[63:32])) BITWISE AND SRC2[63:32]\r\nDEST[95:64] <- (NOT(SRC1[95:64])) BITWISE AND SRC2[95:64]\r\nDEST[127:96] <- (NOT(SRC1[127:96])) BITWISE AND SRC2[127:96]\r\nDEST[159:128] <- (NOT(SRC1[159:128])) BITWISE AND SRC2[159:128]\r\nDEST[191:160] <- (NOT(SRC1[191:160])) BITWISE AND SRC2[191:160]\r\nDEST[223:192] <- (NOT(SRC1[223:192])) BITWISE AND SRC2[223:192]\r\nDEST[255:224] <- (NOT(SRC1[255:224])) BITWISE AND SRC2[255:224].\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVANDNPS (VEX.128 encoded version)\r\nDEST[31:0] <- (NOT(SRC1[31:0])) BITWISE AND SRC2[31:0]\r\nDEST[63:32] <- (NOT(SRC1[63:32])) BITWISE AND SRC2[63:32]\r\nDEST[95:64] <- (NOT(SRC1[95:64])) BITWISE AND SRC2[95:64]\r\nDEST[127:96] <- (NOT(SRC1[127:96])) BITWISE AND SRC2[127:96]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nANDNPS (128-bit Legacy SSE version)\r\nDEST[31:0] <- (NOT(DEST[31:0])) BITWISE AND SRC[31:0]\r\nDEST[63:32] <- (NOT(DEST[63:32])) BITWISE AND SRC[63:32]\r\nDEST[95:64] <- (NOT(DEST[95:64])) BITWISE AND SRC[95:64]\r\nDEST[127:96] <- (NOT(DEST[127:96])) BITWISE AND SRC[127:96]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVANDNPS __m512 _mm512_andnot_ps (__m512 a, __m512 b);\r\nVANDNPS __m512 _mm512_mask_andnot_ps (__m512 s, __mmask16 k, __m512 a, __m512 b);\r\nVANDNPS __m512 _mm512_maskz_andnot_ps (__mmask16 k, __m512 a, __m512 b);\r\nVANDNPS __m256 _mm256_mask_andnot_ps (__m256 s, __mmask8 k, __m256 a, __m256 b);\r\nVANDNPS __m256 _mm256_maskz_andnot_ps (__mmask8 k, __m256 a, __m256 b);\r\nVANDNPS __m128 _mm_mask_andnot_ps (__m128 s, __mmask8 k, __m128 a, __m128 b);\r\nVANDNPS __m128 _mm_maskz_andnot_ps (__mmask8 k, __m128 a, __m128 b);\r\nVANDNPS __m256 _mm256_andnot_ps (__m256 a, __m256 b);\r\nANDNPS __m128 _mm_andnot_ps (__m128 a, __m128 b);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 4.\r\nEVEX-encoded instruction, see Exceptions Type E4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ANDNPS"
},
{
"description": "ANDPD-Bitwise Logical AND of Packed Double Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 54 /r RM V/V SSE2 Return the bitwise logical AND of packed double-\r\n ANDPD xmm1, xmm2/m128 precision floating-point values in xmm1 and xmm2/mem.\r\n VEX.NDS.128.66.0F 54 /r RVM V/V AVX Return the bitwise logical AND of packed double-\r\n VANDPD xmm1, xmm2, precision floating-point values in xmm2 and xmm3/mem.\r\n xmm3/m128\r\n VEX.NDS.256.66.0F 54 /r RVM V/V AVX Return the bitwise logical AND of packed double-\r\n VANDPD ymm1, ymm2, precision floating-point values in ymm2 and ymm3/mem.\r\n ymm3/m256\r\n EVEX.NDS.128.66.0F.W1 54 /r FV V/V AVX512VL Return the bitwise logical AND of packed double-\r\n VANDPD xmm1 {k1}{z}, xmm2, AVX512DQ precision floating-point values in xmm2 and\r\n xmm3/m128/m64bcst xmm3/m128/m64bcst subject to writemask k1.\r\n EVEX.NDS.256.66.0F.W1 54 /r FV V/V AVX512VL Return the bitwise logical AND of packed double-\r\n VANDPD ymm1 {k1}{z}, ymm2, AVX512DQ precision floating-point values in ymm2 and\r\n ymm3/m256/m64bcst ymm3/m256/m64bcst subject to writemask k1.\r\n EVEX.NDS.512.66.0F.W1 54 /r FV V/V AVX512DQ Return the bitwise logical AND of packed double-\r\n VANDPD zmm1 {k1}{z}, zmm2, precision floating-point values in zmm2 and\r\n zmm3/m512/m64bcst zmm3/m512/m64bcst subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a bitwise logical AND of the two, four or eight packed double-precision floating-point values from the first\r\nsource operand and the second source operand, and stores the result in the destination operand.\r\nEVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be\r\na ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a\r\n64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register\r\nor a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAX_VL-1:256) of the\r\ncorresponding ZMM register destination are zeroed.\r\nVEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (MAX_VL-1:128) of the corresponding\r\nregister destination are unmodified.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVANDPD (EVEX encoded versions)\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b == 1) AND (SRC2 *is memory*)\r\n THEN\r\n DEST[i+63:i] <- SRC1[i+63:i] BITWISE AND SRC2[63:0]\r\n ELSE\r\n DEST[i+63:i] <- SRC1[i+63:i] BITWISE AND SRC2[i+63:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] = 0\r\n FI;\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVANDPD (VEX.256 encoded version)\r\nDEST[63:0] <- SRC1[63:0] BITWISE AND SRC2[63:0]\r\nDEST[127:64] <- SRC1[127:64] BITWISE AND SRC2[127:64]\r\nDEST[191:128] <- SRC1[191:128] BITWISE AND SRC2[191:128]\r\nDEST[255:192] <- SRC1[255:192] BITWISE AND SRC2[255:192]\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVANDPD (VEX.128 encoded version)\r\nDEST[63:0] <- SRC1[63:0] BITWISE AND SRC2[63:0]\r\nDEST[127:64] <- SRC1[127:64] BITWISE AND SRC2[127:64]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nANDPD (128-bit Legacy SSE version)\r\nDEST[63:0] <- DEST[63:0] BITWISE AND SRC[63:0]\r\nDEST[127:64] <- DEST[127:64] BITWISE AND SRC[127:64]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVANDPD __m512d _mm512_and_pd (__m512d a, __m512d b);\r\nVANDPD __m512d _mm512_mask_and_pd (__m512d s, __mmask8 k, __m512d a, __m512d b);\r\nVANDPD __m512d _mm512_maskz_and_pd (__mmask8 k, __m512d a, __m512d b);\r\nVANDPD __m256d _mm256_mask_and_pd (__m256d s, __mmask8 k, __m256d a, __m256d b);\r\nVANDPD __m256d _mm256_maskz_and_pd (__mmask8 k, __m256d a, __m256d b);\r\nVANDPD __m128d _mm_mask_and_pd (__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVANDPD __m128d _mm_maskz_and_pd (__mmask8 k, __m128d a, __m128d b);\r\nVANDPD __m256d _mm256_and_pd (__m256d a, __m256d b);\r\nANDPD __m128d _mm_and_pd (__m128d a, __m128d b);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\n\r\n\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 4.\r\nEVEX-encoded instruction, see Exceptions Type E4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ANDPD"
},
{
"description": "ANDPS-Bitwise Logical AND of Packed Single Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 54 /r RM V/V SSE Return the bitwise logical AND of packed single-precision\r\n ANDPS xmm1, xmm2/m128 floating-point values in xmm1 and xmm2/mem.\r\n VEX.NDS.128.0F 54 /r RVM V/V AVX Return the bitwise logical AND of packed single-precision\r\n VANDPS xmm1,xmm2, floating-point values in xmm2 and xmm3/mem.\r\n xmm3/m128\r\n VEX.NDS.256.0F 54 /r RVM V/V AVX Return the bitwise logical AND of packed single-precision\r\n VANDPS ymm1, ymm2, floating-point values in ymm2 and ymm3/mem.\r\n ymm3/m256\r\n EVEX.NDS.128.0F.W0 54 /r FV V/V AVX512VL Return the bitwise logical AND of packed single-precision\r\n VANDPS xmm1 {k1}{z}, xmm2, AVX512DQ floating-point values in xmm2 and xmm3/m128/m32bcst\r\n xmm3/m128/m32bcst subject to writemask k1.\r\n EVEX.NDS.256.0F.W0 54 /r FV V/V AVX512VL Return the bitwise logical AND of packed single-precision\r\n VANDPS ymm1 {k1}{z}, ymm2, AVX512DQ floating-point values in ymm2 and ymm3/m256/m32bcst\r\n ymm3/m256/m32bcst subject to writemask k1.\r\n EVEX.NDS.512.0F.W0 54 /r FV V/V AVX512DQ Return the bitwise logical AND of packed single-precision\r\n VANDPS zmm1 {k1}{z}, zmm2, floating-point values in zmm2 and zmm3/m512/m32bcst\r\n zmm3/m512/m32bcst subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a bitwise logical AND of the four, eight or sixteen packed single-precision floating-point values from the\r\nfirst source operand and the second source operand, and stores the result in the destination operand.\r\nEVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be\r\na ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a\r\n32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register\r\nor a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAX_VL-1:256) of the\r\ncorresponding ZMM register destination are zeroed.\r\nVEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (MAX_VL-1:128) of the corresponding\r\nZMM register destination are unmodified.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVANDPS (EVEX encoded versions)\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n IF (EVEX.b == 1) AND (SRC2 *is memory*)\r\n THEN\r\n DEST[i+63:i] <- SRC1[i+31:i] BITWISE AND SRC2[31:0]\r\n ELSE\r\n DEST[i+31:i] <- SRC1[i+31:i] BITWISE AND SRC2[i+31:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI;\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0;\r\n\r\nVANDPS (VEX.256 encoded version)\r\nDEST[31:0] <- SRC1[31:0] BITWISE AND SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] BITWISE AND SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] BITWISE AND SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] BITWISE AND SRC2[127:96]\r\nDEST[159:128] <- SRC1[159:128] BITWISE AND SRC2[159:128]\r\nDEST[191:160] <- SRC1[191:160] BITWISE AND SRC2[191:160]\r\nDEST[223:192] <- SRC1[223:192] BITWISE AND SRC2[223:192]\r\nDEST[255:224] <- SRC1[255:224] BITWISE AND SRC2[255:224].\r\nDEST[MAX_VL-1:256] <- 0;\r\n\r\nVANDPS (VEX.128 encoded version)\r\nDEST[31:0] <- SRC1[31:0] BITWISE AND SRC2[31:0]\r\nDEST[63:32] <- SRC1[63:32] BITWISE AND SRC2[63:32]\r\nDEST[95:64] <- SRC1[95:64] BITWISE AND SRC2[95:64]\r\nDEST[127:96] <- SRC1[127:96] BITWISE AND SRC2[127:96]\r\nDEST[MAX_VL-1:128] <- 0;\r\n\r\nANDPS (128-bit Legacy SSE version)\r\nDEST[31:0] <- DEST[31:0] BITWISE AND SRC[31:0]\r\nDEST[63:32] <- DEST[63:32] BITWISE AND SRC[63:32]\r\nDEST[95:64] <- DEST[95:64] BITWISE AND SRC[95:64]\r\nDEST[127:96] <- DEST[127:96] BITWISE AND SRC[127:96]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVANDPS __m512 _mm512_and_ps (__m512 a, __m512 b);\r\nVANDPS __m512 _mm512_mask_and_ps (__m512 s, __mmask16 k, __m512 a, __m512 b);\r\nVANDPS __m512 _mm512_maskz_and_ps (__mmask16 k, __m512 a, __m512 b);\r\nVANDPS __m256 _mm256_mask_and_ps (__m256 s, __mmask8 k, __m256 a, __m256 b);\r\nVANDPS __m256 _mm256_maskz_and_ps (__mmask8 k, __m256 a, __m256 b);\r\nVANDPS __m128 _mm_mask_and_ps (__m128 s, __mmask8 k, __m128 a, __m128 b);\r\nVANDPS __m128 _mm_maskz_and_ps (__mmask8 k, __m128 a, __m128 b);\r\nVANDPS __m256 _mm256_and_ps (__m256 a, __m256 b);\r\nANDPS __m128 _mm_and_ps (__m128 a, __m128 b);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nVEX-encoded instruction, see Exceptions Type 4.\r\nEVEX-encoded instruction, see Exceptions Type E4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ANDPS"
},
{
"description": "ARPL-Adjust RPL Field of Segment Selector\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n63 /r ARPL r/m16, r16 NP N. E. Valid Adjust RPL of r/m16 to not less than RPL of\r\n r16.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP ModRM:r/m (w) ModRM:reg (r) NA NA\r\n\r\nDescription\r\nCompares the RPL fields of two segment selectors. The first operand (the destination operand) contains one\r\nsegment selector and the second operand (source operand) contains the other. (The RPL field is located in bits 0\r\nand 1 of each operand.) If the RPL field of the destination operand is less than the RPL field of the source operand,\r\nthe ZF flag is set and the RPL field of the destination operand is increased to match that of the source operand.\r\nOtherwise, the ZF flag is cleared and no change is made to the destination operand. (The destination operand can\r\nbe a word register or a memory location; the source operand must be a word register.)\r\nThe ARPL instruction is provided for use by operating-system procedures (however, it can also be used by applica-\r\ntions). It is generally used to adjust the RPL of a segment selector that has been passed to the operating system\r\nby an application program to match the privilege level of the application program. Here the segment selector\r\npassed to the operating system is placed in the destination operand and segment selector for the application\r\nprogram's code segment is placed in the source operand. (The RPL field in the source operand represents the priv-\r\nilege level of the application program.) Execution of the ARPL instruction then ensures that the RPL of the segment\r\nselector received by the operating system is no lower (does not have a higher privilege) than the privilege level of\r\nthe application program (the segment selector for the application program's code segment can be read from the\r\nstack following a procedure call).\r\nThis instruction executes as described in compatibility mode and legacy mode. It is not encodable in 64-bit mode.\r\nSee \"Checking Caller Access Privileges\" in Chapter 3, \"Protected-Mode Memory Management,\" of the Intel 64 and\r\nIA-32 Architectures Software Developer's Manual, Volume 3A, for more information about the use of this instruc-\r\ntion.\r\n\r\nOperation\r\nIF 64-BIT MODE\r\n THEN\r\n See MOVSXD;\r\n ELSE\r\n IF DEST[RPL] < SRC[RPL]\r\n THEN\r\n ZF <- 1;\r\n DEST[RPL] <- SRC[RPL];\r\n ELSE\r\n ZF <- 0;\r\n FI;\r\nFI;\r\n\r\nFlags Affected\r\nThe ZF flag is set to 1 if the RPL field of the destination operand is less than that of the source operand; otherwise,\r\nit is set to 0.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment\r\n selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#UD The ARPL instruction is not recognized in real-address mode.\r\n If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD The ARPL instruction is not recognized in virtual-8086 mode.\r\n If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nNot applicable.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ARPL"
},
{
"description": "BEXTR - Bit Field Extract\r\n Opcode/Instruction Op/ 64/32 CPUID Description\r\n En -bit Feature\r\n Mode Flag\r\n VEX.NDS.LZ.0F38.W0 F7 /r RMV V/V BMI1 Contiguous bitwise extract from r/m32 using r32b as control; store\r\n BEXTR r32a, r/m32, r32b result in r32a.\r\n VEX.NDS.LZ.0F38.W1 F7 /r RMV V/N.E. BMI1 Contiguous bitwise extract from r/m64 using r64b as control; store\r\n BEXTR r64a, r/m64, r64b result in r64a\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMV ModRM:reg (w) ModRM:r/m (r) VEX.vvvv (r) NA\r\n\r\nDescription\r\nExtracts contiguous bits from the first source operand (the second operand) using an index value and length value\r\nspecified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the\r\nstarting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the\r\nsecond source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH)\r\nbeginning at the START position to extract. Only bit positions up to (OperandSize -1) of the first source operand are\r\nextracted. The extracted bits are written to the destination register, starting from the least significant bit. All higher\r\norder bits in the destination operand (starting at bit position LENGTH) are zeroed. The destination register is\r\ncleared if no bits are extracted.\r\nThis instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in\r\n64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An\r\nattempt to execute this instruction with VEX.L not equal to 0 will cause #UD.\r\n\r\nOperation\r\nSTART <- SRC2[7:0];\r\nLEN <- SRC2[15:8];\r\nTEMP <- ZERO_EXTEND_TO_512 (SRC1 );\r\nDEST <- ZERO_EXTEND(TEMP[START+LEN -1: START]);\r\nZF <- (DEST = 0);\r\n\r\nFlags Affected\r\nZF is updated based on the result. AF, SF, and PF are undefined. All other flags are cleared.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBEXTR: unsigned __int32 _bextr_u32(unsigned __int32 src, unsigned __int32 start. unsigned __int32 len);\r\n\r\nBEXTR: unsigned __int64 _bextr_u64(unsigned __int64 src, unsigned __int32 start. unsigned __int32 len);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Section 2.5.1, \"Exception Conditions for VEX-Encoded GPR Instructions\", Table 2-29; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BEXTR"
},
{
"description": "BLENDPD - Blend Packed Double Precision Floating-Point Values\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 3A 0D /r ib RMI V/V SSE4_1 Select packed DP-FP values from xmm1 and\r\n BLENDPD xmm1, xmm2/m128, imm8 xmm2/m128 from mask specified in imm8\r\n and store the values into xmm1.\r\n VEX.NDS.128.66.0F3A.WIG 0D /r ib RVMI V/V AVX Select packed double-precision floating-point\r\n VBLENDPD xmm1, xmm2, xmm3/m128, imm8 Values from xmm2 and xmm3/m128 from\r\n mask in imm8 and store the values in xmm1.\r\n VEX.NDS.256.66.0F3A.WIG 0D /r ib RVMI V/V AVX Select packed double-precision floating-point\r\n VBLENDPD ymm1, ymm2, ymm3/m256, imm8 Values from ymm2 and ymm3/m256 from\r\n mask in imm8 and store the values in ymm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) imm8[3:0]\r\n\r\nDescription\r\nDouble-precision floating-point values from the second source operand (third operand) are conditionally merged\r\nwith values from the first source operand (second operand) and written to the destination operand (first operand).\r\nThe immediate bits [3:0] determine whether the corresponding double-precision floating-point value in the desti-\r\nnation is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \"1\", then\r\nthe double-precision floating-point value in the second source operand is copied, else the value in the first source\r\noperand is copied.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding\r\nYMM register destination are unmodified.\r\nVEX.128 encoded version: the first source operand is an XMM register. The second source operand is an XMM\r\nregister or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of\r\nthe corresponding YMM register destination are zeroed.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register.\r\n\r\nOperation\r\nBLENDPD (128-bit Legacy SSE version)\r\nIF (IMM8[0] = 0)THEN DEST[63:0] <- DEST[63:0]\r\n ELSE DEST [63:0] <- SRC[63:0] FI\r\nIF (IMM8[1] = 0) THEN DEST[127:64] <- DEST[127:64]\r\n ELSE DEST [127:64] <- SRC[127:64] FI\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVBLENDPD (VEX.128 encoded version)\r\nIF (IMM8[0] = 0)THEN DEST[63:0] <- SRC1[63:0]\r\n ELSE DEST [63:0] <- SRC2[63:0] FI\r\nIF (IMM8[1] = 0) THEN DEST[127:64] <- SRC1[127:64]\r\n ELSE DEST [127:64] <- SRC2[127:64] FI\r\nDEST[VLMAX-1:128] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVBLENDPD (VEX.256 encoded version)\r\nIF (IMM8[0] = 0)THEN DEST[63:0] <- SRC1[63:0]\r\n ELSE DEST [63:0] <- SRC2[63:0] FI\r\nIF (IMM8[1] = 0) THEN DEST[127:64] <- SRC1[127:64]\r\n ELSE DEST [127:64] <- SRC2[127:64] FI\r\nIF (IMM8[2] = 0) THEN DEST[191:128] <- SRC1[191:128]\r\n ELSE DEST [191:128] <- SRC2[191:128] FI\r\nIF (IMM8[3] = 0) THEN DEST[255:192] <- SRC1[255:192]\r\n ELSE DEST [255:192] <- SRC2[255:192] FI\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLENDPD: __m128d _mm_blend_pd (__m128d v1, __m128d v2, const int mask);\r\n\r\nVBLENDPD: __m256d _mm256_blend_pd (__m256d a, __m256d b, const int mask);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLENDPD"
},
{
"description": "BLENDPS - Blend Packed Single Precision Floating-Point Values\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 3A 0C /r ib RMI V/V SSE4_1 Select packed single precision floating-point\r\n BLENDPS xmm1, xmm2/m128, imm8 values from xmm1 and xmm2/m128 from\r\n mask specified in imm8 and store the values\r\n into xmm1.\r\n VEX.NDS.128.66.0F3A.WIG 0C /r ib RVMI V/V AVX Select packed single-precision floating-point\r\n VBLENDPS xmm1, xmm2, xmm3/m128, imm8 values from xmm2 and xmm3/m128 from\r\n mask in imm8 and store the values in xmm1.\r\n VEX.NDS.256.66.0F3A.WIG 0C /r ib RVMI V/V AVX Select packed single-precision floating-point\r\n VBLENDPS ymm1, ymm2, ymm3/m256, imm8 values from ymm2 and ymm3/m256 from\r\n mask in imm8 and store the values in ymm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) imm8\r\n\r\nDescription\r\nPacked single-precision floating-point values from the second source operand (third operand) are conditionally\r\nmerged with values from the first source operand (second operand) and written to the destination operand (first\r\noperand). The immediate bits [7:0] determine whether the corresponding single precision floating-point value in\r\nthe destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is\r\n\"1\", then the single-precision floating-point value in the second source operand is copied, else the value in the first\r\nsource operand is copied.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding\r\nYMM register destination are unmodified.\r\nVEX.128 encoded version: The first source operand an XMM register. The second source operand is an XMM register\r\nor 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the\r\ncorresponding YMM register destination are zeroed.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register.\r\n\r\nOperation\r\nBLENDPS (128-bit Legacy SSE version)\r\nIF (IMM8[0] = 0) THEN DEST[31:0] <-DEST[31:0]\r\n ELSE DEST [31:0] <- SRC[31:0] FI\r\nIF (IMM8[1] = 0) THEN DEST[63:32] <- DEST[63:32]\r\n ELSE DEST [63:32] <- SRC[63:32] FI\r\nIF (IMM8[2] = 0) THEN DEST[95:64] <- DEST[95:64]\r\n ELSE DEST [95:64] <- SRC[95:64] FI\r\nIF (IMM8[3] = 0) THEN DEST[127:96] <- DEST[127:96]\r\n ELSE DEST [127:96] <- SRC[127:96] FI\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\n\r\n\r\n\r\n\r\nVBLENDPS (VEX.128 encoded version)\r\nIF (IMM8[0] = 0) THEN DEST[31:0] <-SRC1[31:0]\r\n ELSE DEST [31:0] <- SRC2[31:0] FI\r\nIF (IMM8[1] = 0) THEN DEST[63:32] <- SRC1[63:32]\r\n ELSE DEST [63:32] <- SRC2[63:32] FI\r\nIF (IMM8[2] = 0) THEN DEST[95:64] <- SRC1[95:64]\r\n ELSE DEST [95:64] <- SRC2[95:64] FI\r\nIF (IMM8[3] = 0) THEN DEST[127:96] <- SRC1[127:96]\r\n ELSE DEST [127:96] <- SRC2[127:96] FI\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nVBLENDPS (VEX.256 encoded version)\r\nIF (IMM8[0] = 0) THEN DEST[31:0] <-SRC1[31:0]\r\n ELSE DEST [31:0] <- SRC2[31:0] FI\r\nIF (IMM8[1] = 0) THEN DEST[63:32] <- SRC1[63:32]\r\n ELSE DEST [63:32] <- SRC2[63:32] FI\r\nIF (IMM8[2] = 0) THEN DEST[95:64] <- SRC1[95:64]\r\n ELSE DEST [95:64] <- SRC2[95:64] FI\r\nIF (IMM8[3] = 0) THEN DEST[127:96] <- SRC1[127:96]\r\n ELSE DEST [127:96] <- SRC2[127:96] FI\r\nIF (IMM8[4] = 0) THEN DEST[159:128] <- SRC1[159:128]\r\n ELSE DEST [159:128] <- SRC2[159:128] FI\r\nIF (IMM8[5] = 0) THEN DEST[191:160] <- SRC1[191:160]\r\n ELSE DEST [191:160] <- SRC2[191:160] FI\r\nIF (IMM8[6] = 0) THEN DEST[223:192] <- SRC1[223:192]\r\n ELSE DEST [223:192] <- SRC2[223:192] FI\r\nIF (IMM8[7] = 0) THEN DEST[255:224] <- SRC1[255:224]\r\n ELSE DEST [255:224] <- SRC2[255:224] FI.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLENDPS: __m128 _mm_blend_ps (__m128 v1, __m128 v2, const int mask);\r\n\r\nVBLENDPS: __m256 _mm256_blend_ps (__m256 a, __m256 b, const int mask);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLENDPS"
},
{
"description": "BLENDVPD - Variable Blend Packed Double Precision Floating-Point Values\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 38 15 /r RM0 V/V SSE4_1 Select packed DP FP values from xmm1 and\r\n BLENDVPD xmm1, xmm2/m128 , <XMM0> xmm2 from mask specified in XMM0 and\r\n store the values in xmm1.\r\n VEX.NDS.128.66.0F3A.W0 4B /r /is4 RVMR V/V AVX Conditionally copy double-precision floating-\r\n VBLENDVPD xmm1, xmm2, xmm3/m128, xmm4 point values from xmm2 or xmm3/m128 to\r\n xmm1, based on mask bits in the mask\r\n operand, xmm4.\r\n VEX.NDS.256.66.0F3A.W0 4B /r /is4 RVMR V/V AVX Conditionally copy double-precision floating-\r\n VBLENDVPD ymm1, ymm2, ymm3/m256, ymm4 point values from ymm2 or ymm3/m256 to\r\n ymm1, based on mask bits in the mask\r\n operand, ymm4.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM0 ModRM:reg (r, w) ModRM:r/m (r) implicit XMM0 NA\r\n RVMR ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) imm8[7:4]\r\n\r\nDescription\r\nConditionally copy each quadword data element of double-precision floating-point value from the second source\r\noperand and the first source operand depending on mask bits defined in the mask register operand. The mask bits\r\nare the most significant bit in each quadword element of the mask register.\r\nEach quadword element of the destination operand is copied from:\r\n. the corresponding quadword element in the second source operand, if a mask bit is \"1\"; or\r\n. the corresponding quadword element in the first source operand, if a mask bit is \"0\"\r\nThe register assignment of the implicit mask operand for BLENDVPD is defined to be the architectural register\r\nXMM0.\r\n128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (VLMAX-1:128)\r\nof the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined\r\nto be the architectural register XMM0. An attempt to execute BLENDVPD with a VEX prefix will cause #UD.\r\nVEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second\r\nsource operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and\r\nencoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is\r\nignored. The upper bits (VLMAX-1:128) of the corresponding YMM register (destination register) are zeroed.\r\nVEX.W must be 0, otherwise, the instruction will #UD.\r\nVEX.256 encoded version: The first source operand and destination operand are YMM registers. The second source\r\noperand can be a YMM register or a 256-bit memory location. The mask operand is the third source register, and\r\nencoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is\r\nignored. VEX.W must be 0, otherwise, the instruction will #UD.\r\nVBLENDVPD permits the mask to be any XMM or YMM register. In contrast, BLENDVPD treats XMM0 implicitly as the\r\nmask and do not support non-destructive destination operation.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nBLENDVPD (128-bit Legacy SSE version)\r\nMASK <- XMM0\r\nIF (MASK[63] = 0) THEN DEST[63:0] <- DEST[63:0]\r\n ELSE DEST [63:0] <- SRC[63:0] FI\r\nIF (MASK[127] = 0) THEN DEST[127:64] <- DEST[127:64]\r\n ELSE DEST [127:64] <- SRC[127:64] FI\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVBLENDVPD (VEX.128 encoded version)\r\nMASK <- SRC3\r\nIF (MASK[63] = 0) THEN DEST[63:0] <- SRC1[63:0]\r\n ELSE DEST [63:0] <- SRC2[63:0] FI\r\nIF (MASK[127] = 0) THEN DEST[127:64] <- SRC1[127:64]\r\n ELSE DEST [127:64] <- SRC2[127:64] FI\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nVBLENDVPD (VEX.256 encoded version)\r\nMASK <- SRC3\r\nIF (MASK[63] = 0) THEN DEST[63:0] <- SRC1[63:0]\r\n ELSE DEST [63:0] <- SRC2[63:0] FI\r\nIF (MASK[127] = 0) THEN DEST[127:64] <- SRC1[127:64]\r\n ELSE DEST [127:64] <- SRC2[127:64] FI\r\nIF (MASK[191] = 0) THEN DEST[191:128] <- SRC1[191:128]\r\n ELSE DEST [191:128] <- SRC2[191:128] FI\r\nIF (MASK[255] = 0) THEN DEST[255:192] <- SRC1[255:192]\r\n ELSE DEST [255:192] <- SRC2[255:192] FI\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLENDVPD: __m128d _mm_blendv_pd(__m128d v1, __m128d v2, __m128d v3);\r\n\r\nVBLENDVPD: __m128 _mm_blendv_pd (__m128d a, __m128d b, __m128d mask);\r\n\r\nVBLENDVPD: __m256 _mm256_blendv_pd (__m256d a, __m256d b, __m256d mask);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLENDVPD"
},
{
"description": "BLENDVPS - Variable Blend Packed Single Precision Floating-Point Values\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 38 14 /r RM0 V/V SSE4_1 Select packed single precision floating-point\r\n BLENDVPS xmm1, xmm2/m128, <XMM0> values from xmm1 and xmm2/m128 from\r\n mask specified in XMM0 and store the values\r\n into xmm1.\r\n VEX.NDS.128.66.0F3A.W0 4A /r /is4 RVMR V/V AVX Conditionally copy single-precision floating-\r\n VBLENDVPS xmm1, xmm2, xmm3/m128, xmm4 point values from xmm2 or xmm3/m128 to\r\n xmm1, based on mask bits in the specified\r\n mask operand, xmm4.\r\n VEX.NDS.256.66.0F3A.W0 4A /r /is4 RVMR V/V AVX Conditionally copy single-precision floating-\r\n VBLENDVPS ymm1, ymm2, ymm3/m256, ymm4 point values from ymm2 or ymm3/m256 to\r\n ymm1, based on mask bits in the specified\r\n mask register, ymm4.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM0 ModRM:reg (r, w) ModRM:r/m (r) implicit XMM0 NA\r\n RVMR ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) imm8[7:4]\r\n\r\nDescription\r\nConditionally copy each dword data element of single-precision floating-point value from the second source\r\noperand and the first source operand depending on mask bits defined in the mask register operand. The mask bits\r\nare the most significant bit in each dword element of the mask register.\r\nEach quadword element of the destination operand is copied from:\r\n. the corresponding dword element in the second source operand, if a mask bit is \"1\"; or\r\n. the corresponding dword element in the first source operand, if a mask bit is \"0\"\r\nThe register assignment of the implicit mask operand for BLENDVPS is defined to be the architectural register\r\nXMM0.\r\n128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (VLMAX-1:128)\r\nof the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined\r\nto be the architectural register XMM0. An attempt to execute BLENDVPS with a VEX prefix will cause #UD.\r\nVEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second\r\nsource operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and\r\nencoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is\r\nignored. The upper bits (VLMAX-1:128) of the corresponding YMM register (destination register) are zeroed.\r\nVEX.W must be 0, otherwise, the instruction will #UD.\r\nVEX.256 encoded version: The first source operand and destination operand are YMM registers. The second source\r\noperand can be a YMM register or a 256-bit memory location. The mask operand is the third source register, and\r\nencoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is\r\nignored. VEX.W must be 0, otherwise, the instruction will #UD.\r\nVBLENDVPS permits the mask to be any XMM or YMM register. In contrast, BLENDVPS treats XMM0 implicitly as the\r\nmask and do not support non-destructive destination operation.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nBLENDVPS (128-bit Legacy SSE version)\r\nMASK <- XMM0\r\nIF (MASK[31] = 0) THEN DEST[31:0] <- DEST[31:0]\r\n ELSE DEST [31:0] <- SRC[31:0] FI\r\nIF (MASK[63] = 0) THEN DEST[63:32] <- DEST[63:32]\r\n ELSE DEST [63:32] <- SRC[63:32] FI\r\nIF (MASK[95] = 0) THEN DEST[95:64] <- DEST[95:64]\r\n ELSE DEST [95:64] <- SRC[95:64] FI\r\nIF (MASK[127] = 0) THEN DEST[127:96] <- DEST[127:96]\r\n ELSE DEST [127:96] <- SRC[127:96] FI\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVBLENDVPS (VEX.128 encoded version)\r\nMASK <- SRC3\r\nIF (MASK[31] = 0) THEN DEST[31:0] <- SRC1[31:0]\r\n ELSE DEST [31:0] <- SRC2[31:0] FI\r\nIF (MASK[63] = 0) THEN DEST[63:32] <- SRC1[63:32]\r\n ELSE DEST [63:32] <- SRC2[63:32] FI\r\nIF (MASK[95] = 0) THEN DEST[95:64] <- SRC1[95:64]\r\n ELSE DEST [95:64] <- SRC2[95:64] FI\r\nIF (MASK[127] = 0) THEN DEST[127:96] <- SRC1[127:96]\r\n ELSE DEST [127:96] <- SRC2[127:96] FI\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nVBLENDVPS (VEX.256 encoded version)\r\nMASK <- SRC3\r\nIF (MASK[31] = 0) THEN DEST[31:0] <- SRC1[31:0]\r\n ELSE DEST [31:0] <- SRC2[31:0] FI\r\nIF (MASK[63] = 0) THEN DEST[63:32] <- SRC1[63:32]\r\n ELSE DEST [63:32] <- SRC2[63:32] FI\r\nIF (MASK[95] = 0) THEN DEST[95:64] <- SRC1[95:64]\r\n ELSE DEST [95:64] <- SRC2[95:64] FI\r\nIF (MASK[127] = 0) THEN DEST[127:96] <- SRC1[127:96]\r\n ELSE DEST [127:96] <- SRC2[127:96] FI\r\nIF (MASK[159] = 0) THEN DEST[159:128] <- SRC1[159:128]\r\n ELSE DEST [159:128] <- SRC2[159:128] FI\r\nIF (MASK[191] = 0) THEN DEST[191:160] <- SRC1[191:160]\r\n ELSE DEST [191:160] <- SRC2[191:160] FI\r\nIF (MASK[223] = 0) THEN DEST[223:192] <- SRC1[223:192]\r\n ELSE DEST [223:192] <- SRC2[223:192] FI\r\nIF (MASK[255] = 0) THEN DEST[255:224] <- SRC1[255:224]\r\n ELSE DEST [255:224] <- SRC2[255:224] FI\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLENDVPS: __m128 _mm_blendv_ps(__m128 v1, __m128 v2, __m128 v3);\r\n\r\nVBLENDVPS: __m128 _mm_blendv_ps (__m128 a, __m128 b, __m128 mask);\r\n\r\nVBLENDVPS: __m256 _mm256_blendv_ps (__m256 a, __m256 b, __m256 mask);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\n\r\n\r\n\r\nOther Exceptions\r\nSee Exceptions Type 4; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLENDVPS"
},
{
"description": "BLSI - Extract Lowest Set Isolated Bit\r\n Opcode/Instruction Op/ 64/32 CPUID Description\r\n En -bit Feature\r\n Mode Flag\r\n VEX.NDD.LZ.0F38.W0 F3 /3 VM V/V BMI1 Extract lowest set bit from r/m32 and set that bit in r32.\r\n BLSI r32, r/m32\r\n VEX.NDD.LZ.0F38.W1 F3 /3 VM V/N.E. BMI1 Extract lowest set bit from r/m64, and set that bit in r64.\r\n BLSI r64, r/m64\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n VM VEX.vvvv (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nExtracts the lowest set bit from the source operand and set the corresponding bit in the destination register. All\r\nother bits in the destination operand are zeroed. If no bits are set in the source operand, BLSI sets all the bits in\r\nthe destination to 0 and sets ZF and CF.\r\nThis instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in\r\n64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An\r\nattempt to execute this instruction with VEX.L not equal to 0 will cause #UD.\r\n\r\nOperation\r\ntemp <- (-SRC) bitwiseAND (SRC);\r\nSF <- temp[OperandSize -1];\r\nZF <- (temp = 0);\r\nIF SRC = 0\r\n CF <- 0;\r\nELSE\r\n CF <- 1;\r\nFI\r\nDEST <- temp;\r\n\r\nFlags Affected\r\nZF and SF are updated based on the result. CF is set if the source is not zero. OF flags are cleared. AF and PF\r\nflags are undefined.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLSI: unsigned __int32 _blsi_u32(unsigned __int32 src);\r\n\r\nBLSI: unsigned __int64 _blsi_u64(unsigned __int64 src);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Section 2.5.1, \"Exception Conditions for VEX-Encoded GPR Instructions\", Table 2-29; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLSI"
},
{
"description": "BLSMSK - Get Mask Up to Lowest Set Bit\r\n Opcode/Instruction Op/ 64/32 CPUID Description\r\n En -bit Feature\r\n Mode Flag\r\n VEX.NDD.LZ.0F38.W0 F3 /2 VM V/V BMI1 Set all lower bits in r32 to \"1\" starting from bit 0 to lowest set bit in\r\n BLSMSK r32, r/m32 r/m32.\r\n VEX.NDD.LZ.0F38.W1 F3 /2 VM V/N.E. BMI1 Set all lower bits in r64 to \"1\" starting from bit 0 to lowest set bit in\r\n BLSMSK r64, r/m64 r/m64.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n VM VEX.vvvv (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nSets all the lower bits of the destination operand to \"1\" up to and including lowest set bit (=1) in the source\r\noperand. If source operand is zero, BLSMSK sets all bits of the destination operand to 1 and also sets CF to 1.\r\nThis instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in\r\n64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An\r\nattempt to execute this instruction with VEX.L not equal to 0 will cause #UD.\r\n\r\nOperation\r\ntemp <- (SRC-1) XOR (SRC) ;\r\nSF <- temp[OperandSize -1];\r\nZF <- 0;\r\nIF SRC = 0\r\n CF <- 1;\r\nELSE\r\n CF <- 0;\r\nFI\r\nDEST <- temp;\r\n\r\nFlags Affected\r\nSF is updated based on the result. CF is set if the source if zero. ZF and OF flags are cleared. AF and PF flag are\r\nundefined.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLSMSK: unsigned __int32 _blsmsk_u32(unsigned __int32 src);\r\n\r\nBLSMSK: unsigned __int64 _blsmsk_u64(unsigned __int64 src);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Section 2.5.1, \"Exception Conditions for VEX-Encoded GPR Instructions\", Table 2-29; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLSMSK"
},
{
"description": "BLSR - Reset Lowest Set Bit\r\n Opcode/Instruction Op/ 64/32 CPUID Description\r\n En -bit Feature\r\n Mode Flag\r\n VEX.NDD.LZ.0F38.W0 F3 /1 VM V/V BMI1 Reset lowest set bit of r/m32, keep all other bits of r/m32 and write\r\n BLSR r32, r/m32 result to r32.\r\n VEX.NDD.LZ.0F38.W1 F3 /1 VM V/N.E. BMI1 Reset lowest set bit of r/m64, keep all other bits of r/m64 and write\r\n BLSR r64, r/m64 result to r64.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n VM VEX.vvvv (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nCopies all bits from the source operand to the destination operand and resets (=0) the bit position in the destina-\r\ntion operand that corresponds to the lowest set bit of the source operand. If the source operand is zero BLSR sets\r\nCF.\r\nThis instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in\r\n64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An\r\nattempt to execute this instruction with VEX.L not equal to 0 will cause #UD.\r\n\r\nOperation\r\ntemp <- (SRC-1) bitwiseAND ( SRC );\r\nSF <- temp[OperandSize -1];\r\nZF <- (temp = 0);\r\nIF SRC = 0\r\n CF <- 1;\r\nELSE\r\n CF <- 0;\r\nFI\r\nDEST <- temp;\r\n\r\nFlags Affected\r\nZF and SF flags are updated based on the result. CF is set if the source is zero. OF flag is cleared. AF and PF flags\r\nare undefined.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBLSR: unsigned __int32 _blsr_u32(unsigned __int32 src);\r\n\r\nBLSR: unsigned __int64 _blsr_u64(unsigned __int64 src);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Section 2.5.1, \"Exception Conditions for VEX-Encoded GPR Instructions\", Table 2-29; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BLSR"
},
{
"description": "BNDCL-Check Lower Bound\r\n Opcode/ Op/En 64/32 CPUID Description\r\n Instruction bit Mode Feature\r\n Support Flag\r\n F3 0F 1A /r RM NE/V MPX Generate a #BR if the address in r/m32 is lower than the lower\r\n BNDCL bnd, r/m32 bound in bnd.LB.\r\n F3 0F 1A /r RM V/NE MPX Generate a #BR if the address in r/m64 is lower than the lower\r\n BNDCL bnd, r/m64 bound in bnd.LB.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3\r\n RM ModRM:reg (w) ModRM:r/m (r) NA\r\n\r\nDescription\r\nCompare the address in the second operand with the lower bound in bnd. The second operand can be either a\r\nregister or memory operand. If the address is lower than the lower bound in bnd.LB, it will set BNDSTATUS to 01H\r\nand signal a #BR exception.\r\nThis instruction does not cause any memory access, and does not read or write any flags.\r\n\r\nOperation\r\nBNDCL BND, reg\r\nIF reg < BND.LB Then\r\n BNDSTATUS <- 01H;\r\n #BR;\r\nFI;\r\n\r\nBNDCL BND, mem\r\nTEMP <- LEA(mem);\r\nIF TEMP < BND.LB Then\r\n BNDSTATUS <- 01H;\r\n #BR;\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBNDCL void _bnd_chk_ptr_lbounds(const void *q)\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#BR If lower bound check fails.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 67H prefix is not used and CS.D=0.\r\n If 67H prefix is used and CS.D=1.\r\n\r\n\r\n\r\n\r\n\r\nReal-Address Mode Exceptions\r\n#BR If lower bound check fails.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#BR If lower bound check fails.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BNDCL"
},
{
"description": "-R:BNDCU",
"mnem": "BNDCN"
},
{
"description": "BNDCU/BNDCN-Check Upper Bound\r\n Opcode/ Op/En 64/32 CPUID Description\r\n Instruction bit Mode Feature\r\n Support Flag\r\n F2 0F 1A /r RM NE/V MPX Generate a #BR if the address in r/m32 is higher than the upper\r\n BNDCU bnd, r/m32 bound in bnd.UB (bnb.UB in 1's complement form).\r\n F2 0F 1A /r RM V/NE MPX Generate a #BR if the address in r/m64 is higher than the upper\r\n BNDCU bnd, r/m64 bound in bnd.UB (bnb.UB in 1's complement form).\r\n F2 0F 1B /r RM NE/V MPX Generate a #BR if the address in r/m32 is higher than the upper\r\n BNDCN bnd, r/m32 bound in bnd.UB (bnb.UB not in 1's complement form).\r\n F2 0F 1B /r RM V/NE MPX Generate a #BR if the address in r/m64 is higher than the upper\r\n BNDCN bnd, r/m64 bound in bnd.UB (bnb.UB not in 1's complement form).\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3\r\n RM ModRM:reg (w) ModRM:r/m (r) NA\r\n\r\nDescription\r\nCompare the address in the second operand with the upper bound in bnd. The second operand can be either a\r\nregister or a memory operand. If the address is higher than the upper bound in bnd.UB, it will set BNDSTATUS to\r\n01H and signal a #BR exception.\r\nBNDCU perform 1's complement operation on the upper bound of bnd first before proceeding with address compar-\r\nison. BNDCN perform address comparison directly using the upper bound in bnd that is already reverted out of 1's\r\ncomplement form.\r\nThis instruction does not cause any memory access, and does not read or write any flags.\r\nEffective address computation of m32/64 has identical behavior to LEA\r\n\r\nOperation\r\nBNDCU BND, reg\r\nIF reg > NOT(BND.UB) Then\r\n BNDSTATUS <- 01H;\r\n #BR;\r\nFI;\r\n\r\nBNDCU BND, mem\r\nTEMP <- LEA(mem);\r\nIF TEMP > NOT(BND.UB) Then\r\n BNDSTATUS <- 01H;\r\n #BR;\r\nFI;\r\n\r\nBNDCN BND, reg\r\nIF reg > BND.UB Then\r\n BNDSTATUS <- 01H;\r\n #BR;\r\nFI;\r\n\r\n\r\n\r\n\r\n\r\nBNDCN BND, mem\r\nTEMP <- LEA(mem);\r\nIF TEMP > BND.UB Then\r\n BNDSTATUS <- 01H;\r\n #BR;\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBNDCU .void _bnd_chk_ptr_ubounds(const void *q)\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#BR If upper bound check fails.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 67H prefix is not used and CS.D=0.\r\n If 67H prefix is used and CS.D=1.\r\n\r\nReal-Address Mode Exceptions\r\n#BR If upper bound check fails.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#BR If upper bound check fails.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BNDCU"
},
{
"description": "BNDLDX-Load Extended Bounds Using Address Translation\r\n Opcode/ Op/En 64/32 CPUID Description\r\n Instruction bit Mode Feature\r\n Support Flag\r\n 0F 1A /r RM V/V MPX Load the bounds stored in a bound table entry (BTE) into bnd with\r\n BNDLDX bnd, mib address translation using the base of mib and conditional on the\r\n index of mib matching the pointer value in the BTE.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3\r\n SIB.base (r): Address of pointer\r\n RM ModRM:reg (w) NA\r\n SIB.index(r)\r\n\r\nDescription\r\nBNDLDX uses the linear address constructed from the base register and displacement of the SIB-addressing form\r\nof the memory operand (mib) to perform address translation to access a bound table entry and conditionally load\r\nthe bounds in the BTE to the destination. The destination register is updated with the bounds in the BTE, if the\r\ncontent of the index register of mib matches the pointer value stored in the BTE.\r\nIf the pointer value comparison fails, the destination is updated with INIT bounds (lb = 0x0, ub = 0x0) (note: as\r\narticulated earlier, the upper bound is represented using 1's complement, therefore, the 0x0 value of upper bound\r\nallows for access to full memory).\r\nThis instruction does not cause memory access to the linear address of mib nor the effective address referenced by\r\nthe base, and does not read or write any flags.\r\nSegment overrides apply to the linear address computation with the base of mib, and are used during address\r\ntranslation to generate the address of the bound table entry. By default, the address of the BTE is assumed to be\r\nlinear address. There are no segmentation checks performed on the base of mib.\r\nThe base of mib will not be checked for canonical address violation as it does not access memory.\r\nAny encoding of this instruction that does not specify base or index register will treat those registers as zero\r\n(constant). The reg-reg form of this instruction will remain a NOP.\r\nThe scale field of the SIB byte has no effect on these instructions and is ignored.\r\nThe bound register may be partially updated on memory faults. The order in which memory operands are loaded is\r\nimplementation specific.\r\n\r\nOperation\r\nbase <- mib.SIB.base ? mib.SIB.base + Disp: 0;\r\nptr_value <- mib.SIB.index ? mib.SIB.index : 0;\r\n\r\nOutside 64-bit mode\r\nA_BDE[31:0] <- (Zero_extend32(base[31:12] << 2) + (BNDCFG[31:12] <<12 );\r\nA_BT[31:0] <- LoadFrom(A_BDE );\r\nIF A_BT[0] equal 0 Then\r\n BNDSTATUS <- A_BDE | 02H;\r\n #BR;\r\nFI;\r\nA_BTE[31:0] <- (Zero_extend32(base[11:2] << 4) + (A_BT[31:2] << 2 );\r\nTemp_lb[31:0] <- LoadFrom(A_BTE);\r\nTemp_ub[31:0] <- LoadFrom(A_BTE + 4);\r\nTemp_ptr[31:0] <- LoadFrom(A_BTE + 8);\r\nIF Temp_ptr equal ptr_value Then\r\n BND.LB <- Temp_lb;\r\n BND.UB <- Temp_ub;\r\n\r\n\r\n\r\nELSE\r\n BND.LB <- 0;\r\n BND.UB <- 0;\r\nFI;\r\n\r\nIn 64-bit mode\r\nA_BDE[63:0] <- (Zero_extend64(base[47+MAWA:20] << 3) + (BNDCFG[63:20] <<12 );1\r\nA_BT[63:0] <- LoadFrom(A_BDE);\r\nIF A_BT[0] equal 0 Then\r\n BNDSTATUS <- A_BDE | 02H;\r\n #BR;\r\nFI;\r\nA_BTE[63:0] <- (Zero_extend64(base[19:3] << 5) + (A_BT[63:3] << 3 );\r\nTemp_lb[63:0] <- LoadFrom(A_BTE);\r\nTemp_ub[63:0] <- LoadFrom(A_BTE + 8);\r\nTemp_ptr[63:0] <- LoadFrom(A_BTE + 16);\r\nIF Temp_ptr equal ptr_value Then\r\n BND.LB <- Temp_lb;\r\n BND.UB <- Temp_ub;\r\nELSE\r\n BND.LB <- 0;\r\n BND.UB <- 0;\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBNDLDX: Generated by compiler as needed.\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#BR If the bound directory entry is invalid.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 67H prefix is not used and CS.D=0.\r\n If 67H prefix is used and CS.D=1.\r\n#GP(0) If a destination effective address of the Bound Table entry is outside the DS segment limit.\r\n If DS register contains a NULL segment selector.\r\n#PF(fault code) If a page fault occurs.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n#GP(0) If a destination effective address of the Bound Table entry is outside the DS segment limit.\r\n\r\n\r\n\r\n\r\n1. If CPL < 3, the supervisor MAWA (MAWAS) is used; this value is 0. If CPL = 3, the user MAWA (MAWAU) is used; this value is enumer-\r\n ated in CPUID.(EAX=07H,ECX=0H):ECX.MAWAU[bits 21:17]. See Section 17.3.1 of Intel 64 and IA-32 Architectures Software Devel-\r\n oper's Manual, Volume 1.\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n#GP(0) If a destination effective address of the Bound Table entry is outside the DS segment limit.\r\n#PF(fault code) If a page fault occurs.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#BR If the bound directory entry is invalid.\r\n#UD If ModRM is RIP relative.\r\n If the LOCK prefix is used.\r\n If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.\r\n#GP(0) If the memory address (A_BDE or A_BTE) is in a non-canonical form.\r\n#PF(fault code) If a page fault occurs.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BNDLDX"
},
{
"description": "BNDMK-Make Bounds\r\n Opcode/ Op/En 64/32 CPUID Description\r\n Instruction bit Mode Feature\r\n Support Flag\r\n F3 0F 1B /r RM NE/V MPX Make lower and upper bounds from m32 and store them in bnd.\r\n BNDMK bnd, m32\r\n F3 0F 1B /r RM V/NE MPX Make lower and upper bounds from m64 and store them in bnd.\r\n BNDMK bnd, m64\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3\r\n RM ModRM:reg (w) ModRM:r/m (r) NA\r\n\r\nDescription\r\nMakes bounds from the second operand and stores the lower and upper bounds in the bound register bnd. The\r\nsecond operand must be a memory operand. The content of the base register from the memory operand is stored\r\nin the lower bound bnd.LB. The 1's complement of the effective address of m32/m64 is stored in the upper bound\r\nb.UB. Computation of m32/m64 has identical behavior to LEA.\r\nThis instruction does not cause any memory access, and does not read or write any flags.\r\nIf the instruction did not specify base register, the lower bound will be zero. The reg-reg form of this instruction\r\nretains legacy behavior (NOP).\r\nRIP relative instruction in 64-bit will #UD.\r\n\r\nOperation\r\nBND.LB <- SRCMEM.base;\r\nIF 64-bit mode Then\r\n BND.UB <- NOT(LEA.64_bits(SRCMEM));\r\nELSE\r\n BND.UB <- Zero_Extend.64_bits(NOT(LEA.32_bits(SRCMEM)));\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBNDMKvoid * _bnd_set_ptr_bounds(const void * q, size_t size);\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#UD If ModRM is RIP relative.\r\n If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 67H prefix is not used and CS.D=0.\r\n If 67H prefix is used and CS.D=1.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If ModRM is RIP relative.\r\n If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If ModRM is RIP relative.\r\n If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.\r\n#SS(0) If the memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BNDMK"
},
{
"description": "BNDMOV-Move Bounds\r\n Opcode/ Op/En 64/32 CPUID Description\r\n Instruction bit Mode Feature\r\n Support Flag\r\n 66 0F 1A /r RM NE/V MPX Move lower and upper bound from bnd2/m64 to bound register\r\n BNDMOV bnd1, bnd2/m64 bnd1.\r\n 66 0F 1A /r RM V/NE MPX Move lower and upper bound from bnd2/m128 to bound register\r\n BNDMOV bnd1, bnd2/m128 bnd1.\r\n 66 0F 1B /r MR NE/V MPX Move lower and upper bound from bnd2 to bnd1/m64.\r\n BNDMOV bnd1/m64, bnd2\r\n 66 0F 1B /r MR V/NE MPX Move lower and upper bound from bnd2 to bound register\r\n BNDMOV bnd1/m128, bnd2 bnd1/m128.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3\r\n RM ModRM:reg (w) ModRM:r/m (r) NA\r\n MR ModRM:r/m (w) ModRM:reg (r) NA\r\n\r\nDescription\r\nBNDMOV moves a pair of lower and upper bound values from the source operand (the second operand) to the\r\ndestination (the first operand). Each operation is 128-bit move. The exceptions are same as the MOV instruction.\r\nThe memory format for loading/store bounds in 64-bit mode is shown in Figure 3-5.\r\n\r\n\r\n\r\n\r\n BNDMOV to memory in 64-bit mode\r\n Upper Bound (UB) Lower Bound (LB)\r\n\r\n\r\n 16 8 0 Byte offset\r\n\r\n\r\n\r\n\r\n BNDMOV to memory in 32-bit mode\r\n Upper Bound (UB) Lower Bound (LB)\r\n\r\n\r\n 16 8 4 0 Byte offset\r\n\r\n\r\n\r\n\r\n Figure 3-5. Memory Layout of BNDMOV to/from Memory\r\n\r\n\r\nThis instruction does not change flags.\r\n\r\nOperation\r\nBNDMOV register to register\r\nDEST.LB <- SRC.LB;\r\nDEST.UB <- SRC.UB;\r\n\r\n\r\n\r\n\r\n\r\nBNDMOV from memory\r\nIF 64-bit mode THEN\r\n DEST.LB <- LOAD_QWORD(SRC);\r\n DEST.UB <- LOAD_QWORD(SRC+8);\r\n ELSE\r\n DEST.LB <- LOAD_DWORD_ZERO_EXT(SRC);\r\n DEST.UB <- LOAD_DWORD_ZERO_EXT(SRC+4);\r\nFI;\r\n\r\nBNDMOV to memory\r\nIF 64-bit mode THEN\r\n DEST[63:0] <- SRC.LB;\r\n DEST[127:64] <- SRC.UB;\r\n ELSE\r\n DEST[31:0] <- SRC.LB;\r\n DEST[63:32] <- SRC.UB;\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBNDMOV void * _bnd_copy_ptr_bounds(const void *q, const void *r)\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 67H prefix is not used and CS.D=0.\r\n If 67H prefix is used and CS.D=1.\r\n#SS(0) If the memory operand effective address is outside the SS segment limit.\r\n#GP(0) If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the destination operand points to a non-writable segment\r\n If the DS, ES, FS, or GS segment register contains a NULL segment selector.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while CPL is 3.\r\n#PF(fault code) If a page fault occurs.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n#GP(0) If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If the memory operand effective address is outside the SS segment limit.\r\n\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n#GP(0) If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If the memory operand effective address is outside the SS segment limit.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while CPL is 3.\r\n#PF(fault code) If a page fault occurs.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.\r\n#SS(0) If the memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while CPL is 3.\r\n#PF(fault code) If a page fault occurs.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BNDMOV"
},
{
"description": "BNDSTX-Store Extended Bounds Using Address Translation\r\n Opcode/ Op/En 64/32 CPUID Description\r\n Instruction bit Mode Feature\r\n Support Flag\r\n 0F 1B /r MR V/V MPX Store the bounds in bnd and the pointer value in the index regis-\r\n BNDSTX mib, bnd ter of mib to a bound table entry (BTE) with address translation\r\n using the base of mib.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3\r\n SIB.base (r): Address of pointer\r\n MR ModRM:reg (r) NA\r\n SIB.index(r)\r\n\r\nDescription\r\nBNDSTX uses the linear address constructed from the displacement and base register of the SIB-addressing form\r\nof the memory operand (mib) to perform address translation to store to a bound table entry. The bounds in the\r\nsource operand bnd are written to the lower and upper bounds in the BTE. The content of the index register of mib\r\nis written to the pointer value field in the BTE.\r\nThis instruction does not cause memory access to the linear address of mib nor the effective address referenced by\r\nthe base, and does not read or write any flags.\r\nSegment overrides apply to the linear address computation with the base of mib, and are used during address\r\ntranslation to generate the address of the bound table entry. By default, the address of the BTE is assumed to be\r\nlinear address. There are no segmentation checks performed on the base of mib.\r\nThe base of mib will not be checked for canonical address violation as it does not access memory.\r\nAny encoding of this instruction that does not specify base or index register will treat those registers as zero\r\n(constant). The reg-reg form of this instruction will remain a NOP.\r\nThe scale field of the SIB byte has no effect on these instructions and is ignored.\r\nThe bound register may be partially updated on memory faults. The order in which memory operands are loaded is\r\nimplementation specific.\r\n\r\nOperation\r\nbase <- mib.SIB.base ? mib.SIB.base + Disp: 0;\r\nptr_value <- mib.SIB.index ? mib.SIB.index : 0;\r\n\r\nOutside 64-bit mode\r\nA_BDE[31:0] <- (Zero_extend32(base[31:12] << 2) + (BNDCFG[31:12] <<12 );\r\nA_BT[31:0] <- LoadFrom(A_BDE);\r\nIF A_BT[0] equal 0 Then\r\n BNDSTATUS <- A_BDE | 02H;\r\n #BR;\r\nFI;\r\nA_DEST[31:0] <- (Zero_extend32(base[11:2] << 4) + (A_BT[31:2] << 2 ); // address of Bound table entry\r\nA_DEST[8][31:0] <- ptr_value;\r\nA_DEST[0][31:0] <- BND.LB;\r\nA_DEST[4][31:0] <- BND.UB;\r\n\r\n\r\n\r\n\r\n\r\nIn 64-bit mode\r\nA_BDE[63:0] <- (Zero_extend64(base[47+MAWA:20] << 3) + (BNDCFG[63:20] <<12 );1\r\nA_BT[63:0] <- LoadFrom(A_BDE);\r\nIF A_BT[0] equal 0 Then\r\n BNDSTATUS <- A_BDE | 02H;\r\n #BR;\r\nFI;\r\nA_DEST[63:0] <- (Zero_extend64(base[19:3] << 5) + (A_BT[63:3] << 3 ); // address of Bound table entry\r\nA_DEST[16][63:0] <- ptr_value;\r\nA_DEST[0][63:0] <- BND.LB;\r\nA_DEST[8][63:0] <- BND.UB;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBNDSTX: _bnd_store_ptr_bounds(const void **ptr_addr, const void *ptr_val);\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#BR If the bound directory entry is invalid.\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 67H prefix is not used and CS.D=0.\r\n If 67H prefix is used and CS.D=1.\r\n#GP(0) If a destination effective address of the Bound Table entry is outside the DS segment limit.\r\n If DS register contains a NULL segment selector.\r\n If the destination operand points to a non-writable segment\r\n#PF(fault code) If a page fault occurs.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n#GP(0) If a destination effective address of the Bound Table entry is outside the DS segment limit.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.\r\n If 16-bit addressing is used.\r\n#GP(0) If a destination effective address of the Bound Table entry is outside the DS segment limit.\r\n#PF(fault code) If a page fault occurs.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n1. If CPL < 3, the supervisor MAWA (MAWAS) is used; this value is 0. If CPL = 3, the user MAWA (MAWAU) is used; this value is enumer-\r\n ated in CPUID.(EAX=07H,ECX=0H):ECX.MAWAU[bits 21:17]. See Section 17.3.1 of Intel 64 and IA-32 Architectures Software Devel-\r\n oper's Manual, Volume 1.\r\n\r\n\r\n\r\n64-Bit Mode Exceptions\r\n#BR If the bound directory entry is invalid.\r\n#UD If ModRM is RIP relative.\r\n If the LOCK prefix is used.\r\n If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.\r\n#GP(0) If the memory address (A_BDE or A_BTE) is in a non-canonical form.\r\n If the destination operand points to a non-writable segment\r\n#PF(fault code) If a page fault occurs.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BNDSTX"
},
{
"description": "BOUND-Check Array Index Against Bounds\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n62 /r BOUND r16, m16&16 RM Invalid Valid Check if r16 (array index) is within bounds\r\n specified by m16&16.\r\n62 /r BOUND r32, m32&32 RM Invalid Valid Check if r32 (array index) is within bounds\r\n specified by m32&32.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nBOUND determines if the first operand (array index) is within the bounds of an array specified the second operand\r\n(bounds operand). The array index is a signed integer located in a register. The bounds operand is a memory loca-\r\ntion that contains a pair of signed doubleword-integers (when the operand-size attribute is 32) or a pair of signed\r\nword-integers (when the operand-size attribute is 16). The first doubleword (or word) is the lower bound of the\r\narray and the second doubleword (or word) is the upper bound of the array. The array index must be greater than\r\nor equal to the lower bound and less than or equal to the upper bound plus the operand size in bytes. If the index\r\nis not within bounds, a BOUND range exceeded exception (#BR) is signaled. When this exception is generated, the\r\nsaved return instruction pointer points to the BOUND instruction.\r\nThe bounds limit data structure (two words or doublewords containing the lower and upper limits of the array) is\r\nusually placed just before the array itself, making the limits addressable via a constant offset from the beginning of\r\nthe array. Because the address of the array already will be present in a register, this practice avoids extra bus cycles\r\nto obtain the effective address of the array bounds.\r\nThis instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64bit Mode\r\n THEN\r\n #UD;\r\n ELSE\r\n IF (ArrayIndex < LowerBound OR ArrayIndex > UpperBound)\r\n (* Below lower bound or above upper bound *)\r\n THEN #BR; FI;\r\nFI;\r\n\r\nFlags Affected\r\nNone.\r\n\r\nProtected Mode Exceptions\r\n#BR If the bounds test fails.\r\n#UD If second operand is not a memory location.\r\n If the LOCK prefix is used.\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\n\r\n\r\nReal-Address Mode Exceptions\r\n#BR If the bounds test fails.\r\n#UD If second operand is not a memory location.\r\n If the LOCK prefix is used.\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#BR If the bounds test fails.\r\n#UD If second operand is not a memory location.\r\n If the LOCK prefix is used.\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BOUND"
},
{
"description": "BSF-Bit Scan Forward\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F BC /r BSF r16, r/m16 RM Valid Valid Bit scan forward on r/m16.\r\n 0F BC /r BSF r32, r/m32 RM Valid Valid Bit scan forward on r/m32.\r\n REX.W + 0F BC /r BSF r64, r/m64 RM Valid N.E. Bit scan forward on r/m64.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nSearches the source operand (second operand) for the least significant set bit (1 bit). If a least significant 1 bit is\r\nfound, its bit index is stored in the destination operand (first operand). The source operand can be a register or a\r\nmemory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source\r\noperand. If the content of the source operand is 0, the content of the destination operand is undefined.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nIF SRC = 0\r\n THEN\r\n ZF <- 1;\r\n DEST is undefined;\r\n ELSE\r\n ZF <- 0;\r\n temp <- 0;\r\n WHILE Bit(SRC, temp) = 0\r\n DO\r\n temp <- temp + 1;\r\n OD;\r\n DEST <- temp;\r\nFI;\r\n\r\nFlags Affected\r\nThe ZF flag is set to 1 if all the source operand is 0; otherwise, the ZF flag is cleared. The CF, OF, SF, AF, and PF, flags\r\nare undefined.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BSF"
},
{
"description": "BSR-Bit Scan Reverse\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n0F BD /r BSR r16, r/m16 RM Valid Valid Bit scan reverse on r/m16.\r\n0F BD /r BSR r32, r/m32 RM Valid Valid Bit scan reverse on r/m32.\r\nREX.W + 0F BD /r BSR r64, r/m64 RM Valid N.E. Bit scan reverse on r/m64.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nSearches the source operand (second operand) for the most significant set bit (1 bit). If a most significant 1 bit is\r\nfound, its bit index is stored in the destination operand (first operand). The source operand can be a register or a\r\nmemory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source\r\noperand. If the content source operand is 0, the content of the destination operand is undefined.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nIF SRC = 0\r\n THEN\r\n ZF <- 1;\r\n DEST is undefined;\r\n ELSE\r\n ZF <- 0;\r\n temp <- OperandSize - 1;\r\n WHILE Bit(SRC, temp) = 0\r\n DO\r\n temp <- temp - 1;\r\n OD;\r\n DEST <- temp;\r\nFI;\r\n\r\nFlags Affected\r\nThe ZF flag is set to 1 if all the source operand is 0; otherwise, the ZF flag is cleared. The CF, OF, SF, AF, and PF, flags\r\nare undefined.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BSR"
},
{
"description": "BSWAP-Byte Swap\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n0F C8+rd BSWAP r32 O Valid* Valid Reverses the byte order of a 32-bit register.\r\nREX.W + 0F C8+rd BSWAP r64 O Valid N.E. Reverses the byte order of a 64-bit register.\r\nNOTES:\r\n* See IA-32 Architecture Compatibility section below.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n O opcode + rd (r, w) NA NA NA\r\n\r\nDescription\r\nReverses the byte order of a 32-bit or 64-bit (destination) register. This instruction is provided for converting little-\r\nendian values to big-endian format and vice versa. To swap bytes in a word value (16-bit register), use the XCHG\r\ninstruction. When the BSWAP instruction references a 16-bit register, the result is undefined.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nIA-32 Architecture Legacy Compatibility\r\nThe BSWAP instruction is not supported on IA-32 processors earlier than the Intel486 processor family. For\r\ncompatibility with this instruction, software should include functionally equivalent code for execution on Intel\r\nprocessors earlier than the Intel486 processor family.\r\n\r\nOperation\r\nTEMP <- DEST\r\nIF 64-bit mode AND OperandSize = 64\r\n THEN\r\n DEST[7:0] <- TEMP[63:56];\r\n DEST[15:8] <- TEMP[55:48];\r\n DEST[23:16] <- TEMP[47:40];\r\n DEST[31:24] <- TEMP[39:32];\r\n DEST[39:32] <- TEMP[31:24];\r\n DEST[47:40] <- TEMP[23:16];\r\n DEST[55:48] <- TEMP[15:8];\r\n DEST[63:56] <- TEMP[7:0];\r\n ELSE\r\n DEST[7:0] <- TEMP[31:24];\r\n DEST[15:8] <- TEMP[23:16];\r\n DEST[23:16] <- TEMP[15:8];\r\n DEST[31:24] <- TEMP[7:0];\r\nFI;\r\n\r\nFlags Affected\r\nNone.\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n",
"mnem": "BSWAP"
},
{
"description": "BT-Bit Test\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F A3 /r BT r/m16, r16 MR Valid Valid Store selected bit in CF flag.\r\n 0F A3 /r BT r/m32, r32 MR Valid Valid Store selected bit in CF flag.\r\n REX.W + 0F A3 /r BT r/m64, r64 MR Valid N.E. Store selected bit in CF flag.\r\n 0F BA /4 ib BT r/m16, imm8 MI Valid Valid Store selected bit in CF flag.\r\n 0F BA /4 ib BT r/m32, imm8 MI Valid Valid Store selected bit in CF flag.\r\n REX.W + 0F BA /4 ib BT r/m64, imm8 MI Valid N.E. Store selected bit in CF flag.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n MR ModRM:r/m (r) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r) imm8 NA NA\r\n\r\nDescription\r\nSelects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by\r\nthe bit offset (specified by the second operand) and stores the value of the bit in the CF flag. The bit base operand\r\ncan be a register or a memory location; the bit offset operand can be a register or an immediate value:\r\n. If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset\r\n operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit\r\n mode).\r\n. If the bit base operand specifies a memory location, the operand represents the address of the byte in memory\r\n that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be\r\n referenced by the offset operand depends on the operand size.\r\nSee also: Bit(BitBase, BitOffset) on page 3-11.\r\nSome assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-\r\ntion with the displacement field of the memory operand. In this case, the low-order 3 or 5 bits (3 for 16-bit oper-\r\nands, 5 for 32-bit operands) of the immediate bit offset are stored in the immediate bit offset field, and the high-\r\norder bits are shifted and combined with the byte displacement in the addressing mode by the assembler. The\r\nprocessor will ignore the high order bits if they are not zero.\r\nWhen accessing a bit in memory, the processor may access 4 bytes starting from the memory address for a 32-bit\r\noperand size, using by the following relationship:\r\n\r\n Effective Address + (4 * (BitOffset DIV 32))\r\nOr, it may access 2 bytes starting from the memory address for a 16-bit operand, using this relationship:\r\n\r\n Effective Address + (2 * (BitOffset DIV 16))\r\nIt may do so even when only a single byte needs to be accessed to reach the given bit. When using this bit\r\naddressing mechanism, software should avoid referencing areas of memory close to address space holes. In partic-\r\nular, it should avoid references to memory-mapped I/O registers. Instead, software should use the MOV instruc-\r\ntions to load from or store to these addresses, and use the register form of these instructions to manipulate the\r\ndata.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bit oper-\r\nands. See the summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nCF <- Bit(BitBase, BitOffset);\r\n\r\n\r\n\r\nFlags Affected\r\nThe CF flag contains the value of the selected bit. The ZF flag is unaffected. The OF, SF, AF, and PF flags are\r\nundefined.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BT"
},
{
"description": "BTC-Bit Test and Complement\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F BB /r BTC r/m16, r16 MR Valid Valid Store selected bit in CF flag and complement.\r\n 0F BB /r BTC r/m32, r32 MR Valid Valid Store selected bit in CF flag and complement.\r\n REX.W + 0F BB /r BTC r/m64, r64 MR Valid N.E. Store selected bit in CF flag and complement.\r\n 0F BA /7 ib BTC r/m16, imm8 MI Valid Valid Store selected bit in CF flag and complement.\r\n 0F BA /7 ib BTC r/m32, imm8 MI Valid Valid Store selected bit in CF flag and complement.\r\n REX.W + 0F BA /7 ib BTC r/m64, imm8 MI Valid N.E. Store selected bit in CF flag and complement.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r, w) imm8 NA NA\r\n\r\nDescription\r\nSelects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by\r\nthe bit offset operand (second operand), stores the value of the bit in the CF flag, and complements the selected\r\nbit in the bit string. The bit base operand can be a register or a memory location; the bit offset operand can be a\r\nregister or an immediate value:\r\n. If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset\r\n operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit\r\n mode). This allows any bit position to be selected.\r\n. If the bit base operand specifies a memory location, the operand represents the address of the byte in memory\r\n that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be\r\n referenced by the offset operand depends on the operand size.\r\nSee also: Bit(BitBase, BitOffset) on page 3-11.\r\nSome assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-\r\ntion with the displacement field of the memory operand. See \"BT-Bit Test\" in this chapter for more information on\r\nthis addressing mechanism.\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nCF <- Bit(BitBase, BitOffset);\r\nBit(BitBase, BitOffset) <- NOT Bit(BitBase, BitOffset);\r\n\r\nFlags Affected\r\nThe CF flag contains the value of the selected bit before it is complemented. The ZF flag is unaffected. The OF, SF,\r\nAF, and PF flags are undefined.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination operand points to a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BTC"
},
{
"description": "BTR-Bit Test and Reset\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F B3 /r BTR r/m16, r16 MR Valid Valid Store selected bit in CF flag and clear.\r\n 0F B3 /r BTR r/m32, r32 MR Valid Valid Store selected bit in CF flag and clear.\r\n REX.W + 0F B3 /r BTR r/m64, r64 MR Valid N.E. Store selected bit in CF flag and clear.\r\n 0F BA /6 ib BTR r/m16, imm8 MI Valid Valid Store selected bit in CF flag and clear.\r\n 0F BA /6 ib BTR r/m32, imm8 MI Valid Valid Store selected bit in CF flag and clear.\r\n REX.W + 0F BA /6 ib BTR r/m64, imm8 MI Valid N.E. Store selected bit in CF flag and clear.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r, w) imm8 NA NA\r\n\r\nDescription\r\nSelects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by\r\nthe bit offset operand (second operand), stores the value of the bit in the CF flag, and clears the selected bit in the\r\nbit string to 0. The bit base operand can be a register or a memory location; the bit offset operand can be a register\r\nor an immediate value:\r\n. If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset\r\n operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit\r\n mode). This allows any bit position to be selected.\r\n. If the bit base operand specifies a memory location, the operand represents the address of the byte in memory\r\n that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be\r\n referenced by the offset operand depends on the operand size.\r\nSee also: Bit(BitBase, BitOffset) on page 3-11.\r\nSome assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-\r\ntion with the displacement field of the memory operand. See \"BT-Bit Test\" in this chapter for more information on\r\nthis addressing mechanism.\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nCF <- Bit(BitBase, BitOffset);\r\nBit(BitBase, BitOffset) <- 0;\r\n\r\nFlags Affected\r\nThe CF flag contains the value of the selected bit before it is cleared. The ZF flag is unaffected. The OF, SF, AF, and\r\nPF flags are undefined.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination operand points to a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BTR"
},
{
"description": "BTS-Bit Test and Set\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F AB /r BTS r/m16, r16 MR Valid Valid Store selected bit in CF flag and set.\r\n 0F AB /r BTS r/m32, r32 MR Valid Valid Store selected bit in CF flag and set.\r\n REX.W + 0F AB /r BTS r/m64, r64 MR Valid N.E. Store selected bit in CF flag and set.\r\n 0F BA /5 ib BTS r/m16, imm8 MI Valid Valid Store selected bit in CF flag and set.\r\n 0F BA /5 ib BTS r/m32, imm8 MI Valid Valid Store selected bit in CF flag and set.\r\n REX.W + 0F BA /5 ib BTS r/m64, imm8 MI Valid N.E. Store selected bit in CF flag and set.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r, w) imm8 NA NA\r\n\r\nDescription\r\nSelects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by\r\nthe bit offset operand (second operand), stores the value of the bit in the CF flag, and sets the selected bit in the\r\nbit string to 1. The bit base operand can be a register or a memory location; the bit offset operand can be a register\r\nor an immediate value:\r\n. If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset\r\n operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit\r\n mode). This allows any bit position to be selected.\r\n. If the bit base operand specifies a memory location, the operand represents the address of the byte in memory\r\n that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be\r\n referenced by the offset operand depends on the operand size.\r\nSee also: Bit(BitBase, BitOffset) on page 3-11.\r\nSome assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combina-\r\ntion with the displacement field of the memory operand. See \"BT-Bit Test\" in this chapter for more information on\r\nthis addressing mechanism.\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Using a REX prefix in the form of REX.R permits\r\naccess to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nCF <- Bit(BitBase, BitOffset);\r\nBit(BitBase, BitOffset) <- 1;\r\n\r\nFlags Affected\r\nThe CF flag contains the value of the selected bit before it is set. The ZF flag is unaffected. The OF, SF, AF, and PF\r\nflags are undefined.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination operand points to a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BTS"
},
{
"description": "BZHI - Zero High Bits Starting with Specified Bit Position\r\n Opcode/Instruction Op/ 64/32 CPUID Description\r\n En -bit Feature\r\n Mode Flag\r\n VEX.NDS.LZ.0F38.W0 F5 /r RMV V/V BMI2 Zero bits in r/m32 starting with the position in r32b, write result to\r\n BZHI r32a, r/m32, r32b r32a.\r\n VEX.NDS.LZ.0F38.W1 F5 /r RMV V/N.E. BMI2 Zero bits in r/m64 starting with the position in r64b, write result to\r\n BZHI r64a, r/m64, r64b r64a.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMV ModRM:reg (w) ModRM:r/m (r) VEX.vvvv (r) NA\r\n\r\nDescription\r\nBZHI copies the bits of the first source operand (the second operand) into the destination operand (the first\r\noperand) and clears the higher bits in the destination according to the INDEX value specified by the second source\r\noperand (the third operand). The INDEX is specified by bits 7:0 of the second source operand. The INDEX value is\r\nsaturated at the value of OperandSize -1. CF is set, if the number contained in the 8 low bits of the third operand\r\nis greater than OperandSize -1.\r\nThis instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in\r\n64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An\r\nattempt to execute this instruction with VEX.L not equal to 0 will cause #UD.\r\n\r\nOperation\r\nN <- SRC2[7:0]\r\nDEST <- SRC1\r\nIF (N < OperandSize)\r\n DEST[OperandSize-1:N] <- 0\r\nFI\r\nIF (N > OperandSize - 1)\r\n CF <- 1\r\nELSE\r\n CF <- 0\r\nFI\r\n\r\nFlags Affected\r\nZF, CF and SF flags are updated based on the result. OF flag is cleared. AF and PF flags are undefined.\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nBZHI: unsigned __int32 _bzhi_u32(unsigned __int32 src, unsigned __int32 index);\r\n\r\nBZHI: unsigned __int64 _bzhi_u64(unsigned __int64 src, unsigned __int32 index);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Section 2.5.1, \"Exception Conditions for VEX-Encoded GPR Instructions\", Table 2-29; additionally\r\n#UD If VEX.W = 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "BZHI"
},
{
"description": "CALL-Call Procedure\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n E8 cw CALL rel16 M N.S. Valid Call near, relative, displacement relative to next\r\n instruction.\r\n E8 cd CALL rel32 M Valid Valid Call near, relative, displacement relative to next\r\n instruction. 32-bit displacement sign extended to\r\n 64-bits in 64-bit mode.\r\n FF /2 CALL r/m16 M N.E. Valid Call near, absolute indirect, address given in r/m16.\r\n FF /2 CALL r/m32 M N.E. Valid Call near, absolute indirect, address given in r/m32.\r\n FF /2 CALL r/m64 M Valid N.E. Call near, absolute indirect, address given in r/m64.\r\n 9A cd CALL ptr16:16 D Invalid Valid Call far, absolute, address given in operand.\r\n 9A cp CALL ptr16:32 D Invalid Valid Call far, absolute, address given in operand.\r\n FF /3 CALL m16:16 M Valid Valid Call far, absolute indirect address given in m16:16.\r\n In 32-bit mode: if selector points to a gate, then RIP\r\n = 32-bit zero extended displacement taken from\r\n gate; else RIP = zero extended 16-bit offset from\r\n far pointer referenced in the instruction.\r\n FF /3 CALL m16:32 M Valid Valid In 64-bit mode: If selector points to a gate, then RIP\r\n = 64-bit displacement taken from gate; else RIP =\r\n zero extended 32-bit offset from far pointer\r\n referenced in the instruction.\r\n REX.W + FF /3 CALL m16:64 M Valid N.E. In 64-bit mode: If selector points to a gate, then RIP\r\n = 64-bit displacement taken from gate; else RIP =\r\n 64-bit offset from far pointer referenced in the\r\n instruction.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n D Offset NA NA NA\r\n M ModRM:r/m (r) NA NA NA\r\n\r\nDescription\r\nSaves procedure linking information on the stack and branches to the called procedure specified using the target\r\noperand. The target operand specifies the address of the first instruction in the called procedure. The operand can\r\nbe an immediate value, a general-purpose register, or a memory location.\r\nThis instruction can be used to execute four types of calls:\r\n. Near Call - A call to a procedure in the current code segment (the segment currently pointed to by the CS\r\n register), sometimes referred to as an intra-segment call.\r\n. Far Call - A call to a procedure located in a different segment than the current code segment, sometimes\r\n referred to as an inter-segment call.\r\n. Inter-privilege-level far call - A far call to a procedure in a segment at a different privilege level than that\r\n of the currently executing program or procedure.\r\n. Task switch - A call to a procedure located in a different task.\r\nThe latter two call types (inter-privilege-level call and task switch) can only be executed in protected mode. See\r\n\"Calling Procedures Using Call and RET\" in Chapter 6 of the Intel 64 and IA-32 Architectures Software Devel-\r\noper's Manual, Volume 1, for additional information on near, far, and inter-privilege-level calls. See Chapter 7,\r\n\"Task Management,\" in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A, for infor-\r\nmation on performing task switches with the CALL instruction.\r\n\r\n\r\n\r\nNear Call. When executing a near call, the processor pushes the value of the EIP register (which contains the offset\r\nof the instruction following the CALL instruction) on the stack (for use later as a return-instruction pointer). The\r\nprocessor then branches to the address in the current code segment specified by the target operand. The target\r\noperand specifies either an absolute offset in the code segment (an offset from the base of the code segment) or a\r\nrelative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register; this\r\nvalue points to the instruction following the CALL instruction). The CS register is not changed on near calls.\r\nFor a near call absolute, an absolute offset is specified indirectly in a general-purpose register or a memory location\r\n(r/m16, r/m32, or r/m64). The operand-size attribute determines the size of the target operand (16, 32 or 64\r\nbits). When in 64-bit mode, the operand size for near call (and all near branches) is forced to 64-bits. Absolute\r\noffsets are loaded directly into the EIP(RIP) register. If the operand size attribute is 16, the upper two bytes of the\r\nEIP register are cleared, resulting in a maximum instruction pointer size of 16 bits. When accessing an absolute\r\noffset indirectly using the stack pointer [ESP] as the base register, the base value used is the value of the ESP\r\nbefore the instruction executes.\r\nA relative offset (rel16 or rel32) is generally specified as a label in assembly code. But at the machine code level, it\r\nis encoded as a signed, 16- or 32-bit immediate value. This value is added to the value in the EIP(RIP) register. In\r\n64-bit mode the relative offset is always a 32-bit immediate value which is sign extended to 64-bits before it is\r\nadded to the value in the RIP register for the target calculation. As with absolute offsets, the operand-size attribute\r\ndetermines the size of the target operand (16, 32, or 64 bits). In 64-bit mode the target operand will always be 64-\r\nbits because the operand size is forced to 64-bits for near branches.\r\nFar Calls in Real-Address or Virtual-8086 Mode. When executing a far call in real- address or virtual-8086 mode, the\r\nprocessor pushes the current value of both the CS and EIP registers on the stack for use as a return-instruction\r\npointer. The processor then performs a \"far branch\" to the code segment and offset specified with the target\r\noperand for the called procedure. The target operand specifies an absolute far address either directly with a pointer\r\n(ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). With the pointer method, the\r\nsegment and offset of the called procedure is encoded in the instruction using a 4-byte (16-bit operand size) or 6-\r\nbyte (32-bit operand size) far address immediate. With the indirect method, the target operand specifies a memory\r\nlocation that contains a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address. The operand-size\r\nattribute determines the size of the offset (16 or 32 bits) in the far address. The far address is loaded directly into\r\nthe CS and EIP registers. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared.\r\nFar Calls in Protected Mode. When the processor is operating in protected mode, the CALL instruction can be used to\r\nperform the following types of far calls:\r\n. Far call to the same privilege level\r\n. Far call to a different privilege level (inter-privilege level call)\r\n. Task switch (far call to another task)\r\nIn protected mode, the processor always uses the segment selector part of the far address to access the corre-\r\nsponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate, task gate, or TSS) and access\r\nrights determine the type of call operation to be performed.\r\nIf the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is\r\nperformed. (If the selected code segment is at a different privilege level and the code segment is non-conforming,\r\na general-protection exception is generated.) A far call to the same privilege level in protected mode is very similar\r\nto one carried out in real-address or virtual-8086 mode. The target operand specifies an absolute far address either\r\ndirectly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The\r\noperand- size attribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment\r\nselector and its descriptor are loaded into CS register; the offset from the instruction is loaded into the EIP register.\r\nA call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same\r\nprivilege level. Using this mechanism provides an extra level of indirection and is the preferred method of making\r\ncalls between 16-bit and 32-bit code segments.\r\nWhen executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed\r\nthrough a call gate. The segment selector specified by the target operand identifies the call gate. The target\r\noperand can specify the call gate segment selector either directly with a pointer (ptr16:16 or ptr16:32) or indirectly\r\nwith a memory location (m16:16 or m16:32). The processor obtains the segment selector for the new code\r\nsegment and the new instruction pointer (offset) from the call gate descriptor. (The offset from the target operand\r\nis ignored when a call gate is used.)\r\n\r\n\r\n\r\n\r\nOn inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The\r\nsegment selector for the new stack segment is specified in the TSS for the currently running task. The branch to\r\nthe new code segment occurs after the stack switch. (Note that when using a call gate to perform a far call to a\r\nsegment at the same privilege level, no stack switch occurs.) On the new stack, the processor pushes the segment\r\nselector and stack pointer for the calling procedure's stack, an optional set of parameters from the calling proce-\r\ndures stack, and the segment selector and instruction pointer for the calling procedure's code segment. (A value in\r\nthe call gate descriptor determines how many parameters to copy to the new stack.) Finally, the processor\r\nbranches to the address of the procedure being called within the new code segment.\r\nExecuting a task switch with the CALL instruction is similar to executing a call through a call gate. The target\r\noperand specifies the segment selector of the task gate for the new task activated by the switch (the offset in the\r\ntarget operand is ignored). The task gate in turn points to the TSS for the new task, which contains the segment\r\nselectors for the task's code and stack segments. Note that the TSS also contains the EIP value for the next instruc-\r\ntion that was to be executed before the calling task was suspended. This instruction pointer value is loaded into the\r\nEIP register to re-start the calling task.\r\nThe CALL instruction can also specify the segment selector of the TSS directly, which eliminates the indirection of\r\nthe task gate. See Chapter 7, \"Task Management,\" in the Intel 64 and IA-32 Architectures Software Developer's\r\nManual, Volume 3A, for information on the mechanics of a task switch.\r\nWhen you execute at task switch with a CALL instruction, the nested task flag (NT) is set in the EFLAGS register and\r\nthe new TSS's previous task link field is loaded with the old task's TSS selector. Code is expected to suspend this\r\nnested task by executing an IRET instruction which, because the NT flag is set, automatically uses the previous\r\ntask link to return to the calling task. (See \"Task Linking\" in Chapter 7 of the Intel 64 and IA-32 Architectures\r\nSoftware Developer's Manual, Volume 3A, for information on nested tasks.) Switching tasks with the CALL instruc-\r\ntion differs in this regard from JMP instruction. JMP does not set the NT flag and therefore does not expect an IRET\r\ninstruction to suspend the task.\r\nMixing 16-Bit and 32-Bit Calls. When making far calls between 16-bit and 32-bit code segments, use a call gate. If\r\nthe far call is from a 32-bit code segment to a 16-bit code segment, the call should be made from the first 64\r\nKBytes of the 32-bit code segment. This is because the operand-size attribute of the instruction is set to 16, so only\r\na 16-bit return address offset can be saved. Also, the call should be made using a 16-bit call gate so that 16-bit\r\nvalues can be pushed on the stack. See Chapter 21, \"Mixing 16-Bit and 32-Bit Code,\" in the Intel 64 and IA-32\r\nArchitectures Software Developer's Manual, Volume 3B, for more information.\r\nFar Calls in Compatibility Mode. When the processor is operating in compatibility mode, the CALL instruction can be\r\nused to perform the following types of far calls:\r\n. Far call to the same privilege level, remaining in compatibility mode\r\n. Far call to the same privilege level, transitioning to 64-bit mode\r\n. Far call to a different privilege level (inter-privilege level call), transitioning to 64-bit mode\r\nNote that a CALL instruction can not be used to cause a task switch in compatibility mode since task switches are\r\nnot supported in IA-32e mode.\r\nIn compatibility mode, the processor always uses the segment selector part of the far address to access the corre-\r\nsponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate) and access rights determine\r\nthe type of call operation to be performed.\r\nIf the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is\r\nperformed. (If the selected code segment is at a different privilege level and the code segment is non-conforming,\r\na general-protection exception is generated.) A far call to the same privilege level in compatibility mode is very\r\nsimilar to one carried out in protected mode. The target operand specifies an absolute far address either directly\r\nwith a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The operand-size\r\nattribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment selector and its\r\ndescriptor are loaded into CS register and the offset from the instruction is loaded into the EIP register. The differ-\r\nence is that 64-bit mode may be entered. This specified by the L bit in the new code segment descriptor.\r\nNote that a 64-bit call gate (described in the next paragraph) can also be used to perform a far call to a code\r\nsegment at the same privilege level. However, using this mechanism requires that the target code segment\r\ndescriptor have the L bit set, causing an entry to 64-bit mode.\r\nWhen executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed\r\nthrough a 64-bit call gate. The segment selector specified by the target operand identifies the call gate. The target\r\n\r\n\r\n\r\noperand can specify the call gate segment selector either directly with a pointer (ptr16:16 or ptr16:32) or indirectly\r\nwith a memory location (m16:16 or m16:32). The processor obtains the segment selector for the new code\r\nsegment and the new instruction pointer (offset) from the 16-byte call gate descriptor. (The offset from the target\r\noperand is ignored when a call gate is used.)\r\nOn inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The\r\nsegment selector for the new stack segment is set to NULL. The new stack pointer is specified in the TSS for the\r\ncurrently running task. The branch to the new code segment occurs after the stack switch. (Note that when using\r\na call gate to perform a far call to a segment at the same privilege level, an implicit stack switch occurs as a result\r\nof entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use a segment base of 0x0,\r\nthe limit is ignored, and the default stack size is 64-bits. The full value of RSP is used for the offset, of which the\r\nupper 32-bits are undefined.) On the new stack, the processor pushes the segment selector and stack pointer for\r\nthe calling procedure's stack and the segment selector and instruction pointer for the calling procedure's code\r\nsegment. (Parameter copy is not supported in IA-32e mode.) Finally, the processor branches to the address of the\r\nprocedure being called within the new code segment.\r\nNear/(Far) Calls in 64-bit Mode. When the processor is operating in 64-bit mode, the CALL instruction can be used to\r\nperform the following types of far calls:\r\n. Far call to the same privilege level, transitioning to compatibility mode\r\n. Far call to the same privilege level, remaining in 64-bit mode\r\n. Far call to a different privilege level (inter-privilege level call), remaining in 64-bit mode\r\nNote that in this mode the CALL instruction can not be used to cause a task switch in 64-bit mode since task\r\nswitches are not supported in IA-32e mode.\r\nIn 64-bit mode, the processor always uses the segment selector part of the far address to access the corresponding\r\ndescriptor in the GDT or LDT. The descriptor type (code segment, call gate) and access rights determine the type\r\nof call operation to be performed.\r\nIf the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is\r\nperformed. (If the selected code segment is at a different privilege level and the code segment is non-conforming,\r\na general-protection exception is generated.) A far call to the same privilege level in 64-bit mode is very similar to\r\none carried out in compatibility mode. The target operand specifies an absolute far address indirectly with a\r\nmemory location (m16:16, m16:32 or m16:64). The form of CALL with a direct specification of absolute far\r\naddress is not defined in 64-bit mode. The operand-size attribute determines the size of the offset (16, 32, or 64\r\nbits) in the far address. The new code segment selector and its descriptor are loaded into the CS register; the offset\r\nfrom the instruction is loaded into the EIP register. The new code segment may specify entry either into compati-\r\nbility or 64-bit mode, based on the L bit value.\r\nA 64-bit call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the\r\nsame privilege level. However, using this mechanism requires that the target code segment descriptor have the L\r\nbit set.\r\nWhen executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed\r\nthrough a 64-bit call gate. The segment selector specified by the target operand identifies the call gate. The target\r\noperand can only specify the call gate segment selector indirectly with a memory location (m16:16, m16:32 or\r\nm16:64). The processor obtains the segment selector for the new code segment and the new instruction pointer\r\n(offset) from the 16-byte call gate descriptor. (The offset from the target operand is ignored when a call gate is\r\nused.)\r\nOn inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The\r\nsegment selector for the new stack segment is set to NULL. The new stack pointer is specified in the TSS for the\r\ncurrently running task. The branch to the new code segment occurs after the stack switch.\r\nNote that when using a call gate to perform a far call to a segment at the same privilege level, an implicit stack\r\nswitch occurs as a result of entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use\r\na segment base of 0x0, the limit is ignored, and the default stack size is 64-bits. (The full value of RSP is used for\r\nthe offset.) On the new stack, the processor pushes the segment selector and stack pointer for the calling proce-\r\ndure's stack and the segment selector and instruction pointer for the calling procedure's code segment. (Parameter\r\ncopy is not supported in IA-32e mode.) Finally, the processor branches to the address of the procedure being called\r\nwithin the new code segment.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nIF near call\r\n THEN IF near relative call\r\n THEN\r\n IF OperandSize = 64\r\n THEN\r\n tempDEST <- SignExtend(DEST); (* DEST is rel32 *)\r\n tempRIP <- RIP + tempDEST;\r\n IF stack not large enough for a 8-byte return address\r\n THEN #SS(0); FI;\r\n Push(RIP);\r\n RIP <- tempRIP;\r\n FI;\r\n IF OperandSize = 32\r\n THEN\r\n tempEIP <- EIP + DEST; (* DEST is rel32 *)\r\n IF tempEIP is not within code segment limit THEN #GP(0); FI;\r\n IF stack not large enough for a 4-byte return address\r\n THEN #SS(0); FI;\r\n Push(EIP);\r\n EIP <- tempEIP;\r\n FI;\r\n IF OperandSize = 16\r\n THEN\r\n tempEIP <- (EIP + DEST) AND 0000FFFFH; (* DEST is rel16 *)\r\n IF tempEIP is not within code segment limit THEN #GP(0); FI;\r\n IF stack not large enough for a 2-byte return address\r\n THEN #SS(0); FI;\r\n Push(IP);\r\n EIP <- tempEIP;\r\n FI;\r\n ELSE (* Near absolute call *)\r\n IF OperandSize = 64\r\n THEN\r\n tempRIP <- DEST; (* DEST is r/m64 *)\r\n IF stack not large enough for a 8-byte return address\r\n THEN #SS(0); FI;\r\n Push(RIP);\r\n RIP <- tempRIP;\r\n FI;\r\n IF OperandSize = 32\r\n THEN\r\n tempEIP <- DEST; (* DEST is r/m32 *)\r\n IF tempEIP is not within code segment limit THEN #GP(0); FI;\r\n IF stack not large enough for a 4-byte return address\r\n THEN #SS(0); FI;\r\n Push(EIP);\r\n EIP <- tempEIP;\r\n FI;\r\n IF OperandSize = 16\r\n THEN\r\n tempEIP <- DEST AND 0000FFFFH; (* DEST is r/m16 *)\r\n IF tempEIP is not within code segment limit THEN #GP(0); FI;\r\n\r\n\r\n\r\n IF stack not large enough for a 2-byte return address\r\n THEN #SS(0); FI;\r\n Push(IP);\r\n EIP <- tempEIP;\r\n FI;\r\n FI;rel/abs\r\nFI; near\r\n\r\nIF far call and (PE = 0 or (PE = 1 and VM = 1)) (* Real-address or virtual-8086 mode *)\r\n THEN\r\n IF OperandSize = 32\r\n THEN\r\n IF stack not large enough for a 6-byte return address\r\n THEN #SS(0); FI;\r\n IF DEST[31:16] is not zero THEN #GP(0); FI;\r\n Push(CS); (* Padded with 16 high-order bits *)\r\n Push(EIP);\r\n CS <- DEST[47:32]; (* DEST is ptr16:32 or [m16:32] *)\r\n EIP <- DEST[31:0]; (* DEST is ptr16:32 or [m16:32] *)\r\n ELSE (* OperandSize = 16 *)\r\n IF stack not large enough for a 4-byte return address\r\n THEN #SS(0); FI;\r\n Push(CS);\r\n Push(IP);\r\n CS <- DEST[31:16]; (* DEST is ptr16:16 or [m16:16] *)\r\n EIP <- DEST[15:0]; (* DEST is ptr16:16 or [m16:16]; clear upper 16 bits *)\r\n FI;\r\nFI;\r\n\r\nIF far call and (PE = 1 and VM = 0) (* Protected mode or IA-32e Mode, not virtual-8086 mode*)\r\n THEN\r\n IF segment selector in target operand NULL\r\n THEN #GP(0); FI;\r\n IF segment selector index not within descriptor table limits\r\n THEN #GP(new code segment selector); FI;\r\n Read type and access rights of selected segment descriptor;\r\n IF IA32_EFER.LMA = 0\r\n THEN\r\n IF segment type is not a conforming or nonconforming code segment, call\r\n gate, task gate, or TSS\r\n THEN #GP(segment selector); FI;\r\n ELSE\r\n IF segment type is not a conforming or nonconforming code segment or\r\n 64-bit call gate,\r\n THEN #GP(segment selector); FI;\r\n FI;\r\n Depending on type and access rights:\r\n GO TO CONFORMING-CODE-SEGMENT;\r\n GO TO NONCONFORMING-CODE-SEGMENT;\r\n GO TO CALL-GATE;\r\n GO TO TASK-GATE;\r\n GO TO TASK-STATE-SEGMENT;\r\nFI;\r\n\r\n\r\n\r\n\r\nCONFORMING-CODE-SEGMENT:\r\n IF L bit = 1 and D bit = 1 and IA32_EFER.LMA = 1\r\n THEN GP(new code segment selector); FI;\r\n IF DPL > CPL\r\n THEN #GP(new code segment selector); FI;\r\n IF segment not present\r\n THEN #NP(new code segment selector); FI;\r\n IF stack not large enough for return address\r\n THEN #SS(0); FI;\r\n tempEIP <- DEST(Offset);\r\n IF OperandSize = 16\r\n THEN\r\n tempEIP <- tempEIP AND 0000FFFFH; FI; (* Clear upper 16 bits *)\r\n IF (EFER.LMA = 0 or target mode = Compatibility mode) and (tempEIP outside new code\r\n segment limit)\r\n THEN #GP(0); FI;\r\n IF tempEIP is non-canonical\r\n THEN #GP(0); FI;\r\n IF OperandSize = 32\r\n THEN\r\n Push(CS); (* Padded with 16 high-order bits *)\r\n Push(EIP);\r\n CS <- DEST(CodeSegmentSelector);\r\n (* Segment descriptor information also loaded *)\r\n CS(RPL) <- CPL;\r\n EIP <- tempEIP;\r\n ELSE\r\n IF OperandSize = 16\r\n THEN\r\n Push(CS);\r\n Push(IP);\r\n CS <- DEST(CodeSegmentSelector);\r\n (* Segment descriptor information also loaded *)\r\n CS(RPL) <- CPL;\r\n EIP <- tempEIP;\r\n ELSE (* OperandSize = 64 *)\r\n Push(CS); (* Padded with 48 high-order bits *)\r\n Push(RIP);\r\n CS <- DEST(CodeSegmentSelector);\r\n (* Segment descriptor information also loaded *)\r\n CS(RPL) <- CPL;\r\n RIP <- tempEIP;\r\n FI;\r\n FI;\r\nEND;\r\n\r\nNONCONFORMING-CODE-SEGMENT:\r\n IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1\r\n THEN GP(new code segment selector); FI;\r\n IF (RPL > CPL) or (DPL != CPL)\r\n THEN #GP(new code segment selector); FI;\r\n IF segment not present\r\n THEN #NP(new code segment selector); FI;\r\n IF stack not large enough for return address\r\n\r\n\r\n\r\n THEN #SS(0); FI;\r\n tempEIP <- DEST(Offset);\r\n IF OperandSize = 16\r\n THEN tempEIP <- tempEIP AND 0000FFFFH; FI; (* Clear upper 16 bits *)\r\n IF (EFER.LMA = 0 or target mode = Compatibility mode) and (tempEIP outside new code\r\n segment limit)\r\n THEN #GP(0); FI;\r\n IF tempEIP is non-canonical\r\n THEN #GP(0); FI;\r\n IF OperandSize = 32\r\n THEN\r\n Push(CS); (* Padded with 16 high-order bits *)\r\n Push(EIP);\r\n CS <- DEST(CodeSegmentSelector);\r\n (* Segment descriptor information also loaded *)\r\n CS(RPL) <- CPL;\r\n EIP <- tempEIP;\r\n ELSE\r\n IF OperandSize = 16\r\n THEN\r\n Push(CS);\r\n Push(IP);\r\n CS <- DEST(CodeSegmentSelector);\r\n (* Segment descriptor information also loaded *)\r\n CS(RPL) <- CPL;\r\n EIP <- tempEIP;\r\n ELSE (* OperandSize = 64 *)\r\n Push(CS); (* Padded with 48 high-order bits *)\r\n Push(RIP);\r\n CS <- DEST(CodeSegmentSelector);\r\n (* Segment descriptor information also loaded *)\r\n CS(RPL) <- CPL;\r\n RIP <- tempEIP;\r\n FI;\r\n FI;\r\nEND;\r\n\r\nCALL-GATE:\r\n IF call gate (DPL < CPL) or (RPL > DPL)\r\n THEN #GP(call-gate selector); FI;\r\n IF call gate not present\r\n THEN #NP(call-gate selector); FI;\r\n IF call-gate code-segment selector is NULL\r\n THEN #GP(0); FI;\r\n IF call-gate code-segment selector index is outside descriptor table limits\r\n THEN #GP(call-gate code-segment selector); FI;\r\n Read call-gate code-segment descriptor;\r\n IF call-gate code-segment descriptor does not indicate a code segment\r\n or call-gate code-segment descriptor DPL > CPL\r\n THEN #GP(call-gate code-segment selector); FI;\r\n IF IA32_EFER.LMA = 1 AND (call-gate code-segment descriptor is\r\n not a 64-bit code segment or call-gate code-segment descriptor has both L-bit and D-bit set)\r\n THEN #GP(call-gate code-segment selector); FI;\r\n IF call-gate code segment not present\r\n\r\n\r\n\r\n THEN #NP(call-gate code-segment selector); FI;\r\n IF call-gate code segment is non-conforming and DPL < CPL\r\n THEN go to MORE-PRIVILEGE;\r\n ELSE go to SAME-PRIVILEGE;\r\n FI;\r\nEND;\r\n\r\nMORE-PRIVILEGE:\r\n IF current TSS is 32-bit\r\n THEN\r\n TSSstackAddress <- (new code-segment DPL * 8) + 4;\r\n IF (TSSstackAddress + 5) > current TSS limit\r\n THEN #TS(current TSS selector); FI;\r\n NewSS <- 2 bytes loaded from (TSS base + TSSstackAddress + 4);\r\n NewESP <- 4 bytes loaded from (TSS base + TSSstackAddress);\r\n ELSE\r\n IF current TSS is 16-bit\r\n THEN\r\n TSSstackAddress <- (new code-segment DPL * 4) + 2\r\n IF (TSSstackAddress + 3) > current TSS limit\r\n THEN #TS(current TSS selector); FI;\r\n NewSS <- 2 bytes loaded from (TSS base + TSSstackAddress + 2);\r\n NewESP <- 2 bytes loaded from (TSS base + TSSstackAddress);\r\n ELSE (* current TSS is 64-bit *)\r\n TSSstackAddress <- (new code-segment DPL * 8) + 4;\r\n IF (TSSstackAddress + 7) > current TSS limit\r\n THEN #TS(current TSS selector); FI;\r\n NewSS <- new code-segment DPL; (* NULL selector with RPL = new CPL *)\r\n NewRSP <- 8 bytes loaded from (current TSS base + TSSstackAddress);\r\n FI;\r\n FI;\r\n IF IA32_EFER.LMA = 0 and NewSS is NULL\r\n THEN #TS(NewSS); FI;\r\n Read new code-segment descriptor and new stack-segment descriptor;\r\n IF IA32_EFER.LMA = 0 and (NewSS RPL != new code-segment DPL\r\n or new stack-segment DPL != new code-segment DPL or new stack segment is not a\r\n writable data segment)\r\n THEN #TS(NewSS); FI\r\n IF IA32_EFER.LMA = 0 and new stack segment not present\r\n THEN #SS(NewSS); FI;\r\n IF CallGateSize = 32\r\n THEN\r\n IF new stack does not have room for parameters plus 16 bytes\r\n THEN #SS(NewSS); FI;\r\n IF CallGate(InstructionPointer) not within new code-segment limit\r\n THEN #GP(0); FI;\r\n SS <- newSS; (* Segment descriptor information also loaded *)\r\n ESP <- newESP;\r\n CS:EIP <- CallGate(CS:InstructionPointer);\r\n (* Segment descriptor information also loaded *)\r\n Push(oldSS:oldESP); (* From calling procedure *)\r\n temp <- parameter count from call gate, masked to 5 bits;\r\n Push(parameters from calling procedure's stack, temp)\r\n Push(oldCS:oldEIP); (* Return address to calling procedure *)\r\n\r\n\r\n\r\n ELSE\r\n IF CallGateSize = 16\r\n THEN\r\n IF new stack does not have room for parameters plus 8 bytes\r\n THEN #SS(NewSS); FI;\r\n IF (CallGate(InstructionPointer) AND FFFFH) not in new code-segment limit\r\n THEN #GP(0); FI;\r\n SS <- newSS; (* Segment descriptor information also loaded *)\r\n ESP <- newESP;\r\n CS:IP <- CallGate(CS:InstructionPointer);\r\n (* Segment descriptor information also loaded *)\r\n Push(oldSS:oldESP); (* From calling procedure *)\r\n temp <- parameter count from call gate, masked to 5 bits;\r\n Push(parameters from calling procedure's stack, temp)\r\n Push(oldCS:oldEIP); (* Return address to calling procedure *)\r\n ELSE (* CallGateSize = 64 *)\r\n IF pushing 32 bytes on the stack would use a non-canonical address\r\n THEN #SS(NewSS); FI;\r\n IF (CallGate(InstructionPointer) is non-canonical)\r\n THEN #GP(0); FI;\r\n SS <- NewSS; (* NewSS is NULL)\r\n RSP <- NewESP;\r\n CS:IP <- CallGate(CS:InstructionPointer);\r\n (* Segment descriptor information also loaded *)\r\n Push(oldSS:oldESP); (* From calling procedure *)\r\n Push(oldCS:oldEIP); (* Return address to calling procedure *)\r\n FI;\r\n FI;\r\n CPL <- CodeSegment(DPL)\r\n CS(RPL) <- CPL\r\nEND;\r\n\r\nSAME-PRIVILEGE:\r\n IF CallGateSize = 32\r\n THEN\r\n IF stack does not have room for 8 bytes\r\n THEN #SS(0); FI;\r\n IF CallGate(InstructionPointer) not within code segment limit\r\n THEN #GP(0); FI;\r\n CS:EIP <- CallGate(CS:EIP) (* Segment descriptor information also loaded *)\r\n Push(oldCS:oldEIP); (* Return address to calling procedure *)\r\n ELSE\r\n If CallGateSize = 16\r\n THEN\r\n IF stack does not have room for 4 bytes\r\n THEN #SS(0); FI;\r\n IF CallGate(InstructionPointer) not within code segment limit\r\n THEN #GP(0); FI;\r\n CS:IP <- CallGate(CS:instruction pointer);\r\n (* Segment descriptor information also loaded *)\r\n Push(oldCS:oldIP); (* Return address to calling procedure *)\r\n ELSE (* CallGateSize = 64)\r\n IF pushing 16 bytes on the stack touches non-canonical addresses\r\n THEN #SS(0); FI;\r\n\r\n\r\n\r\n IF RIP non-canonical\r\n THEN #GP(0); FI;\r\n CS:IP <- CallGate(CS:instruction pointer);\r\n (* Segment descriptor information also loaded *)\r\n Push(oldCS:oldIP); (* Return address to calling procedure *)\r\n FI;\r\n FI;\r\n CS(RPL) <- CPL\r\nEND;\r\n\r\nTASK-GATE:\r\n IF task gate DPL < CPL or RPL\r\n THEN #GP(task gate selector); FI;\r\n IF task gate not present\r\n THEN #NP(task gate selector); FI;\r\n Read the TSS segment selector in the task-gate descriptor;\r\n IF TSS segment selector local/global bit is set to local\r\n or index not within GDT limits\r\n THEN #GP(TSS selector); FI;\r\n Access TSS descriptor in GDT;\r\n IF TSS descriptor specifies that the TSS is busy (low-order 5 bits set to 00001)\r\n THEN #GP(TSS selector); FI;\r\n IF TSS not present\r\n THEN #NP(TSS selector); FI;\r\n SWITCH-TASKS (with nesting) to TSS;\r\n IF EIP not within code segment limit\r\n THEN #GP(0); FI;\r\nEND;\r\n\r\nTASK-STATE-SEGMENT:\r\n IF TSS DPL < CPL or RPL\r\n or TSS descriptor indicates TSS not available\r\n THEN #GP(TSS selector); FI;\r\n IF TSS is not present\r\n THEN #NP(TSS selector); FI;\r\n SWITCH-TASKS (with nesting) to TSS;\r\n IF EIP not within code segment limit\r\n THEN #GP(0); FI;\r\nEND;\r\n\r\n\r\nFlags Affected\r\nAll flags are affected if a task switch occurs; no flags are affected if a task switch does not occur.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the target offset in destination operand is beyond the new code segment limit.\r\n If the segment selector in the destination operand is NULL.\r\n If the code segment selector in the gate is NULL.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment\r\n selector.\r\n#GP(selector) If a code segment or gate or TSS selector index is outside descriptor table limits.\r\n If the segment descriptor pointed to by the segment selector in the destination operand is not\r\n for a conforming-code segment, nonconforming-code segment, call gate, task gate, or task\r\n state segment.\r\n If the DPL for a nonconforming-code segment is not equal to the CPL or the RPL for the\r\n segment's segment selector is greater than the CPL.\r\n If the DPL for a conforming-code segment is greater than the CPL.\r\n If the DPL from a call-gate, task-gate, or TSS segment descriptor is less than the CPL or than\r\n the RPL of the call-gate, task-gate, or TSS's segment selector.\r\n If the segment descriptor for a segment selector from a call gate does not indicate it is a code\r\n segment.\r\n If the segment selector from a call gate is beyond the descriptor table limits.\r\n If the DPL for a code-segment obtained from a call gate is greater than the CPL.\r\n If the segment selector for a TSS has its local/global bit set for local.\r\n If a TSS segment descriptor specifies that the TSS is busy or not available.\r\n#SS(0) If pushing the return address, parameters, or stack segment pointer onto the stack exceeds\r\n the bounds of the stack segment, when no stack switch occurs.\r\n If a memory operand effective address is outside the SS segment limit.\r\n#SS(selector) If pushing the return address, parameters, or stack segment pointer onto the stack exceeds\r\n the bounds of the stack segment, when a stack switch occurs.\r\n If the SS register is being loaded as part of a stack switch and the segment pointed to is\r\n marked not present.\r\n If stack segment does not have room for the return address, parameters, or stack segment\r\n pointer, when stack switch occurs.\r\n#NP(selector) If a code segment, data segment, stack segment, call gate, task gate, or TSS is not present.\r\n#TS(selector) If the new stack segment selector and ESP are beyond the end of the TSS.\r\n If the new stack segment selector is NULL.\r\n If the RPL of the new stack segment selector in the TSS is not equal to the DPL of the code\r\n segment being accessed.\r\n If DPL of the stack segment descriptor for the new stack segment is not equal to the DPL of the\r\n code segment descriptor.\r\n If the new stack segment is not a writable data segment.\r\n If segment-selector index for stack segment is outside descriptor table limits.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the target offset is beyond the code segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the target offset is beyond the code segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n#GP(selector) If a memory address accessed by the selector is in non-canonical space.\r\n#GP(0) If the target offset in the destination operand is non-canonical.\r\n\r\n64-Bit Mode Exceptions\r\n#GP(0) If a memory address is non-canonical.\r\n If target offset in destination operand is non-canonical.\r\n If the segment selector in the destination operand is NULL.\r\n If the code segment selector in the 64-bit gate is NULL.\r\n#GP(selector) If code segment or 64-bit call gate is outside descriptor table limits.\r\n If code segment or 64-bit call gate overlaps non-canonical space.\r\n If the segment descriptor pointed to by the segment selector in the destination operand is not\r\n for a conforming-code segment, nonconforming-code segment, or 64-bit call gate.\r\n If the segment descriptor pointed to by the segment selector in the destination operand is a\r\n code segment and has both the D-bit and the L- bit set.\r\n If the DPL for a nonconforming-code segment is not equal to the CPL, or the RPL for the\r\n segment's segment selector is greater than the CPL.\r\n If the DPL for a conforming-code segment is greater than the CPL.\r\n If the DPL from a 64-bit call-gate is less than the CPL or than the RPL of the 64-bit call-gate.\r\n If the upper type field of a 64-bit call gate is not 0x0.\r\n If the segment selector from a 64-bit call gate is beyond the descriptor table limits.\r\n If the DPL for a code-segment obtained from a 64-bit call gate is greater than the CPL.\r\n If the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the L-\r\n bit set and the D-bit clear.\r\n If the segment descriptor for a segment selector from the 64-bit call gate does not indicate it\r\n is a code segment.\r\n#SS(0) If pushing the return offset or CS selector onto the stack exceeds the bounds of the stack\r\n segment when no stack switch occurs.\r\n If a memory operand effective address is outside the SS segment limit.\r\n If the stack address is in a non-canonical form.\r\n#SS(selector) If pushing the old values of SS selector, stack pointer, EFLAGS, CS selector, offset, or error\r\n code onto the stack violates the canonical boundary when a stack switch occurs.\r\n#NP(selector) If a code segment or 64-bit call gate is not present.\r\n#TS(selector) If the load of the new RSP exceeds the limit of the TSS.\r\n#UD (64-bit mode only) If a far call is direct to an absolute address in memory.\r\n If the LOCK prefix is used.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CALL"
},
{
"description": "CBW/CWDE/CDQE-Convert Byte to Word/Convert Word to Doubleword/Convert Doubleword to Quadword\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n98 CBW NP Valid Valid AX <- sign-extend of AL.\r\n98 CWDE NP Valid Valid EAX <- sign-extend of AX.\r\nREX.W + 98 CDQE NP Valid N.E. RAX <- sign-extend of EAX.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nDouble the size of the source operand by means of sign extension. The CBW (convert byte to word) instruction\r\ncopies the sign (bit 7) in the source operand into every bit in the AH register. The CWDE (convert word to double-\r\nword) instruction copies the sign (bit 15) of the word in the AX register into the high 16 bits of the EAX register.\r\nCBW and CWDE reference the same opcode. The CBW instruction is intended for use when the operand-size attri-\r\nbute is 16; CWDE is intended for use when the operand-size attribute is 32. Some assemblers may force the\r\noperand size. Others may treat these two mnemonics as synonyms (CBW/CWDE) and use the setting of the\r\noperand-size attribute to determine the size of values to be converted.\r\nIn 64-bit mode, the default operation size is the size of the destination register. Use of the REX.W prefix promotes\r\nthis instruction (CDQE when promoted) to operate on 64-bit operands. In which case, CDQE copies the sign (bit\r\n31) of the doubleword in the EAX register into the high 32 bits of RAX.\r\n\r\nOperation\r\nIF OperandSize = 16 (* Instruction = CBW *)\r\n THEN\r\n AX <- SignExtend(AL);\r\n ELSE IF (OperandSize = 32, Instruction = CWDE)\r\n EAX <- SignExtend(AX); FI;\r\n ELSE (* 64-Bit Mode, OperandSize = 64, Instruction = CDQE*)\r\n RAX <- SignExtend(EAX);\r\nFI;\r\n\r\nFlags Affected\r\nNone.\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CBW"
},
{
"description": "-R:CWD",
"mnem": "CDQ"
},
{
"description": "-R:CBW",
"mnem": "CDQE"
},
{
"description": "CLAC-Clear AC Flag in EFLAGS Register\r\nOpcode/ Op / 64/32 bit CPUID Description\r\nInstruction En Mode Feature\r\n Support Flag\r\n0F 01 CA NP V/V SMAP Clear the AC flag in the EFLAGS register.\r\nCLAC\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nClears the AC flag bit in EFLAGS register. This disables any alignment checking of user-mode data accesses. If the\r\nSMAP bit is set in the CR4 register, this disallows explicit supervisor-mode data accesses to user-mode pages.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode. Attempts to execute CLAC when\r\nCPL > 0 cause #UD.\r\n\r\nOperation\r\nEFLAGS.AC <- 0;\r\n\r\nFlags Affected\r\nAC cleared. Other flags are unaffected.\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If the CPL > 0.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD The CLAC instruction is not recognized in virtual-8086 mode.\r\n\r\nCompatibility Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If the CPL > 0.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If the CPL > 0.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLAC"
},
{
"description": "CLC-Clear Carry Flag\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n F8 CLC NP Valid Valid Clear CF flag.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nClears the CF flag in the EFLAGS register. Operation is the same in all modes.\r\n\r\nOperation\r\nCF <- 0;\r\n\r\nFlags Affected\r\nThe CF flag is set to 0. The OF, ZF, SF, AF, and PF flags are unaffected.\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLC"
},
{
"description": "CLD-Clear Direction Flag\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\nFC CLD NP Valid Valid Clear DF flag.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nClears the DF flag in the EFLAGS register. When the DF flag is set to 0, string operations increment the index regis-\r\nters (ESI and/or EDI). Operation is the same in all modes.\r\n\r\nOperation\r\nDF <- 0;\r\n\r\nFlags Affected\r\nThe DF flag is set to 0. The CF, OF, ZF, SF, AF, and PF flags are unaffected.\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLD"
},
{
"description": "CLFLUSH-Flush Cache Line\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F AE /7 CLFLUSH m8 M Valid Valid Flushes cache line containing m8.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n M ModRM:r/m (w) NA NA NA\r\n\r\nDescription\r\nInvalidates from every level of the cache hierarchy in the cache coherence domain the cache line that contains the\r\nlinear address specified with the memory operand. If that cache line contains modified data at any level of the\r\ncache hierarchy, that data is written back to memory. The source operand is a byte memory location.\r\nThe availability of CLFLUSH is indicated by the presence of the CPUID feature flag CLFSH\r\n(CPUID.01H:EDX[bit 19]). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8\r\nthrough 15 of the EBX register when the initial value in the EAX register is 1).\r\nThe memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It\r\nshould be noted that processors are free to speculatively fetch and cache data from system memory regions\r\nassigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCHh\r\ninstructions can be used to provide the processor with hints for this speculative behavior. Because this speculative\r\nfetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with\r\nrespect to PREFETCHh instructions or any of the speculative fetching mechanisms (that is, data can be specula-\r\ntively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references\r\nthe cache line).\r\nExecutions of the CLFLUSH instruction are ordered with respect to each other and with respect to writes, locked\r\nread-modify-write instructions, fence instructions, and executions of CLFLUSHOPT to the same cache line.1 They\r\nare not ordered with respect to executions of CLFLUSHOPT to different cache lines.\r\nThe CLFLUSH instruction can be used at all privilege levels and is subject to all permission checking and faults asso-\r\nciated with a byte load (and in addition, a CLFLUSH instruction is allowed to flush a linear address in an execute-\r\nonly segment). Like a load, the CLFLUSH instruction sets the A bit but not the D bit in the page tables.\r\nIn some implementations, the CLFLUSH instruction may always cause transactional abort with Transactional\r\nSynchronization Extensions (TSX). The CLFLUSH instruction is not expected to be commonly used inside typical\r\ntransactional regions. However, programmers must not rely on CLFLUSH instruction to force a transactional abort,\r\nsince whether they cause transactional abort is implementation dependent.\r\nThe CLFLUSH instruction was introduced with the SSE2 extensions; however, because it has its own CPUID feature\r\nflag, it can be implemented in IA-32 processors that do not include the SSE2 extensions. Also, detecting the pres-\r\nence of the SSE2 extensions with the CPUID instruction does not guarantee that the CLFLUSH instruction is imple-\r\nmented in the processor.\r\nCLFLUSH operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nFlush_Cache_Line(SRC);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalents\r\nCLFLUSH: void _mm_clflush(void const *p)\r\n\r\n\r\n\r\n\r\n1. Earlier versions of this manual specified that executions of the CLFLUSH instruction were ordered only by the MFENCE instruction.\r\n All processors implementing the CLFLUSH instruction also order it relative to the other operations enumerated above.\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.\r\n#SS(0) For an illegal address in the SS segment.\r\n#PF(fault-code) For a page fault.\r\n#UD If CPUID.01H:EDX.CLFSH[bit 19] = 0.\r\n If the LOCK prefix is used.\r\n If an instruction prefix F2H or F3H is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n#UD If CPUID.01H:EDX.CLFSH[bit 19] = 0.\r\n If the LOCK prefix is used.\r\n If an instruction prefix F2H or F3H is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in real address mode.\r\n#PF(fault-code) For a page fault.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) For a page fault.\r\n#UD If CPUID.01H:EDX.CLFSH[bit 19] = 0.\r\n If the LOCK prefix is used.\r\n If an instruction prefix F2H or F3H is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLFLUSH"
},
{
"description": "CLFLUSHOPT-Flush Cache Line Optimized\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 66 0F AE /7 CLFLUSHOPT m8 M Valid Valid Flushes cache line containing m8.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n M ModRM:r/m (w) NA NA NA\r\n\r\nDescription\r\nInvalidates from every level of the cache hierarchy in the cache coherence domain the cache line that contains the\r\nlinear address specified with the memory operand. If that cache line contains modified data at any level of the\r\ncache hierarchy, that data is written back to memory. The source operand is a byte memory location.\r\nThe availability of CLFLUSHOPT is indicated by the presence of the CPUID feature flag CLFLUSHOPT\r\n(CPUID.(EAX=7,ECX=0):EBX[bit 23]). The aligned cache line size affected is also indicated with the CPUID instruc-\r\ntion (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1).\r\nThe memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It\r\nshould be noted that processors are free to speculatively fetch and cache data from system memory regions\r\nassigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCHh\r\ninstructions can be used to provide the processor with hints for this speculative behavior. Because this speculative\r\nfetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with\r\nrespect to PREFETCHh instructions or any of the speculative fetching mechanisms (that is, data can be specula-\r\ntively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references\r\nthe cache line).\r\nExecutions of the CLFLUSHOPT instruction are ordered with respect to fence instructions and to locked read-\r\nmodify-write instructions; they are also ordered with respect to the following accesses to the cache line being\r\ninvalidated: writes, executions of CLFLUSH, and executions of CLFLUSHOPT. They are not ordered with respect to\r\nwrites, executions of CLFLUSH, or executions of CLFLUSHOPT that access other cache lines; to enforce ordering\r\nwith such an operation, software can insert an SFENCE instruction between CFLUSHOPT and that operation.\r\nThe CLFLUSHOPT instruction can be used at all privilege levels and is subject to all permission checking and faults\r\nassociated with a byte load (and in addition, a CLFLUSHOPT instruction is allowed to flush a linear address in an\r\nexecute-only segment). Like a load, the CLFLUSHOPT instruction sets the A bit but not the D bit in the page tables.\r\nIn some implementations, the CLFLUSHOPT instruction may always cause transactional abort with Transactional\r\nSynchronization Extensions (TSX). The CLFLUSHOPT instruction is not expected to be commonly used inside\r\ntypical transactional regions. However, programmers must not rely on CLFLUSHOPT instruction to force a transac-\r\ntional abort, since whether they cause transactional abort is implementation dependent.\r\nCLFLUSHOPT operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nFlush_Cache_Line_Optimized(SRC);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalents\r\nCLFLUSHOPT:void _mm_clflushopt(void const *p)\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.\r\n#SS(0) For an illegal address in the SS segment.\r\n#PF(fault-code) For a page fault.\r\n#UD If CPUID.(EAX=7,ECX=0):EBX.CLFLUSHOPT[bit 23] = 0.\r\n If the LOCK prefix is used.\r\n If an instruction prefix F2H or F3H is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n#UD If CPUID.(EAX=7,ECX=0):EBX.CLFLUSHOPT[bit 23] = 0.\r\n If the LOCK prefix is used.\r\n If an instruction prefix F2H or F3H is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in real address mode.\r\n#PF(fault-code) For a page fault.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) For a page fault.\r\n#UD If CPUID.(EAX=7,ECX=0):EBX.CLFLUSHOPT[bit 23] = 0.\r\n If the LOCK prefix is used.\r\n If an instruction prefix F2H or F3H is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLFLUSHOPT"
},
{
"description": "CLI - Clear Interrupt Flag\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n FA CLI NP Valid Valid Clear interrupt flag; interrupts disabled when\r\n interrupt flag cleared.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nIf protected-mode virtual interrupts are not enabled, CLI clears the IF flag in the EFLAGS register. No other flags\r\nare affected. Clearing the IF flag causes the processor to ignore maskable external interrupts. The IF flag and the\r\nCLI and STI instruction have no affect on the generation of exceptions and NMI interrupts.\r\nWhen protected-mode virtual interrupts are enabled, CPL is 3, and IOPL is less than 3; CLI clears the VIF flag in the\r\nEFLAGS register, leaving IF unaffected. Table 3-7 indicates the action of the CLI instruction depending on the\r\nprocessor operating mode and the CPL/IOPL of the running program or procedure.\r\nOperation is the same in all modes.\r\n\r\n\r\n Table 3-7. Decision Table for CLI Results\r\n PE VM IOPL CPL PVI VIP VME CLI Result\r\n 0 X X X X X X IF = 0\r\n 1 0 >= CPL X X X X IF = 0\r\n 1 0 < CPL 3 1 X X VIF = 0\r\n 1 0 < CPL <3 X X X GP Fault\r\n 1 0 < CPL X 0 X X GP Fault\r\n 1 1 3 X X X X IF = 0\r\n 1 1 <3 X X X 1 VIF = 0\r\n 1 1 <3 X X X 0 GP Fault\r\n NOTES:\r\n * X = This setting has no impact.\r\n\r\nOperation\r\nIF PE = 0\r\n THEN\r\n IF <- 0; (* Reset Interrupt Flag *)\r\n ELSE\r\n IF VM = 0;\r\n THEN\r\n IF IOPL >= CPL\r\n THEN\r\n IF <- 0; (* Reset Interrupt Flag *)\r\n ELSE\r\n IF ((IOPL < CPL) and (CPL = 3) and (PVI = 1))\r\n THEN\r\n VIF <- 0; (* Reset Virtual Interrupt Flag *)\r\n ELSE\r\n #GP(0);\r\n\r\n\r\n FI;\r\n FI;\r\n ELSE (* VM = 1 *)\r\n IF IOPL = 3\r\n THEN\r\n IF <- 0; (* Reset Interrupt Flag *)\r\n ELSE\r\n IF (IOPL < 3) AND (VME = 1)\r\n THEN\r\n VIF <- 0; (* Reset Virtual Interrupt Flag *)\r\n ELSE\r\n #GP(0);\r\n FI;\r\n FI;\r\n FI;\r\nFI;\r\n\r\nFlags Affected\r\nIf protected-mode virtual interrupts are not enabled, IF is set to 0 if the CPL is equal to or less than the IOPL; other-\r\nwise, it is not affected. Other flags are unaffected.\r\nWhen protected-mode virtual interrupts are enabled, CPL is 3, and IOPL is less than 3; CLI clears the VIF flag in the\r\nEFLAGS register, leaving IF unaffected. Other flags are unaffected.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the CPL is greater (has less privilege) than the IOPL of the current program or procedure.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If the CPL is greater (has less privilege) than the IOPL of the current program or procedure.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#GP(0) If the CPL is greater (has less privilege) than the IOPL of the current program or procedure.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLI"
},
{
"description": "CLTS-Clear Task-Switched Flag in CR0\r\n Opcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F 06 CLTS NP Valid Valid Clears TS flag in CR0.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nClears the task-switched (TS) flag in the CR0 register. This instruction is intended for use in operating-system\r\nprocedures. It is a privileged instruction that can only be executed at a CPL of 0. It is allowed to be executed in real-\r\naddress mode to allow initialization for protected mode.\r\nThe processor sets the TS flag every time a task switch occurs. The flag is used to synchronize the saving of FPU\r\ncontext in multitasking applications. See the description of the TS flag in the section titled \"Control Registers\" in\r\nChapter 2 of the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A, for more information\r\nabout this flag.\r\nCLTS operation is the same in non-64-bit modes and 64-bit mode.\r\nSee Chapter 25, \"VMX Non-Root Operation,\" of the Intel 64 and IA-32 Architectures Software Developer's\r\nManual, Volume 3C, for more information about the behavior of this instruction in VMX non-root operation.\r\n\r\nOperation\r\nCR0.TS[bit 3] <- 0;\r\n\r\nFlags Affected\r\nThe TS flag in CR0 register is cleared.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the current privilege level is not 0.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) CLTS is not recognized in virtual-8086 mode.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#GP(0) If the CPL is greater than 0.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLTS"
},
{
"description": "CLWB-Cache Line Write Back\r\n Opcode/ Op/ 64/32 bit CPUID Description\r\n Instruction En Mode Feature Flag\r\n Support\r\n 66 0F AE /6 M V/V CLWB Writes back modified cache line containing m8, and may\r\n CLWB m8 retain the line in cache hierarchy in non-modified state.\r\n\r\n\r\n Instruction Operand Encoding1\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n M ModRM:r/m (w) NA NA NA\r\n\r\nDescription\r\nWrites back to memory the cache line (if modified) that contains the linear address specified with the memory\r\noperand from any level of the cache hierarchy in the cache coherence domain. The line may be retained in the\r\ncache hierarchy in non-modified state. Retaining the line in the cache hierarchy is a performance optimization\r\n(treated as a hint by hardware) to reduce the possibility of cache miss on a subsequent access. Hardware may\r\nchoose to retain the line at any of the levels in the cache hierarchy, and in some cases, may invalidate the line from\r\nthe cache hierarchy. The source operand is a byte memory location.\r\nThe availability of CLWB instruction is indicated by the presence of the CPUID feature flag CLWB (bit 24 of the EBX\r\nregister, see \"CPUID - CPU Identification\" in this chapter). The aligned cache line size affected is also indicated\r\nwith the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1).\r\nThe memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It\r\nshould be noted that processors are free to speculatively fetch and cache data from system memory regions that\r\nare assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types).\r\nPREFETCHh instructions can be used to provide the processor with hints for this speculative behavior. Because this\r\nspeculative fetching can occur at any time and is not tied to instruction execution, the CLWB instruction is not\r\nordered with respect to PREFETCHh instructions or any of the speculative fetching mechanisms (that is, data can\r\nbe speculatively loaded into a cache line just before, during, or after the execution of a CLWB instruction that refer-\r\nences the cache line).\r\nCLWB instruction is ordered only by store-fencing operations. For example, software can use an SFENCE, MFENCE,\r\nXCHG, or LOCK-prefixed instructions to ensure that previous stores are included in the write-back. CLWB instruc-\r\ntion need not be ordered by another CLWB or CLFLUSHOPT instruction. CLWB is implicitly ordered with older stores\r\nexecuted by the logical processor to the same address.\r\nFor usages that require only writing back modified data from cache lines to memory (do not require the line to be\r\ninvalidated), and expect to subsequently access the data, software is recommended to use CLWB (with appropriate\r\nfencing) instead of CLFLUSH or CLFLUSHOPT for improved performance.\r\nThe CLWB instruction can be used at all privilege levels and is subject to all permission checking and faults associ-\r\nated with a byte load. Like a load, the CLWB instruction sets the accessed flag but not the dirty flag in the page\r\ntables.\r\nIn some implementations, the CLWB instruction may always cause transactional abort with Transactional Synchro-\r\nnization Extensions (TSX). CLWB instruction is not expected to be commonly used inside typical transactional\r\nregions. However, programmers must not rely on CLWB instruction to force a transactional abort, since whether\r\nthey cause transactional abort is implementation dependent.\r\n\r\nOperation\r\nCache_Line_Write_Back(m8);\r\n\r\nFlags Affected\r\nNone.\r\n\r\n\r\n1. ModRM.MOD != 011B\r\n\r\n\r\n\r\nC/C++ Compiler Intrinsic Equivalent\r\nCLWB void _mm_clwb(void const *p);\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.CLWB[bit 24] = 0.\r\n#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.\r\n#SS(0) For an illegal address in the SS segment.\r\n#PF(fault-code) For a page fault.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.CLWB[bit 24] = 0.\r\n#GP If any part of the operand lies outside the effective address space from 0 to FFFFH.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in real address mode.\r\n#PF(fault-code) For a page fault.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n If CPUID.(EAX=07H, ECX=0H):EBX.CLWB[bit 24] = 0.\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) For a page fault.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CLWB"
},
{
"description": "CMC-Complement Carry Flag\r\nOpcode Instruction Op/ 64-bit Compat/ Description\r\n En Mode Leg Mode\r\nF5 CMC NP Valid Valid Complement CF flag.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nComplements the CF flag in the EFLAGS register. CMC operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nEFLAGS.CF[bit 0]<- NOT EFLAGS.CF[bit 0];\r\n\r\nFlags Affected\r\nThe CF flag contains the complement of its original value. The OF, ZF, SF, AF, and PF flags are unaffected.\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMC"
},
{
"description": "CMOVcc-Conditional Move\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F 47 /r CMOVA r16, r/m16 RM Valid Valid Move if above (CF=0 and ZF=0).\r\n 0F 47 /r CMOVA r32, r/m32 RM Valid Valid Move if above (CF=0 and ZF=0).\r\n REX.W + 0F 47 /r CMOVA r64, r/m64 RM Valid N.E. Move if above (CF=0 and ZF=0).\r\n 0F 43 /r CMOVAE r16, r/m16 RM Valid Valid Move if above or equal (CF=0).\r\n 0F 43 /r CMOVAE r32, r/m32 RM Valid Valid Move if above or equal (CF=0).\r\n REX.W + 0F 43 /r CMOVAE r64, r/m64 RM Valid N.E. Move if above or equal (CF=0).\r\n 0F 42 /r CMOVB r16, r/m16 RM Valid Valid Move if below (CF=1).\r\n 0F 42 /r CMOVB r32, r/m32 RM Valid Valid Move if below (CF=1).\r\n REX.W + 0F 42 /r CMOVB r64, r/m64 RM Valid N.E. Move if below (CF=1).\r\n 0F 46 /r CMOVBE r16, r/m16 RM Valid Valid Move if below or equal (CF=1 or ZF=1).\r\n 0F 46 /r CMOVBE r32, r/m32 RM Valid Valid Move if below or equal (CF=1 or ZF=1).\r\n REX.W + 0F 46 /r CMOVBE r64, r/m64 RM Valid N.E. Move if below or equal (CF=1 or ZF=1).\r\n 0F 42 /r CMOVC r16, r/m16 RM Valid Valid Move if carry (CF=1).\r\n 0F 42 /r CMOVC r32, r/m32 RM Valid Valid Move if carry (CF=1).\r\n REX.W + 0F 42 /r CMOVC r64, r/m64 RM Valid N.E. Move if carry (CF=1).\r\n 0F 44 /r CMOVE r16, r/m16 RM Valid Valid Move if equal (ZF=1).\r\n 0F 44 /r CMOVE r32, r/m32 RM Valid Valid Move if equal (ZF=1).\r\n REX.W + 0F 44 /r CMOVE r64, r/m64 RM Valid N.E. Move if equal (ZF=1).\r\n 0F 4F /r CMOVG r16, r/m16 RM Valid Valid Move if greater (ZF=0 and SF=OF).\r\n 0F 4F /r CMOVG r32, r/m32 RM Valid Valid Move if greater (ZF=0 and SF=OF).\r\n REX.W + 0F 4F /r CMOVG r64, r/m64 RM V/N.E. NA Move if greater (ZF=0 and SF=OF).\r\n 0F 4D /r CMOVGE r16, r/m16 RM Valid Valid Move if greater or equal (SF=OF).\r\n 0F 4D /r CMOVGE r32, r/m32 RM Valid Valid Move if greater or equal (SF=OF).\r\n REX.W + 0F 4D /r CMOVGE r64, r/m64 RM Valid N.E. Move if greater or equal (SF=OF).\r\n 0F 4C /r CMOVL r16, r/m16 RM Valid Valid Move if less (SF!= OF).\r\n 0F 4C /r CMOVL r32, r/m32 RM Valid Valid Move if less (SF!= OF).\r\n REX.W + 0F 4C /r CMOVL r64, r/m64 RM Valid N.E. Move if less (SF!= OF).\r\n 0F 4E /r CMOVLE r16, r/m16 RM Valid Valid Move if less or equal (ZF=1 or SF!= OF).\r\n 0F 4E /r CMOVLE r32, r/m32 RM Valid Valid Move if less or equal (ZF=1 or SF!= OF).\r\n REX.W + 0F 4E /r CMOVLE r64, r/m64 RM Valid N.E. Move if less or equal (ZF=1 or SF!= OF).\r\n 0F 46 /r CMOVNA r16, r/m16 RM Valid Valid Move if not above (CF=1 or ZF=1).\r\n 0F 46 /r CMOVNA r32, r/m32 RM Valid Valid Move if not above (CF=1 or ZF=1).\r\n REX.W + 0F 46 /r CMOVNA r64, r/m64 RM Valid N.E. Move if not above (CF=1 or ZF=1).\r\n 0F 42 /r CMOVNAE r16, r/m16 RM Valid Valid Move if not above or equal (CF=1).\r\n 0F 42 /r CMOVNAE r32, r/m32 RM Valid Valid Move if not above or equal (CF=1).\r\n REX.W + 0F 42 /r CMOVNAE r64, r/m64 RM Valid N.E. Move if not above or equal (CF=1).\r\n 0F 43 /r CMOVNB r16, r/m16 RM Valid Valid Move if not below (CF=0).\r\n 0F 43 /r CMOVNB r32, r/m32 RM Valid Valid Move if not below (CF=0).\r\n REX.W + 0F 43 /r CMOVNB r64, r/m64 RM Valid N.E. Move if not below (CF=0).\r\n 0F 47 /r CMOVNBE r16, r/m16 RM Valid Valid Move if not below or equal (CF=0 and ZF=0).\r\n\r\n\r\n\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F 47 /r CMOVNBE r32, r/m32 RM Valid Valid Move if not below or equal (CF=0 and ZF=0).\r\n REX.W + 0F 47 /r CMOVNBE r64, r/m64 RM Valid N.E. Move if not below or equal (CF=0 and ZF=0).\r\n 0F 43 /r CMOVNC r16, r/m16 RM Valid Valid Move if not carry (CF=0).\r\n 0F 43 /r CMOVNC r32, r/m32 RM Valid Valid Move if not carry (CF=0).\r\n REX.W + 0F 43 /r CMOVNC r64, r/m64 RM Valid N.E. Move if not carry (CF=0).\r\n 0F 45 /r CMOVNE r16, r/m16 RM Valid Valid Move if not equal (ZF=0).\r\n 0F 45 /r CMOVNE r32, r/m32 RM Valid Valid Move if not equal (ZF=0).\r\n REX.W + 0F 45 /r CMOVNE r64, r/m64 RM Valid N.E. Move if not equal (ZF=0).\r\n 0F 4E /r CMOVNG r16, r/m16 RM Valid Valid Move if not greater (ZF=1 or SF!= OF).\r\n 0F 4E /r CMOVNG r32, r/m32 RM Valid Valid Move if not greater (ZF=1 or SF!= OF).\r\n REX.W + 0F 4E /r CMOVNG r64, r/m64 RM Valid N.E. Move if not greater (ZF=1 or SF!= OF).\r\n 0F 4C /r CMOVNGE r16, r/m16 RM Valid Valid Move if not greater or equal (SF!= OF).\r\n 0F 4C /r CMOVNGE r32, r/m32 RM Valid Valid Move if not greater or equal (SF!= OF).\r\n REX.W + 0F 4C /r CMOVNGE r64, r/m64 RM Valid N.E. Move if not greater or equal (SF!= OF).\r\n 0F 4D /r CMOVNL r16, r/m16 RM Valid Valid Move if not less (SF=OF).\r\n 0F 4D /r CMOVNL r32, r/m32 RM Valid Valid Move if not less (SF=OF).\r\n REX.W + 0F 4D /r CMOVNL r64, r/m64 RM Valid N.E. Move if not less (SF=OF).\r\n 0F 4F /r CMOVNLE r16, r/m16 RM Valid Valid Move if not less or equal (ZF=0 and SF=OF).\r\n 0F 4F /r CMOVNLE r32, r/m32 RM Valid Valid Move if not less or equal (ZF=0 and SF=OF).\r\n REX.W + 0F 4F /r CMOVNLE r64, r/m64 RM Valid N.E. Move if not less or equal (ZF=0 and SF=OF).\r\n 0F 41 /r CMOVNO r16, r/m16 RM Valid Valid Move if not overflow (OF=0).\r\n 0F 41 /r CMOVNO r32, r/m32 RM Valid Valid Move if not overflow (OF=0).\r\n REX.W + 0F 41 /r CMOVNO r64, r/m64 RM Valid N.E. Move if not overflow (OF=0).\r\n 0F 4B /r CMOVNP r16, r/m16 RM Valid Valid Move if not parity (PF=0).\r\n 0F 4B /r CMOVNP r32, r/m32 RM Valid Valid Move if not parity (PF=0).\r\n REX.W + 0F 4B /r CMOVNP r64, r/m64 RM Valid N.E. Move if not parity (PF=0).\r\n 0F 49 /r CMOVNS r16, r/m16 RM Valid Valid Move if not sign (SF=0).\r\n 0F 49 /r CMOVNS r32, r/m32 RM Valid Valid Move if not sign (SF=0).\r\n REX.W + 0F 49 /r CMOVNS r64, r/m64 RM Valid N.E. Move if not sign (SF=0).\r\n 0F 45 /r CMOVNZ r16, r/m16 RM Valid Valid Move if not zero (ZF=0).\r\n 0F 45 /r CMOVNZ r32, r/m32 RM Valid Valid Move if not zero (ZF=0).\r\n REX.W + 0F 45 /r CMOVNZ r64, r/m64 RM Valid N.E. Move if not zero (ZF=0).\r\n 0F 40 /r CMOVO r16, r/m16 RM Valid Valid Move if overflow (OF=1).\r\n 0F 40 /r CMOVO r32, r/m32 RM Valid Valid Move if overflow (OF=1).\r\n REX.W + 0F 40 /r CMOVO r64, r/m64 RM Valid N.E. Move if overflow (OF=1).\r\n 0F 4A /r CMOVP r16, r/m16 RM Valid Valid Move if parity (PF=1).\r\n 0F 4A /r CMOVP r32, r/m32 RM Valid Valid Move if parity (PF=1).\r\n REX.W + 0F 4A /r CMOVP r64, r/m64 RM Valid N.E. Move if parity (PF=1).\r\n 0F 4A /r CMOVPE r16, r/m16 RM Valid Valid Move if parity even (PF=1).\r\n 0F 4A /r CMOVPE r32, r/m32 RM Valid Valid Move if parity even (PF=1).\r\n REX.W + 0F 4A /r CMOVPE r64, r/m64 RM Valid N.E. Move if parity even (PF=1).\r\n\r\n\r\n\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n 0F 4B /r CMOVPO r16, r/m16 RM Valid Valid Move if parity odd (PF=0).\r\n 0F 4B /r CMOVPO r32, r/m32 RM Valid Valid Move if parity odd (PF=0).\r\n REX.W + 0F 4B /r CMOVPO r64, r/m64 RM Valid N.E. Move if parity odd (PF=0).\r\n 0F 48 /r CMOVS r16, r/m16 RM Valid Valid Move if sign (SF=1).\r\n 0F 48 /r CMOVS r32, r/m32 RM Valid Valid Move if sign (SF=1).\r\n REX.W + 0F 48 /r CMOVS r64, r/m64 RM Valid N.E. Move if sign (SF=1).\r\n 0F 44 /r CMOVZ r16, r/m16 RM Valid Valid Move if zero (ZF=1).\r\n 0F 44 /r CMOVZ r32, r/m32 RM Valid Valid Move if zero (ZF=1).\r\n REX.W + 0F 44 /r CMOVZ r64, r/m64 RM Valid N.E. Move if zero (ZF=1).\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nThe CMOVcc instructions check the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF,\r\nand ZF) and perform a move operation if the flags are in a specified state (or condition). A condition code (cc) is\r\nassociated with each instruction to indicate the condition being tested for. If the condition is not satisfied, a move\r\nis not performed and execution continues with the instruction following the CMOVcc instruction.\r\nThese instructions can move 16-bit, 32-bit or 64-bit values from memory to a general-purpose register or from one\r\ngeneral-purpose register to another. Conditional moves of 8-bit register operands are not supported.\r\nThe condition for each CMOVcc mnemonic is given in the description column of the above table. The terms \"less\"\r\nand \"greater\" are used for comparisons of signed integers and the terms \"above\" and \"below\" are used for\r\nunsigned integers.\r\nBecause a particular state of the status flags can sometimes be interpreted in two ways, two mnemonics are\r\ndefined for some opcodes. For example, the CMOVA (conditional move if above) instruction and the CMOVNBE\r\n(conditional move if not below or equal) instruction are alternate mnemonics for the opcode 0F 47H.\r\nThe CMOVcc instructions were introduced in P6 family processors; however, these instructions may not be\r\nsupported by all IA-32 processors. Software can determine if the CMOVcc instructions are supported by checking\r\nthe processor's feature information with the CPUID instruction (see \"CPUID-CPU Identification\" in this chapter).\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Use of the REX.R prefix permits access to addi-\r\ntional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the\r\nbeginning of this section for encoding data and limits.\r\n\r\nOperation\r\ntemp <- SRC\r\n\r\nIF condition TRUE\r\n THEN\r\n DEST <- temp;\r\n FI;\r\nELSE\r\n IF (OperandSize = 32 and IA-32e mode active)\r\n THEN\r\n DEST[63:32] <- 0;\r\n FI;\r\nFI;\r\n\r\n\r\n\r\nFlags Affected\r\nNone.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMOVcc"
},
{
"description": "CMP-Compare Two Operands\r\nOpcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n3C ib CMP AL, imm8 I Valid Valid Compare imm8 with AL.\r\n3D iw CMP AX, imm16 I Valid Valid Compare imm16 with AX.\r\n3D id CMP EAX, imm32 I Valid Valid Compare imm32 with EAX.\r\nREX.W + 3D id CMP RAX, imm32 I Valid N.E. Compare imm32 sign-extended to 64-bits\r\n with RAX.\r\n80 /7 ib CMP r/m8, imm8 MI Valid Valid Compare imm8 with r/m8.\r\nREX + 80 /7 ib CMP r/m8*, imm8 MI Valid N.E. Compare imm8 with r/m8.\r\n81 /7 iw CMP r/m16, imm16 MI Valid Valid Compare imm16 with r/m16.\r\n81 /7 id CMP r/m32, imm32 MI Valid Valid Compare imm32 with r/m32.\r\nREX.W + 81 /7 id CMP r/m64, imm32 MI Valid N.E. Compare imm32 sign-extended to 64-bits\r\n with r/m64.\r\n83 /7 ib CMP r/m16, imm8 MI Valid Valid Compare imm8 with r/m16.\r\n83 /7 ib CMP r/m32, imm8 MI Valid Valid Compare imm8 with r/m32.\r\nREX.W + 83 /7 ib CMP r/m64, imm8 MI Valid N.E. Compare imm8 with r/m64.\r\n38 /r CMP r/m8, r8 MR Valid Valid Compare r8 with r/m8.\r\n * *\r\nREX + 38 /r CMP r/m8 , r8 MR Valid N.E. Compare r8 with r/m8.\r\n39 /r CMP r/m16, r16 MR Valid Valid Compare r16 with r/m16.\r\n39 /r CMP r/m32, r32 MR Valid Valid Compare r32 with r/m32.\r\nREX.W + 39 /r CMP r/m64,r64 MR Valid N.E. Compare r64 with r/m64.\r\n3A /r CMP r8, r/m8 RM Valid Valid Compare r/m8 with r8.\r\n * *\r\nREX + 3A /r CMP r8 , r/m8 RM Valid N.E. Compare r/m8 with r8.\r\n3B /r CMP r16, r/m16 RM Valid Valid Compare r/m16 with r16.\r\n3B /r CMP r32, r/m32 RM Valid Valid Compare r/m32 with r32.\r\nREX.W + 3B /r CMP r64, r/m64 RM Valid N.E. Compare r/m64 with r64.\r\nNOTES:\r\n* In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r) ModRM:r/m (r) NA NA\r\n MR ModRM:r/m (r) ModRM:reg (r) NA NA\r\n MI ModRM:r/m (r) imm8 NA NA\r\n I AL/AX/EAX/RAX (r) imm8 NA NA\r\n\r\nDescription\r\nCompares the first source operand with the second source operand and sets the status flags in the EFLAGS register\r\naccording to the results. The comparison is performed by subtracting the second operand from the first operand\r\nand then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as\r\nan operand, it is sign-extended to the length of the first operand.\r\nThe condition codes used by the Jcc, CMOVcc, and SETcc instructions are based on the results of a CMP instruction.\r\nAppendix B, \"EFLAGS Condition Codes,\" in the Intel 64 and IA-32 Architectures Software Developer's Manual,\r\nVolume 1, shows the relationship of the status flags and the condition codes.\r\n\r\n\r\n\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Use of the REX.R prefix permits access to addi-\r\ntional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the\r\nbeginning of this section for encoding data and limits.\r\n\r\nOperation\r\ntemp <- SRC1 - SignExtend(SRC2);\r\nModifyStatusFlags; (* Modify status flags in the same manner as the SUB instruction*)\r\n\r\nFlags Affected\r\nThe CF, OF, SF, ZF, AF, and PF flags are set according to the result.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMP"
},
{
"description": "CMPPD-Compare Packed Double-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F C2 /r ib RMI V/V SSE2 Compare packed double-precision floating-point values\r\n CMPPD xmm1, xmm2/m128, imm8 in xmm2/m128 and xmm1 using bits 2:0 of imm8 as a\r\n comparison predicate.\r\n VEX.NDS.128.66.0F.WIG C2 /r ib RVMI V/V AVX Compare packed double-precision floating-point values\r\n VCMPPD xmm1, xmm2, xmm3/m128, in xmm3/m128 and xmm2 using bits 4:0 of imm8 as a\r\n imm8 comparison predicate.\r\n VEX.NDS.256.66.0F.WIG C2 /r ib RVMI V/V AVX Compare packed double-precision floating-point values\r\n VCMPPD ymm1, ymm2, ymm3/m256, in ymm3/m256 and ymm2 using bits 4:0 of imm8 as a\r\n imm8 comparison predicate.\r\n EVEX.NDS.128.66.0F.W1 C2 /r ib FV V/V AVX512VL Compare packed double-precision floating-point values\r\n VCMPPD k1 {k2}, xmm2, AVX512F in xmm3/m128/m64bcst and xmm2 using bits 4:0 of\r\n xmm3/m128/m64bcst, imm8 imm8 as a comparison predicate with writemask k2\r\n and leave the result in mask register k1.\r\n EVEX.NDS.256.66.0F.W1 C2 /r ib FV V/V AVX512VL Compare packed double-precision floating-point values\r\n VCMPPD k1 {k2}, ymm2, AVX512F in ymm3/m256/m64bcst and ymm2 using bits 4:0 of\r\n ymm3/m256/m64bcst, imm8 imm8 as a comparison predicate with writemask k2\r\n and leave the result in mask register k1.\r\n EVEX.NDS.512.66.0F.W1 C2 /r ib FV V/V AVX512F Compare packed double-precision floating-point values\r\n VCMPPD k1 {k2}, zmm2, in zmm3/m512/m64bcst and zmm2 using bits 4:0 of\r\n zmm3/m512/m64bcst{sae}, imm8 imm8 as a comparison predicate with writemask k2\r\n and leave the result in mask register k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) Imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv ModRM:r/m (r) Imm8\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) Imm8\r\n\r\nDescription\r\nPerforms a SIMD compare of the packed double-precision floating-point values in the second source operand and\r\nthe first source operand and returns the results of the comparison to the destination operand. The comparison\r\npredicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in\r\nthe two source operands.\r\nEVEX encoded versions: The first source operand (second operand) is a ZMM/YMM/XMM register. The second\r\nsource operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector\r\nbroadcasted from a 64-bit memory location. The destination operand (first operand) is an opmask register.\r\nComparison results are written to the destination operand under the writemask k2. Each comparison result is a\r\nsingle mask bit of 1 (comparison true) or 0 (comparison false).\r\nVEX.256 encoded version: The first source operand (second operand) is a YMM register. The second source\r\noperand (third operand) can be a YMM register or a 256-bit memory location. The destination operand (first\r\noperand) is a YMM register. Four comparisons are performed with results written to the destination operand. The\r\nresult of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false).\r\n128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The\r\nsecond source operand (second operand) can be an XMM register or 128-bit memory location. Bits (MAX_VL-\r\n1:128) of the corresponding ZMM destination register remain unchanged. Two comparisons are performed with\r\nresults written to bits 127:0 of the destination operand. The result of each comparison is a quadword mask of all\r\n1s (comparison true) or all 0s (comparison false).\r\n\r\n\r\n\r\n\r\n\r\nVEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source\r\noperand (third operand) can be an XMM register or a 128-bit memory location. Bits (MAX_VL-1:128) of the desti-\r\nnation ZMM register are zeroed. Two comparisons are performed with results written to bits 127:0 of the destina-\r\ntion operand.\r\nThe comparison predicate operand is an 8-bit immediate:\r\n. For instructions encoded using the VEX or EVEX prefix, bits 4:0 define the type of comparison to be performed\r\n (see Table 3-1). Bits 5 through 7 of the immediate are reserved.\r\n. For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see the\r\n first 8 rows of Table 3-1). Bits 3 through 7 of the immediate are reserved.\r\n\r\n\r\n Table 3-1. Comparison Predicate for CMPPD and CMPPS Instructions\r\nPredicate imm8 Description Result: A Is 1st Operand, B Is 2nd Operand Signals\r\n Value #IA on\r\n A >B A<B A=B Unordered1 QNAN\r\n\r\nEQ_OQ (EQ) 0H Equal (ordered, non-signaling) False False True False No\r\nLT_OS (LT) 1H Less-than (ordered, signaling) False True False False Yes\r\nLE_OS (LE) 2H Less-than-or-equal (ordered, signaling) False True True False Yes\r\nUNORD_Q (UNORD) 3H Unordered (non-signaling) False False False True No\r\nNEQ_UQ (NEQ) 4H Not-equal (unordered, non-signaling) True True False True No\r\nNLT_US (NLT) 5H Not-less-than (unordered, signaling) True False True True Yes\r\nNLE_US (NLE) 6H Not-less-than-or-equal (unordered, signaling) True False False True Yes\r\nORD_Q (ORD) 7H Ordered (non-signaling) True True True False No\r\nEQ_UQ 8H Equal (unordered, non-signaling) False False True True No\r\nNGE_US (NGE) 9H Not-greater-than-or-equal (unordered, False True False True Yes\r\n signaling)\r\nNGT_US (NGT) AH Not-greater-than (unordered, signaling) False True True True Yes\r\nFALSE_OQ(FALSE) BH False (ordered, non-signaling) False False False False No\r\nNEQ_OQ CH Not-equal (ordered, non-signaling) True True False False No\r\nGE_OS (GE) DH Greater-than-or-equal (ordered, signaling) True False True False Yes\r\nGT_OS (GT) EH Greater-than (ordered, signaling) True False False False Yes\r\nTRUE_UQ(TRUE) FH True (unordered, non-signaling) True True True True No\r\nEQ_OS 10H Equal (ordered, signaling) False False True False Yes\r\nLT_OQ 11H Less-than (ordered, nonsignaling) False True False False No\r\nLE_OQ 12H Less-than-or-equal (ordered, nonsignaling) False True True False No\r\nUNORD_S 13H Unordered (signaling) False False False True Yes\r\nNEQ_US 14H Not-equal (unordered, signaling) True True False True Yes\r\nNLT_UQ 15H Not-less-than (unordered, nonsignaling) True False True True No\r\nNLE_UQ 16H Not-less-than-or-equal (unordered, nonsig- True False False True No\r\n naling)\r\nORD_S 17H Ordered (signaling) True True True False Yes\r\n\r\n\r\nEQ_US 18H Equal (unordered, signaling) False False True True Yes\r\nNGE_UQ 19H Not-greater-than-or-equal (unordered, non- False True False True No\r\n signaling)\r\n\r\n\r\n\r\n\r\n Table 3-1. Comparison Predicate for CMPPD and CMPPS Instructions (Contd.)\r\n Predicate imm8 Description Result: A Is 1st Operand, B Is 2nd Operand Signals\r\n Value #IA on\r\n A >B A<B A=B Unordered1 QNAN\r\n\r\n NGT_UQ 1AH Not-greater-than (unordered, nonsignaling) False True True True No\r\n FALSE_OS 1BH False (ordered, signaling) False False False False Yes\r\n NEQ_OS 1CH Not-equal (ordered, signaling) True True False False Yes\r\n GE_OQ 1DH Greater-than-or-equal (ordered, nonsignal- True False True False No\r\n ing)\r\n GT_OQ 1EH Greater-than (ordered, nonsignaling) True False False False No\r\n TRUE_US 1FH True (unordered, signaling) True True True True Yes\r\n\r\n\r\nNOTES:\r\n1. If either operand A or B is a NAN.\r\n\r\nThe unordered relationship is true when at least one of the two source operands being compared is a NaN; the\r\nordered relationship is true when neither source operand is a NaN.\r\nA subsequent computational instruction that uses the mask result in the destination operand as an input operand\r\nwill not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask\r\nof all 1s corresponds to a QNaN.\r\nNote that processors with \"CPUID.1H:ECX.AVX =0\" do not implement the \"greater-than\", \"greater-than-or-equal\",\r\n\"not-greater than\", and \"not-greater-than-or-equal relations\" predicates. These comparisons can be made either\r\nby using the inverse relationship (that is, use the \"not-less-than-or-equal\" to make a \"greater-than\" comparison)\r\nor by using software emulation. When using software emulation, the program must swap the operands (copying\r\nregisters when necessary to protect the data that will now be in the destination), and then perform the compare\r\nusing a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of Table 3-7\r\n(Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A) under the heading Emulation.\r\nCompilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand\r\nCMPPD instruction, for processors with \"CPUID.1H:ECX.AVX =0\". See Table 3-2. Compiler should treat reserved\r\nImm8 values as illegal syntax.\r\n Table 3-2. Pseudo-Op and CMPPD Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPPD Implementation\r\n CMPEQPD xmm1, xmm2 CMPPD xmm1, xmm2, 0\r\n CMPLTPD xmm1, xmm2 CMPPD xmm1, xmm2, 1\r\n CMPLEPD xmm1, xmm2 CMPPD xmm1, xmm2, 2\r\n CMPUNORDPD xmm1, xmm2 CMPPD xmm1, xmm2, 3\r\n CMPNEQPD xmm1, xmm2 CMPPD xmm1, xmm2, 4\r\n CMPNLTPD xmm1, xmm2 CMPPD xmm1, xmm2, 5\r\n CMPNLEPD xmm1, xmm2 CMPPD xmm1, xmm2, 6\r\n CMPORDPD xmm1, xmm2 CMPPD xmm1, xmm2, 7\r\n\r\nThe greater-than relations that the processor does not implement require more than one instruction to emulate in\r\nsoftware and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the\r\noperands of the corresponding less than relations and use move instructions to ensure that the mask is moved to\r\nthe correct destination register and that the source operand is left intact.)\r\nProcessors with \"CPUID.1H:ECX.AVX =1\" implement the full complement of 32 predicates shown in Table 3-3, soft-\r\nware emulation is no longer needed. Compilers and assemblers may implement the following three-operand\r\npseudo-ops in addition to the four-operand VCMPPD instruction. See Table 3-3, where the notations of reg1 reg2,\r\nand reg3 represent either XMM registers or YMM registers. Compiler should treat reserved Imm8 values as illegal\r\n\r\n\r\n\r\nsyntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic inter-\r\nface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPPD instructions\r\nin a similar fashion by extending the syntax listed in Table 3-3.\r\n Table 3-3. Pseudo-Op and VCMPPD Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPPD Implementation\r\n VCMPEQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0\r\n VCMPLTPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1\r\n VCMPLEPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 2\r\n VCMPUNORDPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 3\r\n VCMPNEQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 4\r\n VCMPNLTPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 5\r\n VCMPNLEPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 6\r\n VCMPORDPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 7\r\n VCMPEQ_UQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 8\r\n VCMPNGEPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 9\r\n VCMPNGTPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0AH\r\n VCMPFALSEPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0BH\r\n VCMPNEQ_OQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0CH\r\n VCMPGEPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0DH\r\n VCMPGTPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0EH\r\n VCMPTRUEPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 0FH\r\n VCMPEQ_OSPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 10H\r\n VCMPLT_OQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 11H\r\n VCMPLE_OQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 12H\r\n VCMPUNORD_SPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 13H\r\n VCMPNEQ_USPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 14H\r\n VCMPNLT_UQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 15H\r\n VCMPNLE_UQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 16H\r\n VCMPORD_SPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 17H\r\n VCMPEQ_USPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 18H\r\n VCMPNGE_UQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 19H\r\n VCMPNGT_UQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1AH\r\n VCMPFALSE_OSPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1BH\r\n VCMPNEQ_OSPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1CH\r\n VCMPGE_OQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1DH\r\n VCMPGT_OQPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1EH\r\n VCMPTRUE_USPD reg1, reg2, reg3 VCMPPD reg1, reg2, reg3, 1FH\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nCASE (COMPARISON PREDICATE) OF\r\n0: OP3 <- EQ_OQ; OP5 <- EQ_OQ;\r\n 1: OP3 <- LT_OS; OP5 <- LT_OS;\r\n 2: OP3 <- LE_OS; OP5 <- LE_OS;\r\n 3: OP3 <- UNORD_Q; OP5 <- UNORD_Q;\r\n 4: OP3 <- NEQ_UQ; OP5 <- NEQ_UQ;\r\n 5: OP3 <- NLT_US; OP5 <- NLT_US;\r\n 6: OP3 <- NLE_US; OP5 <- NLE_US;\r\n 7: OP3 <- ORD_Q; OP5 <- ORD_Q;\r\n 8: OP5 <- EQ_UQ;\r\n 9: OP5 <- NGE_US;\r\n 10: OP5 <- NGT_US;\r\n 11: OP5 <- FALSE_OQ;\r\n 12: OP5 <- NEQ_OQ;\r\n 13: OP5 <- GE_OS;\r\n 14: OP5 <- GT_OS;\r\n 15: OP5 <- TRUE_UQ;\r\n 16: OP5 <- EQ_OS;\r\n 17: OP5 <- LT_OQ;\r\n 18: OP5 <- LE_OQ;\r\n 19: OP5 <- UNORD_S;\r\n 20: OP5 <- NEQ_US;\r\n 21: OP5 <- NLT_UQ;\r\n 22: OP5 <- NLE_UQ;\r\n 23: OP5 <- ORD_S;\r\n 24: OP5 <- EQ_US;\r\n 25: OP5 <- NGE_UQ;\r\n 26: OP5 <- NGT_UQ;\r\n 27: OP5 <- FALSE_OS;\r\n 28: OP5 <- NEQ_OS;\r\n 29: OP5 <- GE_OQ;\r\n 30: OP5 <- GT_OQ;\r\n 31: OP5 <- TRUE_US;\r\n DEFAULT: Reserved;\r\nESAC;\r\n\r\n\r\n\r\n\r\n\r\nVCMPPD (EVEX encoded versions)\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n IF k2[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1) AND (SRC2 *is memory*)\r\n THEN\r\n CMP <- SRC1[i+63:i] OP5 SRC2[63:0]\r\n ELSE\r\n CMP <- SRC1[i+63:i] OP5 SRC2[i+63:i]\r\n FI;\r\n IF CMP = TRUE\r\n THEN DEST[j] <- 1;\r\n ELSE DEST[j] <- 0; FI;\r\n ELSE DEST[j] <- 0 ; zeroing-masking only\r\n FI;\r\nENDFOR\r\nDEST[MAX_KL-1:KL] <- 0\r\n\r\nVCMPPD (VEX.256 encoded version)\r\nCMP0 <- SRC1[63:0] OP5 SRC2[63:0];\r\nCMP1 <- SRC1[127:64] OP5 SRC2[127:64];\r\nCMP2 <- SRC1[191:128] OP5 SRC2[191:128];\r\nCMP3 <- SRC1[255:192] OP5 SRC2[255:192];\r\nIF CMP0 = TRUE\r\n THEN DEST[63:0] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[63:0] <- 0000000000000000H; FI;\r\nIF CMP1 = TRUE\r\n THEN DEST[127:64] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[127:64] <- 0000000000000000H; FI;\r\nIF CMP2 = TRUE\r\n THEN DEST[191:128] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[191:128] <- 0000000000000000H; FI;\r\nIF CMP3 = TRUE\r\n THEN DEST[255:192] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[255:192] <- 0000000000000000H; FI;\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVCMPPD (VEX.128 encoded version)\r\nCMP0 <- SRC1[63:0] OP5 SRC2[63:0];\r\nCMP1 <- SRC1[127:64] OP5 SRC2[127:64];\r\nIF CMP0 = TRUE\r\n THEN DEST[63:0] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[63:0] <- 0000000000000000H; FI;\r\nIF CMP1 = TRUE\r\n THEN DEST[127:64] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[127:64] <- 0000000000000000H; FI;\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\n\r\n\r\n\r\n\r\nCMPPD (128-bit Legacy SSE version)\r\nCMP0 <- SRC1[63:0] OP3 SRC2[63:0];\r\nCMP1 <- SRC1[127:64] OP3 SRC2[127:64];\r\nIF CMP0 = TRUE\r\n THEN DEST[63:0] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[63:0] <- 0000000000000000H; FI;\r\nIF CMP1 = TRUE\r\n THEN DEST[127:64] <- FFFFFFFFFFFFFFFFH;\r\n ELSE DEST[127:64] <- 0000000000000000H; FI;\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCMPPD __mmask8 _mm512_cmp_pd_mask( __m512d a, __m512d b, int imm);\r\nVCMPPD __mmask8 _mm512_cmp_round_pd_mask( __m512d a, __m512d b, int imm, int sae);\r\nVCMPPD __mmask8 _mm512_mask_cmp_pd_mask( __mmask8 k1, __m512d a, __m512d b, int imm);\r\nVCMPPD __mmask8 _mm512_mask_cmp_round_pd_mask( __mmask8 k1, __m512d a, __m512d b, int imm, int sae);\r\nVCMPPD __mmask8 _mm256_cmp_pd_mask( __m256d a, __m256d b, int imm);\r\nVCMPPD __mmask8 _mm256_mask_cmp_pd_mask( __mmask8 k1, __m256d a, __m256d b, int imm);\r\nVCMPPD __mmask8 _mm_cmp_pd_mask( __m128d a, __m128d b, int imm);\r\nVCMPPD __mmask8 _mm_mask_cmp_pd_mask( __mmask8 k1, __m128d a, __m128d b, int imm);\r\nVCMPPD __m256 _mm256_cmp_pd(__m256d a, __m256d b, int imm)\r\n(V)CMPPD __m128 _mm_cmp_pd(__m128d a, __m128d b, int imm)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid if SNaN operand and invalid if QNaN and predicate as listed in Table 3-1.\r\nDenormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2.\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPPD"
},
{
"description": "CMPPS-Compare Packed Single-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F C2 /r ib RMI V/V SSE Compare packed single-precision floating-point values in\r\n CMPPS xmm1, xmm2/m128, xmm2/m128 and xmm1 using bits 2:0 of imm8 as a\r\n imm8 comparison predicate.\r\n VEX.NDS.128.0F.WIG C2 /r ib RVMI V/V AVX Compare packed single-precision floating-point values in\r\n VCMPPS xmm1, xmm2, xmm3/m128 and xmm2 using bits 4:0 of imm8 as a\r\n xmm3/m128, imm8 comparison predicate.\r\n VEX.NDS.256.0F.WIG C2 /r ib RVMI V/V AVX Compare packed single-precision floating-point values in\r\n VCMPPS ymm1, ymm2, ymm3/m256 and ymm2 using bits 4:0 of imm8 as a\r\n ymm3/m256, imm8 comparison predicate.\r\n EVEX.NDS.128.0F.W0 C2 /r ib FV V/V AVX512VL Compare packed single-precision floating-point values in\r\n VCMPPS k1 {k2}, xmm2, AVX512F xmm3/m128/m32bcst and xmm2 using bits 4:0 of imm8 as\r\n xmm3/m128/m32bcst, imm8 a comparison predicate with writemask k2 and leave the\r\n result in mask register k1.\r\n EVEX.NDS.256.0F.W0 C2 /r ib FV V/V AVX512VL Compare packed single-precision floating-point values in\r\n VCMPPS k1 {k2}, ymm2, AVX512F ymm3/m256/m32bcst and ymm2 using bits 4:0 of imm8 as\r\n ymm3/m256/m32bcst, imm8 a comparison predicate with writemask k2 and leave the\r\n result in mask register k1.\r\n EVEX.NDS.512.0F.W0 C2 /r ib FV V/V AVX512F Compare packed single-precision floating-point values in\r\n VCMPPS k1 {k2}, zmm2, zmm3/m512/m32bcst and zmm2 using bits 4:0 of imm8 as\r\n zmm3/m512/m32bcst{sae}, imm8 a comparison predicate with writemask k2 and leave the\r\n result in mask register k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) Imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv ModRM:r/m (r) Imm8\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) Imm8\r\n\r\nDescription\r\nPerforms a SIMD compare of the packed single-precision floating-point values in the second source operand and\r\nthe first source operand and returns the results of the comparison to the destination operand. The comparison\r\npredicate operand (immediate byte) specifies the type of comparison performed on each of the pairs of packed\r\nvalues.\r\nEVEX encoded versions: The first source operand (second operand) is a ZMM/YMM/XMM register. The second\r\nsource operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector\r\nbroadcasted from a 32-bit memory location. The destination operand (first operand) is an opmask register.\r\nComparison results are written to the destination operand under the writemask k2. Each comparison result is a\r\nsingle mask bit of 1 (comparison true) or 0 (comparison false).\r\nVEX.256 encoded version: The first source operand (second operand) is a YMM register. The second source operand\r\n(third operand) can be a YMM register or a 256-bit memory location. The destination operand (first operand) is a\r\nYMM register. Eight comparisons are performed with results written to the destination operand. The result of each\r\ncomparison is a doubleword mask of all 1s (comparison true) or all 0s (comparison false).\r\n128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The\r\nsecond source operand (second operand) can be an XMM register or 128-bit memory location. Bits (MAX_VL-\r\n1:128) of the corresponding ZMM destination register remain unchanged. Four comparisons are performed with\r\nresults written to bits 127:0 of the destination operand. The result of each comparison is a doubleword mask of all\r\n1s (comparison true) or all 0s (comparison false).\r\n\r\n\r\n\r\n\r\n\r\nVEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source\r\noperand (third operand) can be an XMM register or a 128-bit memory location. Bits (MAX_VL-1:128) of the desti-\r\nnation ZMM register are zeroed. Four comparisons are performed with results written to bits 127:0 of the destina-\r\ntion operand.\r\nThe comparison predicate operand is an 8-bit immediate:\r\n. For instructions encoded using the VEX prefix and EVEX prefix, bits 4:0 define the type of comparison to be\r\n performed (see Table 3-1). Bits 5 through 7 of the immediate are reserved.\r\n. For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see\r\n the first 8 rows of Table 3-1). Bits 3 through 7 of the immediate are reserved.\r\nThe unordered relationship is true when at least one of the two source operands being compared is a NaN; the\r\nordered relationship is true when neither source operand is a NaN.\r\nA subsequent computational instruction that uses the mask result in the destination operand as an input operand\r\nwill not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask\r\nof all 1s corresponds to a QNaN.\r\nNote that processors with \"CPUID.1H:ECX.AVX =0\" do not implement the \"greater-than\", \"greater-than-or-equal\",\r\n\"not-greater than\", and \"not-greater-than-or-equal relations\" predicates. These comparisons can be made either\r\nby using the inverse relationship (that is, use the \"not-less-than-or-equal\" to make a \"greater-than\" comparison)\r\nor by using software emulation. When using software emulation, the program must swap the operands (copying\r\nregisters when necessary to protect the data that will now be in the destination), and then perform the compare\r\nusing a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of Table 3-7\r\n(Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A) under the heading Emulation.\r\nCompilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand\r\nCMPPS instruction, for processors with \"CPUID.1H:ECX.AVX =0\". See Table 3-4. Compiler should treat reserved\r\nImm8 values as illegal syntax.\r\n Table 3-4. Pseudo-Op and CMPPS Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPPS Implementation\r\n CMPEQPS xmm1, xmm2 CMPPS xmm1, xmm2, 0\r\n CMPLTPS xmm1, xmm2 CMPPS xmm1, xmm2, 1\r\n CMPLEPS xmm1, xmm2 CMPPS xmm1, xmm2, 2\r\n CMPUNORDPS xmm1, xmm2 CMPPS xmm1, xmm2, 3\r\n CMPNEQPS xmm1, xmm2 CMPPS xmm1, xmm2, 4\r\n CMPNLTPS xmm1, xmm2 CMPPS xmm1, xmm2, 5\r\n CMPNLEPS xmm1, xmm2 CMPPS xmm1, xmm2, 6\r\n CMPORDPS xmm1, xmm2 CMPPS xmm1, xmm2, 7\r\n\r\nThe greater-than relations that the processor does not implement require more than one instruction to emulate in\r\nsoftware and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the\r\noperands of the corresponding less than relations and use move instructions to ensure that the mask is moved to\r\nthe correct destination register and that the source operand is left intact.)\r\nProcessors with \"CPUID.1H:ECX.AVX =1\" implement the full complement of 32 predicates shown in Table 3-5, soft-\r\nware emulation is no longer needed. Compilers and assemblers may implement the following three-operand\r\npseudo-ops in addition to the four-operand VCMPPS instruction. See Table 3-5, where the notation of reg1 and\r\nreg2 represent either XMM registers or YMM registers. Compiler should treat reserved Imm8 values as illegal\r\nsyntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic inter-\r\nface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPPS instructions\r\nin a similar fashion by extending the syntax listed in Table 3-5.\r\n:\r\n\r\n\r\n\r\n\r\n\r\n Table 3-5. Pseudo-Op and VCMPPS Implementation\r\n Pseudo-Op CMPPS Implementation\r\n VCMPEQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0\r\n VCMPLTPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1\r\n VCMPLEPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 2\r\n VCMPUNORDPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 3\r\n VCMPNEQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 4\r\n VCMPNLTPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 5\r\n VCMPNLEPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 6\r\n VCMPORDPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 7\r\n VCMPEQ_UQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 8\r\n VCMPNGEPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 9\r\n VCMPNGTPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0AH\r\n VCMPFALSEPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0BH\r\n VCMPNEQ_OQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0CH\r\n VCMPGEPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0DH\r\n VCMPGTPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0EH\r\n VCMPTRUEPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 0FH\r\n VCMPEQ_OSPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 10H\r\n VCMPLT_OQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 11H\r\n VCMPLE_OQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 12H\r\n VCMPUNORD_SPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 13H\r\n VCMPNEQ_USPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 14H\r\n VCMPNLT_UQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 15H\r\n VCMPNLE_UQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 16H\r\n VCMPORD_SPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 17H\r\n VCMPEQ_USPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 18H\r\n VCMPNGE_UQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 19H\r\n VCMPNGT_UQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1AH\r\n VCMPFALSE_OSPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1BH\r\n VCMPNEQ_OSPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1CH\r\n VCMPGE_OQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1DH\r\n VCMPGT_OQPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1EH\r\n VCMPTRUE_USPS reg1, reg2, reg3 VCMPPS reg1, reg2, reg3, 1FH\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nCASE (COMPARISON PREDICATE) OF\r\n 0: OP3 <- EQ_OQ; OP5 <- EQ_OQ;\r\n 1: OP3 <- LT_OS; OP5 <- LT_OS;\r\n 2: OP3 <- LE_OS; OP5 <- LE_OS;\r\n 3: OP3 <- UNORD_Q; OP5 <- UNORD_Q;\r\n 4: OP3 <- NEQ_UQ; OP5 <- NEQ_UQ;\r\n 5: OP3 <- NLT_US; OP5 <- NLT_US;\r\n 6: OP3 <- NLE_US; OP5 <- NLE_US;\r\n 7: OP3 <- ORD_Q; OP5 <- ORD_Q;\r\n 8: OP5 <- EQ_UQ;\r\n 9: OP5 <- NGE_US;\r\n 10: OP5 <- NGT_US;\r\n 11: OP5 <- FALSE_OQ;\r\n 12: OP5 <- NEQ_OQ;\r\n 13: OP5 <- GE_OS;\r\n 14: OP5 <- GT_OS;\r\n 15: OP5 <- TRUE_UQ;\r\n 16: OP5 <- EQ_OS;\r\n 17: OP5 <- LT_OQ;\r\n 18: OP5 <- LE_OQ;\r\n 19: OP5 <- UNORD_S;\r\n 20: OP5 <- NEQ_US;\r\n 21: OP5 <- NLT_UQ;\r\n 22: OP5 <- NLE_UQ;\r\n 23: OP5 <- ORD_S;\r\n 24: OP5 <- EQ_US;\r\n 25: OP5 <- NGE_UQ;\r\n 26: OP5 <- NGT_UQ;\r\n 27: OP5 <- FALSE_OS;\r\n 28: OP5 <- NEQ_OS;\r\n 29: OP5 <- GE_OQ;\r\n 30: OP5 <- GT_OQ;\r\n 31: OP5 <- TRUE_US;\r\n DEFAULT: Reserved\r\nESAC;\r\n\r\n\r\n\r\n\r\n\r\nVCMPPS (EVEX encoded versions)\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k2[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1) AND (SRC2 *is memory*)\r\n THEN\r\n CMP <- SRC1[i+31:i] OP5 SRC2[31:0]\r\n ELSE\r\n CMP <- SRC1[i+31:i] OP5 SRC2[i+31:i]\r\n FI;\r\n IF CMP = TRUE\r\n THEN DEST[j] <- 1;\r\n ELSE DEST[j] <- 0; FI;\r\n ELSE DEST[j] <- 0 ; zeroing-masking onlyFI;\r\n FI;\r\nENDFOR\r\nDEST[MAX_KL-1:KL] <- 0\r\n\r\nVCMPPS (VEX.256 encoded version)\r\nCMP0 <- SRC1[31:0] OP5 SRC2[31:0];\r\nCMP1 <- SRC1[63:32] OP5 SRC2[63:32];\r\nCMP2 <- SRC1[95:64] OP5 SRC2[95:64];\r\nCMP3 <- SRC1[127:96] OP5 SRC2[127:96];\r\nCMP4 <- SRC1[159:128] OP5 SRC2[159:128];\r\nCMP5 <- SRC1[191:160] OP5 SRC2[191:160];\r\nCMP6 <- SRC1[223:192] OP5 SRC2[223:192];\r\nCMP7 <- SRC1[255:224] OP5 SRC2[255:224];\r\nIF CMP0 = TRUE\r\n THEN DEST[31:0] <-FFFFFFFFH;\r\n ELSE DEST[31:0] <- 000000000H; FI;\r\nIF CMP1 = TRUE\r\n THEN DEST[63:32] <- FFFFFFFFH;\r\n ELSE DEST[63:32] <-000000000H; FI;\r\nIF CMP2 = TRUE\r\n THEN DEST[95:64] <- FFFFFFFFH;\r\n ELSE DEST[95:64] <- 000000000H; FI;\r\nIF CMP3 = TRUE\r\n THEN DEST[127:96] <- FFFFFFFFH;\r\n ELSE DEST[127:96] <- 000000000H; FI;\r\nIF CMP4 = TRUE\r\n THEN DEST[159:128] <- FFFFFFFFH;\r\n ELSE DEST[159:128] <- 000000000H; FI;\r\nIF CMP5 = TRUE\r\n THEN DEST[191:160] <- FFFFFFFFH;\r\n ELSE DEST[191:160] <- 000000000H; FI;\r\nIF CMP6 = TRUE\r\n THEN DEST[223:192] <- FFFFFFFFH;\r\n ELSE DEST[223:192] <-000000000H; FI;\r\nIF CMP7 = TRUE\r\n THEN DEST[255:224] <- FFFFFFFFH;\r\n ELSE DEST[255:224] <- 000000000H; FI;\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\n\r\n\r\nVCMPPS (VEX.128 encoded version)\r\nCMP0 <- SRC1[31:0] OP5 SRC2[31:0];\r\nCMP1 <- SRC1[63:32] OP5 SRC2[63:32];\r\nCMP2 <- SRC1[95:64] OP5 SRC2[95:64];\r\nCMP3 <- SRC1[127:96] OP5 SRC2[127:96];\r\nIF CMP0 = TRUE\r\n THEN DEST[31:0] <-FFFFFFFFH;\r\n ELSE DEST[31:0] <- 000000000H; FI;\r\nIF CMP1 = TRUE\r\n THEN DEST[63:32] <- FFFFFFFFH;\r\n ELSE DEST[63:32] <- 000000000H; FI;\r\nIF CMP2 = TRUE\r\n THEN DEST[95:64] <- FFFFFFFFH;\r\n ELSE DEST[95:64] <- 000000000H; FI;\r\nIF CMP3 = TRUE\r\n THEN DEST[127:96] <- FFFFFFFFH;\r\n ELSE DEST[127:96] <-000000000H; FI;\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nCMPPS (128-bit Legacy SSE version)\r\nCMP0 <- SRC1[31:0] OP3 SRC2[31:0];\r\nCMP1 <- SRC1[63:32] OP3 SRC2[63:32];\r\nCMP2 <- SRC1[95:64] OP3 SRC2[95:64];\r\nCMP3 <- SRC1[127:96] OP3 SRC2[127:96];\r\nIF CMP0 = TRUE\r\n THEN DEST[31:0] <-FFFFFFFFH;\r\n ELSE DEST[31:0] <- 000000000H; FI;\r\nIF CMP1 = TRUE\r\n THEN DEST[63:32] <- FFFFFFFFH;\r\n ELSE DEST[63:32] <- 000000000H; FI;\r\nIF CMP2 = TRUE\r\n THEN DEST[95:64] <- FFFFFFFFH;\r\n ELSE DEST[95:64] <- 000000000H; FI;\r\nIF CMP3 = TRUE\r\n THEN DEST[127:96] <- FFFFFFFFH;\r\n ELSE DEST[127:96] <-000000000H; FI;\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCMPPS __mmask16 _mm512_cmp_ps_mask( __m512 a, __m512 b, int imm);\r\nVCMPPS __mmask16 _mm512_cmp_round_ps_mask( __m512 a, __m512 b, int imm, int sae);\r\nVCMPPS __mmask16 _mm512_mask_cmp_ps_mask( __mmask16 k1, __m512 a, __m512 b, int imm);\r\nVCMPPS __mmask16 _mm512_mask_cmp_round_ps_mask( __mmask16 k1, __m512 a, __m512 b, int imm, int sae);\r\nVCMPPD __mmask8 _mm256_cmp_ps_mask( __m256 a, __m256 b, int imm);\r\nVCMPPS __mmask8 _mm256_mask_cmp_ps_mask( __mmask8 k1, __m256 a, __m256 b, int imm);\r\nVCMPPS __mmask8 _mm_cmp_ps_mask( __m128 a, __m128 b, int imm);\r\nVCMPPS __mmask8 _mm_mask_cmp_ps_mask( __mmask8 k1, __m128 a, __m128 b, int imm);\r\nVCMPPS __m256 _mm256_cmp_ps(__m256 a, __m256 b, int imm)\r\nCMPPS __m128 _mm_cmp_ps(__m128 a, __m128 b, int imm)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid if SNaN operand and invalid if QNaN and predicate as listed in Table 3-1.\r\nDenormal\r\n\r\n\r\n\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2.\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPPS"
},
{
"description": "CMPS/CMPSB/CMPSW/CMPSD/CMPSQ-Compare String Operands\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n A6 CMPS m8, m8 NP Valid Valid For legacy mode, compare byte at address DS:(E)SI with\r\n byte at address ES:(E)DI; For 64-bit mode compare byte\r\n at address (R|E)SI to byte at address (R|E)DI. The status\r\n flags are set accordingly.\r\n A7 CMPS m16, m16 NP Valid Valid For legacy mode, compare word at address DS:(E)SI\r\n with word at address ES:(E)DI; For 64-bit mode\r\n compare word at address (R|E)SI with word at address\r\n (R|E)DI. The status flags are set accordingly.\r\n A7 CMPS m32, m32 NP Valid Valid For legacy mode, compare dword at address DS:(E)SI at\r\n dword at address ES:(E)DI; For 64-bit mode compare\r\n dword at address (R|E)SI at dword at address (R|E)DI.\r\n The status flags are set accordingly.\r\n REX.W + A7 CMPS m64, m64 NP Valid N.E. Compares quadword at address (R|E)SI with quadword\r\n at address (R|E)DI and sets the status flags accordingly.\r\n A6 CMPSB NP Valid Valid For legacy mode, compare byte at address DS:(E)SI with\r\n byte at address ES:(E)DI; For 64-bit mode compare byte\r\n at address (R|E)SI with byte at address (R|E)DI. The\r\n status flags are set accordingly.\r\n A7 CMPSW NP Valid Valid For legacy mode, compare word at address DS:(E)SI\r\n with word at address ES:(E)DI; For 64-bit mode\r\n compare word at address (R|E)SI with word at address\r\n (R|E)DI. The status flags are set accordingly.\r\n A7 CMPSD NP Valid Valid For legacy mode, compare dword at address DS:(E)SI\r\n with dword at address ES:(E)DI; For 64-bit mode\r\n compare dword at address (R|E)SI with dword at\r\n address (R|E)DI. The status flags are set accordingly.\r\n REX.W + A7 CMPSQ NP Valid N.E. Compares quadword at address (R|E)SI with quadword\r\n at address (R|E)DI and sets the status flags accordingly.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nCompares the byte, word, doubleword, or quadword specified with the first source operand with the byte, word,\r\ndoubleword, or quadword specified with the second source operand and sets the status flags in the EFLAGS register\r\naccording to the results.\r\nBoth source operands are located in memory. The address of the first source operand is read from DS:SI, DS:ESI\r\nor RSI (depending on the address-size attribute of the instruction is 16, 32, or 64, respectively). The address of the\r\nsecond source operand is read from ES:DI, ES:EDI or RDI (again depending on the address-size attribute of the\r\ninstruction is 16, 32, or 64). The DS segment may be overridden with a segment override prefix, but the ES\r\nsegment cannot be overridden.\r\nAt the assembly-code level, two forms of this instruction are allowed: the \"explicit-operands\" form and the \"no-\r\noperands\" form. The explicit-operands form (specified with the CMPS mnemonic) allows the two source operands\r\nto be specified explicitly. Here, the source operands should be symbols that indicate the size and location of the\r\nsource values. This explicit-operand form is provided to allow documentation. However, note that the documenta-\r\ntion provided by this form can be misleading. That is, the source operand symbols must specify the correct type\r\n(size) of the operands (bytes, words, or doublewords, quadwords), but they do not have to specify the correct loca-\r\n\r\n\r\n\r\n\r\ntion. Locations of the source operands are always specified by the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) regis-\r\nters, which must be loaded correctly before the compare string instruction is executed.\r\nThe no-operands form provides \"short forms\" of the byte, word, and doubleword versions of the CMPS instructions.\r\nHere also the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) registers are assumed by the processor to specify the loca-\r\ntion of the source operands. The size of the source operands is selected with the mnemonic: CMPSB (byte compar-\r\nison), CMPSW (word comparison), CMPSD (doubleword comparison), or CMPSQ (quadword comparison using\r\nREX.W).\r\nAfter the comparison, the (E/R)SI and (E/R)DI registers increment or decrement automatically according to the\r\nsetting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E/R)SI and (E/R)DI register increment; if the\r\nDF flag is 1, the registers decrement.) The registers increment or decrement by 1 for byte operations, by 2 for word\r\noperations, 4 for doubleword operations. If operand size is 64, RSI and RDI registers increment by 8 for quadword\r\noperations.\r\nThe CMPS, CMPSB, CMPSW, CMPSD, and CMPSQ instructions can be preceded by the REP prefix for block compar-\r\nisons. More often, however, these instructions will be used in a LOOP construct that takes some action based on the\r\nsetting of the status flags before the next comparison is made. See \"REP/REPE/REPZ /REPNE/REPNZ-Repeat\r\nString Operation Prefix\" in Chapter 4 of the Intel 64 and IA-32 Architectures Software Developer's Manual,\r\nVolume 2B, for a description of the REP prefix.\r\nIn 64-bit mode, the instruction's default address size is 64 bits, 32 bit address size is supported using the prefix\r\n67H. Use of the REX.W prefix promotes doubleword operation to 64 bits (see CMPSQ). See the summary chart at\r\nthe beginning of this section for encoding data and limits.\r\n\r\nOperation\r\ntemp <- SRC1 - SRC2;\r\nSetStatusFlags(temp);\r\n\r\nIF (64-Bit Mode)\r\n THEN\r\n IF (Byte comparison)\r\n THEN IF DF = 0\r\n THEN\r\n (R|E)SI <- (R|E)SI + 1;\r\n (R|E)DI <- (R|E)DI + 1;\r\n ELSE\r\n (R|E)SI <- (R|E)SI - 1;\r\n (R|E)DI <- (R|E)DI - 1;\r\n FI;\r\n ELSE IF (Word comparison)\r\n THEN IF DF = 0\r\n THEN\r\n (R|E)SI <- (R|E)SI + 2;\r\n (R|E)DI <- (R|E)DI + 2;\r\n ELSE\r\n (R|E)SI <- (R|E)SI - 2;\r\n (R|E)DI <- (R|E)DI - 2;\r\n FI;\r\n ELSE IF (Doubleword comparison)\r\n THEN IF DF = 0\r\n THEN\r\n (R|E)SI <- (R|E)SI + 4;\r\n (R|E)DI <- (R|E)DI + 4;\r\n ELSE\r\n (R|E)SI <- (R|E)SI - 4;\r\n (R|E)DI <- (R|E)DI - 4;\r\n FI;\r\n\r\n\r\n\r\n ELSE (* Quadword comparison *)\r\n THEN IF DF = 0\r\n (R|E)SI <- (R|E)SI + 8;\r\n (R|E)DI <- (R|E)DI + 8;\r\n ELSE\r\n (R|E)SI <- (R|E)SI - 8;\r\n (R|E)DI <- (R|E)DI - 8;\r\n FI;\r\n FI;\r\n ELSE (* Non-64-bit Mode *)\r\n IF (byte comparison)\r\n THEN IF DF = 0\r\n THEN\r\n (E)SI <- (E)SI + 1;\r\n (E)DI <- (E)DI + 1;\r\n ELSE\r\n (E)SI <- (E)SI - 1;\r\n (E)DI <- (E)DI - 1;\r\n FI;\r\n ELSE IF (Word comparison)\r\n THEN IF DF = 0\r\n (E)SI <- (E)SI + 2;\r\n (E)DI <- (E)DI + 2;\r\n ELSE\r\n (E)SI <- (E)SI - 2;\r\n (E)DI <- (E)DI - 2;\r\n FI;\r\n ELSE (* Doubleword comparison *)\r\n THEN IF DF = 0\r\n (E)SI <- (E)SI + 4;\r\n (E)DI <- (E)DI + 4;\r\n ELSE\r\n (E)SI <- (E)SI - 4;\r\n (E)DI <- (E)DI - 4;\r\n FI;\r\n FI;\r\nFI;\r\n\r\nFlags Affected\r\nThe CF, OF, SF, ZF, AF, and PF flags are set according to the temporary result of the comparison.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPS"
},
{
"description": "-R:CMPS",
"mnem": "CMPSB"
},
{
"description": "-R:CMPS",
"mnem": "CMPSD"
},
{
"description": "CMPSD-Compare Scalar Double-Precision Floating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F C2 /r ib RMI V/V SSE2 Compare low double-precision floating-point value in\r\n CMPSD xmm1, xmm2/m64, imm8 xmm2/m64 and xmm1 using bits 2:0 of imm8 as comparison\r\n predicate.\r\n VEX.NDS.128.F2.0F.WIG C2 /r ib RVMI V/V AVX Compare low double-precision floating-point value in\r\n VCMPSD xmm1, xmm2, xmm3/m64 and xmm2 using bits 4:0 of imm8 as comparison\r\n xmm3/m64, imm8 predicate.\r\n EVEX.NDS.LIG.F2.0F.W1 C2 /r ib T1S V/V AVX512F Compare low double-precision floating-point value in\r\n VCMPSD k1 {k2}, xmm2, xmm3/m64 and xmm2 using bits 4:0 of imm8 as comparison\r\n xmm3/m64{sae}, imm8 predicate with writemask k2 and leave the result in mask\r\n register k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) Imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv ModRM:r/m (r) Imm8\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) Imm8\r\n\r\nDescription\r\nCompares the low double-precision floating-point values in the second source operand and the first source operand\r\nand returns the results in of the comparison to the destination operand. The comparison predicate operand (imme-\r\ndiate operand) specifies the type of comparison performed.\r\n128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The\r\nsecond source operand (second operand) can be an XMM register or 64-bit memory location. Bits (MAX_VL-1:64)\r\nof the corresponding YMM destination register remain unchanged. The comparison result is a quadword mask of all\r\n1s (comparison true) or all 0s (comparison false).\r\nVEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source\r\noperand (third operand) can be an XMM register or a 64-bit memory location. The result is stored in the low quad-\r\nword of the destination operand; the high quadword is filled with the contents of the high quadword of the first\r\nsource operand. Bits (MAX_VL-1:128) of the destination ZMM register are zeroed. The comparison result is a quad-\r\nword mask of all 1s (comparison true) or all 0s (comparison false).\r\nEVEX encoded version: The first source operand (second operand) is an XMM register. The second source operand\r\ncan be a XMM register or a 64-bit memory location. The destination operand (first operand) is an opmask register.\r\nThe comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false), written to the destination\r\nstarting from the LSB according to the writemask k2. Bits (MAX_KL-1:128) of the destination register are cleared.\r\nThe comparison predicate operand is an 8-bit immediate:\r\n. For instructions encoded using the VEX prefix, bits 4:0 define the type of comparison to be performed (see\r\n Table 3-1). Bits 5 through 7 of the immediate are reserved.\r\n. For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see\r\n the first 8 rows of Table 3-1). Bits 3 through 7 of the immediate are reserved.\r\nThe unordered relationship is true when at least one of the two source operands being compared is a NaN; the\r\nordered relationship is true when neither source operand is a NaN.\r\nA subsequent computational instruction that uses the mask result in the destination operand as an input operand\r\nwill not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask\r\nof all 1s corresponds to a QNaN.\r\nNote that processors with \"CPUID.1H:ECX.AVX =0\" do not implement the \"greater-than\", \"greater-than-or-equal\",\r\n\"not-greater than\", and \"not-greater-than-or-equal relations\" predicates. These comparisons can be made either\r\nby using the inverse relationship (that is, use the \"not-less-than-or-equal\" to make a \"greater-than\" comparison)\r\n\r\n\r\n\r\nor by using software emulation. When using software emulation, the program must swap the operands (copying\r\nregisters when necessary to protect the data that will now be in the destination), and then perform the compare\r\nusing a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of Table 3-7\r\n(Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A) under the heading Emulation.\r\nCompilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand\r\nCMPSD instruction, for processors with \"CPUID.1H:ECX.AVX =0\". See Table 3-6. Compiler should treat reserved\r\nImm8 values as illegal syntax.\r\n Table 3-6. Pseudo-Op and CMPSD Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPSD Implementation\r\n CMPEQSD xmm1, xmm2 CMPSD xmm1, xmm2, 0\r\n CMPLTSD xmm1, xmm2 CMPSD xmm1, xmm2, 1\r\n CMPLESD xmm1, xmm2 CMPSD xmm1, xmm2, 2\r\n CMPUNORDSD xmm1, xmm2 CMPSD xmm1, xmm2, 3\r\n CMPNEQSD xmm1, xmm2 CMPSD xmm1, xmm2, 4\r\n CMPNLTSD xmm1, xmm2 CMPSD xmm1, xmm2, 5\r\n CMPNLESD xmm1, xmm2 CMPSD xmm1, xmm2, 6\r\n CMPORDSD xmm1, xmm2 CMPSD xmm1, xmm2, 7\r\n\r\nThe greater-than relations that the processor does not implement require more than one instruction to emulate in\r\nsoftware and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the\r\noperands of the corresponding less than relations and use move instructions to ensure that the mask is moved to\r\nthe correct destination register and that the source operand is left intact.)\r\nProcessors with \"CPUID.1H:ECX.AVX =1\" implement the full complement of 32 predicates shown in Table 3-7, soft-\r\nware emulation is no longer needed. Compilers and assemblers may implement the following three-operand\r\npseudo-ops in addition to the four-operand VCMPSD instruction. See Table 3-7, where the notations of reg1 reg2,\r\nand reg3 represent either XMM registers or YMM registers. Compiler should treat reserved Imm8 values as illegal\r\nsyntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic inter-\r\nface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPSD instructions\r\nin a similar fashion by extending the syntax listed in Table 3-7.\r\n Table 3-7. Pseudo-Op and VCMPSD Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPSD Implementation\r\n VCMPEQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0\r\n VCMPLTSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1\r\n VCMPLESD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 2\r\n VCMPUNORDSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 3\r\n VCMPNEQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 4\r\n VCMPNLTSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 5\r\n VCMPNLESD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 6\r\n VCMPORDSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 7\r\n VCMPEQ_UQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 8\r\n VCMPNGESD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 9\r\n VCMPNGTSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0AH\r\n VCMPFALSESD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0BH\r\n VCMPNEQ_OQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0CH\r\n VCMPGESD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0DH\r\n\r\n\r\n\r\n\r\n Table 3-7. Pseudo-Op and VCMPSD Implementation\r\n Pseudo-Op CMPSD Implementation\r\n VCMPGTSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0EH\r\n VCMPTRUESD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 0FH\r\n VCMPEQ_OSSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 10H\r\n VCMPLT_OQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 11H\r\n VCMPLE_OQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 12H\r\n VCMPUNORD_SSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 13H\r\n VCMPNEQ_USSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 14H\r\n VCMPNLT_UQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 15H\r\n VCMPNLE_UQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 16H\r\n VCMPORD_SSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 17H\r\n VCMPEQ_USSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 18H\r\n VCMPNGE_UQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 19H\r\n VCMPNGT_UQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1AH\r\n VCMPFALSE_OSSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1BH\r\n VCMPNEQ_OSSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1CH\r\n VCMPGE_OQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1DH\r\n VCMPGT_OQSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1EH\r\n VCMPTRUE_USSD reg1, reg2, reg3 VCMPSD reg1, reg2, reg3, 1FH\r\n\r\nSoftware should ensure VCMPSD is encoded with VEX.L=0. Encoding VCMPSD with VEX.L=1 may encounter unpre-\r\ndictable behavior across different processor generations.\r\n\r\nOperation\r\nCASE (COMPARISON PREDICATE) OF\r\n 0: OP3 <-EQ_OQ; OP5 <-EQ_OQ;\r\n 1: OP3 <-LT_OS; OP5 <-LT_OS;\r\n 2: OP3 <-LE_OS; OP5 <-LE_OS;\r\n 3: OP3 <-UNORD_Q; OP5 <-UNORD_Q;\r\n 4: OP3 <-NEQ_UQ; OP5 <-NEQ_UQ;\r\n 5: OP3 <-NLT_US; OP5 <-NLT_US;\r\n 6: OP3 <-NLE_US; OP5 <-NLE_US;\r\n 7: OP3 <-ORD_Q; OP5 <-ORD_Q;\r\n 8: OP5 <-EQ_UQ;\r\n 9: OP5 <-NGE_US;\r\n 10: OP5 <-NGT_US;\r\n 11: OP5 <-FALSE_OQ;\r\n 12: OP5 <-NEQ_OQ;\r\n 13: OP5 <-GE_OS;\r\n 14: OP5 <-GT_OS;\r\n 15: OP5 <-TRUE_UQ;\r\n 16: OP5 <-EQ_OS;\r\n 17: OP5 <-LT_OQ;\r\n 18: OP5 <-LE_OQ;\r\n 19: OP5 <-UNORD_S;\r\n 20: OP5 <-NEQ_US;\r\n 21: OP5 <-NLT_UQ;\r\n\r\n\r\n\r\n 22: OP5 <-NLE_UQ;\r\n 23: OP5 <-ORD_S;\r\n 24: OP5 <-EQ_US;\r\n 25: OP5 <-NGE_UQ;\r\n 26: OP5 <-NGT_UQ;\r\n 27: OP5 <-FALSE_OS;\r\n 28: OP5 <-NEQ_OS;\r\n 29: OP5 <-GE_OQ;\r\n 30: OP5 <-GT_OQ;\r\n 31: OP5 <-TRUE_US;\r\n DEFAULT: Reserved\r\nESAC;\r\n\r\nVCMPSD (EVEX encoded version)\r\nCMP0 <- SRC1[63:0] OP5 SRC2[63:0];\r\n\r\nIF k2[0] or *no writemask*\r\n THEN IF CMP0 = TRUE\r\n THEN DEST[0] <- 1;\r\n ELSE DEST[0] <- 0; FI;\r\n ELSE DEST[0] <- 0 ; zeroing-masking only\r\nFI;\r\nDEST[MAX_KL-1:1] <- 0\r\n\r\nCMPSD (128-bit Legacy SSE version)\r\nCMP0 <-DEST[63:0] OP3 SRC[63:0];\r\nIF CMP0 = TRUE\r\nTHEN DEST[63:0] <-FFFFFFFFFFFFFFFFH;\r\nELSE DEST[63:0] <-0000000000000000H; FI;\r\nDEST[MAX_VL-1:64] (Unmodified)\r\n\r\nVCMPSD (VEX.128 encoded version)\r\nCMP0 <-SRC1[63:0] OP5 SRC2[63:0];\r\nIF CMP0 = TRUE\r\nTHEN DEST[63:0] <-FFFFFFFFFFFFFFFFH;\r\nELSE DEST[63:0] <-0000000000000000H; FI;\r\nDEST[127:64] <-SRC1[127:64]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCMPSD __mmask8 _mm_cmp_sd_mask( __m128d a, __m128d b, int imm);\r\nVCMPSD __mmask8 _mm_cmp_round_sd_mask( __m128d a, __m128d b, int imm, int sae);\r\nVCMPSD __mmask8 _mm_mask_cmp_sd_mask( __mmask8 k1, __m128d a, __m128d b, int imm);\r\nVCMPSD __mmask8 _mm_mask_cmp_round_sd_mask( __mmask8 k1, __m128d a, __m128d b, int imm, int sae);\r\n(V)CMPSD __m128d _mm_cmp_sd(__m128d a, __m128d b, const int imm)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid if SNaN operand, Invalid if QNaN and predicate as listed in Table 3-1 Denormal.\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPSD"
},
{
"description": "-R:CMPS",
"mnem": "CMPSQ"
},
{
"description": "CMPSS-Compare Scalar Single-Precision Floating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F C2 /r ib RMI V/V SSE Compare low single-precision floating-point value in\r\n CMPSS xmm1, xmm2/m32, imm8 xmm2/m32 and xmm1 using bits 2:0 of imm8 as\r\n comparison predicate.\r\n VEX.NDS.128.F3.0F.WIG C2 /r ib RVMI V/V AVX Compare low single-precision floating-point value in\r\n VCMPSS xmm1, xmm2, xmm3/m32, xmm3/m32 and xmm2 using bits 4:0 of imm8 as\r\n imm8 comparison predicate.\r\n EVEX.NDS.LIG.F3.0F.W0 C2 /r ib T1S V/V AVX512F Compare low single-precision floating-point value in\r\n VCMPSS k1 {k2}, xmm2, xmm3/m32 and xmm2 using bits 4:0 of imm8 as\r\n xmm3/m32{sae}, imm8 comparison predicate with writemask k2 and leave the\r\n result in mask register k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) Imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv ModRM:r/m (r) Imm8\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) Imm8\r\n\r\nDescription\r\nCompares the low single-precision floating-point values in the second source operand and the first source operand\r\nand returns the results of the comparison to the destination operand. The comparison predicate operand (imme-\r\ndiate operand) specifies the type of comparison performed.\r\n128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The\r\nsecond source operand (second operand) can be an XMM register or 32-bit memory location. Bits (MAX_VL-1:32)\r\nof the corresponding YMM destination register remain unchanged. The comparison result is a doubleword mask of\r\nall 1s (comparison true) or all 0s (comparison false).\r\nVEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source\r\noperand (third operand) can be an XMM register or a 32-bit memory location. The result is stored in the low 32 bits\r\nof the destination operand; bits 128:32 of the destination operand are copied from the first source operand. Bits\r\n(MAX_VL-1:128) of the destination ZMM register are zeroed. The comparison result is a doubleword mask of all 1s\r\n(comparison true) or all 0s (comparison false).\r\nEVEX encoded version: The first source operand (second operand) is an XMM register. The second source operand\r\ncan be a XMM register or a 32-bit memory location. The destination operand (first operand) is an opmask register.\r\nThe comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false), written to the destination\r\nstarting from the LSB according to the writemask k2. Bits (MAX_KL-1:128) of the destination register are cleared.\r\nThe comparison predicate operand is an 8-bit immediate:\r\n. For instructions encoded using the VEX prefix, bits 4:0 define the type of comparison to be performed (see\r\n Table 3-1). Bits 5 through 7 of the immediate are reserved.\r\n. For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see\r\n the first 8 rows of Table 3-1). Bits 3 through 7 of the immediate are reserved.\r\n\r\n\r\nThe unordered relationship is true when at least one of the two source operands being compared is a NaN; the\r\nordered relationship is true when neither source operand is a NaN.\r\nA subsequent computational instruction that uses the mask result in the destination operand as an input operand\r\nwill not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask\r\nof all 1s corresponds to a QNaN.\r\nNote that processors with \"CPUID.1H:ECX.AVX =0\" do not implement the \"greater-than\", \"greater-than-or-equal\",\r\n\"not-greater than\", and \"not-greater-than-or-equal relations\" predicates. These comparisons can be made either\r\n\r\n\r\n\r\nby using the inverse relationship (that is, use the \"not-less-than-or-equal\" to make a \"greater-than\" comparison)\r\nor by using software emulation. When using software emulation, the program must swap the operands (copying\r\nregisters when necessary to protect the data that will now be in the destination), and then perform the compare\r\nusing a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of Table 3-7\r\n(Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A) under the heading Emulation.\r\nCompilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand\r\nCMPSS instruction, for processors with \"CPUID.1H:ECX.AVX =0\". See Table 3-8. Compiler should treat reserved\r\nImm8 values as illegal syntax.\r\n Table 3-8. Pseudo-Op and CMPSS Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPSS Implementation\r\n CMPEQSS xmm1, xmm2 CMPSS xmm1, xmm2, 0\r\n CMPLTSS xmm1, xmm2 CMPSS xmm1, xmm2, 1\r\n CMPLESS xmm1, xmm2 CMPSS xmm1, xmm2, 2\r\n CMPUNORDSS xmm1, xmm2 CMPSS xmm1, xmm2, 3\r\n CMPNEQSS xmm1, xmm2 CMPSS xmm1, xmm2, 4\r\n CMPNLTSS xmm1, xmm2 CMPSS xmm1, xmm2, 5\r\n CMPNLESS xmm1, xmm2 CMPSS xmm1, xmm2, 6\r\n CMPORDSS xmm1, xmm2 CMPSS xmm1, xmm2, 7\r\n\r\nThe greater-than relations that the processor does not implement require more than one instruction to emulate in\r\nsoftware and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the\r\noperands of the corresponding less than relations and use move instructions to ensure that the mask is moved to\r\nthe correct destination register and that the source operand is left intact.)\r\nProcessors with \"CPUID.1H:ECX.AVX =1\" implement the full complement of 32 predicates shown in Table 3-7, soft-\r\nware emulation is no longer needed. Compilers and assemblers may implement the following three-operand\r\npseudo-ops in addition to the four-operand VCMPSS instruction. See Table 3-9, where the notations of reg1 reg2,\r\nand reg3 represent either XMM registers or YMM registers. Compiler should treat reserved Imm8 values as illegal\r\nsyntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic inter-\r\nface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPSS instructions\r\nin a similar fashion by extending the syntax listed in Table 3-9.\r\n Table 3-9. Pseudo-Op and VCMPSS Implementation\r\n:\r\n\r\n\r\n\r\n\r\n Pseudo-Op CMPSS Implementation\r\n VCMPEQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0\r\n VCMPLTSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1\r\n VCMPLESS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 2\r\n VCMPUNORDSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 3\r\n VCMPNEQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 4\r\n VCMPNLTSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 5\r\n VCMPNLESS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 6\r\n VCMPORDSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 7\r\n VCMPEQ_UQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 8\r\n VCMPNGESS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 9\r\n VCMPNGTSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0AH\r\n VCMPFALSESS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0BH\r\n VCMPNEQ_OQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0CH\r\n VCMPGESS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0DH\r\n\r\n\r\n\r\n Table 3-9. Pseudo-Op and VCMPSS Implementation\r\n Pseudo-Op CMPSS Implementation\r\n VCMPGTSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0EH\r\n VCMPTRUESS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 0FH\r\n VCMPEQ_OSSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 10H\r\n VCMPLT_OQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 11H\r\n VCMPLE_OQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 12H\r\n VCMPUNORD_SSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 13H\r\n VCMPNEQ_USSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 14H\r\n VCMPNLT_UQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 15H\r\n VCMPNLE_UQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 16H\r\n VCMPORD_SSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 17H\r\n VCMPEQ_USSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 18H\r\n VCMPNGE_UQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 19H\r\n VCMPNGT_UQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1AH\r\n VCMPFALSE_OSSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1BH\r\n VCMPNEQ_OSSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1CH\r\n VCMPGE_OQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1DH\r\n VCMPGT_OQSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1EH\r\n VCMPTRUE_USSS reg1, reg2, reg3 VCMPSS reg1, reg2, reg3, 1FH\r\n\r\nSoftware should ensure VCMPSS is encoded with VEX.L=0. Encoding VCMPSS with VEX.L=1 may encounter unpre-\r\ndictable behavior across different processor generations.\r\n\r\nOperation\r\nCASE (COMPARISON PREDICATE) OF\r\n 0: OP3 <-EQ_OQ; OP5 <-EQ_OQ;\r\n 1: OP3 <-LT_OS; OP5 <-LT_OS;\r\n 2: OP3 <-LE_OS; OP5 <-LE_OS;\r\n 3: OP3 <-UNORD_Q; OP5 <-UNORD_Q;\r\n 4: OP3 <-NEQ_UQ; OP5 <-NEQ_UQ;\r\n 5: OP3 <-NLT_US; OP5 <-NLT_US;\r\n 6: OP3 <-NLE_US; OP5 <-NLE_US;\r\n 7: OP3 <-ORD_Q; OP5 <-ORD_Q;\r\n 8: OP5 <-EQ_UQ;\r\n 9: OP5 <-NGE_US;\r\n 10: OP5 <-NGT_US;\r\n 11: OP5 <-FALSE_OQ;\r\n 12: OP5 <-NEQ_OQ;\r\n 13: OP5 <-GE_OS;\r\n 14: OP5 <-GT_OS;\r\n 15: OP5 <-TRUE_UQ;\r\n 16: OP5 <-EQ_OS;\r\n 17: OP5 <-LT_OQ;\r\n 18: OP5 <-LE_OQ;\r\n 19: OP5 <-UNORD_S;\r\n 20: OP5 <-NEQ_US;\r\n 21: OP5 <-NLT_UQ;\r\n\r\n\r\n\r\n 22: OP5 <-NLE_UQ;\r\n 23: OP5 <-ORD_S;\r\n 24: OP5 <-EQ_US;\r\n 25: OP5 <-NGE_UQ;\r\n 26: OP5 <-NGT_UQ;\r\n 27: OP5 <-FALSE_OS;\r\n 28: OP5 <-NEQ_OS;\r\n 29: OP5 <-GE_OQ;\r\n 30: OP5 <-GT_OQ;\r\n 31: OP5 <-TRUE_US;\r\n DEFAULT: Reserved\r\nESAC;\r\n\r\nVCMPSS (EVEX encoded version)\r\nCMP0 <- SRC1[31:0] OP5 SRC2[31:0];\r\n\r\nIF k2[0] or *no writemask*\r\n THEN IF CMP0 = TRUE\r\n THEN DEST[0] <- 1;\r\n ELSE DEST[0] <- 0; FI;\r\n ELSE DEST[0] <- 0 ; zeroing-masking only\r\nFI;\r\nDEST[MAX_KL-1:1] <- 0\r\n\r\nCMPSS (128-bit Legacy SSE version)\r\nCMP0 <-DEST[31:0] OP3 SRC[31:0];\r\nIF CMP0 = TRUE\r\nTHEN DEST[31:0] <-FFFFFFFFH;\r\nELSE DEST[31:0] <-00000000H; FI;\r\nDEST[MAX_VL-1:32] (Unmodified)\r\n\r\nVCMPSS (VEX.128 encoded version)\r\nCMP0 <-SRC1[31:0] OP5 SRC2[31:0];\r\nIF CMP0 = TRUE\r\nTHEN DEST[31:0] <-FFFFFFFFH;\r\nELSE DEST[31:0] <-00000000H; FI;\r\nDEST[127:32] <-SRC1[127:32]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCMPSS __mmask8 _mm_cmp_ss_mask( __m128 a, __m128 b, int imm);\r\nVCMPSS __mmask8 _mm_cmp_round_ss_mask( __m128 a, __m128 b, int imm, int sae);\r\nVCMPSS __mmask8 _mm_mask_cmp_ss_mask( __mmask8 k1, __m128 a, __m128 b, int imm);\r\nVCMPSS __mmask8 _mm_mask_cmp_round_ss_mask( __mmask8 k1, __m128 a, __m128 b, int imm, int sae);\r\n(V)CMPSS __m128 _mm_cmp_ss(__m128 a, __m128 b, const int imm)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid if SNaN operand, Invalid if QNaN and predicate as listed in Table 3-1, Denormal.\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPSS"
},
{
"description": "-R:CMPS",
"mnem": "CMPSW"
},
{
"description": "CMPXCHG-Compare and Exchange\r\nOpcode/ Op/ 64-Bit Compat/ Description\r\nInstruction En Mode Leg Mode\r\n0F B0/r MR Valid Valid* Compare AL with r/m8. If equal, ZF is set and r8 is loaded into\r\nCMPXCHG r/m8, r8 r/m8. Else, clear ZF and load r/m8 into AL.\r\n\r\nREX + 0F B0/r MR Valid N.E. Compare AL with r/m8. If equal, ZF is set and r8 is loaded into\r\nCMPXCHG r/m8**,r8 r/m8. Else, clear ZF and load r/m8 into AL.\r\n\r\n0F B1/r MR Valid Valid* Compare AX with r/m16. If equal, ZF is set and r16 is loaded\r\nCMPXCHG r/m16, r16 into r/m16. Else, clear ZF and load r/m16 into AX.\r\n0F B1/r MR Valid Valid* Compare EAX with r/m32. If equal, ZF is set and r32 is loaded\r\nCMPXCHG r/m32, r32 into r/m32. Else, clear ZF and load r/m32 into EAX.\r\n\r\nREX.W + 0F B1/r MR Valid N.E. Compare RAX with r/m64. If equal, ZF is set and r64 is loaded\r\nCMPXCHG r/m64, r64 into r/m64. Else, clear ZF and load r/m64 into RAX.\r\n\r\nNOTES:\r\n* See the IA-32 Architecture Compatibility section below.\r\n** In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n MR ModRM:r/m (r, w) ModRM:reg (r) NA NA\r\n\r\nDescription\r\nCompares the value in the AL, AX, EAX, or RAX register with the first operand (destination operand). If the two\r\nvalues are equal, the second operand (source operand) is loaded into the destination operand. Otherwise, the\r\ndestination operand is loaded into the AL, AX, EAX or RAX register. RAX register is available only in 64-bit mode.\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the\r\ninterface to the processor's bus, the destination operand receives a write cycle without regard to the result of the\r\ncomparison. The destination operand is written back if the comparison fails; otherwise, the source operand is\r\nwritten into the destination. (The processor never produces a locked read without also producing a locked write.)\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Use of the REX.R prefix permits access to addi-\r\ntional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the\r\nbeginning of this section for encoding data and limits.\r\n\r\nIA-32 Architecture Compatibility\r\nThis instruction is not supported on Intel processors earlier than the Intel486 processors.\r\n\r\nOperation\r\n(* Accumulator = AL, AX, EAX, or RAX depending on whether a byte, word, doubleword, or quadword comparison is being performed *)\r\nTEMP <- DEST\r\nIF accumulator = TEMP\r\n THEN\r\n ZF <- 1;\r\n DEST <- SRC;\r\n ELSE\r\n ZF <- 0;\r\n accumulator <- TEMP;\r\n DEST <- TEMP;\r\nFI;\r\n\r\n\r\n\r\nFlags Affected\r\nThe ZF flag is set if the values in the destination operand and register AL, AX, or EAX are equal; otherwise it is\r\ncleared. The CF, PF, AF, SF, and OF flags are set according to the results of the comparison operation.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPXCHG"
},
{
"description": "-R:CMPXCHG8B",
"mnem": "CMPXCHG16B"
},
{
"description": "CMPXCHG8B/CMPXCHG16B-Compare and Exchange Bytes\r\nOpcode/ Op/ 64-Bit Compat/ Description\r\nInstruction En Mode Leg Mode\r\n0F C7 /1 m64 M Valid Valid* Compare EDX:EAX with m64. If equal, set ZF and load\r\nCMPXCHG8B m64 ECX:EBX into m64. Else, clear ZF and load m64 into EDX:EAX.\r\n\r\nREX.W + 0F C7 /1 m128 M Valid N.E. Compare RDX:RAX with m128. If equal, set ZF and load\r\nCMPXCHG16B m128 RCX:RBX into m128. Else, clear ZF and load m128 into\r\n RDX:RAX.\r\nNOTES:\r\n*See IA-32 Architecture Compatibility section below.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n M ModRM:r/m (r, w) NA NA NA\r\n\r\nDescription\r\nCompares the 64-bit value in EDX:EAX (or 128-bit value in RDX:RAX if operand size is 128 bits) with the operand\r\n(destination operand). If the values are equal, the 64-bit value in ECX:EBX (or 128-bit value in RCX:RBX) is stored\r\nin the destination operand. Otherwise, the value in the destination operand is loaded into EDX:EAX (or RDX:RAX).\r\nThe destination operand is an 8-byte memory location (or 16-byte memory location if operand size is 128 bits). For\r\nthe EDX:EAX and ECX:EBX register pairs, EDX and ECX contain the high-order 32 bits and EAX and EBX contain the\r\nlow-order 32 bits of a 64-bit value. For the RDX:RAX and RCX:RBX register pairs, RDX and RCX contain the high-\r\norder 64 bits and RAX and RBX contain the low-order 64bits of a 128-bit value.\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the\r\ninterface to the processor's bus, the destination operand receives a write cycle without regard to the result of the\r\ncomparison. The destination operand is written back if the comparison fails; otherwise, the source operand is\r\nwritten into the destination. (The processor never produces a locked read without also producing a locked write.)\r\nIn 64-bit mode, default operation size is 64 bits. Use of the REX.W prefix promotes operation to 128 bits. Note that\r\nCMPXCHG16B requires that the destination (memory) operand be 16-byte aligned. See the summary chart at the\r\nbeginning of this section for encoding data and limits. For information on the CPUID flag that indicates\r\nCMPXCHG16B, see page 3-206.\r\n\r\nIA-32 Architecture Compatibility\r\nThis instruction encoding is not supported on Intel processors earlier than the Pentium processors.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nIF (64-Bit Mode and OperandSize = 64)\r\n THEN\r\n TEMP128 <- DEST\r\n IF (RDX:RAX = TEMP128)\r\n THEN\r\n ZF <- 1;\r\n DEST <- RCX:RBX;\r\n ELSE\r\n ZF <- 0;\r\n RDX:RAX <- TEMP128;\r\n DEST <- TEMP128;\r\n FI;\r\n FI\r\n ELSE\r\n TEMP64 <- DEST;\r\n IF (EDX:EAX = TEMP64)\r\n THEN\r\n ZF <- 1;\r\n DEST <- ECX:EBX;\r\n ELSE\r\n ZF <- 0;\r\n EDX:EAX <- TEMP64;\r\n DEST <- TEMP64;\r\n FI;\r\n FI;\r\nFI;\r\n\r\nFlags Affected\r\nThe ZF flag is set if the destination operand and EDX:EAX are equal; otherwise it is cleared. The CF, PF, AF, SF, and\r\nOF flags are unaffected.\r\n\r\nProtected Mode Exceptions\r\n#UD If the destination is not a memory operand.\r\n#GP(0) If the destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the destination operand is not a memory location.\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the destination operand is not a memory location.\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n If memory operand for CMPXCHG16B is not aligned on a 16-byte boundary.\r\n If CPUID.01H:ECX.CMPXCHG16B[bit 13] = 0.\r\n#UD If the destination operand is not a memory location.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CMPXCHG8B"
},
{
"description": "COMISD-Compare Scalar Ordered Double-Precision Floating-Point Values and Set EFLAGS\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 2F /r RM V/V SSE2 Compare low double-precision floating-point values in\r\n COMISD xmm1, xmm2/m64 xmm1 and xmm2/mem64 and set the EFLAGS flags\r\n accordingly.\r\n VEX.128.66.0F.WIG 2F /r RM V/V AVX Compare low double-precision floating-point values in\r\n VCOMISD xmm1, xmm2/m64 xmm1 and xmm2/mem64 and set the EFLAGS flags\r\n accordingly.\r\n EVEX.LIG.66.0F.W1 2F /r T1S V/V AVX512F Compare low double-precision floating-point values in\r\n VCOMISD xmm1, xmm2/m64{sae} xmm1 and xmm2/mem64 and set the EFLAGS flags\r\n accordingly.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n T1S ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nCompares the double-precision floating-point values in the low quadwords of operand 1 (first operand) and\r\noperand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unor-\r\ndered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unor-\r\ndered result is returned if either source operand is a NaN (QNaN or SNaN).\r\nOperand 1 is an XMM register; operand 2 can be an XMM register or a 64 bit memory\r\nlocation. The COMISD instruction differs from the UCOMISD instruction in that it signals a SIMD floating-point\r\ninvalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISD instruction signals\r\nan invalid numeric exception only if a source operand is an SNaN.\r\nThe EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\nSoftware should ensure VCOMISD is encoded with VEX.L=0. Encoding VCOMISD with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\nOperation\r\nCOMISD (all versions)\r\nRESULT <- OrderedCompare(DEST[63:0] <> SRC[63:0]) {\r\n(* Set EFLAGS *) CASE (RESULT) OF\r\n UNORDERED: ZF,PF,CF <- 111;\r\n GREATER_THAN: ZF,PF,CF <- 000;\r\n LESS_THAN: ZF,PF,CF <- 001;\r\n EQUAL: ZF,PF,CF <- 100;\r\nESAC;\r\nOF, AF, SF <-0; }\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCOMISD int _mm_comi_round_sd(__m128d a, __m128d b, int imm, int sae);\r\nVCOMISD int _mm_comieq_sd (__m128d a, __m128d b)\r\nVCOMISD int _mm_comilt_sd (__m128d a, __m128d b)\r\nVCOMISD int _mm_comile_sd (__m128d a, __m128d b)\r\nVCOMISD int _mm_comigt_sd (__m128d a, __m128d b)\r\nVCOMISD int _mm_comige_sd (__m128d a, __m128d b)\r\nVCOMISD int _mm_comineq_sd (__m128d a, __m128d b)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid (if SNaN or QNaN operands), Denormal.\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3;\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "COMISD"
},
{
"description": "COMISS-Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 2F /r RM V/V SSE Compare low single-precision floating-point values in\r\n COMISS xmm1, xmm2/m32 xmm1 and xmm2/mem32 and set the EFLAGS flags\r\n accordingly.\r\n VEX.128.0F.WIG 2F /r RM V/V AVX Compare low single-precision floating-point values in\r\n VCOMISS xmm1, xmm2/m32 xmm1 and xmm2/mem32 and set the EFLAGS flags\r\n accordingly.\r\n EVEX.LIG.0F.W0 2F /r T1S V/V AVX512F Compare low single-precision floating-point values in\r\n VCOMISS xmm1, xmm2/m32{sae} xmm1 and xmm2/mem32 and set the EFLAGS flags\r\n accordingly.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n T1S ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nCompares the single-precision floating-point values in the low quadwords of operand 1 (first operand) and operand\r\n2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered,\r\ngreater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result\r\nis returned if either source operand is a NaN (QNaN or SNaN).\r\nOperand 1 is an XMM register; operand 2 can be an XMM register or a 32 bit memory location.\r\nThe COMISS instruction differs from the UCOMISS instruction in that it signals a SIMD floating-point invalid opera-\r\ntion exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISS instruction signals an invalid\r\nnumeric exception only if a source operand is an SNaN.\r\nThe EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\nSoftware should ensure VCOMISS is encoded with VEX.L=0. Encoding VCOMISS with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\nOperation\r\nCOMISS (all versions)\r\nRESULT <- OrderedCompare(DEST[31:0] <> SRC[31:0]) {\r\n(* Set EFLAGS *) CASE (RESULT) OF\r\n UNORDERED: ZF,PF,CF <- 111;\r\n GREATER_THAN: ZF,PF,CF <- 000;\r\n LESS_THAN: ZF,PF,CF <- 001;\r\n EQUAL: ZF,PF,CF <- 100;\r\nESAC;\r\nOF, AF, SF <- 0; }\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCOMISS int _mm_comi_round_ss(__m128 a, __m128 b, int imm, int sae);\r\nVCOMISS int _mm_comieq_ss (__m128 a, __m128 b)\r\nVCOMISS int _mm_comilt_ss (__m128 a, __m128 b)\r\nVCOMISS int _mm_comile_ss (__m128 a, __m128 b)\r\nVCOMISS int _mm_comigt_ss (__m128 a, __m128 b)\r\nVCOMISS int _mm_comige_ss (__m128 a, __m128 b)\r\nVCOMISS int _mm_comineq_ss (__m128 a, __m128 b)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid (if SNaN or QNaN operands), Denormal.\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3;\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "COMISS"
},
{
"description": "CPUID-CPU Identification\r\nOpcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n0F A2 CPUID NP Valid Valid Returns processor identification and feature\r\n information to the EAX, EBX, ECX, and EDX\r\n registers, as determined by input entered in\r\n EAX (in some cases, ECX as well).\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nThe ID flag (bit 21) in the EFLAGS register indicates support for the CPUID instruction. If a software procedure can\r\nset and clear this flag, the processor executing the procedure supports the CPUID instruction. This instruction oper-\r\nates the same in non-64-bit modes and 64-bit mode.\r\nCPUID returns processor identification and feature information in the EAX, EBX, ECX, and EDX registers.1 The\r\ninstruction's output is dependent on the contents of the EAX register upon execution (in some cases, ECX as well).\r\nFor example, the following pseudocode loads EAX with 00H and causes CPUID to return a Maximum Return Value\r\nand the Vendor Identification String in the appropriate registers:\r\n\r\n MOV EAX, 00H\r\n CPUID\r\nTable 3-8 shows information returned, depending on the initial value loaded into the EAX register.\r\nTwo types of information are returned: basic and extended function information. If a value entered for CPUID.EAX\r\nis higher than the maximum input value for basic or extended function for that processor then the data for the\r\nhighest basic information leaf is returned. For example, using the Intel Core i7 processor, the following is true:\r\n CPUID.EAX = 05H (* Returns MONITOR/MWAIT leaf. *)\r\n CPUID.EAX = 0AH (* Returns Architectural Performance Monitoring leaf. *)\r\n CPUID.EAX = 0BH (* Returns Extended Topology Enumeration leaf. *)\r\n CPUID.EAX = 0CH (* INVALID: Returns the same information as CPUID.EAX = 0BH. *)\r\n CPUID.EAX = 80000008H (* Returns linear/physical address size data. *)\r\n CPUID.EAX = 8000000AH (* INVALID: Returns same information as CPUID.EAX = 0BH. *)\r\nIf a value entered for CPUID.EAX is less than or equal to the maximum input value and the leaf is not supported on\r\nthat processor then 0 is returned in all the registers.\r\nWhen CPUID returns the highest basic leaf information as a result of an invalid input EAX value, any dependence\r\non input ECX value in the basic leaf is honored.\r\nCPUID can be executed at any privilege level to serialize instruction execution. Serializing instruction execution\r\nguarantees that any modifications to flags, registers, and memory for previous instructions are completed before\r\nthe next instruction is fetched and executed.\r\nSee also:\r\n\"Serializing Instructions\" in Chapter 8, \"Multiple-Processor Management,\" in the Intel 64 and IA-32 Architectures\r\nSoftware Developer's Manual, Volume 3A.\r\n\"Caching Translation Information\" in Chapter 4, \"Paging,\" in the Intel 64 and IA-32 Architectures Software Devel-\r\noper's Manual, Volume 3A.\r\n\r\n\r\n\r\n\r\n1. On Intel 64 processors, CPUID clears the high 32 bits of the RAX/RBX/RCX/RDX registers in all modes.\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Basic CPUID Information\r\n 0H EAX Maximum Input Value for Basic CPUID Information.\r\n EBX \"Genu\"\r\n ECX \"ntel\"\r\n EDX \"ineI\"\r\n 01H EAX Version Information: Type, Family, Model, and Stepping ID (see Figure 3-6).\r\n EBX Bits 07 - 00: Brand Index.\r\n Bits 15 - 08: CLFLUSH line size (Value * 8 = cache line size in bytes; used also by CLFLUSHOPT).\r\n Bits 23 - 16: Maximum number of addressable IDs for logical processors in this physical package*.\r\n Bits 31 - 24: Initial APIC ID.\r\n ECX Feature Information (see Figure 3-7 and Table 3-10).\r\n EDX Feature Information (see Figure 3-8 and Table 3-11).\r\n NOTES:\r\n * The nearest power-of-2 integer that is not smaller than EBX[23:16] is the number of unique initial APIC\r\n IDs reserved for addressing different logical processors in a physical package. This field is only valid if\r\n CPUID.1.EDX.HTT[bit 28]= 1.\r\n 02H EAX Cache and TLB Information (see Table 3-12).\r\n EBX Cache and TLB Information.\r\n ECX Cache and TLB Information.\r\n EDX Cache and TLB Information.\r\n 03H EAX Reserved.\r\n EBX Reserved.\r\n ECX Bits 00 - 31 of 96 bit processor serial number. (Available in Pentium III processor only; otherwise, the\r\n value in this register is reserved.)\r\n EDX Bits 32 - 63 of 96 bit processor serial number. (Available in Pentium III processor only; otherwise, the\r\n value in this register is reserved.)\r\n NOTES:\r\n Processor serial number (PSN) is not supported in the Pentium 4 processor or later. On all models, use\r\n the PSN flag (returned using CPUID) to check for PSN support before accessing the feature.\r\n CPUID leaves above 2 and below 80000000H are visible only when IA32_MISC_ENABLE[bit 22] has its default value of 0.\r\n Deterministic Cache Parameters Leaf\r\n 04H NOTES:\r\n Leaf 04H output depends on the initial value in ECX.*\r\n See also: \"INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level\" on page 214.\r\n\r\n EAX Bits 04 - 00: Cache Type Field.\r\n 0 = Null - No more caches.\r\n 1 = Data Cache.\r\n 2 = Instruction Cache.\r\n 3 = Unified Cache.\r\n 4-31 = Reserved.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Bits 07 - 05: Cache Level (starts at 1).\r\n Bit 08: Self Initializing cache level (does not need SW initialization).\r\n Bit 09: Fully Associative cache.\r\n Bits 13 - 10: Reserved.\r\n Bits 25 - 14: Maximum number of addressable IDs for logical processors sharing this cache**, ***.\r\n Bits 31 - 26: Maximum number of addressable IDs for processor cores in the physical\r\n package**, ****, *****.\r\n EBX Bits 11 - 00: L = System Coherency Line Size**.\r\n Bits 21 - 12: P = Physical Line partitions**.\r\n Bits 31 - 22: W = Ways of associativity**.\r\n ECX Bits 31-00: S = Number of Sets**.\r\n EDX Bit 00: Write-Back Invalidate/Invalidate.\r\n 0 = WBINVD/INVD from threads sharing this cache acts upon lower level caches for threads sharing this\r\n cache.\r\n 1 = WBINVD/INVD is not guaranteed to act upon lower level caches of non-originating threads sharing\r\n this cache.\r\n Bit 01: Cache Inclusiveness.\r\n 0 = Cache is not inclusive of lower cache levels.\r\n 1 = Cache is inclusive of lower cache levels.\r\n Bit 02: Complex Cache Indexing.\r\n 0 = Direct mapped cache.\r\n 1 = A complex function is used to index the cache, potentially using all address bits.\r\n Bits 31 - 03: Reserved = 0.\r\n NOTES:\r\n * If ECX contains an invalid sub leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n+1 is invalid if sub-\r\n leaf n returns EAX[4:0] as 0.\r\n ** Add one to the return value to get the result.\r\n ***The nearest power-of-2 integer that is not smaller than (1 + EAX[25:14]) is the number of unique ini-\r\n tial APIC IDs reserved for addressing different logical processors sharing this cache.\r\n **** The nearest power-of-2 integer that is not smaller than (1 + EAX[31:26]) is the number of unique\r\n Core_IDs reserved for addressing different processor cores in a physical package. Core ID is a subset of\r\n bits of the initial APIC ID.\r\n ***** The returned value is constant for valid initial values in ECX. Valid ECX values start from 0.\r\n MONITOR/MWAIT Leaf\r\n 05H EAX Bits 15 - 00: Smallest monitor-line size in bytes (default is processor's monitor granularity).\r\n Bits 31 - 16: Reserved = 0.\r\n EBX Bits 15 - 00: Largest monitor-line size in bytes (default is processor's monitor granularity).\r\n Bits 31 - 16: Reserved = 0.\r\n ECX Bit 00: Enumeration of Monitor-Mwait extensions (beyond EAX and EBX registers) supported.\r\n Bit 01: Supports treating interrupts as break-event for MWAIT, even when interrupts disabled.\r\n Bits 31 - 02: Reserved.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n EDX Bits 03 - 00: Number of C0* sub C-states supported using MWAIT.\r\n Bits 07 - 04: Number of C1* sub C-states supported using MWAIT.\r\n Bits 11 - 08: Number of C2* sub C-states supported using MWAIT.\r\n Bits 15 - 12: Number of C3* sub C-states supported using MWAIT.\r\n Bits 19 - 16: Number of C4* sub C-states supported using MWAIT.\r\n Bits 23 - 20: Number of C5* sub C-states supported using MWAIT.\r\n Bits 27 - 24: Number of C6* sub C-states supported using MWAIT.\r\n Bits 31 - 28: Number of C7* sub C-states supported using MWAIT.\r\n NOTE:\r\n * The definition of C0 through C7 states for MWAIT extension are processor-specific C-states, not ACPI C-\r\n states.\r\n Thermal and Power Management Leaf\r\n 06H EAX Bit 00: Digital temperature sensor is supported if set.\r\n Bit 01: Intel Turbo Boost Technology Available (see description of IA32_MISC_ENABLE[38]).\r\n Bit 02: ARAT. APIC-Timer-always-running feature is supported if set.\r\n Bit 03: Reserved.\r\n Bit 04: PLN. Power limit notification controls are supported if set.\r\n Bit 05: ECMD. Clock modulation duty cycle extension is supported if set.\r\n Bit 06: PTM. Package thermal management is supported if set.\r\n Bit 07: HWP. HWP base registers (IA32_PM_ENABLE[bit 0], IA32_HWP_CAPABILITIES,\r\n IA32_HWP_REQUEST, IA32_HWP_STATUS) are supported if set.\r\n Bit 08: HWP_Notification. IA32_HWP_INTERRUPT MSR is supported if set.\r\n Bit 09: HWP_Activity_Window. IA32_HWP_REQUEST[bits 41:32] is supported if set.\r\n Bit 10: HWP_Energy_Performance_Preference. IA32_HWP_REQUEST[bits 31:24] is supported if set.\r\n Bit 11: HWP_Package_Level_Request. IA32_HWP_REQUEST_PKG MSR is supported if set.\r\n Bit 12: Reserved.\r\n Bit 13: HDC. HDC base registers IA32_PKG_HDC_CTL, IA32_PM_CTL1, IA32_THREAD_STALL MSRs are\r\n supported if set.\r\n Bits 31 - 15: Reserved.\r\n EBX Bits 03 - 00: Number of Interrupt Thresholds in Digital Thermal Sensor.\r\n Bits 31 - 04: Reserved.\r\n ECX Bit 00: Hardware Coordination Feedback Capability (Presence of IA32_MPERF and IA32_APERF). The\r\n capability to provide a measure of delivered processor performance (since last reset of the counters), as\r\n a percentage of the expected processor performance when running at the TSC frequency.\r\n Bits 02 - 01: Reserved = 0.\r\n Bit 03: The processor supports performance-energy bias preference if CPUID.06H:ECX.SETBH[bit 3] is set\r\n and it also implies the presence of a new architectural MSR called IA32_ENERGY_PERF_BIAS (1B0H).\r\n Bits 31 - 04: Reserved = 0.\r\n EDX Reserved = 0.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Structured Extended Feature Flags Enumeration Leaf (Output depends on ECX input value)\r\n 07H Sub-leaf 0 (Input ECX = 0). *\r\n\r\n\r\n EAX Bits 31 - 00: Reports the maximum input value for supported leaf 7 sub-leaves.\r\n EBX Bit 00: FSGSBASE. Supports RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE if 1.\r\n Bit 01: IA32_TSC_ADJUST MSR is supported if 1.\r\n Bit 02: SGX. Supports Intel Software Guard Extensions (Intel SGX Extensions) if 1.\r\n Bit 03: BMI1.\r\n Bit 04: HLE.\r\n Bit 05: AVX2.\r\n Bit 06: FDP_EXCPTN_ONLY. x87 FPU Data Pointer updated only on x87 exceptions if 1.\r\n Bit 07: SMEP. Supports Supervisor-Mode Execution Prevention if 1.\r\n Bit 08: BMI2.\r\n Bit 09: Supports Enhanced REP MOVSB/STOSB if 1.\r\n Bit 10: INVPCID. If 1, supports INVPCID instruction for system software that manages process-context\r\n identifiers.\r\n Bit 11: RTM.\r\n Bit 12: RDT-M. Supports Intel Resource Director Technology (Intel RDT) Monitoring capability if 1.\r\n Bit 13: Deprecates FPU CS and FPU DS values if 1.\r\n Bit 14: MPX. Supports Intel Memory Protection Extensions if 1.\r\n Bit 15: RDT-A. Supports Intel Resource Director Technology (Intel RDT) Allocation capability if 1.\r\n Bits 17:16: Reserved.\r\n Bit 18: RDSEED.\r\n Bit 19: ADX.\r\n Bit 20: SMAP. Supports Supervisor-Mode Access Prevention (and the CLAC/STAC instructions) if 1.\r\n Bits 22 - 21: Reserved.\r\n Bit 23: CLFLUSHOPT.\r\n Bit 24: CLWB.\r\n Bit 25: Intel Processor Trace.\r\n Bits 28 - 26: Reserved.\r\n Bit 29: SHA. supports Intel Secure Hash Algorithm Extensions (Intel SHA Extensions) if 1.\r\n Bits 31 - 30: Reserved.\r\n ECX Bit 00: PREFETCHWT1.\r\n Bit 01: Reserved.\r\n Bit 02: UMIP. Supports user-mode instruction prevention if 1.\r\n Bit 03: PKU. Supports protection keys for user-mode pages if 1.\r\n Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable protection keys (and the RDPKRU/WRPKRU instruc-\r\n tions).\r\n Bits 16 - 5: Reserved.\r\n Bits 21 - 17: The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode.\r\n Bit 22: RDPID. Supports Read Processor ID if 1.\r\n Bits 29 - 23: Reserved.\r\n Bit 30: SGX_LC. Supports SGX Launch Configuration if 1.\r\n Bit 31: Reserved.\r\n EDX Reserved.\r\n\r\n NOTE:\r\n * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n\r\n exceeds the value that sub-leaf 0 returns in EAX.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Direct Cache Access Information Leaf\r\n 09H EAX Value of bits [31:0] of IA32_PLATFORM_DCA_CAP MSR (address 1F8H).\r\n EBX Reserved.\r\n ECX Reserved.\r\n EDX Reserved.\r\n Architectural Performance Monitoring Leaf\r\n 0AH EAX Bits 07 - 00: Version ID of architectural performance monitoring.\r\n Bits 15 - 08: Number of general-purpose performance monitoring counter per logical processor.\r\n Bits 23 - 16: Bit width of general-purpose, performance monitoring counter.\r\n Bits 31 - 24: Length of EBX bit vector to enumerate architectural performance monitoring events.\r\n EBX Bit 00: Core cycle event not available if 1.\r\n Bit 01: Instruction retired event not available if 1.\r\n Bit 02: Reference cycles event not available if 1.\r\n Bit 03: Last-level cache reference event not available if 1.\r\n Bit 04: Last-level cache misses event not available if 1.\r\n Bit 05: Branch instruction retired event not available if 1.\r\n Bit 06: Branch mispredict retired event not available if 1.\r\n Bits 31 - 07: Reserved = 0.\r\n ECX Reserved = 0.\r\n EDX Bits 04 - 00: Number of fixed-function performance counters (if Version ID > 1).\r\n Bits 12 - 05: Bit width of fixed-function performance counters (if Version ID > 1).\r\n Reserved = 0.\r\n Extended Topology Enumeration Leaf\r\n 0BH NOTES:\r\n Most of Leaf 0BH output depends on the initial value in ECX.\r\n The EDX output of leaf 0BH is always valid and does not vary with input value in ECX.\r\n Output value in ECX[7:0] always equals input value in ECX[7:0].\r\n For sub-leaves that return an invalid level-type of 0 in ECX[15:8]; EAX and EBX will return 0.\r\n If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8], other input values with ECX >\r\n n also return 0 in ECX[15:8].\r\n\r\n\r\n EAX Bits 04 - 00: Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type*.\r\n All logical processors with the same next level ID share current level.\r\n Bits 31 - 05: Reserved.\r\n EBX Bits 15 - 00: Number of logical processors at this level type. The number reflects configuration as shipped\r\n by Intel**.\r\n Bits 31- 16: Reserved.\r\n ECX Bits 07 - 00: Level number. Same value in ECX input.\r\n Bits 15 - 08: Level type***.\r\n Bits 31 - 16: Reserved.\r\n EDX Bits 31- 00: x2APIC ID the current logical processor.\r\n NOTES:\r\n * Software should use this field (EAX[4:0]) to enumerate processor topology of the system.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n ** Software must not use EBX[15:0] to enumerate processor topology of the system. This value in this\r\n field (EBX[15:0]) is only intended for display/diagnostic purposes. The actual number of logical processors\r\n available to BIOS/OS/Applications may be different from the value of EBX[15:0], depending on software\r\n and platform hardware configurations.\r\n\r\n *** The value of the \"level type\" field is not related to level numbers in any way, higher \"level type\" val-\r\n ues do not mean higher levels. Level type field has the following encoding:\r\n 0: Invalid.\r\n 1: SMT.\r\n 2: Core.\r\n 3-255: Reserved.\r\n Processor Extended State Enumeration Main Leaf (EAX = 0DH, ECX = 0)\r\n 0DH NOTES:\r\n Leaf 0DH main leaf (ECX = 0).\r\n EAX Bits 31 - 00: Reports the supported bits of the lower 32 bits of XCR0. XCR0[n] can be set to 1 only if\r\n EAX[n] is 1.\r\n Bit 00: x87 state.\r\n Bit 01: SSE state.\r\n Bit 02: AVX state.\r\n Bits 04 - 03: MPX state.\r\n Bits 07 - 05: AVX-512 state.\r\n Bit 08: Used for IA32_XSS.\r\n Bit 09: PKRU state.\r\n Bits 31 - 10: Reserved.\r\n EBX Bits 31 - 00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) required by\r\n enabled features in XCR0. May be different than ECX if some features at the end of the XSAVE save area\r\n are not enabled.\r\n ECX Bit 31 - 00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) of the\r\n XSAVE/XRSTOR save area required by all supported features in the processor, i.e., all the valid bit fields in\r\n XCR0.\r\n EDX Bit 31 - 00: Reports the supported bits of the upper 32 bits of XCR0. XCR0[n+32] can be set to 1 only if\r\n EDX[n] is 1.\r\n Bits 31 - 00: Reserved.\r\n Processor Extended State Enumeration Sub-leaf (EAX = 0DH, ECX = 1)\r\n 0DH EAX Bit 00: XSAVEOPT is available.\r\n Bit 01: Supports XSAVEC and the compacted form of XRSTOR if set.\r\n Bit 02: Supports XGETBV with ECX = 1 if set.\r\n Bit 03: Supports XSAVES/XRSTORS and IA32_XSS if set.\r\n Bits 31 - 04: Reserved.\r\n EBX Bits 31 - 00: The size in bytes of the XSAVE area containing all states enabled by XCRO | IA32_XSS.\r\n ECX Bits 31 - 00: Reports the supported bits of the lower 32 bits of the IA32_XSS MSR. IA32_XSS[n] can be\r\n set to 1 only if ECX[n] is 1.\r\n Bits 07 - 00: Used for XCR0.\r\n Bit 08: PT state.\r\n Bit 09: Used for XCR0.\r\n Bits 31 - 10: Reserved.\r\n EDX Bits 31 - 00: Reports the supported bits of the upper 32 bits of the IA32_XSS MSR. IA32_XSS[n+32] can\r\n be set to 1 only if EDX[n] is 1.\r\n Bits 31 - 00: Reserved.\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Processor Extended State Enumeration Sub-leaves (EAX = 0DH, ECX = n, n > 1)\r\n 0DH NOTES:\r\n Leaf 0DH output depends on the initial value in ECX.\r\n Each sub-leaf index (starting at position 2) is supported if it corresponds to a supported bit in either the\r\n XCR0 register or the IA32_XSS MSR.\r\n * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf n (0 <= n <= 31) is invalid\r\n if sub-leaf 0 returns 0 in EAX[n] and sub-leaf 1 returns 0 in ECX[n]. Sub-leaf n (32 <= n <= 63) is invalid if\r\n sub-leaf 0 returns 0 in EDX[n-32] and sub-leaf 1 returns 0 in EDX[n-32].\r\n EAX Bits 31 - 0: The size in bytes (from the offset specified in EBX) of the save area for an extended state\r\n feature associated with a valid sub-leaf index, n.\r\n EBX Bits 31 - 0: The offset in bytes of this extended state component's save area from the beginning of the\r\n XSAVE/XRSTOR area.\r\n This field reports 0 if the sub-leaf index, n, does not map to a valid bit in the XCR0 register*.\r\n ECX Bit 00 is set if the bit n (corresponding to the sub-leaf index) is supported in the IA32_XSS MSR; it is clear\r\n if bit n is instead supported in XCR0.\r\n Bit 01 is set if, when the compacted format of an XSAVE area is used, this extended state component\r\n located on the next 64-byte boundary following the preceding state component (otherwise, it is located\r\n immediately following the preceding state component).\r\n Bits 31 - 02 are reserved.\r\n This field reports 0 if the sub-leaf index, n, is invalid*.\r\n EDX This field reports 0 if the sub-leaf index, n, is invalid*; otherwise it is reserved.\r\n Intel Resource Director Technology (Intel RDT) Monitoring Enumeration Sub-leaf (EAX = 0FH, ECX = 0)\r\n 0FH NOTES:\r\n Leaf 0FH output depends on the initial value in ECX.\r\n Sub-leaf index 0 reports valid resource type starting at bit position 1 of EDX.\r\n EAX Reserved.\r\n EBX Bits 31 - 00: Maximum range (zero-based) of RMID within this physical processor of all types.\r\n ECX Reserved.\r\n EDX Bit 00: Reserved.\r\n Bit 01: Supports L3 Cache Intel RDT Monitoring if 1.\r\n Bits 31 - 02: Reserved.\r\n L3 Cache Intel RDT Monitoring Capability Enumeration Sub-leaf (EAX = 0FH, ECX = 1)\r\n 0FH NOTES:\r\n Leaf 0FH output depends on the initial value in ECX.\r\n EAX Reserved.\r\n EBX Bits 31 - 00: Conversion factor from reported IA32_QM_CTR value to occupancy metric (bytes).\r\n ECX Maximum range (zero-based) of RMID of this resource type.\r\n EDX Bit 00: Supports L3 occupancy monitoring if 1.\r\n Bit 01: Supports L3 Total Bandwidth monitoring if 1.\r\n Bit 02: Supports L3 Local Bandwidth monitoring if 1.\r\n Bits 31 - 03: Reserved.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Intel Resource Director Technology (Intel RDT) Allocation Enumeration Sub-leaf (EAX = 10H, ECX = 0)\r\n 10H NOTES:\r\n Leaf 10H output depends on the initial value in ECX.\r\n Sub-leaf index 0 reports valid resource identification (ResID) starting at bit position 1 of EBX.\r\n EAX Reserved.\r\n EBX Bit 00: Reserved.\r\n Bit 01: Supports L3 Cache Allocation Technology if 1.\r\n Bit 02: Supports L2 Cache Allocation Technology if 1.\r\n Bits 31 - 03: Reserved.\r\n ECX Reserved.\r\n EDX Reserved.\r\n L3 Cache Allocation Technology Enumeration Sub-leaf (EAX = 10H, ECX = ResID =1)\r\n 10H NOTES:\r\n Leaf 10H output depends on the initial value in ECX.\r\n EAX Bits 4 - 00: Length of the capacity bit mask for the corresponding ResID using minus-one notation.\r\n Bits 31 - 05: Reserved.\r\n EBX Bits 31 - 00: Bit-granular map of isolation/contention of allocation units.\r\n ECX Bit 00: Reserved.\r\n Bit 01: Updates of COS should be infrequent if 1.\r\n Bit 02: Code and Data Prioritization Technology supported if 1.\r\n Bits 31 - 03: Reserved.\r\n EDX Bits 15 - 00: Highest COS number supported for this ResID.\r\n Bits 31 - 16: Reserved.\r\n L2 Cache Allocation Technology Enumeration Sub-leaf (EAX = 10H, ECX = ResID =2)\r\n 10H NOTES:\r\n Leaf 10H output depends on the initial value in ECX.\r\n EAX Bits 4 - 00: Length of the capacity bit mask for the corresponding ResID using minus-one notation.\r\n Bits 31 - 05: Reserved.\r\n EBX Bits 31 - 00: Bit-granular map of isolation/contention of allocation units.\r\n ECX Bits 31 - 00: Reserved.\r\n EDX Bits 15 - 00: Highest COS number supported for this ResID.\r\n Bits 31 - 16: Reserved.\r\n Intel SGX Capability Enumeration Leaf, sub-leaf 0 (EAX = 12H, ECX = 0)\r\n 12H NOTES:\r\n Leaf 12H sub-leaf 0 (ECX = 0) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1.\r\n EAX Bit 00: SGX1. If 1, Indicates Intel SGX supports the collection of SGX1 leaf functions.\r\n Bit 01: SGX2. If 1, Indicates Intel SGX supports the collection of SGX2 leaf functions.\r\n Bit 31 - 02: Reserved.\r\n EBX Bit 31 - 00: MISCSELECT. Bit vector of supported extended SGX features.\r\n ECX Bit 31 - 00: Reserved.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n EDX Bit 07 - 00: MaxEnclaveSize_Not64. The maximum supported enclave size in non-64-bit mode is\r\n 2^(EDX[7:0]).\r\n Bit 15 - 08: MaxEnclaveSize_64. The maximum supported enclave size in 64-bit mode is 2^(EDX[15:8]).\r\n Bits 31 - 16: Reserved.\r\n Intel SGX Attributes Enumeration Leaf, sub-leaf 1 (EAX = 12H, ECX = 1)\r\n 12H NOTES:\r\n Leaf 12H sub-leaf 1 (ECX = 1) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1.\r\n EAX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[31:0] that software can set with ECREATE.\r\n EBX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[63:32] that software can set with ECREATE.\r\n ECX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[95:64] that software can set with ECREATE.\r\n EDX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[127:96] that software can set with ECREATE.\r\n Intel SGX EPC Enumeration Leaf, sub-leaves (EAX = 12H, ECX = 2 or higher)\r\n 12H NOTES:\r\n Leaf 12H sub-leaf 2 or higher (ECX >= 2) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1.\r\n For sub-leaves (ECX = 2 or higher), definition of EDX,ECX,EBX,EAX[31:4] depends on the sub-leaf type\r\n listed below.\r\n EAX Bit 03 - 00: Sub-leaf Type\r\n 0000b: Indicates this sub-leaf is invalid.\r\n 0001b: This sub-leaf enumerates an EPC section. EBX:EAX and EDX:ECX provide information on the\r\n Enclave Page Cache (EPC) section.\r\n All other type encodings are reserved.\r\n Type 0000b. This sub-leaf is invalid.\r\n EDX:ECX:EBX:EAX return 0.\r\n Type 0001b. This sub-leaf enumerates an EPC sections with EDX:ECX, EBX:EAX defined as follows.\r\n EAX[11:04]: Reserved (enumerate 0).\r\n EAX[31:12]: Bits 31:12 of the physical address of the base of the EPC section.\r\n\r\n EBX[19:00]: Bits 51:32 of the physical address of the base of the EPC section.\r\n EBX[31:20]: Reserved.\r\n\r\n ECX[03:00]: EPC section property encoding defined as follows:\r\n If EAX[3:0] 0000b, then all bits of the EDX:ECX pair are enumerated as 0.\r\n If EAX[3:0] 0001b, then this section has confidentiality and integrity protection.\r\n All other encodings are reserved.\r\n ECX[11:04]: Reserved (enumerate 0).\r\n ECX[31:12]: Bits 31:12 of the size of the corresponding EPC section within the Processor Reserved\r\n Memory.\r\n\r\n EDX[19:00]: Bits 51:32 of the size of the corresponding EPC section within the Processor Reserved\r\n Memory.\r\n EDX[31:20]: Reserved.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Intel Processor Trace Enumeration Main Leaf (EAX = 14H, ECX = 0)\r\n 14H NOTES:\r\n Leaf 14H main leaf (ECX = 0).\r\n EAX Bits 31 - 00: Reports the maximum sub-leaf supported in leaf 14H.\r\n EBX Bit 00: If 1, indicates that IA32_RTIT_CTL.CR3Filter can be set to 1, and that IA32_RTIT_CR3_MATCH\r\n MSR can be accessed.\r\n Bit 01: If 1, indicates support of Configurable PSB and Cycle-Accurate Mode.\r\n Bit 02: If 1, indicates support of IP Filtering, TraceStop filtering, and preservation of Intel PT MSRs across\r\n warm reset.\r\n Bit 03: If 1, indicates support of MTC timing packet and suppression of COFI-based packets.\r\n Bit 04: If 1, indicates support of PTWRITE. Writes can set IA32_RTIT_CTL[12] (PTWEn) and\r\n IA32_RTIT_CTL[5] (FUPonPTW), and PTWRITE can generate packets.\r\n Bit 05: If 1, indicates support of Power Event Trace. Writes can set IA32_RTIT_CTL[4] (PwrEvtEn),\r\n enabling Power Event Trace packet generation.\r\n Bit 31 - 06: Reserved.\r\n ECX Bit 00: If 1, Tracing can be enabled with IA32_RTIT_CTL.ToPA = 1, hence utilizing the ToPA output\r\n scheme; IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS MSRs can be accessed.\r\n Bit 01: If 1, ToPA tables can hold any number of output entries, up to the maximum allowed by the Mas-\r\n kOrTableOffset field of IA32_RTIT_OUTPUT_MASK_PTRS.\r\n Bit 02: If 1, indicates support of Single-Range Output scheme.\r\n Bit 03: If 1, indicates support of output to Trace Transport subsystem.\r\n Bit 30 - 04: Reserved.\r\n Bit 31: If 1, generated packets which contain IP payloads have LIP values, which include the CS base com-\r\n ponent.\r\n EDX Bits 31 - 00: Reserved.\r\n Intel Processor Trace Enumeration Sub-leaf (EAX = 14H, ECX = 1)\r\n 14H EAX Bits 02 - 00: Number of configurable Address Ranges for filtering.\r\n Bits 15 - 03: Reserved.\r\n Bits 31 - 16: Bitmap of supported MTC period encodings.\r\n EBX Bits 15 - 00: Bitmap of supported Cycle Threshold value encodings.\r\n Bit 31 - 16: Bitmap of supported Configurable PSB frequency encodings.\r\n ECX Bits 31 - 00: Reserved.\r\n EDX Bits 31 - 00: Reserved.\r\n Time Stamp Counter and Nominal Core Crystal Clock Information Leaf\r\n 15H NOTES:\r\n If EBX[31:0] is 0, the TSC/\"core crystal clock\" ratio is not enumerated.\r\n EBX[31:0]/EAX[31:0] indicates the ratio of the TSC frequency and the core crystal clock frequency.\r\n If ECX is 0, the nominal core crystal clock frequency is not enumerated.\r\n \"TSC frequency\" = \"core crystal clock frequency\" * EBX/EAX.\r\n The core crystal clock may differ from the reference clock, bus clock, or core clock frequencies.\r\n EAX Bits 31 - 00: An unsigned integer which is the denominator of the TSC/\"core crystal clock\" ratio.\r\n EBX Bits 31 - 00: An unsigned integer which is the numerator of the TSC/\"core crystal clock\" ratio.\r\n ECX Bits 31 - 00: An unsigned integer which is the nominal frequency of the core crystal clock in Hz.\r\n EDX Bits 31 - 00: Reserved = 0.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n Processor Frequency Information Leaf\r\n 16H EAX Bits 15 - 00: Processor Base Frequency (in MHz).\r\n Bits 31 - 16: Reserved =0.\r\n EBX Bits 15 - 00: Maximum Frequency (in MHz).\r\n Bits 31 - 16: Reserved = 0.\r\n ECX Bits 15 - 00: Bus (Reference) Frequency (in MHz).\r\n Bits 31 - 16: Reserved = 0.\r\n EDX Reserved.\r\n NOTES:\r\n * Data is returned from this interface in accordance with the processor's specification and does not reflect\r\n actual values. Suitable use of this data includes the display of processor information in like manner to the\r\n processor brand string and for determining the appropriate range to use when displaying processor\r\n information e.g. frequency history graphs. The returned information should not be used for any other\r\n purpose as the returned information does not accurately correlate to information / counters returned by\r\n other processor interfaces.\r\n\r\n While a processor may support the Processor Frequency Information leaf, fields that return a value of\r\n zero are not supported.\r\n System-On-Chip Vendor Attribute Enumeration Main Leaf (EAX = 17H, ECX = 0)\r\n 17H NOTES:\r\n Leaf 17H main leaf (ECX = 0).\r\n Leaf 17H output depends on the initial value in ECX.\r\n Leaf 17H sub-leaves 1 through 3 reports SOC Vendor Brand String.\r\n Leaf 17H is valid if MaxSOCID_Index >= 3.\r\n Leaf 17H sub-leaves 4 and above are reserved.\r\n\r\n\r\n EAX Bits 31 - 00: MaxSOCID_Index. Reports the maximum input value of supported sub-leaf in leaf 17H.\r\n EBX Bits 15 - 00: SOC Vendor ID.\r\n Bit 16: IsVendorScheme. If 1, the SOC Vendor ID field is assigned via an industry standard enumeration\r\n scheme. Otherwise, the SOC Vendor ID field is assigned by Intel.\r\n Bits 31 - 17: Reserved = 0.\r\n ECX Bits 31 - 00: Project ID. A unique number an SOC vendor assigns to its SOC projects.\r\n EDX Bits 31 - 00: Stepping ID. A unique number within an SOC project that an SOC vendor assigns.\r\n System-On-Chip Vendor Attribute Enumeration Sub-leaf (EAX = 17H, ECX = 1..3)\r\n 17H EAX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string.\r\n EBX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string.\r\n ECX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string.\r\n EDX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string.\r\n NOTES:\r\n Leaf 17H output depends on the initial value in ECX.\r\n SOC Vendor Brand String is a UTF-8 encoded string padded with trailing bytes of 00H.\r\n The complete SOC Vendor Brand String is constructed by concatenating in ascending order of\r\n EAX:EBX:ECX:EDX and from the sub-leaf 1 fragment towards sub-leaf 3.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n System-On-Chip Vendor Attribute Enumeration Sub-leaves (EAX = 17H, ECX > MaxSOCID_Index)\r\n 17H NOTES:\r\n Leaf 17H output depends on the initial value in ECX.\r\n\r\n\r\n EAX Bits 31 - 00: Reserved = 0.\r\n EBX Bits 31 - 00: Reserved = 0.\r\n ECX Bits 31 - 00: Reserved = 0.\r\n EDX Bits 31 - 00: Reserved = 0.\r\n Unimplemented CPUID Leaf Functions\r\n 40000000H Invalid. No existing or future CPU will return processor identification or feature information if the initial\r\n - EAX value is in the range 40000000H to 4FFFFFFFH.\r\n 4FFFFFFFH\r\n Extended Function CPUID Information\r\n 80000000H EAX Maximum Input Value for Extended Function CPUID Information.\r\n EBX Reserved.\r\n ECX Reserved.\r\n EDX Reserved.\r\n 80000001H EAX Extended Processor Signature and Feature Bits.\r\n EBX Reserved.\r\n ECX Bit 00: LAHF/SAHF available in 64-bit mode.\r\n Bits 04 - 01: Reserved.\r\n Bit 05: LZCNT.\r\n Bits 07 - 06: Reserved.\r\n Bit 08: PREFETCHW.\r\n Bits 31 - 09: Reserved.\r\n EDX Bits 10 - 00: Reserved.\r\n Bit 11: SYSCALL/SYSRET available in 64-bit mode.\r\n Bits 19 - 12: Reserved = 0.\r\n Bit 20: Execute Disable Bit available.\r\n Bits 25 - 21: Reserved = 0.\r\n Bit 26: 1-GByte pages are available if 1.\r\n Bit 27: RDTSCP and IA32_TSC_AUX are available if 1.\r\n Bit 28: Reserved = 0.\r\n Bit 29: Intel 64 Architecture available if 1.\r\n Bits 31 - 30: Reserved = 0.\r\n 80000002H EAX Processor Brand String.\r\n EBX Processor Brand String Continued.\r\n ECX Processor Brand String Continued.\r\n EDX Processor Brand String Continued.\r\n 80000003H EAX Processor Brand String Continued.\r\n EBX Processor Brand String Continued.\r\n ECX Processor Brand String Continued.\r\n EDX Processor Brand String Continued.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-8. Information Returned by CPUID Instruction (Contd.)\r\n Initial EAX\r\n Value Information Provided about the Processor\r\n 80000004H EAX Processor Brand String Continued.\r\n EBX Processor Brand String Continued.\r\n ECX Processor Brand String Continued.\r\n EDX Processor Brand String Continued.\r\n 80000005H EAX Reserved = 0.\r\n EBX Reserved = 0.\r\n ECX Reserved = 0.\r\n EDX Reserved = 0.\r\n 80000006H EAX Reserved = 0.\r\n EBX Reserved = 0.\r\n ECX Bits 07 - 00: Cache Line size in bytes.\r\n Bits 11 - 08: Reserved.\r\n Bits 15 - 12: L2 Associativity field *.\r\n Bits 31 - 16: Cache size in 1K units.\r\n EDX Reserved = 0.\r\n NOTES:\r\n * L2 associativity field encodings:\r\n 00H - Disabled.\r\n 01H - Direct mapped.\r\n 02H - 2-way.\r\n 04H - 4-way.\r\n 06H - 8-way.\r\n 08H - 16-way.\r\n 0FH - Fully associative.\r\n 80000007H EAX Reserved = 0.\r\n EBX Reserved = 0.\r\n ECX Reserved = 0.\r\n EDX Bits 07 - 00: Reserved = 0.\r\n Bit 08: Invariant TSC available if 1.\r\n Bits 31 - 09: Reserved = 0.\r\n 80000008H EAX Linear/Physical Address size.\r\n Bits 07 - 00: #Physical Address Bits*.\r\n Bits 15 - 08: #Linear Address Bits.\r\n Bits 31 - 16: Reserved = 0.\r\n EBX Reserved = 0.\r\n ECX Reserved = 0.\r\n EDX Reserved = 0.\r\n\r\n NOTES:\r\n * If CPUID.80000008H:EAX[7:0] is supported, the maximum physical address number supported should\r\n come from this field.\r\n\r\n\r\nINPUT EAX = 0: Returns CPUID's Highest Value for Basic Processor Information and the Vendor Identification String\r\nWhen CPUID executes with EAX set to 0, the processor returns the highest value the CPUID recognizes for\r\nreturning basic processor information. The value is returned in the EAX register and is processor specific.\r\n\r\n\r\n\r\n\r\n\r\nA vendor identification string is also returned in EBX, EDX, and ECX. For Intel processors, the string is \"Genuin-\r\neIntel\" and is expressed:\r\n EBX <- 756e6547h (* \"Genu\", with G in the low eight bits of BL *)\r\n EDX <- 49656e69h (* \"ineI\", with i in the low eight bits of DL *)\r\n ECX <- 6c65746eh (* \"ntel\", with n in the low eight bits of CL *)\r\n\r\nINPUT EAX = 80000000H: Returns CPUID's Highest Value for Extended Processor Information\r\nWhen CPUID executes with EAX set to 80000000H, the processor returns the highest value the processor recog-\r\nnizes for returning extended processor information. The value is returned in the EAX register and is processor\r\nspecific.\r\n\r\nIA32_BIOS_SIGN_ID Returns Microcode Update Signature\r\nFor processors that support the microcode update facility, the IA32_BIOS_SIGN_ID MSR is loaded with the update\r\nsignature whenever CPUID executes. The signature is returned in the upper DWORD. For details, see Chapter 9 in\r\nthe Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A.\r\n\r\nINPUT EAX = 01H: Returns Model, Family, Stepping Information\r\nWhen CPUID executes with EAX set to 01H, version information is returned in EAX (see Figure 3-6). For example:\r\nmodel, family, and processor type for the Intel Xeon processor 5100 series is as follows:\r\n. Model - 1111B\r\n. Family - 0101B\r\n. Processor Type - 00B\r\nSee Table 3-9 for available processor type values. Stepping IDs are provided as needed.\r\n\r\n\r\n\r\n\r\n 31 28 27 20 19 16 15 14 13 12 11 8 7 4 3 0\r\n\r\n\r\n Extended Extended Family Stepping\r\n EAX Model\r\n Family ID Model ID ID ID\r\n\r\n\r\n Extended Family ID (0)\r\n Extended Model ID (0)\r\n Processor Type\r\n Family (0FH for the Pentium 4 Processor Family)\r\n Model\r\n\r\n Reserved\r\n OM16525\r\n\r\n\r\n Figure 3-6. Version Information Returned by CPUID in EAX\r\n\r\n\r\n\r\n\r\n\r\n Table 3-9. Processor Type Field\r\n Type Encoding\r\n Original OEM Processor 00B\r\n \r\n Intel OverDrive Processor 01B\r\n Dual processor (not applicable to Intel486 processors) 10B\r\n Intel reserved 11B\r\n\r\n NOTE\r\n See Chapter 19 in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 1,\r\n for information on identifying earlier IA-32 processors.\r\n\r\n\r\nThe Extended Family ID needs to be examined only when the Family ID is 0FH. Integrate the fields into a display\r\nusing the following rule:\r\n\r\n IF Family_ID != 0FH\r\n THEN DisplayFamily = Family_ID;\r\n ELSE DisplayFamily = Extended_Family_ID + Family_ID;\r\n (* Right justify and zero-extend 4-bit field. *)\r\n FI;\r\n (* Show DisplayFamily as HEX field. *)\r\nThe Extended Model ID needs to be examined only when the Family ID is 06H or 0FH. Integrate the field into a\r\ndisplay using the following rule:\r\n\r\n IF (Family_ID = 06H or Family_ID = 0FH)\r\n THEN DisplayModel = (Extended_Model_ID << 4) + Model_ID;\r\n (* Right justify and zero-extend 4-bit field; display Model_ID as HEX field.*)\r\n ELSE DisplayModel = Model_ID;\r\n FI;\r\n (* Show DisplayModel as HEX field. *)\r\n\r\nINPUT EAX = 01H: Returns Additional Information in EBX\r\nWhen CPUID executes with EAX set to 01H, additional information is returned to the EBX register:\r\n. Brand index (low byte of EBX) - this number provides an entry into a brand string table that contains brand\r\n strings for IA-32 processors. More information about this field is provided later in this section.\r\n. CLFLUSH instruction cache line size (second byte of EBX) - this number indicates the size of the cache line\r\n flushed by the CLFLUSH and CLFLUSHOPT instructions in 8-byte increments. This field was introduced in the\r\n Pentium 4 processor.\r\n. Local APIC ID (high byte of EBX) - this number is the 8-bit ID that is assigned to the local APIC on the\r\n processor during power up. This field was introduced in the Pentium 4 processor.\r\n\r\nINPUT EAX = 01H: Returns Feature Information in ECX and EDX\r\nWhen CPUID executes with EAX set to 01H, feature information is returned in ECX and EDX.\r\n. Figure 3-7 and Table 3-10 show encodings for ECX.\r\n. Figure 3-8 and Table 3-11 show encodings for EDX.\r\nFor all feature flags, a 1 indicates that the feature is supported. Use Intel to properly interpret feature flags.\r\n\r\n NOTE\r\n Software must confirm that a processor feature is present using feature flags returned by CPUID\r\n prior to using the feature. Software should not depend on future offerings retaining all features.\r\n\r\n\r\n\r\n\r\n\r\n 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0\r\n\r\n\r\n\r\n ECX\r\n 0\r\n\r\n\r\n RDRAND\r\n F16C\r\n AVX\r\n OSXSAVE\r\n XSAVE\r\n AES\r\n TSC-Deadline\r\n POPCNT\r\n MOVBE\r\n x2APIC\r\n SSE4_2 - SSE4.2\r\n SSE4_1 - SSE4.1\r\n DCA - Direct Cache Access\r\n PCID - Process-context Identifiers\r\n PDCM - Perf/Debug Capability MSR\r\n xTPR Update Control\r\n CMPXCHG16B\r\n FMA - Fused Multiply Add\r\n SDBG\r\n CNXT-ID - L1 Context ID\r\n SSSE3 - SSSE3 Extensions\r\n TM2 - Thermal Monitor 2\r\n EIST - Enhanced Intel SpeedStep Technology\r\n SMX - Safer Mode Extensions\r\n VMX - Virtual Machine Extensions\r\n DS-CPL - CPL Qualified Debug Store\r\n MONITOR - MONITOR/MWAIT\r\n DTES64 - 64-bit DS Area\r\n PCLMULQDQ - Carryless Multiplication\r\n SSE3 - SSE3 Extensions\r\n OM16524b\r\n Reserved\r\n\r\n Figure 3-7. Feature Information Returned in the ECX Register\r\n\r\n\r\n Table 3-10. Feature Information Returned in the ECX Register\r\n Bit # Mnemonic Description\r\n 0 SSE3 Streaming SIMD Extensions 3 (SSE3). A value of 1 indicates the processor supports this\r\n technology.\r\n 1 PCLMULQDQ PCLMULQDQ. A value of 1 indicates the processor supports the PCLMULQDQ instruction.\r\n 2 DTES64 64-bit DS Area. A value of 1 indicates the processor supports DS area using 64-bit layout.\r\n 3 MONITOR MONITOR/MWAIT. A value of 1 indicates the processor supports this feature.\r\n 4 DS-CPL CPL Qualified Debug Store. A value of 1 indicates the processor supports the extensions to the\r\n Debug Store feature to allow for branch message storage qualified by CPL.\r\n 5 VMX Virtual Machine Extensions. A value of 1 indicates that the processor supports this technology.\r\n 6 SMX Safer Mode Extensions. A value of 1 indicates that the processor supports this technology. See\r\n Chapter 6, \"Safer Mode Extensions Reference\".\r\n 7 EIST Enhanced Intel SpeedStep technology. A value of 1 indicates that the processor supports this\r\n technology.\r\n 8 TM2 Thermal Monitor 2. A value of 1 indicates whether the processor supports this technology.\r\n 9 SSSE3 A value of 1 indicates the presence of the Supplemental Streaming SIMD Extensions 3 (SSSE3). A\r\n value of 0 indicates the instruction extensions are not present in the processor.\r\n\r\n\r\n\r\n Table 3-10. Feature Information Returned in the ECX Register (Contd.)\r\n Bit # Mnemonic Description\r\n 10 CNXT-ID L1 Context ID. A value of 1 indicates the L1 data cache mode can be set to either adaptive mode\r\n or shared mode. A value of 0 indicates this feature is not supported. See definition of the\r\n IA32_MISC_ENABLE MSR Bit 24 (L1 Data Cache Context Mode) for details.\r\n 11 SDBG A value of 1 indicates the processor supports IA32_DEBUG_INTERFACE MSR for silicon debug.\r\n 12 FMA A value of 1 indicates the processor supports FMA extensions using YMM state.\r\n 13 CMPXCHG16B CMPXCHG16B Available. A value of 1 indicates that the feature is available. See the\r\n \"CMPXCHG8B/CMPXCHG16B-Compare and Exchange Bytes\" section in this chapter for a\r\n description.\r\n 14 xTPR Update xTPR Update Control. A value of 1 indicates that the processor supports changing\r\n Control IA32_MISC_ENABLE[bit 23].\r\n 15 PDCM Perfmon and Debug Capability: A value of 1 indicates the processor supports the performance\r\n and debug feature indication MSR IA32_PERF_CAPABILITIES.\r\n 16 Reserved Reserved\r\n 17 PCID Process-context identifiers. A value of 1 indicates that the processor supports PCIDs and that\r\n software may set CR4.PCIDE to 1.\r\n 18 DCA A value of 1 indicates the processor supports the ability to prefetch data from a memory mapped\r\n device.\r\n 19 SSE4.1 A value of 1 indicates that the processor supports SSE4.1.\r\n 20 SSE4.2 A value of 1 indicates that the processor supports SSE4.2.\r\n 21 x2APIC A value of 1 indicates that the processor supports x2APIC feature.\r\n 22 MOVBE A value of 1 indicates that the processor supports MOVBE instruction.\r\n 23 POPCNT A value of 1 indicates that the processor supports the POPCNT instruction.\r\n 24 TSC-Deadline A value of 1 indicates that the processor's local APIC timer supports one-shot operation using a\r\n TSC deadline value.\r\n 25 AESNI A value of 1 indicates that the processor supports the AESNI instruction extensions.\r\n 26 XSAVE A value of 1 indicates that the processor supports the XSAVE/XRSTOR processor extended states\r\n feature, the XSETBV/XGETBV instructions, and XCR0.\r\n 27 OSXSAVE A value of 1 indicates that the OS has set CR4.OSXSAVE[bit 18] to enable XSETBV/XGETBV\r\n instructions to access XCR0 and to support processor extended state management using\r\n XSAVE/XRSTOR.\r\n 28 AVX A value of 1 indicates the processor supports the AVX instruction extensions.\r\n 29 F16C A value of 1 indicates that processor supports 16-bit floating-point conversion instructions.\r\n 30 RDRAND A value of 1 indicates that processor supports RDRAND instruction.\r\n 31 Not Used Always returns 0.\r\n\r\n\r\n\r\n\r\n\r\n 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0\r\n\r\n\r\n EDX\r\n\r\n\r\n PBE-Pend. Brk. EN.\r\n TM-Therm. Monitor\r\n HTT-Multi-threading\r\n SS-Self Snoop\r\n SSE2-SSE2 Extensions\r\n SSE-SSE Extensions\r\n FXSR-FXSAVE/FXRSTOR\r\n MMX-MMX Technology\r\n ACPI-Thermal Monitor and Clock Ctrl\r\n DS-Debug Store\r\n CLFSH-CLFLUSH instruction\r\n PSN-Processor Serial Number\r\n PSE-36 - Page Size Extension\r\n PAT-Page Attribute Table\r\n CMOV-Conditional Move/Compare Instruction\r\n MCA-Machine Check Architecture\r\n PGE-PTE Global Bit\r\n MTRR-Memory Type Range Registers\r\n SEP-SYSENTER and SYSEXIT\r\n APIC-APIC on Chip\r\n CX8-CMPXCHG8B Inst.\r\n MCE-Machine Check Exception\r\n PAE-Physical Address Extensions\r\n MSR-RDMSR and WRMSR Support\r\n TSC-Time Stamp Counter\r\n PSE-Page Size Extensions\r\n DE-Debugging Extensions\r\n VME-Virtual-8086 Mode Enhancement\r\n FPU-x87 FPU on Chip\r\n\r\n Reserved\r\n OM16523\r\n\r\n\r\n\r\n Figure 3-8. Feature Information Returned in the EDX Register\r\n\r\n\r\n\r\n\r\n\r\n Table 3-11. More on Feature Information Returned in the EDX Register\r\n Bit # Mnemonic Description\r\n 0 FPU Floating Point Unit On-Chip. The processor contains an x87 FPU.\r\n 1 VME Virtual 8086 Mode Enhancements. Virtual 8086 mode enhancements, including CR4.VME for controlling the\r\n feature, CR4.PVI for protected mode virtual interrupts, software interrupt indirection, expansion of the TSS\r\n with the software indirection bitmap, and EFLAGS.VIF and EFLAGS.VIP flags.\r\n 2 DE Debugging Extensions. Support for I/O breakpoints, including CR4.DE for controlling the feature, and optional\r\n trapping of accesses to DR4 and DR5.\r\n 3 PSE Page Size Extension. Large pages of size 4 MByte are supported, including CR4.PSE for controlling the\r\n feature, the defined dirty bit in PDE (Page Directory Entries), optional reserved bit trapping in CR3, PDEs, and\r\n PTEs.\r\n 4 TSC Time Stamp Counter. The RDTSC instruction is supported, including CR4.TSD for controlling privilege.\r\n 5 MSR Model Specific Registers RDMSR and WRMSR Instructions. The RDMSR and WRMSR instructions are\r\n supported. Some of the MSRs are implementation dependent.\r\n 6 PAE Physical Address Extension. Physical addresses greater than 32 bits are supported: extended page table\r\n entry formats, an extra level in the page translation tables is defined, 2-MByte pages are supported instead of\r\n 4 Mbyte pages if PAE bit is 1.\r\n 7 MCE Machine Check Exception. Exception 18 is defined for Machine Checks, including CR4.MCE for controlling the\r\n feature. This feature does not define the model-specific implementations of machine-check error logging,\r\n reporting, and processor shutdowns. Machine Check exception handlers may have to depend on processor\r\n version to do model specific processing of the exception, or test for the presence of the Machine Check feature.\r\n 8 CX8 CMPXCHG8B Instruction. The compare-and-exchange 8 bytes (64 bits) instruction is supported (implicitly\r\n locked and atomic).\r\n 9 APIC APIC On-Chip. The processor contains an Advanced Programmable Interrupt Controller (APIC), responding to\r\n memory mapped commands in the physical address range FFFE0000H to FFFE0FFFH (by default - some\r\n processors permit the APIC to be relocated).\r\n 10 Reserved Reserved\r\n 11 SEP SYSENTER and SYSEXIT Instructions. The SYSENTER and SYSEXIT and associated MSRs are supported.\r\n 12 MTRR Memory Type Range Registers. MTRRs are supported. The MTRRcap MSR contains feature bits that describe\r\n what memory types are supported, how many variable MTRRs are supported, and whether fixed MTRRs are\r\n supported.\r\n 13 PGE Page Global Bit. The global bit is supported in paging-structure entries that map a page, indicating TLB entries\r\n that are common to different processes and need not be flushed. The CR4.PGE bit controls this feature.\r\n 14 MCA Machine Check Architecture. A value of 1 indicates the Machine Check Architecture of reporting machine\r\n errors is supported. The MCG_CAP MSR contains feature bits describing how many banks of error reporting\r\n MSRs are supported.\r\n 15 CMOV Conditional Move Instructions. The conditional move instruction CMOV is supported. In addition, if x87 FPU is\r\n present as indicated by the CPUID.FPU feature bit, then the FCOMI and FCMOV instructions are supported\r\n 16 PAT Page Attribute Table. Page Attribute Table is supported. This feature augments the Memory Type Range\r\n Registers (MTRRs), allowing an operating system to specify attributes of memory accessed through a linear\r\n address on a 4KB granularity.\r\n 17 PSE-36 36-Bit Page Size Extension. 4-MByte pages addressing physical memory beyond 4 GBytes are supported with\r\n 32-bit paging. This feature indicates that upper bits of the physical address of a 4-MByte page are encoded in\r\n bits 20:13 of the page-directory entry. Such physical addresses are limited by MAXPHYADDR and may be up to\r\n 40 bits in size.\r\n 18 PSN Processor Serial Number. The processor supports the 96-bit processor identification number feature and the\r\n feature is enabled.\r\n 19 CLFSH CLFLUSH Instruction. CLFLUSH Instruction is supported.\r\n 20 Reserved Reserved\r\n\r\n\r\n\r\n Table 3-11. More on Feature Information Returned in the EDX Register (Contd.)\r\n Bit # Mnemonic Description\r\n 21 DS Debug Store. The processor supports the ability to write debug information into a memory resident buffer.\r\n This feature is used by the branch trace store (BTS) and precise event-based sampling (PEBS) facilities (see\r\n Chapter 23, \"Introduction to Virtual-Machine Extensions,\" in the Intel 64 and IA-32 Architectures Software\r\n Developer's Manual, Volume 3C).\r\n 22 ACPI Thermal Monitor and Software Controlled Clock Facilities. The processor implements internal MSRs that\r\n allow processor temperature to be monitored and processor performance to be modulated in predefined duty\r\n cycles under software control.\r\n 23 MMX Intel MMX Technology. The processor supports the Intel MMX technology.\r\n 24 FXSR FXSAVE and FXRSTOR Instructions. The FXSAVE and FXRSTOR instructions are supported for fast save and\r\n restore of the floating point context. Presence of this bit also indicates that CR4.OSFXSR is available for an\r\n operating system to indicate that it supports the FXSAVE and FXRSTOR instructions.\r\n 25 SSE SSE. The processor supports the SSE extensions.\r\n 26 SSE2 SSE2. The processor supports the SSE2 extensions.\r\n 27 SS Self Snoop. The processor supports the management of conflicting memory types by performing a snoop of its\r\n own cache structure for transactions issued to the bus.\r\n 28 HTT Max APIC IDs reserved field is Valid. A value of 0 for HTT indicates there is only a single logical processor in\r\n the package and software should assume only a single APIC ID is reserved. A value of 1 for HTT indicates the\r\n value in CPUID.1.EBX[23:16] (the Maximum number of addressable IDs for logical processors in this package) is\r\n valid for the package.\r\n 29 TM Thermal Monitor. The processor implements the thermal monitor automatic thermal control circuitry (TCC).\r\n 30 Reserved Reserved\r\n 31 PBE Pending Break Enable. The processor supports the use of the FERR#/PBE# pin when the processor is in the\r\n stop-clock state (STPCLK# is asserted) to signal the processor that an interrupt is pending and that the\r\n processor should return to normal operation to handle the interrupt. Bit 10 (PBE enable) in the\r\n IA32_MISC_ENABLE MSR enables this capability.\r\n\r\n\r\n\r\nINPUT EAX = 02H: TLB/Cache/Prefetch Information Returned in EAX, EBX, ECX, EDX\r\nWhen CPUID executes with EAX set to 02H, the processor returns information about the processor's internal TLBs,\r\ncache and prefetch hardware in the EAX, EBX, ECX, and EDX registers. The information is reported in encoded form\r\nand fall into the following categories:\r\n. The least-significant byte in register EAX (register AL) will always return 01H. Software should ignore this value\r\n and not interpret it as an informational descriptor.\r\n. The most significant bit (bit 31) of each register indicates whether the register contains valid information (set\r\n to 0) or is reserved (set to 1).\r\n. If a register contains valid information, the information is contained in 1 byte descriptors. There are four types\r\n of encoding values for the byte descriptor, the encoding type is noted in the second column of Table 3-12. Table\r\n 3-12 lists the encoding of these descriptors. Note that the order of descriptors in the EAX, EBX, ECX, and EDX\r\n registers is not defined; that is, specific bytes are not designated to contain descriptors for specific cache,\r\n prefetch, or TLB types. The descriptors may appear in any order. Note also a processor may report a general\r\n descriptor type (FFH) and not report any byte descriptor of \"cache type\" via CPUID leaf 2.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-12. Encoding of CPUID Leaf 2 Descriptors\r\n Value Type Description\r\n 00H General Null descriptor, this byte contains no information\r\n 01H TLB Instruction TLB: 4 KByte pages, 4-way set associative, 32 entries\r\n 02H TLB Instruction TLB: 4 MByte pages, fully associative, 2 entries\r\n 03H TLB Data TLB: 4 KByte pages, 4-way set associative, 64 entries\r\n 04H TLB Data TLB: 4 MByte pages, 4-way set associative, 8 entries\r\n 05H TLB Data TLB1: 4 MByte pages, 4-way set associative, 32 entries\r\n 06H Cache 1st-level instruction cache: 8 KBytes, 4-way set associative, 32 byte line size\r\n 08H Cache 1st-level instruction cache: 16 KBytes, 4-way set associative, 32 byte line size\r\n 09H Cache 1st-level instruction cache: 32KBytes, 4-way set associative, 64 byte line size\r\n 0AH Cache 1st-level data cache: 8 KBytes, 2-way set associative, 32 byte line size\r\n 0BH TLB Instruction TLB: 4 MByte pages, 4-way set associative, 4 entries\r\n 0CH Cache 1st-level data cache: 16 KBytes, 4-way set associative, 32 byte line size\r\n 0DH Cache 1st-level data cache: 16 KBytes, 4-way set associative, 64 byte line size\r\n 0EH Cache 1st-level data cache: 24 KBytes, 6-way set associative, 64 byte line size\r\n 1DH Cache 2nd-level cache: 128 KBytes, 2-way set associative, 64 byte line size\r\n 21H Cache 2nd-level cache: 256 KBytes, 8-way set associative, 64 byte line size\r\n 22H Cache 3rd-level cache: 512 KBytes, 4-way set associative, 64 byte line size, 2 lines per sector\r\n 23H Cache 3rd-level cache: 1 MBytes, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 24H Cache 2nd-level cache: 1 MBytes, 16-way set associative, 64 byte line size\r\n 25H Cache 3rd-level cache: 2 MBytes, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 29H Cache 3rd-level cache: 4 MBytes, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 2CH Cache 1st-level data cache: 32 KBytes, 8-way set associative, 64 byte line size\r\n 30H Cache 1st-level instruction cache: 32 KBytes, 8-way set associative, 64 byte line size\r\n 40H Cache No 2nd-level cache or, if processor contains a valid 2nd-level cache, no 3rd-level cache\r\n 41H Cache 2nd-level cache: 128 KBytes, 4-way set associative, 32 byte line size\r\n 42H Cache 2nd-level cache: 256 KBytes, 4-way set associative, 32 byte line size\r\n 43H Cache 2nd-level cache: 512 KBytes, 4-way set associative, 32 byte line size\r\n 44H Cache 2nd-level cache: 1 MByte, 4-way set associative, 32 byte line size\r\n 45H Cache 2nd-level cache: 2 MByte, 4-way set associative, 32 byte line size\r\n 46H Cache 3rd-level cache: 4 MByte, 4-way set associative, 64 byte line size\r\n 47H Cache 3rd-level cache: 8 MByte, 8-way set associative, 64 byte line size\r\n 48H Cache 2nd-level cache: 3MByte, 12-way set associative, 64 byte line size\r\n 49H Cache 3rd-level cache: 4MB, 16-way set associative, 64-byte line size (Intel Xeon processor MP, Family 0FH, Model\r\n 06H);\r\n 2nd-level cache: 4 MByte, 16-way set associative, 64 byte line size\r\n 4AH Cache 3rd-level cache: 6MByte, 12-way set associative, 64 byte line size\r\n 4BH Cache 3rd-level cache: 8MByte, 16-way set associative, 64 byte line size\r\n 4CH Cache 3rd-level cache: 12MByte, 12-way set associative, 64 byte line size\r\n 4DH Cache 3rd-level cache: 16MByte, 16-way set associative, 64 byte line size\r\n 4EH Cache 2nd-level cache: 6MByte, 24-way set associative, 64 byte line size\r\n 4FH TLB Instruction TLB: 4 KByte pages, 32 entries\r\n\r\n\r\n\r\n\r\n Table 3-12. Encoding of CPUID Leaf 2 Descriptors (Contd.)\r\n Value Type Description\r\n 50H TLB Instruction TLB: 4 KByte and 2-MByte or 4-MByte pages, 64 entries\r\n 51H TLB Instruction TLB: 4 KByte and 2-MByte or 4-MByte pages, 128 entries\r\n 52H TLB Instruction TLB: 4 KByte and 2-MByte or 4-MByte pages, 256 entries\r\n 55H TLB Instruction TLB: 2-MByte or 4-MByte pages, fully associative, 7 entries\r\n 56H TLB Data TLB0: 4 MByte pages, 4-way set associative, 16 entries\r\n 57H TLB Data TLB0: 4 KByte pages, 4-way associative, 16 entries\r\n 59H TLB Data TLB0: 4 KByte pages, fully associative, 16 entries\r\n 5AH TLB Data TLB0: 2 MByte or 4 MByte pages, 4-way set associative, 32 entries\r\n 5BH TLB Data TLB: 4 KByte and 4 MByte pages, 64 entries\r\n 5CH TLB Data TLB: 4 KByte and 4 MByte pages,128 entries\r\n 5DH TLB Data TLB: 4 KByte and 4 MByte pages,256 entries\r\n 60H Cache 1st-level data cache: 16 KByte, 8-way set associative, 64 byte line size\r\n 61H TLB Instruction TLB: 4 KByte pages, fully associative, 48 entries\r\n 63H TLB Data TLB: 2 MByte or 4 MByte pages, 4-way set associative, 32 entries and a separate array with 1 GByte\r\n pages, 4-way set associative, 4 entries\r\n 64H TLB Data TLB: 4 KByte pages, 4-way set associative, 512 entries\r\n 66H Cache 1st-level data cache: 8 KByte, 4-way set associative, 64 byte line size\r\n 67H Cache 1st-level data cache: 16 KByte, 4-way set associative, 64 byte line size\r\n 68H Cache 1st-level data cache: 32 KByte, 4-way set associative, 64 byte line size\r\n 6AH Cache uTLB: 4 KByte pages, 8-way set associative, 64 entries\r\n 6BH Cache DTLB: 4 KByte pages, 8-way set associative, 256 entries\r\n 6CH Cache DTLB: 2M/4M pages, 8-way set associative, 128 entries\r\n 6DH Cache DTLB: 1 GByte pages, fully associative, 16 entries\r\n 70H Cache Trace cache: 12 K-muop, 8-way set associative\r\n 71H Cache Trace cache: 16 K-muop, 8-way set associative\r\n 72H Cache Trace cache: 32 K-muop, 8-way set associative\r\n 76H TLB Instruction TLB: 2M/4M pages, fully associative, 8 entries\r\n 78H Cache 2nd-level cache: 1 MByte, 4-way set associative, 64byte line size\r\n 79H Cache 2nd-level cache: 128 KByte, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 7AH Cache 2nd-level cache: 256 KByte, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 7BH Cache 2nd-level cache: 512 KByte, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 7CH Cache 2nd-level cache: 1 MByte, 8-way set associative, 64 byte line size, 2 lines per sector\r\n 7DH Cache 2nd-level cache: 2 MByte, 8-way set associative, 64byte line size\r\n 7FH Cache 2nd-level cache: 512 KByte, 2-way set associative, 64-byte line size\r\n 80H Cache 2nd-level cache: 512 KByte, 8-way set associative, 64-byte line size\r\n 82H Cache 2nd-level cache: 256 KByte, 8-way set associative, 32 byte line size\r\n 83H Cache 2nd-level cache: 512 KByte, 8-way set associative, 32 byte line size\r\n 84H Cache 2nd-level cache: 1 MByte, 8-way set associative, 32 byte line size\r\n 85H Cache 2nd-level cache: 2 MByte, 8-way set associative, 32 byte line size\r\n 86H Cache 2nd-level cache: 512 KByte, 4-way set associative, 64 byte line size\r\n 87H Cache 2nd-level cache: 1 MByte, 8-way set associative, 64 byte line size\r\n\r\n\r\n\r\n\r\n Table 3-12. Encoding of CPUID Leaf 2 Descriptors (Contd.)\r\n Value Type Description\r\n A0H DTLB DTLB: 4k pages, fully associative, 32 entries\r\n B0H TLB Instruction TLB: 4 KByte pages, 4-way set associative, 128 entries\r\n B1H TLB Instruction TLB: 2M pages, 4-way, 8 entries or 4M pages, 4-way, 4 entries\r\n B2H TLB Instruction TLB: 4KByte pages, 4-way set associative, 64 entries\r\n B3H TLB Data TLB: 4 KByte pages, 4-way set associative, 128 entries\r\n B4H TLB Data TLB1: 4 KByte pages, 4-way associative, 256 entries\r\n B5H TLB Instruction TLB: 4KByte pages, 8-way set associative, 64 entries\r\n B6H TLB Instruction TLB: 4KByte pages, 8-way set associative, 128 entries\r\n BAH TLB Data TLB1: 4 KByte pages, 4-way associative, 64 entries\r\n C0H TLB Data TLB: 4 KByte and 4 MByte pages, 4-way associative, 8 entries\r\n C1H STLB Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries\r\n C2H DTLB DTLB: 4 KByte/2 MByte pages, 4-way associative, 16 entries\r\n C3H STLB Shared 2nd-Level TLB: 4 KByte /2 MByte pages, 6-way associative, 1536 entries. Also 1GBbyte pages, 4-way,\r\n 16 entries.\r\n C4H DTLB DTLB: 2M/4M Byte pages, 4-way associative, 32 entries\r\n CAH STLB Shared 2nd-Level TLB: 4 KByte pages, 4-way associative, 512 entries\r\n D0H Cache 3rd-level cache: 512 KByte, 4-way set associative, 64 byte line size\r\n D1H Cache 3rd-level cache: 1 MByte, 4-way set associative, 64 byte line size\r\n D2H Cache 3rd-level cache: 2 MByte, 4-way set associative, 64 byte line size\r\n D6H Cache 3rd-level cache: 1 MByte, 8-way set associative, 64 byte line size\r\n D7H Cache 3rd-level cache: 2 MByte, 8-way set associative, 64 byte line size\r\n D8H Cache 3rd-level cache: 4 MByte, 8-way set associative, 64 byte line size\r\n DCH Cache 3rd-level cache: 1.5 MByte, 12-way set associative, 64 byte line size\r\n DDH Cache 3rd-level cache: 3 MByte, 12-way set associative, 64 byte line size\r\n DEH Cache 3rd-level cache: 6 MByte, 12-way set associative, 64 byte line size\r\n E2H Cache 3rd-level cache: 2 MByte, 16-way set associative, 64 byte line size\r\n E3H Cache 3rd-level cache: 4 MByte, 16-way set associative, 64 byte line size\r\n E4H Cache 3rd-level cache: 8 MByte, 16-way set associative, 64 byte line size\r\n EAH Cache 3rd-level cache: 12MByte, 24-way set associative, 64 byte line size\r\n EBH Cache 3rd-level cache: 18MByte, 24-way set associative, 64 byte line size\r\n ECH Cache 3rd-level cache: 24MByte, 24-way set associative, 64 byte line size\r\n F0H Prefetch 64-Byte prefetching\r\n F1H Prefetch 128-Byte prefetching\r\n FFH General CPUID leaf 2 does not report cache descriptor information, use CPUID leaf 4 to query cache parameters\r\n\r\n\r\nExample 3-1. Example of Cache and TLB Interpretation\r\nThe first member of the family of Pentium 4 processors returns the following information about caches and TLBs\r\nwhen the CPUID executes with an input value of 2:\r\n\r\n EAX 66 5B 50 01H\r\n EBX 0H\r\n ECX 0H\r\n EDX 00 7A 70 00H\r\n\r\n\r\n\r\n\r\nWhich means:\r\n. The least-significant byte (byte 0) of register EAX is set to 01H. This value should be ignored.\r\n. The most-significant bit of all four registers (EAX, EBX, ECX, and EDX) is set to 0, indicating that each register\r\n contains valid 1-byte descriptors.\r\n. Bytes 1, 2, and 3 of register EAX indicate that the processor has:\r\n - 50H - a 64-entry instruction TLB, for mapping 4-KByte and 2-MByte or 4-MByte pages.\r\n - 5BH - a 64-entry data TLB, for mapping 4-KByte and 4-MByte pages.\r\n - 66H - an 8-KByte 1st level data cache, 4-way set associative, with a 64-Byte cache line size.\r\n. The descriptors in registers EBX and ECX are valid, but contain NULL descriptors.\r\n. Bytes 0, 1, 2, and 3 of register EDX indicate that the processor has:\r\n - 00H - NULL descriptor.\r\n - 70H - Trace cache: 12 K-muop, 8-way set associative.\r\n - 7AH - a 256-KByte 2nd level cache, 8-way set associative, with a sectored, 64-byte cache line size.\r\n - 00H - NULL descriptor.\r\n\r\nINPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level\r\nWhen CPUID executes with EAX set to 04H and ECX contains an index value, the processor returns encoded data\r\nthat describe a set of deterministic cache parameters (for the cache level associated with the input in ECX). Valid\r\nindex values start from 0.\r\nSoftware can enumerate the deterministic cache parameters for each level of the cache hierarchy starting with an\r\nindex value of 0, until the parameters report the value associated with the cache type field is 0. The architecturally\r\ndefined fields reported by deterministic cache parameters are documented in Table 3-8.\r\nThis Cache Size in Bytes\r\n= (Ways + 1) * (Partitions + 1) * (Line_Size + 1) * (Sets + 1)\r\n= (EBX[31:22] + 1) * (EBX[21:12] + 1) * (EBX[11:0] + 1) * (ECX + 1)\r\n\r\n\r\nThe CPUID leaf 04H also reports data that can be used to derive the topology of processor cores in a physical\r\npackage. This information is constant for all valid index values. Software can query the raw data reported by\r\nexecuting CPUID with EAX=04H and ECX=0 and use it as part of the topology enumeration algorithm described in\r\nChapter 8, \"Multiple-Processor Management,\" in the Intel 64 and IA-32 Architectures Software Developer's\r\nManual, Volume 3A.\r\n\r\nINPUT EAX = 05H: Returns MONITOR and MWAIT Features\r\nWhen CPUID executes with EAX set to 05H, the processor returns information about features available to\r\nMONITOR/MWAIT instructions. The MONITOR instruction is used for address-range monitoring in conjunction with\r\nMWAIT instruction. The MWAIT instruction optionally provides additional extensions for advanced power manage-\r\nment. See Table 3-8.\r\n\r\nINPUT EAX = 06H: Returns Thermal and Power Management Features\r\nWhen CPUID executes with EAX set to 06H, the processor returns information about thermal and power manage-\r\nment features. See Table 3-8.\r\n\r\nINPUT EAX = 07H: Returns Structured Extended Feature Enumeration Information\r\nWhen CPUID executes with EAX set to 07H and ECX = 0, the processor returns information about the maximum\r\ninput value for sub-leaves that contain extended feature flags. See Table 3-8.\r\nWhen CPUID executes with EAX set to 07H and the input value of ECX is invalid (see leaf 07H entry in Table 3-8),\r\nthe processor returns 0 in EAX/EBX/ECX/EDX. In subleaf 0, EAX returns the maximum input value of the highest\r\nleaf 7 sub-leaf, and EBX, ECX & EDX contain information of extended feature flags.\r\n\r\n\r\n\r\n\r\nINPUT EAX = 09H: Returns Direct Cache Access Information\r\nWhen CPUID executes with EAX set to 09H, the processor returns information about Direct Cache Access capabili-\r\nties. See Table 3-8.\r\n\r\nINPUT EAX = 0AH: Returns Architectural Performance Monitoring Features\r\nWhen CPUID executes with EAX set to 0AH, the processor returns information about support for architectural\r\nperformance monitoring capabilities. Architectural performance monitoring is supported if the version ID (see\r\nTable 3-8) is greater than Pn 0. See Table 3-8.\r\nFor each version of architectural performance monitoring capability, software must enumerate this leaf to discover\r\nthe programming facilities and the architectural performance events available in the processor. The details are\r\ndescribed in Chapter 23, \"Introduction to Virtual-Machine Extensions,\" in the Intel 64 and IA-32 Architectures\r\nSoftware Developer's Manual, Volume 3C.\r\n\r\nINPUT EAX = 0BH: Returns Extended Topology Information\r\nWhen CPUID executes with EAX set to 0BH, the processor returns information about extended topology enumera-\r\ntion data. Software must detect the presence of CPUID leaf 0BH by verifying (a) the highest leaf index supported\r\nby CPUID is >= 0BH, and (b) CPUID.0BH:EBX[15:0] reports a non-zero value. See Table 3-8.\r\n\r\nINPUT EAX = 0DH: Returns Processor Extended States Enumeration Information\r\nWhen CPUID executes with EAX set to 0DH and ECX = 0, the processor returns information about the bit-vector\r\nrepresentation of all processor state extensions that are supported in the processor and storage size requirements\r\nof the XSAVE/XRSTOR area. See Table 3-8.\r\nWhen CPUID executes with EAX set to 0DH and ECX = n (n > 1, and is a valid sub-leaf index), the processor returns\r\ninformation about the size and offset of each processor extended state save area within the XSAVE/XRSTOR area.\r\nSee Table 3-8. Software can use the forward-extendable technique depicted below to query the valid sub-leaves\r\nand obtain size and offset information for each processor extended state save area:\r\n\r\nFor i = 2 to 62 // sub-leaf 1 is reserved\r\n IF (CPUID.(EAX=0DH, ECX=0):VECTOR[i] = 1 ) // VECTOR is the 64-bit value of EDX:EAX\r\n Execute CPUID.(EAX=0DH, ECX = i) to examine size and offset for sub-leaf i;\r\n FI;\r\n\r\nINPUT EAX = 0FH: Returns Intel Resource Director Technology (Intel RDT) Monitoring Enumeration Information\r\nWhen CPUID executes with EAX set to 0FH and ECX = 0, the processor returns information about the bit-vector\r\nrepresentation of QoS monitoring resource types that are supported in the processor and maximum range of RMID\r\nvalues the processor can use to monitor of any supported resource types. Each bit, starting from bit 1, corresponds\r\nto a specific resource type if the bit is set. The bit position corresponds to the sub-leaf index (or ResID) that soft-\r\nware must use to query QoS monitoring capability available for that type. See Table 3-8.\r\nWhen CPUID executes with EAX set to 0FH and ECX = n (n >= 1, and is a valid ResID), the processor returns infor-\r\nmation software can use to program IA32_PQR_ASSOC, IA32_QM_EVTSEL MSRs before reading QoS data from the\r\nIA32_QM_CTR MSR.\r\n\r\nINPUT EAX = 10H: Returns Intel Resource Director Technology (Intel RDT) Allocation Enumeration Information\r\nWhen CPUID executes with EAX set to 10H and ECX = 0, the processor returns information about the bit-vector\r\nrepresentation of QoS Enforcement resource types that are supported in the processor. Each bit, starting from bit\r\n1, corresponds to a specific resource type if the bit is set. The bit position corresponds to the sub-leaf index (or\r\nResID) that software must use to query QoS enforcement capability available for that type. See Table 3-8.\r\nWhen CPUID executes with EAX set to 10H and ECX = n (n >= 1, and is a valid ResID), the processor returns infor-\r\nmation about available classes of service and range of QoS mask MSRs that software can use to configure each\r\nclass of services using capability bit masks in the QoS Mask registers, IA32_resourceType_Mask_n.\r\n\r\n\r\n\r\n\r\n\r\nINPUT EAX = 12H: Returns Intel SGX Enumeration Information\r\nWhen CPUID executes with EAX set to 12H and ECX = 0H, the processor returns information about Intel SGX capa-\r\nbilities. See Table 3-8.\r\nWhen CPUID executes with EAX set to 12H and ECX = 1H, the processor returns information about Intel SGX attri-\r\nbutes. See Table 3-8.\r\nWhen CPUID executes with EAX set to 12H and ECX = n (n > 1), the processor returns information about Intel SGX\r\nEnclave Page Cache. See Table 3-8.\r\n\r\nINPUT EAX = 14H: Returns Intel Processor Trace Enumeration Information\r\nWhen CPUID executes with EAX set to 14H and ECX = 0H, the processor returns information about Intel Processor\r\nTrace extensions. See Table 3-8.\r\nWhen CPUID executes with EAX set to 14H and ECX = n (n > 0 and less than the number of non-zero bits in\r\nCPUID.(EAX=14H, ECX= 0H).EAX), the processor returns information about packet generation in Intel Processor\r\nTrace. See Table 3-8.\r\n\r\nINPUT EAX = 15H: Returns Time Stamp Counter and Nominal Core Crystal Clock Information\r\nWhen CPUID executes with EAX set to 15H and ECX = 0H, the processor returns information about Time Stamp\r\nCounter and Core Crystal Clock. See Table 3-8.\r\n\r\nINPUT EAX = 16H: Returns Processor Frequency Information\r\nWhen CPUID executes with EAX set to 16H, the processor returns information about Processor Frequency Informa-\r\ntion. See Table 3-8.\r\n\r\nINPUT EAX = 17H: Returns System-On-Chip Information\r\nWhen CPUID executes with EAX set to 17H, the processor returns information about the System-On-Chip Vendor\r\nAttribute Enumeration. See Table 3-8.\r\n\r\nMETHODS FOR RETURNING BRANDING INFORMATION\r\nUse the following techniques to access branding information:\r\n1. Processor brand string method.\r\n2. Processor brand index; this method uses a software supplied brand string table.\r\nThese two methods are discussed in the following sections. For methods that are available in early processors, see\r\nSection: \"Identification of Earlier IA-32 Processors\" in Chapter 19 of the Intel 64 and IA-32 Architectures Soft-\r\nware Developer's Manual, Volume 1.\r\n\r\nThe Processor Brand String Method\r\nFigure 3-9 describes the algorithm used for detection of the brand string. Processor brand identification software\r\nshould execute this algorithm on all Intel 64 and IA-32 processors.\r\nThis method (introduced with Pentium 4 processors) returns an ASCII brand identification string and the Processor\r\nBase frequency of the processor to the EAX, EBX, ECX, and EDX registers.\r\n\r\n\r\n\r\n\r\n\r\n Input: EAX=\r\n 0x80000000\r\n\r\n CPUID\r\n\r\n\r\n False Processor Brand\r\n IF (EAX & 0x80000000) String Not\r\n Supported\r\n\r\n\r\n CPUID\r\n True >=\r\n Function\r\n Extended\r\n Supported\r\n\r\n EAX Return Value =\r\n Max. Extended CPUID\r\n Function Index\r\n\r\n\r\n\r\n\r\n True Processor Brand\r\n IF (EAX Return Value\r\n >= 0x80000004) String Supported\r\n\r\n\r\n OM15194\r\n\r\n\r\n\r\n Figure 3-9. Determination of Support for the Processor Brand String\r\n\r\n\r\nHow Brand Strings Work\r\nTo use the brand string method, execute CPUID with EAX input of 8000002H through 80000004H. For each input\r\nvalue, CPUID returns 16 ASCII characters using EAX, EBX, ECX, and EDX. The returned string will be NULL-termi-\r\nnated.\r\n\r\n\r\n\r\n\r\n\r\nTable 3-13 shows the brand string that is returned by the first processor in the Pentium 4 processor family.\r\n\r\n Table 3-13. Processor Brand String Returned with Pentium 4 Processor\r\n EAX Input Value Return Values ASCII Equivalent\r\n 80000002H EAX = 20202020H \" \"\r\n EBX = 20202020H \" \"\r\n ECX = 20202020H \" \"\r\n EDX = 6E492020H \"nI \"\r\n 80000003H EAX = 286C6574H \"(let\"\r\n EBX = 50202952H \"P )R\"\r\n ECX = 69746E65H \"itne\"\r\n EDX = 52286D75H \"R(mu\"\r\n 80000004H EAX = 20342029H \" 4 )\"\r\n EBX = 20555043H \" UPC\"\r\n ECX = 30303531H \"0051\"\r\n EDX = 007A484DH \"\\0zHM\"\r\n\r\n\r\n\r\nExtracting the Processor Frequency from Brand Strings\r\nFigure 3-10 provides an algorithm which software can use to extract the Processor Base frequency from the\r\nprocessor brand string.\r\n\r\n\r\n Scan \"Brand String\" in\r\n Reverse Byte Order\r\n\r\n \"zHM\", or\r\n Match\r\n \"zHG\", or\r\n Substring\r\n \"zHT\"\r\n\r\n\r\n False\r\n IF Substring Matched Report Error\r\n\r\n\r\n\r\n\r\n Determine \"Freq\" True If \"zHM\"\r\n and \"Multiplier\" Multiplier = 1 x 106\r\n\r\n If \"zHG\"\r\n Multiplier = 1 x 109\r\n Determine \"Multiplier\" If \"zHT\"\r\n Multiplier = 1 x 1012\r\n\r\n\r\n Scan Digits\r\n Until Blank Reverse Digits\r\n Determine \"Freq\"\r\n In Reverse Order To Decimal Value\r\n\r\n\r\n\r\n\r\n Processor Base\r\n Frequency =\r\n \"Freq\" = X.YZ if\r\n \"Freq\" x \"Multiplier\"\r\n Digits = \"ZY.X\"\r\n\r\n OM15195\r\n\r\n\r\n Figure 3-10. Algorithm for Extracting Processor Frequency\r\n\r\n\r\n\r\n\r\nThe Processor Brand Index Method\r\nThe brand index method (introduced with Pentium III Xeon processors) provides an entry point into a brand\r\nidentification table that is maintained in memory by system software and is accessible from system- and user-level\r\ncode. In this table, each brand index is associate with an ASCII brand identification string that identifies the official\r\nIntel family and model number of a processor.\r\nWhen CPUID executes with EAX set to 1, the processor returns a brand index to the low byte in EBX. Software can\r\nthen use this index to locate the brand identification string for the processor in the brand identification table. The\r\nfirst entry (brand index 0) in this table is reserved, allowing for backward compatibility with processors that do not\r\nsupport the brand identification feature. Starting with processor signature family ID = 0FH, model = 03H, brand\r\nindex method is no longer supported. Use brand string method instead.\r\nTable 3-14 shows brand indices that have identification strings associated with them.\r\n Table 3-14. Mapping of Brand Indices; and Intel 64 and IA-32 Processor Brand Strings\r\n Brand Index Brand String\r\n 00H This processor does not support the brand identification feature\r\n 01H Intel(R) Celeron(R) processor1\r\n 02H Intel(R) Pentium(R) III processor1\r\n 03H Intel(R) Pentium(R) III Xeon(R) processor; If processor signature = 000006B1h, then Intel(R) Celeron(R)\r\n processor\r\n 04H Intel(R) Pentium(R) III processor\r\n 06H Mobile Intel(R) Pentium(R) III processor-M\r\n 07H Mobile Intel(R) Celeron(R) processor1\r\n 08H Intel(R) Pentium(R) 4 processor\r\n 09H Intel(R) Pentium(R) 4 processor\r\n 0AH Intel(R) Celeron(R) processor1\r\n 0BH Intel(R) Xeon(R) processor; If processor signature = 00000F13h, then Intel(R) Xeon(R) processor MP\r\n 0CH Intel(R) Xeon(R) processor MP\r\n 0EH Mobile Intel(R) Pentium(R) 4 processor-M; If processor signature = 00000F13h, then Intel(R) Xeon(R) processor\r\n 0FH Mobile Intel(R) Celeron(R) processor1\r\n 11H Mobile Genuine Intel(R) processor\r\n 12H Intel(R) Celeron(R) M processor\r\n 13H Mobile Intel(R) Celeron(R) processor1\r\n 14H Intel(R) Celeron(R) processor\r\n 15H Mobile Genuine Intel(R) processor\r\n 16H Intel(R) Pentium(R) M processor\r\n 17H Mobile Intel(R) Celeron(R) processor1\r\n 18H - 0FFH RESERVED\r\n NOTES:\r\n 1. Indicates versions of these processors that were introduced after the Pentium III\r\n\r\nIA-32 Architecture Compatibility\r\nCPUID is not supported in early models of the Intel486 processor or in any IA-32 processor earlier than the\r\nIntel486 processor.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nIA32_BIOS_SIGN_ID MSR <- Update with installed microcode revision number;\r\n\r\nCASE (EAX) OF\r\n EAX = 0:\r\n EAX <- Highest basic function input value understood by CPUID;\r\n EBX <- Vendor identification string;\r\n EDX <- Vendor identification string;\r\n ECX <- Vendor identification string;\r\n BREAK;\r\n EAX = 1H:\r\n EAX[3:0] <- Stepping ID;\r\n EAX[7:4] <- Model;\r\n EAX[11:8] <- Family;\r\n EAX[13:12] <- Processor type;\r\n EAX[15:14] <- Reserved;\r\n EAX[19:16] <- Extended Model;\r\n EAX[27:20] <- Extended Family;\r\n EAX[31:28] <- Reserved;\r\n EBX[7:0] <- Brand Index; (* Reserved if the value is zero. *)\r\n EBX[15:8] <- CLFLUSH Line Size;\r\n EBX[16:23] <- Reserved; (* Number of threads enabled = 2 if MT enable fuse set. *)\r\n EBX[24:31] <- Initial APIC ID;\r\n ECX <- Feature flags; (* See Figure 3-7. *)\r\n EDX <- Feature flags; (* See Figure 3-8. *)\r\n BREAK;\r\n EAX = 2H:\r\n EAX <- Cache and TLB information;\r\n EBX <- Cache and TLB information;\r\n ECX <- Cache and TLB information;\r\n EDX <- Cache and TLB information;\r\n BREAK;\r\n EAX = 3H:\r\n EAX <- Reserved;\r\n EBX <- Reserved;\r\n ECX <- ProcessorSerialNumber[31:0];\r\n (* Pentium III processors only, otherwise reserved. *)\r\n EDX <- ProcessorSerialNumber[63:32];\r\n (* Pentium III processors only, otherwise reserved. *\r\n BREAK\r\n EAX = 4H:\r\n EAX <- Deterministic Cache Parameters Leaf; (* See Table 3-8. *)\r\n EBX <- Deterministic Cache Parameters Leaf;\r\n ECX <- Deterministic Cache Parameters Leaf;\r\n EDX <- Deterministic Cache Parameters Leaf;\r\n BREAK;\r\n EAX = 5H:\r\n EAX <- MONITOR/MWAIT Leaf; (* See Table 3-8. *)\r\n EBX <- MONITOR/MWAIT Leaf;\r\n ECX <- MONITOR/MWAIT Leaf;\r\n EDX <- MONITOR/MWAIT Leaf;\r\n BREAK;\r\n\r\n\r\n\r\n\r\n\r\n EAX = 6H:\r\n EAX <- Thermal and Power Management Leaf; (* See Table 3-8. *)\r\n EBX <- Thermal and Power Management Leaf;\r\n ECX <- Thermal and Power Management Leaf;\r\n EDX <- Thermal and Power Management Leaf;\r\n BREAK;\r\n EAX = 7H:\r\n EAX <- Structured Extended Feature Flags Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Structured Extended Feature Flags Enumeration Leaf;\r\n ECX <- Structured Extended Feature Flags Enumeration Leaf;\r\n EDX <- Structured Extended Feature Flags Enumeration Leaf;\r\n BREAK;\r\n EAX = 8H:\r\n EAX <- Reserved = 0;\r\n EBX <- Reserved = 0;\r\n ECX <- Reserved = 0;\r\n EDX <- Reserved = 0;\r\n BREAK;\r\n EAX = 9H:\r\n EAX <- Direct Cache Access Information Leaf; (* See Table 3-8. *)\r\n EBX <- Direct Cache Access Information Leaf;\r\n ECX <- Direct Cache Access Information Leaf;\r\n EDX <- Direct Cache Access Information Leaf;\r\n BREAK;\r\n EAX = AH:\r\n EAX <- Architectural Performance Monitoring Leaf; (* See Table 3-8. *)\r\n EBX <- Architectural Performance Monitoring Leaf;\r\n ECX <- Architectural Performance Monitoring Leaf;\r\n EDX <- Architectural Performance Monitoring Leaf;\r\n BREAK\r\n EAX = BH:\r\n EAX <- Extended Topology Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Extended Topology Enumeration Leaf;\r\n ECX <- Extended Topology Enumeration Leaf;\r\n EDX <- Extended Topology Enumeration Leaf;\r\n BREAK;\r\n EAX = CH:\r\n EAX <- Reserved = 0;\r\n EBX <- Reserved = 0;\r\n ECX <- Reserved = 0;\r\n EDX <- Reserved = 0;\r\n BREAK;\r\n EAX = DH:\r\n EAX <- Processor Extended State Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Processor Extended State Enumeration Leaf;\r\n ECX <- Processor Extended State Enumeration Leaf;\r\n EDX <- Processor Extended State Enumeration Leaf;\r\n BREAK;\r\n EAX = EH:\r\n EAX <- Reserved = 0;\r\n EBX <- Reserved = 0;\r\n ECX <- Reserved = 0;\r\n EDX <- Reserved = 0;\r\n BREAK;\r\n\r\n\r\n\r\n EAX = FH:\r\n EAX <- Intel Resource Director Technology Monitoring Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Intel Resource Director Technology Monitoring Enumeration Leaf;\r\n ECX <- Intel Resource Director Technology Monitoring Enumeration Leaf;\r\n EDX <- Intel Resource Director Technology Monitoring Enumeration Leaf;\r\n BREAK;\r\n EAX = 10H:\r\n EAX <- Intel Resource Director Technology Allocation Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Intel Resource Director Technology Allocation Enumeration Leaf;\r\n ECX <- Intel Resource Director Technology Allocation Enumeration Leaf;\r\n EDX <- Intel Resource Director Technology Allocation Enumeration Leaf;\r\n BREAK;\r\n EAX = 12H:\r\n EAX <- Intel SGX Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Intel SGX Enumeration Leaf;\r\n ECX <- Intel SGX Enumeration Leaf;\r\n EDX <- Intel SGX Enumeration Leaf;\r\n BREAK;\r\n EAX = 14H:\r\n EAX <- Intel Processor Trace Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Intel Processor Trace Enumeration Leaf;\r\n ECX <- Intel Processor Trace Enumeration Leaf;\r\n EDX <- Intel Processor Trace Enumeration Leaf;\r\n BREAK;\r\n EAX = 15H:\r\n EAX <- Time Stamp Counter and Nominal Core Crystal Clock Information Leaf; (* See Table 3-8. *)\r\n EBX <- Time Stamp Counter and Nominal Core Crystal Clock Information Leaf;\r\n ECX <- Time Stamp Counter and Nominal Core Crystal Clock Information Leaf;\r\n EDX <- Time Stamp Counter and Nominal Core Crystal Clock Information Leaf;\r\n BREAK;\r\n EAX = 16H:\r\n EAX <- Processor Frequency Information Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- Processor Frequency Information Enumeration Leaf;\r\n ECX <- Processor Frequency Information Enumeration Leaf;\r\n EDX <- Processor Frequency Information Enumeration Leaf;\r\n BREAK;\r\n EAX = 17H:\r\n EAX <- System-On-Chip Vendor Attribute Enumeration Leaf; (* See Table 3-8. *)\r\n EBX <- System-On-Chip Vendor Attribute Enumeration Leaf;\r\n ECX <- System-On-Chip Vendor Attribute Enumeration Leaf;\r\n EDX <- System-On-Chip Vendor Attribute Enumeration Leaf;\r\n BREAK;\r\n EAX = 80000000H:\r\n EAX <- Highest extended function input value understood by CPUID;\r\n EBX <- Reserved;\r\n ECX <- Reserved;\r\n EDX <- Reserved;\r\n BREAK;\r\n EAX = 80000001H:\r\n EAX <- Reserved;\r\n EBX <- Reserved;\r\n ECX <- Extended Feature Bits (* See Table 3-8.*);\r\n EDX <- Extended Feature Bits (* See Table 3-8. *);\r\n BREAK;\r\n\r\n\r\n\r\n EAX = 80000002H:\r\n EAX <- Processor Brand String;\r\n EBX <- Processor Brand String, continued;\r\n ECX <- Processor Brand String, continued;\r\n EDX <- Processor Brand String, continued;\r\n BREAK;\r\n EAX = 80000003H:\r\n EAX <- Processor Brand String, continued;\r\n EBX <- Processor Brand String, continued;\r\n ECX <- Processor Brand String, continued;\r\n EDX <- Processor Brand String, continued;\r\n BREAK;\r\n EAX = 80000004H:\r\n EAX <- Processor Brand String, continued;\r\n EBX <- Processor Brand String, continued;\r\n ECX <- Processor Brand String, continued;\r\n EDX <- Processor Brand String, continued;\r\n BREAK;\r\n EAX = 80000005H:\r\n EAX <- Reserved = 0;\r\n EBX <- Reserved = 0;\r\n ECX <- Reserved = 0;\r\n EDX <- Reserved = 0;\r\n BREAK;\r\n EAX = 80000006H:\r\n EAX <- Reserved = 0;\r\n EBX <- Reserved = 0;\r\n ECX <- Cache information;\r\n EDX <- Reserved = 0;\r\n BREAK;\r\n EAX = 80000007H:\r\n EAX <- Reserved = 0;\r\n EBX <- Reserved = 0;\r\n ECX <- Reserved = 0;\r\n EDX <- Reserved = Misc Feature Flags;\r\n BREAK;\r\n EAX = 80000008H:\r\n EAX <- Reserved = Physical Address Size Information;\r\n EBX <- Reserved = Virtual Address Size Information;\r\n ECX <- Reserved = 0;\r\n EDX <- Reserved = 0;\r\n BREAK;\r\n EAX >= 40000000H and EAX <= 4FFFFFFFH:\r\n DEFAULT: (* EAX = Value outside of recognized range for CPUID. *)\r\n (* If the highest basic information leaf data depend on ECX input value, ECX is honored.*)\r\n EAX <- Reserved; (* Information returned for highest basic information leaf. *)\r\n EBX <- Reserved; (* Information returned for highest basic information leaf. *)\r\n ECX <- Reserved; (* Information returned for highest basic information leaf. *)\r\n EDX <- Reserved; (* Information returned for highest basic information leaf. *)\r\n BREAK;\r\nESAC;\r\n\r\nFlags Affected\r\nNone.\r\n\r\n\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n In earlier IA-32 processors that do not support the CPUID instruction, execution of the instruc-\r\n tion results in an invalid opcode (#UD) exception being generated.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CPUID"
},
{
"description": "-R:CWD",
"mnem": "CQO"
},
{
"description": "CRC32 - Accumulate CRC32 Value\r\nOpcode/ Op/ 64-Bit Compat/ Description\r\nInstruction En Mode Leg Mode\r\nF2 0F 38 F0 /r RM Valid Valid Accumulate CRC32 on r/m8.\r\nCRC32 r32, r/m8\r\nF2 REX 0F 38 F0 /r RM Valid N.E. Accumulate CRC32 on r/m8.\r\nCRC32 r32, r/m8*\r\nF2 0F 38 F1 /r RM Valid Valid Accumulate CRC32 on r/m16.\r\nCRC32 r32, r/m16\r\nF2 0F 38 F1 /r RM Valid Valid Accumulate CRC32 on r/m32.\r\nCRC32 r32, r/m32\r\nF2 REX.W 0F 38 F0 /r RM Valid N.E. Accumulate CRC32 on r/m8.\r\nCRC32 r64, r/m8\r\nF2 REX.W 0F 38 F1 /r RM Valid N.E. Accumulate CRC32 on r/m64.\r\nCRC32 r64, r/m64\r\nNOTES:\r\n*In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nStarting with an initial value in the first operand (destination operand), accumulates a CRC32 (polynomial\r\n11EDC6F41H) value for the second operand (source operand) and stores the result in the destination operand. The\r\nsource operand can be a register or a memory location. The destination operand must be an r32 or r64 register. If\r\nthe destination is an r64 register, then the 32-bit result is stored in the least significant double word and\r\n00000000H is stored in the most significant double word of the r64 register.\r\nThe initial value supplied in the destination operand is a double word integer stored in the r32 register or the least\r\nsignificant double word of the r64 register. To incrementally accumulate a CRC32 value, software retains the result\r\nof the previous CRC32 operation in the destination operand, then executes the CRC32 instruction again with new\r\ninput data in the source operand. Data contained in the source operand is processed in reflected bit order. This\r\nmeans that the most significant bit of the source operand is treated as the least significant bit of the quotient, and\r\nso on, for all the bits of the source operand. Likewise, the result of the CRC operation is stored in the destination\r\noperand in reflected bit order. This means that the most significant bit of the resulting CRC (bit 31) is stored in the\r\nleast significant bit of the destination operand (bit 0), and so on, for all the bits of the CRC.\r\n\r\nOperation\r\n\r\nNotes:\r\n BIT_REFLECT64: DST[63-0] = SRC[0-63]\r\n BIT_REFLECT32: DST[31-0] = SRC[0-31]\r\n BIT_REFLECT16: DST[15-0] = SRC[0-15]\r\n BIT_REFLECT8: DST[7-0] = SRC[0-7]\r\n MOD2: Remainder from Polynomial division modulus 2\r\n\r\n\r\n\r\n\r\n\r\nCRC32 instruction for 64-bit source operand and 64-bit destination operand:\r\n\r\n TEMP1[63-0] <- BIT_REFLECT64 (SRC[63-0])\r\n TEMP2[31-0] <- BIT_REFLECT32 (DEST[31-0])\r\n TEMP3[95-0] <- TEMP1[63-0] << 32\r\n TEMP4[95-0] <- TEMP2[31-0] << 64\r\n TEMP5[95-0] <- TEMP3[95-0] XOR TEMP4[95-0]\r\n TEMP6[31-0] <- TEMP5[95-0] MOD2 11EDC6F41H\r\n DEST[31-0] <- BIT_REFLECT (TEMP6[31-0])\r\n DEST[63-32] <- 00000000H\r\nCRC32 instruction for 32-bit source operand and 32-bit destination operand:\r\n\r\n TEMP1[31-0] <- BIT_REFLECT32 (SRC[31-0])\r\n TEMP2[31-0] <- BIT_REFLECT32 (DEST[31-0])\r\n TEMP3[63-0] <- TEMP1[31-0] << 32\r\n TEMP4[63-0] <- TEMP2[31-0] << 32\r\n TEMP5[63-0] <- TEMP3[63-0] XOR TEMP4[63-0]\r\n TEMP6[31-0] <- TEMP5[63-0] MOD2 11EDC6F41H\r\n DEST[31-0] <- BIT_REFLECT (TEMP6[31-0])\r\nCRC32 instruction for 16-bit source operand and 32-bit destination operand:\r\n\r\n TEMP1[15-0] <- BIT_REFLECT16 (SRC[15-0])\r\n TEMP2[31-0] <- BIT_REFLECT32 (DEST[31-0])\r\n TEMP3[47-0] <- TEMP1[15-0] << 32\r\n TEMP4[47-0] <- TEMP2[31-0] << 16\r\n TEMP5[47-0] <- TEMP3[47-0] XOR TEMP4[47-0]\r\n TEMP6[31-0] <- TEMP5[47-0] MOD2 11EDC6F41H\r\n DEST[31-0] <- BIT_REFLECT (TEMP6[31-0])\r\nCRC32 instruction for 8-bit source operand and 64-bit destination operand:\r\n\r\n TEMP1[7-0] <- BIT_REFLECT8(SRC[7-0])\r\n TEMP2[31-0] <- BIT_REFLECT32 (DEST[31-0])\r\n TEMP3[39-0] <- TEMP1[7-0] << 32\r\n TEMP4[39-0] <- TEMP2[31-0] << 8\r\n TEMP5[39-0] <- TEMP3[39-0] XOR TEMP4[39-0]\r\n TEMP6[31-0] <- TEMP5[39-0] MOD2 11EDC6F41H\r\n DEST[31-0] <- BIT_REFLECT (TEMP6[31-0])\r\n DEST[63-32] <- 00000000H\r\nCRC32 instruction for 8-bit source operand and 32-bit destination operand:\r\n\r\n TEMP1[7-0] <- BIT_REFLECT8(SRC[7-0])\r\n TEMP2[31-0] <- BIT_REFLECT32 (DEST[31-0])\r\n TEMP3[39-0] <- TEMP1[7-0] << 32\r\n TEMP4[39-0] <- TEMP2[31-0] << 8\r\n TEMP5[39-0] <- TEMP3[39-0] XOR TEMP4[39-0]\r\n TEMP6[31-0] <- TEMP5[39-0] MOD2 11EDC6F41H\r\n DEST[31-0] <- BIT_REFLECT (TEMP6[31-0])\r\n\r\nFlags Affected\r\nNone\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nunsigned int _mm_crc32_u8( unsigned int crc, unsigned char data )\r\nunsigned int _mm_crc32_u16( unsigned int crc, unsigned short data )\r\nunsigned int _mm_crc32_u32( unsigned int crc, unsigned int data )\r\nunsinged __int64 _mm_crc32_u64( unsinged __int64 crc, unsigned __int64 data )\r\n\r\nSIMD Floating Point Exceptions\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS or GS segments.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF (fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0.\r\n If LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP(0) If any part of the operand lies outside of the effective address space from 0 to 0FFFFH.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#UD If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0.\r\n If LOCK prefix is used.\r\n\r\nVirtual 8086 Mode Exceptions\r\n#GP(0) If any part of the operand lies outside of the effective address space from 0 to 0FFFFH.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF (fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0.\r\n If LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in Protected Mode.\r\n\r\n64-Bit Mode Exceptions\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#PF (fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0.\r\n If LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CRC32"
},
{
"description": "CVTDQ2PD-Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point\r\nValues\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F E6 /r RM V/V SSE2 Convert two packed signed doubleword integers from\r\n CVTDQ2PD xmm1, xmm2/m64 xmm2/mem to two packed double-precision floating-\r\n point values in xmm1.\r\n VEX.128.F3.0F.WIG E6 /r RM V/V AVX Convert two packed signed doubleword integers from\r\n VCVTDQ2PD xmm1, xmm2/m64 xmm2/mem to two packed double-precision floating-\r\n point values in xmm1.\r\n VEX.256.F3.0F.WIG E6 /r RM V/V AVX Convert four packed signed doubleword integers from\r\n VCVTDQ2PD ymm1, xmm2/m128 xmm2/mem to four packed double-precision floating-\r\n point values in ymm1.\r\n EVEX.128.F3.0F.W0 E6 /r HV V/V AVX512VL Convert 2 packed signed doubleword integers from\r\n VCVTDQ2PD xmm1 {k1}{z}, AVX512F xmm2/m128/m32bcst to eight packed double-precision\r\n xmm2/m128/m32bcst floating-point values in xmm1 with writemask k1.\r\n EVEX.256.F3.0F.W0 E6 /r HV V/V AVX512VL Convert 4 packed signed doubleword integers from\r\n VCVTDQ2PD ymm1 {k1}{z}, AVX512F xmm2/m128/m32bcst to 4 packed double-precision\r\n xmm2/m128/m32bcst floating-point values in ymm1 with writemask k1.\r\n EVEX.512.F3.0F.W0 E6 /r HV V/V AVX512F Convert eight packed signed doubleword integers from\r\n VCVTDQ2PD zmm1 {k1}{z}, ymm2/m256/m32bcst to eight packed double-precision\r\n ymm2/m256/m32bcst floating-point values in zmm1 with writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n HV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two, four or eight packed signed doubleword integers in the source operand (the second operand) to two,\r\nfour or eight packed double-precision floating-point values in the destination operand (the first operand).\r\nEVEX encoded versions: The source operand can be a YMM/XMM/XMM (low 64 bits) register, a 256/128/64-bit\r\nmemory location or a 256/128/64-bit vector broadcasted from a 32-bit memory location. The destination operand\r\nis a ZMM/YMM/XMM register conditionally updated with writemask k1. Attempt to encode this instruction with EVEX\r\nembedded rounding is ignored.\r\nVEX.256 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a YMM register.\r\nVEX.128 encoded version: The source operand is an XMM register or 64- bit memory location. The destination\r\noperand is a XMM register. The upper Bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 64- bit memory location. The destination\r\noperand is an XMM register. The upper Bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nunmodified.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\n\r\n SRC X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n DEST X3 X2 X1 X0\r\n\r\n\r\n\r\n Figure 3-11. CVTDQ2PD (VEX.256 encoded version)\r\n\r\n\r\nOperation\r\nVCVTDQ2PD (EVEX encoded versions) when src operand is a register\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n k <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+63:i] <-\r\n Convert_Integer_To_Double_Precision_Floating_Point(SRC[k+31:k])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTDQ2PD (EVEX encoded versions) when src operand is a memory source\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n k <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+63:i] <-\r\n Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])\r\n ELSE\r\n DEST[i+63:i] <-\r\n Convert_Integer_To_Double_Precision_Floating_Point(SRC[k+31:k])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTDQ2PD (VEX.256 encoded version)\r\nDEST[63:0] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])\r\nDEST[127:64] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32])\r\nDEST[191:128] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[95:64])\r\nDEST[255:192] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[127:96)\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVCVTDQ2PD (VEX.128 encoded version)\r\nDEST[63:0] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])\r\nDEST[127:64] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32])\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nCVTDQ2PD (128-bit Legacy SSE version)\r\nDEST[63:0] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])\r\nDEST[127:64] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32])\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTDQ2PD __m512d _mm512_cvtepi32_pd( __m256i a);\r\nVCVTDQ2PD __m512d _mm512_mask_cvtepi32_pd( __m512d s, __mmask8 k, __m256i a);\r\nVCVTDQ2PD __m512d _mm512_maskz_cvtepi32_pd( __mmask8 k, __m256i a);\r\nVCVTDQ2PD __m256d _mm256_mask_cvtepi32_pd( __m256d s, __mmask8 k, __m256i a);\r\nVCVTDQ2PD __m256d _mm256_maskz_cvtepi32_pd( __mmask8 k, __m256i a);\r\nVCVTDQ2PD __m128d _mm_mask_cvtepi32_pd( __m128d s, __mmask8 k, __m128i a);\r\nVCVTDQ2PD __m128d _mm_maskz_cvtepi32_pd( __mmask8 k, __m128i a);\r\nCVTDQ2PD __m256d _mm256_cvtepi32_pd (__m128i src)\r\nCVTDQ2PD __m128d _mm_cvtepi32_pd (__m128i src)\r\n\r\n\r\n\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 5;\r\nEVEX-encoded instructions, see Exceptions Type E5.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTDQ2PD"
},
{
"description": "CVTDQ2PS-Convert Packed Doubleword Integers to Packed Single-Precision Floating-Point\r\nValues\r\n Opcode Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 5B /r RM V/V SSE2 Convert four packed signed doubleword integers from\r\n CVTDQ2PS xmm1, xmm2/m128 xmm2/mem to four packed single-precision floating-\r\n point values in xmm1.\r\n VEX.128.0F.WIG 5B /r RM V/V AVX Convert four packed signed doubleword integers from\r\n VCVTDQ2PS xmm1, xmm2/m128 xmm2/mem to four packed single-precision floating-\r\n point values in xmm1.\r\n VEX.256.0F.WIG 5B /r RM V/V AVX Convert eight packed signed doubleword integers from\r\n VCVTDQ2PS ymm1, ymm2/m256 ymm2/mem to eight packed single-precision floating-\r\n point values in ymm1.\r\n EVEX.128.0F.W0 5B /r FV V/V AVX512VL Convert four packed signed doubleword integers from\r\n VCVTDQ2PS xmm1 {k1}{z}, AVX512F xmm2/m128/m32bcst to four packed single-precision\r\n xmm2/m128/m32bcst floating-point values in xmm1with writemask k1.\r\n EVEX.256.0F.W0 5B /r FV V/V AVX512VL Convert eight packed signed doubleword integers from\r\n VCVTDQ2PS ymm1 {k1}{z}, AVX512F ymm2/m256/m32bcst to eight packed single-precision\r\n ymm2/m256/m32bcst floating-point values in ymm1with writemask k1.\r\n EVEX.512.0F.W0 5B /r FV V/V AVX512F Convert sixteen packed signed doubleword integers\r\n VCVTDQ2PS zmm1 {k1}{z}, from zmm2/m512/m32bcst to sixteen packed single-\r\n zmm2/m512/m32bcst{er} precision floating-point values in zmm1with writemask\r\n k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n FV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts four, eight or sixteen packed signed doubleword integers in the source operand to four, eight or sixteen\r\npacked single-precision floating-point values in the destination operand.\r\nEVEX encoded versions: The source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory loca-\r\ntion or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a\r\nZMM/YMM/XMM register conditionally updated with writemask k1.\r\nVEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination\r\noperand is a YMM register. Bits (MAX_VL-1:256) of the corresponding register destination are zeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a XMM register. The upper bits (MAX_VL-1:128) of the corresponding register destination are zeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is an XMM register. The upper Bits (MAX_VL-1:128) of the corresponding register destination are unmod-\r\nified.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTDQ2PS (EVEX encoded versions) when SRC operand is a register\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nIF (VL = 512) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC); ; refer to Table 2-4 in the Intel Architecture Instruction Set Extensions Programming Reference\r\n ELSE\r\n SET_RM(MXCSR.RM); ; refer to Table 2-4 in the Intel Architecture Instruction Set Extensions Programming Reference\r\nFI;\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+31:i] <-\r\n Convert_Integer_To_Single_Precision_Floating_Point(SRC[i+31:i])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTDQ2PS (EVEX encoded versions) when SRC operand is a memory source\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <-j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <-\r\n Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])\r\n ELSE\r\n DEST[i+31:i] <-\r\n Convert_Integer_To_Single_Precision_Floating_Point(SRC[i+31:i])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTDQ2PS (VEX.256 encoded version)\r\nDEST[31:0] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])\r\nDEST[63:32] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32])\r\nDEST[95:64] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[95:64])\r\nDEST[127:96] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[127:96)\r\nDEST[159:128] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[159:128])\r\nDEST[191:160] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[191:160])\r\nDEST[223:192] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[223:192])\r\nDEST[255:224] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[255:224)\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVCVTDQ2PS (VEX.128 encoded version)\r\nDEST[31:0] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])\r\nDEST[63:32] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32])\r\nDEST[95:64] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[95:64])\r\nDEST[127:96] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[127z:96)\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nCVTDQ2PS (128-bit Legacy SSE version)\r\nDEST[31:0] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])\r\nDEST[63:32] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32])\r\nDEST[95:64] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[95:64])\r\nDEST[127:96] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[127z:96)\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTDQ2PS __m512 _mm512_cvtepi32_ps( __m512i a);\r\nVCVTDQ2PS __m512 _mm512_mask_cvtepi32_ps( __m512 s, __mmask16 k, __m512i a);\r\nVCVTDQ2PS __m512 _mm512_maskz_cvtepi32_ps( __mmask16 k, __m512i a);\r\nVCVTDQ2PS __m512 _mm512_cvt_roundepi32_ps( __m512i a, int r);\r\nVCVTDQ2PS __m512 _mm512_mask_cvt_roundepi_ps( __m512 s, __mmask16 k, __m512i a, int r);\r\nVCVTDQ2PS __m512 _mm512_maskz_cvt_roundepi32_ps( __mmask16 k, __m512i a, int r);\r\nVCVTDQ2PS __m256 _mm256_mask_cvtepi32_ps( __m256 s, __mmask8 k, __m256i a);\r\nVCVTDQ2PS __m256 _mm256_maskz_cvtepi32_ps( __mmask8 k, __m256i a);\r\nVCVTDQ2PS __m128 _mm_mask_cvtepi32_ps( __m128 s, __mmask8 k, __m128i a);\r\nVCVTDQ2PS __m128 _mm_maskz_cvtepi32_ps( __mmask8 k, __m128i a);\r\nCVTDQ2PS __m256 _mm256_cvtepi32_ps (__m256i src)\r\nCVTDQ2PS __m128 _mm_cvtepi32_ps (__m128i src)\r\n\r\nSIMD Floating-Point Exceptions\r\nPrecision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2;\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTDQ2PS"
},
{
"description": "CVTPD2DQ-Convert Packed Double-Precision Floating-Point Values to Packed Doubleword\r\nIntegers\r\n Opcode Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F E6 /r RM V/V SSE2 Convert two packed double-precision floating-point\r\n CVTPD2DQ xmm1, xmm2/m128 values in xmm2/mem to two signed doubleword\r\n integers in xmm1.\r\n VEX.128.F2.0F.WIG E6 /r RM V/V AVX Convert two packed double-precision floating-point\r\n VCVTPD2DQ xmm1, xmm2/m128 values in xmm2/mem to two signed doubleword\r\n integers in xmm1.\r\n VEX.256.F2.0F.WIG E6 /r RM V/V AVX Convert four packed double-precision floating-point\r\n VCVTPD2DQ xmm1, ymm2/m256 values in ymm2/mem to four signed doubleword\r\n integers in xmm1.\r\n EVEX.128.F2.0F.W1 E6 /r FV V/V AVX512VL Convert two packed double-precision floating-point\r\n VCVTPD2DQ xmm1 {k1}{z}, AVX512F values in xmm2/m128/m64bcst to two signed\r\n xmm2/m128/m64bcst doubleword integers in xmm1 subject to writemask k1.\r\n EVEX.256.F2.0F.W1 E6 /r FV V/V AVX512VL Convert four packed double-precision floating-point\r\n VCVTPD2DQ xmm1 {k1}{z}, AVX512F values in ymm2/m256/m64bcst to four signed\r\n ymm2/m256/m64bcst doubleword integers in xmm1 subject to writemask k1.\r\n EVEX.512.F2.0F.W1 E6 /r FV V/V AVX512F Convert eight packed double-precision floating-point\r\n VCVTPD2DQ ymm1 {k1}{z}, values in zmm2/m512/m64bcst to eight signed\r\n zmm2/m512/m64bcst{er} doubleword integers in ymm1 subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n FV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts packed double-precision floating-point values in the source operand (second operand) to packed signed\r\ndoubleword integers in the destination operand (first operand).\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister or the embedded rounding control bits. If a converted result cannot be represented in the destination\r\nformat, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value\r\n(2w-1, where w represents the number of bits in the destination format) is returned.\r\nEVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512-bit memory location, or a 512-bit\r\nvector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register condi-\r\ntionally updated with writemask k1. The upper bits (MAX_VL-1:256/128/64) of the corresponding destination are\r\nzeroed.\r\nVEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination\r\noperand is an XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a XMM register. The upper bits (MAX_VL-1:64) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is an XMM register. Bits[127:64] of the destination XMM register are zeroed. However, the upper bits\r\n(MAX_VL-1:128) of the corresponding ZMM register destination are unmodified.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\n\r\n SRC X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n DEST 0 X3 X2 X1 X0\r\n\r\n\r\n\r\n Figure 3-12. VCVTPD2DQ (VEX.256 encoded version)\r\n\r\n\r\nOperation\r\nVCVTPD2DQ (EVEX encoded versions) when src operand is a register\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nIF (VL = 512) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n k <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+31:i] <-\r\n Convert_Double_Precision_Floating_Point_To_Integer(SRC[k+63:k])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL/2] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTPD2DQ (EVEX encoded versions) when src operand is a memory source\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n k <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <-\r\n Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])\r\n ELSE\r\n DEST[i+31:i] <-\r\n Convert_Double_Precision_Floating_Point_To_Integer(SRC[k+63:k])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL/2] <- 0\r\n\r\nVCVTPD2DQ (VEX.256 encoded version)\r\nDEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])\r\nDEST[63:32] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[127:64])\r\nDEST[95:64] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[191:128])\r\nDEST[127:96] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[255:192)\r\nDEST[MAX_VL-1:128]<-0\r\n\r\nVCVTPD2DQ (VEX.128 encoded version)\r\nDEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])\r\nDEST[63:32] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[127:64])\r\nDEST[MAX_VL-1:64]<-0\r\n\r\nCVTPD2DQ (128-bit Legacy SSE version)\r\nDEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])\r\nDEST[63:32] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[127:64])\r\nDEST[127:64] <-0\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTPD2DQ __m256i _mm512_cvtpd_epi32( __m512d a);\r\nVCVTPD2DQ __m256i _mm512_mask_cvtpd_epi32( __m256i s, __mmask8 k, __m512d a);\r\nVCVTPD2DQ __m256i _mm512_maskz_cvtpd_epi32( __mmask8 k, __m512d a);\r\nVCVTPD2DQ __m256i _mm512_cvt_roundpd_epi32( __m512d a, int r);\r\nVCVTPD2DQ __m256i _mm512_mask_cvt_roundpd_epi32( __m256i s, __mmask8 k, __m512d a, int r);\r\nVCVTPD2DQ __m256i _mm512_maskz_cvt_roundpd_epi32( __mmask8 k, __m512d a, int r);\r\nVCVTPD2DQ __m128i _mm256_mask_cvtpd_epi32( __m128i s, __mmask8 k, __m256d a);\r\nVCVTPD2DQ __m128i _mm256_maskz_cvtpd_epi32( __mmask8 k, __m256d a);\r\nVCVTPD2DQ __m128i _mm_mask_cvtpd_epi32( __m128i s, __mmask8 k, __m128d a);\r\nVCVTPD2DQ __m128i _mm_maskz_cvtpd_epi32( __mmask8 k, __m128d a);\r\nVCVTPD2DQ __m128i _mm256_cvtpd_epi32 (__m256d src)\r\nCVTPD2DQ __m128i _mm_cvtpd_epi32 (__m128d src)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nSee Exceptions Type 2; additionally\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPD2DQ"
},
{
"description": "CVTPD2PI-Convert Packed Double-Precision FP Values to Packed Dword Integers\r\nOpcode/ Op/ 64-Bit Compat/ Description\r\nInstruction En Mode Leg Mode\r\n66 0F 2D /r RM Valid Valid Convert two packed double-precision floating-\r\nCVTPD2PI mm, xmm/m128 point values from xmm/m128 to two packed\r\n signed doubleword integers in mm.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two packed double-precision floating-point values in the source operand (second operand) to two packed\r\nsigned doubleword integers in the destination operand (first operand).\r\nThe source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX tech-\r\nnology register.\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid\r\nexception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.\r\nThis instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack\r\npointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU\r\nfloating-point exception is pending, the exception is handled before the CVTPD2PI instruction is executed.\r\nIn 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).\r\n\r\nOperation\r\nDEST[31:0] <- Convert_Double_Precision_Floating_Point_To_Integer32(SRC[63:0]);\r\nDEST[63:32] <- Convert_Double_Precision_Floating_Point_To_Integer32(SRC[127:64]);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nCVTPD1PI: __m64 _mm_cvtpd_pi32(__m128d a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision.\r\n\r\nOther Exceptions\r\nSee Table 22-4, \"Exception Conditions for Legacy SIMD/MMX Instructions with FP Exception and 16-Byte Align-\r\nment,\" in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPD2PI"
},
{
"description": "CVTPD2PS-Convert Packed Double-Precision Floating-Point Values to Packed Single-Precision\r\nFloating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 5A /r RM V/V SSE2 Convert two packed double-precision floating-point\r\n CVTPD2PS xmm1, xmm2/m128 values in xmm2/mem to two single-precision\r\n floating-point values in xmm1.\r\n VEX.128.66.0F.WIG 5A /r RM V/V AVX Convert two packed double-precision floating-point\r\n VCVTPD2PS xmm1, xmm2/m128 values in xmm2/mem to two single-precision\r\n floating-point values in xmm1.\r\n VEX.256.66.0F.WIG 5A /r RM V/V AVX Convert four packed double-precision floating-point\r\n VCVTPD2PS xmm1, ymm2/m256 values in ymm2/mem to four single-precision\r\n floating-point values in xmm1.\r\n EVEX.128.66.0F.W1 5A /r FV V/V AVX512VL Convert two packed double-precision floating-point\r\n VCVTPD2PS xmm1 {k1}{z}, AVX512F values in xmm2/m128/m64bcst to two single-\r\n xmm2/m128/m64bcst precision floating-point values in xmm1with\r\n writemask k1.\r\n EVEX.256.66.0F.W1 5A /r FV V/V AVX512VL Convert four packed double-precision floating-point\r\n VCVTPD2PS xmm1 {k1}{z}, AVX512F values in ymm2/m256/m64bcst to four single-\r\n ymm2/m256/m64bcst precision floating-point values in xmm1with\r\n writemask k1.\r\n EVEX.512.66.0F.W1 5A /r FV V/V AVX512F Convert eight packed double-precision floating-point\r\n VCVTPD2PS ymm1 {k1}{z}, values in zmm2/m512/m64bcst to eight single-\r\n zmm2/m512/m64bcst{er} precision floating-point values in ymm1with\r\n writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n FV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two, four or eight packed double-precision floating-point values in the source operand (second operand)\r\nto two, four or eight packed single-precision floating-point values in the destination operand (first operand).\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister or the embedded rounding control bits.\r\nEVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or\r\na 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a\r\nYMM/XMM/XMM (low 64-bits) register conditionally updated with writemask k1. The upper bits (MAX_VL-\r\n1:256/128/64) of the corresponding destination are zeroed.\r\nVEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination\r\noperand is an XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a XMM register. The upper bits (MAX_VL-1:64) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is an XMM register. Bits[127:64] of the destination XMM register are zeroed. However, the upper Bits\r\n(MAX_VL-1:128) of the corresponding ZMM register destination are unmodified.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\n SRC X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n DEST 0 X3 X2 X1 X0\r\n\r\n\r\n\r\n Figure 3-13. VCVTPD2PS (VEX.256 encoded version)\r\n\r\n\r\nOperation\r\nVCVTPD2PS (EVEX encoded version) when src operand is a register\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nIF (VL = 512) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n k <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n DEST[i+31:i] <- Convert_Double_Precision_Floating_Point_To_Single_Precision_Floating_Point(SRC[k+63:k])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL/2] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTPD2PS (EVEX encoded version) when src operand is a memory source\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n k <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <-Convert_Double_Precision_Floating_Point_To_Single_Precision_Floating_Point(SRC[63:0])\r\n ELSE\r\n DEST[i+31:i] <- Convert_Double_Precision_Floating_Point_To_Single_Precision_Floating_Point(SRC[k+63:k])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL/2] <- 0\r\n\r\nVCVTPD2PS (VEX.256 encoded version)\r\nDEST[31:0] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0])\r\nDEST[63:32] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[127:64])\r\nDEST[95:64] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[191:128])\r\nDEST[127:96] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[255:192)\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVCVTPD2PS (VEX.128 encoded version)\r\nDEST[31:0] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0])\r\nDEST[63:32] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[127:64])\r\nDEST[MAX_VL-1:64] <- 0\r\n\r\nCVTPD2PS (128-bit Legacy SSE version)\r\nDEST[31:0] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0])\r\nDEST[63:32] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[127:64])\r\nDEST[127:64] <- 0\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTPD2PS __m256 _mm512_cvtpd_ps( __m512d a);\r\nVCVTPD2PS __m256 _mm512_mask_cvtpd_ps( __m256 s, __mmask8 k, __m512d a);\r\nVCVTPD2PS __m256 _mm512_maskz_cvtpd_ps( __mmask8 k, __m512d a);\r\nVCVTPD2PS __m256 _mm512_cvt_roundpd_ps( __m512d a, int r);\r\nVCVTPD2PS __m256 _mm512_mask_cvt_roundpd_ps( __m256 s, __mmask8 k, __m512d a, int r);\r\nVCVTPD2PS __m256 _mm512_maskz_cvt_roundpd_ps( __mmask8 k, __m512d a, int r);\r\nVCVTPD2PS __m128 _mm256_mask_cvtpd_ps( __m128 s, __mmask8 k, __m256d a);\r\nVCVTPD2PS __m128 _mm256_maskz_cvtpd_ps( __mmask8 k, __m256d a);\r\nVCVTPD2PS __m128 _mm_mask_cvtpd_ps( __m128 s, __mmask8 k, __m128d a);\r\nVCVTPD2PS __m128 _mm_maskz_cvtpd_ps( __mmask8 k, __m128d a);\r\nVCVTPD2PS __m128 _mm256_cvtpd_ps (__m256d a)\r\nCVTPD2PS __m128 _mm_cvtpd_ps (__m128d a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision, Underflow, Overflow, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2;\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPD2PS"
},
{
"description": "CVTPI2PD-Convert Packed Dword Integers to Packed Double-Precision FP Values\r\nOpcode/ Op/ 64-Bit Compat/ Description\r\nInstruction En Mode Leg Mode\r\n66 0F 2A /r RM Valid Valid Convert two packed signed doubleword\r\nCVTPI2PD xmm, mm/m64* integers from mm/mem64 to two packed\r\n double-precision floating-point values in xmm.\r\nNOTES:\r\n*Operation is different for different operand sets; see the Description section.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two packed signed doubleword integers in the source operand (second operand) to two packed double-\r\nprecision floating-point values in the destination operand (first operand).\r\nThe source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an\r\nXMM register. In addition, depending on the operand configuration:\r\n. For operands xmm, mm: the instruction causes a transition from x87 FPU to MMX technology operation (that\r\n is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this\r\n instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before\r\n the CVTPI2PD instruction is executed.\r\n. For operands xmm, m64: the instruction does not cause a transition to MMX technology and does not take\r\n x87 FPU exceptions.\r\nIn 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).\r\n\r\nOperation\r\nDEST[63:0] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0]);\r\nDEST[127:64] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32]);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nCVTPI2PD: __m128d _mm_cvtpi32_pd(__m64 a)\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nSee Table 22-6, \"Exception Conditions for Legacy SIMD/MMX Instructions with XMM and without FP Exception,\" in\r\nthe Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPI2PD"
},
{
"description": "CVTPI2PS-Convert Packed Dword Integers to Packed Single-Precision FP Values\r\nOpcode/ Op/ 64-Bit Compat/ Description\r\nInstruction En Mode Leg Mode\r\n0F 2A /r RM Valid Valid Convert two signed doubleword integers\r\nCVTPI2PS xmm, mm/m64 from mm/m64 to two single-precision\r\n floating-point values in xmm.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two packed signed doubleword integers in the source operand (second operand) to two packed single-\r\nprecision floating-point values in the destination operand (first operand).\r\nThe source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an\r\nXMM register. The results are stored in the low quadword of the destination operand, and the high quadword\r\nremains unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control\r\nbits in the MXCSR register.\r\nThis instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack\r\npointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU\r\nfloating-point exception is pending, the exception is handled before the CVTPI2PS instruction is executed.\r\nIn 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).\r\n\r\nOperation\r\nDEST[31:0] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0]);\r\nDEST[63:32] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32]);\r\n(* High quadword of destination unchanged *)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nCVTPI2PS: __m128 _mm_cvtpi32_ps(__m128 a, __m64 b)\r\n\r\nSIMD Floating-Point Exceptions\r\nPrecision\r\n\r\nOther Exceptions\r\nSee Table 22-5, \"Exception Conditions for Legacy SIMD/MMX Instructions with XMM and FP Exception,\" in the\r\nIntel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPI2PS"
},
{
"description": "CVTPS2DQ-Convert Packed Single-Precision Floating-Point Values to Packed Signed\r\nDoubleword Integer Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 5B /r RM V/V SSE2 Convert four packed single-precision floating-point values\r\n CVTPS2DQ xmm1, xmm2/m128 from xmm2/mem to four packed signed doubleword\r\n values in xmm1.\r\n VEX.128.66.0F.WIG 5B /r RM V/V AVX Convert four packed single-precision floating-point values\r\n VCVTPS2DQ xmm1, xmm2/m128 from xmm2/mem to four packed signed doubleword\r\n values in xmm1.\r\n VEX.256.66.0F.WIG 5B /r RM V/V AVX Convert eight packed single-precision floating-point values\r\n VCVTPS2DQ ymm1, ymm2/m256 from ymm2/mem to eight packed signed doubleword\r\n values in ymm1.\r\n EVEX.128.66.0F.W0 5B /r FV V/V AVX512VL Convert four packed single precision floating-point values\r\n VCVTPS2DQ xmm1 {k1}{z}, AVX512F from xmm2/m128/m32bcst to four packed signed\r\n xmm2/m128/m32bcst doubleword values in xmm1 subject to writemask k1.\r\n EVEX.256.66.0F.W0 5B /r FV V/V AVX512VL Convert eight packed single precision floating-point values\r\n VCVTPS2DQ ymm1 {k1}{z}, AVX512F from ymm2/m256/m32bcst to eight packed signed\r\n ymm2/m256/m32bcst doubleword values in ymm1 subject to writemask k1.\r\n EVEX.512.66.0F.W0 5B /r FV V/V AVX512F Convert sixteen packed single-precision floating-point\r\n VCVTPS2DQ zmm1 {k1}{z}, values from zmm2/m512/m32bcst to sixteen packed\r\n zmm2/m512/m32bcst{er} signed doubleword values in zmm1 subject to writemask\r\n k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n FV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts four, eight or sixteen packed single-precision floating-point values in the source operand to four, eight or\r\nsixteen signed doubleword integers in the destination operand.\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister or the embedded rounding control bits. If a converted result cannot be represented in the destination\r\nformat, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value\r\n(2w-1, where w represents the number of bits in the destination format) is returned.\r\nEVEX encoded versions: The source operand is a ZMM register, a 512-bit memory location or a 512-bit vector\r\nbroadcasted from a 32-bit memory location. The destination operand is a ZMM register conditionally updated with\r\nwritemask k1.\r\nVEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination\r\noperand is a YMM register. The upper bits (MAX_VL-1:256) of the corresponding ZMM register destination are\r\nzeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is an XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nunmodified.\r\nVEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTPS2DQ (encoded versions) when src operand is a register\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nIF (VL = 512) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+31:i] <-\r\n Convert_Single_Precision_Floating_Point_To_Integer(SRC[i+31:i])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTPS2DQ (EVEX encoded versions) when src operand is a memory source\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\n\r\nFOR j <- 0 TO 15\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <-\r\n Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])\r\n ELSE\r\n DEST[i+31:i] <-\r\n Convert_Single_Precision_Floating_Point_To_Integer(SRC[i+31:i])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTPS2DQ (VEX.256 encoded version)\r\nDEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])\r\nDEST[63:32] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32])\r\nDEST[95:64] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[95:64])\r\nDEST[127:96] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[127:96)\r\nDEST[159:128] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[159:128])\r\nDEST[191:160] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[191:160])\r\nDEST[223:192] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[223:192])\r\nDEST[255:224] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[255:224])\r\n\r\nVCVTPS2DQ (VEX.128 encoded version)\r\nDEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])\r\nDEST[63:32] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32])\r\nDEST[95:64] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[95:64])\r\nDEST[127:96] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[127:96])\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nCVTPS2DQ (128-bit Legacy SSE version)\r\nDEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])\r\nDEST[63:32] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32])\r\nDEST[95:64] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[95:64])\r\nDEST[127:96] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[127:96])\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTPS2DQ __m512i _mm512_cvtps_epi32( __m512 a);\r\nVCVTPS2DQ __m512i _mm512_mask_cvtps_epi32( __m512i s, __mmask16 k, __m512 a);\r\nVCVTPS2DQ __m512i _mm512_maskz_cvtps_epi32( __mmask16 k, __m512 a);\r\nVCVTPS2DQ __m512i _mm512_cvt_roundps_epi32( __m512 a, int r);\r\nVCVTPS2DQ __m512i _mm512_mask_cvt_roundps_epi32( __m512i s, __mmask16 k, __m512 a, int r);\r\nVCVTPS2DQ __m512i _mm512_maskz_cvt_roundps_epi32( __mmask16 k, __m512 a, int r);\r\nVCVTPS2DQ __m256i _mm256_mask_cvtps_epi32( __m256i s, __mmask8 k, __m256 a);\r\nVCVTPS2DQ __m256i _mm256_maskz_cvtps_epi32( __mmask8 k, __m256 a);\r\nVCVTPS2DQ __m128i _mm_mask_cvtps_epi32( __m128i s, __mmask8 k, __m128 a);\r\nVCVTPS2DQ __m128i _mm_maskz_cvtps_epi32( __mmask8 k, __m128 a);\r\nVCVTPS2DQ __ m256i _mm256_cvtps_epi32 (__m256 a)\r\nCVTPS2DQ __m128i _mm_cvtps_epi32 (__m128 a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2;\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPS2DQ"
},
{
"description": "CVTPS2PD-Convert Packed Single-Precision Floating-Point Values to Packed Double-Precision\r\nFloating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 5A /r RM V/V SSE2 Convert two packed single-precision floating-point values in\r\n CVTPS2PD xmm1, xmm2/m64 xmm2/m64 to two packed double-precision floating-point\r\n values in xmm1.\r\n VEX.128.0F.WIG 5A /r RM V/V AVX Convert two packed single-precision floating-point values in\r\n VCVTPS2PD xmm1, xmm2/m64 xmm2/m64 to two packed double-precision floating-point\r\n values in xmm1.\r\n VEX.256.0F.WIG 5A /r RM V/V AVX Convert four packed single-precision floating-point values\r\n VCVTPS2PD ymm1, xmm2/m128 in xmm2/m128 to four packed double-precision floating-\r\n point values in ymm1.\r\n EVEX.128.0F.W0 5A /r HV V/V AVX512VL Convert two packed single-precision floating-point values in\r\n VCVTPS2PD xmm1 {k1}{z}, AVX512F xmm2/m64/m32bcst to packed double-precision floating-\r\n xmm2/m64/m32bcst point values in xmm1 with writemask k1.\r\n EVEX.256.0F.W0 5A /r HV V/V AVX512VL Convert four packed single-precision floating-point values\r\n VCVTPS2PD ymm1 {k1}{z}, in xmm2/m128/m32bcst to packed double-precision\r\n xmm2/m128/m32bcst floating-point values in ymm1 with writemask k1.\r\n EVEX.512.0F.W0 5A /r HV V/V AVX512F Convert eight packed single-precision floating-point values\r\n VCVTPS2PD zmm1 {k1}{z}, in ymm2/m256/b32bcst to eight packed double-precision\r\n ymm2/m256/m32bcst{sae} floating-point values in zmm1 with writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n HV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two, four or eight packed single-precision floating-point values in the source operand (second operand)\r\nto two, four or eight packed double-precision floating-point values in the destination operand (first operand).\r\nEVEX encoded versions: The source operand is a YMM/XMM/XMM (low 64-bits) register, a 256/128/64-bit memory\r\nlocation or a 256/128/64-bit vector broadcasted from a 32-bit memory location. The destination operand is a\r\nZMM/YMM/XMM register conditionally updated with writemask k1.\r\nVEX.256 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a YMM register. Bits (MAX_VL-1:256) of the corresponding destination ZMM register are zeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 64- bit memory location. The destination\r\noperand is a XMM register. The upper Bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 64- bit memory location. The destination\r\noperand is an XMM register. The upper Bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nunmodified.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\n\r\n SRC X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n DEST X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n Figure 3-14. CVTPS2PD (VEX.256 encoded version)\r\n\r\n\r\nOperation\r\nVCVTPS2PD (EVEX encoded versions) when src operand is a register\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n k <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+63:i] <-\r\n Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[k+31:k])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTPS2PD (EVEX encoded versions) when src operand is a memory source\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n k <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+63:i] <-\r\n Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[31:0])\r\n ELSE\r\n DEST[i+63:i] <-\r\n Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[k+31:k])\r\n FI;\r\n ELSE\r\n\r\n\r\n\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTPS2PD (VEX.256 encoded version)\r\nDEST[63:0] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[31:0])\r\nDEST[127:64] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[63:32])\r\nDEST[191:128] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[95:64])\r\nDEST[255:192] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[127:96)\r\nDEST[MAX_VL-1:256] <- 0\r\n\r\nVCVTPS2PD (VEX.128 encoded version)\r\nDEST[63:0] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[31:0])\r\nDEST[127:64] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[63:32])\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nCVTPS2PD (128-bit Legacy SSE version)\r\nDEST[63:0] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[31:0])\r\nDEST[127:64] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[63:32])\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTPS2PD __m512d _mm512_cvtps_pd( __m256 a);\r\nVCVTPS2PD __m512d _mm512_mask_cvtps_pd( __m512d s, __mmask8 k, __m256 a);\r\nVCVTPS2PD __m512d _mm512_maskz_cvtps_pd( __mmask8 k, __m256 a);\r\nVCVTPS2PD __m512d _mm512_cvt_roundps_pd( __m256 a, int sae);\r\nVCVTPS2PD __m512d _mm512_mask_cvt_roundps_pd( __m512d s, __mmask8 k, __m256 a, int sae);\r\nVCVTPS2PD __m512d _mm512_maskz_cvt_roundps_pd( __mmask8 k, __m256 a, int sae);\r\nVCVTPS2PD __m256d _mm256_mask_cvtps_pd( __m256d s, __mmask8 k, __m128 a);\r\nVCVTPS2PD __m256d _mm256_maskz_cvtps_pd( __mmask8 k, __m128a);\r\nVCVTPS2PD __m128d _mm_mask_cvtps_pd( __m128d s, __mmask8 k, __m128 a);\r\nVCVTPS2PD __m128d _mm_maskz_cvtps_pd( __mmask8 k, __m128 a);\r\nVCVTPS2PD __m256d _mm256_cvtps_pd (__m128 a)\r\nCVTPS2PD __m128d _mm_cvtps_pd (__m128 a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3;\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPS2PD"
},
{
"description": "CVTPS2PI-Convert Packed Single-Precision FP Values to Packed Dword Integers\r\n Opcode/ Op/ 64-Bit Compat/ Description\r\n Instruction En Mode Leg Mode\r\n 0F 2D /r RM Valid Valid Convert two packed single-precision floating-\r\n CVTPS2PI mm, xmm/m64 point values from xmm/m64 to two packed\r\n signed doubleword integers in mm.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two packed single-precision floating-point values in the source operand (second operand) to two packed\r\nsigned doubleword integers in the destination operand (first operand).\r\nThe source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX tech-\r\nnology register. When the source operand is an XMM register, the two single-precision floating-point values are\r\ncontained in the low quadword of the register. When a conversion is inexact, the value returned is rounded\r\naccording to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum\r\nsigned doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indef-\r\ninite integer value (80000000H) is returned.\r\nCVTPS2PI causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer\r\nis set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-\r\npoint exception is pending, the exception is handled before the CVTPS2PI instruction is executed.\r\nIn 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).\r\n\r\nOperation\r\nDEST[31:0] <- Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0]);\r\nDEST[63:32] <- Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32]);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nCVTPS2PI: __m64 _mm_cvtps_pi32(__m128 a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nSee Table 22-5, \"Exception Conditions for Legacy SIMD/MMX Instructions with XMM and FP Exception,\" in the\r\nIntel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTPS2PI"
},
{
"description": "CVTSD2SI-Convert Scalar Double-Precision Floating-Point Value to Doubleword Integer\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F 2D /r RM V/V SSE2 Convert one double-precision floating-point value from\r\n CVTSD2SI r32, xmm1/m64 xmm1/m64 to one signed doubleword integer r32.\r\n F2 REX.W 0F 2D /r RM V/N.E. SSE2 Convert one double-precision floating-point value from\r\n CVTSD2SI r64, xmm1/m64 xmm1/m64 to one signed quadword integer sign-\r\n extended into r64.\r\n VEX.128.F2.0F.W0 2D /r RM V/V AVX Convert one double-precision floating-point value from\r\n VCVTSD2SI r32, xmm1/m64 xmm1/m64 to one signed doubleword integer r32.\r\n VEX.128.F2.0F.W1 2D /r RM V/N.E.1 AVX Convert one double-precision floating-point value from\r\n VCVTSD2SI r64, xmm1/m64 xmm1/m64 to one signed quadword integer sign-\r\n extended into r64.\r\n EVEX.LIG.F2.0F.W0 2D /r T1F V/V AVX512F Convert one double-precision floating-point value from\r\n VCVTSD2SI r32, xmm1/m64{er} xmm1/m64 to one signed doubleword integer r32.\r\n EVEX.LIG.F2.0F.W1 2D /r T1F V/N.E.1 AVX512F Convert one double-precision floating-point value from\r\n VCVTSD2SI r64, xmm1/m64{er} xmm1/m64 to one signed quadword integer sign-\r\n extended into r64.\r\nNOTES:\r\n1. VEX.W1/EVEX.W1 in non-64 bit is ignored; the instructions behaves as if the W0 version is used.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n T1F ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts a double-precision floating-point value in the source operand (the second operand) to a signed double-\r\nword integer in the destination operand (first operand). The source operand can be an XMM register or a 64-bit\r\nmemory location. The destination operand is a general-purpose register. When the source operand is an XMM\r\nregister, the double-precision floating-point value is contained in the low quadword of the register.\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister.\r\nIf a converted result exceeds the range limits of signed doubleword integer (in non-64-bit modes or 64-bit mode\r\nwith REX.W/VEX.W/EVEX.W=0), the floating-point invalid exception is raised, and if this exception is masked, the\r\nindefinite integer value (80000000H) is returned.\r\nIf a converted result exceeds the range limits of signed quadword integer (in 64-bit mode and\r\nREX.W/VEX.W/EVEX.W = 1), the floating-point invalid exception is raised, and if this exception is masked, the\r\nindefinite integer value (80000000_00000000H) is returned.\r\nLegacy SSE instruction: Use of the REX.W prefix promotes the instruction to produce 64-bit data in 64-bit mode.\r\nSee the summary chart at the beginning of this section for encoding data and limits.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\nSoftware should ensure VCVTSD2SI is encoded with VEX.L=0. Encoding VCVTSD2SI with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTSD2SI (EVEX encoded version)\r\nIF SRC *is register* AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF 64-Bit Mode and OperandSize = 64\r\n THEN DEST[63:0] <- Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0]);\r\n ELSE DEST[31:0] <- Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0]);\r\nFI\r\n\r\n(V)CVTSD2SI\r\nIF 64-Bit Mode and OperandSize = 64\r\nTHEN\r\n DEST[63:0] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0]);\r\nELSE\r\n DEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0]);\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTSD2SI int _mm_cvtsd_i32(__m128d);\r\nVCVTSD2SI int _mm_cvt_roundsd_i32(__m128d, int r);\r\nVCVTSD2SI __int64 _mm_cvtsd_i64(__m128d);\r\nVCVTSD2SI __int64 _mm_cvt_roundsd_i64(__m128d, int r);\r\nCVTSD2SI __int64 _mm_cvtsd_si64(__m128d);\r\nCVTSD2SI int _mm_cvtsd_si32(__m128d a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3;\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTSD2SI"
},
{
"description": "CVTSD2SS-Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision\r\nFloating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F 5A /r RM V/V SSE2 Convert one double-precision floating-point value in\r\n CVTSD2SS xmm1, xmm2/m64 xmm2/m64 to one single-precision floating-point value\r\n in xmm1.\r\n VEX.NDS.128.F2.0F.WIG 5A /r RVM V/V AVX Convert one double-precision floating-point value in\r\n VCVTSD2SS xmm1,xmm2, xmm3/m64 to one single-precision floating-point value\r\n xmm3/m64 and merge with high bits in xmm2.\r\n EVEX.NDS.LIG.F2.0F.W1 5A /r T1S V/V AVX512F Convert one double-precision floating-point value in\r\n VCVTSD2SS xmm1 {k1}{z}, xmm2, xmm3/m64 to one single-precision floating-point value\r\n xmm3/m64{er} and merge with high bits in xmm2 under writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nConverts a double-precision floating-point value in the \"convert-from\" source operand (the second operand in\r\nSSE2 version, otherwise the third operand) to a single-precision floating-point value in the destination operand.\r\nWhen the \"convert-from\" operand is an XMM register, the double-precision floating-point value is contained in the\r\nlow quadword of the register. The result is stored in the low doubleword of the destination operand. When the\r\nconversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.\r\n128-bit Legacy SSE version: The \"convert-from\" source operand (the second operand) is an XMM register or\r\nmemory location. Bits (MAX_VL-1:32) of the corresponding destination register remain unchanged. The destina-\r\ntion operand is an XMM register.\r\nVEX.128 and EVEX encoded versions: The \"convert-from\" source operand (the third operand) can be an XMM\r\nregister or a 64-bit memory location. The first source and destination operands are XMM registers. Bits (127:32) of\r\nthe XMM register destination are copied from the corresponding bits in the first source operand. Bits (MAX_VL-\r\n1:128) of the destination register are zeroed.\r\nEVEX encoded version: the converted result in written to the low doubleword element of the destination under the\r\nwritemask.\r\nSoftware should ensure VCVTSD2SS is encoded with VEX.L=0. Encoding VCVTSD2SS with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTSD2SS (EVEX encoded version)\r\nIF (SRC2 *is register*) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF k1[0] or *no writemask*\r\n THEN DEST[31:0] <- Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC2[63:0]);\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[31:0] remains unchanged*\r\n ELSE ; zeroing-masking\r\n THEN DEST[31:0] <- 0\r\n FI;\r\nFI;\r\nDEST[127:32] <- SRC1[127:32]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVCVTSD2SS (VEX.128 encoded version)\r\nDEST[31:0] <-Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC2[63:0]);\r\nDEST[127:32] <-SRC1[127:32]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nCVTSD2SS (128-bit Legacy SSE version)\r\nDEST[31:0] <-Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0]);\r\n(* DEST[MAX_VL-1:32] Unmodified *)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTSD2SS __m128 _mm_mask_cvtsd_ss(__m128 s, __mmask8 k, __m128 a, __m128d b);\r\nVCVTSD2SS __m128 _mm_maskz_cvtsd_ss( __mmask8 k, __m128 a,__m128d b);\r\nVCVTSD2SS __m128 _mm_cvt_roundsd_ss(__m128 a, __m128d b, int r);\r\nVCVTSD2SS __m128 _mm_mask_cvt_roundsd_ss(__m128 s, __mmask8 k, __m128 a, __m128d b, int r);\r\nVCVTSD2SS __m128 _mm_maskz_cvt_roundsd_ss( __mmask8 k, __m128 a,__m128d b, int r);\r\nCVTSD2SS __m128_mm_cvtsd_ss(__m128 a, __m128d b)\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTSD2SS"
},
{
"description": "CVTSI2SD-Convert Doubleword Integer to Scalar Double-Precision Floating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F 2A /r RM V/V SSE2 Convert one signed doubleword integer from\r\n CVTSI2SD xmm1, r32/m32 r32/m32 to one double-precision floating-point\r\n value in xmm1.\r\n F2 REX.W 0F 2A /r RM V/N.E. SSE2 Convert one signed quadword integer from r/m64\r\n CVTSI2SD xmm1, r/m64 to one double-precision floating-point value in\r\n xmm1.\r\n VEX.NDS.128.F2.0F.W0 2A /r RVM V/V AVX Convert one signed doubleword integer from\r\n VCVTSI2SD xmm1, xmm2, r/m32 r/m32 to one double-precision floating-point\r\n value in xmm1.\r\n VEX.NDS.128.F2.0F.W1 2A /r RVM V/N.E.1 AVX Convert one signed quadword integer from r/m64\r\n VCVTSI2SD xmm1, xmm2, r/m64 to one double-precision floating-point value in\r\n xmm1.\r\n EVEX.NDS.LIG.F2.0F.W0 2A /r T1S V/V AVX512F Convert one signed doubleword integer from\r\n VCVTSI2SD xmm1, xmm2, r/m32 r/m32 to one double-precision floating-point\r\n value in xmm1.\r\n EVEX.NDS.LIG.F2.0F.W1 2A /r T1S V/N.E.1 AVX512F Convert one signed quadword integer from r/m64\r\n VCVTSI2SD xmm1, xmm2, r/m64{er} to one double-precision floating-point value in\r\n xmm1.\r\nNOTES:\r\n1. VEX.W1/EVEX.W1 in non-64 bit is ignored; the instructions behaves as if the W0 version is used.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nConverts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the \"convert-from\"\r\nsource operand to a double-precision floating-point value in the destination operand. The result is stored in the low\r\nquadword of the destination operand, and the high quadword left unchanged. When conversion is inexact, the\r\nvalue returned is rounded according to the rounding control bits in the MXCSR register.\r\nThe second source operand can be a general-purpose register or a 32/64-bit memory location. The first source and\r\ndestination operands are XMM registers.\r\n128-bit Legacy SSE version: Use of the REX.W prefix promotes the instruction to 64-bit operands. The \"convert-\r\nfrom\" source operand (the second operand) is a general-purpose register or memory location. The destination is\r\nan XMM register Bits (MAX_VL-1:64) of the corresponding destination register remain unchanged.\r\nVEX.128 and EVEX encoded versions: The \"convert-from\" source operand (the third operand) can be a general-\r\npurpose register or a memory location. The first source and destination operands are XMM registers. Bits (127:64)\r\nof the XMM register destination are copied from the corresponding bits in the first source operand. Bits (MAX_VL-\r\n1:128) of the destination register are zeroed.\r\nEVEX.W0 version: attempt to encode this instruction with EVEX embedded rounding is ignored.\r\nVEX.W1 and EVEX.W1 versions: promotes the instruction to use 64-bit input value in 64-bit mode.\r\nSoftware should ensure VCVTSI2SD is encoded with VEX.L=0. Encoding VCVTSI2SD with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTSI2SD (EVEX encoded version)\r\nIF (SRC2 *is register*) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF 64-Bit Mode And OperandSize = 64\r\nTHEN\r\n DEST[63:0] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC2[63:0]);\r\nELSE\r\n DEST[63:0] <- Convert_Integer_To_Double_Precision_Floating_Point(SRC2[31:0]);\r\nFI;\r\nDEST[127:64] <- SRC1[127:64]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVCVTSI2SD (VEX.128 encoded version)\r\nIF 64-Bit Mode And OperandSize = 64\r\nTHEN\r\n DEST[63:0] <-Convert_Integer_To_Double_Precision_Floating_Point(SRC2[63:0]);\r\nELSE\r\n DEST[63:0] <-Convert_Integer_To_Double_Precision_Floating_Point(SRC2[31:0]);\r\nFI;\r\nDEST[127:64] <-SRC1[127:64]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nCVTSI2SD\r\nIF 64-Bit Mode And OperandSize = 64\r\nTHEN\r\n DEST[63:0] <-Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:0]);\r\nELSE\r\n DEST[63:0] <-Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0]);\r\nFI;\r\nDEST[MAX_VL-1:64] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTSI2SD __m128d _mm_cvti32_sd(__m128d s, int a);\r\nVCVTSI2SD __m128d _mm_cvt_roundi32_sd(__m128d s, int a, int r);\r\nVCVTSI2SD __m128d _mm_cvti64_sd(__m128d s, __int64 a);\r\nVCVTSI2SD __m128d _mm_cvt_roundi64_sd(__m128d s, __int64 a, int r);\r\nCVTSI2SD __m128d _mm_cvtsi64_sd(__m128d s, __int64 a);\r\nCVTSI2SD __m128d_mm_cvtsi32_sd(__m128d a, int b)\r\n\r\nSIMD Floating-Point Exceptions\r\nPrecision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3 if W1, else Type 5.\r\nEVEX-encoded instructions, see Exceptions Type E3NF if W1, else Type E10NF.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTSI2SD"
},
{
"description": "CVTSI2SS-Convert Doubleword Integer to Scalar Single-Precision Floating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 2A /r RM V/V SSE Convert one signed doubleword integer from r/m32\r\n CVTSI2SS xmm1, r/m32 to one single-precision floating-point value in xmm1.\r\n F3 REX.W 0F 2A /r RM V/N.E. SSE Convert one signed quadword integer from r/m64\r\n CVTSI2SS xmm1, r/m64 to one single-precision floating-point value in xmm1.\r\n VEX.NDS.128.F3.0F.W0 2A /r RVM V/V AVX Convert one signed doubleword integer from r/m32\r\n VCVTSI2SS xmm1, xmm2, r/m32 to one single-precision floating-point value in xmm1.\r\n VEX.NDS.128.F3.0F.W1 2A /r RVM V/N.E.1 AVX Convert one signed quadword integer from r/m64\r\n VCVTSI2SS xmm1, xmm2, r/m64 to one single-precision floating-point value in xmm1.\r\n EVEX.NDS.LIG.F3.0F.W0 2A /r T1S V/V AVX512F Convert one signed doubleword integer from r/m32\r\n VCVTSI2SS xmm1, xmm2, r/m32{er} to one single-precision floating-point value in xmm1.\r\n EVEX.NDS.LIG.F3.0F.W1 2A /r T1S V/N.E.1 AVX512F Convert one signed quadword integer from r/m64\r\n VCVTSI2SS xmm1, xmm2, r/m64{er} to one single-precision floating-point value in xmm1.\r\nNOTES:\r\n1. VEX.W1/EVEX.W1 in non-64 bit is ignored; the instructions behaves as if the W0 version is used.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nConverts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the \"convert-from\"\r\nsource operand to a single-precision floating-point value in the destination operand (first operand). The \"convert-\r\nfrom\" source operand can be a general-purpose register or a memory location. The destination operand is an XMM\r\nregister. The result is stored in the low doubleword of the destination operand, and the upper three doublewords\r\nare left unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control\r\nbits in the MXCSR register or the embedded rounding control bits.\r\n128-bit Legacy SSE version: In 64-bit mode, Use of the REX.W prefix promotes the instruction to use 64-bit input\r\nvalue. The \"convert-from\" source operand (the second operand) is a general-purpose register or memory location.\r\nBits (MAX_VL-1:32) of the corresponding destination register remain unchanged.\r\nVEX.128 and EVEX encoded versions: The \"convert-from\" source operand (the third operand) can be a general-\r\npurpose register or a memory location. The first source and destination operands are XMM registers. Bits (127:32)\r\nof the XMM register destination are copied from corresponding bits in the first source operand. Bits (MAX_VL-\r\n1:128) of the destination register are zeroed.\r\nEVEX encoded version: the converted result in written to the low doubleword element of the destination under the\r\nwritemask.\r\nSoftware should ensure VCVTSI2SS is encoded with VEX.L=0. Encoding VCVTSI2SS with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTSI2SS (EVEX encoded version)\r\nIF (SRC2 *is register*) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF 64-Bit Mode And OperandSize = 64\r\nTHEN\r\n DEST[31:0] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:0]);\r\nELSE\r\n DEST[31:0] <- Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0]);\r\nFI;\r\nDEST[127:32] <- SRC1[127:32]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVCVTSI2SS (VEX.128 encoded version)\r\nIF 64-Bit Mode And OperandSize = 64\r\nTHEN\r\n DEST[31:0] <-Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:0]);\r\nELSE\r\n DEST[31:0] <-Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0]);\r\nFI;\r\nDEST[127:32] <-SRC1[127:32]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nCVTSI2SS (128-bit Legacy SSE version)\r\nIF 64-Bit Mode And OperandSize = 64\r\nTHEN\r\n DEST[31:0] <-Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:0]);\r\nELSE\r\n DEST[31:0] <-Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0]);\r\nFI;\r\nDEST[MAX_VL-1:32] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTSI2SS __m128 _mm_cvti32_ss(__m128 s, int a);\r\nVCVTSI2SS __m128 _mm_cvt_roundi32_ss(__m128 s, int a, int r);\r\nVCVTSI2SS __m128 _mm_cvti64_ss(__m128 s, __int64 a);\r\nVCVTSI2SS __m128 _mm_cvt_roundi64_ss(__m128 s, __int64 a, int r);\r\nCVTSI2SS __m128 _mm_cvtsi64_ss(__m128 s, __int64 a);\r\nCVTSI2SS __m128 _mm_cvtsi32_ss(__m128 a, int b);\r\n\r\nSIMD Floating-Point Exceptions\r\nPrecision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTSI2SS"
},
{
"description": "CVTSS2SD-Convert Scalar Single-Precision Floating-Point Value to Scalar Double-Precision\r\nFloating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 5A /r RM V/V SSE2 Convert one single-precision floating-point value in\r\n CVTSS2SD xmm1, xmm2/m32 xmm2/m32 to one double-precision floating-point value\r\n in xmm1.\r\n VEX.NDS.128.F3.0F.WIG 5A /r RVM V/V AVX Convert one single-precision floating-point value in\r\n VCVTSS2SD xmm1, xmm2, xmm3/m32 to one double-precision floating-point value\r\n xmm3/m32 and merge with high bits of xmm2.\r\n EVEX.NDS.LIG.F3.0F.W0 5A /r T1S V/V AVX512F Convert one single-precision floating-point value in\r\n VCVTSS2SD xmm1 {k1}{z}, xmm2, xmm3/m32 to one double-precision floating-point value\r\n xmm3/m32{sae} and merge with high bits of xmm2 under writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nConverts a single-precision floating-point value in the \"convert-from\" source operand to a double-precision\r\nfloating-point value in the destination operand. When the \"convert-from\" source operand is an XMM register, the\r\nsingle-precision floating-point value is contained in the low doubleword of the register. The result is stored in the\r\nlow quadword of the destination operand.\r\n128-bit Legacy SSE version: The \"convert-from\" source operand (the second operand) is an XMM register or\r\nmemory location. Bits (MAX_VL-1:64) of the corresponding destination register remain unchanged. The destina-\r\ntion operand is an XMM register.\r\nVEX.128 and EVEX encoded versions: The \"convert-from\" source operand (the third operand) can be an XMM\r\nregister or a 32-bit memory location. The first source and destination operands are XMM registers. Bits (127:64) of\r\nthe XMM register destination are copied from the corresponding bits in the first source operand. Bits (MAX_VL-\r\n1:128) of the destination register are zeroed.\r\nSoftware should ensure VCVTSS2SD is encoded with VEX.L=0. Encoding VCVTSS2SD with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\nOperation\r\nVCVTSS2SD (EVEX encoded version)\r\nIF k1[0] or *no writemask*\r\n THEN DEST[63:0] <- Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC2[31:0]);\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[63:0] remains unchanged*\r\n ELSE ; zeroing-masking\r\n THEN DEST[63:0] = 0\r\n FI;\r\nFI;\r\nDEST[127:64] <- SRC1[127:64]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTSS2SD (VEX.128 encoded version)\r\nDEST[63:0] <-Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC2[31:0])\r\nDEST[127:64] <-SRC1[127:64]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nCVTSS2SD (128-bit Legacy SSE version)\r\nDEST[63:0] <-Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[31:0]);\r\nDEST[MAX_VL-1:64] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTSS2SD __m128d _mm_cvt_roundss_sd(__m128d a, __m128 b, int r);\r\nVCVTSS2SD __m128d _mm_mask_cvt_roundss_sd(__m128d s, __mmask8 m, __m128d a,__m128 b, int r);\r\nVCVTSS2SD __m128d _mm_maskz_cvt_roundss_sd(__mmask8 k, __m128d a, __m128 a, int r);\r\nVCVTSS2SD __m128d _mm_mask_cvtss_sd(__m128d s, __mmask8 m, __m128d a,__m128 b);\r\nVCVTSS2SD __m128d _mm_maskz_cvtss_sd(__mmask8 m, __m128d a,__m128 b);\r\nCVTSS2SD __m128d_mm_cvtss_sd(__m128d a, __m128 a);\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTSS2SD"
},
{
"description": "CVTSS2SI-Convert Scalar Single-Precision Floating-Point Value to Doubleword Integer\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 2D /r RM V/V SSE Convert one single-precision floating-point value from\r\n CVTSS2SI r32, xmm1/m32 xmm1/m32 to one signed doubleword integer in r32.\r\n F3 REX.W 0F 2D /r RM V/N.E. SSE Convert one single-precision floating-point value from\r\n CVTSS2SI r64, xmm1/m32 xmm1/m32 to one signed quadword integer in r64.\r\n VEX.128.F3.0F.W0 2D /r RM V/V AVX Convert one single-precision floating-point value from\r\n VCVTSS2SI r32, xmm1/m32 xmm1/m32 to one signed doubleword integer in r32.\r\n VEX.128.F3.0F.W1 2D /r RM V/N.E.1 AVX Convert one single-precision floating-point value from\r\n VCVTSS2SI r64, xmm1/m32 xmm1/m32 to one signed quadword integer in r64.\r\n EVEX.LIG.F3.0F.W0 2D /r T1F V/V AVX512F Convert one single-precision floating-point value from\r\n VCVTSS2SI r32, xmm1/m32{er} xmm1/m32 to one signed doubleword integer in r32.\r\n EVEX.LIG.F3.0F.W1 2D /r T1F V/N.E.1 AVX512F Convert one single-precision floating-point value from\r\n VCVTSS2SI r64, xmm1/m32{er} xmm1/m32 to one signed quadword integer in r64.\r\nNOTES:\r\n1. VEX.W1/EVEX.W1 in non-64 bit is ignored; the instructions behaves as if the W0 version is used.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n T1F ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts a single-precision floating-point value in the source operand (the second operand) to a signed double-\r\nword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand).\r\nThe source operand can be an XMM register or a memory location. The destination operand is a general-purpose\r\nregister. When the source operand is an XMM register, the single-precision floating-point value is contained in the\r\nlow doubleword of the register.\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister or the embedded rounding control bits. If a converted result cannot be represented in the destination\r\nformat, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value\r\n(2w-1, where w represents the number of bits in the destination format) is returned.\r\nLegacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to produce 64-bit data.\r\nSee the summary chart at the beginning of this section for encoding data and limits.\r\nVEX.W1 and EVEX.W1 versions: promotes the instruction to produce 64-bit data in 64-bit mode.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\nSoftware should ensure VCVTSS2SI is encoded with VEX.L=0. Encoding VCVTSS2SI with VEX.L=1 may encounter\r\nunpredictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTSS2SI (EVEX encoded version)\r\nIF (SRC *is register*) AND (EVEX.b = 1)\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF 64-bit Mode and OperandSize = 64\r\nTHEN\r\n DEST[63:0] <- Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0]);\r\nELSE\r\n DEST[31:0] <- Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0]);\r\nFI;\r\n\r\n(V)CVTSS2SI (Legacy and VEX.128 encoded version)\r\nIF 64-bit Mode and OperandSize = 64\r\nTHEN\r\n DEST[63:0] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0]);\r\nELSE\r\n DEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0]);\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTSS2SI int _mm_cvtss_i32( __m128 a);\r\nVCVTSS2SI int _mm_cvt_roundss_i32( __m128 a, int r);\r\nVCVTSS2SI __int64 _mm_cvtss_i64( __m128 a);\r\nVCVTSS2SI __int64 _mm_cvt_roundss_i64( __m128 a, int r);\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3; additionally\r\n#UD If VEX.vvvv != 1111B.\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTSS2SI"
},
{
"description": "CVTTPD2DQ-Convert with Truncation Packed Double-Precision Floating-Point Values to\r\nPacked Doubleword Integers\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F E6 /r RM V/V SSE2 Convert two packed double-precision floating-point\r\n CVTTPD2DQ xmm1, xmm2/m128 values in xmm2/mem to two signed doubleword\r\n integers in xmm1 using truncation.\r\n VEX.128.66.0F.WIG E6 /r RM V/V AVX Convert two packed double-precision floating-point\r\n VCVTTPD2DQ xmm1, xmm2/m128 values in xmm2/mem to two signed doubleword\r\n integers in xmm1 using truncation.\r\n VEX.256.66.0F.WIG E6 /r RM V/V AVX Convert four packed double-precision floating-point\r\n VCVTTPD2DQ xmm1, ymm2/m256 values in ymm2/mem to four signed doubleword\r\n integers in xmm1 using truncation.\r\n EVEX.128.66.0F.W1 E6 /r FV V/V AVX512VL Convert two packed double-precision floating-point\r\n VCVTTPD2DQ xmm1 {k1}{z}, AVX512F values in xmm2/m128/m64bcst to two signed\r\n xmm2/m128/m64bcst doubleword integers in xmm1 using truncation subject\r\n to writemask k1.\r\n EVEX.256.66.0F.W1 E6 /r FV V/V AVX512VL Convert four packed double-precision floating-point\r\n VCVTTPD2DQ xmm1 {k1}{z}, AVX512F values in ymm2/m256/m64bcst to four signed\r\n ymm2/m256/m64bcst doubleword integers in xmm1 using truncation subject\r\n to writemask k1.\r\n EVEX.512.66.0F.W1 E6 /r FV V/V AVX512F Convert eight packed double-precision floating-point\r\n VCVTTPD2DQ ymm1 {k1}{z}, values in zmm2/m512/m64bcst to eight signed\r\n zmm2/m512/m64bcst{sae} doubleword integers in ymm1 using truncation subject\r\n to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n FV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two, four or eight packed double-precision floating-point values in the source operand (second operand)\r\nto two, four or eight packed signed doubleword integers in the destination operand (first operand).\r\nWhen a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than\r\nthe maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is\r\nmasked, the indefinite integer value (80000000H) is returned.\r\nEVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or\r\na 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a\r\nYMM/XMM/XMM (low 64 bits) register conditionally updated with writemask k1. The upper bits (MAX_VL-1:256) of\r\nthe corresponding destination are zeroed.\r\nVEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination\r\noperand is an XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a XMM register. The upper bits (MAX_VL-1:64) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is an XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nunmodified.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\n\r\n\r\n SRC X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n DEST 0 X3 X2 X1 X0\r\n\r\n\r\n\r\n\r\n Figure 3-15. VCVTTPD2DQ (VEX.256 encoded version)\r\n\r\n\r\nOperation\r\nVCVTTPD2DQ (EVEX encoded versions) when src operand is a register\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n k <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+31:i] <-\r\n Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[k+63:k])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL/2] <- 0\r\n\r\n\r\n\r\n\r\n\r\nVCVTTPD2DQ (EVEX encoded versions) when src operand is a memory source\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n k <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <-\r\n Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[63:0])\r\n ELSE\r\n DEST[i+31:i] <-\r\n Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[k+63:k])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL/2] <- 0\r\n\r\nVCVTTPD2DQ (VEX.256 encoded version)\r\nDEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[63:0])\r\nDEST[63:32] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[127:64])\r\nDEST[95:64] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[191:128])\r\nDEST[127:96] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[255:192)\r\nDEST[MAX_VL-1:128]<-0\r\n\r\nVCVTTPD2DQ (VEX.128 encoded version)\r\nDEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[63:0])\r\nDEST[63:32] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[127:64])\r\nDEST[MAX_VL-1:64]<-0\r\n\r\nCVTTPD2DQ (128-bit Legacy SSE version)\r\nDEST[31:0] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[63:0])\r\nDEST[63:32] <-Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[127:64])\r\nDEST[127:64] <-0\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTTPD2DQ __m256i _mm512_cvttpd_epi32( __m512d a);\r\nVCVTTPD2DQ __m256i _mm512_mask_cvttpd_epi32( __m256i s, __mmask8 k, __m512d a);\r\nVCVTTPD2DQ __m256i _mm512_maskz_cvttpd_epi32( __mmask8 k, __m512d a);\r\nVCVTTPD2DQ __m256i _mm512_cvtt_roundpd_epi32( __m512d a, int sae);\r\nVCVTTPD2DQ __m256i _mm512_mask_cvtt_roundpd_epi32( __m256i s, __mmask8 k, __m512d a, int sae);\r\nVCVTTPD2DQ __m256i _mm512_maskz_cvtt_roundpd_epi32( __mmask8 k, __m512d a, int sae);\r\nVCVTTPD2DQ __m128i _mm256_mask_cvttpd_epi32( __m128i s, __mmask8 k, __m256d a);\r\nVCVTTPD2DQ __m128i _mm256_maskz_cvttpd_epi32( __mmask8 k, __m256d a);\r\nVCVTTPD2DQ __m128i _mm_mask_cvttpd_epi32( __m128i s, __mmask8 k, __m128d a);\r\nVCVTTPD2DQ __m128i _mm_maskz_cvttpd_epi32( __mmask8 k, __m128d a);\r\nVCVTTPD2DQ __m128i _mm256_cvttpd_epi32 (__m256d src);\r\nCVTTPD2DQ __m128i _mm_cvttpd_epi32 (__m128d src);\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2;\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTTPD2DQ"
},
{
"description": "CVTTPD2PI-Convert with Truncation Packed Double-Precision FP Values to Packed Dword\r\nIntegers\r\n Opcode/ Op/ 64-Bit Compat/ Description\r\n Instruction En Mode Leg Mode\r\n 66 0F 2C /r RM Valid Valid Convert two packer double-precision floating-\r\n CVTTPD2PI mm, xmm/m128 point values from xmm/m128 to two packed\r\n signed doubleword integers in mm using\r\n truncation.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two packed double-precision floating-point values in the source operand (second operand) to two packed\r\nsigned doubleword integers in the destination operand (first operand). The source operand can be an XMM register\r\nor a 128-bit memory location. The destination operand is an MMX technology register.\r\nWhen a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger\r\nthan the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is\r\nmasked, the indefinite integer value (80000000H) is returned.\r\nThis instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack\r\npointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU\r\nfloating-point exception is pending, the exception is handled before the CVTTPD2PI instruction is executed.\r\nIn 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).\r\n\r\nOperation\r\nDEST[31:0] <- Convert_Double_Precision_Floating_Point_To_Integer32_Truncate(SRC[63:0]);\r\nDEST[63:32] <- Convert_Double_Precision_Floating_Point_To_Integer32_\r\n Truncate(SRC[127:64]);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nCVTTPD1PI: __m64 _mm_cvttpd_pi32(__m128d a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Mode Exceptions\r\nSee Table 22-4, \"Exception Conditions for Legacy SIMD/MMX Instructions with FP Exception and 16-Byte Align-\r\nment,\" in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTTPD2PI"
},
{
"description": "CVTTPS2DQ-Convert with Truncation Packed Single-Precision Floating-Point Values to Packed\r\nSigned Doubleword Integer Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 5B /r RM V/V SSE2 Convert four packed single-precision floating-point\r\n CVTTPS2DQ xmm1, xmm2/m128 values from xmm2/mem to four packed signed\r\n doubleword values in xmm1 using truncation.\r\n VEX.128.F3.0F.WIG 5B /r RM V/V AVX Convert four packed single-precision floating-point\r\n VCVTTPS2DQ xmm1, xmm2/m128 values from xmm2/mem to four packed signed\r\n doubleword values in xmm1 using truncation.\r\n VEX.256.F3.0F.WIG 5B /r RM V/V AVX Convert eight packed single-precision floating-point\r\n VCVTTPS2DQ ymm1, ymm2/m256 values from ymm2/mem to eight packed signed\r\n doubleword values in ymm1 using truncation.\r\n EVEX.128.F3.0F.W0 5B /r FV V/V AVX512VL Convert four packed single precision floating-point\r\n VCVTTPS2DQ xmm1 {k1}{z}, AVX512F values from xmm2/m128/m32bcst to four packed\r\n xmm2/m128/m32bcst signed doubleword values in xmm1 using truncation\r\n subject to writemask k1.\r\n EVEX.256.F3.0F.W0 5B /r FV V/V AVX512VL Convert eight packed single precision floating-point\r\n VCVTTPS2DQ ymm1 {k1}{z}, AVX512F values from ymm2/m256/m32bcst to eight packed\r\n ymm2/m256/m32bcst signed doubleword values in ymm1 using truncation\r\n subject to writemask k1.\r\n EVEX.512.F3.0F.W0 5B /r FV V/V AVX512F Convert sixteen packed single-precision floating-point\r\n VCVTTPS2DQ zmm1 {k1}{z}, values from zmm2/m512/m32bcst to sixteen packed\r\n zmm2/m512/m32bcst {sae} signed doubleword values in zmm1 using truncation\r\n subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n FV ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts four, eight or sixteen packed single-precision floating-point values in the source operand to four, eight or\r\nsixteen signed doubleword integers in the destination operand.\r\nWhen a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than\r\nthe maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is\r\nmasked, the indefinite integer value (80000000H) is returned.\r\nEVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or\r\na 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a\r\nZMM/YMM/XMM register conditionally updated with writemask k1.\r\nVEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination\r\noperand is a YMM register. The upper bits (MAX_VL-1:256) of the corresponding ZMM register destination are\r\nzeroed.\r\nVEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is a XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nzeroed.\r\n128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination\r\noperand is an XMM register. The upper bits (MAX_VL-1:128) of the corresponding ZMM register destination are\r\nunmodified.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\nOperation\r\nVCVTTPS2DQ (EVEX encoded versions) when src operand is a register\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\n\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN DEST[i+31:i] <-\r\n Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[i+31:i])\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTTPS2DQ (EVEX encoded versions) when src operand is a memory source\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\n\r\nFOR j <- 0 TO 15\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1)\r\n THEN\r\n DEST[i+31:i] <-\r\n Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0])\r\n ELSE\r\n DEST[i+31:i] <-\r\n Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[i+31:i])\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVCVTTPS2DQ (VEX.256 encoded version)\r\nDEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0])\r\nDEST[63:32] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[63:32])\r\nDEST[95:64] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[95:64])\r\nDEST[127:96] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[127:96)\r\nDEST[159:128] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[159:128])\r\nDEST[191:160] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[191:160])\r\nDEST[223:192] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[223:192])\r\nDEST[255:224] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[255:224])\r\n\r\n\r\n\r\nVCVTTPS2DQ (VEX.128 encoded version)\r\nDEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0])\r\nDEST[63:32] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[63:32])\r\nDEST[95:64] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[95:64])\r\nDEST[127:96] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[127:96])\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nCVTTPS2DQ (128-bit Legacy SSE version)\r\nDEST[31:0] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0])\r\nDEST[63:32] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[63:32])\r\nDEST[95:64] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[95:64])\r\nDEST[127:96] <-Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[127:96])\r\nDEST[MAX_VL-1:128] (unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTTPS2DQ __m512i _mm512_cvttps_epi32( __m512 a);\r\nVCVTTPS2DQ __m512i _mm512_mask_cvttps_epi32( __m512i s, __mmask16 k, __m512 a);\r\nVCVTTPS2DQ __m512i _mm512_maskz_cvttps_epi32( __mmask16 k, __m512 a);\r\nVCVTTPS2DQ __m512i _mm512_cvtt_roundps_epi32( __m512 a, int sae);\r\nVCVTTPS2DQ __m512i _mm512_mask_cvtt_roundps_epi32( __m512i s, __mmask16 k, __m512 a, int sae);\r\nVCVTTPS2DQ __m512i _mm512_maskz_cvtt_roundps_epi32( __mmask16 k, __m512 a, int sae);\r\nVCVTTPS2DQ __m256i _mm256_mask_cvttps_epi32( __m256i s, __mmask8 k, __m256 a);\r\nVCVTTPS2DQ __m256i _mm256_maskz_cvttps_epi32( __mmask8 k, __m256 a);\r\nVCVTTPS2DQ __m128i _mm_mask_cvttps_epi32( __m128i s, __mmask8 k, __m128 a);\r\nVCVTTPS2DQ __m128i _mm_maskz_cvttps_epi32( __mmask8 k, __m128 a);\r\nVCVTTPS2DQ __m256i _mm256_cvttps_epi32 (__m256 a)\r\nCVTTPS2DQ __m128i _mm_cvttps_epi32 (__m128 a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2; additionally\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTTPS2DQ"
},
{
"description": "CVTTPS2PI-Convert with Truncation Packed Single-Precision FP Values to Packed Dword\r\nIntegers\r\n Opcode/ Op/ 64-Bit Compat/ Description\r\n Instruction En Mode Leg Mode\r\n 0F 2C /r RM Valid Valid Convert two single-precision floating-point\r\n CVTTPS2PI mm, xmm/m64 values from xmm/m64 to two signed\r\n doubleword signed integers in mm using\r\n truncation.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts two packed single-precision floating-point values in the source operand (second operand) to two packed\r\nsigned doubleword integers in the destination operand (first operand). The source operand can be an XMM register\r\nor a 64-bit memory location. The destination operand is an MMX technology register. When the source operand is\r\nan XMM register, the two single-precision floating-point values are contained in the low quadword of the register.\r\nWhen a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger\r\nthan the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is\r\nmasked, the indefinite integer value (80000000H) is returned.\r\nThis instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack\r\npointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU\r\nfloating-point exception is pending, the exception is handled before the CVTTPS2PI instruction is executed.\r\nIn 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).\r\n\r\nOperation\r\nDEST[31:0] <- Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0]);\r\nDEST[63:32] <- Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[63:32]);\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nCVTTPS2PI: __m64 _mm_cvttps_pi32(__m128 a)\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nSee Table 22-5, \"Exception Conditions for Legacy SIMD/MMX Instructions with XMM and FP Exception,\" in the\r\nIntel 64 and IA-32 Architectures Software Developer's Manual, Volume 3B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTTPS2PI"
},
{
"description": "CVTTSD2SI-Convert with Truncation Scalar Double-Precision Floating-Point Value to Signed\r\nInteger\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F 2C /r RM V/V SSE2 Convert one double-precision floating-point value from\r\n CVTTSD2SI r32, xmm1/m64 xmm1/m64 to one signed doubleword integer in r32\r\n using truncation.\r\n F2 REX.W 0F 2C /r RM V/N.E. SSE2 Convert one double-precision floating-point value from\r\n CVTTSD2SI r64, xmm1/m64 xmm1/m64 to one signed quadword integer in r64\r\n using truncation.\r\n VEX.128.F2.0F.W0 2C /r RM V/V AVX Convert one double-precision floating-point value from\r\n VCVTTSD2SI r32, xmm1/m64 xmm1/m64 to one signed doubleword integer in r32\r\n using truncation.\r\n VEX.128.F2.0F.W1 2C /r T1F V/N.E.1 AVX Convert one double-precision floating-point value from\r\n VCVTTSD2SI r64, xmm1/m64 xmm1/m64 to one signed quadword integer in r64\r\n using truncation.\r\n EVEX.LIG.F2.0F.W0 2C /r T1F V/V AVX512F Convert one double-precision floating-point value from\r\n VCVTTSD2SI r32, xmm1/m64{sae} xmm1/m64 to one signed doubleword integer in r32\r\n using truncation.\r\n EVEX.LIG.F2.0F.W1 2C /r T1F V/N.E.1 AVX512F Convert one double-precision floating-point value from\r\n VCVTTSD2SI r64, xmm1/m64{sae} xmm1/m64 to one signed quadword integer in r64\r\n using truncation.\r\nNOTES:\r\n1. For this specific instruction, VEX.W/EVEX.W in non-64 bit is ignored; the instructions behaves as if the W0 ver-\r\n sion is used.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n T1F ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts a double-precision floating-point value in the source operand (the second operand) to a signed double-\r\nword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand).\r\nThe source operand can be an XMM register or a 64-bit memory location. The destination operand is a general\r\npurpose register. When the source operand is an XMM register, the double-precision floating-point value is\r\ncontained in the low quadword of the register.\r\nWhen a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR\r\nregister.\r\nIf a converted result exceeds the range limits of signed doubleword integer (in non-64-bit modes or 64-bit mode\r\nwith REX.W/VEX.W/EVEX.W=0), the floating-point invalid exception is raised, and if this exception is masked, the\r\nindefinite integer value (80000000H) is returned.\r\nIf a converted result exceeds the range limits of signed quadword integer (in 64-bit mode and\r\nREX.W/VEX.W/EVEX.W = 1), the floating-point invalid exception is raised, and if this exception is masked, the\r\nindefinite integer value (80000000_00000000H) is returned.\r\nLegacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to 64-bit operation. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\nVEX.W1 and EVEX.W1 versions: promotes the instruction to produce 64-bit data in 64-bit mode.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\n\r\n\r\n\r\n\r\nSoftware should ensure VCVTTSD2SI is encoded with VEX.L=0. Encoding VCVTTSD2SI with VEX.L=1 may\r\nencounter unpredictable behavior across different processor generations.\r\n\r\nOperation\r\n(V)CVTTSD2SI (All versions)\r\nIF 64-Bit Mode and OperandSize = 64\r\nTHEN\r\n DEST[63:0] <- Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[63:0]);\r\nELSE\r\n DEST[31:0] <- Convert_Double_Precision_Floating_Point_To_Integer_Truncate(SRC[63:0]);\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTTSD2SI int _mm_cvttsd_i32( __m128d a);\r\nVCVTTSD2SI int _mm_cvtt_roundsd_i32( __m128d a, int sae);\r\nVCVTTSD2SI __int64 _mm_cvttsd_i64( __m128d a);\r\nVCVTTSD2SI __int64 _mm_cvtt_roundsd_i64( __m128d a, int sae);\r\nCVTTSD2SI int _mm_cvttsd_si32( __m128d a);\r\nCVTTSD2SI __int64 _mm_cvttsd_si64( __m128d a);\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3; additionally\r\n#UD If VEX.vvvv != 1111B.\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTTSD2SI"
},
{
"description": "CVTTSS2SI-Convert with Truncation Scalar Single-Precision Floating-Point Value to Integer\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 2C /r RM V/V SSE Convert one single-precision floating-point value from\r\n CVTTSS2SI r32, xmm1/m32 xmm1/m32 to one signed doubleword integer in r32\r\n using truncation.\r\n F3 REX.W 0F 2C /r RM V/N.E. SSE Convert one single-precision floating-point value from\r\n CVTTSS2SI r64, xmm1/m32 xmm1/m32 to one signed quadword integer in r64\r\n using truncation.\r\n VEX.128.F3.0F.W0 2C /r RM V/V AVX Convert one single-precision floating-point value from\r\n VCVTTSS2SI r32, xmm1/m32 xmm1/m32 to one signed doubleword integer in r32\r\n using truncation.\r\n VEX.128.F3.0F.W1 2C /r RM V/N.E.1 AVX Convert one single-precision floating-point value from\r\n VCVTTSS2SI r64, xmm1/m32 xmm1/m32 to one signed quadword integer in r64\r\n using truncation.\r\n EVEX.LIG.F3.0F.W0 2C /r T1F V/V AVX512F Convert one single-precision floating-point value from\r\n VCVTTSS2SI r32, xmm1/m32{sae} xmm1/m32 to one signed doubleword integer in r32\r\n using truncation.\r\n EVEX.LIG.F3.0F.W1 2C /r T1F V/N.E.1 AVX512F Convert one single-precision floating-point value from\r\n VCVTTSS2SI r64, xmm1/m32{sae} xmm1/m32 to one signed quadword integer in r64\r\n using truncation.\r\nNOTES:\r\n1. For this specific instruction, VEX.W/EVEX.W in non-64 bit is ignored; the instructions behaves as if the W0 ver-\r\n sion is used.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (w) ModRM:r/m (r) NA NA\r\n T1F ModRM:reg (w) ModRM:r/m (r) NA NA\r\n\r\nDescription\r\nConverts a single-precision floating-point value in the source operand (the second operand) to a signed doubleword\r\ninteger (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand). The\r\nsource operand can be an XMM register or a 32-bit memory location. The destination operand is a general purpose\r\nregister. When the source operand is an XMM register, the single-precision floating-point value is contained in the\r\nlow doubleword of the register.\r\nWhen a conversion is inexact, a truncated (round toward zero) result is returned. If a converted result is larger than\r\nthe maximum signed doubleword integer, the floating-point invalid exception is raised. If this exception is masked,\r\nthe indefinite integer value (80000000H or 80000000_00000000H if operand size is 64 bits) is returned.\r\nLegacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to 64-bit operation. See\r\nthe summary chart at the beginning of this section for encoding data and limits.\r\nVEX.W1 and EVEX.W1 versions: promotes the instruction to produce 64-bit data in 64-bit mode.\r\nNote: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.\r\nSoftware should ensure VCVTTSS2SI is encoded with VEX.L=0. Encoding VCVTTSS2SI with VEX.L=1 may\r\nencounter unpredictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\n(V)CVTTSS2SI (All versions)\r\nIF 64-Bit Mode and OperandSize = 64\r\nTHEN\r\n DEST[63:0] <- Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0]);\r\nELSE\r\n DEST[31:0] <- Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC[31:0]);\r\nFI;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVCVTTSS2SI int _mm_cvttss_i32( __m128 a);\r\nVCVTTSS2SI int _mm_cvtt_roundss_i32( __m128 a, int sae);\r\nVCVTTSS2SI __int64 _mm_cvttss_i64( __m128 a);\r\nVCVTTSS2SI __int64 _mm_cvtt_roundss_i64( __m128 a, int sae);\r\nCVTTSS2SI int _mm_cvttss_si32( __m128 a);\r\nCVTTSS2SI __int64 _mm_cvttss_si64( __m128 a);\r\n\r\nSIMD Floating-Point Exceptions\r\nInvalid, Precision\r\n\r\nOther Exceptions\r\nSee Exceptions Type 3; additionally\r\n#UD If VEX.vvvv != 1111B.\r\nEVEX-encoded instructions, see Exceptions Type E3NF.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CVTTSS2SI"
},
{
"description": "CWD/CDQ/CQO-Convert Word to Doubleword/Convert Doubleword to Quadword\r\nOpcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n99 CWD NP Valid Valid DX:AX <- sign-extend of AX.\r\n99 CDQ NP Valid Valid EDX:EAX <- sign-extend of EAX.\r\nREX.W + 99 CQO NP Valid N.E. RDX:RAX<- sign-extend of RAX.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nDoubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign\r\nextension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction\r\ncopies the sign (bit 15) of the value in the AX register into every bit position in the DX register. The CDQ instruction\r\ncopies the sign (bit 31) of the value in the EAX register into every bit position in the EDX register. The CQO instruc-\r\ntion (available in 64-bit mode only) copies the sign (bit 63) of the value in the RAX register into every bit position\r\nin the RDX register.\r\nThe CWD instruction can be used to produce a doubleword dividend from a word before word division. The CDQ\r\ninstruction can be used to produce a quadword dividend from a doubleword before doubleword division. The CQO\r\ninstruction can be used to produce a double quadword dividend from a quadword before a quadword division.\r\nThe CWD and CDQ mnemonics reference the same opcode. The CWD instruction is intended for use when the\r\noperand-size attribute is 16 and the CDQ instruction for when the operand-size attribute is 32. Some assemblers\r\nmay force the operand size to 16 when CWD is used and to 32 when CDQ is used. Others may treat these\r\nmnemonics as synonyms (CWD/CDQ) and use the current setting of the operand-size attribute to determine the\r\nsize of values to be converted, regardless of the mnemonic used.\r\nIn 64-bit mode, use of the REX.W prefix promotes operation to 64 bits. The CQO mnemonics reference the same\r\nopcode as CWD/CDQ. See the summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\n\r\nIF OperandSize = 16 (* CWD instruction *)\r\n THEN\r\n DX <- SignExtend(AX);\r\n ELSE IF OperandSize = 32 (* CDQ instruction *)\r\n EDX <- SignExtend(EAX); FI;\r\n ELSE IF 64-Bit Mode and OperandSize = 64 (* CQO instruction*)\r\n RDX <- SignExtend(RAX); FI;\r\nFI;\r\n\r\nFlags Affected\r\nNone\r\n\r\nExceptions (All Operating Modes)\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "CWD"
},
{
"description": "-R:CBW",
"mnem": "CWDE"
},
{
"description": "DAA-Decimal Adjust AL after Addition\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n 27 DAA NP Invalid Valid Decimal adjust AL after addition.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nAdjusts the sum of two packed BCD values to create a packed BCD result. The AL register is the implied source and\r\ndestination operand. The DAA instruction is only useful when it follows an ADD instruction that adds (binary addi-\r\ntion) two 2-digit, packed BCD values and stores a byte result in the AL register. The DAA instruction then adjusts\r\nthe contents of the AL register to contain the correct 2-digit, packed BCD result. If a decimal carry is detected, the\r\nCF and AF flags are set accordingly.\r\nThis instruction executes as described above in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64-Bit Mode\r\n THEN\r\n #UD;\r\n ELSE\r\n old_AL <- AL;\r\n old_CF <- CF;\r\n CF <- 0;\r\n IF (((AL AND 0FH) > 9) or AF = 1)\r\n THEN\r\n AL <- AL + 6;\r\n CF <- old_CF or (Carry from AL <- AL + 6);\r\n AF <- 1;\r\n ELSE\r\n AF <- 0;\r\n FI;\r\n IF ((old_AL > 99H) or (old_CF = 1))\r\n THEN\r\n AL <- AL + 60H;\r\n CF <- 1;\r\n ELSE\r\n CF <- 0;\r\n FI;\r\nFI;\r\n\r\nExample\r\nADD AL, BL Before: AL=79H BL=35H EFLAGS(OSZAPC)=XXXXXX\r\n After: AL=AEH BL=35H EFLAGS(0SZAPC)=110000\r\nDAA Before: AL=AEH BL=35H EFLAGS(OSZAPC)=110000\r\n After: AL=14H BL=35H EFLAGS(0SZAPC)=X00111\r\nDAA Before: AL=2EH BL=35H EFLAGS(OSZAPC)=110000\r\n After: AL=34H BL=35H EFLAGS(0SZAPC)=X00101\r\n\r\n\r\n\r\n\r\n\r\nFlags Affected\r\nThe CF and AF flags are set if the adjustment of the value results in a decimal carry in either digit of the result (see\r\nthe \"Operation\" section above). The SF, ZF, and PF flags are set according to the result. The OF flag is undefined.\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DAA"
},
{
"description": "DAS-Decimal Adjust AL after Subtraction\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n 2F DAS NP Invalid Valid Decimal adjust AL after subtraction.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nAdjusts the result of the subtraction of two packed BCD values to create a packed BCD result. The AL register is the\r\nimplied source and destination operand. The DAS instruction is only useful when it follows a SUB instruction that\r\nsubtracts (binary subtraction) one 2-digit, packed BCD value from another and stores a byte result in the AL\r\nregister. The DAS instruction then adjusts the contents of the AL register to contain the correct 2-digit, packed BCD\r\nresult. If a decimal borrow is detected, the CF and AF flags are set accordingly.\r\nThis instruction executes as described above in compatibility mode and legacy mode. It is not valid in 64-bit mode.\r\n\r\nOperation\r\nIF 64-Bit Mode\r\n THEN\r\n #UD;\r\n ELSE\r\n old_AL <- AL;\r\n old_CF <- CF;\r\n CF <- 0;\r\n IF (((AL AND 0FH) > 9) or AF = 1)\r\n THEN\r\n AL <- AL - 6;\r\n CF <- old_CF or (Borrow from AL <- AL - 6);\r\n AF <- 1;\r\n ELSE\r\n AF <- 0;\r\n FI;\r\n IF ((old_AL > 99H) or (old_CF = 1))\r\n THEN\r\n AL <- AL - 60H;\r\n CF <- 1;\r\n FI;\r\nFI;\r\n\r\nExample\r\nSUB AL, BL Before: AL = 35H, BL = 47H, EFLAGS(OSZAPC) = XXXXXX\r\n After: AL = EEH, BL = 47H, EFLAGS(0SZAPC) = 010111\r\nDAA Before: AL = EEH, BL = 47H, EFLAGS(OSZAPC) = 010111\r\n After: AL = 88H, BL = 47H, EFLAGS(0SZAPC) = X10111\r\n\r\nFlags Affected\r\nThe CF and AF flags are set if the adjustment of the value results in a decimal borrow in either digit of the result\r\n(see the \"Operation\" section above). The SF, ZF, and PF flags are set according to the result. The OF flag is unde-\r\nfined.\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\n#UD If the LOCK prefix is used.\r\n\r\n64-Bit Mode Exceptions\r\n#UD If in 64-bit mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DAS"
},
{
"description": "DEC-Decrement by 1\r\n Opcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n FE /1 DEC r/m8 M Valid Valid Decrement r/m8 by 1.\r\n *\r\n REX + FE /1 DEC r/m8 M Valid N.E. Decrement r/m8 by 1.\r\n FF /1 DEC r/m16 M Valid Valid Decrement r/m16 by 1.\r\n FF /1 DEC r/m32 M Valid Valid Decrement r/m32 by 1.\r\n REX.W + FF /1 DEC r/m64 M Valid N.E. Decrement r/m64 by 1.\r\n 48+rw DEC r16 O N.E. Valid Decrement r16 by 1.\r\n 48+rd DEC r32 O N.E. Valid Decrement r32 by 1.\r\n NOTES:\r\n * In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n M ModRM:r/m (r, w) NA NA NA\r\n O opcode + rd (r, w) NA NA NA\r\n\r\nDescription\r\nSubtracts 1 from the destination operand, while preserving the state of the CF flag. The destination operand can be\r\na register or a memory location. This instruction allows a loop counter to be updated without disturbing the CF flag.\r\n(To perform a decrement operation that updates the CF flag, use a SUB instruction with an immediate operand of\r\n1.)\r\nThis instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.\r\nIn 64-bit mode, DEC r16 and DEC r32 are not encodable (because opcodes 48H through 4FH are REX prefixes).\r\nOtherwise, the instruction's 64-bit mode default operation size is 32 bits. Use of the REX.R prefix permits access to\r\nadditional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits.\r\nSee the summary chart at the beginning of this section for encoding data and limits.\r\n\r\nOperation\r\nDEST <- DEST - 1;\r\n\r\nFlags Affected\r\nThe CF flag is not affected. The OF, SF, ZF, AF, and PF flags are set according to the result.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination operand is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used but the destination is not a memory operand.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DEC"
},
{
"description": "DIV-Unsigned Divide\r\nOpcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\nF6 /6 DIV r/m8 M Valid Valid Unsigned divide AX by r/m8, with result\r\n stored in AL <- Quotient, AH <- Remainder.\r\nREX + F6 /6 DIV r/m8* M Valid N.E. Unsigned divide AX by r/m8, with result\r\n stored in AL <- Quotient, AH <- Remainder.\r\nF7 /6 DIV r/m16 M Valid Valid Unsigned divide DX:AX by r/m16, with result\r\n stored in AX <- Quotient, DX <- Remainder.\r\nF7 /6 DIV r/m32 M Valid Valid Unsigned divide EDX:EAX by r/m32, with\r\n result stored in EAX <- Quotient, EDX <-\r\n Remainder.\r\nREX.W + F7 /6 DIV r/m64 M Valid N.E. Unsigned divide RDX:RAX by r/m64, with\r\n result stored in RAX <- Quotient, RDX <-\r\n Remainder.\r\nNOTES:\r\n* In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n M ModRM:r/m (w) NA NA NA\r\n\r\nDescription\r\nDivides unsigned the value in the AX, DX:AX, EDX:EAX, or RDX:RAX registers (dividend) by the source operand\r\n(divisor) and stores the result in the AX (AH:AL), DX:AX, EDX:EAX, or RDX:RAX registers. The source operand can\r\nbe a general-purpose register or a memory location. The action of this instruction depends on the operand size\r\n(dividend/divisor). Division using 64-bit operand is available only in 64-bit mode.\r\nNon-integral results are truncated (chopped) towards 0. The remainder is always less than the divisor in magni-\r\ntude. Overflow is indicated with the #DE (divide error) exception rather than with the CF flag.\r\nIn 64-bit mode, the instruction's default operation size is 32 bits. Use of the REX.R prefix permits access to addi-\r\ntional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. In 64-bit mode when REX.W is\r\napplied, the instruction divides the unsigned value in RDX:RAX by the source operand and stores the quotient in\r\nRAX, the remainder in RDX.\r\nSee the summary chart at the beginning of this section for encoding data and limits. See Table 3-15.\r\n Table 3-15. DIV Action\r\n Maximum\r\n Operand Size Dividend Divisor Quotient Remainder Quotient\r\n Word/byte AX r/m8 AL AH 255\r\n Doubleword/word DX:AX r/m16 AX DX 65,535\r\n Quadword/doubleword EDX:EAX r/m32 EAX EDX 232 - 1\r\n Doublequadword/ RDX:RAX r/m64 RAX RDX 264 - 1\r\n quadword\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nIF SRC = 0\r\n THEN #DE; FI; (* Divide Error *)\r\nIF OperandSize = 8 (* Word/Byte Operation *)\r\n THEN\r\n temp <- AX / SRC;\r\n IF temp > FFH\r\n THEN #DE; (* Divide error *)\r\n ELSE\r\n AL <- temp;\r\n AH <- AX MOD SRC;\r\n FI;\r\n ELSE IF OperandSize = 16 (* Doubleword/word operation *)\r\n THEN\r\n temp <- DX:AX / SRC;\r\n IF temp > FFFFH\r\n THEN #DE; (* Divide error *)\r\n ELSE\r\n AX <- temp;\r\n DX <- DX:AX MOD SRC;\r\n FI;\r\n FI;\r\n ELSE IF Operandsize = 32 (* Quadword/doubleword operation *)\r\n THEN\r\n temp <- EDX:EAX / SRC;\r\n IF temp > FFFFFFFFH\r\n THEN #DE; (* Divide error *)\r\n ELSE\r\n EAX <- temp;\r\n EDX <- EDX:EAX MOD SRC;\r\n FI;\r\n FI;\r\n ELSE IF 64-Bit Mode and Operandsize = 64 (* Doublequadword/quadword operation *)\r\n THEN\r\n temp <- RDX:RAX / SRC;\r\n IF temp > FFFFFFFFFFFFFFFFH\r\n THEN #DE; (* Divide error *)\r\n ELSE\r\n RAX <- temp;\r\n RDX <- RDX:RAX MOD SRC;\r\n FI;\r\n FI;\r\nFI;\r\n\r\nFlags Affected\r\nThe CF, OF, SF, ZF, AF, and PF flags are undefined.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#DE If the source operand (divisor) is 0\r\n If the quotient is too large for the designated register.\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#DE If the source operand (divisor) is 0.\r\n If the quotient is too large for the designated register.\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#DE If the source operand (divisor) is 0.\r\n If the quotient is too large for the designated register.\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#DE If the source operand (divisor) is 0\r\n If the quotient is too large for the designated register.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DIV"
},
{
"description": "DIVPD-Divide Packed Double-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 5E /r RM V/V SSE2 Divide packed double-precision floating-point values\r\n DIVPD xmm1, xmm2/m128 in xmm1 by packed double-precision floating-point\r\n values in xmm2/mem.\r\n VEX.NDS.128.66.0F.WIG 5E /r RVM V/V AVX Divide packed double-precision floating-point values\r\n VDIVPD xmm1, xmm2, xmm3/m128 in xmm2 by packed double-precision floating-point\r\n values in xmm3/mem.\r\n VEX.NDS.256.66.0F.WIG 5E /r RVM V/V AVX Divide packed double-precision floating-point values\r\n VDIVPD ymm1, ymm2, ymm3/m256 in ymm2 by packed double-precision floating-point\r\n values in ymm3/mem.\r\n EVEX.NDS.128.66.0F.W1 5E /r FV V/V AVX512VL Divide packed double-precision floating-point values\r\n VDIVPD xmm1 {k1}{z}, xmm2, AVX512F in xmm2 by packed double-precision floating-point\r\n xmm3/m128/m64bcst values in xmm3/m128/m64bcst and write results to\r\n xmm1 subject to writemask k1.\r\n EVEX.NDS.256.66.0F.W1 5E /r FV V/V AVX512VL Divide packed double-precision floating-point values\r\n VDIVPD ymm1 {k1}{z}, ymm2, AVX512F in ymm2 by packed double-precision floating-point\r\n ymm3/m256/m64bcst values in ymm3/m256/m64bcst and write results to\r\n ymm1 subject to writemask k1.\r\n EVEX.NDS.512.66.0F.W1 5E /r FV V/V AVX512F Divide packed double-precision floating-point values\r\n VDIVPD zmm1 {k1}{z}, zmm2, in zmm2 by packed double-precision FP values in\r\n zmm3/m512/m64bcst{er} zmm3/m512/m64bcst and write results to zmm1\r\n subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a SIMD divide of the double-precision floating-point values in the first source operand by the floating-\r\npoint values in the second source operand (the third operand). Results are written to the destination operand (the\r\nfirst operand).\r\nEVEX encoded versions: The first source operand (the second operand) is a ZMM/YMM/XMM register. The second\r\nsource operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector\r\nbroadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally\r\nupdated with writemask k1.\r\nVEX.256 encoded version: The first source operand (the second operand) is a YMM register. The second source\r\noperand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper\r\nbits (MAX_VL-1:256) of the corresponding destination are zeroed.\r\nVEX.128 encoded version: The first source operand (the second operand) is a XMM register. The second source\r\noperand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper\r\nbits (MAX_VL-1:128) of the corresponding destination are zeroed.\r\n128-bit Legacy SSE version: The second source operand (the second operand) can be an XMM register or an 128-\r\nbit memory location. The destination is the same as the first source operand. The upper bits (MAX_VL-1:128) of the\r\ncorresponding destination are unmodified.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVDIVPD (EVEX encoded versions)\r\n(KL, VL) = (2, 128), (4, 256), (8, 512)\r\nIF (VL = 512) AND (EVEX.b = 1) AND SRC2 *is a register*\r\n THEN\r\n SET_RM(EVEX.RC); ; refer to Table 2-4 in the Intel Architecture Instruction Set Extensions Programming Reference\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nFOR j <- 0 TO KL-1\r\n i <- j * 64\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1) AND (SRC2 *is memory*)\r\n THEN\r\n DEST[i+63:i] <- SRC1[i+63:i] / SRC2[63:0]\r\n ELSE\r\n DEST[i+63:i] <- SRC1[i+63:i] / SRC2[i+63:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+63:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+63:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVDIVPD (VEX.256 encoded version)\r\nDEST[63:0] <-SRC1[63:0] / SRC2[63:0]\r\nDEST[127:64] <-SRC1[127:64] / SRC2[127:64]\r\nDEST[191:128] <-SRC1[191:128] / SRC2[191:128]\r\nDEST[255:192] <-SRC1[255:192] / SRC2[255:192]\r\nDEST[MAX_VL-1:256] <-0;\r\n\r\nVDIVPD (VEX.128 encoded version)\r\nDEST[63:0] <-SRC1[63:0] / SRC2[63:0]\r\nDEST[127:64] <-SRC1[127:64] / SRC2[127:64]\r\nDEST[MAX_VL-1:128] <-0;\r\n\r\nDIVPD (128-bit Legacy SSE version)\r\nDEST[63:0] <-SRC1[63:0] / SRC2[63:0]\r\nDEST[127:64] <-SRC1[127:64] / SRC2[127:64]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVDIVPD __m512d _mm512_div_pd( __m512d a, __m512d b);\r\nVDIVPD __m512d _mm512_mask_div_pd(__m512d s, __mmask8 k, __m512d a, __m512d b);\r\nVDIVPD __m512d _mm512_maskz_div_pd( __mmask8 k, __m512d a, __m512d b);\r\nVDIVPD __m256d _mm256_mask_div_pd(__m256d s, __mmask8 k, __m256d a, __m256d b);\r\nVDIVPD __m256d _mm256_maskz_div_pd( __mmask8 k, __m256d a, __m256d b);\r\nVDIVPD __m128d _mm_mask_div_pd(__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVDIVPD __m128d _mm_maskz_div_pd( __mmask8 k, __m128d a, __m128d b);\r\nVDIVPD __m512d _mm512_div_round_pd( __m512d a, __m512d b, int);\r\nVDIVPD __m512d _mm512_mask_div_round_pd(__m512d s, __mmask8 k, __m512d a, __m512d b, int);\r\nVDIVPD __m512d _mm512_maskz_div_round_pd( __mmask8 k, __m512d a, __m512d b, int);\r\nVDIVPD __m256d _mm256_div_pd (__m256d a, __m256d b);\r\nDIVPD __m128d _mm_div_pd (__m128d a, __m128d b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Divide-by-Zero, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2.\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DIVPD"
},
{
"description": "DIVPS-Divide Packed Single-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 0F 5E /r RM V/V SSE Divide packed single-precision floating-point values\r\n DIVPS xmm1, xmm2/m128 in xmm1 by packed single-precision floating-point\r\n values in xmm2/mem.\r\n VEX.NDS.128.0F.WIG 5E /r RVM V/V AVX Divide packed single-precision floating-point values\r\n VDIVPS xmm1, xmm2, xmm3/m128 in xmm2 by packed single-precision floating-point\r\n values in xmm3/mem.\r\n VEX.NDS.256.0F.WIG 5E /r RVM V/V AVX Divide packed single-precision floating-point values\r\n VDIVPS ymm1, ymm2, ymm3/m256 in ymm2 by packed single-precision floating-point\r\n values in ymm3/mem.\r\n EVEX.NDS.128.0F.W0 5E /r FV V/V AVX512VL Divide packed single-precision floating-point values\r\n VDIVPS xmm1 {k1}{z}, xmm2, AVX512F in xmm2 by packed single-precision floating-point\r\n xmm3/m128/m32bcst values in xmm3/m128/m32bcst and write results to\r\n xmm1 subject to writemask k1.\r\n EVEX.NDS.256.0F.W0 5E /r FV V/V AVX512VL Divide packed single-precision floating-point values\r\n VDIVPS ymm1 {k1}{z}, ymm2, AVX512F in ymm2 by packed single-precision floating-point\r\n ymm3/m256/m32bcst values in ymm3/m256/m32bcst and write results to\r\n ymm1 subject to writemask k1.\r\n EVEX.NDS.512.0F.W0 5E /r FV V/V AVX512F Divide packed single-precision floating-point values\r\n VDIVPS zmm1 {k1}{z}, zmm2, in zmm2 by packed single-precision floating-point\r\n zmm3/m512/m32bcst{er} values in zmm3/m512/m32bcst and write results to\r\n zmm1 subject to writemask k1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n FV ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nPerforms a SIMD divide of the four, eight or sixteen packed single-precision floating-point values in the first source\r\noperand (the second operand) by the four, eight or sixteen packed single-precision floating-point values in the\r\nsecond source operand (the third operand). Results are written to the destination operand (the first operand).\r\nEVEX encoded versions: The first source operand (the second operand) is a ZMM/YMM/XMM register. The second\r\nsource operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector\r\nbroadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally\r\nupdated with writemask k1.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register.\r\nVEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM\r\nregister or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAX_VL-1:128)\r\nof the corresponding ZMM register destination are zeroed.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (MAX_VL-1:128) of the corresponding\r\nZMM register destination are unmodified.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVDIVPS (EVEX encoded versions)\r\n(KL, VL) = (4, 128), (8, 256), (16, 512)\r\nIF (VL = 512) AND (EVEX.b = 1) AND SRC2 *is a register*\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nFOR j <- 0 TO KL-1\r\n i <- j * 32\r\n IF k1[j] OR *no writemask*\r\n THEN\r\n IF (EVEX.b = 1) AND (SRC2 *is memory*)\r\n THEN\r\n DEST[i+31:i] <- SRC1[i+31:i] / SRC2[31:0]\r\n ELSE\r\n DEST[i+31:i] <- SRC1[i+31:i] / SRC2[i+31:i]\r\n FI;\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[i+31:i] remains unchanged*\r\n ELSE ; zeroing-masking\r\n DEST[i+31:i] <- 0\r\n FI\r\n FI;\r\nENDFOR\r\nDEST[MAX_VL-1:VL] <- 0\r\n\r\nVDIVPS (VEX.256 encoded version)\r\nDEST[31:0] <-SRC1[31:0] / SRC2[31:0]\r\nDEST[63:32] <-SRC1[63:32] / SRC2[63:32]\r\nDEST[95:64] <-SRC1[95:64] / SRC2[95:64]\r\nDEST[127:96] <-SRC1[127:96] / SRC2[127:96]\r\nDEST[159:128] <-SRC1[159:128] / SRC2[159:128]\r\nDEST[191:160]<-SRC1[191:160] / SRC2[191:160]\r\nDEST[223:192] <-SRC1[223:192] / SRC2[223:192]\r\nDEST[255:224] <-SRC1[255:224] / SRC2[255:224].\r\nDEST[MAX_VL-1:256] <-0;\r\n\r\nVDIVPS (VEX.128 encoded version)\r\nDEST[31:0] <-SRC1[31:0] / SRC2[31:0]\r\nDEST[63:32] <-SRC1[63:32] / SRC2[63:32]\r\nDEST[95:64] <-SRC1[95:64] / SRC2[95:64]\r\nDEST[127:96] <-SRC1[127:96] / SRC2[127:96]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\n\r\n\r\n\r\n\r\nDIVPS (128-bit Legacy SSE version)\r\nDEST[31:0] <-SRC1[31:0] / SRC2[31:0]\r\nDEST[63:32] <-SRC1[63:32] / SRC2[63:32]\r\nDEST[95:64] <-SRC1[95:64] / SRC2[95:64]\r\nDEST[127:96] <-SRC1[127:96] / SRC2[127:96]\r\nDEST[MAX_VL-1:128] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVDIVPS __m512 _mm512_div_ps( __m512 a, __m512 b);\r\nVDIVPS __m512 _mm512_mask_div_ps(__m512 s, __mmask16 k, __m512 a, __m512 b);\r\nVDIVPS __m512 _mm512_maskz_div_ps(__mmask16 k, __m512 a, __m512 b);\r\nVDIVPD __m256d _mm256_mask_div_pd(__m256d s, __mmask8 k, __m256d a, __m256d b);\r\nVDIVPD __m256d _mm256_maskz_div_pd( __mmask8 k, __m256d a, __m256d b);\r\nVDIVPD __m128d _mm_mask_div_pd(__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVDIVPD __m128d _mm_maskz_div_pd( __mmask8 k, __m128d a, __m128d b);\r\nVDIVPS __m512 _mm512_div_round_ps( __m512 a, __m512 b, int);\r\nVDIVPS __m512 _mm512_mask_div_round_ps(__m512 s, __mmask16 k, __m512 a, __m512 b, int);\r\nVDIVPS __m512 _mm512_maskz_div_round_ps(__mmask16 k, __m512 a, __m512 b, int);\r\nVDIVPS __m256 _mm256_div_ps (__m256 a, __m256 b);\r\nDIVPS __m128 _mm_div_ps (__m128 a, __m128 b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Divide-by-Zero, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 2.\r\nEVEX-encoded instructions, see Exceptions Type E2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DIVPS"
},
{
"description": "DIVSD-Divide Scalar Double-Precision Floating-Point Value\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F2 0F 5E /r RM V/V SSE2 Divide low double-precision floating-point value in\r\n DIVSD xmm1, xmm2/m64 xmm1 by low double-precision floating-point value\r\n in xmm2/m64.\r\n VEX.NDS.128.F2.0F.WIG 5E /r RVM V/V AVX Divide low double-precision floating-point value in\r\n VDIVSD xmm1, xmm2, xmm3/m64 xmm2 by low double-precision floating-point value\r\n in xmm3/m64.\r\n EVEX.NDS.LIG.F2.0F.W1 5E /r T1S V/V AVX512F Divide low double-precision floating-point value in\r\n VDIVSD xmm1 {k1}{z}, xmm2, xmm2 by low double-precision floating-point value\r\n xmm3/m64{er} in xmm3/m64.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nDivides the low double-precision floating-point value in the first source operand by the low double-precision\r\nfloating-point value in the second source operand, and stores the double-precision floating-point result in the desti-\r\nnation operand. The second source operand can be an XMM register or a 64-bit memory location. The first source\r\nand destination are XMM registers.\r\n128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (MAX_VL-\r\n1:64) of the corresponding ZMM destination register remain unchanged.\r\nVEX.128 encoded version: The first source operand is an xmm register encoded by VEX.vvvv. The quadword at bits\r\n127:64 of the destination operand is copied from the corresponding quadword of the first source operand. Bits\r\n(MAX_VL-1:128) of the destination register are zeroed.\r\nEVEX.128 encoded version: The first source operand is an xmm register encoded by EVEX.vvvv. The quadword\r\nelement of the destination operand at bits 127:64 are copied from the first source operand. Bits (MAX_VL-1:128)\r\nof the destination register are zeroed.\r\nEVEX version: The low quadword element of the destination is updated according to the writemask.\r\nSoftware should ensure VDIVSD is encoded with VEX.L=0. Encoding VDIVSD with VEX.L=1 may encounter unpre-\r\ndictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVDIVSD (EVEX encoded version)\r\nIF (EVEX.b = 1) AND SRC2 *is a register*\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF k1[0] or *no writemask*\r\n THEN DEST[63:0] <- SRC1[63:0] / SRC2[63:0]\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[63:0] remains unchanged*\r\n ELSE ; zeroing-masking\r\n THEN DEST[63:0] <- 0\r\n FI;\r\nFI;\r\nDEST[127:64] <- SRC1[127:64]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVDIVSD (VEX.128 encoded version)\r\nDEST[63:0] <-SRC1[63:0] / SRC2[63:0]\r\nDEST[127:64] <-SRC1[127:64]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nDIVSD (128-bit Legacy SSE version)\r\nDEST[63:0] <-DEST[63:0] / SRC[63:0]\r\nDEST[MAX_VL-1:64] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVDIVSD __m128d _mm_mask_div_sd(__m128d s, __mmask8 k, __m128d a, __m128d b);\r\nVDIVSD __m128d _mm_maskz_div_sd( __mmask8 k, __m128d a, __m128d b);\r\nVDIVSD __m128d _mm_div_round_sd( __m128d a, __m128d b, int);\r\nVDIVSD __m128d _mm_mask_div_round_sd(__m128d s, __mmask8 k, __m128d a, __m128d b, int);\r\nVDIVSD __m128d _mm_maskz_div_round_sd( __mmask8 k, __m128d a, __m128d b, int);\r\nDIVSD __m128d _mm_div_sd (__m128d a, __m128d b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Divide-by-Zero, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DIVSD"
},
{
"description": "DIVSS-Divide Scalar Single-Precision Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n F3 0F 5E /r RM V/V SSE Divide low single-precision floating-point value in\r\n DIVSS xmm1, xmm2/m32 xmm1 by low single-precision floating-point value in\r\n xmm2/m32.\r\n VEX.NDS.128.F3.0F.WIG 5E /r RVM V/V AVX Divide low single-precision floating-point value in\r\n VDIVSS xmm1, xmm2, xmm3/m32 xmm2 by low single-precision floating-point value in\r\n xmm3/m32.\r\n EVEX.NDS.LIG.F3.0F.W0 5E /r T1S V/V AVX512F Divide low single-precision floating-point value in\r\n VDIVSS xmm1 {k1}{z}, xmm2, xmm2 by low single-precision floating-point value in\r\n xmm3/m32{er} xmm3/m32.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RM ModRM:reg (r, w) ModRM:r/m (r) NA NA\r\n RVM ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA\r\n T1S ModRM:reg (w) EVEX.vvvv ModRM:r/m (r) NA\r\n\r\nDescription\r\nDivides the low single-precision floating-point value in the first source operand by the low single-precision floating-\r\npoint value in the second source operand, and stores the single-precision floating-point result in the destination\r\noperand. The second source operand can be an XMM register or a 32-bit memory location.\r\n128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (MAX_VL-\r\n1:32) of the corresponding YMM destination register remain unchanged.\r\nVEX.128 encoded version: The first source operand is an xmm register encoded by VEX.vvvv. The three high-order\r\ndoublewords of the destination operand are copied from the first source operand. Bits (MAX_VL-1:128) of the\r\ndestination register are zeroed.\r\nEVEX.128 encoded version: The first source operand is an xmm register encoded by EVEX.vvvv. The doubleword\r\nelements of the destination operand at bits 127:32 are copied from the first source operand. Bits (MAX_VL-1:128)\r\nof the destination register are zeroed.\r\nEVEX version: The low doubleword element of the destination is updated according to the writemask.\r\nSoftware should ensure VDIVSS is encoded with VEX.L=0. Encoding VDIVSS with VEX.L=1 may encounter unpre-\r\ndictable behavior across different processor generations.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nVDIVSS (EVEX encoded version)\r\nIF (EVEX.b = 1) AND SRC2 *is a register*\r\n THEN\r\n SET_RM(EVEX.RC);\r\n ELSE\r\n SET_RM(MXCSR.RM);\r\nFI;\r\nIF k1[0] or *no writemask*\r\n THEN DEST[31:0] <- SRC1[31:0] / SRC2[31:0]\r\n ELSE\r\n IF *merging-masking* ; merging-masking\r\n THEN *DEST[31:0] remains unchanged*\r\n ELSE ; zeroing-masking\r\n THEN DEST[31:0] <- 0\r\n FI;\r\nFI;\r\nDEST[127:32] <- SRC1[127:32]\r\nDEST[MAX_VL-1:128] <- 0\r\n\r\nVDIVSS (VEX.128 encoded version)\r\nDEST[31:0] <-SRC1[31:0] / SRC2[31:0]\r\nDEST[127:32] <-SRC1[127:32]\r\nDEST[MAX_VL-1:128] <-0\r\n\r\nDIVSS (128-bit Legacy SSE version)\r\nDEST[31:0] <-DEST[31:0] / SRC[31:0]\r\nDEST[MAX_VL-1:32] (Unmodified)\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nVDIVSS __m128 _mm_mask_div_ss(__m128 s, __mmask8 k, __m128 a, __m128 b);\r\nVDIVSS __m128 _mm_maskz_div_ss( __mmask8 k, __m128 a, __m128 b);\r\nVDIVSS __m128 _mm_div_round_ss( __m128 a, __m128 b, int);\r\nVDIVSS __m128 _mm_mask_div_round_ss(__m128 s, __mmask8 k, __m128 a, __m128 b, int);\r\nVDIVSS __m128 _mm_maskz_div_round_ss( __mmask8 k, __m128 a, __m128 b, int);\r\nDIVSS __m128 _mm_div_ss(__m128 a, __m128 b);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Divide-by-Zero, Precision, Denormal\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 3.\r\nEVEX-encoded instructions, see Exceptions Type E3.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DIVSS"
},
{
"description": "DPPD - Dot Product of Packed Double Precision Floating-Point Values\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 3A 41 /r ib RMI V/V SSE4_1 Selectively multiply packed DP floating-point\r\n DPPD xmm1, xmm2/m128, imm8 values from xmm1 with packed DP floating-\r\n point values from xmm2, add and selectively\r\n store the packed DP floating-point values to\r\n xmm1.\r\n VEX.NDS.128.66.0F3A.WIG 41 /r ib RVMI V/V AVX Selectively multiply packed DP floating-point\r\n VDPPD xmm1,xmm2, xmm3/m128, imm8 values from xmm2 with packed DP floating-\r\n point values from xmm3, add and selectively\r\n store the packed DP floating-point values to\r\n xmm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) imm8\r\n\r\nDescription\r\nConditionally multiplies the packed double-precision floating-point values in the destination operand (first operand)\r\nwith the packed double-precision floating-point values in the source (second operand) depending on a mask\r\nextracted from bits [5:4] of the immediate operand (third operand). If a condition mask bit is zero, the corre-\r\nsponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of Intel 64 and\r\nIA-32 Architectures Software Developer's Manual, Volume 1.\r\nThe two resulting double-precision values are summed into an intermediate result. The intermediate result is\r\nconditionally broadcasted to the destination using a broadcast mask specified by bits [1:0] of the immediate byte.\r\nIf a broadcast mask bit is \"1\", the intermediate result is copied to the corresponding qword element in the destina-\r\ntion operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero.\r\nDPPD follows the NaN forwarding rules stated in the Software Developer's Manual, vol. 1, table 4.7. These rules do\r\nnot cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of\r\nthose NaNs in the destination is implementation dependent. NaNs on the input sources or computationally gener-\r\nated NaNs will have at least one NaN propagated to the destination.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding\r\nYMM register destination are unmodified.\r\nVEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination\r\noperand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are\r\nzeroed.\r\nIf VDPPD is encoded with VEX.L= 1, an attempt to execute the instruction encoded with VEX.L= 1 will cause an\r\n#UD exception.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nDP_primitive (SRC1, SRC2)\r\nIF (imm8[4] = 1)\r\n THEN Temp1[63:0] <- DEST[63:0] * SRC[63:0]; // update SIMD exception flags\r\n ELSE Temp1[63:0] <- +0.0; FI;\r\nIF (imm8[5] = 1)\r\n THEN Temp1[127:64] <- DEST[127:64] * SRC[127:64]; // update SIMD exception flags\r\n ELSE Temp1[127:64] <- +0.0; FI;\r\n/* if unmasked exception reported, execute exception handler*/\r\n\r\nTemp2[63:0] <- Temp1[63:0] + Temp1[127:64]; // update SIMD exception flags\r\n/* if unmasked exception reported, execute exception handler*/\r\n\r\nIF (imm8[0] = 1)\r\n THEN DEST[63:0] <- Temp2[63:0];\r\n ELSE DEST[63:0] <- +0.0; FI;\r\nIF (imm8[1] = 1)\r\n THEN DEST[127:64] <- Temp2[63:0];\r\n ELSE DEST[127:64] <- +0.0; FI;\r\n\r\nDPPD (128-bit Legacy SSE version)\r\nDEST[127:0]<-DP_Primitive(SRC1[127:0], SRC2[127:0]);\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVDPPD (VEX.128 encoded version)\r\nDEST[127:0]<-DP_Primitive(SRC1[127:0], SRC2[127:0]);\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nFlags Affected\r\nNone\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nDPPD: __m128d _mm_dp_pd ( __m128d a, __m128d b, const int mask);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\nExceptions are determined separately for each add and multiply operation. Unmasked exceptions will leave the\r\ndestination untouched.\r\n\r\nOther Exceptions\r\nSee Exceptions Type 2; additionally\r\n#UD If VEX.L= 1.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DPPD"
},
{
"description": "DPPS - Dot Product of Packed Single Precision Floating-Point Values\r\n Opcode/ Op/ 64/32-bit CPUID Description\r\n Instruction En Mode Feature\r\n Flag\r\n 66 0F 3A 40 /r ib RMI V/V SSE4_1 Selectively multiply packed SP floating-point\r\n DPPS xmm1, xmm2/m128, imm8 values from xmm1 with packed SP floating-\r\n point values from xmm2, add and selectively\r\n store the packed SP floating-point values or\r\n zero values to xmm1.\r\n VEX.NDS.128.66.0F3A.WIG 40 /r ib RVMI V/V AVX Multiply packed SP floating point values from\r\n VDPPS xmm1,xmm2, xmm3/m128, imm8 xmm1 with packed SP floating point values\r\n from xmm2/mem selectively add and store to\r\n xmm1.\r\n VEX.NDS.256.66.0F3A.WIG 40 /r ib RVMI V/V AVX Multiply packed single-precision floating-point\r\n VDPPS ymm1, ymm2, ymm3/m256, imm8 values from ymm2 with packed SP floating\r\n point values from ymm3/mem, selectively add\r\n pairs of elements and store to ymm1.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:reg (r, w) ModRM:r/m (r) imm8 NA\r\n RVMI ModRM:reg (w) VEX.vvvv (r) ModRM:r/m (r) imm8\r\n\r\nDescription\r\nConditionally multiplies the packed single precision floating-point values in the destination operand (first operand)\r\nwith the packed single-precision floats in the source (second operand) depending on a mask extracted from the\r\nhigh 4 bits of the immediate byte (third operand). If a condition mask bit in Imm8[7:4] is zero, the corresponding\r\nmultiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of Intel 64 and IA-32 Archi-\r\ntectures Software Developer's Manual, Volume 1.\r\nThe four resulting single-precision values are summed into an intermediate result. The intermediate result is condi-\r\ntionally broadcasted to the destination using a broadcast mask specified by bits [3:0] of the immediate byte.\r\nIf a broadcast mask bit is \"1\", the intermediate result is copied to the corresponding dword element in the destina-\r\ntion operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero.\r\nDPPS follows the NaN forwarding rules stated in the Software Developer's Manual, vol. 1, table 4.7. These rules do\r\nnot cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of\r\nthose NaNs in the destination is implementation dependent. NaNs on the input sources or computationally gener-\r\nated NaNs will have at least one NaN propagated to the destination.\r\n128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The desti-\r\nnation is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding\r\nYMM register destination are unmodified.\r\nVEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination\r\noperand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are\r\nzeroed.\r\nVEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM\r\nregister or a 256-bit memory location. The destination operand is a YMM register.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nDP_primitive (SRC1, SRC2)\r\nIF (imm8[4] = 1)\r\n THEN Temp1[31:0] <- DEST[31:0] * SRC[31:0]; // update SIMD exception flags\r\n ELSE Temp1[31:0] <- +0.0; FI;\r\nIF (imm8[5] = 1)\r\n THEN Temp1[63:32] <- DEST[63:32] * SRC[63:32]; // update SIMD exception flags\r\n ELSE Temp1[63:32] <- +0.0; FI;\r\nIF (imm8[6] = 1)\r\n THEN Temp1[95:64] <- DEST[95:64] * SRC[95:64]; // update SIMD exception flags\r\n ELSE Temp1[95:64] <- +0.0; FI;\r\nIF (imm8[7] = 1)\r\n THEN Temp1[127:96] <- DEST[127:96] * SRC[127:96]; // update SIMD exception flags\r\n ELSE Temp1[127:96] <- +0.0; FI;\r\n\r\nTemp2[31:0] <- Temp1[31:0] + Temp1[63:32]; // update SIMD exception flags\r\n/* if unmasked exception reported, execute exception handler*/\r\nTemp3[31:0] <- Temp1[95:64] + Temp1[127:96]; // update SIMD exception flags\r\n/* if unmasked exception reported, execute exception handler*/\r\nTemp4[31:0] <- Temp2[31:0] + Temp3[31:0]; // update SIMD exception flags\r\n/* if unmasked exception reported, execute exception handler*/\r\n\r\nIF (imm8[0] = 1)\r\n THEN DEST[31:0] <- Temp4[31:0];\r\n ELSE DEST[31:0] <- +0.0; FI;\r\nIF (imm8[1] = 1)\r\n THEN DEST[63:32] <- Temp4[31:0];\r\n ELSE DEST[63:32] <- +0.0; FI;\r\nIF (imm8[2] = 1)\r\n THEN DEST[95:64] <- Temp4[31:0];\r\n ELSE DEST[95:64] <- +0.0; FI;\r\nIF (imm8[3] = 1)\r\n THEN DEST[127:96] <- Temp4[31:0];\r\n ELSE DEST[127:96] <- +0.0; FI;\r\n\r\nDPPS (128-bit Legacy SSE version)\r\nDEST[127:0]<-DP_Primitive(SRC1[127:0], SRC2[127:0]);\r\nDEST[VLMAX-1:128] (Unmodified)\r\n\r\nVDPPS (VEX.128 encoded version)\r\nDEST[127:0]<-DP_Primitive(SRC1[127:0], SRC2[127:0]);\r\nDEST[VLMAX-1:128] <- 0\r\n\r\nVDPPS (VEX.256 encoded version)\r\nDEST[127:0]<-DP_Primitive(SRC1[127:0], SRC2[127:0]);\r\nDEST[255:128]<-DP_Primitive(SRC1[255:128], SRC2[255:128]);\r\n\r\nFlags Affected\r\nNone\r\n\r\n\r\n\r\n\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\n(V)DPPS: __m128 _mm_dp_ps ( __m128 a, __m128 b, const int mask);\r\n\r\nVDPPS: __m256 _mm256_dp_ps ( __m256 a, __m256 b, const int mask);\r\n\r\nSIMD Floating-Point Exceptions\r\nOverflow, Underflow, Invalid, Precision, Denormal\r\nExceptions are determined separately for each add and multiply operation, in the order of their execution.\r\nUnmasked exceptions will leave the destination operands unchanged.\r\n\r\nOther Exceptions\r\nSee Exceptions Type 2.\r\n\r\n\r\n\r\n\r\n",
"mnem": "DPPS"
},
{
"description": "EMMS-Empty MMX Technology State\r\nOpcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\n0F 77 EMMS NP Valid Valid Set the x87 FPU tag word to empty.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n NP NA NA NA NA\r\n\r\nDescription\r\nSets the values of all the tags in the x87 FPU tag word to empty (all 1s). This operation marks the x87 FPU data\r\nregisters (which are aliased to the MMX technology registers) as available for use by x87 FPU floating-point instruc-\r\ntions. (See Figure 8-7 in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 1, for the\r\nformat of the x87 FPU tag word.) All other MMX instructions (other than the EMMS instruction) set all the tags in\r\nx87 FPU tag word to valid (all 0s).\r\nThe EMMS instruction must be used to clear the MMX technology state at the end of all MMX technology procedures\r\nor subroutines and before calling other procedures or subroutines that may execute x87 floating-point instructions.\r\nIf a floating-point instruction loads one of the registers in the x87 FPU data register stack before the x87 FPU tag\r\nword has been reset by the EMMS instruction, an x87 floating-point register stack overflow can occur that will\r\nresult in an x87 floating-point exception or incorrect result.\r\nEMMS operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nx87FPUTagWord <- FFFFH;\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nvoid _mm_empty()\r\n\r\nFlags Affected\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#UD If CR0.EM[bit 2] = 1.\r\n#NM If CR0.TS[bit 3] = 1.\r\n#MF If there is a pending FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n",
"mnem": "EMMS"
},
{
"description": "ENTER-Make Stack Frame for Procedure Parameters\r\nOpcode Instruction Op/ 64-Bit Compat/ Description\r\n En Mode Leg Mode\r\nC8 iw 00 ENTER imm16, 0 II Valid Valid Create a stack frame for a procedure.\r\nC8 iw 01 ENTER imm16,1 II Valid Valid Create a stack frame with a nested pointer for\r\n a procedure.\r\nC8 iw ib ENTER imm16, imm8 II Valid Valid Create a stack frame with nested pointers for\r\n a procedure.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n II iw imm8 NA NA\r\n\r\nDescription\r\nCreates a stack frame (comprising of space for dynamic storage and 1-32 frame pointer storage) for a procedure.\r\nThe first operand (imm16) specifies the size of the dynamic storage in the stack frame (that is, the number of bytes\r\nof dynamically allocated on the stack for the procedure). The second operand (imm8) gives the lexical nesting level\r\n(0 to 31) of the procedure. The nesting level (imm8 mod 32) and the OperandSize attribute determine the size in\r\nbytes of the storage space for frame pointers.\r\nThe nesting level determines the number of frame pointers that are copied into the \"display area\" of the new stack\r\nframe from the preceding frame. The default size of the frame pointer is the StackAddrSize attribute, but can be\r\noverridden using the 66H prefix. Thus, the OperandSize attribute determines the size of each frame pointer that\r\nwill be copied into the stack frame and the data being transferred from SP/ESP/RSP register into the BP/EBP/RBP\r\nregister.\r\nThe ENTER and companion LEAVE instructions are provided to support block structured languages. The ENTER\r\ninstruction (when used) is typically the first instruction in a procedure and is used to set up a new stack frame for\r\na procedure. The LEAVE instruction is then used at the end of the procedure (just before the RET instruction) to\r\nrelease the stack frame.\r\nIf the nesting level is 0, the processor pushes the frame pointer from the BP/EBP/RBP register onto the stack,\r\ncopies the current stack pointer from the SP/ESP/RSP register into the BP/EBP/RBP register, and loads the\r\nSP/ESP/RSP register with the current stack-pointer value minus the value in the size operand. For nesting levels of\r\n1 or greater, the processor pushes additional frame pointers on the stack before adjusting the stack pointer. These\r\nadditional frame pointers provide the called procedure with access points to other nested frames on the stack. See\r\n\"Procedure Calls for Block-Structured Languages\" in Chapter 6 of the Intel 64 and IA-32 Architectures Software\r\nDeveloper's Manual, Volume 1, for more information about the actions of the ENTER instruction.\r\nThe ENTER instruction causes a page fault whenever a write using the final value of the stack pointer (within the\r\ncurrent stack segment) would do so.\r\nIn 64-bit mode, default operation size is 64 bits; 32-bit operation size cannot be encoded. Use of 66H prefix\r\nchanges frame pointer operand size to 16 bits.\r\nWhen the 66H prefix is used and causing the OperandSize attribute to be less than the StackAddrSize, software is\r\nresponsible for the following:\r\n. The companion LEAVE instruction must also use the 66H prefix,\r\n. The value in the RBP/EBP register prior to executing \"66H ENTER\" must be within the same 16KByte region of\r\n the current stack pointer (RSP/ESP), such that the value of RBP/EBP after \"66H ENTER\" remains a valid address\r\n in the stack. This ensures \"66H LEAVE\" can restore 16-bits of data from the stack.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nAllocSize <- imm16;\r\nNestingLevel <- imm8 MOD 32;\r\nIF (OperandSize = 64)\r\n THEN\r\n Push(RBP); (* RSP decrements by 8 *)\r\n FrameTemp <- RSP;\r\n ELSE IF OperandSize = 32\r\n THEN\r\n Push(EBP); (* (E)SP decrements by 4 *)\r\n FrameTemp <- ESP; FI;\r\n ELSE (* OperandSize = 16 *)\r\n Push(BP); (* RSP or (E)SP decrements by 2 *)\r\n FrameTemp <- SP;\r\nFI;\r\n\r\nIF NestingLevel = 0\r\n THEN GOTO CONTINUE;\r\nFI;\r\n\r\nIF (NestingLevel > 1)\r\n THEN FOR i <- 1 to (NestingLevel - 1)\r\n DO\r\n IF (OperandSize = 64)\r\n THEN\r\n RBP <- RBP - 8;\r\n Push([RBP]); (* Quadword push *)\r\n ELSE IF OperandSize = 32\r\n THEN\r\n IF StackSize = 32\r\n EBP <- EBP - 4;\r\n Push([EBP]); (* Doubleword push *)\r\n ELSE (* StackSize = 16 *)\r\n BP <- BP - 4;\r\n Push([BP]); (* Doubleword push *)\r\n FI;\r\n FI;\r\n ELSE (* OperandSize = 16 *)\r\n IF StackSize = 32\r\n THEN\r\n EBP <- EBP - 2;\r\n Push([EBP]); (* Word push *)\r\n ELSE (* StackSize = 16 *)\r\n BP <- BP - 2;\r\n Push([BP]); (* Word push *)\r\n FI;\r\n FI;\r\n OD;\r\nFI;\r\n\r\nIF (OperandSize = 64) (* nestinglevel 1 *)\r\n THEN\r\n Push(FrameTemp); (* Quadword push and RSP decrements by 8 *)\r\n ELSE IF OperandSize = 32\r\n\r\n\r\n\r\n\r\n THEN\r\n Push(FrameTemp); FI; (* Doubleword push and (E)SP decrements by 4 *)\r\n ELSE (* OperandSize = 16 *)\r\n Push(FrameTemp); (* Word push and RSP|ESP|SP decrements by 2 *)\r\nFI;\r\n\r\nCONTINUE:\r\nIF 64-Bit Mode (StackSize = 64)\r\n THEN\r\n RBP <- FrameTemp;\r\n RSP <- RSP - AllocSize;\r\n ELSE IF OperandSize = 32\r\n THEN\r\n EBP <- FrameTemp;\r\n ESP <- ESP - AllocSize; FI;\r\n ELSE (* OperandSize = 16 *)\r\n BP <- FrameTemp[15:1]; (* Bits 16 and above of applicable RBP/EBP are unmodified *)\r\n SP <- SP - AllocSize;\r\nFI;\r\n\r\nEND;\r\n\r\nFlags Affected\r\nNone.\r\n\r\nProtected Mode Exceptions\r\n#SS(0) If the new value of the SP or ESP register is outside the stack segment limit.\r\n#PF(fault-code) If a page fault occurs or if a write using the final value of the stack pointer (within the current\r\n stack segment) would cause a page fault.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#SS If the new value of the SP or ESP register is outside the stack segment limit.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#SS(0) If the new value of the SP or ESP register is outside the stack segment limit.\r\n#PF(fault-code) If a page fault occurs or if a write using the final value of the stack pointer (within the current\r\n stack segment) would cause a page fault.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If the stack address is in a non-canonical form.\r\n#PF(fault-code) If a page fault occurs or if a write using the final value of the stack pointer (within the current\r\n stack segment) would cause a page fault.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "ENTER"
},
{
"description": "EXTRACTPS-Extract Packed Floating-Point Values\r\n Opcode/ Op / 64/32 CPUID Description\r\n Instruction En bit Mode Feature\r\n Support Flag\r\n 66 0F 3A 17 /r ib RMI VV SSE4_1 Extract one single-precision floating-point value\r\n EXTRACTPS reg/m32, xmm1, imm8 from xmm1 at the offset specified by imm8 and\r\n store the result in reg or m32. Zero extend the\r\n results in 64-bit register if applicable.\r\n VEX.128.66.0F3A.WIG 17 /r ib RMI V/V AVX Extract one single-precision floating-point value\r\n VEXTRACTPS reg/m32, xmm1, imm8 from xmm1 at the offset specified by imm8 and\r\n store the result in reg or m32. Zero extend the\r\n results in 64-bit register if applicable.\r\n EVEX.128.66.0F3A.WIG 17 /r ib T1S V/V AVX512F Extract one single-precision floating-point value\r\n VEXTRACTPS reg/m32, xmm1, imm8 from xmm1 at the offset specified by imm8 and\r\n store the result in reg or m32. Zero extend the\r\n results in 64-bit register if applicable.\r\n\r\n\r\n\r\n Instruction Operand Encoding\r\n Op/En Operand 1 Operand 2 Operand 3 Operand 4\r\n RMI ModRM:r/m (w) ModRM:reg (r) Imm8 NA\r\n T1S ModRM:r/m (w) ModRM:reg (r) Imm8 NA\r\n\r\nDescription\r\nExtracts a single-precision floating-point value from the source operand (second operand) at the 32-bit offset spec-\r\nified from imm8. Immediate bits higher than the most significant offset for the vector length are ignored.\r\nThe extracted single-precision floating-point value is stored in the low 32-bits of the destination operand\r\nIn 64-bit mode, destination register operand has default operand size of 64 bits. The upper 32-bits of the register\r\nare filled with zero. REX.W is ignored.\r\nVEX.128 and EVEX encoded version: When VEX.W1 or EVEX.W1 form is used in 64-bit mode with a general\r\npurpose register (GPR) as a destination operand, the packed single quantity is zero extended to 64 bits.\r\nVEX.vvvv/EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.\r\n128-bit Legacy SSE version: When a REX.W prefix is used in 64-bit mode with a general purpose register (GPR) as\r\na destination operand, the packed single quantity is zero extended to 64 bits.\r\nThe source register is an XMM register. Imm8[1:0] determine the starting DWORD offset from which to extract the\r\n32-bit floating-point value.\r\nIf VEXTRACTPS is encoded with VEX.L= 1, an attempt to execute the instruction encoded with VEX.L= 1 will cause\r\nan #UD exception.\r\n\r\nOperation\r\nVEXTRACTPS (EVEX and VEX.128 encoded version)\r\nSRC_OFFSET <- IMM8[1:0]\r\nIF (64-Bit Mode and DEST is register)\r\n DEST[31:0] <- (SRC[127:0] >> (SRC_OFFSET*32)) AND 0FFFFFFFFh\r\n DEST[63:32] <- 0\r\nELSE\r\n DEST[31:0] <- (SRC[127:0] >> (SRC_OFFSET*32)) AND 0FFFFFFFFh\r\nFI\r\n\r\n\r\n\r\n\r\n\r\nEXTRACTPS (128-bit Legacy SSE version)\r\nSRC_OFFSET <-IMM8[1:0]\r\nIF (64-Bit Mode and DEST is register)\r\n DEST[31:0] <-(SRC[127:0] >> (SRC_OFFSET*32)) AND 0FFFFFFFFh\r\n DEST[63:32] <-0\r\nELSE\r\n DEST[31:0] <-(SRC[127:0] >> (SRC_OFFSET*32)) AND 0FFFFFFFFh\r\nFI\r\n\r\nIntel C/C++ Compiler Intrinsic Equivalent\r\nEXTRACTPS int _mm_extract_ps (__m128 a, const int nidx);\r\n\r\nSIMD Floating-Point Exceptions\r\nNone\r\n\r\nOther Exceptions\r\nVEX-encoded instructions, see Exceptions Type 5; Additionally\r\nEVEX-encoded instructions, see Exceptions Type E9NF.\r\n#UD IF VEX.L = 0.\r\n#UD If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.\r\n\r\n\r\n\r\n\r\n",
"mnem": "EXTRACTPS"
},
{
"description": "F2XM1-Compute 2x-1\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD9 F0 F2XM1 Valid Valid Replace ST(0) with (2ST(0) - 1).\r\n\r\n\r\n\r\nDescription\r\nComputes the exponential value of 2 to the power of the source operand minus 1. The source operand is located in\r\nregister ST(0) and the result is also stored in ST(0). The value of the source operand must lie in the range -1.0 to\r\n+1.0. If the source value is outside this range, the result is undefined.\r\nThe following table shows the results obtained when computing the exponential value of various classes of\r\nnumbers, assuming that neither overflow nor underflow occurs.\r\n Table 3-16. Results Obtained from F2XM1\r\n ST(0) SRC ST(0) DEST\r\n - 1.0 to -0 - 0.5 to - 0\r\n -0 -0\r\n +0 +0\r\n + 0 to +1.0 + 0 to 1.0\r\n\r\nValues other than 2 can be exponentiated using the following formula:\r\n\r\n xy <- 2(y * log2x)\r\n\r\n\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nST(0) <- (2ST(0) - 1);\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Set if result was rounded up; cleared otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Source operand is an SNaN value or unsupported format.\r\n#D Source is a denormal value.\r\n#U Result is too small for destination format.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "F2XM1"
},
{
"description": "FABS-Absolute Value\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n D9 E1 FABS Valid Valid Replace ST with its absolute value.\r\n\r\n\r\n\r\nDescription\r\nClears the sign bit of ST(0) to create the absolute value of the operand. The following table shows the results\r\nobtained when creating the absolute value of various classes of numbers.\r\n\r\n Table 3-17. Results Obtained from FABS\r\n ST(0) SRC ST(0) DEST\r\n -inf +inf\r\n -F +F\r\n -0 +0\r\n +0 +0\r\n +F +F\r\n +inf +inf\r\n NaN NaN\r\n NOTES:\r\n F Means finite floating-point value.\r\n\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nST(0) <- |ST(0)|;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n",
"mnem": "FABS"
},
{
"description": "FADD/FADDP/FIADD-Add\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD8 /0 FADD m32fp Valid Valid Add m32fp to ST(0) and store result in ST(0).\r\nDC /0 FADD m64fp Valid Valid Add m64fp to ST(0) and store result in ST(0).\r\nD8 C0+i FADD ST(0), ST(i) Valid Valid Add ST(0) to ST(i) and store result in ST(0).\r\nDC C0+i FADD ST(i), ST(0) Valid Valid Add ST(i) to ST(0) and store result in ST(i).\r\nDE C0+i FADDP ST(i), ST(0) Valid Valid Add ST(0) to ST(i), store result in ST(i), and pop the\r\n register stack.\r\nDE C1 FADDP Valid Valid Add ST(0) to ST(1), store result in ST(1), and pop the\r\n register stack.\r\nDA /0 FIADD m32int Valid Valid Add m32int to ST(0) and store result in ST(0).\r\nDE /0 FIADD m16int Valid Valid Add m16int to ST(0) and store result in ST(0).\r\n\r\n\r\n\r\nDescription\r\nAdds the destination and source operands and stores the sum in the destination location. The destination operand\r\nis always an FPU register; the source operand can be a register or a memory location. Source operands in memory\r\ncan be in single-precision or double-precision floating-point format or in word or doubleword integer format.\r\nThe no-operand version of the instruction adds the contents of the ST(0) register to the ST(1) register. The one-\r\noperand version adds the contents of a memory location (either a floating-point or an integer value) to the contents\r\nof the ST(0) register. The two-operand version, adds the contents of the ST(0) register to the ST(i) register or vice\r\nversa. The value in ST(0) can be doubled by coding:\r\n\r\n FADD ST(0), ST(0);\r\nThe FADDP instructions perform the additional operation of popping the FPU register stack after storing the result.\r\nTo pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP)\r\nby 1. (The no-operand version of the floating-point add instructions always results in the register stack being\r\npopped. In some assemblers, the mnemonic for this instruction is FADD rather than FADDP.)\r\nThe FIADD instructions convert an integer source operand to double extended-precision floating-point format\r\nbefore performing the addition.\r\nThe table on the following page shows the results obtained when adding various classes of numbers, assuming that\r\nneither overflow nor underflow occurs.\r\nWhen the sum of two operands with opposite signs is 0, the result is +0, except for the round toward -inf mode, in\r\nwhich case the result is -0. When the source operand is an integer 0, it is treated as a +0.\r\nWhen both operand are infinities of the same sign, the result is inf of the expected sign. If both operands are infini-\r\nties of opposite signs, an invalid-operation exception is generated. See Table 3-18.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-18. FADD/FADDP/FIADD Results\r\n DEST\r\n -inf -F -0 +0 +F +inf NaN\r\n -inf -inf -inf -inf -inf -inf * NaN\r\n - F or - I -inf -F SRC SRC +- F or +- 0 +inf NaN\r\n SRC -0 -inf DEST -0 +-0 DEST +inf NaN\r\n +0 -inf DEST +-0 +0 DEST +inf NaN\r\n + F or + I -inf +- F or +- 0 SRC SRC +F +inf NaN\r\n +inf * +inf +inf +inf +inf +inf NaN\r\n NaN NaN NaN NaN NaN NaN NaN NaN\r\n NOTES:\r\n F Means finite floating-point value.\r\n I Means integer.\r\n * Indicates floating-point invalid-arithmetic-operand (#IA) exception.\r\n\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nIF Instruction = FIADD\r\n THEN\r\n DEST <- DEST + ConvertToDoubleExtendedPrecisionFP(SRC);\r\n ELSE (* Source operand is floating-point value *)\r\n DEST <- DEST + SRC;\r\nFI;\r\n\r\nIF Instruction = FADDP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Set if result was rounded up; cleared otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Operand is an SNaN value or unsupported format.\r\n Operands are infinities of unlike sign.\r\n#D Source operand is a denormal value.\r\n#U Result is too small for destination format.\r\n#O Result is too large for destination format.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FADD"
},
{
"description": "-R:FADD",
"mnem": "FADDP"
},
{
"description": "FBLD-Load Binary Coded Decimal\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n DF /4 FBLD m80dec Valid Valid Convert BCD value to floating-point and push onto the\r\n FPU stack.\r\n\r\n\r\n\r\nDescription\r\nConverts the BCD source operand into double extended-precision floating-point format and pushes the value onto\r\nthe FPU stack. The source operand is loaded without rounding errors. The sign of the source operand is preserved,\r\nincluding that of -0.\r\nThe packed BCD digits are assumed to be in the range 0 through 9; the instruction does not check for invalid digits\r\n(AH through FH). Attempting to load an invalid encoding produces an undefined result.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nTOP <- TOP - 1;\r\nST(0) <- ConvertToDoubleExtendedPrecisionFP(SRC);\r\n\r\nFPU Flags Affected\r\nC1 Set to 1 if stack overflow occurred; otherwise, set to 0.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack overflow occurred.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FBLD"
},
{
"description": "FBSTP-Store BCD Integer and Pop\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nDF /6 FBSTP m80bcd Valid Valid Store ST(0) in m80bcd and pop ST(0).\r\n\r\n\r\n\r\nDescription\r\nConverts the value in the ST(0) register to an 18-digit packed BCD integer, stores the result in the destination\r\noperand, and pops the register stack. If the source value is a non-integral value, it is rounded to an integer value,\r\naccording to rounding mode specified by the RC field of the FPU control word. To pop the register stack, the\r\nprocessor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.\r\nThe destination operand specifies the address where the first byte destination value is to be stored. The BCD value\r\n(including its sign bit) requires 10 bytes of space in memory.\r\nThe following table shows the results obtained when storing various classes of numbers in packed BCD format.\r\n Table 3-19. FBSTP Results\r\n ST(0) DEST\r\n - inf or Value Too Large for DEST Format *\r\n F<=-1 -D\r\n -1 < F < -0 **\r\n -0 -0\r\n +0 +0\r\n + 0 < F < +1 **\r\n F >= +1 +D\r\n + inf or Value Too Large for DEST Format *\r\n NaN *\r\n NOTES:\r\n F Means finite floating-point value.\r\n D Means packed-BCD number.\r\n * Indicates floating-point invalid-operation (#IA) exception.\r\n ** +-0 or +-1, depending on the rounding mode.\r\n\r\nIf the converted value is too large for the destination format, or if the source operand is an inf, SNaN, QNAN, or is in\r\nan unsupported format, an invalid-arithmetic-operand condition is signaled. If the invalid-operation exception is\r\nnot masked, an invalid-arithmetic-operand exception (#IA) is generated and no value is stored in the destination\r\noperand. If the invalid-operation exception is masked, the packed BCD indefinite value is stored in memory.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nDEST <- BCD(ST(0));\r\nPopRegisterStack;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Set if result was rounded up; cleared otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\n\r\n\r\n\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Converted value that exceeds 18 BCD digits in length.\r\n Source operand is an SNaN, QNaN, +-inf, or in an unsupported format.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a segment register is being loaded with a segment selector that points to a non-writable\r\n segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FBSTP"
},
{
"description": "FCHS-Change Sign\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD9 E0 FCHS Valid Valid Complements sign of ST(0).\r\n\r\n\r\n\r\nDescription\r\nComplements the sign bit of ST(0). This operation changes a positive value into a negative value of equal magni-\r\ntude or vice versa. The following table shows the results obtained when changing the sign of various classes of\r\nnumbers.\r\n Table 3-20. FCHS Results\r\n ST(0) SRC ST(0) DEST\r\n -inf +inf\r\n -F +F\r\n -0 +0\r\n +0 -0\r\n +F -F\r\n +inf -inf\r\n NaN NaN\r\n NOTES:\r\n * F means finite floating-point value.\r\n\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nSignBit(ST(0)) <- NOT (SignBit(ST(0)));\r\n\r\nFPU Flags Affected\r\nC1 Set to 0.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FCHS"
},
{
"description": "FCLEX/FNCLEX - Clear Exceptions\r\n Opcode* Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n 9B DB E2 FCLEX Valid Valid Clear floating-point exception flags after checking for\r\n pending unmasked floating-point exceptions.\r\n DB E2 FNCLEX* Valid Valid Clear floating-point exception flags without checking for\r\n pending unmasked floating-point exceptions.\r\n NOTES:\r\n * See IA-32 Architecture Compatibility section below.\r\n\r\n\r\n\r\nDescription\r\nClears the floating-point exception flags (PE, UE, OE, ZE, DE, and IE), the exception summary status flag (ES), the\r\nstack fault flag (SF), and the busy flag (B) in the FPU status word. The FCLEX instruction checks for and handles\r\nany pending unmasked floating-point exceptions before clearing the exception flags; the FNCLEX instruction does\r\nnot.\r\nThe assembler issues two instructions for the FCLEX instruction (an FWAIT instruction followed by an FNCLEX\r\ninstruction), and the processor executes each of these instructions separately. If an exception is generated for\r\neither of these instructions, the save EIP points to the instruction that caused the exception.\r\n\r\nIA-32 Architecture Compatibility\r\nWhen operating a Pentium or Intel486 processor in MS-DOS* compatibility mode, it is possible (under unusual\r\ncircumstances) for an FNCLEX instruction to be interrupted prior to being executed to handle a pending FPU excep-\r\ntion. See the section titled \"No-Wait FPU Instructions Can Get FPU Interrupt in Window\" in Appendix D of the Intel\r\n64 and IA-32 Architectures Software Developer's Manual, Volume 1, for a description of these circumstances. An\r\nFNCLEX instruction cannot be interrupted in this way on later Intel processors, except for the Intel QuarkTM X1000\r\nprocessor.\r\nThis instruction affects only the x87 FPU floating-point exception flags. It does not affect the SIMD floating-point\r\nexception flags in the MXCRS register.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nFPUStatusWord[0:7] <- 0;\r\nFPUStatusWord[15] <- 0;\r\n\r\nFPU Flags Affected\r\nThe PE, UE, OE, ZE, DE, IE, ES, SF, and B flags in the FPU status word are cleared. The C0, C1, C2, and C3 flags are\r\nundefined.\r\n\r\nFloating-Point Exceptions\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FCLEX"
},
{
"description": "FCMOVcc-Floating-Point Conditional Move\r\nOpcode* Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode*\r\nDA C0+i FCMOVB ST(0), ST(i) Valid Valid Move if below (CF=1).\r\nDA C8+i FCMOVE ST(0), ST(i) Valid Valid Move if equal (ZF=1).\r\nDA D0+i FCMOVBE ST(0), ST(i) Valid Valid Move if below or equal (CF=1 or ZF=1).\r\nDA D8+i FCMOVU ST(0), ST(i) Valid Valid Move if unordered (PF=1).\r\nDB C0+i FCMOVNB ST(0), ST(i) Valid Valid Move if not below (CF=0).\r\nDB C8+i FCMOVNE ST(0), ST(i) Valid Valid Move if not equal (ZF=0).\r\nDB D0+i FCMOVNBE ST(0), ST(i) Valid Valid Move if not below or equal (CF=0 and ZF=0).\r\nDB D8+i FCMOVNU ST(0), ST(i) Valid Valid Move if not unordered (PF=0).\r\nNOTES:\r\n* See IA-32 Architecture Compatibility section below.\r\n\r\n\r\n\r\nDescription\r\nTests the status flags in the EFLAGS register and moves the source operand (second operand) to the destination\r\noperand (first operand) if the given test condition is true. The condition for each mnemonic os given in the Descrip-\r\ntion column above and in Chapter 8 in the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume\r\n1. The source operand is always in the ST(i) register and the destination operand is always ST(0).\r\nThe FCMOVcc instructions are useful for optimizing small IF constructions. They also help eliminate branching\r\noverhead for IF operations and the possibility of branch mispredictions by the processor.\r\nA processor may not support the FCMOVcc instructions. Software can check if the FCMOVcc instructions are\r\nsupported by checking the processor's feature information with the CPUID instruction (see \"COMISS-Compare\r\nScalar Ordered Single-Precision Floating-Point Values and Set EFLAGS\" in this chapter). If both the CMOV and FPU\r\nfeature bits are set, the FCMOVcc instructions are supported.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nIA-32 Architecture Compatibility\r\nThe FCMOVcc instructions were introduced to the IA-32 Architecture in the P6 family processors and are not avail-\r\nable in earlier IA-32 processors.\r\n\r\nOperation\r\nIF condition TRUE\r\n THEN ST(0) <- ST(i);\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n\r\nInteger Flags Affected\r\nNone.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FCMOVcc"
},
{
"description": "FCOM/FCOMP/FCOMPP-Compare Floating Point Values\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n D8 /2 FCOM m32fp Valid Valid Compare ST(0) with m32fp.\r\n DC /2 FCOM m64fp Valid Valid Compare ST(0) with m64fp.\r\n D8 D0+i FCOM ST(i) Valid Valid Compare ST(0) with ST(i).\r\n D8 D1 FCOM Valid Valid Compare ST(0) with ST(1).\r\n D8 /3 FCOMP m32fp Valid Valid Compare ST(0) with m32fp and pop register stack.\r\n DC /3 FCOMP m64fp Valid Valid Compare ST(0) with m64fp and pop register stack.\r\n D8 D8+i FCOMP ST(i) Valid Valid Compare ST(0) with ST(i) and pop register stack.\r\n D8 D9 FCOMP Valid Valid Compare ST(0) with ST(1) and pop register stack.\r\n DE D9 FCOMPP Valid Valid Compare ST(0) with ST(1) and pop register stack\r\n twice.\r\n\r\n\r\n\r\nDescription\r\nCompares the contents of register ST(0) and source value and sets condition code flags C0, C2, and C3 in the FPU\r\nstatus word according to the results (see the table below). The source operand can be a data register or a memory\r\nlocation. If no source operand is given, the value in ST(0) is compared with the value in ST(1). The sign of zero is\r\nignored, so that -0.0 is equal to +0.0.\r\n Table 3-21. FCOM/FCOMP/FCOMPP Results\r\n Condition C3 C2 C0\r\n ST(0) > SRC 0 0 0\r\n ST(0) < SRC 0 0 1\r\n ST(0) = SRC 1 0 0\r\n Unordered* 1 1 1\r\n NOTES:\r\n * Flags not set if unmasked invalid-arithmetic-operand (#IA) exception is generated.\r\n\r\nThis instruction checks the class of the numbers being compared (see \"FXAM-Examine Floating-Point\" in this\r\nchapter). If either operand is a NaN or is in an unsupported format, an invalid-arithmetic-operand exception (#IA)\r\nis raised and, if the exception is masked, the condition flags are set to \"unordered.\" If the invalid-arithmetic-\r\noperand exception is unmasked, the condition code flags are not set.\r\nThe FCOMP instruction pops the register stack following the comparison operation and the FCOMPP instruction\r\npops the register stack twice following the comparison operation. To pop the register stack, the processor marks\r\nthe ST(0) register as empty and increments the stack pointer (TOP) by 1.\r\nThe FCOM instructions perform the same operation as the FUCOM instructions. The only difference is how they\r\nhandle QNaN operands. The FCOM instructions raise an invalid-arithmetic-operand exception (#IA) when either or\r\nboth of the operands is a NaN value or is in an unsupported format. The FUCOM instructions perform the same\r\noperation as the FCOM instructions, except that they do not generate an invalid-arithmetic-operand exception for\r\nQNaNs.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nCASE (relation of operands) OF\r\n ST > SRC: C3, C2, C0 <- 000;\r\n ST < SRC: C3, C2, C0 <- 001;\r\n ST = SRC: C3, C2, C0 <- 100;\r\nESAC;\r\n\r\nIF ST(0) or SRC = NaN or unsupported format\r\n THEN\r\n #IA\r\n IF FPUControlWord.IM = 1\r\n THEN\r\n C3, C2, C0 <- 111;\r\n FI;\r\nFI;\r\n\r\nIF Instruction = FCOMP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nIF Instruction = FCOMPP\r\n THEN\r\n PopRegisterStack;\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0.\r\nC0, C2, C3 See table on previous page.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA One or both operands are NaN values or have unsupported formats.\r\n Register is marked empty.\r\n#D One or both operands are denormal values.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FCOM"
},
{
"description": "FCOMI/FCOMIP/ FUCOMI/FUCOMIP-Compare Floating Point Values and Set EFLAGS\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n DB F0+i FCOMI ST, ST(i) Valid Valid Compare ST(0) with ST(i) and set status flags accordingly.\r\n DF F0+i FCOMIP ST, ST(i) Valid Valid Compare ST(0) with ST(i), set status flags accordingly, and\r\n pop register stack.\r\n DB E8+i FUCOMI ST, ST(i) Valid Valid Compare ST(0) with ST(i), check for ordered values, and set\r\n status flags accordingly.\r\n DF E8+i FUCOMIP ST, ST(i) Valid Valid Compare ST(0) with ST(i), check for ordered values, set\r\n status flags accordingly, and pop register stack.\r\n\r\n\r\n\r\nDescription\r\nPerforms an unordered comparison of the contents of registers ST(0) and ST(i) and sets the status flags ZF, PF, and\r\nCF in the EFLAGS register according to the results (see the table below). The sign of zero is ignored for compari-\r\nsons, so that -0.0 is equal to +0.0.\r\n Table 3-22. FCOMI/FCOMIP/ FUCOMI/FUCOMIP Results\r\n Comparison Results* ZF PF CF\r\n ST0 > ST(i) 0 0 0\r\n ST0 < ST(i) 0 0 1\r\n ST0 = ST(i) 1 0 0\r\n Unordered** 1 1 1\r\n NOTES:\r\n * See the IA-32 Architecture Compatibility section below.\r\n ** Flags not set if unmasked invalid-arithmetic-operand (#IA) exception is generated.\r\n\r\nAn unordered comparison checks the class of the numbers being compared (see \"FXAM-Examine Floating-Point\"\r\nin this chapter). The FUCOMI/FUCOMIP instructions perform the same operations as the FCOMI/FCOMIP instruc-\r\ntions. The only difference is that the FUCOMI/FUCOMIP instructions raise the invalid-arithmetic-operand exception\r\n(#IA) only when either or both operands are an SNaN or are in an unsupported format; QNaNs cause the condition\r\ncode flags to be set to unordered, but do not cause an exception to be generated. The FCOMI/FCOMIP instructions\r\nraise an invalid-operation exception when either or both of the operands are a NaN value of any kind or are in an\r\nunsupported format.\r\nIf the operation results in an invalid-arithmetic-operand exception being raised, the status flags in the EFLAGS\r\nregister are set only if the exception is masked.\r\nThe FCOMI/FCOMIP and FUCOMI/FUCOMIP instructions set the OF, SF and AF flags to zero in the EFLAGS register\r\n(regardless of whether an invalid-operation exception is detected).\r\nThe FCOMIP and FUCOMIP instructions also pop the register stack following the comparison operation. To pop the\r\nregister stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nIA-32 Architecture Compatibility\r\nThe FCOMI/FCOMIP/FUCOMI/FUCOMIP instructions were introduced to the IA-32 Architecture in the P6 family\r\nprocessors and are not available in earlier IA-32 processors.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nCASE (relation of operands) OF\r\n ST(0) > ST(i): ZF, PF, CF <- 000;\r\n ST(0) < ST(i): ZF, PF, CF <- 001;\r\n ST(0) = ST(i): ZF, PF, CF <- 100;\r\nESAC;\r\n\r\nIF Instruction is FCOMI or FCOMIP\r\n THEN\r\n IF ST(0) or ST(i) = NaN or unsupported format\r\n THEN\r\n #IA\r\n IF FPUControlWord.IM = 1\r\n THEN\r\n ZF, PF, CF <- 111;\r\n FI;\r\n FI;\r\nFI;\r\n\r\nIF Instruction is FUCOMI or FUCOMIP\r\n THEN\r\n IF ST(0) or ST(i) = QNaN, but not SNaN or unsupported format\r\n THEN\r\n ZF, PF, CF <- 111;\r\n ELSE (* ST(0) or ST(i) is SNaN or unsupported format *)\r\n #IA;\r\n IF FPUControlWord.IM = 1\r\n THEN\r\n ZF, PF, CF <- 111;\r\n FI;\r\n FI;\r\nFI;\r\n\r\nIF Instruction is FCOMIP or FUCOMIP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0.\r\nC0, C2, C3 Not affected.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA (FCOMI or FCOMIP instruction) One or both operands are NaN values or have unsupported\r\n formats.\r\n (FUCOMI or FUCOMIP instruction) One or both operands are SNaN values (but not QNaNs) or\r\n have undefined formats. Detection of a QNaN value does not raise an invalid-operand excep-\r\n tion.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FCOMI"
},
{
"description": "-R:FCOMI",
"mnem": "FCOMIP"
},
{
"description": "-R:FCOM",
"mnem": "FCOMP"
},
{
"description": "-R:FCOM",
"mnem": "FCOMPP"
},
{
"description": "FCOS- Cosine\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD9 FF FCOS Valid Valid Replace ST(0) with its approximate cosine.\r\n\r\n\r\n\r\nDescription\r\nComputes the approximate cosine of the source operand in register ST(0) and stores the result in ST(0). The\r\nsource operand must be given in radians and must be within the range -263 to +263. The following table shows the\r\nresults obtained when taking the cosine of various classes of numbers.\r\n Table 3-23. FCOS Results\r\n ST(0) SRC ST(0) DEST\r\n -inf *\r\n -F -1 to +1\r\n -0 +1\r\n +0 +1\r\n +F - 1 to + 1\r\n +inf *\r\n NaN NaN\r\n NOTES:\r\n F Means finite floating-point value.\r\n * Indicates floating-point invalid-arithmetic-operand (#IA) exception.\r\n\r\nIf the source operand is outside the acceptable range, the C2 flag in the FPU status word is set, and the value in\r\nregister ST(0) remains unchanged. The instruction does not raise an exception when the source operand is out of\r\nrange. It is up to the program to check the C2 flag for out-of-range conditions. Source values outside the range -\r\n263 to +263 can be reduced to the range of the instruction by subtracting an appropriate integer multiple of 2pi.\r\nHowever, even within the range -263 to +263, inaccurate results can occur because the finite approximation of pi\r\nused internally for argument reduction is not sufficient in all cases. Therefore, for accurate results it is safe to apply\r\nFCOS only to arguments reduced accurately in software, to a value smaller in absolute value than 3pi/8. See the\r\nsections titled \"Approximation of Pi\" and \"Transcendental Instruction Accuracy\" in Chapter 8 of the Intel 64 and\r\nIA-32 Architectures Software Developer's Manual, Volume 1, for a discussion of the proper value to use for pi in\r\nperforming such reductions.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nIF |ST(0)| < 263\r\nTHEN\r\n C2 <- 0;\r\n ST(0) <- FCOS(ST(0)); // approximation of cosine\r\nELSE (* Source operand is out-of-range *)\r\n C2 <- 1;\r\nFI;\r\n\r\n\r\n\r\n\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Set if result was rounded up; cleared otherwise.\r\n Undefined if C2 is 1.\r\nC2 Set to 1 if outside range (-263 < source operand < +263); otherwise, set to 0.\r\nC0, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Source operand is an SNaN value, inf, or unsupported format.\r\n#D Source is a denormal value.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FCOS"
},
{
"description": "FDECSTP-Decrement Stack-Top Pointer\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n D9 F6 FDECSTP Valid Valid Decrement TOP field in FPU status word.\r\n\r\n\r\n\r\nDescription\r\nSubtracts one from the TOP field of the FPU status word (decrements the top-of-stack pointer). If the TOP field\r\ncontains a 0, it is set to 7. The effect of this instruction is to rotate the stack by one position. The contents of the\r\nFPU data registers and tag register are not affected.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nIF TOP = 0\r\n THEN TOP <- 7;\r\n ELSE TOP <- TOP - 1;\r\nFI;\r\n\r\nFPU Flags Affected\r\nThe C1 flag is set to 0. The C0, C2, and C3 flags are undefined.\r\n\r\nFloating-Point Exceptions\r\nNone.\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FDECSTP"
},
{
"description": "FDIV/FDIVP/FIDIV-Divide\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD8 /6 FDIV m32fp Valid Valid Divide ST(0) by m32fp and store result in ST(0).\r\nDC /6 FDIV m64fp Valid Valid Divide ST(0) by m64fp and store result in ST(0).\r\nD8 F0+i FDIV ST(0), ST(i) Valid Valid Divide ST(0) by ST(i) and store result in ST(0).\r\nDC F8+i FDIV ST(i), ST(0) Valid Valid Divide ST(i) by ST(0) and store result in ST(i).\r\nDE F8+i FDIVP ST(i), ST(0) Valid Valid Divide ST(i) by ST(0), store result in ST(i), and pop the\r\n register stack.\r\nDE F9 FDIVP Valid Valid Divide ST(1) by ST(0), store result in ST(1), and pop\r\n the register stack.\r\nDA /6 FIDIV m32int Valid Valid Divide ST(0) by m32int and store result in ST(0).\r\nDE /6 FIDIV m16int Valid Valid Divide ST(0) by m16int and store result in ST(0).\r\n\r\n\r\n\r\nDescription\r\nDivides the destination operand by the source operand and stores the result in the destination location. The desti-\r\nnation operand (dividend) is always in an FPU register; the source operand (divisor) can be a register or a memory\r\nlocation. Source operands in memory can be in single-precision or double-precision floating-point format, word or\r\ndoubleword integer format.\r\nThe no-operand version of the instruction divides the contents of the ST(1) register by the contents of the ST(0)\r\nregister. The one-operand version divides the contents of the ST(0) register by the contents of a memory location\r\n(either a floating-point or an integer value). The two-operand version, divides the contents of the ST(0) register by\r\nthe contents of the ST(i) register or vice versa.\r\nThe FDIVP instructions perform the additional operation of popping the FPU register stack after storing the result.\r\nTo pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP)\r\nby 1. The no-operand version of the floating-point divide instructions always results in the register stack being\r\npopped. In some assemblers, the mnemonic for this instruction is FDIV rather than FDIVP.\r\nThe FIDIV instructions convert an integer source operand to double extended-precision floating-point format\r\nbefore performing the division. When the source operand is an integer 0, it is treated as a +0.\r\nIf an unmasked divide-by-zero exception (#Z) is generated, no result is stored; if the exception is masked, an inf of\r\nthe appropriate sign is stored in the destination operand.\r\nThe following table shows the results obtained when dividing various classes of numbers, assuming that neither\r\noverflow nor underflow occurs.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-24. FDIV/FDIVP/FIDIV Results\r\n DEST\r\n -inf -F -0 +0 +F +inf NaN\r\n -inf * +0 +0 -0 -0 * NaN\r\n -F +inf +F +0 -0 -F -inf NaN\r\n -I +inf +F +0 -0 -F -inf NaN\r\n SRC -0 +inf ** * * ** -inf NaN\r\n +0 -inf ** * * ** +inf NaN\r\n +I -inf -F -0 +0 +F +inf NaN\r\n +F -inf -F -0 +0 +F +inf NaN\r\n +inf * -0 -0 +0 +0 * NaN\r\n NaN NaN NaN NaN NaN NaN NaN NaN\r\n NOTES:\r\n F Means finite floating-point value.\r\n I Means integer.\r\n * Indicates floating-point invalid-arithmetic-operand (#IA) exception.\r\n ** Indicates floating-point zero-divide (#Z) exception.\r\n\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nIF SRC = 0\r\n THEN\r\n #Z;\r\n ELSE\r\n IF Instruction is FIDIV\r\n THEN\r\n DEST <- DEST / ConvertToDoubleExtendedPrecisionFP(SRC);\r\n ELSE (* Source operand is floating-point value *)\r\n DEST <- DEST / SRC;\r\n FI;\r\nFI;\r\n\r\nIF Instruction = FDIVP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Set if result was rounded up; cleared otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\n\r\n\r\n\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Operand is an SNaN value or unsupported format.\r\n +-inf / +-inf; +-0 / +-0\r\n#D Source is a denormal value.\r\n#Z DEST / +-0, where DEST is not equal to +-0.\r\n#U Result is too small for destination format.\r\n#O Result is too large for destination format.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FDIV"
},
{
"description": "-R:FDIV",
"mnem": "FDIVP"
},
{
"description": "FDIVR/FDIVRP/FIDIVR-Reverse Divide\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD8 /7 FDIVR m32fp Valid Valid Divide m32fp by ST(0) and store result in ST(0).\r\nDC /7 FDIVR m64fp Valid Valid Divide m64fp by ST(0) and store result in ST(0).\r\nD8 F8+i FDIVR ST(0), ST(i) Valid Valid Divide ST(i) by ST(0) and store result in ST(0).\r\nDC F0+i FDIVR ST(i), ST(0) Valid Valid Divide ST(0) by ST(i) and store result in ST(i).\r\nDE F0+i FDIVRP ST(i), ST(0) Valid Valid Divide ST(0) by ST(i), store result in ST(i), and pop the\r\n register stack.\r\nDE F1 FDIVRP Valid Valid Divide ST(0) by ST(1), store result in ST(1), and pop the\r\n register stack.\r\nDA /7 FIDIVR m32int Valid Valid Divide m32int by ST(0) and store result in ST(0).\r\nDE /7 FIDIVR m16int Valid Valid Divide m16int by ST(0) and store result in ST(0).\r\n\r\n\r\n\r\nDescription\r\nDivides the source operand by the destination operand and stores the result in the destination location. The desti-\r\nnation operand (divisor) is always in an FPU register; the source operand (dividend) can be a register or a memory\r\nlocation. Source operands in memory can be in single-precision or double-precision floating-point format, word or\r\ndoubleword integer format.\r\nThese instructions perform the reverse operations of the FDIV, FDIVP, and FIDIV instructions. They are provided to\r\nsupport more efficient coding.\r\nThe no-operand version of the instruction divides the contents of the ST(0) register by the contents of the ST(1)\r\nregister. The one-operand version divides the contents of a memory location (either a floating-point or an integer\r\nvalue) by the contents of the ST(0) register. The two-operand version, divides the contents of the ST(i) register by\r\nthe contents of the ST(0) register or vice versa.\r\nThe FDIVRP instructions perform the additional operation of popping the FPU register stack after storing the result.\r\nTo pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP)\r\nby 1. The no-operand version of the floating-point divide instructions always results in the register stack being\r\npopped. In some assemblers, the mnemonic for this instruction is FDIVR rather than FDIVRP.\r\nThe FIDIVR instructions convert an integer source operand to double extended-precision floating-point format\r\nbefore performing the division.\r\nIf an unmasked divide-by-zero exception (#Z) is generated, no result is stored; if the exception is masked, an inf of\r\nthe appropriate sign is stored in the destination operand.\r\nThe following table shows the results obtained when dividing various classes of numbers, assuming that neither\r\noverflow nor underflow occurs.\r\n\r\n\r\n\r\n\r\n\r\n Table 3-25. FDIVR/FDIVRP/FIDIVR Results\r\n DEST\r\n -inf -F -0 +0 +F +inf NaN\r\n -inf * +inf +inf -inf -inf * NaN\r\n SRC -F +0 +F ** ** -F -0 NaN\r\n -I +0 +F ** ** -F -0 NaN\r\n -0 +0 +0 * * -0 -0 NaN\r\n +0 -0 -0 * * +0 +0 NaN\r\n +I -0 -F ** ** +F +0 NaN\r\n +F -0 -F ** ** +F +0 NaN\r\n +inf * -inf -inf +inf +inf * NaN\r\n NaN NaN NaN NaN NaN NaN NaN NaN\r\n NOTES:\r\n F Means finite floating-point value.\r\n I Means integer.\r\n * Indicates floating-point invalid-arithmetic-operand (#IA) exception.\r\n ** Indicates floating-point zero-divide (#Z) exception.\r\n\r\nWhen the source operand is an integer 0, it is treated as a +0. This instruction's operation is the same in non-64-bit\r\nmodes and 64-bit mode.\r\n\r\nOperation\r\nIF DEST = 0\r\n THEN\r\n #Z;\r\n ELSE\r\n IF Instruction = FIDIVR\r\n THEN\r\n DEST <- ConvertToDoubleExtendedPrecisionFP(SRC) / DEST;\r\n ELSE (* Source operand is floating-point value *)\r\n DEST <- SRC / DEST;\r\n FI;\r\nFI;\r\n\r\nIF Instruction = FDIVRP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Set if result was rounded up; cleared otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\n\r\n\r\n\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Operand is an SNaN value or unsupported format.\r\n +-inf / +-inf; +-0 / +-0\r\n#D Source is a denormal value.\r\n#Z SRC / +-0, where SRC is not equal to +-0.\r\n#U Result is too small for destination format.\r\n#O Result is too large for destination format.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FDIVR"
},
{
"description": "-R:FDIVR",
"mnem": "FDIVRP"
},
{
"description": "FFREE-Free Floating-Point Register\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nDD C0+i FFREE ST(i) Valid Valid Sets tag for ST(i) to empty.\r\n\r\n\r\n\r\nDescription\r\nSets the tag in the FPU tag register associated with register ST(i) to empty (11B). The contents of ST(i) and the FPU\r\nstack-top pointer (TOP) are not affected.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nTAG(i) <- 11B;\r\n\r\nFPU Flags Affected\r\nC0, C1, C2, C3 undefined.\r\n\r\nFloating-Point Exceptions\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FFREE"
},
{
"description": "-R:FADD",
"mnem": "FIADD"
},
{
"description": "FICOM/FICOMP-Compare Integer\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n DE /2 FICOM m16int Valid Valid Compare ST(0) with m16int.\r\n DA /2 FICOM m32int Valid Valid Compare ST(0) with m32int.\r\n DE /3 FICOMP m16int Valid Valid Compare ST(0) with m16int and pop stack register.\r\n DA /3 FICOMP m32int Valid Valid Compare ST(0) with m32int and pop stack register.\r\n\r\n\r\n\r\nDescription\r\nCompares the value in ST(0) with an integer source operand and sets the condition code flags C0, C2, and C3 in\r\nthe FPU status word according to the results (see table below). The integer value is converted to double extended-\r\nprecision floating-point format before the comparison is made.\r\n Table 3-26. FICOM/FICOMP Results\r\n Condition C3 C2 C0\r\n ST(0) > SRC 0 0 0\r\n ST(0) < SRC 0 0 1\r\n ST(0) = SRC 1 0 0\r\n Unordered 1 1 1\r\n\r\nThese instructions perform an \"unordered comparison.\" An unordered comparison also checks the class of the\r\nnumbers being compared (see \"FXAM-Examine Floating-Point\" in this chapter). If either operand is a NaN or is in\r\nan undefined format, the condition flags are set to \"unordered.\"\r\nThe sign of zero is ignored, so that -0.0 <- +0.0.\r\nThe FICOMP instructions pop the register stack following the comparison. To pop the register stack, the processor\r\nmarks the ST(0) register empty and increments the stack pointer (TOP) by 1.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nCASE (relation of operands) OF\r\n ST(0) > SRC: C3, C2, C0 <- 000;\r\n ST(0) < SRC: C3, C2, C0 <- 001;\r\n ST(0) = SRC: C3, C2, C0 <- 100;\r\n Unordered: C3, C2, C0 <- 111;\r\nESAC;\r\n\r\nIF Instruction = FICOMP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0.\r\nC0, C2, C3 See table on previous page.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA One or both operands are NaN values or have unsupported formats.\r\n#D One or both operands are denormal values.\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FICOM"
},
{
"description": "-R:FICOM",
"mnem": "FICOMP"
},
{
"description": "-R:FDIV",
"mnem": "FIDIV"
},
{
"description": "-R:FDIVR",
"mnem": "FIDIVR"
},
{
"description": "FILD-Load Integer\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n DF /0 FILD m16int Valid Valid Push m16int onto the FPU register stack.\r\n DB /0 FILD m32int Valid Valid Push m32int onto the FPU register stack.\r\n DF /5 FILD m64int Valid Valid Push m64int onto the FPU register stack.\r\n\r\n\r\n\r\nDescription\r\nConverts the signed-integer source operand into double extended-precision floating-point format and pushes the\r\nvalue onto the FPU register stack. The source operand can be a word, doubleword, or quadword integer. It is loaded\r\nwithout rounding errors. The sign of the source operand is preserved.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nTOP <- TOP - 1;\r\nST(0) <- ConvertToDoubleExtendedPrecisionFP(SRC);\r\n\r\nFPU Flags Affected\r\nC1 Set to 1 if stack overflow occurred; set to 0 otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack overflow occurred.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register contains a NULL segment selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FILD"
},
{
"description": "-R:FMUL",
"mnem": "FIMUL"
},
{
"description": "FINCSTP-Increment Stack-Top Pointer\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n D9 F7 FINCSTP Valid Valid Increment the TOP field in the FPU status register.\r\n\r\n\r\n\r\nDescription\r\nAdds one to the TOP field of the FPU status word (increments the top-of-stack pointer). If the TOP field contains a\r\n7, it is set to 0. The effect of this instruction is to rotate the stack by one position. The contents of the FPU data\r\nregisters and tag register are not affected. This operation is not equivalent to popping the stack, because the tag\r\nfor the previous top-of-stack register is not marked empty.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nIF TOP = 7\r\n THEN TOP <- 0;\r\n ELSE TOP <- TOP + 1;\r\nFI;\r\n\r\nFPU Flags Affected\r\nThe C1 flag is set to 0. The C0, C2, and C3 flags are undefined.\r\n\r\nFloating-Point Exceptions\r\nNone\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FINCSTP"
},
{
"description": "FINIT/FNINIT-Initialize Floating-Point Unit\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n9B DB E3 FINIT Valid Valid Initialize FPU after checking for pending unmasked\r\n floating-point exceptions.\r\nDB E3 FNINIT* Valid Valid Initialize FPU without checking for pending unmasked\r\n floating-point exceptions.\r\nNOTES:\r\n* See IA-32 Architecture Compatibility section below.\r\n\r\n\r\n\r\nDescription\r\nSets the FPU control, status, tag, instruction pointer, and data pointer registers to their default states. The FPU\r\ncontrol word is set to 037FH (round to nearest, all exceptions masked, 64-bit precision). The status word is cleared\r\n(no exception flags set, TOP is set to 0). The data registers in the register stack are left unchanged, but they are all\r\ntagged as empty (11B). Both the instruction and data pointers are cleared.\r\nThe FINIT instruction checks for and handles any pending unmasked floating-point exceptions before performing\r\nthe initialization; the FNINIT instruction does not.\r\nThe assembler issues two instructions for the FINIT instruction (an FWAIT instruction followed by an FNINIT\r\ninstruction), and the processor executes each of these instructions in separately. If an exception is generated for\r\neither of these instructions, the save EIP points to the instruction that caused the exception.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nIA-32 Architecture Compatibility\r\nWhen operating a Pentium or Intel486 processor in MS-DOS compatibility mode, it is possible (under unusual\r\ncircumstances) for an FNINIT instruction to be interrupted prior to being executed to handle a pending FPU excep-\r\ntion. See the section titled \"No-Wait FPU Instructions Can Get FPU Interrupt in Window\" in Appendix D of the Intel\r\n64 and IA-32 Architectures Software Developer's Manual, Volume 1, for a description of these circumstances. An\r\nFNINIT instruction cannot be interrupted in this way on later Intel processors, except for the Intel QuarkTM X1000\r\nprocessor.\r\nIn the Intel387 math coprocessor, the FINIT/FNINIT instruction does not clear the instruction and data pointers.\r\nThis instruction affects only the x87 FPU. It does not affect the XMM and MXCSR registers.\r\n\r\nOperation\r\nFPUControlWord <- 037FH;\r\nFPUStatusWord <- 0;\r\nFPUTagWord <- FFFFH;\r\nFPUDataPointer <- 0;\r\nFPUInstructionPointer <- 0;\r\nFPULastInstructionOpcode <- 0;\r\n\r\nFPU Flags Affected\r\nC0, C1, C2, C3 set to 0.\r\n\r\nFloating-Point Exceptions\r\nNone\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FINIT"
},
{
"description": "FIST/FISTP-Store Integer\r\nOpcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nDF /2 FIST m16int Valid Valid Store ST(0) in m16int.\r\nDB /2 FIST m32int Valid Valid Store ST(0) in m32int.\r\nDF /3 FISTP m16int Valid Valid Store ST(0) in m16int and pop register stack.\r\nDB /3 FISTP m32int Valid Valid Store ST(0) in m32int and pop register stack.\r\nDF /7 FISTP m64int Valid Valid Store ST(0) in m64int and pop register stack.\r\n\r\n\r\n\r\nDescription\r\nThe FIST instruction converts the value in the ST(0) register to a signed integer and stores the result in the desti-\r\nnation operand. Values can be stored in word or doubleword integer format. The destination operand specifies the\r\naddress where the first byte of the destination value is to be stored.\r\nThe FISTP instruction performs the same operation as the FIST instruction and then pops the register stack. To pop\r\nthe register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.\r\nThe FISTP instruction also stores values in quadword integer format.\r\nThe following table shows the results obtained when storing various classes of numbers in integer format.\r\n Table 3-27. FIST/FISTP Results\r\n ST(0) DEST\r\n - inf or Value Too Large for DEST Format *\r\n F <= -1 -I\r\n -1 < F < -0 **\r\n -0 0\r\n +0 0\r\n +0<F<+1 **\r\n F>=+1 +I\r\n + inf or Value Too Large for DEST Format *\r\n NaN *\r\nNOTES:\r\nF Means finite floating-point value.\r\nI Means integer.\r\n* Indicates floating-point invalid-operation (#IA) exception.\r\n** 0 or +-1, depending on the rounding mode.\r\n\r\n\r\nIf the source value is a non-integral value, it is rounded to an integer value, according to the rounding mode spec-\r\nified by the RC field of the FPU control word.\r\nIf the converted value is too large for the destination format, or if the source operand is an inf, SNaN, QNAN, or is in\r\nan unsupported format, an invalid-arithmetic-operand condition is signaled. If the invalid-operation exception is\r\nnot masked, an invalid-arithmetic-operand exception (#IA) is generated and no value is stored in the destination\r\noperand. If the invalid-operation exception is masked, the integer indefinite value is stored in memory.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\n\r\n\r\n\r\n\r\nOperation\r\nDEST <- Integer(ST(0));\r\n\r\nIF Instruction = FISTP\r\n THEN\r\n PopRegisterStack;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 0 if stack underflow occurred.\r\n Indicates rounding direction of if the inexact exception (#P) is generated: 0 <- not roundup; 1\r\n <- roundup.\r\n Set to 0 otherwise.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow occurred.\r\n#IA Converted value is too large for the destination format.\r\n Source operand is an SNaN, QNaN, +-inf, or unsupported format.\r\n#P Value cannot be represented exactly in destination format.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment\r\n selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FIST"
},
{
"description": "-R:FIST",
"mnem": "FISTP"
},
{
"description": "FISTTP-Store Integer with Truncation\r\n Opcode Instruction 64-Bit Mode Compat/ Description\r\n Leg Mode\r\n DF /1 FISTTP m16int Valid Valid Store ST(0) in m16int with truncation.\r\n DB /1 FISTTP m32int Valid Valid Store ST(0) in m32int with truncation.\r\n DD /1 FISTTP m64int Valid Valid Store ST(0) in m64int with truncation.\r\n\r\n\r\n\r\nDescription\r\nFISTTP converts the value in ST into a signed integer using truncation (chop) as rounding mode, transfers the\r\nresult to the destination, and pop ST. FISTTP accepts word, short integer, and long integer destinations.\r\nThe following table shows the results obtained when storing various classes of numbers in integer format.\r\n Table 3-28. FISTTP Results\r\n ST(0) DEST\r\n- inf or Value Too Large for DEST Format *\r\nF<= -1 -I\r\n-1<F<+1 0\r\nFS+1 +I\r\n+ inf or Value Too Large for DEST Format *\r\nNaN *\r\nNOTES:\r\nF Means finite floating-point value.\r\nI Means integer.\r\n* Indicates floating-point invalid-operation (#IA) exception.\r\n\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nDEST <- ST;\r\npop ST;\r\n\r\nFlags Affected\r\nC1 is cleared; C0, C2, C3 undefined.\r\n\r\nNumeric Exceptions\r\nInvalid, Stack Invalid (stack underflow), Precision.\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If the destination is in a nonwritable segment.\r\n For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.\r\n#SS(0) For an illegal address in the SS segment.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#NM If CR0.EM[bit 2] = 1.\r\n If CR0.TS[bit 3] = 1.\r\n#UD If CPUID.01H:ECX.SSE3[bit 0] = 0.\r\n If the LOCK prefix is used.\r\n\r\n\r\n\r\nReal Address Mode Exceptions\r\nGP(0) If any part of the operand would lie outside of the effective address space from 0 to 0FFFFH.\r\n#NM If CR0.EM[bit 2] = 1.\r\n If CR0.TS[bit 3] = 1.\r\n#UD If CPUID.01H:ECX.SSE3[bit 0] = 0.\r\n If the LOCK prefix is used.\r\n\r\nVirtual 8086 Mode Exceptions\r\nGP(0) If any part of the operand would lie outside of the effective address space from 0 to 0FFFFH.\r\n#NM If CR0.EM[bit 2] = 1.\r\n If CR0.TS[bit 3] = 1.\r\n#UD If CPUID.01H:ECX.SSE3[bit 0] = 0.\r\n If the LOCK prefix is used.\r\n#PF(fault-code) For a page fault.\r\n#AC(0) For unaligned memory reference if the current privilege is 3.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FISTTP"
},
{
"description": "-R:FSUB",
"mnem": "FISUB"
},
{
"description": "-R:FSUBR",
"mnem": "FISUBR"
},
{
"description": "FLD-Load Floating Point Value\r\n Opcode Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\n D9 /0 FLD m32fp Valid Valid Push m32fp onto the FPU register stack.\r\n DD /0 FLD m64fp Valid Valid Push m64fp onto the FPU register stack.\r\n DB /5 FLD m80fp Valid Valid Push m80fp onto the FPU register stack.\r\n D9 C0+i FLD ST(i) Valid Valid Push ST(i) onto the FPU register stack.\r\n\r\n\r\n\r\nDescription\r\nPushes the source operand onto the FPU register stack. The source operand can be in single-precision, double-\r\nprecision, or double extended-precision floating-point format. If the source operand is in single-precision or\r\ndouble-precision floating-point format, it is automatically converted to the double extended-precision floating-\r\npoint format before being pushed on the stack.\r\nThe FLD instruction can also push the value in a selected FPU register [ST(i)] onto the stack. Here, pushing register\r\nST(0) duplicates the stack top.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nOperation\r\nIF SRC is ST(i)\r\n THEN\r\n temp <- ST(i);\r\nFI;\r\n\r\nTOP <- TOP - 1;\r\nIF SRC is memory-operand\r\n THEN\r\n ST(0) <- ConvertToDoubleExtendedPrecisionFP(SRC);\r\n ELSE (* SRC is ST(i) *)\r\n ST(0) <- temp;\r\nFI;\r\n\r\nFPU Flags Affected\r\nC1 Set to 1 if stack overflow occurred; otherwise, set to 0.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack underflow or overflow occurred.\r\n#IA Source operand is an SNaN. Does not occur if the source operand is in double extended-preci-\r\n sion floating-point format (FLD m80fp or FLD ST(i)).\r\n#D Source operand is a denormal value. Does not occur if the source operand is in double\r\n extended-precision floating-point format.\r\n\r\n\r\n\r\n\r\n\r\nProtected Mode Exceptions\r\n#GP(0) If destination is located in a non-writable segment.\r\n If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment\r\n selector.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\n#GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#UD If the LOCK prefix is used.\r\n\r\nVirtual-8086 Mode Exceptions\r\n#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.\r\n#SS(0) If a memory operand effective address is outside the SS segment limit.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made.\r\n#UD If the LOCK prefix is used.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\n#SS(0) If a memory address referencing the SS segment is in a non-canonical form.\r\n#GP(0) If the memory address is in a non-canonical form.\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#PF(fault-code) If a page fault occurs.\r\n#AC(0) If alignment checking is enabled and an unaligned memory reference is made while the\r\n current privilege level is 3.\r\n#UD If the LOCK prefix is used.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FLD"
},
{
"description": "FLD1/FLDL2T/FLDL2E/FLDPI/FLDLG2/FLDLN2/FLDZ-Load Constant\r\nOpcode* Instruction 64-Bit Compat/ Description\r\n Mode Leg Mode\r\nD9 E8 FLD1 Valid Valid Push +1.0 onto the FPU register stack.\r\nD9 E9 FLDL2T Valid Valid Push log210 onto the FPU register stack.\r\nD9 EA FLDL2E Valid Valid Push log2e onto the FPU register stack.\r\nD9 EB FLDPI Valid Valid Push pi onto the FPU register stack.\r\nD9 EC FLDLG2 Valid Valid Push log102 onto the FPU register stack.\r\nD9 ED FLDLN2 Valid Valid Push loge2 onto the FPU register stack.\r\nD9 EE FLDZ Valid Valid Push +0.0 onto the FPU register stack.\r\nNOTES:\r\n* See IA-32 Architecture Compatibility section below.\r\n\r\n\r\n\r\nDescription\r\nPush one of seven commonly used constants (in double extended-precision floating-point format) onto the FPU\r\nregister stack. The constants that can be loaded with these instructions include +1.0, +0.0, log210, log2e, pi, log102,\r\nand loge2. For each constant, an internal 66-bit constant is rounded (as specified by the RC field in the FPU control\r\nword) to double extended-precision floating-point format. The inexact-result exception (#P) is not generated as a\r\nresult of the rounding, nor is the C1 flag set in the x87 FPU status word if the value is rounded up.\r\nSee the section titled \"Approximation of Pi\" in Chapter 8 of the Intel 64 and IA-32 Architectures Software Devel-\r\noper's Manual, Volume 1, for a description of the pi constant.\r\nThis instruction's operation is the same in non-64-bit modes and 64-bit mode.\r\n\r\nIA-32 Architecture Compatibility\r\nWhen the RC field is set to round-to-nearest, the FPU produces the same constants that is produced by the Intel\r\n8087 and Intel 287 math coprocessors.\r\n\r\nOperation\r\nTOP <- TOP - 1;\r\nST(0) <- CONSTANT;\r\n\r\nFPU Flags Affected\r\nC1 Set to 1 if stack overflow occurred; otherwise, set to 0.\r\nC0, C2, C3 Undefined.\r\n\r\nFloating-Point Exceptions\r\n#IS Stack overflow occurred.\r\n\r\nProtected Mode Exceptions\r\n#NM CR0.EM[bit 2] or CR0.TS[bit 3] = 1.\r\n#MF If there is a pending x87 FPU exception.\r\n#UD If the LOCK prefix is used.\r\n\r\nReal-Address Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\nVirtual-8086 Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\nCompatibility Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n64-Bit Mode Exceptions\r\nSame exceptions as in protected mode.\r\n\r\n\r\n\r\n\r\n",
"mnem": "FLD1"
},
{