|
Table of Content | Chapter Fourteen (Part 5) |
The 80387 (and later) FPU adds over 80 new instructions to the 80x86 instruction set. We can classify these instructions as data movement instructions conversions arithmetic instructions comparisons constant instructions transcendental instructions and miscellaneous instructions. The following sections describe each of the instructions in these categories.
14.4.4 FPU Data Movement Instructions
The data movement instructions transfer data between the
internal FPU registers and memory. The instructions in this category are fld
fst
fstp
and fxch. The fld instructions always pushes its operand onto
the floating point stack. The fstp instruction always pops the top of stack
after storing the top of stack (tos) into its operation. The remaining instructions do not
affect the number of items on the stack.
The fld instruction loads a 32 bit
64 bit
or
80 bit floating point value onto the stack. This instruction converts 32 and 64 bit
operand to an 80 bit extended precision value before pushing the value onto the floating
point stack.
The fld instruction first decrements the tos
pointer (bits 11-13 of the status register) and then stores the 80 bit value in the
physical register specified by the new tos pointer. If the source operand of the fld
instruction is a floating point data register
ST(i)
then the actual
register the 80x87 uses for the load operation is the register number before decrementing
the tos pointer. Therefore
fld st or fld st(0) duplicates the
value on the top of the stack.
The fld instruction sets the stack fault bit
if stack overflow occurs. It sets the the denormalized exception bit if you load an 80 bit
denormalized value. It sets the invalid operation bit if you attempt to load an empty
floating point register onto the stop of stack (or perform some other invalid operation).
Examples:
fld st(1) fld mem_32 fld MyRealVar fld mem_64[bx]
14.4.4.2 The FST and FSTP Instructions
The fst and fstp instructions
copy the value on the top of the floating point register stack to another floating point
register or to a 32
64
or 80 bit memory variable. When copying data to a 32 or 64 bit
memory variable
the 80 bit extended precision value on the top of stack is rounded to the
smaller format as specified by the rounding control bits in the FPU control register.
The fstp instruction pops the value off the
top of stack when moving it to the destination location. It does this by incrementing the
top of stack pointer in the status register after accessing the data in st(0).
If the destination operand is a floating point register
the FPU stores the value at the
specified register number before popping the data off the top of the stack.
Executing an fstp st(0) instruction
effectively pops the data off the top of stack with no data transfer. Examples:
fst mem_32 fstp mem_64 fstp mem_64[ebx*8] fst mem_80 fst st(2) fstp st(1)
The last example above effectively pops st(1)
while leaving st(0) on the top of the stack.
The fst and fstp instructions
will set the stack exception bit if a stack underflow occurs (attempting to store a value
from an empty register stack). They will set the precision bit if there is a loss of
precision during the store operation (this will occur
for example
when storing an 80 bit
extended precision value into a 32 or 64 bit memory variable and there are some bits lost
during conversion). They will set the underflow exception bit when storing an 80 bit value
value into a 32 or 64 bit memory variable
but the value is too small to fit into the
destination operand. Likewise
these instructions will set the overflow exception bit if
the value on the top of stack is too big to fit into a 32 or 64 bit memory variable. The fst
and fstp instructions set the denormalized flag when you try to store a
denormalized value into an 80 bit register or variable[7]. They
set the invalid operation flag if an invalid operation (such as storing into an empty
register) occurs. Finally
these instructions set the C1 condition bit if
rounding occurs during the store operation (this only occurs when storing into a 32 or 64
bit memory variable and you have to round the mantissa to fit into the destination).
The fxch instruction exchanges the value on
the top of stack with one of the other FPU registers. This instruction takes two forms:
one with a single FPU register as an operand
the second without any operands. The first
form exchanges the top of stack with the specified register. The second form of fxch
swaps the top of stack with st(1).
Many FPU instructions
e.g.
fsqrt
operate
only on the top of the register stack. If you want to perform such an operation on a value
that is not on the top of stack
you can use the fxch instruction to swap
that register with tos
perform the desired operation
and then use the fxch
to swap the tos with the original register. The following example takes the square root of
st(2):
fxch st(2) fsqrt fxch st(2)
The fxch instruction sets the stack exception
bit if the stack is empty. It sets the invalid operation bit if you specify an empty
register as the operand. This instruction always clears the C1 condition code
bit.
The 80x87 chip performs all arithmetic operations on 80 bit
real quantities. In a sense
the fld and fst/fstp instructions
are conversion instructions as well as data movement instructions because they
automatically convert between the internal 80 bit real format and the 32 and 64 bit memory
formats. Nonetheless
we'll simply classify them as data movement operations
rather than
conversions
because they are moving real values to and from memory. The 80x87 FPU
provides five routines which convert to or from integer or binary coded decimal (BCD)
format when moving data. These instructions are fild
fist
fistp
fbld
and fbstp.
The fild (integer load) instruction converts a
16
32
or 64 bit two's complement integer to the 80 bit extended precision format and
pushes the result onto the stack. This instruction always expects a single operand. This
operand must be the address of a word
double word
or quad word integer variable.
Although the instruction format for fild uses the familiar mod/rm fields
the
operand must be a memory variable
even for 16 and 32 bit integers. You cannot specify one
of the 80386's 16 or 32 bit general purpose registers. If you want to push an 80x86
general purpose register onto the FPU stack
you must first store it into a memory
variable and then use fild to push that value of that memory variable.
The fild instruction sets the stack exception
bit and C1 (accordingly) if stack overflow occurs while pushing the converted
value. Examples:
fild mem_16 fild mem_32[ecx*4] fild mem_64[ebx+ecx*8]
14.4.5.2 The FIST and FISTP Instructions
The fist and fistp instructions
convert the 80 bit extended precision variable on the top of stack to a 16
32
or 64 bit
integer and store the result away into the memory variable specified by the single
operand. These instructions convert the value on tos to an integer according to the
rounding setting in the FPU control register (bits 10 and 11). As for the fild
instruction
the fist and fistp instructions will not let you
specify one of the 80x86's general purpose 16 or 32 bit registers as the destination
operand.
The fist instruction converts the value on the
top of stack to an integer and then stores the result; it does not otherwise affect the
floating point register stack. The fistp instruction pops the value off the
floating point register stack after storing the converted value.
These instructions set the stack exception bit if the
floating point register stack is empty (this will also clear C1). They set the precision
(imprecise operation) and C1 bits if rounding occurs (that is
if there is
any fractional component to the value in st(0)). These instructions set the
underflow exception bit if the result is too small (i.e.
less than one but greater than
zero or less than zero but greater than -1). Examples:
fist mem_16[bx] fist mem_64 fistp mem_32
Don't forget that these instructions use the rounding
control settings to determine how they will convert the floating point data to an integer
during the store operation. Be default
the rouding control is usually set to
"round" mode; yet most programmers expect fist/fistp to truncate
the decimal portion during conversion. If you want fist/fistp to truncate
floating point values when converting them to an integer
you will need to set the
rounding control bits appropriately in the floating point control register.
14.4.5.3 The FBLD and FBSTP Instructions
The fbld and fbstp instructions
load and store 80 bit BCD values. The fbld instruction converts a BCD value
to its 80 bit extended precision equivalent and pushes the result onto the stack. The fbstp
instruction pops the extended precision real value on tos
converts it to an 80 bit BCD
value (rounding according to the bits in the floating point control register)
and stores
the converted result at the address specified by the destination memory operand. Note that
there is no fbst instruction which stores the value on tos without popping
it.
The fbld instruction sets the stack exception
bit and C1 if stack overflow occurs. It sets the invalid operation bit if you
attempt to load an invalid BCD value. The fbstp instruction sets the stack
exception bit and clears C1 if stack underflow occurs (the stack is empty).
It sets the underflow flag under the same conditions as fist and fistp.
Examples:
; Assuming fewer than eight items on the stack the following ; code sequence is equivalent to an fbst instruction: fld st(0) ;Duplicate value on TOS. fbstp mem_80 ; The following example easily converts an 80 bit BCD value to ; a 64 bit integer: fbld bcd_80 ;Get BCD value to convert. fist mem_64 ;Store as an integer.
14.4.6 Arithmetic Instructions
The arithmetic instructions make up a small but important subset of the 80x87's instruction set. These instructions fall into two general categories - those which operate on real values and those which operate on a real and an integer value.
14.4.6.1 The FADD and FADDP Instructions
These two instructions take the following forms:
fadd faddp fadd st(i) st(0) fadd st(0) st(i) faddp st(i) st(0) fadd mem
The first two forms are equivalent. They pop the two values on the top of stack add them and push their sum back onto the stack.
The next two forms of the fadd instruction
those with two FPU register operands
behave like the 80x86's add
instruction. They add the value in the second register operand to the value in the first
register operand. Note that one of the register operands must be st(0)[8].
The faddp instruction with two operands adds st(0)
(which must always be the second operand) to the destination (first) operand and then pops
st(0). The destination operand must be one of the other FPU registers.
The last form above
fadd with a memory
operand
adds a 32 or 64 bit floating point variable to the value in st(0).
This instruction will convert the 32 or 64 bit operands to an 80 bit extended precision
value before performing the addition. Note that this instruction does not allow an 80 bit
memory operand.
These instructions can raise the stack
precision
underflow
overflow
denormalized
and illegal operation exceptions
as appropriate. If a
stack fault exception occurs
C1 denotes stack overflow or underflow.
14.4.6.2 The FSUB FSUBP FSUBR and FSUBRP Instructions
These four instructions take the following forms:
fsub fsubp fsubr fsubrp fsub st(i). st(0) fsub st(0) st(i) fsubp st(i) st(0) fsub mem fsubr st(i). st(0) fsubr st(0) st(i) fsubrp st(i) st(0) fsubr mem
With no operands
the fsub and fsubp
instructions operate identically. They pop st(0) and st(1) from
the register stack
compute st(0)-st(1)
and the push the difference back
onto the stack. The fsubr and fsubrp instructions (reverse
subtraction) operate in an almost identical fashion except they compute st(1)-st(0)
and push that difference.
With two register operands (destination
source ) the fsub
instruction computes destination := destination - source. One of the two registers must be
st(0). With two registers as operands
the fsubp also computes
destination := destination - source and then it pops st(0) off the stack
after computing the difference. For the fsubp instruction
the source operand
must be st(0).
With two register operands
the fsubr and fsubrp
instruction work in a similar fashion to fsub and fsubp
except
they compute destination := source - destination.
The fsub mem and fsubr mem
instructions accept a 32 or 64 bit memory operand. They convert the memory operand to an
80 bit extended precision value and subtract this from st(0) (fsub)
or subtract st(0) from this value (fsubr) and store the result
back into st(0).
These instructions can raise the stack
precision
underflow
overflow
denormalized
and illegal operation exceptions
as appropriate. If a
stack fault exception occurs
C1 denotes stack overflow or underflow.
14.4.6.3 The FMUL and FMULP Instructions
The fmul and fmulp instructions
multiply two floating point values. These instructions allow the following forms:
fmul fmulp fmul st(0) st(i) fmul st(i) st(0) fmul mem fmulp st(i) st(0)
With no operands
fmul and fmulp
both do the same thing - they pop st(0) and st(1)
multiply
these values
and push their product back onto the stack. The fmul instructions
with two register operands compute destination := destination * source. One of the
registers (source or destination) must be st(0).
The fmulp st(i)
st(0) instruction computes st(i)
:= st(i) * st(0) and then pops st(0). This instruction uses the value
for i before popping st(0). The fmul mem instruction requires a
32 or 64 bit memory operand. It converts the specified memory variable to an 80 bit
extended precision value and the multiplies st(0) by this value.
These instructions can raise the stack
precision
underflow
overflow
denormalized
and illegal operation exceptions
as appropriate. If
rounding occurs during the computation
these instructions set the C1
condition code bit. If a stack fault exception occurs
C1 denotes stack
overflow or underflow.
14.4.6.4 The FDIV FDIVP FDIVR and FDIVRP Instructions
These four instructions allow the following forms:
fdiv fdivp fdivr fdivrp fdiv st(0) st(i) fdiv st(i) st(0) fdivp st(i) st(0) fdivr st(0) st(i) fdivr st(i) st(0) fdivrp st(i) st(0) fdiv mem fdivr mem
With zero operands
the fdiv and fdivp
instructions pop st(0) and st(1)
compute st(0)/st(1)
and push the result back onto the stack. The fdivr and fdivrp
instructions also pop st(0) and st(1) but compute st(1)/st(0)
before pushing the quotient onto the stack.
With two register operands these instructions compute the following quotients:
fdiv st(0) st(i) ;st(0) := st(0)/st(i) fdiv st(i) st(0) ;st(i) := st(i)/st(0) fdivp st(i) st(0) ;st(i) := st(i)/st(0) fdivr st(i) st(i) ;st(0) := st(0)/st(i) fdivrp st(i) st(0) ;st(i) := st(0)/st(i)
The fdivp and fdivrp instructions
also pop st(0) after performing the division operation. The value for i in
this two instructions is computed before popping st(0).
These instructions can raise the stack
precision
underflow
overflow
denormalized
zero divide
and illegal operation exceptions
as
appropriate. If rounding occurs during the computation
these instructions set the C1
condition code bit. If a stack fault exception occurs
C1 denotes stack
overflow or underflow.
14.4.6.5 The FSQRT Instruction
The fsqrt routine does not allow any operands.
It computes the square root of the value on tos and replaces st(0) with this
result. The value on tos must be zero or positive
otherwise fsqrt will
generate an invalid operation exception.
This instruction can raise the stack
precision
denormalized
and invalid operation exceptions
as appropriate. If rounding occurs during
the computation
fsqrt sets the C1 condition code bit. If a
stack fault exception occurs
C1 denotes stack overflow or underflow.
Example:
; Compute Z := sqrt(x**2 + y**2); fld x ;Load X. fld st(0) ;Duplicate X on TOS. fmul ;Compute X**2. fld y ;Load Y. fld st(0) ;Duplicate Y on TOS. fmul ;Compute Y**2. fadd ;Compute X**2 + Y**2. fsqrt ;Compute sqrt(x**2 + y**2). fst Z ;Store away result in Z.
14.4.6.6 The FSCALE Instruction
The fscale instruction pops two values off the
stack. It multiplies st(0) by 2st(1) and pushes the result back
onto the stack. If the value in st(1) is not an integer
fscale
truncates it towards zero before performing the operation.
This instruction raises the stack exception if there are
not two items currently on the stack (this will also clear C1 since stack
underflow occurs). It raises the precision exception if there is a loss of precision due
to this operation (this occurs when st(1) contains a large
negative
value).
Likewise
this instruction sets the underflow or overflow exception bits if you multiply st(0)
by a very large positive or negative power of two. If the result of the multiplication is
very small
fscale could set the denormalized bit. Also
this instruction
could set the invalid operation bit if you attempt to fscale illegal values. Fscale
sets C1 if rounding occurs in an otherwise correct computation. Example:
fild Sixteen ;Push sixteen onto the stack. fld x ;Compute x * (2**16). fscale . . . Sixteen word 16
14.4.6.7 The FPREM and FPREM1 Instructions
The fprem and fprem1 instructions
compute a partial remainder. Intel designed the fprem instruction before the
IEEE finalized their floating point standard. In the final draft of the IEEE floating
point standard
the definition of fprem was a little different than Intel's
original design. Unfortunately
Intel needed to maintain compatibility with the existing
software that used the fprem instruction
so they designed a new version to
handle the IEEE partial remainder operation
fprem1. You should always use fprem1
in new software you write
therefore we will only discuss fprem1 here
although you use fprem in an identical fashion.
Fprem1 computes the partial remainder of st(0)/st(1).
If the difference between the exponents of st(0) and st(1) is
less than 64
fprem1 can compute the exact remainder in one operation. Otherwise you will
have to execute the fprem1 two or more times to get the correct remainder
value. The C2 condition code bit determines when the computation is complete.
Note that fprem1 does not pop the two operands off the stack; it leaves the
partial remainder in st(0) and the original divisor in st(1) in
case you need to compute another partial product to complete the result.
The fprem1 instruction sets the stack
exception flag if there aren't two values on the top of stack. It sets the underflow and
denormal exception bits if the result is too small. It sets the invalid operation bit if
the values on tos are inappropriate for this operation. It sets the C2
condition code bit if the partial remainder operation is not complete. Finally
it loads C3
C1
and C0 with bits zero
one
and two of the quotient
respectively.
Example:
; Compute Z := X mod Y fld y fld x PartialLp: fprem1 fstsw ax ;Get condition bits in AX. test ah 100b ;See if C2 is set. jnz PartialLp ;Repeat if not done yet. fstp Z ;Store remainder away. fstp st(0) ;Pop old y value.
14.4.6.8 The FRNDINT Instruction
The frndint instruction rounds the value on tos to the nearest integer using the rounding algorithm specified in the control register.
This instruction sets the stack exception flag if there is no value on the tos (it will also clear C1 in this case). It sets the precision and denormal exception bits if there was a loss of precision. It sets the invalid operation flag if the value on the tos is not a valid number.
14.4.6.9 The FXTRACT Instruction
The fxtract instruction is the complement to
the fscale instruction. It pops the value off the top of the stack and pushes
a value which is the integer equivalent of the exponent (in 80 bit real form)
and then
pushes the mantissa with an exponent of zero (3fffh in biased form).
This instruction raises the stack exception if there is a
stack underflow when popping the original value or a stack overflow when pushing the two
results (C1 determines whether stack overflow or underflow occurs). If the
original top of stack was zero
fxtract sets the zero division exception flag. The
denormalized flag is set if the result warrants it; and the invalid operation flag is set
if there are illegal input values when you execute fxtract.
Example:
; The following example extracts the binary exponent of X and ; stores this into the 16 bit integer variable Xponent. fld x fxtract fstp st(0) fistp Xponent
14.4.6.10 The FABS Instruction
Fabs
computes the absolute value of st(0)
by clearing the sign bit of st(0). It sets the stack exception bit and
invalid operation bits if the stack is empty.
Example:
; Compute X := sqrt(abs(x)); fld x fabs fsqrt fstp x
14.4.6.11 The FCHS Instruction
Fchs changes the sign of st(0)'s value by inverting its sign bit. It sets the stack exception bit and invalid operation bits if the stack is empty. Example:
; Compute X := -X if X is positive X := X if X is negative. fld x fabs fchs fstp x
[7] Storing a denormalized value into a 32 or 64 bit memory variable will always set the underflow exception bit.
[8] Because you
will use st(0) quite a bit when programming the 80x87
MASM allows you to use
the abbreviation st for st(0). However
this text will
explicitly state st(0) so there will be no confusion.
|
Table of Content | Chapter Fourteen (Part 5) |
Chapter Fourteen: Floating Point
Arithmetics (Part 4)
28 SEP 1996