Floating-point arithmetic

floating pointfloating-pointfloating-point numberfloating point arithmeticfloating point numberfloating point numbersfloatfloating-point numbersfloatsfloating
In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision.wikipedia
851 Related Articles

Significand

mantissaMantissa (floating point number)mantissas
A number is, in general, represented approximately to a fixed number of significant digits (the significand) and scaled using an exponent in some fixed base; the base for the scaling is normally two, ten, or sixteen.
The significand (also mantissa or coefficient, sometimes also argument or fraction) is part of a number in scientific notation or a floating-point number, consisting of its significant digits.

FLOPS

GFLOPSpetaflopsteraflops
The speed of floating-point operations, commonly measured in terms of FLOPS, is an important characteristic of a computer system, especially for applications that involve intensive mathematical calculations.
In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations.

Floating-point unit

FPUfloating point unitmath coprocessor
A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer system specially designed to carry out operations on floating-point numbers.
A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer system specially designed to carry out operations on floating point numbers.

Coprocessor

co-processorcoprocessorsmath co-processor
A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer system specially designed to carry out operations on floating-point numbers.
Operations performed by the coprocessor may be floating point arithmetic, graphics, signal processing, string processing, cryptography or I/O interfacing with peripheral devices.

Fixed-point arithmetic

fixed-pointfixed pointfixed-point number
In fixed-point systems, a position in the string is specified for the radix point.
Fixed-point number representation can be compared to the more complicated (and more computationally demanding) floating-point number representation.

Decimal floating point

decimal floating-pointDecimaldecimal arithmetic
Historically, several number bases have been used for representing floating-point numbers, with base two (binary) being the most common, followed by base ten (decimal floating point), and other less common varieties, such as base sixteen (hexadecimal floating point ), eight (octal floating point ), base four (quaternary floating point ), base three (balanced ternary floating point ) and even base 256 and base 65536.
Decimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers.

Computer

computerscomputer systemdigital computer
The speed of floating-point operations, commonly measured in terms of FLOPS, is an important characteristic of a computer system, especially for applications that involve intensive mathematical calculations.
It was quite similar to modern machines in some respects, pioneering numerous advances such as floating point numbers.

Tapered floating point

Leveling (tapered floating point)Tapered floating-point representation
In computing, tapered floating point (TFP) is a format similar to floating point, but with variable-sized entries for the significand and exponent instead of the fixed-length entries found in normal floating-point formats.

Rounding

roundedroundnearest integer function
It is used to round the 33-bit approximation to the nearest 24-bit number (there are specific rules for halfway values, which is not the case here).
Rounding is almost unavoidable when reporting many computations – especially when dividing two numbers in integer or fixed-point arithmetic; when computing mathematical functions such as square roots, logarithms, and sines; or when using a floating-point representation with a fixed number of significant digits.

Symmetric level-index arithmetic

Level-index arithmeticgeneralized logarithmgeneralized logarithm function
The level-index (LI) representation of numbers, and its algorithms for arithmetic operations, were introduced by Charles Clenshaw and Frank Olver in 1984.

Arbitrary-precision arithmetic

arbitrary-precisionarbitrary precisionbignum
Several modern programming languages have built-in support for bignums, and others have libraries available for arbitrary-precision integer and floating-point math.

Dynamic range

DRdynamicdynamic and tonal range
The result of this dynamic range is that the numbers that can be represented are not uniformly spaced; the difference between two consecutive representable numbers grows with the chosen scale.
Most Digital audio workstations process audio with 32-bit floating-point representation which affords even higher dynamic range and so loss of dynamic range is no longer a concern in terms of digital audio processing.

Logarithmic number system

European Logarithmic MicroprocessorHigh Speed Logarithmic Arithmeticlogarithmically coded
An LNS can be considered as a floating-point number with the significand being always equal to 1 and a non-integer exponent.

IBM 704

704address registerIBM704
The mass-produced IBM 704 followed in 1954; it introduced the use of a biased exponent.
The IBM 704, introduced by IBM in 1954, is the first mass-produced computer with floating-point arithmetic hardware.

IBM System/360

System/360IBM 360IBM/360
Indeed, in 1964, IBM introduced hexadecimal floating-point representations in its System/360 mainframes; these same representations are still available for use in modern z/Architecture systems.
All but the only partially compatible Model 44 and the most expensive systems use microcode to implement the instruction set, which features 8-bit byte addressing and binary, decimal and hexadecimal floating-point calculations.

Z1 (computer)

Z1Z1 computerfirst computer
In 1938, Konrad Zuse of Berlin completed the Z1, the first binary, programmable mechanical computer; it uses a 24-bit binary floating-point number representation with a 7-bit signed exponent, a 17-bit significand (including one implicit bit), and a sign bit.
The Z1 was the first freely programmable computer in the world which used Boolean logic and binary floating-point numbers, however it was unreliable in operation.

Z4 (computer)

Z4Zuse Z4Z4 computer
The first commercial computer with floating-point hardware was Zuse's Z4 computer, designed in 1942–1945.
The memory consisted of 32-bit rather than 22-bit floating point words.

Intel 8087

808780x87i8087
This standard was significantly based on a proposal from Intel, which was designing the i8087 numerical coprocessor; Motorola, which was designing the 68000 around the same time, gave significant input as well.
The Intel 8087, announced in 1980, was the first x87 floating-point coprocessor for the 8086 line of microprocessors.

IBM hexadecimal floating point

hexadecimalIBM Floating Point Architecturebase-16 excess-64 floating point format
Indeed, in 1964, IBM introduced hexadecimal floating-point representations in its System/360 mainframes; these same representations are still available for use in modern z/Architecture systems.
IBM System/360 computers, and subsequent machines based on that architecture (mainframes), support a hexadecimal floating-point format (HFP).

Exponent bias

biased exponentbiasbiased
The mass-produced IBM 704 followed in 1954; it introduced the use of a biased exponent.
In IEEE 754 floating point numbers, the exponent is biased in the engineering sense of the word – the value stored is offset from the actual value by the exponent bias.

IEEE 754

IEEE floating-pointIEEE floating-point standardIEEE floating point
In 1985, the IEEE 754 Standard for Floating-Point Arithmetic was established, and since the 1990s, the most commonly encountered representations are those defined by the IEEE. Floating-point compatibility across multiple computing systems was in desperate need of standardization by the early 1980s, leading to the creation of the IEEE 754 standard once the 32-bit (or 64-bit) word had become commonplace.
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE).

Binary number

binarybinary numeral systembase 2
Historically, several number bases have been used for representing floating-point numbers, with base two (binary) being the most common, followed by base ten (decimal floating point), and other less common varieties, such as base sixteen (hexadecimal floating point ), eight (octal floating point ), base four (quaternary floating point ), base three (balanced ternary floating point ) and even base 256 and base 65536.
The Z1 computer, which was designed and built by Konrad Zuse between 1935 and 1938, used Boolean logic and binary floating point numbers.

Konrad Zuse

ZuseZuse, KonradZuse KG
In 1938, Konrad Zuse of Berlin completed the Z1, the first binary, programmable mechanical computer; it uses a 24-bit binary floating-point number representation with a 7-bit signed exponent, a 17-bit significand (including one implicit bit), and a sign bit.
Working in his parents' apartment in 1936, he produced his first attempt, the Z1, a floating point binary mechanical calculator with limited programmability, reading instructions from a perforated 35 mm film.

Word (computer architecture)

wordwordsword size
Floating-point compatibility across multiple computing systems was in desperate need of standardization by the early 1980s, leading to the creation of the IEEE 754 standard once the 32-bit (or 64-bit) word had become commonplace.