您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Detailed explanation of python3 rounding problem

編輯：Python

“ In essence , Floating point arithmetic is imprecise , And programmers can easily abuse it , Thus, the result of calculation is almost composed of noise ”

–Donald Knuth（《 Programming art 》( Second volume ) Semi numerical algorithm ）

One . Rounding confusion

Python2 in ,round Function use depends on near most near and etc. distance far leave 0 ‾ \underline{ Near the nearest and equidistant away 0}

Near the nearest and equidistant away 0

（ROUND_HALF_UP） Strategy , Is the usual rounding pattern .

（Values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done away from 0 ）

And in the Python3 in ,round The choice of function depends on near most near and etc. distance by near accidentally Count ‾ \underline{ Near nearest and equidistant near even }

Near nearest and equidistant near even

（ROUND_HALF_EVEN） Strategy ,

（values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done toward the even choice）

meanwhile ,float Type uses double precision binary storage （ reference IEEE754 standard ）,round The function uses binary stored values , It will affect the result when rounding , and round The rounding specification is not used by itself , It caused some confusion .

Difficult to understand rounding ：

 >>> round(1.205, 2)
1.21
......
>>> round(1.245, 2)
1.25
>>> round(1.255, 2)
1.25
>>> round(1.265, 2)
1.26
......
>>> round(1.295, 2)
1.29

Numpy Of round The same problem ：

 >>> numpy.round(1.255, 2)
1.25
>>> numpy.round(1.265, 2)
1.26

Most floating point numbers appear “ The tail is garbled ” And “999 variation “：

 >>> format(1.215, '.52f')
'1.2150000000000000799360577730112709105014801025390625'
>>> round(1.215, 2)
1.22
>>>formatl(1.275, '.52f')
'1.2749999999999999111821580299874767661094665527343750'
>>> round(1.275, 2)
1.27

It doesn't seem to be all ：

 >>> format(1.125, '.52f')
'1.1250000000000000000000000000000000000000000000000000'
>>> round(1.125, 2)
1.12
>>> format(0.5, '.52f')
'0.5000000000000000000000000000000000000000000000000000'
>>> round(0.5, 0)
0

When it is necessary to master accurate calculation results ,round() It seems to be a headache .

It was a simple question , Why it becomes difficult to understand , Is it Python Of round Something went wrong with the strategy ？

As a professional language platform ,Python Obviously, I would not have thought of such a problem .

Actually , Floating point numbers are not as simple as we think .

Two . Problem analysis

On the precision of floating point numbers , Just like the master of Computing Knuth said , This is a complicated problem beyond our imagination , The result often makes us very depressed , I can't even believe in computers anymore . It is necessary to master the calculation rules , It really needs some patience and in-depth discussion . In terms of floating point number calculation , We have many illusions , The finite computation of computer is often confused with the computation in algebra , So that they are confused about many results . For a serious programmer , If you want to know the accuracy of the result of floating-point arithmetic , You need to know how the computer represents and handles floating point numbers . Want to get a more thorough answer , It needs to be studied in detail IEEE 754 standard , Or study Kuth Of 《 Computer programming art 》.

in the light of Python Middle floating point number float The rounding problem of , We need to understand two things ： Binary representation 、 Decimal rounding policy .

$\underline{ Floating point type float And round The design of the }$

float use IEEE754 Format , Use binary encoding to store floating point numbers , Not much different from other computer languages .float The type is a double precision floating-point number .
round The choice is close to the nearest and even numbers , This strategy accords with the general approximation optimization principle of large-scale computing , The usual rounding strategy is not used .
For high-precision operations and decimal decimal precise representation ,Python Special modules are provided decimal, And provides optional rounding strategies , Include rounding .

$\underline{float Binary representation of type }$

float Use binary encoding to describe floating point numbers . In binary representation , Most finite decimal decimals cannot be expressed exactly in binary . in other words , A limited number of decimal decimals , It often becomes an infinite number of binary decimals . in fact , The denominator contains non 2 The fraction of the prime factor , Cannot be represented by a finite binary decimal . The decimal denominator contains the prime factor 5, If there is still a factor in the denominator after the division 5, Will become an infinite binary decimal .
For decimal finite decimal that cannot be represented by finite binary decimal , The approximate values of these decimal floating-point numbers are stored in the system . In the approximation , There are two types: carry and truncation , The approximate error is generally in ${10}^{-17}$ about . The carry approximation is greater than the original value , The truncation approximation is smaller than the original value , So for values with small decimal places （ Such as 1.215 Carried ,1.275 Truncated ）, The carry approximation results in a tail increment （ The tail is garbled ）, The truncation approximation is smaller than the original value , There will be ”999...“ The approximate value phenomenon of .
After expressed as binary approximation ,Python The system is working round When calculating , Use an approximation , Not using the original value .

【 example 1】 Decimal number 0.1 Cannot use finite bit binary to express , The approximate values of the same original value are all equal .

From decimal to binary ：$(0.1){10} = (0.0001100110011001100110011001100110011001100110011001...){2}$

In the system , The original value is converted to an approximate value ：

 >>> a = 1.2355000000000000426325641456060111522674560546875
>>> b = 1.2355
>>> a == b
True

【 example 2】 After binary representation conversion , The original value no longer exists , But sometimes the system will show misunderstanding .

We assign to Python A value of , He will follow the floating point standard , Convert it to a binary stored approximation . therefore , The actual value in the system is not our original value . And for the system , He thinks the two values are the same . In practice , He used that approximation . But what is easy to confuse is , For simplicity ,Python Sometimes the original value is displayed to the user .

 >>> x = 0.1
>>> print(x)
0.1
>>> format(x, '.20f')
'0.10000000000000000555'
>>> format(x, '.50f')
'0.10000000000000000555111512312578270211815834045410'

$\underline{ Built in functions round Rounding analysis of }$

<font face=" In black " color="Crimson">round(number, ndigits)

【values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done toward the even choice (so, for example, both round(0.5) and round(-0.5) are 0, and round(1.5) is 2). 】

For a given floating point number $v$, Here we might as well consider only less than 1 Decimals of , That is, the effective mantissa of a floating-point number , Ignore sign and exponent parts .

Its decimal system is expressed as ：

$v{10} = d{1} \cdots d{m} =d{1}10^{-1}+\cdots+d_{m}10^{-m}$

stay Python Will use float Type processing , Store the representation using its closest binary value ：

$v{2} = b{1} \cdots b{n} = b{1}2^{-1}+\cdots+b_{n}2^{-n}$

This binary decimal approximation $v_{2}$ Revert to 10 Base time , Generally, it is not equal to the original value , Unless the original value can be expressed as 2 Finite negative power sum of .

Such as ：0.125 = $0\frac{1}{2}+0\frac{1}{4}+1*\frac{1}{8} = 0.001$

$\underline {round When rounding decimal places , Actually, it's true v_{2} The decimal value of }$

Use round For decimal numbers $v{10}$ Of the k When the digits are rounded , In fact, it turns out that $v{2}$ Processing of decimal values of ,Python3 The current treatment principle is ：

（1）$v_{2}$ The first k+1 Is it 0-4 or 6-9 when , Rounding is very clear , It can be called a position 4 House 6 Enter into .

（2）$v{2}$ The first k+1 Is it 5 when , Want to see $v{2}$ And $d{1}...d{k}$ and $d{1}...(d{k}+1)$ Distance between :

-【1】 If the two distances are not equal , The result takes the nearest value .
-【2】 If the two distances are equal , Adopt the principle of approaching even numbers , namely , If $d{k}$ If it is an even number, take $d{1}...d{k}$, Otherwise take $d{1}...(d_{k}+1)$.

<font face=" In black " color=#0099ff> Look at the following two floating-point numbers to keep the rounding of two decimal places ：

1.275 Binary approximate representation of : $v_{2}(1.275)$ = 1.274999999999999911182158029987476766109466552734375 Calculation 1.27 and 1.28 Which one left $v_2(1.275)$ A more recent , obviously 1.27 A more recent , therefore ：round(1.275, 2) = 1.27round(1.215, 2) = 1.22<font face=" In black " color=#0099ff> Both sides are equally close ( The original value is 2 A finite representation of the power of ） Approach an even number , Such as 1.125 = 1 + 0$\frac{1}{2}$ + 0$\frac{1}{4}$ + 1$\frac{1}{8}$ = $1 + 12^{-3}$ , The result is ：
1.215 Binary approximate representation of ： $v_{2}(1.215)$ = 1.2150000000000000799360577730112709105014801025390625 because $v_2(1.215)$ And 1.22 Than 1.21 A more recent , The result is ：

 >>> round(1.125, 2)
1.2
>>> round(1.5, 0)
2.0
>>> round(0.5, 0)
0.0

It seems to depend on round It is impossible to solve the problem of rounding , Because its design goal is not this , Simple adjustment cannot meet the requirement of rounding .

Approximate rounding is beneficial to the accuracy of data analysis , Is an error minimization strategy .

meanwhile , From the user's point of view ,round Also affected by binary representation . If you only think about rounding , Within a certain accuracy range

Only related to rounding rules . But the value given by the user must first be converted to the double precision approximation ,round The rule of is used to round this approximation ,

It is necessary to consider the accuracy range . This range is right for float Come on , Namely 52 Bit binary storage precision , It's decimal 17 Decimal places within the significant digits .

The significant digits include integer part digits .

3、 ... and . resolvent

String conversion methods ( Negative digits are not supported , Other questions remain to be tested ...)：

def round45s(v, d=0):
vl = str(v).split('.')
sign = -1 if v < 0 else 1
if len(vl) == 2:
if len(vl[1]) > d:
if vl[1][d] >= '5':
if d > 0:
vp = (eval(vl[1][:d]) + 1)/10**d * sign
return int(v)+vp
else:
return int(v)+sign
else:
if d > 0:
return int(v) + eval(vl[1][:d])/10**d * sign
else:
return int(v)
else:
return v
return int(v)

The way to improve :

This method converts the original value to a slightly larger decimal value , So that the finite decimal digits of the input value will not change （ And then 0 value , until 15 position ）, Avoid "999 variation ". But affected by double precision binary approximate storage , Only decimal significant digits 15 Use within bits （digits<15）. We need to pay attention to , The number of bits in the integer part is also considered to be within the significant bits .

def round45r(number, digits=0):
int_len = len(str(int(abs(number))))
signal_ = 1 if number >= 0 else -1
err_place = 16 - int_len - 1
if err_place > 0:
err_ = 10**-err_place
return round(number + err_ * signal_, digits)
else:
raise NotImplemented # suffer float Indicates the limits of precision ！

round45r() It is also valid for negative numbers and integers , Namely support v, d negative ：

>>> round45r(-1.205, 2)
-1.210000000000002 # stay 16 Bit complement error , Ensure that the previous figures do not change
>>> round45r(123.205, -2)
100.00000000000001
>>> round45r(153.205, -2)
200.0
>>> round45r(-153.205, -2)
-200.0

If the running time can bear , Try to use high-precision decimal representation module decimal（ Precision and rounding strategy can be controlled ）：

from decimal import Decimal, Context, ROUND_HALF_UP
def roundl45d(v, d):
if int(v) > 0:
d = d + str(v).find('.') # The significand includes the integer part
return float(Context(prec=d, rounding=ROUND_HALF_UP).create_decimal(str(v)))
>>> decimal45(1.205)
1.21
>>> decimal45(1.255)
1.26

Efficiency test results ：

>>> test_round45s(number=1000000)
6.26826286315918e-06
>>> test_round45r(number=1000000)
1.287583589553833e-06
>>> test_round45d(number=1000000)
1.7323946952819824e-06

round45s Than round45r Yes 5 Difference in operation speed of times .

round45d And round45r Basically at the same level .

Four . Further reflection

Achieve a more efficient approach , Should consider using c Module to write round45. >>> x = round45(1.275, 2); print(x) 1.28 >>> format(x, '.30f') '1.280000000000000026645352591004'
If the calculation result still uses the floating-point type float Express , Its value is still Python Binary double precision approximation stored in .
To accurately perform floating-point operations , It is recommended to use decimal modular , And assign values through strings , And set the precision and rounding strategy according to the calculation needs .

 >>> from decimal import Decimal, gecontext, ROUND_HALF_UP
>>> Decimal(‘1.675’)
Decimal(‘1.675’) # Use strings to get exact values
>>> getcontext().prec = 52 # Set the precision to 52 position （ Here it refers to the decimal system ）
>>> Decimal(‘1’) / Decimal(str(2**52))
Decimal('2.220446049250313080847263336181640625E-16') # Accurately represent significant digits as required （ Actual significant digits 37 position ）
>>> Decimal('0.1') + Decimal('0.2')
Decimal('0.3') # Within the accuracy range , Accurate calculation
>>> getcontext().prec = 3
>>> getcontext().rounding = ROUND_HALF_UP
>>> Decimal('1.275')
Decimal('1.28')

Get to know decimal Eight rounding strategies for

stay decimal in , Eight kinds of rounding Strategy , When calculating, you can select .

These should be frequently used in scientific computing and data processing , It has basically become a standard .

Some articles also talk about Java BigDecimal Rounding strategy for , Basic and Python decimal similar , It's just decimal One more. ROUND_05UP.

(https://www.jianshu.com/p/87627d53f77b?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation)

adopt decimal Computing environment Context management , Set trade-off precision and strategy

>>> tc = getcontext() # Get the computing environment
>>> tc.prec
28 # Default precision
>>> tc.rounding
decimal.ROUND_HALF_EVEN # The default policy is ROUND_HALF_EVEN
>>> tc.prec = 5 # Set up 5 Bit significant digit accuracy
>>> tc.rounding = decimal.ROUND_HALF_UP # Set to new rounding policy

decimal Eight rounding strategies in operations ：

1） ROUND_CEILING Positive infinity （Infinity） near

>>> tc.rounding = decimal.ROUND_CEILING
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1235') # When it is positive, it is close to the larger direction
>>> tc.create_decimal(‘-1.12345’)
Decimal('-1.1234') # When the number is negative, it is close to the direction where the absolute value is smaller

2） ROUND_DOWN towards 0 near

>>> tc.rounding = decimal.ROUND_DOWN
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1234') # Positive numbers are rounded down
>>> tc.create_decimal(‘-1.12345’)
Decimal('-1.1234') # When the number is negative, it is close to the direction where the absolute value is smaller

3） ROUND_FLOOR To negative infinity （-Infinity） near

>>> tc.rounding = decimal.ROUND_FLOOR
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1234') # Positive numbers are rounded down
>>> tc.create_decimal(‘-1.12345’)
Decimal('-1.1235') # When the number is negative, it is close to the direction where the absolute value is larger

4） ROUND_HALF_DOWN Approach to the nearest approximation , Approach when both sides are equal 0 Direction

>>> tc.rounding = decimal.ROUND_HALF_DOWN
>>> tc.create_decimal(‘1.12346’)
Decimal('1.1235') # The two ends are not equal , by 4 House 6 Enter into
>>> tc.create_decimal(‘-1.12346’)
Decimal('-1.1235') # The two ends are not equal , by 4 House 6 Enter into
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1234') # Both ends are equal , Positive numbers are rounded down
>>> tc.create_decimal(‘-1.12345’)
Decimal('-1.1234') # Both ends are equal , When the number is negative, it is close to the direction where the absolute value is smaller

5） ROUND_HALF_EVEN Approach to the nearest approximation ; When both sides are equal , It is preceded by odd carry , Even numbers do not carry

>>> tc.rounding = decimal.ROUND_EVEN
>>> tc.create_decimal(‘1.12346’)
Decimal('1.1235') # The two ends are not equal , by 4 House 6 Enter into
>>> tc.create_decimal(‘-1.12346’)
Decimal('-1.1235') # The two ends are not equal , by 4 House 6 Enter into
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1234') # Both ends are equal , Approach an even number
>>> tc.create_decimal(‘-1.12335’)
Decimal('-1.1234') # Both ends are equal , Approach an even number

6） ROUND_HALF_UP Approach to the nearest approximation , When both sides are equal, close and far away 0 The direction of

>>> tc.rounding = decimal.ROUND_HALF_UP
>>> tc.create_decimal(‘1.12346’)
Decimal('1.1235') # The two ends are not equal , by 4 House 6 Enter into
>>> tc.create_decimal(‘-1.12346’)
Decimal('-1.1235') # The two ends are not equal , by 4 House 6 Enter into
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1235') # Both ends are equal , Positive numbers lean towards Infinity near
>>> tc.create_decimal(‘-1.12345’)
Decimal('-1.1235') # Both ends are equal , Negative direction -Infinity near

7） ROUND_UP Close away 0 The direction of

>>> tc.rounding = decimal.ROUND_UP
>>> tc.create_decimal(‘1.12345’)
Decimal('1.1235') # Positive numbers are rounded up
>>> tc.create_decimal(‘-1.12345’)
Decimal('-1.1235') # Negative numbers are rounded down

8） ROUND_05UP If to 0 The last decimal place close to the choice is 0 or 5, Just keep away 0 near , Otherwise, I will go to 0 near .

>>> tc.rounding = decimal.ROUND_05UP
>>> tc.create_decimal(‘1.12343’)
Decimal('1.1234') # towards 0 The last person who is close to the choice is not 0 or 5
>>> tc.create_decimal(‘1.12351’)
Decimal('1.1236') # towards 0 The last one after the choice is 5, Choose to stay away from 0
>>> tc.create_decimal(‘-1.12355’)
Decimal('-1.1236') # towards 0 The last one after the choice is 5, Choose to stay away from 0
>>> tc.create_decimal(‘-1.12305’)
Decimal('-1.1231') # towards 0 The last one after the choice is 0, Choose to stay away from 0