“ In essence , Floating point arithmetic is imprecise , And programmers can easily abuse it , Thus, the result of calculation is almost composed of noise ”
–Donald Knuth(《 Programming art 》( Second volume ) Semi numerical algorithm )
One . Rounding confusion
Python2 in ,round Function use depends on near most near and etc. distance far leave 0 ‾ \underline{ Near the nearest and equidistant away 0}
Near the nearest and equidistant away 0
(ROUND_HALF_UP) Strategy , Is the usual rounding pattern .
(Values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done away from 0 )
And in the Python3 in ,round The choice of function depends on near most near and etc. distance by near accidentally Count ‾ \underline{ Near nearest and equidistant near even }
Near nearest and equidistant near even
(ROUND_HALF_EVEN) Strategy ,
(values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done toward the even choice)
meanwhile ,float Type uses double precision binary storage ( reference IEEE754 standard ),round The function uses binary stored values , It will affect the result when rounding , and round The rounding specification is not used by itself , It caused some confusion .
>>> round(1.205, 2) 1.21 ...... >>> round(1.245, 2) 1.25 >>> round(1.255, 2) 1.25 >>> round(1.265, 2) 1.26 ...... >>> round(1.295, 2) 1.29
Numpy Of round The same problem :
>>> numpy.round(1.255, 2) 1.25 >>> numpy.round(1.265, 2) 1.26
>>> format(1.215, '.52f') '1.2150000000000000799360577730112709105014801025390625' >>> round(1.215, 2) 1.22 >>>formatl(1.275, '.52f') '1.2749999999999999111821580299874767661094665527343750' >>> round(1.275, 2) 1.27
>>> format(1.125, '.52f') '1.1250000000000000000000000000000000000000000000000000' >>> round(1.125, 2) 1.12 >>> format(0.5, '.52f') '0.5000000000000000000000000000000000000000000000000000' >>> round(0.5, 0) 0
When it is necessary to master accurate calculation results ,round() It seems to be a headache .
It was a simple question , Why it becomes difficult to understand , Is it Python Of round Something went wrong with the strategy ?
As a professional language platform ,Python Obviously, I would not have thought of such a problem .
Actually , Floating point numbers are not as simple as we think .
On the precision of floating point numbers , Just like the master of Computing Knuth said , This is a complicated problem beyond our imagination , The result often makes us very depressed , I can't even believe in computers anymore . It is necessary to master the calculation rules , It really needs some patience and in-depth discussion . In terms of floating point number calculation , We have many illusions , The finite computation of computer is often confused with the computation in algebra , So that they are confused about many results . For a serious programmer , If you want to know the accuracy of the result of floating-point arithmetic , You need to know how the computer represents and handles floating point numbers . Want to get a more thorough answer , It needs to be studied in detail IEEE 754 standard , Or study Kuth Of 《 Computer programming art 》.
in the light of Python Middle floating point number float The rounding problem of , We need to understand two things : Binary representation 、 Decimal rounding policy .
$\underline{ Floating point type float And round The design of the }$
$\underline{float Binary representation of type }$
【 example 1】 Decimal number 0.1 Cannot use finite bit binary to express , The approximate values of the same original value are all equal .
From decimal to binary :$(0.1){10} = (0.0001100110011001100110011001100110011001100110011001...){2}$
In the system , The original value is converted to an approximate value :
>>> a = 1.2355000000000000426325641456060111522674560546875 >>> b = 1.2355 >>> a == b True
【 example 2】 After binary representation conversion , The original value no longer exists , But sometimes the system will show misunderstanding .
We assign to Python A value of , He will follow the floating point standard , Convert it to a binary stored approximation . therefore , The actual value in the system is not our original value . And for the system , He thinks the two values are the same . In practice , He used that approximation . But what is easy to confuse is , For simplicity ,Python Sometimes the original value is displayed to the user .
>>> x = 0.1 >>> print(x) 0.1 >>> format(x, '.20f') '0.10000000000000000555' >>> format(x, '.50f') '0.10000000000000000555111512312578270211815834045410'
$\underline{ Built in functions round Rounding analysis of }$
<font face=" In black " color="Crimson">round(number, ndigits)
【values are rounded to the closest multiple of 10 to the power minus ndigits; if two multiples are equally close, rounding is done toward the even choice (so, for example, both round(0.5) and round(-0.5) are 0, and round(1.5) is 2). 】
For a given floating point number $v$, Here we might as well consider only less than 1 Decimals of , That is, the effective mantissa of a floating-point number , Ignore sign and exponent parts .
Its decimal system is expressed as :
$v{10} = d{1} \cdots d{m} =d{1}10^{-1}+\cdots+d_{m}10^{-m}$
stay Python Will use float Type processing , Store the representation using its closest binary value :
$v{2} = b{1} \cdots b{n} = b{1}2^{-1}+\cdots+b_{n}2^{-n}$
This binary decimal approximation $v_{2}$ Revert to 10 Base time , Generally, it is not equal to the original value , Unless the original value can be expressed as 2 Finite negative power sum of .
Such as :0.125 = $0\frac{1}{2}+0\frac{1}{4}+1*\frac{1}{8} = 0.001$
$\underline {round When rounding decimal places , Actually, it's true v_{2} The decimal value of }$
Use round For decimal numbers $v{10}$ Of the k When the digits are rounded , In fact, it turns out that $v{2}$ Processing of decimal values of ,Python3 The current treatment principle is :
(1)$v_{2}$ The first k+1 Is it 0-4 or 6-9 when , Rounding is very clear , It can be called a position 4 House 6 Enter into .
(2)$v{2}$ The first k+1 Is it 5 when , Want to see $v{2}$ And $d{1}...d{k}$ and $d{1}...(d{k}+1)$ Distance between :
<font face=" In black " color=#0099ff> Look at the following two floating-point numbers to keep the rounding of two decimal places :
>>> round(1.125, 2) 1.2 >>> round(1.5, 0) 2.0 >>> round(0.5, 0) 0.0
It seems to depend on round It is impossible to solve the problem of rounding , Because its design goal is not this , Simple adjustment cannot meet the requirement of rounding .
Approximate rounding is beneficial to the accuracy of data analysis , Is an error minimization strategy .
meanwhile , From the user's point of view ,round Also affected by binary representation . If you only think about rounding , Within a certain accuracy range
Only related to rounding rules . But the value given by the user must first be converted to the double precision approximation ,round The rule of is used to round this approximation ,
It is necessary to consider the accuracy range . This range is right for float Come on , Namely 52 Bit binary storage precision , It's decimal 17 Decimal places within the significant digits .
The significant digits include integer part digits .
String conversion methods ( Negative digits are not supported , Other questions remain to be tested ...):
def round45s(v, d=0): vl = str(v).split('.') sign = -1 if v < 0 else 1 if len(vl) == 2: if len(vl[1]) > d: if vl[1][d] >= '5': if d > 0: vp = (eval(vl[1][:d]) + 1)/10**d * sign return int(v)+vp else: return int(v)+sign else: if d > 0: return int(v) + eval(vl[1][:d])/10**d * sign else: return int(v) else: return v return int(v)
The way to improve :
This method converts the original value to a slightly larger decimal value , So that the finite decimal digits of the input value will not change ( And then 0 value , until 15 position ), Avoid "999 variation ". But affected by double precision binary approximate storage , Only decimal significant digits 15 Use within bits (digits<15). We need to pay attention to , The number of bits in the integer part is also considered to be within the significant bits .
def round45r(number, digits=0): int_len = len(str(int(abs(number)))) signal_ = 1 if number >= 0 else -1 err_place = 16 - int_len - 1 if err_place > 0: err_ = 10**-err_place return round(number + err_ * signal_, digits) else: raise NotImplemented # suffer float Indicates the limits of precision !
round45r() It is also valid for negative numbers and integers , Namely support v, d negative :
>>> round45r(-1.205, 2) -1.210000000000002 # stay 16 Bit complement error , Ensure that the previous figures do not change >>> round45r(123.205, -2) 100.00000000000001 >>> round45r(153.205, -2) 200.0 >>> round45r(-153.205, -2) -200.0
If the running time can bear , Try to use high-precision decimal representation module decimal( Precision and rounding strategy can be controlled ):
from decimal import Decimal, Context, ROUND_HALF_UP def roundl45d(v, d): if int(v) > 0: d = d + str(v).find('.') # The significand includes the integer part return float(Context(prec=d, rounding=ROUND_HALF_UP).create_decimal(str(v))) >>> decimal45(1.205) 1.21 >>> decimal45(1.255) 1.26
Efficiency test results :
>>> test_round45s(number=1000000) 6.26826286315918e-06 >>> test_round45r(number=1000000) 1.287583589553833e-06 >>> test_round45d(number=1000000) 1.7323946952819824e-06
round45s Than round45r Yes 5 Difference in operation speed of times .
round45d And round45r Basically at the same level .
>>> from decimal import Decimal, gecontext, ROUND_HALF_UP >>> Decimal(‘1.675’) Decimal(‘1.675’) # Use strings to get exact values >>> getcontext().prec = 52 # Set the precision to 52 position ( Here it refers to the decimal system ) >>> Decimal(‘1’) / Decimal(str(2**52)) Decimal('2.220446049250313080847263336181640625E-16') # Accurately represent significant digits as required ( Actual significant digits 37 position ) >>> Decimal('0.1') + Decimal('0.2') Decimal('0.3') # Within the accuracy range , Accurate calculation >>> getcontext().prec = 3 >>> getcontext().rounding = ROUND_HALF_UP >>> Decimal('1.275') Decimal('1.28')
stay decimal in , Eight kinds of rounding Strategy , When calculating, you can select .
These should be frequently used in scientific computing and data processing , It has basically become a standard .
Some articles also talk about Java BigDecimal Rounding strategy for , Basic and Python decimal similar , It's just decimal One more. ROUND_05UP.
(https://www.jianshu.com/p/87627d53f77b?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation)
>>> tc = getcontext() # Get the computing environment >>> tc.prec 28 # Default precision >>> tc.rounding decimal.ROUND_HALF_EVEN # The default policy is ROUND_HALF_EVEN >>> tc.prec = 5 # Set up 5 Bit significant digit accuracy >>> tc.rounding = decimal.ROUND_HALF_UP # Set to new rounding policy
1) ROUND_CEILING Positive infinity (Infinity) near
>>> tc.rounding = decimal.ROUND_CEILING >>> tc.create_decimal(‘1.12345’) Decimal('1.1235') # When it is positive, it is close to the larger direction >>> tc.create_decimal(‘-1.12345’) Decimal('-1.1234') # When the number is negative, it is close to the direction where the absolute value is smaller
2) ROUND_DOWN towards 0 near
>>> tc.rounding = decimal.ROUND_DOWN >>> tc.create_decimal(‘1.12345’) Decimal('1.1234') # Positive numbers are rounded down >>> tc.create_decimal(‘-1.12345’) Decimal('-1.1234') # When the number is negative, it is close to the direction where the absolute value is smaller
3) ROUND_FLOOR To negative infinity (-Infinity) near
>>> tc.rounding = decimal.ROUND_FLOOR >>> tc.create_decimal(‘1.12345’) Decimal('1.1234') # Positive numbers are rounded down >>> tc.create_decimal(‘-1.12345’) Decimal('-1.1235') # When the number is negative, it is close to the direction where the absolute value is larger
4) ROUND_HALF_DOWN Approach to the nearest approximation , Approach when both sides are equal 0 Direction
>>> tc.rounding = decimal.ROUND_HALF_DOWN >>> tc.create_decimal(‘1.12346’) Decimal('1.1235') # The two ends are not equal , by 4 House 6 Enter into >>> tc.create_decimal(‘-1.12346’) Decimal('-1.1235') # The two ends are not equal , by 4 House 6 Enter into >>> tc.create_decimal(‘1.12345’) Decimal('1.1234') # Both ends are equal , Positive numbers are rounded down >>> tc.create_decimal(‘-1.12345’) Decimal('-1.1234') # Both ends are equal , When the number is negative, it is close to the direction where the absolute value is smaller
5) ROUND_HALF_EVEN Approach to the nearest approximation ; When both sides are equal , It is preceded by odd carry , Even numbers do not carry
>>> tc.rounding = decimal.ROUND_EVEN >>> tc.create_decimal(‘1.12346’) Decimal('1.1235') # The two ends are not equal , by 4 House 6 Enter into >>> tc.create_decimal(‘-1.12346’) Decimal('-1.1235') # The two ends are not equal , by 4 House 6 Enter into >>> tc.create_decimal(‘1.12345’) Decimal('1.1234') # Both ends are equal , Approach an even number >>> tc.create_decimal(‘-1.12335’) Decimal('-1.1234') # Both ends are equal , Approach an even number
6) ROUND_HALF_UP Approach to the nearest approximation , When both sides are equal, close and far away 0 The direction of
>>> tc.rounding = decimal.ROUND_HALF_UP >>> tc.create_decimal(‘1.12346’) Decimal('1.1235') # The two ends are not equal , by 4 House 6 Enter into >>> tc.create_decimal(‘-1.12346’) Decimal('-1.1235') # The two ends are not equal , by 4 House 6 Enter into >>> tc.create_decimal(‘1.12345’) Decimal('1.1235') # Both ends are equal , Positive numbers lean towards Infinity near >>> tc.create_decimal(‘-1.12345’) Decimal('-1.1235') # Both ends are equal , Negative direction -Infinity near
7) ROUND_UP Close away 0 The direction of
>>> tc.rounding = decimal.ROUND_UP >>> tc.create_decimal(‘1.12345’) Decimal('1.1235') # Positive numbers are rounded up >>> tc.create_decimal(‘-1.12345’) Decimal('-1.1235') # Negative numbers are rounded down
8) ROUND_05UP If to 0 The last decimal place close to the choice is 0 or 5, Just keep away 0 near , Otherwise, I will go to 0 near .
>>> tc.rounding = decimal.ROUND_05UP >>> tc.create_decimal(‘1.12343’) Decimal('1.1234') # towards 0 The last person who is close to the choice is not 0 or 5 >>> tc.create_decimal(‘1.12351’) Decimal('1.1236') # towards 0 The last one after the choice is 5, Choose to stay away from 0 >>> tc.create_decimal(‘-1.12355’) Decimal('-1.1236') # towards 0 The last one after the choice is 5, Choose to stay away from 0 >>> tc.create_decimal(‘-1.12305’) Decimal('-1.1231') # towards 0 The last one after the choice is 0, Choose to stay away from 0