Skip Navigation
54 comments
  • If you're working with floating point, you should be aware it's just an approximation.

    • People think floats are too magical. Calling it an approximation is sort of leaning into this. Floats have limited precision like every other fixed size representation of a number.

      This is sort of saying that integers are an approximation because int(1.6) + int(2.6) = 5. What do you mean‽ Clearly 1.6 + 2.6 = 4.2 ~= 4!

      Floating points can't perfectly represent 0.1 or 0.2 much like integers can't represent 1.6. So it is rounded to the nearest representable value. The addition is then performed perfectly accurately using these values and the result is the rounded to the nearest representable value. (Much like integers division is "rounded"). That result happens to not be equal to the nearest representable value to 0.3.

      It is definitely a bit surprising. And yes, because of rounding it is technically approximate. But the way people talk about floating point makes it sound like it is some nebulous thing that gives a different result every time or just sort of does the correct thing.

      • I think specifically, they have amazing precision. But the boundaries just don't fall perfectly on round numbers we humans would expect. That's what gets people confused.

        Rounding can resolve these problems, or don't use float if you don't need to.

      • The difference is that when you input a specific, precise floating point number, the number that's stored isn't what you entered.

        When you enter integers and store them in ints, as long as the number is small enough, what's stored is exactly what you entered.

        If you tell your program that the radius of the circle is 0.2 units exactly, it says OK and stores 0.200000000000000011102230246251565404236316680908203125.

        Of course everybody knows that there's a limit to how many digits get stored. If you tried to store Pi, there's obviously some point where it would have to be cut off. But, in life we're used to cutting things off at a certain power of 10. When we say Pi is 3.14 the numbers after the 4 are all zero. If we choose 3.14159 instead, it's the numbers after the 9 that are zero. If we represent these as fractions one is 314/100 the other is 314159/100000. The denominator is always a power of 10.

        Since computers are base 2, their denominator is always a power of 2, so there's a mismatch between the rounded-off numbers we use and the rounded-off numbers the computer uses.

  • Can someone ELI5 why 0.1 + 0.2 fails, but 1+2 doesn't? There's probably a reason you don't represent the decimal portion like you do integers, but I'm tired and not very mathy on a good day.

    • Sure -> I'm not smart enough to explain it like you're five, but maybe 12 or so would work?


      The problem

      The problem here is that you're not adding 1 + 2, or 0.1 + 0.2. You're converting those to binary (because computers talk binary), then you're adding binary numbers, and converting the result back. And the error happens at this conversion step. Let's take it slow, one thing at a time.


      decimal vs binary

      See, if you are looking at decimal numbers, it's kinda like this:

      357 => 7 * 1 + 5 * 10 + 3 * 100. That sequence, from right to left, would be 1, 10, 100, ... as you go from right to left, you keep multiplying that by 10.

      Binary is similar, except it's not 1, 10, 100, 1000 but rather 1, 2, 4, 8, 16 -> multiply by 2 instead of 10. So for example:

      00101101 => right to left => 1 * 1 + 0 * 2 + 1 * 4 + 1 * 8 + 0 * 16 + 1 * 32 + 0 * 64 + 0 * 128 => 45

      The numbers 0, 1, 2, 3..9 we call digits (since we can represent each of them with one digit). And the binary "numbers" 0 and 1 we call bits.

      You can look up more at simple wikipedia links above probably.


      bits and bytes

      We usually "align" these so that we fill with zeroes on the left until some sane width, which we don't do in decimal.

      132 is 132, right? But what if someone told you to write number 132 with 5 digits? We can just add zeroes. So call, "padding".

      00132 - > it's the same as 132.

      In computers, we often "align" things to 8 bits - or 8 places. Let's say you have 5 - > 1001 in binary. To align it to 8 bits, we would add zeroes on the left, and write:

      00001001 -> 1001 -> decimal 5.

      Instead of, say, 100110, you would padd it to 8 bits, you can add two zeroes to left: 00100110.

      Think of it as a thousands separator - we would not write down a million dollars like this: $1000000. We would more frequently write it down like this: $1,000,000, right? (Europe and America do things differently with thousands- and fractions- separators, so 1,000.00 vs 1.000,00. Don't ask me why.)

      So we group groups of three numbers usually, to have it easier to read large numbers.

      E.g. 8487173209478 is hard to read, but 8 487 173 209 478 is simpler to see, it's eight and a half trillion, right?

      With binary, we group things into 8 bits - we call that "byte". So we would often write this:

      01000101010001001010101010001101

      like this:

      01000101 01000100 10101010 10001101

      I will try to be using either 4 or 8 bits from now on, for binary.


      which system are we in?

      As a tangential side note, we sometimes add "b" or "d" in front of numbers, that way we know if it's decimal or binary. E.g. is 100 binary or decimal?

      b100 vs d100 makes it easier. Although, we almost never use the d, but we do mark other systems that we use: b for binary, o for octal (system with 8 digits), h for hexadecimal (16 digits).

      Anyway.


      Conversion

      To convert numbers to binary, we'd take chunks out of it, write down the bit. Example:

      13 -> ?

      What we want to do is take chunks out of that 13 that we can write down in binary until nothing's left.

      We go from the biggest binary value and substract it, then go to next and next until we get that 13 down to zero. Binary values are 1, 2, 4, 8, 16, 32, ... (and we write them down as b0001, b0010, b0100, b1000, .... with more zeroes on the left.)

      • the biggest of those that fit into 13 seems to be 8, or 1000. So let's start there. Our binary numbers so far: 1000 And we have 13 - 8 = 5 left to deal with.
      • The biggest binary to fit into 5 is 4 (b0100). Our binary so far: b1000 + b0100 And our decimal leftover: 5 - 4 = 1.
      • The biggest binary to fit into 1 is 1 (b0001). So binary: b1000 + b0100 + b0001 And decimal: 1 - 1 = 0.

      So in the endl, we have to add these binary numbers:

      ` 1000 0100 +0001

      b1101 `

      So decimal 13 we write as 1101 in binary.


      Fractions

      So far, so good, right? Let's go to fractions now. It's very similar, but we split parts before and after the dot.

      E.g. 43.976 =>

      • the part before the dot (whole numbers part) -> 1 * 3 + 10 * 4 = > 13
      • the part after it (fractional part) -> 0.1 * 9 + 0.01 * 7 + 0.001 * 6
        Or, we could write it as: 9 / 10 + 7 / 100 + 6 / 1000.

      Just note that we started already with 10 on the fractional part, not with 1 (so it's 1/10, 1/100, 1/1000...)

      The decimal part is similar, except instead of multiplying by 10, you divide by 10. It would be similar with binary: 1/2, 1/4, 1/8. Let's try something:

      b0101.0110 ->

      • whole number part: 1 * 1 + 2 * 0 + 4 * 1 + 8 * 0 (5)
      • fractional part -> 0 / 2 + 1 / 4 + 1 / 8 + 0 / 16 -> 0.375.

      So b0101.0110 (in binary) would be 5.375 in decimal.


      Converting with fractions

      Now, let's convert 2.5 into binary, shall we?

      First we take the whole part: 2. The biggest binary that fits is 2 (b0010). Now the fractional part, 0.5. What's the biggest fraction we can write down? What are all of them?

      If you remember, it's 1/2, 1/4, 1/8, 1/16... or in other words, 0.5, 0.25, 0.125, 0.0625...

      So 0.5 would be binary 1/2, or b0.1000

      And finally, 2.5 in decimal => b0010.1000

      Let's try another one:

      13.625

      • Whole number part is 13 -> we already have it above, it's b1101.
      • Fractional part: 0.625. The bigest fraction that fits is 0.5, or 1/2, or b0.1. We have then 0.625 - 0.5 = 0.125 left. The next fraction that fits is 1/8 (0.125), written as b0.0010.

      Together with b0.1000 above, it's b0.1010 So the final number is:

      b1101.1010

      Get it? Try a few more:

      4.125, 9.0625, 13.75.

      Now, all these conversions so far, align very nicely. But what when they do not?


      Finaly, our problem.

      1 + 2 = 3. In binary, let's padd it to 4 bits: 1 -> the biggest binary that fits is b0010. 2 -> the biggest thing that fits is b0010.

      b0001 + b0010 = b0011.

      If we convert the result back: b0011 -> to decimal, we get 3.

      Okay? Good.


      Now let's try 0.1 + 0.2.

      • decimal 0.1 => 1 / 10.

      How do we get it in binary? Let's find the biggest fraction that fits: 1/16, or 0.0625, or b0.0001 What's left is 0.1 - 0.0625 = 0.0375. Next binary that fits: 1/32 or 0.03125 or b0.00001. We're left with 0.00625. Next binary that fits is 1/256
      ... etc etc until we get to:

      decimal 0.1 = b0.0001100110

      We can do the same with 0.2 -> b0.0011001100.

      Now, let's add those two:

      ` b0.0001 1001 10 +b0.0011 0011 00

      b0.0100 1100 10 `

      Right? So far so good. Now, if we go back to decimal, it should come out to 0.3.

      So let's try it: 0/2+1/4+0/8+0/16+1/32+1/64+0/128+0/256+1/512+0/1024 => 0.298828125

      WHAAAT?

    • I just recently worked on fixed point 8.8 and basically the way fractional values work, you take an integer and say that integer is then divided by another one. So you represent the number in question with two numbers not one. 0.3 can be presented in a number of ways, like 30 % 10, or 6 % 20.

      The problem is the way 0.1 is represented and 0.2 represented don't jive when you add them, so the compiler makes a fractional representation of 0.3 based on how 0.1 and 0.2 were expressed that just comes out weird.

      That's also why 0.3 + 0.3 is fine. When you wrote 0.3, the compiler/runtime already knew how to express 0.3 without rounding errors.

    • We use a system for representing decimal numbers that trades off precision for speed and less storage space. Integer numbers don't need to use that system.

54 comments