Most active commenters
  • kccqzy(3)
  • eru(3)

←back to thread

Uncertain<T>

(nshipster.com)
444 points samtheprogram | 21 comments | | HN request time: 0.43s | source | bottom
1. j2kun ◴[] No.45056827[source]
This concept has been done many times in the past, under the name "interval arithmetic." Boost has it [1] as does flint [2]

What is really curious is why, after being reinvented so many times, it is not more mainstream. I would love to talk to people who have tried using it in production and then decided it was a bad idea (if they exist).

[1]: https://www.boost.org/doc/libs/1_89_0/libs/numeric/interval/... [2]: https://arblib.org/

replies(8): >>45056929 #>>45057194 #>>45057366 #>>45057865 #>>45058239 #>>45058375 #>>45058723 #>>45062336 #
2. Tarean ◴[] No.45056929[source]
Interval arithmetic is only a constant factor slower but may simplify at every step. For every operation over numbers there is a unique most precise equivalent op over intervals, because there's a Galois connection. But just because there is a most precise way to represent a set of numbers as an interval doesn't mean the representation is precise.

A computation graph which gets sampled like here is much slower but can be accurate. You don't need an abstract domain which loses precision at every step.

replies(1): >>45057275 #
3. pklausler ◴[] No.45057194[source]
Interval arithmetic makes good intuitive sense when the endpoints of the intervals can be represented exactly. Figuring out how to do that, however, is not obvious.
replies(1): >>45059131 #
4. bee_rider ◴[] No.45057275[source]
It would have been sort of interesting if we’d gone down the road of often using interval arithmetic. Constant factor slower, but also the operations are independent. So if it was the conventional way of handling non-integer numbers, I guess we’d have hardware acceleration by now to do it in parallel “for free.”
replies(1): >>45059122 #
5. kccqzy ◴[] No.45057366[source]
The article says,

> Under the hood, Uncertain<T> models GPS uncertainty using a Rayleigh distribution.

And the Rayleigh distribution is clearly not just an interval with a uniformly random distribution in between. Normal interval arithmetic isn't useful because that uniform random distribution isn't at all a good model for the real world.

Take for example that Boost library you linked. Ask it to compute (-2,2)*(-2,2). It will give (-4,4). A more sensible result might be something like (-2.35, 2.35). The -4 lower bound is only attainable when you have -2 and 2 as the multiplicands which are at the extremes of the interval; probabilistically if we assume these are independent random variables then two of them achieving this extreme value simultaneously should have an even lower probability.

replies(1): >>45060585 #
6. anal_reactor ◴[] No.45057865[source]
Because reasoning about uncertain values / random variables / intervals / fuzzy logic / whatever is difficult and the model where things are certain is much easier to process while it models the reality well enough.
7. PaulDavisThe1st ◴[] No.45058239[source]
Several years ago when I discovered some of the historical work on interval arithmetic, I was astounded to find that there was a notable contingent in the 60s that was urging hardware developers to make interval arithmetic be the basic design of new CPUs, and saying quite forcefully that if we simply went with "normal" integers and floating point, we'd be unable to correctly model the world.
replies(1): >>45058826 #
8. woah ◴[] No.45058375[source]
Using simple types (booleans etc) is very simple and easy to reason about, and any shortcomings are obvious. Trying to model physical uncertainty is difficult and requires different models for different domains. Once you have committed to needing to do that, it would be much better to use a purpose built model instead of a library which put some bell curves behind a pretty API.
replies(1): >>45059106 #
9. orlp ◴[] No.45058723[source]
Not sure why this is being upvoted as the article is not describing interval arithmetic. It supports all kinds of uncertainty distributions.
10. skissane ◴[] No.45058826[source]
I think as another commenter pointed out, interval arithmetic’s problem is that while it acknowledges the reality of uncertainty, its model of uncertainty is so simplistic, in many applications it is unusable. So making it the standard primitive, could potentially result in the situation where apps that don’t need to explicitly model uncertainty at all have to pay the price of being forced to do so; meanwhile, apps which need a more realistic model of uncertainty are being forced to do so while being hamstrung by its interactions with another overly simple model. It is one of those ideas which sounds great in theory, but there are good reasons it never succeeded in practice-the space of use cases where explicitly modelling uncertainty is desirable, but where the simplistic model of interval arithmetic is entirely adequate, is rather small-a standard primitive which only addresses the needs of a narrow subset of use cases is not a good architecture
11. eru ◴[] No.45059106[source]
I agree that different application strictly speaking need different models of uncertainty.

But I'm not so sure in your conclusion: a good enough model could be universally useful. See how everyone uses IEEE 754 floats, despite them giving effectively one very specific model of uncertainty. Most of the time this just works, and sometimes people have to explicitly work around floats' weirdnesses (whether that's because they carefully planned ahead because they know what they are doing, or whether they got a nasty surprise first). But overall they are still useful enough to be used almost universally.

12. eru ◴[] No.45059122{3}[source]
You can probably get the parallelism for interval arithmetic today? Though it would probably require a bit of effort and not be completely free.

On the CPU you probably get implicit parallel execution with pipelines and re-ordering etc, and on the GPU you can set up something similar.

13. eru ◴[] No.45059131[source]
Also not all uncertainties are modeled well by uniform distributions over an interval.
14. rendaw ◴[] No.45060585[source]
While it does sound like GP missed a distinction, I don't see how (-2.35, 2.35) would be sensible. The extremes can happen (or else they wouldn't be part of the input intervals) and the code has to sensibly deal with that event in order to be correct.
replies(3): >>45062972 #>>45064017 #>>45080705 #
15. jjcob ◴[] No.45062336[source]
In physics, you typically learn about error propagation quite early in your studies.

If you make some assumptions about your error (a popular one is to assume Gaussian distribution) then you can calculate the error of the result quite elegantly.

It's a nice excercise to write some custom C++ types that have (value, error) and automatically propagate them as you perform mathematical operations on them.

Unfortunately, in the real world only very few measurements have a gaussian error distribution, and the problem are systematic (non-random) errors, and reasoning about them is a lot harder.

So this means that automatically handling error propagation is in most cases pointless, since you need to manually analyze the situation anyway.

16. esrauch ◴[] No.45062972{3}[source]
The reason is that the uniform distribution is very rare. Nearly no real world scenario were something is equally likely to be the values 2, 0 and -2, and where it's literally impossible to be -2.01. It exists but it's not the normal case.

In noisy sensors case there's some arbitrary low probability of them being actually super wrong, if you go by true 10^-10 outlier bounds they will be useless for any practical use, while the 99% confidence range is a relatively small rent.

More often you want some other distribution and say (-2, 2) and those are the 90th percentile interval not the absolute bounds, 0 is more likely than -2 and -3 is possible but rare. It's not bounds, you can ask you model for your 99th or 99.9th percentile value or whatever tolerance you want and get something outside of (-2,2).

17. kccqzy ◴[] No.45064017{3}[source]
Interval arithmetic isn't useful because it only tells you the extreme values, but not how likely these values are. So you have to interpret them as uniform random. Operations like multiplications change the shape of these distributions, so then uniform random isn't applicable any more. Therefore interval arithmetic basically has an undefined underlying distribution that can change easily without being tracked.
replies(1): >>45065842 #
18. mcphage ◴[] No.45065842{4}[source]
> Operations like multiplications change the shape of these distributions, so then uniform random isn't applicable any more.

Doesn't addition as well? Like if you roll d6+d6, the output range is 2-12, but it's not nearly the same as if you rolled d11+1.

replies(1): >>45066031 #
19. kccqzy ◴[] No.45066031{5}[source]
Yes that's true! I used multiplication because that was my original example.
replies(1): >>45073853 #
20. mcphage ◴[] No.45073853{6}[source]
Okay, thanks :-). I was just trying to make sure I was understanding what I was reading.
21. Dylan16807 ◴[] No.45080705{3}[source]
-2 and 2 were not the extremes to begin with.