Most active commenters

btown(4)
geocar(3)
bee_rider(3)
kccqzy(3)
mabster(3)
eru(3)

Popular/hot comments

>>45056827 #
>>45056358 #
>>45055418 #
>>45055499 #
>>45056582 #
>>45059050 #
>>45058496 #
>>45060585 #

Uncertain<T>

(nshipster.com)

1. mackross ◴[28 Aug 25 18:28 UTC] No.45055371[source]▶

>>45054703 (OP) #

Always enjoy mattt’s work. Looks like a great library.

2. boscillator ◴[28 Aug 25 18:32 UTC] No.45055418[source]▶

>>45054703 (OP) #

Does this handle covariance between different variables? For example, the location of the object your measuring your distance to presumably also has some error in it's position, which may be correlated with your position (if, for example, if it comes from another GPS operating at a similar time).

Certainly a univarient model in the type system could be useful, but it would be extra powerful (and more correct) if it could handle covariance.

replies(4): >>45056007 #>>45056398 #>>45060737 #>>45062423 #

3. jakubmazanec ◴[28 Aug 25 18:40 UTC] No.45055499[source]▶

>>45054703 (OP) #

[flagged]

replies(4): >>45055607 #>>45055617 #>>45055704 #>>45055864 #

4. cobbal ◴[28 Aug 25 18:49 UTC] No.45055607[source]▶

>>45055499 #

I don't think inference is part of this at all, frequentist or otherwise.

It's not part of the type system, it's just the giry monad as a library.

5. frizlab ◴[28 Aug 25 18:49 UTC] No.45055617[source]▶

>>45055499 #

> And why does it need to be part of the type system?

As presented in the article, it is indeed just a library.

6. geocar ◴[28 Aug 25 18:57 UTC] No.45055704[source]▶

>>45055499 #

> What if I want Bayesian?

Bayes is mentioned on page 46.

> And why does it need to be part of the type system? It could be just a library.

It is a library that defines a type.

It is not a new type system, or an extension to any particularly complicated type system.

> Am I missing something?

Did you read it?

https://www.microsoft.com/en-us/research/wp-content/uploads/...

https://github.com/klipto/Uncertainty/

replies(1): >>45056001 #

7. AlotOfReading ◴[28 Aug 25 19:11 UTC] No.45055844[source]▶

>>45054703 (OP) #

A small note, but GPS is only well-approximated by a circular uncertainty in specific conditions, usually open sky and long-time fixes. The full uncertainty model is much more complicated, hence the profusion of ways to measure error. This becomes important in many of the same situations that would lead you to stop treating the fix as a point location in the first place. To give a concrete example, autonomous vehicles will encounter situations where localization uncertainty is dominated by non-circular multipath effects.

If you go down this road far enough you eventually end up reinventing particle filters and similar.

replies(2): >>45056358 #>>45061195 #

8. muxl ◴[28 Aug 25 19:12 UTC] No.45055864[source]▶

>>45055499 #

It was chosen to be implemented as a generic type in this design because the way that uncertainty "pollutes" underlying values maps well onto monads which were expressed through generics in this case.

9. layer8 ◴[28 Aug 25 19:19 UTC] No.45055943[source]▶

>>45054703 (OP) #

Arguably Uncertain should be the default, and you should have to annotate a type as certain T when you are really certain. ;)

replies(2): >>45056272 #>>45056582 #

10. jakubmazanec ◴[28 Aug 25 19:23 UTC] No.45056001{3}[source]▶

>>45055704 #

> Bayes is mentioned on page 46.

Bayes isn't mentioned in the linked article. But thanks for the links.

replies(1): >>45062183 #

11. layer8 ◴[28 Aug 25 19:24 UTC] No.45056007[source]▶

>>45055418 #

To properly model quantum mechanics, you’d have to associate a complex-valued wave function with any set of entangled variables you might have.

12. munchler ◴[28 Aug 25 19:37 UTC] No.45056168[source]▶

>>45054703 (OP) #

Is this essentially a programmatic version of fuzzy logic?

https://en.wikipedia.org/wiki/Fuzzy_logic

replies(1): >>45056279 #

13. krukah ◴[28 Aug 25 19:38 UTC] No.45056184[source]▶

>>45054703 (OP) #

Monads are really undefeated. This particular application feels to me akin to wavefunction evolution? Density matrices as probability monads over Hilbert space, with unitary evolution as bind, measurement/collapse as pure/return. I guess everything just seems to rhyme under a category theory lens.

replies(1): >>45056360 #

14. esafak ◴[28 Aug 25 19:47 UTC] No.45056272[source]▶

>>45055943 #

A complement to Optional.

15. esafak ◴[28 Aug 25 19:48 UTC] No.45056279[source]▶

>>45056168 #

https://en.wikipedia.org/wiki/Probabilistic_programming more like. It is already a thing; see, for example, https://pyro.ai/

16. mikepurvis ◴[28 Aug 25 19:56 UTC] No.45056358[source]▶

>>45055844 #

Vehicle GPS is usually augmented by a lot of additional sensors and assumptions, notably the speedometer, compass, and knowledge the you'll be on one of the roads marked on its map. Not to mention a fast fix because you can assume you haven't changed position since you last powered on.

replies(6): >>45057070 #>>45060201 #>>45060449 #>>45062091 #>>45062247 #>>45062692 #

17. valcron1000 ◴[28 Aug 25 19:57 UTC] No.45056360[source]▶

>>45056184 #

Relevant (2006): https://web.engr.oregonstate.edu/~erwig/pfp/

18. evanb ◴[28 Aug 25 20:00 UTC] No.45056398[source]▶

>>45055418 #

If you need to track covariance you might want to play with gvar https://gvar.readthedocs.io/en/latest/ in python.

19. 8note ◴[28 Aug 25 20:07 UTC] No.45056484[source]▶

>>45054703 (OP) #

for mechanical engineering drawings to communicate with machinists and the like, we use tolerances

eg. 10cm +8mm/-3mm

for what the acceptable range is, both bigger and smaller.

id expect something like "are we there yet" referencing GPS should understand the direction of the error and what directions of uncertainty are better or worse

replies(2): >>45057720 #>>45059441 #

20. nine_k ◴[28 Aug 25 20:17 UTC] No.45056582[source]▶

>>45055943 #

Only for physical measurements. For things like money, you should be pretty certain, often down to exact fractional cents.

It appears that a similar approach is implemented in some modern Fortran libraries.

replies(4): >>45057007 #>>45057078 #>>45057151 #>>45060359 #

21. cb321 ◴[28 Aug 25 20:20 UTC] No.45056609[source]▶

>>45054703 (OP) #

If you are in an even more "approximate" mindset (as opposed to propagating by simulation to get real world re-sampled skewed distributions, as often happens in experimental physics labs, or at least their undergraduate courses), there is an error propagation (https://en.wikipedia.org/wiki/Propagation_of_uncertainty) simplification for "small" errors thing you can do. Then translating "root" errors to "downstream errors" is just simple chain rule calculus stuff. (There is a Nim library for that at https://github.com/SciNim/Measuremancer that I use at least every week or two - whenever I'm timing anything.)

It usually takes some "finesse" to get your data / measurements into territory where the errors are even small in the first place. So, I think it is probably better to do things like this Uncertain<T> for the kinds of long/fat/heavy tailed and oddly shaped distributions that occur in real world data { IF the expense doesn't get in your way some other way, that is, as per Senior Engineer in the article }.

22. black_knight ◴[28 Aug 25 20:31 UTC] No.45056719[source]▶

>>45054703 (OP) #

This seems closely related to this classic Functional Pearl: https://web.engr.oregonstate.edu/~erwig/papers/PFP_JFP06.pdf

It’s so cool!

I always start my introductory course on Haskell with a demo of the Monty Hall problem with the probability monad and using rationals to get the exact probability of winning using the two strategies as a fraction.

replies(1): >>45061891 #

23. droideqa ◴[28 Aug 25 20:35 UTC] No.45056755[source]▶

>>45054703 (OP) #

Could this be implemented in Rust or Clojure?

Does Anglican kind of do this?

24. j2kun ◴[28 Aug 25 20:44 UTC] No.45056827[source]▶

>>45054703 (OP) #

This concept has been done many times in the past, under the name "interval arithmetic." Boost has it [1] as does flint [2]

What is really curious is why, after being reinvented so many times, it is not more mainstream. I would love to talk to people who have tried using it in production and then decided it was a bad idea (if they exist).

[1]: https://www.boost.org/doc/libs/1_89_0/libs/numeric/interval/... [2]: https://arblib.org/

replies(8): >>45056929 #>>45057194 #>>45057366 #>>45057865 #>>45058239 #>>45058375 #>>45058723 #>>45062336 #

25. Tarean ◴[28 Aug 25 20:54 UTC] No.45056929[source]▶

>>45056827 #

Interval arithmetic is only a constant factor slower but may simplify at every step. For every operation over numbers there is a unique most precise equivalent op over intervals, because there's a Galois connection. But just because there is a most precise way to represent a set of numbers as an interval doesn't mean the representation is precise.

A computation graph which gets sampled like here is much slower but can be accurate. You don't need an abstract domain which loses precision at every step.

replies(1): >>45057275 #

26. nicois ◴[28 Aug 25 21:00 UTC] No.45056978[source]▶

>>45054703 (OP) #

Is there a risk that this will underemphasise some values when the source of error is not independent? For example, the ROI on financial instruments may be inversely correlated to the risk of losing your job. If you associate errors with each, then combine them in a way which loses this relationship, there will be problems.

replies(1): >>45057248 #

27. XorNot ◴[28 Aug 25 21:02 UTC] No.45057007{3}[source]▶

>>45056582 #

Money has the problem that no matter how clever you are someone will punch all the values into Excel and then complain they don't match.

Or specify they're paying X per day, but want hourly itemized billing...but it should definitely come out to X per day (this was one employer which meant I invoiced them with like 8 digits of precision due to how it divided, and they refused to accept a line item for mathematical uncertainty aggregates).

28. tricky_theclown ◴[28 Aug 25 21:05 UTC] No.45057024[source]▶

>>45054703 (OP) #

29. monocasa ◴[28 Aug 25 21:09 UTC] No.45057070{3}[source]▶

>>45056358 #

As well as a fast fix because you know what mobile cell or wifi network you're on.

30. rictic ◴[28 Aug 25 21:10 UTC] No.45057078{3}[source]▶

>>45056582 #

A person might have mistyped a price, a barcode may have been misread, the unit prices might be correct but the quantity could be mistaken. Modeling uncertainty well isn't just about measurement error from sensors.

I wonder what it'd look like to propagate this kind of uncertainty around. You might want to check the user's input against a representative distribution to see if it's unusual and, depending on the cost of an error vs the friction of asking, double-check the input.

replies(1): >>45058748 #

31. random3 ◴[28 Aug 25 21:18 UTC] No.45057151{3}[source]▶

>>45056582 #

have you ever tried working computationally with money? Forget money, have you worked with floating points? There really isn't anything certain.

replies(1): >>45057195 #

32. pklausler ◴[28 Aug 25 21:22 UTC] No.45057194[source]▶

>>45056827 #

Interval arithmetic makes good intuitive sense when the endpoints of the intervals can be represented exactly. Figuring out how to do that, however, is not obvious.

replies(1): >>45059131 #

33. nine_k ◴[28 Aug 25 21:22 UTC] No.45057195{4}[source]▶

>>45057151 #

Yes, I worked in a billing department. No, floats are emphatically not suitable for representing money, except the very rounded values in presentations.

Floats try to keep the relative error at bay, so their absolute precision varies greatly. You need to sum them starting with the smallest magnitude, and do many other subtle tricks, to limit rounding errors.

34. ◴[28 Aug 25 21:28 UTC] No.45057248[source]▶

>>45056978 #

35. bee_rider ◴[28 Aug 25 21:31 UTC] No.45057275{3}[source]▶

>>45056929 #

It would have been sort of interesting if we’d gone down the road of often using interval arithmetic. Constant factor slower, but also the operations are independent. So if it was the conventional way of handling non-integer numbers, I guess we’d have hardware acceleration by now to do it in parallel “for free.”

replies(1): >>45059122 #

36. kccqzy ◴[28 Aug 25 21:41 UTC] No.45057366[source]▶

>>45056827 #

The article says,

> Under the hood, Uncertain<T> models GPS uncertainty using a Rayleigh distribution.

And the Rayleigh distribution is clearly not just an interval with a uniformly random distribution in between. Normal interval arithmetic isn't useful because that uniform random distribution isn't at all a good model for the real world.

Take for example that Boost library you linked. Ask it to compute (-2,2)*(-2,2). It will give (-4,4). A more sensible result might be something like (-2.35, 2.35). The -4 lower bound is only attainable when you have -2 and 2 as the multiplicands which are at the extremes of the interval; probabilistically if we assume these are independent random variables then two of them achieving this extreme value simultaneously should have an even lower probability.

replies(1): >>45060585 #

37. lloydatkinson ◴[28 Aug 25 21:50 UTC] No.45057449[source]▶

>>45054703 (OP) #

IS there the complete C# available for this? I looked over the original paper and it's just snippets.

replies(1): >>45058337 #

38. contravariant ◴[28 Aug 25 22:16 UTC] No.45057669[source]▶

>>45054703 (OP) #

I feel like if you're worried about picking the right abstraction then this is almost certainly the wrong one.

39. mabster ◴[28 Aug 25 22:23 UTC] No.45057720[source]▶

>>45056484 #

Something that's bugged me about this notation though is that sometimes it means "cannot exceed the bounds" and sometimes it means "only exceeds the bounds 10% of the time"

replies(1): >>45058079 #

40. lxe ◴[28 Aug 25 22:35 UTC] No.45057816[source]▶

>>45054703 (OP) #

I really like that this leans on computing probabilities instead of forcing everything into closed-form math or classical probability exercises. I’ve always found it way more intuitive to simulate, sample, and work directly with distributions. With a computer, it feels much more natural to uh... compute: you just run the process, look at the results, and reason from there.

41. anal_reactor ◴[28 Aug 25 22:43 UTC] No.45057865[source]▶

>>45056827 #

Because reasoning about uncertain values / random variables / intervals / fuzzy logic / whatever is difficult and the model where things are certain is much easier to process while it models the reality well enough.

42. keeganpoppen ◴[28 Aug 25 23:01 UTC] No.45058003[source]▶

>>45054703 (OP) #

oh man i had forgotten about this blog from when i orbited the swift ecosystem a bit... it's clearly as great as always! fun post!

43. taneq ◴[28 Aug 25 23:12 UTC] No.45058079{3}[source]▶

>>45057720 #

I don’t think I’ve ever seen mechanical drawings have “90% confidence” dimensions like this. If a part’s too big then it won’t fit, and it’s probably useless.

replies(2): >>45058208 #>>45062027 #

44. dcsommer ◴[28 Aug 25 23:26 UTC] No.45058160[source]▶

>>45054703 (OP) #

Seems more proper to call it a `ProbabilityDistribution` type. It's a more general and intuitive way to handle the concept.

replies(2): >>45058243 #>>45058770 #

45. kevin_thibedeau ◴[28 Aug 25 23:32 UTC] No.45058208{4}[source]▶

>>45058079 #

If a test procedure is verifying all dimensional accuracy, it can be assumed to be bounding tolerance. If it's a mass production line with less than 100% testing of parts, you'd have to expect that some outliers get by and the tolerance is something like 3-sigma on a Gaussian.

46. PaulDavisThe1st ◴[28 Aug 25 23:36 UTC] No.45058239[source]▶

>>45056827 #

Several years ago when I discovered some of the historical work on interval arithmetic, I was astounded to find that there was a notable contingent in the 60s that was urging hardware developers to make interval arithmetic be the basic design of new CPUs, and saying quite forcefully that if we simply went with "normal" integers and floating point, we'd be unable to correctly model the world.

replies(1): >>45058826 #

47. ngruhn ◴[28 Aug 25 23:36 UTC] No.45058243[source]▶

>>45058160 #

Yeah but the shorter name wins

48. kittoes ◴[28 Aug 25 23:51 UTC] No.45058337[source]▶

>>45057449 #

https://github.com/klipto/uncertainty

replies(1): >>45058496 #

49. woah ◴[28 Aug 25 23:57 UTC] No.45058375[source]▶

>>45056827 #

Using simple types (booleans etc) is very simple and easy to reason about, and any shortcomings are obvious. Trying to model physical uncertainty is difficult and requires different models for different domains. Once you have committed to needing to do that, it would be much better to use a purpose built model instead of a library which put some bell curves behind a pretty API.

replies(1): >>45059106 #

50. Pxtl ◴[29 Aug 25 00:15 UTC] No.45058496{3}[source]▶

>>45058337 #

10 years since commit and no attached documents besides a tiny readme. Pass.

replies(3): >>45059952 #>>45060475 #>>45065340 #

51. orlp ◴[29 Aug 25 00:49 UTC] No.45058723[source]▶

>>45056827 #

Not sure why this is being upvoted as the article is not describing interval arithmetic. It supports all kinds of uncertainty distributions.

52. bee_rider ◴[29 Aug 25 00:52 UTC] No.45058748{4}[source]▶

>>45057078 #

Typos seem like a different type of error from physical tolerances, and one that would be really hard to reason about mathematically.

53. bee_rider ◴[29 Aug 25 00:56 UTC] No.45058770[source]▶

>>45058160 #

But the pun, uncertainty.

54. skissane ◴[29 Aug 25 01:04 UTC] No.45058826{3}[source]▶

>>45058239 #

I think as another commenter pointed out, interval arithmetic’s problem is that while it acknowledges the reality of uncertainty, its model of uncertainty is so simplistic, in many applications it is unusable. So making it the standard primitive, could potentially result in the situation where apps that don’t need to explicitly model uncertainty at all have to pay the price of being forced to do so; meanwhile, apps which need a more realistic model of uncertainty are being forced to do so while being hamstrung by its interactions with another overly simple model. It is one of those ideas which sounds great in theory, but there are good reasons it never succeeded in practice-the space of use cases where explicitly modelling uncertainty is desirable, but where the simplistic model of interval arithmetic is entirely adequate, is rather small-a standard primitive which only addresses the needs of a narrow subset of use cases is not a good architecture

55. btown ◴[29 Aug 25 01:37 UTC] No.45059050[source]▶

>>45054703 (OP) #

Once one understands that a variable (in a programming context) can hold a specification for a variable (in a mathematical context), one opens up incredible doors that are at the foundation of modern AI.

When you see y = m * x + b, your recollections of math class may note that you can easily solve for "m" or find a regression for "m" and "b" given various data points. But from a programming perspective, if these are all literal values, all this is is a "render" function. How can you reverse an arbitrary render function?

There are various approaches, depending on how Bayesian you want to be, but they boil down to: if your language supports redefining operators based on the types of the variables, and you have your variables contain a full specification of the subgraphs of computations that lead to them... you can create systems that can simultaneously do "forward passes" by rendering the relationships, and "backward passes" where the system can automatically calculate a gradient/derivative and thus allow a training system to "nudge" the likeliest values of variables in the right direction. By sampling these outputs, in a mathematically sound way, you get the weights that form a model.

Every layer in a deep neural network is specified in this way. Because of the composability of these operations, systems like PyTorch can compile incredibly optimal instructions for any combination of layers you can think of, just by specifying the forward-pass relationships.

So Uncertain<T> is just the tip of the iceberg. I'd recommend that everyone experiment with the idea that a numeric variable might be defined by metadata about its potential values at any given time, and that you can manipulate that metadata as easily as adding `a + b` in your favorite programming language.

replies(4): >>45059815 #>>45062069 #>>45062097 #>>45062893 #

56. eru ◴[29 Aug 25 01:46 UTC] No.45059106{3}[source]▶

>>45058375 #

I agree that different application strictly speaking need different models of uncertainty.

But I'm not so sure in your conclusion: a good enough model could be universally useful. See how everyone uses IEEE 754 floats, despite them giving effectively one very specific model of uncertainty. Most of the time this just works, and sometimes people have to explicitly work around floats' weirdnesses (whether that's because they carefully planned ahead because they know what they are doing, or whether they got a nasty surprise first). But overall they are still useful enough to be used almost universally.

57. eru ◴[29 Aug 25 01:49 UTC] No.45059122{4}[source]▶

>>45057275 #

You can probably get the parallelism for interval arithmetic today? Though it would probably require a bit of effort and not be completely free.

On the CPU you probably get implicit parallel execution with pipelines and re-ordering etc, and on the GPU you can set up something similar.

58. eru ◴[29 Aug 25 01:49 UTC] No.45059131{3}[source]▶

>>45057194 #

Also not all uncertainties are modeled well by uniform distributions over an interval.

59. _kb ◴[29 Aug 25 02:35 UTC] No.45059441[source]▶

>>45056484 #

Or for something likely relevant to many here - 3 point time estimates for project planning.

Probability distributions (even very simple ones) provide a much clearer view across any domain where there’s inherent uncertainty.

60. jonahx ◴[29 Aug 25 03:26 UTC] No.45059815[source]▶

>>45059050 #

Very interesting.

Are there PLs that support this kind of thing at the language level as you are describing?

replies(2): >>45059994 #>>45060213 #

61. miffy900 ◴[29 Aug 25 03:50 UTC] No.45059952{4}[source]▶

>>45058496 #

This is still some code, as opposed to no code. It does seem to model everything in the research paper.

Aside from the original research paper needing to be included in the repo, it definitely does not need anything more than what's already there. It all builds and compiles without errors, only 2 warnings for the library proper and 6 warnings for the test project. Oh and it comes with a unit testing project: 59 tests written that covers about 73% of the library code. Only 2 tests failed.

Even having a unit testing library means it beats out like 50% of all repos you see on GitHub.

62. btown ◴[29 Aug 25 03:56 UTC] No.45059994{3}[source]▶

>>45059815 #

https://colcarroll.github.io/ppl-api/ is likely a good starting point to get a taste of examples in Python; some use custom languages, but the success of Python-native frameworks in the LLM world I think has shown that embracing that makes interop and composability more possible at scale.

https://news.ycombinator.com/item?id=28941145 has some discussion here as well, though it’s a few years old.

Pyro and NumPyro seem to be popular at the moment!

63. astrange ◴[29 Aug 25 04:29 UTC] No.45060201{3}[source]▶

>>45056358 #

None of the inputs you mention work against multipath effects in cities, which means car GPS won't know which lane you're in and in a grid system may think you're on the next street over.

If you have an HD map you can solve for it using building shapes or by looking at the street with cameras. WiFi seems like it would help, but the locations of the WiFi terminals are themselves based on crowdsourced GPS.

64. astrange ◴[29 Aug 25 04:30 UTC] No.45060213{3}[source]▶

>>45059815 #

If you're willing to be discrete about it, logic languages like Prolog and Mercury use "unification" instead of "evaluation" which means they can evaluate backwards.

65. geocar ◴[29 Aug 25 04:55 UTC] No.45060359{3}[source]▶

>>45056582 #

> For things like money, you should be pretty certain, often down to exact fractional cents.

That's one way to look at it.

Another is that Money is certain only at the point of exchange.

> It appears that a similar approach is implemented in some modern Fortran libraries.

I'd be curious about that. Do you have a link?

66. o11c ◴[29 Aug 25 05:11 UTC] No.45060449{3}[source]▶

>>45056358 #

> Not to mention a fast fix because you can assume you haven't changed position since you last powered on.

... until you use a ferry.

67. kittoes ◴[29 Aug 25 05:16 UTC] No.45060475{4}[source]▶

>>45058496 #

Blame Microsoft Research, as the link came directly from them: https://www.microsoft.com/en-us/research/project/uncertainty.... I don't think they ever really took the project past the initial paper/presentation.

68. webcoon ◴[29 Aug 25 05:16 UTC] No.45060478[source]▶

>>45054703 (OP) #

Awesome! This speaks to something, which I've been thinking (and wishing) for a long time. I've already done probabilistic programming in a scientific context (Python) and classical software engineering for web development (TypeScript, Python, Rust), and I've always wondered why I couldn't have the real-world modelling capacity of the former with the static type assurances of the latter. Love that you (and Microsoft) are thinking along the same lines! Do you perhaps know of any Python implementations for this? There are plenty of dynamic stats programming libraries, but none offer typing solutions AFAIK.

69. rendaw ◴[29 Aug 25 05:34 UTC] No.45060585{3}[source]▶

>>45057366 #

While it does sound like GP missed a distinction, I don't see how (-2.35, 2.35) would be sensible. The extremes can happen (or else they wouldn't be part of the input intervals) and the code has to sensibly deal with that event in order to be correct.

replies(3): >>45062972 #>>45064017 #>>45080705 #

70. akst ◴[29 Aug 25 06:04 UTC] No.45060735[source]▶

>>45054703 (OP) #

Something I've wanted to make was a data type to represent a value that may or may not be known with a level of certainty over a certain distribution (or probability density function), but you could apply various transforms that may or may not have their own level of uncertainty, and you end up with a refined set of probability distributions each observation (or a new set of classifications based on whatever conditionals).

With the eventual goal of running various simulations over different randomly generated outcomes based on those probability distributions.

71. joerick ◴[29 Aug 25 06:05 UTC] No.45060737[source]▶

>>45055418 #

I've been wondering for a while if a program could "learn" covariance somehow. Through real-world usage.

Otherwise, it feels to me that it'd be consistently wrong to model the variables as independent. And any program of notable size is gonna be far too big to consider correlations between all the variables.

As for how one might do the learning, I don't know yet!

72. nullhole ◴[29 Aug 25 07:22 UTC] No.45061195[source]▶

>>45055844 #

Lidar points aren't points, they're spheroids centred on the most likely location

73. internet_points ◴[29 Aug 25 09:13 UTC] No.45061891[source]▶

>>45056719 #

See also the Haskell library monad-bayes https://monad-bayes.netlify.app/tutorials/ https://www.tweag.io/blog/2019-09-20-monad-bayes-1/

74. mabster ◴[29 Aug 25 09:37 UTC] No.45062027{4}[source]▶

>>45058079 #

Yeah it's probably field specific and I guess Gaussian-based uncertainty would be more about statistical sampling rather than tolerances. I've noticed that if arithmetic is being done on it it's almost certainly Gaussian. I just mean whenever I see uncertainty like this, I don't know what is meant!

replies(1): >>45064325 #

75. danhau ◴[29 Aug 25 09:44 UTC] No.45062069[source]▶

>>45059050 #

This sounds super interesting, but as someone who knows little about ML or math in general, could you give an ELI5?

replies(1): >>45068802 #

76. jeffreygoesto ◴[29 Aug 25 09:48 UTC] No.45062091{3}[source]▶

>>45056358 #

Well. Some part of the 101 was moved a bunch of feet sideways after construction. Really hard to correct for, the GPS and the map localization were constantly fighting like an old couple... Had to re-map that stretch quickly...

77. MangoToupe ◴[29 Aug 25 09:49 UTC] No.45062097[source]▶

>>45059050 #

This comment seems to conflate variables, functions, and linear systems. I don't think these are worth conflating.

replies(1): >>45065349 #

78. geocar ◴[29 Aug 25 10:01 UTC] No.45062183{4}[source]▶

>>45056001 #

That did not surprise me because I did not think the article was about anything but about adapting the dot-net library they linked to on Microsoft's site to swift, and I figured that if I wanted to understand the library and the approach I had better read the links that indicated I might be able to learn from them.

79. blauditore ◴[29 Aug 25 10:12 UTC] No.45062247{3}[source]▶

>>45056358 #

> assume you haven't changed position since you last powered on

Sounds like a classic case of programmers ignoring corner cases: Towing, ferries, car trains, pushing the car because it broke down...

It's when you find messages in the log like "this should never happen".

replies(1): >>45062709 #

80. jjcob ◴[29 Aug 25 10:30 UTC] No.45062336[source]▶

>>45056827 #

In physics, you typically learn about error propagation quite early in your studies.

If you make some assumptions about your error (a popular one is to assume Gaussian distribution) then you can calculate the error of the result quite elegantly.

It's a nice excercise to write some custom C++ types that have (value, error) and automatically propagate them as you perform mathematical operations on them.

Unfortunately, in the real world only very few measurements have a gaussian error distribution, and the problem are systematic (non-random) errors, and reasoning about them is a lot harder.

So this means that automatically handling error propagation is in most cases pointless, since you need to manually analyze the situation anyway.

81. flaghacker ◴[29 Aug 25 10:45 UTC] No.45062423[source]▶

>>45055418 #

Using this sampling-based approach you get correct covariance modeling for free. You have to only sample leaf values that are used in multiple places once per evaluation, but it looks like they do just that: https://github.com/mattt/Uncertain/blob/962d4cc802a2b179685d...

82. thekoma ◴[29 Aug 25 10:56 UTC] No.45062486[source]▶

>>45054703 (OP) #

We designed a processor microarchitecture [1] at the University of Cambridge, inspired by Uncertain<T> (James Bornholt) and related work. In addition to assuming parametric distributions (e.g., Gaussian, Rayleigh), it lets you load arbitrary sets of samples into registers/memory so program values are carried and propagated as nonparametric distributions through ordinary arithmetic.

A spin-off, Signaloid, is taking this technology to market. I'm also researching using this in state estimation (e.g., particle filters).

[1]: https://dl.acm.org/doi/10.1145/3466752.3480131

83. mauvehaus ◴[29 Aug 25 11:30 UTC] No.45062692{3}[source]▶

>>45056358 #

And yet, sometimes driving down a divided, limited access highway, Apple/Google/whatever maps will suddenly start giving directions from whatever parallel dirt road I happen to be driving next to. As though there's a situation where I left the highway at highway speed, crossing a ditch, crushing a fence, and possibly smashing through a guard rail, and am now traveling 65mph/100kph down a dirt road.

84. mauvehaus ◴[29 Aug 25 11:32 UTC] No.45062709{4}[source]▶

>>45062247 #

You can pretty clearly use it to correct errors up to a point though. If you have a 5km difference from when the GPS was turned off, you've probably hit a corner case. If you have a 25m difference, and it's converging on the last location as you pick up satellites, snapping to the prior location is almost certainly correct.

85. Davidbrcz ◴[29 Aug 25 12:00 UTC] No.45062893[source]▶

>>45059050 #

Congrats, you have just reinvented the monad

86. ◴[29 Aug 25 12:05 UTC] No.45062954[source]▶

>>45054703 (OP) #

87. esrauch ◴[29 Aug 25 12:06 UTC] No.45062972{4}[source]▶

>>45060585 #

The reason is that the uniform distribution is very rare. Nearly no real world scenario were something is equally likely to be the values 2, 0 and -2, and where it's literally impossible to be -2.01. It exists but it's not the normal case.

In noisy sensors case there's some arbitrary low probability of them being actually super wrong, if you go by true 10^-10 outlier bounds they will be useless for any practical use, while the 99% confidence range is a relatively small rent.

More often you want some other distribution and say (-2, 2) and those are the 90th percentile interval not the absolute bounds, 0 is more likely than -2 and -3 is possible but rare. It's not bounds, you can ask you model for your 99th or 99.9th percentile value or whatever tolerance you want and get something outside of (-2,2).

88. kccqzy ◴[29 Aug 25 13:41 UTC] No.45064017{4}[source]▶

>>45060585 #

Interval arithmetic isn't useful because it only tells you the extreme values, but not how likely these values are. So you have to interpret them as uniform random. Operations like multiplications change the shape of these distributions, so then uniform random isn't applicable any more. Therefore interval arithmetic basically has an undefined underlying distribution that can change easily without being tracked.

replies(1): >>45065842 #

89. brabel ◴[29 Aug 25 14:07 UTC] No.45064325{5}[source]▶

>>45062027 #

In Mechanical Engineering, tolerances ensure that when you put parts together, they will fit as long as the tolerances were respected.

It's not statistical. If the machinist makes a part that's not within the +/- bounds, they throw it away and start again. If you tried to fit multiple parts, all with only statistical respect for tolerances, you would run into trouble almost 100% of the time with just a few pieces.

replies(1): >>45069998 #

90. naasking ◴[29 Aug 25 15:22 UTC] No.45065312[source]▶

>>45054703 (OP) #

Really interesting to see how long ideas take to go mainstream. From my recollection, Oleg and Chung-chieh Shan did this first back in 2009 as a library in OCaml [1,2].

[1] https://groups.google.com/g/fa.caml/c/CbXeoR_Rzrk?pli=1

[2] https://okmij.org/ftp/kakuritu/

91. naasking ◴[29 Aug 25 15:24 UTC] No.45065340{4}[source]▶

>>45058496 #

Sometimes things can just be "done", and the paper is pretty good documentation if the implementation is faithful to what is described there.

92. btown ◴[29 Aug 25 15:24 UTC] No.45065349{3}[source]▶

>>45062097 #

If you store a specification for a probability distribution and give that specification a name, it can behave like a function in that it can be sampled for a scalar output. It can behave like a variable in that you can assign it to a new variable name, play with it as you would anything else in a programming language. And a linear system, perhaps overdetermined, is but one of many ways that the specification can be defined under the hood.

The fact that one can play with complicated nested probability distributions that unify these concepts, as one would play with dolls in a dollhouse, is the point!

93. mcphage ◴[29 Aug 25 16:03 UTC] No.45065842{5}[source]▶

>>45064017 #

> Operations like multiplications change the shape of these distributions, so then uniform random isn't applicable any more.

Doesn't addition as well? Like if you roll d6+d6, the output range is 2-12, but it's not nearly the same as if you rolled d11+1.

replies(1): >>45066031 #

94. captainmuon ◴[29 Aug 25 16:14 UTC] No.45065987[source]▶

>>45054703 (OP) #

Back when I was studying physics, we frequently had to do calculations with error propagation. I tried to implement something very similar in C++ and in Python, but never finished it. I also thought it would be neat if a spreadsheet program could understand uncertainties, and also units, so you could enter 1m +- 10cm and it would propagate the errors correctly. If you laid out the data with one column for the values and one for the errors, I had a couple of OpenOffice macros that would perform the calculations.

Another place where I think this would be neat would be in CAD. Imagine if you are trying to create a model of an existing workpiece or of a room, and your measurements don't exactly add up. It's really frustrating and you have to go back and measure again, and you usually end up idealizing the model and putting in rounder numbers to make it fit, but it is less true to reality. It would be cool if you could put in uncertainties for all lengths and angles, and it would run a solver to minimize the total error.

95. kccqzy ◴[29 Aug 25 16:17 UTC] No.45066031{6}[source]▶

>>45065842 #

Yes that's true! I used multiplication because that was my original example.

replies(1): >>45073853 #

96. btown ◴[29 Aug 25 20:07 UTC] No.45068802{3}[source]▶

>>45062069 #

I have a bunch of points. I want to fit a curve to them. I could write a function that takes a bunch of parameters as floats that specify the curve, and an x coordinate as a float, and have it output the most likely y value as a float.

If I have a library, though, that lets me add and multiply not just floats but entire computation subgraphs with the same exact + and * operators, though, I can have the library reverse that function automatically, and say: “optimize the parameters to minimize the difference between the curve and the data points.”

LLMs and other ML systems, to paint with a very broad stroke, solve that problem with billions of parameters in a million-dimensional space. Developing intuition for those high dimensions is hard! But the code is simple because once you’ve done the math for the forward pass, you can go straight from chalkboard to Python code, and the libraries largely assist with reversing and building a GPU-accelerated training process automatically!

97. mabster ◴[29 Aug 25 22:10 UTC] No.45069998{6}[source]▶

>>45064325 #

Yeah understood. In electronics: Resistor values are Gaussian but they test and bucket the resistors so that they can be treated as tolerances for similar reasons.

98. mcphage ◴[30 Aug 25 11:50 UTC] No.45073853{7}[source]▶

>>45066031 #

Okay, thanks :-). I was just trying to make sure I was understanding what I was reading.

99. Dylan16807 ◴[31 Aug 25 05:57 UTC] No.45080705{4}[source]▶

>>45060585 #

-2 and 2 were not the extremes to begin with.

↑