A visual introduction to big O notation

(samwho.dev)

688 points samwho | 2 comments | 24 Aug 25 07:39 UTC | HN request time: 0s | source

Show context

IceDane ◴[25 Aug 25 17:59 UTC] No.45016742[source]▶

Every time I see an introduction to Big-O notation, I immediately go look for a section about the worst case to find the most common, fundamental misconception that is present in nearly every such text.

Lo and behold:

> Because it's common for an algorithm's performance to depend not just on the size of the input, but also its arrangement, big O notation always describes the worst-case scenario.

This is not true. I'm on my phone, so I can't be bothered to explain this in more detail, but there is plenty of information about this online.

replies(4): >>45016873 #>>45016884 #>>45016889 #>>45017375 #

bonoboTP ◴[25 Aug 25 18:08 UTC] No.45016884[source]▶

>>45016742 #

Yes. The notation can be used to describe how fast any function grows. That function may be a worst-case runtime wrt. input length, the average-case runtimve wrt. input length, or anything else, really.

But the most common usage is indeed worst-case analysis, especially in intro courses.

This is also wrong in the article:

> You may sometimes see the Greek letter "theta", Θ, instead of the letter O in big O notation. This is used when the best and worst case have the same order of growth. We can't use it for bubble sort because the best case is O(n) and the worst case is O(n^2).

It conflates two different, orthogonal concepts: upper vs lower bounds on the one hand, and best vs worst case analysis on the other.

The phrase "the best case is O(n)" already contradicts "big O notation always describes the worst-case scenario". They clearly use it for best case right there in the article.

replies(2): >>45016968 #>>45016973 #

samwho ◴[25 Aug 25 18:13 UTC] No.45016968[source]▶

>>45016884 #

I’d love to hear how those two things differ, and how I could include it in the post in a way that won’t be too scary for a beginner to the topic.

That section in particular has already seen several revisions based on other peoples’ comments.

replies(1): >>45017154 #

bonoboTP ◴[25 Aug 25 18:25 UTC] No.45017154{3}[source]▶

>>45016968 #

Basically, for each input length, you can have different runtimes depending on the contents of that input. If you always pick the highest runtime for each length, that gives you a specific function. You can then derive either a lower or an upper bound for that function. Same for the best case.

You can state four combinations in their common sense intuition as

- even in the worst case, the runtime grows only at most as fast as n^2 (modulo multiplication and addition)

- in the worst case, the runtime can grow at least as fast as n^2

- in the best case, the runtime can grow only at most as fast as n^2

- even in the best case, the runtime grows at least as fast as n^2

One can imagine it as not just a single curve on a function plot, but a band, a range. Then the top line of this range can be bounded from above or below, and the bottom can also be bounded from above or below. The common case is when you give a bound for the top curve, from above.

How to work this into the article, I don't know, sorry.

replies(1): >>45017216 #

samwho ◴[25 Aug 25 18:30 UTC] No.45017216{4}[source]▶

>>45017154 #

I appreciate you taking the time to explain this to me, thank you.

I feel like the way I have it is a reasonable, if strictly incorrect, simplification for someone new to the material.

When I write these, I have the newly-hired bootcamp grad as my target audience. Someone who has been taught the basics for how to write code for the web and work as a junior within a company, but has no computer science background. I want them to understand enough to make sense of what they see.

replies(1): >>45017465 #

bonoboTP ◴[25 Aug 25 18:47 UTC] No.45017465{5}[source]▶

>>45017216 #

Maybe like this:

"It's common for an algorithm's performance to depend not just on the size of the input, but also its arrangement. In such cases we typically use big O notation to describe the worst-case scenario. However, note that we may perform similar analysis on the best case or the average case as well."

But here:

This is plain false, the worst-case runtime of bubble sort is indeed Θ(n^2). Either delete it, or write about the upper bound / lower bound thing.

replies(1): >>45017565 #

1. samwho ◴[25 Aug 25 18:56 UTC] No.45017565{6}[source]▶

>>45017465 #

I made some wording changes based on what you wrote, I'd appreciate your feedback.

I'm struggling to understand how the part about big theta is false. Is the best case for bubble sort not O(n)? I've deleted the section for the time being.

replies(1): >>45017728 #

2. umanwizard ◴[25 Aug 25 19:11 UTC] No.45017728[source]▶

>>45017565 (TP) #

The difference between big theta and big O has nothing to do with worst vs. average case. They are two completely orthogonal axes.

You can talk about the big-O of the average case, the big-theta of the average case, the big-O of the worst case, or the big-theta of the worst case.

↑