coding === function(){
  return _joy;
}

Big O - Part 2: Asymptotic Notation? What?

This is a planned seven part series taking a journey into Big O.

For this discussion, we will be presenting ideas we learned about Big O. These are our interpretations and attempt at humanizing Big O.

We encourage you to correct us. Let us know where we’ve gone wrong. Letting us know what we got right would be an added bonus.

In the last post, we described our take on Big 0 which essentially boiled down to a way of describing the efficiency of a function as the size of the data increases.

In this post, we'll say that Asymptotic Notation allows us to analyze the running time of an algorithm by identifying its behavior as the input size of the algorithm increases, aka, an algorithm's growth rate.

We bet you're saying to yourself slowly: Yeeeeeaaaaahhhhh… that sure sounds like the same thing as Big O!?!?

Big O vs Asymptotic Notation

On this journey, we've discovered that often times the term Big O is loosely used as a synonym of Asymptotic Notation. We suspect this is because “Big O” just rolls off the tongue and is an oft used subset of Asymptotic Notation to describe the worst-case scenario of an algorithm's growth rate.

Even Wikipedia redirects Asymptotic Notation to Big O, sort of; the redirect is actually to “Big O notation”. This just reaffirms to us that our developer community has generally accepted using the term Big O as a synonym for Asymptotic Notation.

Does Asymptotic Notation Provide any Answers?

Generally, Asymptotic Notation allows us to answer the following question:

As the size of an algorithm's input increases, at what rate does the runtime begin to slow?

Before we can begin to answer the question, there are some basic concepts around Asymptotic Notation that need to be understood.

Basic Concepts of Asymptotic Notation

The most basic of concepts/notations when using Asymptotic Notation:

When the function and the input size are taken together, they will produce a graph where:

Mostly commonly, f(n) is the generic notation for describing:

Common Notations of Asymptotic Notation

The more common Notations for describing the growth rate of f(n):

The notations described above are listed from the Best Case to the Worst Case. When the notations are spoken, the “O” is included, i.e.,

Big O is the Worst of Asymptotic Notation

As developers, we tend to be pessimistic when it comes to measuring the effectiveness of our code. Whether it be for input validation, data sanitizing, or build times, we typically look at the unhappy path, the sad path, the worst case.

In Asymptotic Notation, Big O is the Worst Case scenario for the growth rate of a given function. Recall that the Worst Case is “slowest execution, fastest rate of runtime growth”. For example, function A's worst case may be O(1); while function B's worst case may be O(n); and function Z's worst case may be O(n!).

In fancier Asymptotic Notation words:

Big O is the Asymptotic Upper Bound for the growth rate of a function.

This means that the growth rate may be slower, but it will never be greater than the Upper Bound.

The concepts of Upper Bound and Lower Bound become much more relevant the deeper we dive into Asymptotic Notation, but those are reserved for the upcoming posts in this series.

The Irrelevancy of Constants and Insignificant Terms

One of the interesting aspects of Asymptotic Notation is that insignificant terms and constant values are discarded when determining the growth rate of a function. As n increases, the contributing terms to the growth rate will become insignificant due to their contribution to the overall rate of growth being overshadowed by the more dominant terms. The concept behind this is fairly simple once the larger picture is observed.

If the formula derived for calculating the grown rate is: f(n) = 5n^2 + 100n + 300

Then we can describe the terms as:

Everything but the dominant term is insignificant. Why? This is best explained with a table demonstrating the growth of the input as a ratio of the values. The ratio is calculated by dividing the “full term” column's value by the “coefficient term” column's value.

n coefficient term
5n^2
full term
5n^2 + 100n + 300
ratio
full term / coefficient term
1 5 405 81.00000000000
10 500 1,800 3.60000000000
100 50,000 60,300 1.20600000000
1,000 5,000,000 5,100,300 1.02006000000
10,000 500,000,000 501,000,300 1.00200060000
100,000 50,000,000,000 50,010,000,300 1.00020000600
1,000,000 5,000,000,000,000 5,000,100,000,300 1.00002000006
10,000,000 500,000,000,000,000 500,001,000,000,300 1.00000200000

As the ratio converges towards 1.0:

Finding the Big O of a Method

Now let's look at a basic search method and determine its asymptotic upper bound, or Big O:

public Item BasicSearch(int id)
{
    foreach( Item itm in Items )
    {
        if( itm.Id == id )
        {
            return itm;
        }
    }
    
    return null;
}

So what's the worst case here? We don't find anything in the list and return null. This also means that we are going to run the inner code block of the foreach loop Items.Count times. So n can describe Items.Count. Now let's say running that inner code block once takes 5 milliseconds; running through the entire Items list would take 5 milliseconds each time for n times, or 5n milliseconds. Next, since we are running worst case, we will hit return null. Let's say that takes 2 milliseconds, and since it will run only once we can simply add the constant value 2 to our formula.

So now we are looking at the formula: f(n) = 5n + 2

In the above section The Irrelevancy of Constants and Insignificant Terms, we pointed out that constants and coefficients are irrelevant when talking about the asymptotic growth of a function. Therefore we can simply remove these values, and we are left with O(n).

Asymptotic Noise

We've purposely avoided going in-depth into calculating and plotting growth rates. It's just noise. Asymptotic Noise.

For now, we're more interested in understanding what it means when someone says: This function sucks because it's clearly “Ohhh Factorial”. Now we know that this person is stating: This function sucks because it will be the fastest growing with the slowest execution time.

As opposed to someone saying: This function is pretty good because it's clearly “Ohhh log n”. Now we know that the person is saying the only thing that could make this function better would be if it could magically transform into “Ohhh one”.

With Joy!