Rohit Roy

Imagine you need to sort a billion names for a search engine. There are dozens of ways to do it. Each one works. But some finish in seconds and others take hours. Algorithm analysis helps you compare them.

We care most about running time. Memory matters too, but time is what makes or breaks a program. An algorithm that takes too long is useless, no matter how much memory you have.

But it runs fast on my machine!

You could run two algorithms on your laptop and compare. But that result means nothing outside your machine. A fast algorithm on a high-end server might crawl on a budget PC. Hardware varies. Programming languages vary. Even coding style affects execution time.

We need a measure that is tied to the algorithm itself, not the machine running it.

That measure is input size: $n$ . It is the number of elements you are working with. The length of an array. The nodes in a graph. The bits in a number. Bigger $n$ means more work. If you express running time as a function of $n$ , you get a machine-independent measure. Call it $f(n)$ .

What is in $f(n)$ ?

$f(n)$ counts operations. Basic steps like adding numbers, comparing values, or reading memory.

Suppose you are summing a list of $n$ numbers. You loop through each one and add it. That is roughly $n$ operations. So $f(n) = n$ .

Now imagine sorting that list with Bubble Sort. You compare pairs of numbers in nested loops. That leads to roughly $n \times n$ comparisons, plus some setup work. So $f(n) = n^2 + 4n + 100$ .

Each term reflects a part of the algorithm. The $n^2$ comes from the nested loops. The $4n$ might be a single loop for swaps. The $100$ is fixed setup work.

We count these operations on an idealized machine where each step takes the same time. This removes real-world noise like slow CPUs or cache delays. The focus stays on the algorithm's logic.

Inputs are not always the same

The same algorithm can behave very differently depending on the input.

A sorted list might let a sorting algorithm finish early. A reverse-sorted list forces it to do every step. That is why we analyze three cases:

Best Case: The luckiest input. Linear search finds the item first. That is 1 step.
Worst Case: The hardest input. Linear search checks every element. That is $n$ steps.
Average Case: A typical random input. Linear search checks about half the list. That is roughly $n/2$ steps.

The worst case is the most useful. It guarantees performance no matter what input arrives. The average case predicts real-world behavior but requires assumptions about input patterns. The best case is rarely the one you plan for.

Simplifying $f(n)$

For large inputs, not all terms in $f(n)$ matter equally.

Take $f(n) = n^2 + 4n + 100$ . At $n = 1{,}000{,}000$ , the $n^2$ term is a trillion. The $4n$ term is four million. The constant $100$ is negligible. The $n^2$ term dominates everything else.

This is the idea of rate of growth. It describes how running time scales as input grows. Computer scientists call this time complexity.

For large $n$ :

f(n) = n^2 + 4n + 100 \approx n^2

We focus on the dominant term because it determines the algorithm's behavior at scale.

Asymptotic notation

We use asymptotic notation to describe rate of growth cleanly, without carrying all the terms.

A function $f(n)$ is asymptotic to $g(n)$ if:

\lim_{n \to \infty} \frac{f(n)}{g(n)} = 1

This means $f(n)$ and $g(n)$ grow at the same rate for very large $n$ . We use $g(n)$ as a simpler approximation of $f(n)$ .

For $f(n) = n^2 + 4n + 100$ , a natural choice is $g(n) = n^2$ . But we could also pick $g(n) = n^3$ , which grows faster and gives an upper estimate. Or $g(n) = n$ , which grows slower and gives a lower estimate.

These choices lead to three notations: Big- $O$ , Big- $\Omega$ , and Big- $\Theta$ .

Big- $O$ : the upper bound

Big- $O$ sets a ceiling. It says the function will never grow faster than this.

Formally, $f(n) = O(g(n))$ if there exist constants $c > 0$ and $n_0 > 0$ such that:

f(n) \leq c \cdot g(n) \quad \text{for all } n \geq n_0

There are infinitely many valid choices for $g(n)$ . If $f(n) = n$ , then $f(n) = O(n)$ is valid. So is $O(n^2)$ or $O(2^n)$ . All are correct upper bounds. In practice, we pick the tightest one.

For $f(n) = n^2 + 4n + 100$ , the tightest choice is $O(n^2)$ . The smaller terms are dropped. Big- $O$ answers the question: how bad can it get?

Big- $\Omega$ : the lower bound

Big- $\Omega$ sets a floor. It says the function will never grow slower than this.

Formally, $f(n) = \Omega(g(n))$ if there exist constants $c > 0$ and $n_0 > 0$ such that:

c \cdot g(n) \leq f(n) \quad \text{for all } n \geq n_0

Again, there are infinitely many valid choices. If $f(n) = n^2$ , then $\Omega(n^2)$ is valid. So is $\Omega(n)$ or $\Omega(1)$ . In practice, we pick the largest $g(n)$ that still stays below $f(n)$ .

For $f(n) = 100n^2 + 10n + 50$ , the tightest choice is $\Omega(n^2)$ . Big- $\Omega$ answers the question: how fast can it get?

Big- $\Theta$ : the tight bound

Big- $O$ gives a ceiling. Big- $\Omega$ gives a floor. Big- $\Theta$ gives both at once.

Formally, $f(n) = \Theta(g(n))$ if there exist constants $c_1$ , $c_2$ , and $n_0$ such that:

c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n) \quad \text{for all } n \geq n_0

$f(n)$ is squeezed between two multiples of $g(n)$ . It grows exactly like $g(n)$ , up to constant factors.

For $f(n) = 100n^2 + 10n + 50$ :

Upper bound: $O(n^2)$
Lower bound: $\Omega(n^2)$
Together: $\Theta(n^2)$

Putting it all together

Here is how best, worst, and average cases look for linear search.

Worst case ( $f(n) \approx n$ , item is last or missing):

$O(n)$ : at most linear time.
$\Omega(n)$ : at least linear time.
$\Theta(n)$ : exactly linear time.

Best case ( $f(n) \approx 1$ , item is first):

$O(1)$ : at most constant time.
$\Omega(1)$ : at least constant time.
$\Theta(1)$ : exactly constant time.

Average case ( $f(n) \approx n/2$ , random input):

$O(n)$ : at most linear time.
$\Omega(n)$ : at least linear time.
$\Theta(n)$ : exactly linear time.

Does $O(n)$ mean worst case?

Not automatically. $O(n)$ says the running time grows no faster than linear. It says nothing about which case you are analyzing.

For linear search:

Worst case: $f(n) = n$ , so $O(n)$ .
Best case: $f(n) = 1$ , so $O(1)$ .

Both are $O$ notation. They describe different cases. Whenever you see $O(n)$ , ask: best, worst, or average?

So,

$O$ gives an upper bound. It answers: how bad can it get?
$\Omega$ gives a lower bound. It answers: how fast can it get?
$\Theta$ gives a tight bound. It answers: what is the exact growth rate?
Case analysis and asymptotic notation are independent. You can apply any notation to any case.
In practice, we focus on worst-case Big- $O$ . It guarantees performance under the hardest conditions.

But it runs fast on my machine!

What is in f(n)f(n)f(n)?

Inputs are not always the same

Simplifying f(n)f(n)f(n)

Asymptotic notation

Big-OOO: the upper bound

Big-Ω\OmegaΩ: the lower bound

Big-Θ\ThetaΘ: the tight bound