Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Well done!
You have completed Introduction to Algorithms!
You have completed Introduction to Algorithms!
Preview
Now that we know how to generally define an algorithm let's talk about what it means to have a good algorithm
Glossary
- Time Complexity - A measure of the running time of an algorithm
- Space Complexity - A measure of the storage taken up by an algorithm
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Now that we know how to
generally define an algorithm,
0:00
let's talk about what it means
to have a good algorithm.
0:02
An important thing to keep in mind is
that there's no one single way to measure
0:05
whether an algorithm is the right
solution because it is all about context.
0:10
Earlier we touched on two concepts,
correctness and efficiency.
0:15
Let's define correctness more clearly,
because before we can evaluate
0:20
an algorithm on efficiency,
we need to ensure its correctness.
0:24
Before we define our algorithms,
we start by defining our problem.
0:28
In the definition of that problem
we have a clearly defined input
0:32
satisfying any pre-conditions and
a clearly defined output.
0:37
An algorithm is deemed correct if
on every run of the algorithm,
0:41
against all possible values in the input
data, we always get the output we expect.
0:45
Part of correctness means that for
any possible input,
0:51
the algorithm should always terminate,
or end.
0:55
If these two are not true,
then our algorithm isn't correct.
0:58
If you were to pick up
an algorithm's textbook and
1:03
look up correctness, you will run
into a bunch of mathematical theory.
1:06
This is because traditionally algorithm
correctness is proved by mathematical
1:10
induction, which is a form of reasoning
used in mathematics to verify
1:15
that a statement is correct.
1:20
This approach involves writing
what it called a specification and
1:22
a correctness proof.
1:26
We won't be going into
that in this course.
1:27
Proof through induction is an important
part of designing algorithms.
1:30
But we're confident that you can
understand algorithms both in terms of how
1:33
and when to use them,
without getting into the math.
1:38
So if you pick up a textbook and
feel daunted, don't worry, I do, too.
1:41
But we can still figure
things out without it.
1:45
All right, so
once we have a correct algorithm,
1:47
we can start to talk about how
efficient an algorithm is.
1:50
Remember that this efficiency ultimately
matters because they help us solve
1:54
problems faster and
1:59
deliver a better end-user
experience in a variety of fields.
2:00
For example, algorithms are used
in the sequencing of DNA.
2:04
And more efficient sequencing
algorithms allow us to research and
2:08
understand diseases better and faster.
2:13
But let's not get ahead of ourselves.
2:15
We'll start simple,
2:17
by evaluating John's linear search
algorithm in terms of its efficiency.
2:19
First, what do we mean by efficiency?
2:24
There are two measures of efficiency when
it comes to algorithms, time and space.
2:26
Sounds really cool and very sci-fi?
2:31
Efficiency, measured by time, something
you'll hear called time complexity,
2:34
is a measure of how long it
takes the algorithm to run.
2:39
Time complexity can be understood
generally outside the context of
2:43
coding computers,
2:47
because how long it take to complete a job
is a universal measure of efficiency.
2:48
The less time you take,
the more efficient you are.
2:52
The second measure of efficiency
is called space complexity, and
2:55
this is pretty computer-specific.
2:59
It deals with the amount of
memory taken up on the computer.
3:01
Good algorithms need to balance between
these two measures to be useful.
3:05
For example, you can have a blazingly
fast algorithm, but it might not
3:10
matter if the algorithm consumes
more memory than you have available.
3:14
Both of these concepts,
time and space complexity,
3:18
are measured using the same metric.
3:21
But it is a very technical
sounding metric, so
3:24
let's build up to it slowly and
start simple.
3:27
A few videos ago,
I played a game with Britney and
3:30
John where they tried to guess
the number I was thinking of.
3:32
Effectively, they were searching for
a value.
3:36
So how do we figure out how
efficient each algorithm is and
3:38
which algorithm was more
suited to our purposes?
3:42
If we consider the number of tries
they took to guess or search for
3:46
the value as an indicator of the time
they take to run through the exercise,
3:50
this is a good indicator of how long the
algorithm runs for a given set of values.
3:55
This measurement is called
the running time of an algorithm and
4:01
we'll use it to define time complexity.
4:04
In the game we played four rounds.
4:06
Let's recap those here,
focusing on John's performance.
4:08
In round 1, we had 10 values,
the target was 3, and John took 3 turns.
4:12
In round 2, we had 10 values,
the target was 10, and John took 10 turns.
4:18
In round 3, we had 100 values,
the target was 5, John took 5 tries.
4:22
And finally in round 4,
when the target was 100,
4:27
given 100 values, John took 100 tries.
4:31
On paper, it's hard to gauge
anything about this performance.
4:34
When it comes to anything with numbers
though, I like to put it up on a graph and
4:38
compare visually.
4:42
On the vertical, or Y axis, let's measure
the number of tries it took John to
4:43
guess the answer, or
the running time of the algorithm.
4:48
On the horizontal, or
X axis, what do we put?
4:51
For each turn we have a number of
values as well as a target value.
4:55
We could plot the target value
on the horizontal axis, but
5:01
that leaves some context and
meaning behind.
5:04
It's far more impressive that John took
5 tries when the range went up to 100,
5:07
than when he took 3 tries for
a maximum of 10 values.
5:13
We could plot the maximum range of values,
but
5:17
then we're leaving out
the other half of the picture.
5:19
There are data points, however,
that satisfy both requirements.
5:22
If we only plot the values where the
target, the number John was looking for,
5:26
was the same as the maximum
range of values, we have a data
5:31
point that includes both the size of
the dataset as well as his effort.
5:35
There's an additional benefit
to this approach as well.
5:40
There are three ways we can
measure how well John does or,
5:43
in general, how well any algorithm does.
5:46
First, we can check how well
John does in the best case or
5:49
good scenarios from
the perspective of his strategy.
5:52
In the range of 100 values,
the answer being a lone number,
5:56
like 3 at the start of the range,
is a good scenario.
6:00
He can guess it fairly quickly.
6:03
One is his best case scenario.
6:05
Or we could check how
well he does on average.
6:07
We could run this game a bunch of times
and average out the running time.
6:10
This would give us a much better picture
of John's performance over time, but
6:14
our estimates would be too high
if the value he was searching for
6:18
was at the start of the range or far
too low if it was the end of the range.
6:22
Let's imagine a scenario were Facebook
naively implements linear search when
6:26
finding friends.
6:30
They looked at the latest US census,
saw that 50% of names start with
6:31
the letters A through J,
which is the first 40% of the alphabet.
6:35
And thought, okay, on average,
linear search serves us well.
6:40
But what about the rest of those whose
names start with the letter after J in
6:44
the alphabet?
6:48
Searching for my name would take
longer than the average and
6:49
much longer for someone whose
name starts with the letter Z.
6:52
So while measuring the run time of
an algorithm on average might seem like
6:55
a good strategy, it won't necessarily
provide an accurate picture.
7:00
By picking the maximum in the range,
7:04
we're measuring how our algorithm
does in the worse case scenario.
7:06
Analyzing the worse case scenario's
quite useful because it indicates that
7:10
the algorithm will never
perform worse than we expect.
7:15
There's no room for surprises.
7:18
Back to our graph, we're going to
plot the number of tries, a proxy for
7:20
running time of the algorithm,
against the number of values in the range,
7:25
which we'll shorten to n.
7:29
n here also represents
John's worst-case scenario.
7:31
When n is 10, he takes 10 turns.
7:34
When n is 100, he take 100 turns.
7:37
But these two values alone
are insufficient to really get any sort of
7:40
visual understanding.
7:44
Moreover, it's not realistic.
7:46
John may take a long time to
work through 100 numbers, but
7:48
a computer can do that in no time.
7:52
To evaluate the performance of linear
search in the context of a computer,
7:53
we should probably throw some harder and
larger ranges of values at it.
7:58
The nice thing is by evaluating
a worst-case scenario,
8:03
we don't actually have to do that work.
8:06
We know what the result will be.
8:08
For a given value of n
using linear search,
8:11
it will take n tries to find
the value in the worst-case scenario.
8:13
So let's add a few values in
here to build out this graph.
8:17
Okay, so we have a good picture of
what this is starting to look like.
8:21
As the values get really large,
8:24
the running time of the algorithm
gets large as well.
8:26
Well, we sort of already knew that.
8:30
Before we dig into this run time
any deeper, let's switch tracks and
8:32
evaluate Britney's work.
8:35
By having something to compare against,
8:37
it should become easier to build
a mental model around time complexity.
8:39
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up