Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Well done!
You have completed Preparing Data for Analysis!
You have completed Preparing Data for Analysis!
Preview
Clean the rest of the dataset using tools in Google Sheets.
Update
- "Trim Whitespace" in Google Sheets can now be found under
Data > Data Cleanup > Trim Whitespace
Resource
- Contains Text Custom Formula
=istext(B:B)
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Welcome back.
0:00
Next, let's deal with formatting issues.
0:01
I'm going to handle the whitespace first,
0:04
since Google Sheets has a helpful
built in feature for this, too.
0:06
Let's select all of the data and
go to Data > Trim whitespace.
0:11
And you can keep an eye
on this cell right here.
0:16
We have some whitespace at the beginning.
0:19
And if I click it,
you can see that whitespace is gone.
0:22
And it also tells you how many
cells it found whitespace for,
0:25
just in case you need that information,
but that's it, super easy.
0:29
Next, let's handle spelling mistakes.
0:34
We can use Google's built in feature for
this, too, Tools > Spelling > Spellcheck.
0:37
And you see we have some Posions
[LAUGH] instead of Poison, so Change.
0:45
Change, And that's it.
0:53
But if I scroll up to the top,
0:58
you can see I still have Ie instead
of Ice, and Gras instead of Grass.
1:00
So what I'm going to do is I'm going
to use Data > Create a filter,
1:05
and I'm gonna use filters to help
us find these spelling mistakes.
1:10
So you can see it gives us these little
icons here at the top of each column now.
1:15
Okay, I'm gonna click on this one for
type.
1:19
And you can see it just has all
of these as an aggregated list of
1:21
all of the different types.
1:26
So there could be ten different
Pokemon that are just Bug type,
1:28
it's only gonna show it once.
1:32
So it kind of groups them all together for
us.
1:34
So I'm gonna scroll through,
And yep, there's one.
1:37
That's definitely supposed to be Fire and
not Fie.
1:43
And we keep scrolling through, And
1:46
see if we find any other
spelling mistakes.
1:50
And then we have to scroll, there we go.
1:58
On the big side, hit OK.
2:01
And it finds that cell for us,
2:03
Fire, and there we go.
2:08
Now staring at it, that's making me
think that's not how you spell fire, but
2:12
that is correct.
2:15
That is fire.
2:16
It's one of those times where
when you spell something,
2:17
So many times when you see it misspelled.
2:25
We actually need to Select all.
2:29
There we go, now they're all back.
2:30
Okay, Weaknesses,
we're gonna do the same thing.
2:33
So I'm gonna hit Clear.
2:35
Got one here.
2:38
And one here, right away, you can see they
should all be under this same category.
2:39
And let's see what else we find.
2:46
There's a Gras one There's a Flying one.
2:57
I feel like there's one more.
3:12
I think I missed it, so
I'm gonna scroll back through.
3:15
Since there are a few more in
this one than in the other one.
3:21
We have a Ie instead of an Ice.
3:26
Just a quick pull through, and
I believe that's all of them.
3:33
So then scroll on the outer scroll bar,
3:36
cuz I know sometimes you
kind of get stuck in here.
3:39
But make sure you scroll on
the outer scroll bar to hit OK.
3:42
And now we have all our errors,
we can fix them.
3:45
Ice, Grass,
3:48
Ice, Ice, and,
3:56
Come on, click in, there we go, Flying.
4:02
Perfect, all fixed, Select all.
4:06
Okay, Our last formatting
error is the dashes
4:10
instead of commas in
the Weaknesses column.
4:15
Conditional formatting
to the rescue again.
4:18
Format > Conditional formatting,
add another rule, and
4:21
then we can do Text contains,
and put in our dash.
4:26
And then I'm gonna do
this a different color.
4:30
Let's do it like a purple.
4:31
Done, close that out.
4:33
And now we can scroll through and
find our dashes.
4:36
Here's one.
4:48
This is what makes
conditional formatting nice,
4:52
it's super easy to scroll through and
find them really quick.
4:54
Make sure I don't accidentally add in any
spelling mistakes while we're doing this.
5:00
Perfect, and I think that's all of them,
but I'm gonna scroll through just in case.
5:15
Okay, awesome,
let's tackle the type issues next.
5:20
We're gonna select these two number
columns, and I'm gonna do Cmd, or
5:25
Ctrl + Up Arrow,
to get us back to the top.
5:29
Select our two number categories,
or columns,
5:31
Conditional formatting,
we're gonna add another rule.
5:34
And this is going to be
another custom function.
5:38
So scroll all the way to the bottom,
or Custom formula,
5:41
Custom formula, and
this is going to be =istext(B:B),
5:46
because that's the start.
5:52
That's our column start, is B.
5:55
And that's gonna find any
item that contains text.
5:57
So I'm gonna give this like an orange.
6:02
Let's do Done.
6:04
Okay, so if I scroll through,
you can see it found pounds, inches.
6:07
This one's numbers, so
6:14
anything that's gonna force these to
turn into text instead of numbers or
6:15
integers, this is going to pop them out
for you with, conditional formatting.
6:20
So we can go through and just remove that,
6:24
remove that, that should be 51.
6:29
Gonna keep scrolling,
6:32
39, 66.1,
6:39
59, And
6:49
perfect, that's all of our type issues.
6:53
Lastly, there's some nonsensical
data in this data set as well.
6:58
We come back up to the top, reselect these
two columns, we're gonna add another rule.
7:03
We're gonna do, Less than or equal to,
7:08
because height and weight,
the two of these and
7:12
things like age should never be 0 or less.
7:17
So we want to make sure that
we catch all those errors.
7:22
So I'm gonna make this
like a maroon color.
7:26
Done, now we can scroll through and
see if we find any.
7:30
Here's one, and for speediness,
if you look up Alakazam and
7:38
you look up the weight,
this weight is correct.
7:43
It's just accidentally negative
just to save us some time.
7:46
So there's one, and
then there's another one.
7:50
Lickitung, same thing,
actually looked it up earlier and
7:54
you can see it's 144.4 pounds, 144.4,
so it's just accidentally negative.
7:58
Finish scrolling just to make sure we
don't have any more, and that's it.
8:05
There you go, the data set is all
clean and ready for analysis.
8:11
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up