Open in app
Home
Notifications
Lists
Stories

Write
Jason Drummond
Jason Drummond

Home

Apr 30, 2021

test post

test post

1 min read

test post

--

--


Apr 22, 2021

Categorical Data

One of the most important and biggest steps in almost any Data Science project is the data preparation phase, this includes loading and cleaning our data. When we begin to clean our data we will most like encounter two different types of data. The first of these data types is…

6 min read

Categorical Data
Categorical Data

Apr 20, 2021

SQL Series: Subqueries

A subquery in SQL is a query that is nested inside another query. This simply means that we will have an additional SELECT statement, contained inside parentheses, surrounded by another complete SQL statement. Many times in order to retrieve certain information that we want we will have to perform some…

5 min read

SQL Series: Subqueries
SQL Series: Subqueries

Apr 13, 2021

Data Collection

We are generating huge amounts of data on a daily basis simply by using our phones, playing that YouTube video, or by simply paying for a meal. The companies that are providing these services to us collect this data internally and use it to help them make data driven decisions…

5 min read


Apr 6, 2021

Python: for loops

In Python we use a for loop to iterate over a sequence, generally this sequence is a list, dictionary, tuple, set, or string. Traditionally it is used when we have a block of code which we want to repeat a certain number of times. For loops are extremely valuable to…

5 min read

Python: for loops
Python: for loops

Mar 30, 2021

Confusion Matrix NOT Confusing Matrix

Confusion Matrices are a useful tool for evaluating binary classification models, it can even be extended to multiclass classification problems but we will stick to binary cases for this article. The matrix is a way for us to see our model’s predicted classes vs the actual outcomes. It’s called a…

4 min read

Confusion Matrix NOT Confusing Matrix
Confusion Matrix NOT Confusing Matrix

Mar 23, 2021

Ventures into Visualizations: Seaborn

Today we will be continuing our series on visualization tools, specifically focusing on the Seaborn library. As a reminder, we went over the basics of the Matplotlib library in our previous article, a link will be provided at the bottom of the page for to view it. …

4 min read

Ventures into Visualizations: Seaborn
Ventures into Visualizations: Seaborn

Mar 15, 2021

SQL Series: CASE

This is the fifth part in my series teaching the basics of SQL. As a reminder in parts 1–4 we have gone over a few of the most common keywords that are used to query a database. We will be utilizing many of these keywords that we have gone over…

4 min read

SQL Series: CASE
SQL Series: CASE

Mar 8, 2021

Aggregating Data with Pandas

We have seen the importance of the pandas library and its impact on data manipulation and visualization for Data Scientists. We can also use pandas to explore our dataframes by grouping them by certain variables and then performing summary statistics on them. In this blog we will go over aggregating…

5 min read

Aggregating Data with Pandas
Aggregating Data with Pandas

Mar 2, 2021

Data Transformations with Pandas

Pandas is an immensely useful Python package used mainly for data manipulation but also can be used for data visualization. As we have talked about in a previous article, “A Pandas Primer”, Pandas is built upon two essential packages, Numpy and Matplotlib. Numpy allows for the easy and quick manipulation…

5 min read

Jason Drummond

Jason Drummond

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Knowable