Intermediate SQL Commands: CASE, Null Handling, and Copying Tables
Enhancing Your SQL Skills: Exploring Intermediate Commands and Null Handling
In the previous articles (start the series from here) of this series on Structured Query Language (SQL), we have successfully learned how to create and populate database, along with how we can do simple analysis by grouping the data into different clusters to how we can join different tables. Before moving on to the advanced stuff, we need to cover some intermediate commands used in SQL quite often. This article will focus on these commands which include CASE
, SELECT INTO
etc.
Note: All the code for this article series is available on my github here
Recap (Table Structure):
Our database from previous articles contain two tables YoutubeChannel
and ChannelStats
. The YoutubeChannel
table includes columns for ChannelID
, ChannelName
, Category
and DateCreated
, while the ChannelStats
table features columns for ChannelID
, SubscribersCount
, TotalViews
, and Date
.
CASE Statement
The CASE
statement in SQL operates like a series of IF...THEN...ELSE
statements in programming languages. It allows us to create different outputs based on certain conditions. For example, we can use CASE
to categorize channels based on their SubscribersCount
.
For instance, let's say we want to categorize our channels into 'Small', 'Medium', and 'Large' based on the number of subscribers. We could use the CASE
statement like this:
SELECT ChannelName,
CASE
WHEN SubscribersCount < 10000 THEN 'Small'
WHEN SubscribersCount BETWEEN 10000 AND 100000 THEN 'Medium'
ELSE 'Large'
END as ChannelSize
FROM YoutubeChannel;
In the above query, we first select the ChannelName
from our YoutubeChannel
table. Then we set up our CASE
statement. If the SubscribersCount
is less than 10,000, we label the channel as 'Small'. If it's between 10,000 and 100,000, we label it 'Medium'. Any channel with a SubscribersCount
over 100,000 we label as 'Large'. The result of this CASE
statement is then stored in a new column called ChannelSize
.
INSERT INTO SELECT
The INSERT INTO SELECT
statement is utilized when we want to insert data from one table into another existing table. Note that in this instance, we need to create the new table first and then use this command. For instance, if we need a separate table for 'Large' channels and we want to add more channels to it from our YoutubeChannel
table, we could use INSERT INTO SELECT
like so:
CREATE TABLE LargeChannels (
ChannelName varchar(255),
SubscribersCount int,
TotalViews int
)
INSERT INTO LargeChannels (ChannelName, SubscribersCount, TotalViews)
SELECT ChannelName, SubscribersCount, TotalViews
FROM YoutubeChannel
JOIN ChannelStats
ON YoutubeChannel.ChannelID = ChannelStats.ChannelID
WHERE SubscribersCount > 100000;
In this query, we are selecting channels from the YoutubeChannel
table where the SubscribersCount
is more than 100,000 and inserting this data into the LargeChannels
table.
Selecting the data from LargeChannels
results in the following:
SELECT INTO Statement
The SELECT INTO
statement copies data from one table into a new table i.e., it combines the CREATE TABLE
and INSERT INTO SELECT
into one statement and doesn’t need to know the exact structure to create the new table either. This can be useful when we want to create a new table with specific data from an existing table. For example, we may want to create a new table that only contains 'Large' channels. Here's how we could use SELECT INTO
to achieve this:
SELECT ChannelName, SubscribersCount, TotalViews
INTO SmallChannels
FROM YoutubeChannel
JOIN ChannelStats
ON YoutubeChannel.ChannelID = ChannelStats.ChannelID
WHERE SubscribersCount < 10000;
In the above query, we select the required columns from the join of YoutubeChannel
and ChannelStats
table where the SubscribersCount
is less than 10,000. This data is then copied into a new table called SmallChannels
.
Checking for NULL values
We can check for NULL
values inside the SELECT
statement as well and replace it with something else as well (for instance, a default or more human understandable value)
ISNULL Function
The ISNULL
function is a SQL Server function used to replace NULL
with a specified replacement value. The syntax is ISNULL(expression, replaceWith)
. Note here that the data type of replaceWith
should be the same as the data type of the selected column (or expression)
For example, if we want to replace NULL
values in the DateCreated
column with a default date, we can use:
SELECT ChannelName, ISNULL(DateCreated, '2010-01-01') as DateCreated
FROM YoutubeChannel;
Note: In MySQL, ISNULL
is a function that takes only one argument and returns 1
if the argument is NULL
, and 0
otherwise. For same functionality as ISNULL
, we have to use IFNULL
in MySQL.
COALESCE Function
The COALESCE
function is a SQL standard function that returns the first non-NULL
value in a list. This function can be used in all SQL databases. The syntax is COALESCE(value1, value2, ..., valueN)
.
For example, if we want to replace NULL
values in the DateCreated
column by first checking if there is at least a recording date in ChannelStats
table, and if not, set it to a default date, we can use:
SELECT ChannelName, COALESCE(DateCreated, Date, '2010-01-01') as DateCreated
FROM YoutubeChannel
JOIN ChannelStats
ON YoutubeChannel.ChannelID = ChannelStats.ChannelID;
In this example, if DateCreated
and Date
both are NULL
, the function will return '2010-01-01'.
Comments in SQL
Comments in SQL are used to explain sections of code, and they can be very helpful for future reference. They can be created using two hyphens (--) for single line comments, or a /* and */ for multi-line comments. For example:d
-- This is a single-line comment
SELECT ChannelName FROM YoutubeChannel;
/*
This is a
multi-line comment
*/
SELECT ChannelName, SubscribersCount FROM YoutubeChannel;
Conclusion
The article covers intermediate SQL commands including the CASE
statement, INSERT INTO SELECT
, SELECT INTO
, and methods to handle NULL
values with ISNULL
and COALESCE
functions. The CASE
statement allows for creating different outputs based on conditions. INSERT INTO SELECT
is used to insert data from one table into another. SELECT INTO
copies data from one table into a new one. ISNULL
and COALESCE
functions are used to replace NULL
values. The article also explains how to use comments in SQL for better code understanding.
Next, we will delve into advanced SQL commands that allow for more complex data manipulation and analysis. Stay tuned for detailed explanations and examples of using these commands effectively in your SQL practice. In the meanwhile, feel free to download the code for all the articles from here