Email : dsciencelabs@outlook.com
Instagram : https://www.instagram.com/dsciencelabs
RPubs
: https://rpubs.com/dsciencelabs/
Github :
https://github.com/dsciencelabs/
Telegram :
@dsciencelabs
Department : Business
Statistics
Address : ARA Center, Matana University
Tower
Jl. CBD Barat Kav, RT.1, Curug
Sangereng, Kelapa Dua, Tangerang, Banten 15810.
The real power of a relational database lies in its ability to quickly retrieve and analyze your data by running a query. Queries allow you to pull information from one or more tables based on a set of search conditions you define. In this section, you will learn how to create a simple one-table query.
First, we need to connect to our database. Please type the following code in your R console:
# set up the connection and save it into the workspace
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
library(RMySQL)
library(DBI)
<- dbConnect(RMySQL::MySQL(),
bakti dbname='factory_db',
username='root',
password='',
host='localhost',
port=3306)
::opts_chunk$set(connection = "bakti") # to set up the connection in your Rmarkdown chunk knitr
The SQL SELECT statement is used to fetch the data from a database table which returns this data in the form of a result table. These result tables are called result-sets. The basic syntax of the SELECT statement is as follows:
SELECT column1, column2, columnN
FROM table_name;
Assume, column1, column2… are the fields of a table whose values you want to fetch. If you want to fetch some of the fields available in the field, then you can use the following syntax.
SELECT CustomerName, Address, City, Country
FROM CUSTOMERS;
CustomerName | Address | City | Country |
---|---|---|---|
Alfreds Futterkiste | Obere Str. 57 | Berlin | Germany |
Ana Trujillo Emparedados y helados | Avda. de la Constituci |
M |
Mexico |
Antonio Moreno Taquer |
Mataderos 2312 | M |
Mexico |
Around the Horn | 120 Hanover Sq. | London | UK |
Berglunds snabbk |
Berguvsv |
Lule |
Sweden |
Blauer See Delikatessen | Forsterstr. 57 | Mannheim | Germany |
Blondel p |
24, place Kl |
Strasbourg | France |
B |
C/ Araquil, 67 | Madrid | Spain |
Bon app’ | 12, rue des Bouchers | Marseille | France |
Bottom-Dollar Marketse | 23 Tsawassen Blvd. | Tsawassen | Canada |
If you want to fetch all the fields of the CUSTOMERS table, then you should use the following query.
SELECT *
FROM CUSTOMERS;
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
2 | Ana Trujillo Emparedados y helados | Ana Trujillo | Avda. de la Constituci |
M |
5021 | Mexico |
3 | Antonio Moreno Taquer |
Antonio Moreno | Mataderos 2312 | M |
5023 | Mexico |
4 | Around the Horn | Thomas Hardy | 120 Hanover Sq. | London | WA1 1DP | UK |
5 | Berglunds snabbk |
Christina Berglund | Berguvsv |
Lule |
S-958 22 | Sweden |
6 | Blauer See Delikatessen | Hanna Moos | Forsterstr. 57 | Mannheim | 68306 | Germany |
7 | Blondel p |
Fr |
24, place Kl |
Strasbourg | 67000 | France |
8 | B |
Mart |
C/ Araquil, 67 | Madrid | 28023 | Spain |
9 | Bon app’ | Laurence Lebihans | 12, rue des Bouchers | Marseille | 13008 | France |
10 | Bottom-Dollar Marketse | Elizabeth Lincoln | 23 Tsawassen Blvd. | Tsawassen | T2F 8M4 | Canada |
The SQL DISTINCT keyword is used in conjunction with the SELECT statement to eliminate all the duplicate records and fetching only unique records. There may be a situation when you have multiple duplicate records in a table. While fetching such records, it makes more sense to fetch only those unique records instead of fetching duplicate records.
The basic syntax of DISTINCT keyword to eliminate the duplicate records is as follows:
SELECT DISTINCT column_name
FROM table_name
Now, let us use the DISTINCT keyword with the above SELECT query and then see the result.
SELECT DISTINCT Country
FROM customers;
Country |
---|
Germany |
Mexico |
UK |
Sweden |
France |
Spain |
Canada |
Argentina |
Switzerland |
Brazil |
The WHERE clause is used to filter records. The WHERE clause is used to extract only those records that fulfill a specified condition.
SELECT column1, column2, ...
FROM table_name
WHERE [condition];
Note: The WHERE clause is not only used in SELECT statement, it is also used in UPDATE, DELETE statement, etc.!
The following SQL statement selects all the customers from the country “Mexico”, in the “Customers” table:
SELECT *
FROM Customers
WHERE Country='Mexico';
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
2 | Ana Trujillo Emparedados y helados | Ana Trujillo | Avda. de la Constituci |
M |
5021 | Mexico |
3 | Antonio Moreno Taquer |
Antonio Moreno | Mataderos 2312 | M |
5023 | Mexico |
13 | Centro comercial Moctezuma | Francisco Chang | Sierras de Granada 9993 | M |
5022 | Mexico |
58 | Pericles Comidas cl |
Guillermo Fern |
Calle Dr. Jorge Cash 321 | M |
5033 | Mexico |
80 | Tortuga Restaurante | Miguel Angel Paolino | Avda. Azteca 123 | M |
5033 | Mexico |
The following operators can be used in the WHERE clause, please try it by your self!
Operator | Description |
---|---|
= | Equal |
> | Greater than |
< | Less than |
>= | Greater than or equal |
<= | Less than or equal |
<> | Not equal. Note: In some versions of SQL this operator may be written as != |
IS NULL or IS NOT NUL | A field with a NULL value is a field with no value. |
BETWEEN | Between a certain range |
LIKE | Search for a pattern |
IN | To specify multiple possible values for a column |
The BETWEEN operator selects values within a given range. The values can be numbers, text, or dates. The BETWEEN operator is inclusive: begin and end values are included.
SELECT column_name(s)
FROM table_name
WHERE column_name
BETWEEN value1 AND value2;
The following SQL statement selects all products with a price BETWEEN 10 and 20:
SELECT *
FROM Products
WHERE Price
BETWEEN 10 AND 20;
ProductID | ProductName | SupplierID | CategoryID | Unit | Price |
---|---|---|---|---|---|
1 | Chais | 1 | 1 | 10 boxes x 20 bags | 18.00 |
2 | Chang | 1 | 1 | 24 - 12 oz bottles | 19.00 |
3 | Aniseed Syrup | 1 | 2 | 12 - 550 ml bottles | 10.00 |
15 | Genen Shouyu | 6 | 2 | 24 - 250 ml bottles | 15.50 |
16 | Pavlova | 7 | 3 | 32 - 500 g boxes | 17.45 |
21 | Sir Rodney’s Scones | 8 | 3 | 24 pkgs. x 4 pieces | 10.00 |
25 | NuNuCa Nu |
11 | 3 | 20 - 450 g glasses | 14.00 |
31 | Gorgonzola Telino | 14 | 4 | 12 - 100 g pkgs | 12.50 |
34 | Sasquatch Ale | 16 | 1 | 24 - 12 oz bottles | 14.00 |
35 | Steeleye Stout | 16 | 1 | 24 - 12 oz bottles | 18.00 |
The following SQL statement selects all orders with an OrderDate BETWEEN ‘01-July-1996’ and ‘31-July-1996’:
SELECT *
FROM Orders
WHERE OrderDate
BETWEEN '1996-07-01' AND '1996-07-31';
OrderID | CustomerID | EmployeeID | OrderDate | ShipperID |
---|---|---|---|---|
10248 | 90 | 5 | 1996-07-04 | 3 |
10249 | 81 | 6 | 1996-07-05 | 1 |
10250 | 34 | 4 | 1996-07-08 | 2 |
10251 | 84 | 3 | 1996-07-08 | 1 |
10252 | 76 | 4 | 1996-07-09 | 2 |
10253 | 34 | 3 | 1996-07-10 | 2 |
10254 | 14 | 5 | 1996-07-11 | 2 |
10255 | 68 | 9 | 1996-07-12 | 3 |
10256 | 88 | 3 | 1996-07-15 | 2 |
10257 | 35 | 4 | 1996-07-16 | 3 |
The IN operator allows you to specify multiple values in a WHERE clause. The IN operator is a shorthand for multiple OR conditions.
SELECT column_name(s)
FROM table_name
WHERE column_name
IN (SELECT STATEMENT);
The following SQL statement selects all customers that are located in “Germany”, “France” or “UK”:
SELECT *
FROM Customers
WHERE Country
IN ('Germany', 'France', 'UK');
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
4 | Around the Horn | Thomas Hardy | 120 Hanover Sq. | London | WA1 1DP | UK |
6 | Blauer See Delikatessen | Hanna Moos | Forsterstr. 57 | Mannheim | 68306 | Germany |
7 | Blondel p |
Fr |
24, place Kl |
Strasbourg | 67000 | France |
9 | Bon app’ | Laurence Lebihans | 12, rue des Bouchers | Marseille | 13008 | France |
11 | B’s Beverages | Victoria Ashworth | Fauntleroy Circus | London | EC2 5NT | UK |
16 | Consolidated Holdings | Elizabeth Brown | Berkeley Gardens 12 Brewery | London | WX1 6LT | UK |
17 | Drachenblut Delikatessend | Sven Ottlieb | Walserweg 21 | Aachen | 52066 | Germany |
18 | Du monde entier | Janine Labrune | 67, rue des Cinquante Otages | Nantes | 44000 | France |
19 | Eastern Connection | Ann Devon | 35 King George | London | WX3 6FW | UK |
The following SQL statement selects all customers that are from the same countries as the suppliers:
SELECT *
FROM Customers
WHERE Country
IN (SELECT Country FROM Suppliers);
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
4 | Around the Horn | Thomas Hardy | 120 Hanover Sq. | London | WA1 1DP | UK |
5 | Berglunds snabbk |
Christina Berglund | Berguvsv |
Lule |
S-958 22 | Sweden |
6 | Blauer See Delikatessen | Hanna Moos | Forsterstr. 57 | Mannheim | 68306 | Germany |
7 | Blondel p |
Fr |
24, place Kl |
Strasbourg | 67000 | France |
8 | B |
Mart |
C/ Araquil, 67 | Madrid | 28023 | Spain |
9 | Bon app’ | Laurence Lebihans | 12, rue des Bouchers | Marseille | 13008 | France |
10 | Bottom-Dollar Marketse | Elizabeth Lincoln | 23 Tsawassen Blvd. | Tsawassen | T2F 8M4 | Canada |
11 | B’s Beverages | Victoria Ashworth | Fauntleroy Circus | London | EC2 5NT | UK |
15 | Com |
Pedro Afonso | Av. dos Lus |
S |
05432-043 | Brazil |
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column. There are two wildcards often used in conjunction with the LIKE operator:
SELECT column1, column2, ...
FROM table_name
WHERE columnN LIKE pattern;
The following SQL statement selects all customers with a CustomerName starting with “a”:
SELECT *
FROM Customers
WHERE CustomerName
LIKE 'a%';
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
2 | Ana Trujillo Emparedados y helados | Ana Trujillo | Avda. de la Constituci |
M |
5021 | Mexico |
3 | Antonio Moreno Taquer |
Antonio Moreno | Mataderos 2312 | M |
5023 | Mexico |
4 | Around the Horn | Thomas Hardy | 120 Hanover Sq. | London | WA1 1DP | UK |
Here are some examples showing different LIKE operators with ‘%’ and ’_’ wildcards:
LIKE Operator | Description |
---|---|
WHERE CustomerName LIKE ‘a%’ | Finds any values that start with “a” |
WHERE CustomerName LIKE ‘%a’ | Finds any values that end with “a” |
WHERE CustomerName LIKE ‘%or%’ | Finds any values that have “or” in any position |
WHERE CustomerName LIKE ’_r%’ | Finds any values that have “r” in the second position |
WHERE CustomerName LIKE ‘a_%’ | Finds any values that start with “a” and are at least 2 characters in length |
WHERE CustomerName LIKE ’a__%’ | Finds any values that start with “a” and are at least 3 characters in length |
WHERE ContactName LIKE ‘a%o’ |
The WHERE clause can be combined with AND, OR, and NOT operators. The AND and OR operators are used to filter records based on more than one condition:
SELECT column1, column2, ...
FROM table_name
WHERE condition1 AND condition2 OR condition3 NOT condition4;
The following SQL statement selects all fields from “Customers” where country is “Germany” AND city must be “Berlin” OR “München” (use parenthesis to form complex expressions):
SELECT * FROM Customers
WHERE Country='Germany' AND (City='Berlin' OR City='München');
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
Let see one more example, the following SQL statement selects all fields from “Customers” where country is NOT “Germany” and NOT “USA”:
SELECT * FROM Customers
WHERE Country='Germany' AND (City='Berlin' OR City='München');
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
The SQL ORDER BY clause is used to sort the data in ascending or descending order, based on one or more columns. Some databases sort the query results in an ascending order by default. The basic syntax of the ORDER BY clause is as follows:
SELECT column-list
FROM table_name
[WHERE condition]
[ORDER BY column1, column2, .. columnN] [ASC | DESC];
You can use more than one column in the ORDER BY clause. Make sure whatever column you are using to sort that column should be in the column-list.The following code block has an example, which would sort the result in an ascending order by the City and the Country:
SELECT * FROM Customers
WHERE Country='Germany' AND (City='Berlin' OR City='München')
ORDER BY Country, City;
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
If there are a large number of tuples satisfying the query conditions, it might be resourceful to view only a handful of them at a time.
SELECT column-list
FROM table_name
[WHERE condition]
[ORDER BY column1, column2, .. columnN] [ASC | DESC]
LIMIT rows_to_skip, next_rows_to_skip;
The following illustrates the LIMIT clauses to collect TOP 3 rows:
SELECT * FROM Customers
WHERE Country='Germany' AND (City='Berlin' OR City='München')
ORDER BY Country, City
LIMIT 3;
CustomerID | CustomerName | ContactName | Address | City | PostalCode | Country |
---|---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Obere Str. 57 | Berlin | 12209 | Germany |
Next, the following illustrates the LIMIT clauses to collect TOP 5 rows after TOP 3 rows:
SELECT CustomerName, Address, City, Country
FROM customers
ORDER BY City, Country DESC
LIMIT 3, 5;
CustomerName | Address | City | Country |
---|---|---|---|
Old World Delicatessen | 2743 Bering St. | Anchorage | USA |
Galer |
Rambla de Catalu |
Barcelona | Spain |
LILA-Supermercado | Carrera 52 con Ave. Bol |
Barquisimeto | Venezuela |
Magazzini Alimentari Riuniti | Via Ludovico il Moro 22 | Bergamo | Italy |
Alfreds Futterkiste | Obere Str. 57 | Berlin | Germany |
The MIN() function returns the smallest value of the selected column. The MAX() function returns the largest value of the selected column.
SELECT MIN/MAX(column_name)
FROM table_name
WHERE condition;
The following SQL statement finds the price of the cheapest product:
SELECT MIN(Price) AS SmallestPrice
FROM Products;
SmallestPrice |
---|
2.5 |
The following SQL statement finds the price of the most expensive product:
SELECT MAX(Price) AS LargestPrice
FROM Products;
LargestPrice |
---|
263.5 |
The COUNT() function returns the number of rows that matches a specified criterion. The AVG() function returns the average value of a numeric column. The SUM() function returns the total sum of a numeric column.
SELECT COUNT/SUM/AVG(column_name)
FROM table_name
WHERE condition;
The following SQL statement finds the average price of all products:
SELECT AVG(Price)
FROM Products;
AVG(Price) |
---|
28.86636 |
Note: Please try other functions, to get more convenient with SQL!
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);
The following SQL statement lists the number of customers in each country, sorted high to low (Only include countries with more than 5 customers):
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 5
ORDER BY COUNT(CustomerID) DESC;
COUNT(CustomerID) | Country |
---|---|
13 | USA |
11 | Germany |
11 | France |
9 | Brazil |
7 | UK |
The CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.
If there is no ELSE part and no conditions are true, it returns NULL.
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
WHEN conditionN THEN resultN
ELSE result
END;
The following SQL goes through conditions and returns a value when the first condition is met:
SELECT OrderID, Quantity,
CASE
WHEN Quantity > 30 THEN 'The quantity is greater than 30'
WHEN Quantity = 30 THEN 'The quantity is 30'
ELSE 'The quantity is under 30'
END AS QuantityText
FROM OrderDetails;
OrderID | Quantity | QuantityText |
---|---|---|
10248 | 12 | The quantity is under 30 |
10248 | 10 | The quantity is under 30 |
10248 | 5 | The quantity is under 30 |
10249 | 9 | The quantity is under 30 |
10249 | 40 | The quantity is greater than 30 |
10250 | 10 | The quantity is under 30 |
10250 | 35 | The quantity is greater than 30 |
10250 | 15 | The quantity is under 30 |
10251 | 6 | The quantity is under 30 |
10251 | 15 | The quantity is under 30 |
The following SQL will order the customers by City. However, if City is NULL, then order by Country:
SELECT CustomerName, City, Country
FROM Customers
ORDER BY
CASE
(WHEN City IS NULL THEN Country
ELSE City
END);
CustomerName | City | Country |
---|---|---|
Vaffeljernet | Denmark | |
Drachenblut Delikatessend | Aachen | Germany |
Rattlesnake Canyon Grocery | Albuquerque | USA |
Old World Delicatessen | Anchorage | USA |
Galer |
Barcelona | Spain |
LILA-Supermercado | Barquisimeto | Venezuela |
Magazzini Alimentari Riuniti | Bergamo | Italy |
Alfreds Futterkiste | Berlin | Germany |
Chop-suey Chinese | Bern | Switzerland |
Save-a-lot Markets | Boise | USA |