We do realise that most of the softwares around are designed to solve
a problem!. To better understand what databases are ,we need to first
understand the problems that necessitated the advent of databases. The
answer to this is quite obvious ! you have some data
or
large amounts of information
that you need to store, this
data could be about:
But why worry? , we could store the data in text files or even spreadsheets or if they are documents we can just organise them in folders…. is it! . Just having data is not a good enough reason to have a database , having data is not the problem but the problem is what comes next. Below are some of the problems that a database can solve
so what`s a database?
A structured storage space where the data is kept in many
tables
and organized so that the necessary information can
be easily fetched, manipulated, and summarized.
A table
is an organized set of related data stored in a
tabular form, i.e., in rows and columns. A field
is another
term for a column
of a table and record
is the
other name for a row
Relational data are related by unique identifiers
To better understand this, let’s consider a scenario were Bongani Has
a business. say I have a number of different restaurants. In one table I
might have information about these restaurants including, where they are
located and what type of food they serve. I may then have a second table
where information about health and safety inspections is stored. Each
inspection is a different row and the date of the inspection, the
inspector, and the safety rating are stored in this table. Finally, I
might have a third table. This third table contains information pulled
from an API, regarding the number of stars given to each restaurant, as
rated by people online. Each table contains different bits of
information; however, there is a common column id
in each
of the tables. This allows the information to be linked between the
tables. The restaurant with the id
“JJ29JJ” in the
restaurant table would refer to the same restaurant with the
id
“JJ29JJ” in the health inspections table, and so on. The
values in this id
column are known as
unique identifiers
because they uniquely identify each
restaurant. No two restaurants will have the same id
, and
the same restaurant will always have the same id
, no matter
what table you’re looking at. The fact that these tables have unique
identifiers connecting each table to all the other tables makes this
example what we call relational data.
Unique identifiers help link entries across tables
Storing data in this way has a number of advantages; however, the three most important are:
DDMS
stands for Database Management System
,
a software package used to perform various operations on the data stored
in a database, such as accessing, updating, wrangling, inserting, and
removing data. There are various types of DBMS, such as relational,
hierarchical, network, graph, or object-oriented. These types are based
on the way the data is organized, structured, and stored in the
system.
RDBMS
stands for
Relational Database Management System
. It’s the most common
type of DBMS used for working with data stored in multiple tables
related to each other by means of shared keys. The SQL programming
language is particularly designed to interact with RDBMS. Some examples
of RDBMS are MySQL, PostgreSQL, Oracle, MariaDB, etc.
constraints | Description |
---|---|
DEFAULT | provides a default value for a column |
UNIQUE | allows only unique values |
NOT NULL | allows only non-null values. |
PRIMARY KEY | allows only unique and strictly non-null values (NOT NULL and UNIQUE). |
FOREIGN KEY | provides shared keys between two and more tables |
A column (or multiple columns) of a table to which the UNIQUE constraint was imposed to ensure unique values in that column, including a possible NULL value (the only one).
A column (or multiple columns) of a table to which the FOREIGN KEY constraint was imposed to link this column to the primary key in another table (or several tables). The purpose of foreign keys is to keep connected various tables of a database.
A query is a piece of code written in SQL to access the data from a database or to modify the data. Correspondingly, there are two types of SQL queries: select and action queries. The first ones are used to retrieve the necessary data (this also includes limiting, grouping, ordering the data, extracting the data from multiple tables, etc.), while the second ones are used to create, add, delete, update, rename the data, etc.
A human-readable clarification on what a particular piece of code does. SQL code comments can be single-line (preceded by a double dash –) or span over multiple lines (as follows: /comment_text/). When the SQL engine runs, it ignores code comments. The purpose of adding SQL code comments is to make the code more comprehensive for those people who will read it in the future.
any computer system that deals with storing data needs to have the four fundamental functions that have the ability to:
C
-REATER
-EADU
-UPDATED
-ELETEcommonly pronounced as CRUD
we use the
CREATE TABLE
statement.
table_name (col_1 datatype,
CREATE TABLE
col_2 datatype, col_3 datatype);
for instance if run the following query !
my_table (age INTERGER,
CREATE TABLE sex TEXT);
we add the values to the table using
INSERT INTO .... VALUES (....)
VALUES (34,'male');
INSERT INTO my_table VALUES (47,'female');
INSERT INTO my_table VALUES (32,'male'); INSERT INTO my_table
then call the database using SELECT
* FROM my_table; SELECT
Employee attrition database
SELECT
statement
select all with *
SELECT name FROM people;
*
rather than typing all the columns* FROM dat_new SELECT
selecting a subset
SELECT col_1,col_2,col_3 FROM my_table;
using the database we get
SELECT Attrition,Department,Education FROM dat_new;
select distinct
items in SQL
Bongani
has been recorded twice in the
dataset
SELECT DISTINCT name,age,sex FROM database_table
filtering using WHERE
in SQL
WHERE is a filtering clause
In SQL, the WHERE keyword allows you to filter based on both text and numeric values in a table. There are a few different comparison operators you can use:
= equal
<> not equal
< less than
> greater than
<= less than or equal to
>= greater than or equal to
You can build up your WHERE queries by combining multiple conditions with the AND keyword.
SELECT DISTINCT name,age,sex
FROM df='male'; WHERE sex
COUNT
and AS
alias in
SQL
AS
to rename default nameCOUNT(*)
tells you how many rows are in a tableCOUNT(*) AS pple_count
SELECT FROM database_table
5 rows
we can use
COUNT(DISTINCT column)
to get the number of unique rows
COUNT(DISTINCT name) AS unique_count
SELECT FROM database_table
the query above will calculate rows with distinct names and finds them to be 4
We can use COUNT
with
WHERE
COUNT(*) AS num_male
SELECT
FROM database_table='male'; WHERE sex
here we see that we have 3 males although the other one has been duplicated