Please make sure to read the previous SQL document before continuing with this one.
Each table has rows and columns, and each row represents a "record" in the table. Records in different tables can be related to each other using keys, which means that we can define relationships between tables. For example, we can have a table of users and a table of posts, and each post can be related to a user. If we also have a table of comments, each comment can be related to a post. We can define what happens when a user is deleted, for example, and the database will automatically delete all the posts and comments related to that user (without us having to write any code for that logic). Alternatively, we can define that the posts and comments will remain in the database, but the user ID will be set to NULL
(which means that the user is unknown or deleted).
In this diagram we can outline the relationships between records in different tables. For example, employees must have an associated department, and each department can have multiple employees. We shouldn't be able to insert an employee without associating them with a valid department.
Similarly, each employee can have multiple projects, and each project can have multiple employees. If we delete or update a project, we can choose what happens in the table that links employees to projects, and that would depend on out business logic / use case / requirements.
Primary keys are unique identifiers for each row in a table. They are used to uniquely identify a row in a table. Primary keys are used to enforce entity integrity. A primary key is a column or a set of columns that uniquely identifies each row in a table. A primary key must contain unique values.
Foreign keys are used to link two tables together. A foreign key is a column or a set of columns in a table that is used to establish a link between the data in two tables. A foreign key is a field in a relational table that matches the primary key column of another table. The foreign key can be used to cross-reference tables.
In SQLite, foreign keys are not enforced by default. To enable foreign key constraints, you need to run the following command (once per connection):
PRAGMA foreign_keys = ON;
Also in SQLite, foreign keys in one table, can only be associated with primary keys of other tables. In other words, a foreign key must reference a primary key. This is not necessarily the case in all databases.
This is one of many reasons to prefer more advanced databases like PostgreSQL for larger projects, or in general if you plan on using more advanced features.
While SQLite is a great database for small projects, PostgreSQL is a more powerful database that is used in many large projects. PostgreSQL is a relational database management system that is open-source and has many advanced features.
Additionally, PostgreSQL is the most popular database in the world, with great documentation and a large community. It is also very fast and reliable, and it has many advanced features that are not available in SQLite (such as advanced data types, built-in functions, and more).
See the chapter about installing PostgreSQL for more information on using it locally for development. And see the official documentation for more information.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
username TEXT
);
In PostgreSQL, we have several options for creating a primary key. The first one is the most common one, but we'll include the other examples for reference.
CREATE TABLE users (
id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
-- id INTEGER PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
-- id INTEGER SERIAL PRIMARY KEY,
username TEXT
);
CREATE TABLE posts (
id INTEGER PRIMARY KEY,
user_id INTEGER,
content TEXT,
FOREIGN KEY (user_id) REFERENCES users(id)
);
Because these two tables are related, there are some actions that are not allowed (which is good, because it prevents us from making mistakes).
For example, if we try to insert a post with a user ID that doesn't exist in the users
table, we will get an error.
A similar error will occur if we try to delete a user that has posts associated with them. We can define what happens in this case, for example, we can delete the posts as well, or we can set the user ID in the posts table to NULL
.
When creating a foreign key, we can define what happens when the primary key is deleted or updated. The default is NO ACTION
, which means that the operation will fail if there are related records. Other options are:
CASCADE
: Delete or update the related records as well.SET NULL
: Set the foreign key toNULL
(column must allowNULL
values).SET DEFAULT
: Set the foreign key to the default value (column must have a default value, otherwise this is the same asSET NULL
).
We can define these actions like this:
CREATE TABLE posts (
id INTEGER PRIMARY KEY,
user_id INTEGER,
content TEXT,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE SET NULL ON UPDATE CASCADE
);
In this example, if we update the user ID in the users
table, the user ID in the posts
table will be updated as well (the update will "cascade" to the posts table). If we delete a user, the user ID in the posts
table will be set to NULL
.
A JOIN
clause is used to combine rows from two or more tables, based on a related column between them. There are several types of joins:
INNER JOIN
(orJOIN
): Returns records that have matching values in both tables.LEFT JOIN
(orLEFT OUTER JOIN
): Returns all records from the left table, and the matched records from the right table.RIGHT JOIN
(orRIGHT OUTER JOIN
): Returns all records from the right table, and the matched records from the left table.
By far the most useful and common join is the INNER JOIN
. This join returns only the rows that have matching values in both tables. For example, if we have a table of users and a table of posts, we can join them to get a list of all the posts with the usernames of the users who posted them.
SELECT users.id AS user_id,
users.username,
posts.id AS post_id,
posts.content
FROM users
INNER JOIN posts ON users.id = posts.user_id;
-- JOIN posts ON users.id = posts.user_id; -- This is the same
What we will get is this data, even though this does not exist in this form in any one table:
user_id | username | post_id | content
--------+----------+---------+--------
1 | Alice | 1 | Hello!
1 | Alice | 2 | How are you?
2 | Bob | 3 | I'm good, thanks!
2 | Bob | 4 | How about you?
We won't discuss LEFT JOIN
and RIGHT JOIN
in this document, but you can read more about them in the official documentation.