Skip to content

Latest commit

 

History

History
137 lines (100 loc) · 5.33 KB

Database Normalization.md

File metadata and controls

137 lines (100 loc) · 5.33 KB

Database Normalization

Database normalization is a process that aims to organize a relational database efficiently by minimizing data redundancy and improving data integrity. It involves decomposing a database's tables into smaller, more manageable tables and defining relationships between them. This is achieved by following a series of well-defined normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF). Each normal form has specific requirements that must be met to achieve a higher degree of normalization.

Here's an in-depth explanation of the first three normal forms, along with examples:

First Normal Form (1NF)

A table is in 1NF if it meets the following requirements:

  • Each column must contain atomic values (i.e., each value must be indivisible)
  • Each column must have a unique name
  • The order in which data is stored does not matter

Consider this example of a table that is not in 1NF:

OrderID Customer Products
1 John Doe Shoes, T-shirt
2 Jane Smith Laptop, Mouse, Keyboard
3 Alice Johnson Desk, Chair

The 'Products' column violates 1NF because it contains non-atomic values (multiple products in a single cell). To convert this table to 1NF, you would create a separate row for each product:

OrderID Customer Product
1 John Doe Shoes
1 John Doe T-shirt
2 Jane Smith Laptop
2 Jane Smith Mouse
2 Jane Smith Keyboard
3 Alice Johnson Desk
3 Alice Johnson Chair

Second Normal Form (2NF)

A table is in 2NF if it is already in 1NF and meets the following additional requirements:

  • All non-primary key columns must be fully functionally dependent on the primary key (i.e., they must depend on the entire primary key, not just a part of it)

Consider the previous example, which is now in 1NF:

OrderID Customer Product
1 John Doe Shoes
1 John Doe T-shirt
2 Jane Smith Laptop
2 Jane Smith Mouse
2 Jane Smith Keyboard
3 Alice Johnson Desk
3 Alice Johnson Chair

In this table, the primary key is the combination of 'OrderID' and 'Product'. The 'Customer' column is dependent on only part of the primary key ('OrderID'), which violates 2NF. To convert this table to 2NF, you would create separate tables for orders and order items:

Orders:

OrderID Customer
1 John Doe
2 Jane Smith
3 Alice Johnson

OrderItems:

OrderID Product
1 Shoes
1 T-shirt
2 Laptop
2 Mouse
2 Keyboard
3 Desk
3 Chair

Third Normal Form (Third Normal Form (3NF)

A table is in 3NF if it is already in 2NF and meets the following additional requirements:

  • All non-primary key columns must be directly dependent on the primary key (i.e., there should be no transitive dependencies)

Consider the following example, where the table is already in 2NF:

Orders:

OrderID CustomerID Customer CustomerCity
1 1 John Doe New York
2 2 Jane Smith Los Angeles
3 3 Alice Johnson San Francisco

OrderItems:

OrderID Product
1 Shoes
1 T-shirt
2 Laptop
2 Mouse
2 Keyboard
3 Desk
3 Chair

In the 'Orders' table, the primary key is 'OrderID', but the 'CustomerCity' column depends on the 'CustomerID' column, not the primary key. This is a transitive dependency and violates 3NF. To convert this table to 3NF, you would create a separate table for customers:

Customers:

CustomerID Customer CustomerCity
1 John Doe New York
2 Jane Smith Los Angeles
3 Alice Johnson San Francisco

Orders:

OrderID CustomerID
1 1
2 2
3 3

OrderItems:

OrderID Product
1 Shoes
1 T-shirt
2 Laptop
2 Mouse
2 Keyboard
3 Desk
3 Chair

Now, all non-primary key columns in each table are directly dependent on the primary key, and the tables are in 3NF.


There are more advanced normal forms like BCNF, 4NF, and 5NF that deal with specific cases of redundancy and data dependency, but achieving 3NF is generally considered sufficient for most database designs. By following these normal forms, you can create a well-structured and efficient database that minimizes redundancy and maintains data integrity.