Normalization

Summary

Things to Remember

MCQs

Subjective Questions

Videos

Normalization

Normalization of Data

First Normal Form

Second Normal Form

Third Normal Form

Structured Query Language (SQL)

Functional dependence:

Keys

Lesson

Subject

Grade

Recent Notes

Related Notes

This note is about the Normalization of Data which include the three-phase of data normalization like First normalization form, Second Normalization form, Third Normalization, and BoyCodd form. Normalization is the processes of identifying the existence of potential problem called update anomalies (irregularities) in the design of the relational database.

Normalization is the processes of identifying the existence of potential problem called update anomalies (irregularities) in the design of the relational database.
First Normal Form is said to be the first normal form (1NF) if it does not contain the repeating groups. It includes the tables without repeating groups are in the first normal form.
A relation is in second normal form if all the non-key attributes are dependent on all the key. It includes the modification anomalies could be eliminated if the non-key attribute were dependent on all the key not just a part of it.
A relation is in third normal form if it is in second normal form and has no transitive dependencies. It includes the eliminate the anomalies from the relation is second normal form, the transitive dependency must be removed which leads us to the definition of the third normal form.

No MCQs found.

No subjective questions found.

No videos found.

Normalization is a process which has a setof steps that enables to identify the existence of potential problems called update anomalies (irregularities) in the design of a relation database. This process supplies the methods for correcting these anomalies. In this process, it involves converting tables into various types of normal forms. A table in particular normal form processes a certain desirable collection of properties.

_{Fig: Database Normalization}

_{^{Source :www.w3schools.in}}

The normal forms are 1NF, 2NF,3NF, which were later extended by Boyce-Codd. The normalization process allows a table or collection of tables and produces a new collection of tables that represent the same information but free of anomalies.

In the normalization process, two terms are basically used. They are functional dependence and keys.

Functional Dependence is a formal name for what is basically a simple idea. Example: A sales representative pay class determine his or her commission rate. In other words, a sales representative’s pay class functionally determines his commission rate or a sales commission rate functionally depends on his pay class.

File name: SaleRep

[Note: Each blank row can hold data]

Filename: Customer

Customer Number	Last Name	First Name	Place	City	Balance	Credit Limit	Sales rep number

[Note: Each blank row can hold data]

A column (attribute) B is functionally dependent on another column (or possibly a collection of columns), if a value for A determines a single value for B at any one time. The functional dependence of A is written as (AàB). If B is functionally dependent on A. This can express as A functionally determines B.

Functional dependence is determined on the basis of the unique field. For example, last name, address, ID code, etc.

It is the second underlying concept of the normalization process. A primary key is very important. A column (attribute) C (or collection of columns) is the primary key for table (relation) T.

If;

All column in T is functionally dependent on C.
No sub-collection of the columns in C (assuming C is a collection of columns and not just a single column) also has property (a).

Example, given the name, you cannot determine the address, city etc. In this case, you will choose candidate key. The candidate key is a column or collection of columns on which all columns in the table are functionally dependent. The primary key defines the candidate key as well.

The candidate keys that are not chosen to be the primary key are often referred to as an alternate key. The primary key is frequently called simply, the key in other studies on database management and the relational model.

A table (relation) is said to be the first normal form ( 1NF) if it does not contain the repeating groups. A table (relation) that contains a repeating group (or multiple entries for as single record) is called a non normalized relation. Removal of repeating groups is the starting points in the quest for tables that are free of problems as possible.

Tables without repeating groups are in the first normal form. In the given table, if a student (SIDà 100) is deleted, the information about the fee of Golf is lost. This has deletion anomaly. Similarly, if a student wants to take part in cricket, its fee is unknown. It also possesses the insertion anomaly.

Consider the third relation in the above figure. Here, we have assumed that the student is allowed to enroll for more than that one activity at a time. We observe that it has modification anomalies. If we delete the tuple for student 175 (SIDà175), we lose the fact that the monthly fee of squash is Rs. 50. Also, we cannot enter an activity until a student signs up for it. Thus, the relation suffers from both insertion and deletion anomalies.

The problem with this relation is that it has a dependency involving only part of the key. The key here is the combination (SID, Activity) but the relation contains dependency, Activity a Fee (Fee depends on the type of Activity). The determinant of this dependency (Activity) is only part of the key (SID, Activity). The modification anomalies could be eliminated if the non-key attribute (Fee) were dependent on all the key, not just a part of it.

To resolve the situation, we must separate the relation into small ones. The situation now leads to the definition of the second normal form:

A relation is in second normal form if all the non-key attributes are dependent on the entire key. Observe that this definition pertains to relations that have composite keys. If the key is a single attribute, then the relation is automatically in second normal form.

Filename: ACTIVITY (SID, Activity, Fee)

Key: (SID, Activity)

Activities can be split to form two relations in second normal form. The relations are the same as those in fig.

STU-ACT

SID	Activity
100	Skiing
150	Swimming
175	Squash
200	Swimming

ACT-COST

Activity	Fee
Skiing	200
Swimming	50
Squash	50

The relation in second normal form also has anomalies. Consider the relation HOUSING in given fig. Here, the students stay in different houses and their cost prices per quarter are different. The key here is SID and the functional dependencies are SID à Building and Building à Fee (SID à Activity and Activity à Fee, as in second normal form). The dependencies arise because a given student lives in only one building and each building has its own fixed fee structure, for e.g. everyone living in Yellow House pays Rs. 1200 per quarter.

Since, SID determines Building and since, Building determines Fee, then, transitively SID à Fee. Hence, SID depends on fee which is incorrect. Thus, SID, a single attribute, is the key and the relation must, therefore, be in second normal form. The relation, however, does still have anomalies. These anomalies can be resolved by splitting the table.

HOUSING ( SID, Building, Fee)

Functional dependencies: Building à Fee

SID à Building à Fee SID à Housing (SID,Building) BLDG-FEE (Building, Fee)

SID	Building	Fee
100	Yellow	1200
150	Blue	1100
200	Red	1200
250	White	1100
300	Green	1200
Table 1

SID	Building
100	Yellow
150	Blue
200	Red
250	White
300	Yellow
Table 2

Building	Fee
Yellow	1200
Blue	1100
White	1100
Table 3

If we delete the second row of the Housing relation (table 1), we lose not only the fact that student 150 lives in Blue house, but also that it costs Rs. 1100 to live there. This is a deletion anomaly. Further, how can one record the fact that the fee for XY House (imagine) is R.s 1500? We cannot do that until a student moves into the XY House. This is an insertion anomaly. To eliminate the anomalies from the relation in second normal, the transitive dependency must be removed. This leads is to the definition of the third normal form:

A relation is in third normal form if it is in second normal form and has no transitive dependencies.

Housing relation in above fig can be split into two relations in third normal form. The relation STU-Housing (SID, Building) and BLDG-Fee (Building, Fee) are examples.

SQL was developed by IBM in the, 1970s as a way to get information into and out of relational database management systems. It was first standardized in 1986 ANSI. It is declarative in nature. That is its commands are accurate and declared so that they perform on particular databases. SQL commands are categorized as:

DDL such as create a table, alter table, drop table, etc.
DML such as select, insert into, update, delete from, etc.

SQL is a declarative language. It is used to find the results of the database. SQL queries are the most common in use. The SQL sublanguage DML and DDL are very common in server-based database management system. For using SQL, you need to create the table, insert data records into the table and then, you can manipulate the data records or make the queries. A typical command could be:

SELECT CarModel FROM CarSales WHERE CarSoldDate Between ‘May 1, 2005’ AND May 31, 2005’;

For creating database you would like to enter the SQL command as:

CREATE TABLE MyFirstTable

(Stud_name varchar (40),

Stud_Address varchar (40),

Stud_DateOfBirth Date, )

For listing all the records from the table, you would enter the command as:

Select * from MyFirstTable

Similarly, for various activities such as deleting records, indexing records, filtering records, joining tables, etc., you have commands of SQL. These commands are beyond context at this level.

References:

Khanal, R.C. Khanal, R.C. Computer Concept for XII. Pashupatigriha Marga, Thapathali, Kathmandu, Nepal: Ekta Books Distributors Pvt. Ltd., 2010. 9-13.

Adhikari, Deepak Kumar.,et.al., Computer Science XII,Asia Publication Pvt.Ltd

Database

Computer Science

Grade 12

No recent notes.

First Name

Commission

No related notes.