After Part 1 Of DBMS here we are with the second part ER Diagrams, Relationships, Normalizations etc etc. Most of the theory based interview questions are directly asked from these sections. Even these topics hold some good marks considering preparation for higher studies also.
What are ER Models ?
An Entity Relationship Diagram (ERD) is a visual representation of different data using
conventions that describe how these data are related to each other.
For example, the elements writer, novel, and consumer may be described using ER
diagrams this way:
Attribute
Entities are represented by means of their properties, called attributes. All attributes
have values. For example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes.
For example, a student's name cannot be a numeric value. It has to be alphabetic. A
a student's age cannot be negative, etc.
Simple attributes:
Simple attributes are atomic values, which cannot be divided further. For example, a
a student's phone number is an atomic value of 10 digits.
Composite attribute :
If the attributes are composite, they are further divided in a tree like structure. Every
node is then connected to its attribute. That is composite attributes are represented by
eclipses that are connected with an eclipse.Composite attributes are made of more than
one simple attribute.
For example, a student's complete name may have first_name and last_name.
Derived attribute :
Derived attributes are the attributes that do not exist in the physical database, but their
values are derived from other attributes present in the database.
For example, average_salary in a department should not be saved directly in the
database, instead it can be derived.
For another example, age can be derived from data_of_birth.
For example for a circle the area can be derived from the radius.
Multi-valued attribute :
Multi-valued attributes may contain more than one value.
For example, a person can have more than one phone number, email_address, etc.
For example a teacher entity can have multiple subject values.
One-to-one :
One entity from entity set A can be associated with at most one entity of entity set B
and vice versa.
One to many:
One entity from entity set A can be associated with more than one entity of entity set B
however an entity from entity set B, can be associated with at most one entity.
Many-to-one:
More than one entity from entity set A can be associated with at most one entity of
entity set B, however an entity from entity set B can be associated with more than one
entity from entity set A.
Many-to-many :
One entity from A can be associated with more than one entity from B and vice versa.
RDBMS stands for Relational Database Management System.
RDBMS data is structured in database tables, fields and records.
Each RDBMS table consists of database table rows.Each database table row consists of
one or more database table fields.
RDBMS stores the data into a collection of tables, which might be related by common
fields (database table columns).
RDBMS also provides relational operators to manipulate the data stored into the
database tables.
Degree Of Relationship
The degree of relationship refers to the number of participating entities in a
relationship.
Relationship sets that involve two entity sets are binary (or degree two). Generally, most
relationship sets in a database system are binary.
Relationships between more than two entity sets are rare. Most relationships are
Binary.
Now, we will be discussing regarding the Integrity Constraints (e.g - keys),
Keys
For example, ID is used as a key in the Student table because it is unique for each student. In the PERSON table, passport_number, license_number, SSN are keys since they are unique for each person.
Types of keys
1.Primary Key:
It is the first key used to identify one and only one instance of an entity uniquely. An entity can contain multiple keys, as we saw in the PERSON table. The key which is most suitable from those lists becomes a primary key.
In the PERSON table, ID can be the primary key since it is unique for each employee. In the PERSON table, For each entity, the primary key selection is based on requirements and developers.
2.Candidate Key:
A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
Except for the primary key, the remaining attributes are considered a candidate key. The candidate keys are as strong as the primary key.
For example: In the PERSON table, id is best suited for the primary key. The rest of the attributes, like SSN, Passport_Number etc., are considered a candidate key.
3.Super Key:
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a candidate key.
In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a key.
The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.
4.Composite Key:
Whenever a primary key consists of more than one attribute, it is known as a composite key. This key is also known as Concatenated Key.
5.Foreign Key :
Foreign keys are the column of the table used to point to the primary key of another table.
Every employee works in a specific department in a company, and employee and department are two different entities. So we can't store the department's information in the employee table. That's why we link these two tables through the primary key of one table.
We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the EMPLOYEE table.
In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
A process of organizing the data in the database to avoid data redundancy, insertion
anomaly, update anomaly & deletion anomaly.
A process of organizing data into tables in such a way that the results of using the
database are always unambiguous and as intended. Such normalization is intrinsic to
relational database theory. It may have the effect of duplicating data within the
database and often results in the creation of additional tables.
Advantages of Normalization
❖ Elimination of data redundancy makes the database to be compact reducing the
overall amount of space a database consumes.
❖ Enforcement of referential integrity on data ensuring data to be consistent
across all tables.
❖ Maintenance becomes easier and faster since the data are organized logically in
a normalized database in a flexible way.
❖ Searching and sorting of records is easier and faster because data will appear in
a separate, smaller table when a database is normalized allowing us to easily find
them.
❖ Normalization and denormalization are two processes that are completely
opposite.
❖ Normalization is the process of dividing larger tables into smaller ones reducing
the redundant data, while denormalization is the process of adding redundant
data to optimize performance.
❖ Normalization is carried out to prevent database anomalies.
❖ Denormalization is usually carried out to improve the read performance of the
database, but due to the additional constraints used for denormalization, writes
(i.e. insert, update and delete operations) can become slower. Therefore, a
denormalized database can offer worse write performance than a normalized
database.
❖ It is often recommended that you should “normalize until it hurts, denormalize
until it works”.
❖ Normalizing data means eliminating redundant information from a table and
organizing the data so that future changes to the table are easier.
Types of Normalization:
Normalization works through a series of stages called Normal forms. The normal forms apply to individual relations. The relation is said to be in particular normal form if it satisfies constraints.
Following are the various types of Normal forms:
Normal Form: Description
1NF : A relation is in 1NF if it contains an atomic value.
2NF : A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent on the primary key.
3NF : A relation will be in 3NF if it is in 2NF and no transition dependency exists.
BCNF : A stronger definition of 3NF is known as Boyce Codd's normal form.
Advantages of Normalization
Disadvantages of Normalization
WHAT IS WEB3?Web3 is the catch-all term for the vision of an upgraded version of the internet aimed at giving power back to the users. It is truly the......
Operating Systems is one of the core subjects for a computer science student. As such a lot of important questions are asked on this subject in interv......
Getting started with cloud computing ?The words "cloud" and "web" will be interchangeable. What kind of business operates today without the internet? ...
SOFTWARE MAINTENANCE:Software Maintenance can be defined as the process of modifying a software product after its delivery. The main purpose of softwa...
A quick summary upto what we did this far from this series:Software EngineeringSoftware engineering is a discipline of engineering concerned wit...
Introduction:Understanding time complexity of a function is imperative for every programmer to execute a program in the most efficient manner. Time co...
Software MethodologyA software methodology (which is also known as software process) is a set of related development Phases that leads to the pr...
Software Engineering is a subject that requires when you are a software developer as well as when you are in your college preparing for a semester or ...
Are you getting ready for your SQL developer job interview?You've come to the correct place.This tutorial will assist you in brushing up on your SQL s...
Introduction: What is SQL?To understand SQL, we must first understand databases and database management systems (DBMS).Data is essentially a collectio...
Networking: Importance in real life!Using a computer for work, as well as for personal use, ha...
FORMULA LIST:ALGEBRA :1.Sum of first n natural numbers = n(n+1)/22.Sum of the squares of first n nat...
What is actually DBMS?A Database Management System (DBMS) is system software used to manage the orga...
C CHEATSHEETINTRODUCTION:C programming language was developed by Dennis Ritchie of Bell Labs in 1972...
C is a general purpose high-level language most popular amongst coders, it is the most compatible, e...
C++ programming language has templates which basically allows functions and classes to work with gen...
C++ was conceived in the late 1970s to overcome the limitations of C. It is an extension of th...
Interview Questions are a great source to test your knowledge about any specific field. But remember...
The most popular high-level, multipurpose programming language right now is Python.Python supports p...
IntroductionPython is a high-level programming language with dynamic binding and high-level inherent...
Interview Questions are a great source to test your knowledge about any specific field. But remember...
Object-Oriented Programming or better known as OOPs is one of the major pillars of Java that has lev...
Java is a high-level programming language known for its robustness, object-oriented nature, enhanced...