RDBMS Exam Preparation Synthesis

RDBMS Exam Preparation Synthesis

This material focuses heavily on foundational RDBMS theory, SQL language elements, integrity constraints, and database design principles (Normalization and ER Modeling).

1) Simplified, Exam-Oriented Notes

I. Core DBMS and RDBMS Concepts

Database Management System (DBMS): A software system that manages, stores, and retrieves data efficiently in a structured format. It provides security, integrity, and consistency across multiple users.

Relational Database Management System (RDBMS): Stores data in the form of related tables (relations).

  • Founders: First DBMS (Integrated Data Store/IDS) was created by **Charles Bachman**. The father of the relational database is **Dr. E. F. Codd** (IBM, 1970).
  • Popular Examples: RDBMS examples include MySQL (most popular open source DBMS), Oracle, Microsoft SQL Server, PostgreSQL. NoSQL examples include MongoDB, Cassandra, and DynamoDB.

Data Hierarchy (Smallest to Largest)

  1. Bit: Smallest unit (0 or 1).
  2. Byte/Character: 8 Bits.
  3. Field/Attribute/Column: Group of Bytes.
  4. Record/Tuple/Row: Group of Fields.
  5. File/Table/Relation: Group of Records/Tuples.
  6. Database: Group of Files.

Relational Terminology (Relation/Table)

  • Cardinality: The number of tuples (rows) in a relation.
  • Degree: The number of attributes (columns) in a relation.
  • Relation: A two-dimensional table of data.
  • Schema: The overall design or structure (skeleton/blueprint) of the database, defining tables, fields, and relationships. It does not contain the data itself.
  • Instance: A snapshot of the actual data stored in the database at a specific point in time.

II. Database Architecture & Data Abstraction

The goal of data abstraction is to hide complexity from users.

ANSI/SPARC Three-Schema Architecture

  1. External Level (View Level): Closest to the user. Shows only the relevant content to the user (subschema). Hides data structure complexity.
  2. Conceptual Level (Logical Level): Describes the structure of the whole database (global view). Where constraints and security are implemented.
  3. Internal Level (Physical Level): Lowest level of abstraction, describing exactly how data is physically stored and organized on the storage medium (e.g., files, indices, complex low-level data structures).

III. Integrity Constraints and Keys

Constraints are rules added to columns to control the data stored, ensuring accuracy and reliability.

Primary Key: Uniquely identifies a row/record. A table can have **only one** Primary Key. It cannot contain duplicate or NULL values.

Foreign Key: A column in one table that references (points to) the primary key or candidate key of another table. Used to link two tables and maintain **Referential Integrity**.

Hierarchy of Keys

  • Super Key: A group of single or multiple attributes that uniquely identifies rows in a table (e.g., {Emp\_Id}, {Emp\_Id, Name}).
  • Candidate Key: A **minimal** Super Key. It can uniquely identify a tuple, and the Primary Key is selected from this set.
  • Alternate Key (AK): Candidate keys that were not selected as the Primary Key.
  • Composite Key: A primary key consisting of two or more attributes working together to uniquely identify a row.

IV. SQL & Database Languages

SQL is a **4th Generation Language**, developed at IBM by Raymond Boyce and Donald Chamberlin. It is the standard language for relational databases.

SQL Language Classification and Commands (High-Yield Area)

Language (Full Form) Purpose/Focus Core Commands
DDL (Data Definition Language) Defines database schema/structure. CREATE (table, view, index), ALTER (modifies structure), DROP (deletes entire object/structure), TRUNCATE (removes all rows/data quickly, structure remains), RENAME.
DML (Data Manipulation Language) Manipulates data stored in the tables. INSERT, UPDATE, DELETE (removes specific rows; uses WHERE clause; slower than TRUNCATE).
DQL (Data Query Language) Subset of DML, retrieves data. SELECT (The primary command).
TCL (Transaction Control Language) Manages transactions to maintain consistency/reliability. COMMIT (saves changes permanently), ROLLBACK (undoes changes to last commit/savepoint), SAVEPOINT.
DCL (Data Control Language) Manages access permissions (security/authorization). GRANT (permits user privileges), REVOKE (removes privileges).

Key SQL Clauses and Functions

  • SQL Query Order: SELECT -> FROM -> WHERE -> GROUP BY -> HAVING -> ORDER BY.
  • WHERE: Used to filter rows based on specified conditions.
  • GROUP BY: Groups data based on a particular column or row.
  • HAVING: Used to filter the results based on aggregate functions (cannot be used without GROUP BY; WHERE cannot use aggregate functions).
  • Aggregate Functions: Calculate aggregate values (e.g., MAX, MIN, SUM, AVG, **COUNT()**).
  • Scalar Functions: Perform operations on individual values (e.g., UCASE(), LCASE(), ROUND()).
  • CRUD: Stands for Create, Read, Update, Delete.

V. Transaction Properties (ACID)

ACID properties ensure accuracy, completeness, and data integrity during database transactions.

  1. Atomicity (All or Nothing): The entire transaction must take place at once, or it doesn't happen at all. All commands are treated as a single unit. Maintained by the Transaction Manager.
  2. Consistency: The database must be consistent before and after the transaction. Integrity constraints (like foreign keys) must be maintained.
  3. Isolation: Multiple concurrent transactions occur without interfering with each other (independently). Maintained by the Concurrency Control Manager.
  4. Durability: Once a transaction successfully completes (commits), the changes are permanent and persist in the system, even if system failures (crashes, power outages) occur. Maintained by the Recovery Manager.

VI. Normalization (Data Design)

Normalization is the process of minimizing redundancy and unwanted data from a relation or set of relations.

Anomalies (Problems Normalization Solves):

  1. Insertion Anomaly: Cannot insert a record without complete information for dependent fields.
  2. Deletion Anomaly: Deleting a record unintentionally causes the loss of critical, related data.
  3. Updation Anomaly: Updating a single piece of data requires updating multiple redundant entries.

Normal Forms (Exam Focus: Definition and Dependencies)

  • 1NF (First Normal Form): The most basic level. Requires that each attribute (field) must contain only **atomic values** (single value) and no repeating groups.
  • 2NF (Second Normal Form): Must be in 1NF. Requires that all non-key attributes are **fully functionally dependent on the Primary Key** (no partial dependencies on a composite key).
  • 3NF (Third Normal Form): Must be in 2NF. Requires that there is **no transitive dependency** (non-key attributes cannot depend on other non-key attributes). This is considered adequate for usual database design.
  • BCNF (Boyce-Codd Normal Form): Stricter/Advanced version of 3NF (3.5NF). Ensures that every determinant is a **Candidate Key**. Eliminates all remaining redundancy.
  • 4NF (Fourth Normal Form): Must be in BCNF. Eliminates **multi-valued dependencies**.
  • 5NF (Fifth Normal Form): Highest level, concerned with **Join Dependency**.

VII. ER Model and Data Structures

The Entity-Relationship (ER) model (introduced by Peter Chen in 1976) is used to represent real-world objects (entities) and their relationships.

Entities and Attributes

  • Entity: A definable object (person, place, concept) about which data is stored. Represented by a rectangle.
  • Attribute: A property or characteristic of an entity. Represented by an ellipse.
  • Strong Entity: Has a primary key and is independent.
  • Weak Entity: Cannot be uniquely identified by its own attributes; its existence depends on a strong entity. Represented by a double rectangle.
  • Composite Attribute: An attribute composed of several smaller sub-attributes (e.g., Address = house no, street name).
  • Multivalued Attribute: An attribute that can have more than one value for a single entity (e.g., multiple phone numbers).
  • Derived Attribute: An attribute computed from other attributes (e.g., Age derived from Date of Birth). Represented by a dashed oval.

Cardinality Relationships

Defines how many times an entity of one set participates in a relationship set.

  1. One-to-One: One entity of A relates to at most one entity of B, and vice-versa.
  2. One-to-Many: One entity of A relates to one or more entities of B.
  3. Many-to-Many: Any number of entities of A relates to any number of entities of B, and vice-versa.

Database Joins (SQL)

Used to combine rows from two or more tables based on related columns, promoting normalization. ANSI standard SQL defines five types of JOINs.

  1. Inner Join (Equi Join / Simple Join): Returns only the rows that have matching values in both tables. The default join type.
  2. Left Outer Join: Returns all rows from the first (left) table, and the matched rows from the second (right) table.
  3. Right Outer Join: Returns all rows from the second (right) table, and the matched rows from the first (left) table.
  4. Full Outer Join: Returns all rows from both tables, combining matched results and including unmatched rows with NULLs for the missing side.
  5. Cross Join (Cartesian Join): Returns the Cartesian product of the tables' rows (every row from the first table combined with every row from the second).
  6. Self Join: Used to join a table to itself.