SQL (Structured Query Language) is a standard programming language used to manage and manipulate relational databases. SQL is used for tasks like querying data, inserting records, updating existing records, and deleting records.
-- INNER JOIN
SELECT students.name, courses.name
FROM students
INNER JOIN courses ON students.course_id = courses.id;
-- LEFT JOIN
SELECT students.name, courses.name
FROM students
LEFT JOIN courses ON students.course_id = courses.id;
-- RIGHT JOIN
SELECT students.name, courses.name
FROM students
RIGHT JOIN courses ON students.course_id = courses.id;
-- FULL JOIN
SELECT students.name, courses.name
FROM students
FULL OUTER JOIN courses ON students.course_id = courses.id;
Normalization is the process of organizing the data in a database to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, manageable ones while ensuring relationships between the tables are maintained.
1NF:
| StudentID | Name | Courses |
|-----------|----------|--------------|
| 1 | John | Math, Science|
| 2 | Alice | English, Art |
After normalization into 2NF:
-- Students Table
| StudentID | Name |
|-----------|----------|
| 1 | John |
| 2 | Alice |
-- Courses Table
| StudentID | Course |
|-----------|----------|
| 1 | Math |
| 1 | Science |
| 2 | English |
| 2 | Art |
-- Using WHERE
SELECT name, age
FROM employees
WHERE age > 30;
-- Using HAVING
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;
A subquery is a query within another query. It can be used in SELECT, INSERT, UPDATE, or DELETE statements.
-- Single-row subquery
SELECT name
FROM employees
WHERE department_id = (SELECT department_id FROM departments WHERE name = 'HR');
-- Correlated subquery
SELECT name, salary
FROM employees e
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);
ACID stands for the properties of a transaction in a database:
-- DELETE
DELETE FROM employees WHERE id = 5;
-- TRUNCATE
TRUNCATE TABLE employees;
-- DROP
DROP TABLE employees;
-- Primary key example
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(50)
);
-- Foreign key example
CREATE TABLE orders (
order_id INT PRIMARY KEY,
employee_id INT,
FOREIGN KEY (employee_id) REFERENCES employees(id)
);
-- Using UNION
SELECT name FROM employees
UNION
SELECT name FROM customers;
-- Using UNION ALL
SELECT name FROM employees
UNION ALL
SELECT name FROM customers;
The GROUP BY
clause is used to group rows that have the same values into summary rows, typically used with aggregate functions like COUNT()
, SUM()
, AVG()
, MAX()
, and MIN()
.
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
Aggregate functions are functions that perform a calculation on a set of values and return a single value. Common aggregate functions include:
COUNT()
: Returns the number of rows.SUM()
: Returns the sum of a numeric column.AVG()
: Returns the average value of a numeric column.MIN()
: Returns the minimum value.MAX()
: Returns the maximum value.
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
The DISTINCT
keyword is used to remove duplicate rows from the result set. It is typically used in a SELECT
statement to return only unique values.
SELECT DISTINCT department FROM employees;
A VIEW
is a virtual table that consists of a stored query. It doesn't physically store data but presents data derived from one or more tables. Views simplify complex queries and can provide a level of abstraction.
CREATE VIEW employee_view AS
SELECT id, name, department
FROM employees
WHERE salary > 50000;
-- Querying the view
SELECT * FROM employee_view;
A trigger
is a database object that automatically executes a specified action when certain events occur on a table or view. The events could be INSERT
, UPDATE
, or DELETE
.
CREATE TRIGGER after_employee_insert
AFTER INSERT ON employees
FOR EACH ROW
BEGIN
INSERT INTO audit_log (action, table_name, action_time)
VALUES ('INSERT', 'employees', NOW());
END;
CREATE TABLE users (
username VARCHAR(50),
password CHAR(20)
);
The BETWEEN
operator is used to filter the result set within a certain range of values. The range can be numeric, date-based, or alphabetic.
SELECT name, salary
FROM employees
WHERE salary BETWEEN 40000 AND 60000;
A NULL
value represents missing or undefined data. It is different from an empty string or zero. You must use IS NULL
or IS NOT NULL
to check for NULL
values, as NULL
is not equal to any value, even itself.
SELECT name
FROM employees
WHERE salary IS NULL;
transaction
controls in SQL?Transactions in SQL ensure that a series of operations are completed successfully and consistently. The common transaction control commands are:
BEGIN TRANSACTION;
UPDATE employees
SET salary = salary * 1.1
WHERE department = 'HR';
COMMIT;
The LIKE
operator is used in a WHERE
clause to search for a specified pattern in a column. It is often used with wildcard characters:
%
: Represents zero or more characters._
: Represents a single character.
SELECT name
FROM employees
WHERE name LIKE 'A%'; -- Names starting with 'A'
SELECT name
FROM employees
WHERE name LIKE '_n'; -- Names ending with 'n'
The IN
operator is used to check if a value matches any value in a list or subquery. It simplifies the use of multiple OR
conditions.
SELECT name
FROM employees
WHERE department IN ('HR', 'IT', 'Finance');
The CASE
statement is used to create conditional logic in SQL queries. It works like an IF-THEN-ELSE
statement.
SELECT name, salary,
CASE
WHEN salary > 60000 THEN 'High'
WHEN salary BETWEEN 40000 AND 60000 THEN 'Medium'
ELSE 'Low'
END AS salary_range
FROM employees;
-- Using WHERE
SELECT department, COUNT(*) AS employee_count
FROM employees
WHERE salary > 50000
GROUP BY department;
-- Using HAVING
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
-- Stored Procedure
CREATE PROCEDURE GetEmployeeById (IN emp_id INT)
BEGIN
SELECT * FROM employees WHERE id = emp_id;
END;
-- Function
CREATE FUNCTION GetEmployeeSalary (emp_id INT) RETURNS DECIMAL(10, 2)
BEGIN
DECLARE salary DECIMAL(10, 2);
SELECT salary INTO salary FROM employees WHERE id = emp_id;
RETURN salary;
END;
A self-join is a join where a table is joined with itself. It is useful for finding relationships within the same table, such as hierarchical or comparative data.
-- Find employees who have the same manager
SELECT A.name AS Employee, B.name AS Manager
FROM employees A
JOIN employees B ON A.manager_id = B.id;
A composite key is a combination of two or more columns that together uniquely identify a row in a table. It is used when no single column can uniquely identify the row.
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- Clustered Index
CREATE CLUSTERED INDEX idx_employee_id ON employees(id);
-- Non-Clustered Index
CREATE NONCLUSTERED INDEX idx_employee_name ON employees(name);
A cursor is a database object used to retrieve, manipulate, and iterate over rows in a result set one at a time. It is commonly used for row-by-row processing.
DECLARE cursor_employee CURSOR FOR
SELECT name, salary FROM employees;
OPEN cursor_employee;
FETCH NEXT FROM cursor_employee INTO @name, @salary;
-- Process rows here
CLOSE cursor_employee;
DEALLOCATE cursor_employee;
SELECT name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_number
FROM employees;
TRUE
if the subquery returns any rows.
-- Using EXISTS
SELECT name FROM employees
WHERE EXISTS (SELECT 1 FROM departments WHERE departments.id = employees.department_id);
-- Using IN
SELECT name FROM employees
WHERE department_id IN (SELECT id FROM departments);
Both are used to convert data types, but they have some differences:
-- Using CAST
SELECT CAST(123.45 AS INT);
-- Using CONVERT
SELECT CONVERT(INT, 123.45);
The COALESCE()
function returns the first non-NULL value in a list of arguments.
SELECT COALESCE(NULL, NULL, 'Default Value') AS result;
DELETE FROM employees WHERE department = 'HR';
TRUNCATE TABLE employees;
Partitioning divides a large table into smaller, more manageable pieces, improving performance. It can be applied based on range, list, or hash.
CREATE TABLE sales (
id INT,
sale_date DATE,
amount DECIMAL(10, 2)
)
PARTITION BY RANGE (YEAR(sale_date)) (
PARTITION p_2023 VALUES LESS THAN (2024),
PARTITION p_2024 VALUES LESS THAN (2025)
);
The MERGE
statement allows you to perform INSERT
, UPDATE
, or DELETE
operations in a single statement based on specific conditions.
MERGE INTO employees AS target
USING new_employees AS source
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET target.salary = source.salary
WHEN NOT MATCHED THEN
INSERT (id, name, salary) VALUES (source.id, source.name, source.salary);
An index improves the speed of data retrieval by providing quick access to rows in a table. However, indexes can slow down INSERT
, UPDATE
, and DELETE
operations due to the overhead of updating the index.
CREATE INDEX idx_salary ON employees(salary);
Sharding is a method of partitioning where data is distributed across multiple databases or servers to improve scalability and performance. Each shard holds a subset of the data.
To identify duplicates, use the GROUP BY
and HAVING
clauses.
SELECT name, COUNT(*) AS duplicate_count
FROM employees
GROUP BY name
HAVING COUNT(*) > 1;
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
age INT CHECK (age > 18),
department_id INT UNIQUE
);
An index is a database object used to improve the speed of data retrieval operations on a table. It is created on columns that are frequently used in queries, such as in the WHERE clause or in JOIN conditions.
CREATE INDEX idx_employee_name ON employees(name);