1. Overview
A semi-join in Oracle is a type of query where rows from one table are returned if they satisfy a condition with at least one row from another table. Unlike regular joins, a semi-join only returns rows from the first table, even if multiple matching rows exist in the second table. It is commonly implemented using the EXISTS
or IN
clauses.
Semi-joins are particularly useful for improving query performance in cases where you only need to check the existence of related data in another table.
2. Characteristics of a Semi-Join
- Focus on the First Table: Returns rows only from the first table.
- Existence Check: Used to determine whether related rows exist in another table.
- Efficient Execution: Can be optimized using indexes and query execution plans.
3. Syntax
3.1 Using EXISTS
SELECT column_list FROM table1 t1 WHERE EXISTS ( SELECT 1 FROM table2 t2 WHERE t1.column_name = t2.column_name );
3.2 Using IN
SELECT column_list FROM table1 t1 WHERE t1.column_name IN ( SELECT t2.column_name FROM table2 t2 );
4. Example
Example 1: Semi-Join Using EXISTS
We’ll use the following tables for this example.
Customer
+-------------+---------+
| CUSTOMER_ID | NAME |
+-------------+---------+
| 1 | Alice |
+-------------+---------+
| 2 | Bob |
+-------------+---------+
| 3 | Charlie |
+-------------+---------+
| 4 | David |
+-------------+---------+
Order
+----------+-------------+
| ORDER_ID | CUSTOMER_ID |
+----------+-------------+
| 101 | 1 |
+----------+-------------+
| 102 | 2 |
+----------+-------------+
| 103 | 1 |
+----------+-------------+
| 104 | 3 |
+----------+-------------+
Query
Return all customers who have placed at least one order:
SELECT c.customer_id, c.name FROM customer c WHERE EXISTS ( SELECT 1 FROM order o WHERE c.customer_id = o.customer_id );
Output
+-------------+---------+
| CUSTOMER_ID | NAME |
+-------------+---------+
| 1 | Alice |
+-------------+---------+
| 2 | Bob |
+-------------+---------+
| 3 | Charlie |
+-------------+---------+
Example 2: Semi-Join Using IN
Return all customers who have placed at least one order:
SELECT c.CustomerID, c.Name FROM Customers c WHERE c.CustomerID IN ( SELECT o.CustomerID FROM Orders o );
Output is still the same.
5. Semi-Join vs Inner Join
Feature | Semi-Join | Inner Join |
Rows Returned | Only rows from the first table. | Combines rows from both tables. |
Duplicates | Does not produce duplicates from the second table. | May return duplicates depending on the match. |
Use Case | Existence check. | Retrieving matching rows from both tables. |
6. Performance Considerations
- Indexing: Ensure columns used in the join condition are indexed to improve query performance.
- Use
EXISTS
for Scalability:EXISTS
is often more efficient thanIN
when the subquery returns a large result set. - Execution Plans: Analyze the execution plan using
EXPLAIN PLAN
to identify and resolve performance bottlenecks.
7. When to Use Semi-Joins
- Existence Checks: When you only need to confirm that related data exists in another table.
- Filtering Data: To filter rows from the first table based on the presence of related rows in the second table.
- Improving Performance: Semi-joins can avoid unnecessary data retrieval and improve query efficiency.
8. Conclusion
Semi-joins are a powerful tool for efficiently querying related data in Oracle. By understanding their syntax and characteristics, you can use them to write optimized queries for scenarios involving existence checks or filtering data based on relationships.