In ClickHouse, Data Definition Language does not only apply to creating or dropping entire tables and databases, but also to evolving the structure of existing tables over time, and this is an important capability because real-world analytical schemas rarely remain static as requirements change, new attributes are needed, or old fields become irrelevant. Column-level DDL operations allow you to modify the structure of a table without recreating it from scratch, and these operations are especially relevant for tables that belong to the MergeTree family of engines, which are the backbone of most production ClickHouse deployments.
When working with column manipulation in ClickHouse, it is essential to remember that ClickHouse is a columnar database designed for analytical performance, and therefore schema changes are handled in a way that preserves read efficiency and storage organization. Although these operations are powerful, they should still be used thoughtfully in production systems because structural changes on large tables can have background processing costs.
The most common column-level DDL operations include:
- Adding a column
- Renaming a column
- Clearing a column’s data
- Dropping a column
All of these are performed using the ALTER TABLE statement.
1. Foundation Example Table
To understand these operations clearly, let us begin with a simple MergeTree-based table.
Example:
CREATE TABLE ddl_demo.sample_table
(
column1 String,
column2 String
)
ENGINE = MergeTree
ORDER BY column1;
This table contains two string columns and uses the MergeTree engine, which supports schema evolution and mutations.
Let us insert a few rows:
INSERT INTO ddl_demo.sample_table VALUES
('A', 'Apple'),
('B', 'Banana');
At this stage, the table contains two rows and two columns, which gives us a good baseline for column manipulation.
2. Adding a Column
Adding a column is a common requirement when new data attributes need to be stored. In ClickHouse, adding a column does not rewrite existing data immediately; instead, the new column is logically added and existing rows receive default or NULL values until new data is written.
Example:
ALTER TABLE ddl_demo.sample_table ADD COLUMN column3 Nullable(String);
Key points:
ADD COLUMNdefines the new columnNullable(String)allows NULL values- Existing rows will show NULL for this column
If you query the table after this operation, you will see that column3 exists but contains NULLs for previously inserted rows.
Now, if you insert new data:
INSERT INTO ddl_demo.sample_table (column1, column2, column3)
VALUES ('C', 'Cherry', 'Red');
The new row will have a value in column3, while older rows remain NULL.
This behavior is efficient because ClickHouse avoids rewriting large volumes of historical data when adding columns.
3. Renaming a Column
Renaming a column is useful when improving naming conventions or correcting schema design mistakes. This is a metadata-level change and does not rewrite the stored data.
Example:
ALTER TABLE ddl_demo.sample_table RENAME COLUMN column3 TO column4;
After this operation:
- The column name changes
- The data remains intact
- Queries must use the new name
If you select from the table, you will see column4 instead of column3.
This operation is fast because it primarily updates metadata.
4. Clearing a Column
Clearing a column means resetting its values to default or NULL without removing the column itself. This is useful when a column’s data is no longer valid but the column structure should remain.
Example:
ALTER TABLE ddl_demo.sample_table CLEAR COLUMN column4;
Effect:
- All non-null values in
column4become NULL - The column remains in the schema
- Future inserts can still populate it
This can be seen as a bulk reset operation for a column.
Because ClickHouse is columnar, clearing a column is more efficient than row-by-row updates in traditional databases.
5. Dropping a Column
Dropping a column removes it entirely from the table schema. This should be done carefully because the data in that column is permanently removed.
Example:
ALTER TABLE ddl_demo.sample_table DROP COLUMN column4;
After this operation:
- The column no longer exists
- Its data is removed
- Queries referencing it will fail
Dropping unused columns can reduce storage usage and improve clarity in schemas, but it should be planned carefully in production systems.
