Introduction
In PostgreSQL 17, the introduction of the built-in background worker worker_spi brings an interesting feature that enhances the capability of handling background tasks without interrupting main database operations. Background workers are a powerful feature in PostgreSQL that allows custom background processes to be executed within the database system itself, making tasks like maintenance, data aggregation, or periodic operations more efficient.
In this blog post, we will dive into the concept of background workers in PostgreSQL, explain their importance, and distinguish them from extensions. Additionally, we'll walk through the process of installing and using the worker_spi extension, and then explain its functionality and workflow with a hands-on example.
What Are Background Workers in PostgreSQL?
The Importance of Background Workers
Background workers in PostgreSQL are processes that run independently of client sessions and can perform various tasks asynchronously. These tasks could include:
* Periodic updates
* Data aggregation
* Housekeeping operations like cleanup
* Data collection or reporting
The key advantage of background workers is that they don't block regular database operations. Since they run in the background, database users or applications can continue using the database without being interrupted. This makes them a great choice for tasks that need to run without constant user interaction.
Background Workers vs Extensions
* Extensions: These are modules that can add functionality to the database, such as adding new data types, functions, or operators. While extensions extend PostgreSQL's capabilities, they are generally loaded at the start of a session and provide added functionalities or enhancements to PostgreSQL, like new data types or methods.
* Background Workers: These are long-running processes designed to operate in the background. They can perform tasks like background data aggregation, cleaning up expired data, or similar operations without needing constant user input.
While both can extend PostgreSQL's capabilities, background workers are specifically designed for tasks that require continuous, asynchronous operation, while extensions typically offer immediate features.
Installing and Using the worker_spi Extension
To start using worker_spi in PostgreSQL 17, you'll need to install the extension and create it in your database. Here’s a quick guide to do that:
1. Make and Install the Extension
First, you need to compile and install the extension. To do this, ensure you have access to the PostgreSQL source code and the necessary build tools. Run the following commands:
cd /path/to/postgresql-source/src/test/modules/worker_spi
make
sudo make install
cd /path/to/postgresql-source
make
sudo make install
2. Create the worker_spi Extension
After successfully installing the extension, you need to enable it in your database:
3. Launch the Worker
To launch the worker, you use the worker_spi_launch function. Let’s launch it with the ID 4:
SELECT worker_spi_launch(4);
This command creates a schema named schema4 and sets up a table named counted in that schema.
Table Structure of schema4.counted
The table schema4.counted has two columns: type and value. Here's the structure:
\d schema4.counted;
Table "schema4.counted"
Column | Type | Collation | Nullable | Default
-----------+---------+-----------+----------+---------
type | text | | |
value | integer | | |
Indexes:
"counted_unique_total" UNIQUE, btree (type) WHERE type = 'total'::text
Check constraints:
"counted_type_check" CHECK (type = ANY (ARRAY['total'::text, 'delta'::text]))
Key Points:
* Column type: This column holds the type of the row, either 'total' or 'delta'.
* Column value: This holds the numeric value associated with the type.
The table also includes:
* A unique constraint (counted_unique_total) on the type column when the type is total.
* A check constraint (counted_type_check) to ensure that the type is either total or delta.
The Workflow of the worker_spi Background Worker
The primary job of the worker_spi background worker is to aggregate the delta rows into the total row and then delete them. Here's how it works:
1. Initial Insert
First, you insert a row with the type set to total and a value (e.g., 100). You can view this with:
INSERT INTO schema4.counted (type, value) VALUES ('total', 100);
SELECT * FROM schema4.counted;
Output:
type | value
-------+-------
total | 100
(1 row)
2. Inserting Delta Rows
Next, you insert rows with the type set to delta and corresponding values (e.g., 10, 20, 30). These rows will be marked for deletion, and their values will be aggregated into the total value:
INSERT INTO schema4.counted (type, value)
VALUES
('delta', 10),
('delta', 20),
('delta', 30);
SELECT * FROM schema4.counted;
Output:
type | value
-------+-------
total | 100
delta | 10
delta | 20
delta | 30
(4 rows)
3. Automatic Restructuring
After a short period, the worker_spi background worker will automatically delete the delta rows and update the total value by summing up the delta values. This results in:
SELECT * FROM schema4.counted;
Output:
type | value
-------+-------
total | 160
(1 row)
The total value now reflects the sum of the initial 100 plus the delta values (10 + 20 + 30).
How Does the Background Worker Work?
A background worker in PostgreSQL like worker_spi runs asynchronously. It operates independently of the main database operations, performing its work based on the programmed logic. The worker_spi background worker does the following:
* Listens for delta rows: It periodically checks for rows with the type set to delta.
* Deletes delta rows: After processing the delta rows, it deletes them.
* Aggregates the values: It sums the delta values and updates the total row with the new aggregated value.
* Operates in the background: All of this happens without user intervention and without blocking regular database operations.
This workflow enables PostgreSQL to handle complex, long-running background tasks efficiently without affecting the performance of the system for regular users.
Clarification on the Role of the worker_spi Extension
The worker_spi extension essentially allows you to run background worker processes in PostgreSQL. It acts as a helper module that creates background workers (in this case, for aggregating delta values into total) without requiring the user to manually write and manage background worker code. Instead, the background worker logic is part of the extension, and once installed, it can be used seamlessly.
The extension in PostgreSQL is what makes it easy to integrate background workers like worker_spi into your system. By installing the worker_spi extension and enabling it using CREATE EXTENSION, you are effectively setting up a background worker to run in your database.
Key Points
* Extension: The worker_spi extension encapsulates the logic for running background workers in PostgreSQL. It is a built-in extension that provides an API to launch and manage background workers.
* Background Worker: After installing the extension, you can invoke background worker functionality using commands like worker_spi_launch(4) to create and run background worker processes in the database.
In conclusion, the worker_spi extension is a tool for implementing background workers easily, and through this extension, PostgreSQL 17 allows us to execute background tasks like aggregating values and deleting old rows without requiring extra manual work for background worker management.
By understanding how background workers like worker_spi operate, you can leverage them for various use cases—from periodic updates to complex data aggregation—all while maintaining high performance in your PostgreSQL database.
Learn the essentials of creating and managing materialized views in PostgreSQL. Optimize your data retrieval and boost performance with our expert insights.