PostgreSQL Vacuuming: An Introduction for Busy Devs
Published on
If you have interacted with PostgreSQL at any point in your developer career, you have met it: The autovacuum daemon. It fires up every now and then, consumes resources, and disappears again, without telling you what it did, and why it ran in the first place. In this post, I would like to give an idea of what vacuuming is, what the autovacuum daemon does, and how you can become friends with it.
What is vacuuming?
The concept of vacuuming has to do with the way PostgreSQL implements certain RDBMS features. A modern RDBMS has to offer concurrency control for transactions. That is, different transactions have to be able to see different views of the data, depending on which statements they have already executed. This concept is called transaction isolation, and constitutes the I in ACID. Some rows might be edited by a transaction, changing certain fields, whereas others might be deleted in one while they are still available in others. Furthermore, each transaction can be rolled back, leading to undoing of the changes made by the transaction. The management of data state is complicated by the fact that an RDBMS has to keep the storage of the table and any indexes on a table intact while managing data visibility. It cannot just go ahead and modify data on the primary data structures; this would lead to an invalid state.
The solution implemented by PostgreSQL is called Multi-version Concurrency
Control
(MVCC).
The basic idea is to mark rows according to the transactions which are affecting
them, and manage visibility accordingly. Each transaction gets an ID from a
simple 32 bit integer sequence. Rows are then marked with this ID regarding
which transaction last modified or deleted them. These marks are stored in the
xmin
and xmax
columns which are normally hidden, but visible if explicitly
queried for. Using the sample tables from the previous post on PostgreSQL
performance,
we can insert some data, and then see the transaction ID’s:
BEGIN TRANSACTION;
SELECT txid_current(); -- prints the current transaction id
INSERT INTO person (first_name, last_name) VALUES ('Hercule', 'Poirot');
COMMIT TRANSACTION;
On my computer, the SELECT txid_current();
statement prints out 156078
. When we query the columns xmin
and xmax
, we can see the following values:
test=# SELECT xmin, xmax, first_name, last_name FROM person;
xmin | xmax | first_name | last_name
--------+------+------------+-----------
156078 | 0 | Hercule | Poirot
(1 row)
As you can see, the xmin
column of the relevant row is set to the ID of the
transaction in which it was commited. xmin
can be interpreted as the lowest
transaction ID that can see this column. Any transactions that have been started
beforehand, and thus have lower ID, cannot see this row. The meaning of the
xmax
column is the exact opposite. This column is set to the ID of the
transaction that deletes this row; any transactions that come after it cannot
see the row. Essentially, for a transaction to see a row, the relationship xmin
< current_txid < xmax
should hold. There are two more columns (cmin
and
cmax
) that are used for tracking rows per cursor state, but the details are
not relevant here. For details of the algorithm, have a look at these
slides.
MVCC is not the only method for RDBMS concurrency control; other databases use other mechanisms, such as rollback segments in Oracle or MySQL. These are like blocks of work which, when a transaction fails, are undone on rollback or in the next read that refers to those blocks. The advantage of MVCC compared to other methods is that rolling back a transaction has minimal cost. There is no cleanup that has to be done when a rollback happens; the memory and processor load is the same as the commit case.
Enter VACUUM
The disadvantage of MVCC is the topic of this post: The necessity of vacuuming. The primary purpose of vacuuming is as a garbage collector. Since PostgreSQL does not remove any rows from physical storage when they are updated or deleted, after some time (depending on the frequency of update and delete activity in the database), the database will be occupying a lot of essentially unused disk space. Garbage collection is not the only purpose of vacuuming, though. Two related things are visibility map and transaction ID wraparound. Visibility maps are PostgreSQL’s way of avoiding unnecessary trips to the heap, where the actual row data is stored. When a query finds rows in an index, PostgreSQL has to check whether these rows are visible, i.e. not already deleted and to be vacuumed, by fetching the data from the heap. This IO trip is avoided using visibility maps that record which pages on the heap have only visible data. If a page is on this map, PostgreSQL does not have to visit it to ensure visibility. As a side note, the visibility map is the reason it took Postgresql longer to implement index-only scans, which are possible only since version 9.2.
Transaction ID wraparound is the name given to the fact that since these IDs are
32 bit integers, they cannot be greater than 232. When a database has
processed more transactions than that, the transaction ID overflows, starting at
0 again. If no further action is undertaken, nearly all rows will suddenly
become invisible, because they have positive transaction IDs. The solution
implemented by PostgreSQL is setting the xmin
id of rows with sensibly low
xmin
values to a special value FrozenTransactionID
which is always
considered to be lower (ergo older) than any transaction ID. This happens as a
part of vacuuming, so if you do not vacuum your database for a long time, there
is a real possibility that old data suddenly becomes invisible.
Edit: As Peter pointed out in the comments, the transaction ID comparison is
presented in a simplified manner here. The real comparison of IDs involves
modulo-arithmetic, so that the space of IDs wraps around. That is to say, for
any ID x
, there are 232 IDs smaller than x
, and just as many
greater. See the documentation for
details.
Manual vacuuming is as simple as running VACUUM;
in psql
, or rather VACUUM
VERBOSE;
if you want to actually see what is happening. These commands also
accept the name of a table as an optional argument. If this option is ommitted,
VACUUM
is executed on the whole database. Running only VACUUM
is what one
could call the first level of vacuuming; it takes care of deleted rows and
updates the visibility map. What it does not do is to return the storage space
to the operating system, however, contrary to what I said above. It actually
updates what’s called the free space map
(FSM)
to mark the pages that have free space due to deleted or updated rows. The next
time a new row has to be written, PostgreSQL can consult this map and use the
free space in the pages, instead of demanding more storage space from the OS.
If you want to reclaim all free
space,
you need to run VACUUM FULL;
, which might be necessary if you e.g. manually
delete a lot of rows. Full vacuuming reprocesses table data, and rewrites a
brand new version that is compacted and consumes exactly the space it needs.
However, think twice before you run it: It locks the tables it is processing,
and will block the both read and write queries.
Vacuum ≠ Analyze
As I mentioned in my previous post, PostgreSQL relies on statistics of column
value distributions to generate efficient query plans. Updating these statistics
is not the job of VACUUM
, and requires a separate command, namely ANALYZE
.
You can run ANALYZE;
either standalon in psql
(or ANALYZE VERBOSE;
for
more input), or both maintenance commands together with VACUUM ANALYZE;
. As
with VACUUM
, you can pass [VACUUM] ANALYZE
the name of a single table. Fun
note: Both ANALYZE
and ANALYSE
work, so go ahead and spell it the British
way if you are keen to do so.
The autovacuum daemon
In order to make the jobs of database users worldwide easier, PostgreSQL since
8.1 comes with a daemon that runs both VACUUM
and ANALYZE
at certain
intervals: The famous autovacuum daemon. It runs as a separate daemon process,
the presence of which you can check with a simple ps aux | grep autovacuum
. If
you don’t have any running vacuum processes, you should only see a “launcher
process”, otherwise you might also see workers. The autovacuum daemon checks
each database in regular intervals to see whether it needs vacuuming and/or
analyzing. If the number of rows that were updated or deleted is above a certain
threshold for a table, these processes are executed. The number of deleted and
updated rows is read from the statistics views; we can see an approximation for
the person
table with the following query:
test=# SELECT n_tup_del, n_tup_upd FROM pg_stat_all_tables WHERE relname = 'person';
n_tup_del | n_tup_upd
-----------+-----------
0 | 0
(1 row)
The threshold is calculated according to the following formula:
autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * pg_class.reltuples)
The constants starting with autovacuum
in the above formula can be queried in
psql from the pg_settings
table. The last value can be obtained with SELECT
reltuples from pg_class WHERE relname='person';
. Bringing these together, we
can write the following query as an approximation for what the autovacuum daemon
does to decide whether to vacuum a table:
SELECT
(pt.n_tup_del + pt.n_tup_upd) > pgs_threshold.setting::int + (pgs_scale.setting::float * pc.reltuples)
AS should_vacuum
FROM pg_class pc JOIN pg_stat_all_tables pt ON pc.relname = pt.relname
CROSS JOIN pg_settings pgs_threshold
CROSS JOIN pg_settings pgs_scale
WHERE pt.relname='person'
AND pgs_threshold.name = 'autovacuum_vacuum_threshold'
AND pgs_scale.name = 'autovacuum_vacuum_scale_factor';
You have to keep in mind that the statistics we receive from the
pg_stat_all_tables
are accumulated since pg_stat_archiver.stats_reset
. In
the documentation, there is no remark as to which exact statistics the
autovacuum daemon uses, but I’m pretty certain that only the tuples updated and
deleted since the last vacuum run are included. Otherwise, the autovacuum daemon
would have to vacuum every table in every run in the limit. The autovacuum
daemon does a similar calculation to decide whether to run analyze; details can
be found on the PostgreSQL
documentation.
Improving Vacuuming
A frequent issue with the autovacuum daemon is that it gets to work at
unexpected times of the day, maybe in the middle of a high load period, and
causes deteriorated performance. Another symptom of improper vacuuming regime is
queries that are executed with suboptimal query plans. The primary reason for
this is incorrect table statistics, which can be alleviated by ANALYZE
statements that run as a part of vacuuming. As you can see above, it’s difficult
to imitate the behavior of autovacuum. The ideal case would be to find out
whether analytics are out of sync, but that’s difficult to find out, and not
even autovacuum does
that:
The daemon schedules ANALYZE strictly as a function of the number of rows inserted or updated; it has no knowledge of whether that will lead to meaningful statistical changes.
You are also strongly advised to never turn off autovacuuming, because of the risks it involves. Even if you do frequent manual vacuuming, there might be unexpected bouts of high activity that affect many rows. Also, autovacuum will do little work if it runs when you have manual vacuuming, so it makes sense to just leave it running. The most sensible thing to do is to adjust the settings so that large and active tables are vacuumed more frequently. Here is a query to find out which tables have the most number of rows:
SELECT reltuples,relname FROM pg_class WHERE relkind='r' ORDER BY reltuples DESC;
You can either schedule a cron job to vacuum the largest tables regularly if you have periods of low load, or as is in our case, if the load on your application is continuous, you can adjust the parameters for these tables to run vacuuming more frequently. The vacuum parameters can be set separately for individual tables with the following query:
ALTER TABLE person SET (autovacuum_vacuum_scale_factor = 0.0);
ALTER TABLE person SET (autovacuum_vacuum_threshold = 4000);
As per the equation above, these settings would cause autovacuum to vacuum these
tables every 4000 row updates or deletes, no matter how many rows are already in
the table. This would lead to more frequent vacuuming of these tables, and
shorter vacuum times in the runs where all tables are vacuumed, leading to
better perforamnce. The settings for individual tables can be queried from the
pg_class
table as follows:
SELECT relname, reloptions FROM pg_class WHERE relname='person';
A simple test using pg_dump
and pg_restore
has revealed that settings
changed with the ALTER
statement above are also preserved in the dump and
restore process, so you don’t have to run it for every new instance of your
database if you’re reading in dumps.