<?xml version="1.0" encoding="UTF-8" ?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" version="2.0"><channel><title>Tom Swartz | CrunchyData Blog</title>
<atom:link href="https://www.crunchydata.com/blog/author/tom-swartz/rss.xml" rel="self" type="application/rss+xml" />
<link>https://www.crunchydata.com/blog/author/tom-swartz</link>
<image><url>https://www.crunchydata.com/build/_assets/default.png-W4XGD4DB.webp</url>
<title>Tom Swartz | CrunchyData Blog</title>
<link>https://www.crunchydata.com/blog/author/tom-swartz</link>
<width>256</width>
<height>256</height></image>
<description>PostgreSQL experts from Crunchy Data share advice, performance tips, and guides on successfully running PostgreSQL and Kubernetes solutions</description>
<language>en-us</language>
<pubDate>Wed, 14 Oct 2020 05:00:00 EDT</pubDate>
<dc:date>2020-10-14T09:00:00.000Z</dc:date>
<dc:language>en-us</dc:language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<item><title><![CDATA[ Tuning Your Postgres Database for High Write Loads ]]></title>
<link>https://www.crunchydata.com/blog/tuning-your-postgres-database-for-high-write-loads</link>
<description><![CDATA[ As a database grows and scales up from a proof of concept to a full-fledged production instance, there are always a variety of growing pains that database administrators and systems administrators will run into. ]]></description>
<content:encoded><![CDATA[ <p>As a database grows and scales up from a proof of concept to a full-fledged production instance, there are always a variety of growing pains that database administrators and systems administrators will run into.<p>Very often, the <a href=https://www.crunchydata.com/solutions/enterprise-postgresql-support>engineers on the Crunchy Data support team</a> help support enterprise projects which start out as small, proof of concept systems, and are then promoted to large scale production uses. As these systems receive increased traffic load beyond their original proof-of-concept sizes, one issue may be observed in the Postgres logs as the following:<pre><code class=language-txt>LOG:  checkpoints are occurring too frequently (9 seconds apart)
HINT:  Consider increasing the configuration parameter "max_wal_size".
LOG:  checkpoints are occurring too frequently (2 seconds apart)
HINT:  Consider increasing the configuration parameter "max_wal_size".
</code></pre><p>This is a classic example of a database which has not been properly tuned for a high write load. In this post, we'll discuss what this means, some possible causes for this error, and some relatively simple ways to resolve the issue.<h2 id=systems-settings><a href=#systems-settings>Systems Settings</a></h2><p>First, a look at the system settings and a brief discussion about what this error means.<p>The Postgres logs mentioned two specific things, checkpoints and max_wal_size. Investigating the Postgres instance to observe any settings related to these two items, we see the following:<pre><code class=language-pgsql>select name, setting from pg_settings where name like '%wal_size%' or name like '%checkpoint%' order by name;
            name             |  setting
------------------------------+-----------
 checkpoint_completion_target | 0.9
 checkpoint_flush_after       | 32
 checkpoint_timeout           | 300
 checkpoint_warning           | 30
 log_checkpoints              | off
 max_wal_size                 | 1024
 min_wal_size                 | 80
(7 rows)
</code></pre><p><a href=https://www.postgresql.org/docs/13/runtime-config-wal.html#GUC-MAX-WAL-SIZE>max_wal_size</a> sets the maximum amount of <dfn>Write-Ahead-Logging</dfn> (<abbr>WAL</abbr>) to grow between automatic checkpoints. This is a soft limit; WAL size can exceed max_wal_size under special circumstances, such as heavy load, a failing <a href=https://www.postgresql.org/docs/13/runtime-config-wal.html#GUC-ARCHIVE-COMMAND>archive_command</a>, or a high <a href=https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-WAL-KEEP-SEGMENTS>wal_keep_segments</a> setting.<p>It should also be noted that increasing this parameter can increase the amount of time needed for crash recovery. The default value is 1GB (1024 MB).<p>As discussed in <a href=/blog/optimize-postgresql-server-performance>previous posts</a>, the default configuration values for PostgreSQL are typically conservative, so as to work equally well on a large server as it would on a small, resource-constrained development machine. Because of this, it's likely that the default value observed here for max_wal_size is too low for the system generating the error messages we've seen.<h2 id=identifying-the-issue><a href=#identifying-the-issue>Identifying the Issue</a></h2><p>Next, let's look at why this low value for max_wal_size might be the related to the cause of the issue.<p>Obviously, the exact cause for this issue will vary from one situation to another, but generally speaking, when max_wal_size is low, and the database has a high number of updates or inserts happening quickly, it will tend to generate WAL faster than it can be archived, and faster than standard checkpoint processes can keep up.<p>As a result, if you have disk usage monitoring on your Postgres instance (you should!) you may also observe that the pg_wal directory increases in size dramatically as these WAL files are retained.<hr><p>A brief aside:<p>There's a partner parameter for max_wal_size, which is it's opposite: <a href=https://www.postgresql.org/docs/13/runtime-config-wal.html#GUC-MIN-WAL-SIZE>min_wal_size</a>. The parameter for min_wal_size defines the minimum size to shrink the WAL. As long as WAL disk usage stays below this setting while archiving, old WAL files are always recycled for future use at a checkpoint, rather than removed. This is useful to ensure that enough WAL space is reserved to handle spikes in WAL usage, for example when running large batch jobs. The default value for this is 80 MB.<hr><h2 id=how-to-resolve><a href=#how-to-resolve>How to Resolve</a></h2><p>PostgreSQL helpfully informs us in the log file specifically what should be done: Increase the max_wal_size.<p>So, as suggested, edit the instance configuration files to increase the max_wal_size value to match the system's work load.<p>The ideal value, for most use cases, is to increase the value for max_wal_size such that it can hold at least one hour's worth of logs. The caveat here, however, is that you do not want to set this value extremely high, as it can increase the amount of time needed for crash recovery. If desired, the min_wal_size can also be increased, so that the system can handle spikes in WAL usage during batch jobs and other unusual circumstances. After making the appropriate configuration changes, and reloading Postgres, we can validate that the new settings are applied, as we expect:<pre><code class=language-pgsql>            name             |  setting
------------------------------+-----------
 checkpoint_completion_target | 0.9
 checkpoint_flush_after       | 32
 checkpoint_timeout           | 300
 checkpoint_warning           | 30
 log_checkpoints              | off
 max_wal_size                 | 16384
 min_wal_size                 | 4096
(7 rows)
</code></pre><p>With these new settings in place, and with careful monitoring of the log files and system usage, the growing pains of scaling a system such as this up from a development device to a full-fledged production instance will be all but a distant memory.<p>For more information, and some interactive workshops on configuring PostgreSQL settings, please visit the <a href=https://learn.crunchydata.com/pg-administration/courses/basic-postgresql-for-dbas-11/setting-up-pg/>Crunchy Postgres Developer Portal</a>. ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Tom.Swartz@crunchydata.com (Tom Swartz) ]]></author>
<dc:creator><![CDATA[ Tom Swartz ]]></dc:creator>
<guid isPermalink="false">https://blog.crunchydata.com/blog/tuning-your-postgres-database-for-high-write-loads</guid>
<pubDate>Wed, 14 Oct 2020 05:00:00 EDT</pubDate>
<dc:date>2020-10-14T09:00:00.000Z</dc:date>
<atom:updated>2020-10-14T09:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ Optimize PostgreSQL Server Performance Through Configuration ]]></title>
<link>https://www.crunchydata.com/blog/optimize-postgresql-server-performance</link>
<description><![CDATA[ Learn the top 5 settings to tune once an install of PostgreSQL is completed to optimize performance. ]]></description>
<content:encoded><![CDATA[ <p>By design, the out of the box configuration for <a href=https://www.postgresql.org/>PostgreSQL</a> is defined to be a "Jack of All Trades, Master of None". The default configuration for PostgreSQL is fairly painstakingly chosen to ensure that it will run on every environment it is installed, meeting the lowest common denominator resources across most platforms.<p>Because of this, it's always recommended that one of the first actions performed once an install of PostgreSQL is completed, would be to tune and configure some high-level settings.<p>There are four high-level settings which will be discussed here: <code>shared_buffers</code>,  <code>wal_buffers</code>,  <code>effective_cache_size</code>, and <code>maintenance_work_mem</code>.<p>Let's begin with <code>shared_buffers</code>.<h2 id=shared_buffers><a href=#shared_buffers><code>shared_buffers</code></a></h2><p>PostgreSQL uses 'double buffering', meaning that PostgreSQL uses its own internal buffer as well as kernel buffered IO. In short, this means that data is stored in memory twice.<p>The PostgreSQL buffer is named <a href=https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-SHARED-BUFFERS><code>shared_buffers</code></a> and it defines how much dedicated system memory PostgreSQL will use for cache.<p>Because of PostgreSQL's design choice to ensure compatibility on all supported machines and operating systems, this value is set conservatively low by default. As such, updating the <code>shared_buffers</code> is one of the settings which will be the most effective in improving overall performance on most modern operating systems.<p>There is not one specific recommended value for <code>shared_buffers</code>, but the calculation to determine the value for a particular system is not especially difficult.<p>Generally speaking, the value for <code>shared_buffers</code> should be roughly 25% of the total system RAM for a dedicated DB server. The value for <code>shared_buffers</code> should never be set to reserve all of the system RAM for PostgreSQL. A value over 25% of the system RAM can be useful if, for example, it is set such that the entire database working set of data can fit in cache, as this would greatly reduce the amount of time reading from disk.<p>Alternately, while a larger <code>shared_buffers</code> value can increase performance in 'read heavy' use cases, having a large <code>shared_buffer</code> value can be detrimental for 'write heavy' use cases, as the entire contents of <code>shared_buffers</code> must be processed during writes.<h2 id=wal_buffers><a href=#wal_buffers><code>wal_buffers</code></a></h2><p><dfn>Write-Ahead Logging</dfn> (<abbr>WAL</abbr>) is a standard method for ensuring integrity of data. Much like in the <code>shared_buffers</code> setting, PostgreSQL writes WAL records into buffers and then these buffers are flushed to disk.<p>The default size of the buffer is set by the  <a href=https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-BUFFERS><code>wal_buffers</code></a> setting- initially at 16MB. If the system being tuned has a large number of concurrent connections, then a higher value for  <code>wal_buffers</code> can provide better performance.<h2 id=effective_cache_size><a href=#effective_cache_size><code>effective_cache_size</code></a></h2><p><a href=https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE><code>effective_cache_size</code></a> has the reputation of being a confusing PostgreSQL settings, and as such, many times the setting is left to the default value.<p>The <code>effective_cache_size</code> value provides a 'rough estimate' of the number of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what's used by the OS itself and other applications.<p>This value is used only by the PostgreSQL query planner to figure out whether plans it's considering would be expected to fit in RAM or not. As such, it's a bit of a fuzzy number to define for general use cases.<p>A conservative value for  <code>effective_cache_size</code>  would be 1/2 of the total memory available on the system. Most commonly, the value is set to 75% of the total system memory on a dedicated DB server, but can vary depending on the specific discrete needs on a particular server workload.<p>If the value for <code>effective_cache_size</code>  is too low, then the query planner may decide not to use some indexes, even if they would help greatly increase query speed.<h2 id=work_mem><a href=#work_mem><code>work_mem</code></a></h2><p>The value of <a href=https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM><code>work_mem</code></a> is used for complex sort operations, and defines the maximum amount of memory to be used for intermediate results, such as hash tables, and for sorting.<p>When the value for  <code>work_mem</code>  is properly tuned, then the majority of sort actions are performed in the much-faster memory, rather than being written and read to disk.<p>However, it's important to ensure that the  <code>work_mem</code>  value is not set too high, as it can 'bottleneck' the available memory on the system as the application performs sort operations. In this case, for example, the system will try to allocate.  <code>work_mem</code>  several times over for each concurrent sort operation.<p>Because of this important caveat, it's ideal to set the global value for  <code>work_mem</code> at a relatively low value, and then alter any specific queries themselves to use a higher  <code>work_mem</code>  value:<pre><code class=language-pgsql>SET LOCAL work_mem = '256MB';
SELECT * FROM db ORDER BY LOWER(name);
</code></pre><h2 id=maintenance_work_mem><a href=#maintenance_work_mem><code>maintenance_work_mem</code></a></h2><p>While  <code>work_mem</code>  specifies how much memory is used for complex sort operations, <a href=https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM><code>maintenance_work_mem</code></a>  specifies how much memory is used for routine maintenance tasks, such as VACUUM, CREATE INDEX, and similar.<p>Unlike  <code>work_mem</code>, however, only one of these maintenance operations can be executed at a time by a database session. As a result, most systems do not have many of these processes running concurrently, so it's typically safe to set this value much larger than <code>work_mem</code>, as the larger amounts of available memory could improve the performance of vacuuming and database dump restores.<p>The default value for  <code>maintenance_work_mem</code>  is 64MB.<h2 id=wrapping-up><a href=#wrapping-up>Wrapping Up</a></h2><p>Using a tool such as <a href=https://pgtune.leopard.in.ua/#/>https://pgtune.leopard.in.ua/#/</a> to craft an initial configuration is worthwhile, but the key to getting the absolute best performance is benchmarking your workload and comparing against a known baseline.<p>It's also important to remember that even the most well tuned database cannot salvage poorly formed queries. Developers creating applications which interface with the database need to be mindful of how queries are written.<p>If a query performs heavy joins or other expensive aggregate operations, or if a query is performing a full table scan where an index could be used, it will nearly always perform poorly, no matter how well the database settings are tuned.<p>We hope that the brief explanations above provide enough insight to allow you go forth and tune your PostgreSQL installs! We're also here to help with <a href=https://www.crunchydata.com/solutions/enterprise-postgresql-support>PostgreSQL support</a> and to help troubleshoot any PostgreSQL performance issues you may come across. ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Tom.Swartz@crunchydata.com (Tom Swartz) ]]></author>
<dc:creator><![CDATA[ Tom Swartz ]]></dc:creator>
<guid isPermalink="false">https://blog.crunchydata.com/blog/optimize-postgresql-server-performance</guid>
<pubDate>Tue, 07 Apr 2020 05:00:00 EDT</pubDate>
<dc:date>2020-04-07T09:00:00.000Z</dc:date>
<atom:updated>2020-04-07T09:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ How To Get Started with pgBackRest and PostgreSQL 12 ]]></title>
<link>https://www.crunchydata.com/blog/how-to-get-started-with-pgbackrest-and-postgresql-12</link>
<description><![CDATA[ With the recent release of PostgreSQL 12, pgBackRest also received a number of updates and changes to take advantage of the latest features of Postgres. ]]></description>
<content:encoded><![CDATA[ <p>pgBackRest is a reliable and simple to configure backup and restore solution for PostgreSQL, which provides a powerful solution for any PostgreSQL database; be it a small project, or scaled up to enterprise-level use cases.<p>Many powerful features are included in pgBackRest, including parallel backup and restore, local or remote operation, full, incremental, and differential backup types, backup rotation, archive expiration, backup integrity, page checksums, backup resume, streaming compression and checksums, delta restore, and much more.<p>With the recent release of PostgreSQL 12 (and more recently 12.1), pgBackRest also received a number of updates and changes to take advantage of the latest features of Postgres.<p>On October 1st 2019, pgBackRest released version 2.18 which is the first release of pgBackRest to support PostgreSQL 12. As such, <strong>any deployment using PostgreSQL 12 where pgBackRest will be used requires version 2.18 or greater.</strong> At the time of this post, the latest version of pgBackRest is version 2.19.<p>In the following guide, we will explore the steps involved in configuring pgBackRest on a PostgreSQL 12 database, followed by simulating a disaster where the database files have been destroyed, and restoring a backup to regain the database.<h2 id=setting-up-the-demo><a href=#setting-up-the-demo>Setting up the Demo</a></h2><p>If following this guide for tutorial purposes, it is useful to have a similar working environment to ensure that the same settings, commands, and processes are used while performing the exercise.<p>For this example, we will be performing the install and configuration of PostgreSQL 12.1 and pgBackRest 2.19 on CentOS 7.<p>For simplicity, <a href=https://vagrantup.com/>Hashicorp's Vagrant</a> to start and manage the Official CentOS 7 <a href=https://app.vagrantup.com/centos/boxes/7>Vagrant Box</a> is recommended.<p>If using Vagrant, simply run:<pre><code class=language-shell>cd temp_work_dir
vagrant init centos/7
vagrant up
vagrant ssh
</code></pre><p>Alternately, you can set up and configure a virtual machine manually, using the following<br>settings:<ul><li>12 GB single partition ext4 disk<li>2 GB RAM<li>2 CPU</ul><h2 id=install-postgresql-and-pgbackrest><a href=#install-postgresql-and-pgbackrest>Install PostgreSQL and pgBackRest</a></h2><p>There are several methods for installing PostgreSQL on a CentOS 7 server which<br>are detailed on the <a href=https://wiki.postgresql.org/wiki/YUM_Installation>PostgreSQL wiki</a>.<p>For the purposes of this guide, we will install both PostgreSQL 12 and the latest<br>version of pgBackRest using the PGDG Yum repository.<p>At the time of this writing, the latest versions of PostgreSQL was version 12.1 and<br>pgBackRest was version 2.19.<pre><code class=language-shell>sudo yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
sudo yum -y install postgresql12-server postgresql12-contrib pgbackrest
</code></pre><p>Next, initialize the PostgreSQL instance with the following commands:<pre><code class=language-shell>sudo /usr/pgsql-12/bin/postgresql-12-setup initdb
sudo systemctl enable postgresql-12.service
sudo systemctl start postgresql-12.service
</code></pre><p>The first command is only necessary to run once, and is responsible for<br>initializing the database in <code>$PGDATA</code> directory.<br>The second command ensures PostgreSQL will start automatically when the<br>operating system is started, and the third command will start the database at<br>the present time.<p>To verify PostgreSQL has started correctly, the following command will confirm:<pre><code class=language-shell>sudo -iu postgres
psql --version psql (PostgreSQL) 12.1
</code></pre><p>Next, verify pgBackRest was installed correctly by running the following command<br>either as the default user, or as the <code>postgres</code> user:<pre><code class=language-shell>$ pgbackrest
pgBackRest 2.19 - General help
Usage:
    pgbackrest [options] [command]

Commands:
     archive-get     Get a WAL segment from the archive.
     archive-push    Push a WAL segment to the archive.
     backup          Backup a database cluster.
     check           Check the configuration.
     expire          Expire backups that exceed retention.
     help            Get help.
     info            Retrieve information about backups.
     restore         Restore a database cluster.
     stanza-create   Create the required stanza data.
     stanza-delete   Delete a stanza.
     stanza-upgrade  Upgrade a stanza.
     start           Allow pgBackRest processes to run.
     stop            Stop pgBackRest processes from running.
     version         Get version.

 Use 'pgbackrest help [command]' for more information.
</code></pre><p>Ensuring that the basic commands return valid responses and expected versions<br>is generally a good practice to follow, as it ensures that the software is functioning<br>properly and that the versions are compatible with one another.<h2 id=configure-postgresql><a href=#configure-postgresql>Configure PostgreSQL</a></h2><p>Depending on your specific use case, you may need to configure PostgreSQL's<br>options to meet your specific environment needs.<p>Possible considerations would be:<ul><li>Using replication or configuring a cluster<li>How your backups will be stored (using the <a href=https://www.backblaze.com/blog/the-3-2-1-backup-strategy/>3-2-1 Method?</a>)<li>Security concerns for your database</ul><p>For the purposes of this exercise, we will be following a very simplistic model<br>which will do best in demonstrating the process, but is not ideal for production<br>environments.<br>If you wish to configure your environment to have replica PostgreSQL instances,<br>for example, further information may be found in <a href=https://www.crunchydata.com/blog/pgbackrest-performing-backups-on-a-standby-cluster>these Crunchy Blog Posts</a>.<p>For the simple purposes of this demonstration, we will configure PostgreSQL<br>with the least number of changes to make a minimum working model.<p>If you’re unaware of where your configuration files are on the PostgreSQL host, you can run the following:<pre><code class=language-shell>$ sudo -iu postgres psql -U postgres -c 'SHOW config_file'
config_file
----------------------------------------
/var/lib/pgsql/12/data/postgresql.conf
(1 row)
</code></pre><p>Edit the<br><a href=https://www.postgresql.org/docs/current/runtime-config.html>postgresql.conf</a><br>file with root privileges in your preferred text editor.<br>The following parameters will need to be defined:<pre><code class=language-ini>listen_addresses = '*'
# Optionally, define the address as the host IP:
listen_addresses = '10.0.1.1'
password_encryption=’scram-sha-256’
archive_mode = on
</code></pre><hr><h3 id=a-brief-aside-about-configured-postgresql-settings><a href=#a-brief-aside-about-configured-postgresql-settings>A brief aside about configured PostgreSQL settings</a></h3><p>It is always best practice to have an understanding of the configuration changes made to a database.<p>For the purposes of this demonstration, three options were changed from their defaults, and it is important<br>to know why this has happened.<p><strong><code>listen_addresses</code>:</strong><br>While PostgreSQL's <code>pg_hba.conf</code> is the file responsible for restricting<br>connections, when <code>listen_addresses</code> is set to <code>*</code> (wildcard), it is possible<br>to discover the open port on <code>5432</code> using <code>nmap</code> and learn the database<br>exists, thereby possibly opening the server up for an exploit. Setting it<br>to the an IP address prevents PostgreSQL from listening on an unintended<br>interface, preventing this potential exploit. More information on this<br>specific attack vector and how to avoid it can be found in this <a href=https://postgresql.verite.pro/blog/2019/02/08/open-instances.html>blog</a><br><a href=https://postgresql.verite.pro/blog/2019/02/08/open-instances.html>post</a>.<p><strong><code>password_encryption</code>:</strong><br>Starting with the release of PostgreSQL 10, <code>SCRAM-SHA-256 authentication</code> was available for use, and<br>is used in this example for the explicit purpose of encouraging secure connections to the database.<br>Specifically, this method of authentication prevents password sniffing on<br>untrusted connections and offers support for cryptographically hashing<br>passwords on the server in a secure manner.<br>More detailed information on this authentication method can be found<br><a href=https://www.postgresql.org/docs/current/auth-password.html>here</a>.<p><strong><code>archive_mode</code>:</strong><br>Another addition to recent versions of PostgreSQL, starting with the<br>major release for PostgreSQL 10, a change was introduced to reduce the<br>number of configuration edits that were necessary to perform streaming<br>backup and replication (specifically affecting the parameters <code>wal_level</code>,<br><code>max_wal_senders</code>, <code>max_replication_slots</code>, and <code>hot_standby</code> - these are now<br>all set by default). The release notes regarding this change can be found<br><a href=https://www.postgresql.org/docs/10/release-10.html>here</a>.<hr><p>Once the changes have been made to the PostgreSQL configuration file, restart<br>the service to allow the changes to take effect.<br>A restart is necessary in this specific case, as all of the values changed<br>particularly require the PostgreSQL service to stop and start again.<pre><code class=language-shell>sudo systemctl restart postgresql-12.service
</code></pre><p>It is possible to check if the configuration values have been correctly<br>applied to the database by running the following command:<pre><code class=language-shell>sudo -iu postgres psql
</code></pre><pre><code class=language-pgsql>SELECT name,setting,context,source FROM pg_settings WHERE NAME IN ('listen_addresses','archive_mode','password_encryption');
      name           |  setting          |    context   |    source
---------------------+---------------+------------+--------------------
archive_mode         |   on              |   postmaster | configuration file
listen_addresses     |   *               | postmaster   | configuration file
password_encryption  | scram-sha-256     | user         | configuration file
(3 rows)
</code></pre><p>Use <code>\q</code> to exit from the psql prompt.<h2 id=configure-pgbackrest><a href=#configure-pgbackrest>Configure pgBackRest</a></h2><p>Configure a location for the pgBackRest backup repository:<pre><code class=language-shell>sudo mkdir -p /var/lib/pgbackrest
sudo chmod 0750 /var/lib/pgbackrest
sudo chown -R postgres:postgres /var/lib/pgbackrest
</code></pre><p>Configure the location and permissions on the pgbackrest log location:<pre><code class=language-shell>sudo chown -R postgres:postgres /var/log/pgbackrest
</code></pre><p>Next, modify pgBackRest's configuration files to meet the needs of the environment.<p>As best practice, first create a backup of any existing <code>pgbackrest.conf</code> file:<pre><code class=language-shell>sudo cp /etc/pgbackrest.conf /etc/pgbackrest.conf.backup
</code></pre><p>Generate a secure, long, and random passphrase to encrypt the repository:<pre><code class=language-shell>openssl rand -base64 48
</code></pre><p>This generated value will be used as the <code>repo1-cipher-pass</code> option.<br><strong>NOTE:</strong> Once the repository has been configured and the stanza created and<br>checked, the repository encryption settings cannot be changed.<p>Next, edit the <code>pgbackrest.conf</code> file as root, entering the following parameters:<pre><code class=language-ini>[global]
repo1-cipher-pass=uUQsaa7+CCFaqXVagFzNUix3XuLe9e2uqVskqfI6wcKf8BX8y5b+8bL3oimRpV1N
repo1-cipher-type=aes-256-cbc
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
log-level-console=info
log-level-file=debug

[demo]
pg1-path=/var/lib/pgsql/12/data
</code></pre><p>The <code>[global]</code> section defines the location of backups, logging settings, and encryption settings.<br>The <code>[demo]</code> section defines a stanza for the <code>demo</code> backup repository, which we will configure.<p>As with the PostgreSQL settings, best practices encourage an understanding of the configuration options.<br>More information can be found about these configuration options within the <a href=https://pgbackrest.org/configuration.html>pgBackRest Configuration Guide</a>.<p>Finally, initialize the pgBackRest stanza, which contains the definitions for the<br>location, archiving options, backup settings, and other similar configurations<br>for the PostgreSQL database cluster.<br>There is generally one stanza defined for each database cluster that needs to have backups.<br>The <code>stanza-create</code> command must be run on the primary host after <code>pgbackrest.conf</code> has been configured.<pre><code class=language-shell>$ sudo -u postgres pgbackrest --stanza=demo stanza-create

2019-11-15 18:08:57.158 P00      INFO: stanza-create command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=&#60redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --stanza=demo
2019-11-15 18:08:57.609 P00      INFO: stanza-create command end: completed successfully (455ms)
</code></pre><h2 id=pulling-it-all-together-performing-first-backup><a href=#pulling-it-all-together-performing-first-backup>Pulling It All Together, Performing First Backup</a></h2><p>Now that PostgreSQL and pgBackRest have been configured individually, a few final steps must be<br>performed to tie them together and perform the backup process.<p>First, edit the <code>postgresql.conf</code> file once more, and configure the <code>archive_command</code>:<pre><code class=language-ini>archive_command = 'pgbackrest --stanza=demo archive-push %p'
</code></pre><p>This configuration option informs PostgreSQL to use pgBackRest to handle the WAL<br>segments, pushing them immediately to the archive.<p>Following this change to the configuration file, reload the PostgreSQL service:<pre><code class=language-shell>sudo systemctl reload postgresql-12.service
</code></pre><p>Next, we will check the cluster with pgBackRest. This validates that pgBackRest<br>and the <code>archive_command</code> settings are both accurately configured and performing<br>as expected.<pre><code class=language-shell>$ sudo -iu postgres pgbackrest --stanza=demo check
2019-11-15 18:10:03.637 P00      INFO: check command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=&#60redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --stanza=demo
2019-11-15 18:10:04.757 P00      INFO: WAL segment 000000010000000000000001 successfully archived to '/var/lib/pgbackrest/archive/demo/12-1/0000000100000000/000000010000000000000001-bddaecf52ba8c3dd83e6157fea6a4dbeb6476010.gz'
2019-11-15 18:10:04.757 P00      INFO: check command end: completed successfully (1120ms)
</code></pre><p>If any errors are produced by this command, read and inspect the output for<br>recommendations on how to resolve the specific issue.<p>Now, after much ado; perform a full backup:<pre><code class=language-shell>$ sudo -u postgres pgbackrest --stanza=demo --type=full backup

2019-11-15 18:10:32.421 P00      INFO: backup command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=&#60redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --repo1-retention-full=2 --stanz a=demo --type=full
2019-11-15 18:10:33.555 P00      INFO: execute non-exclusive pg_start_backup() with label "pgBackRest backup started at 2019-11-15 18:10:32": backup begins after the next regular checkpoint completes 2019-11-15 18:10:33.758 P00 INFO: backup start archive = 000000010000000000000003, lsn = 0/3000028
2019-11-15 18:10:35.930 P01      INFO: backup file /var/lib/pgsql/12/data/base/14187/1255 (632KB, 2%) checksum 60325e5cd07379af0ffe91eea27cfd4f2f07af69 [...] 2019-11-15 18:10:38.818 P00 INFO: full backup size = 24.2MB 2019-11-15 18:10:38.818 P00           INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive
2019-11-15 18:10:38.920 P00      INFO: backup stop archive = 000000010000000000000003, lsn = 0/3000138
2019-11-15 18:10:39.235 P00      INFO: new backup label = 20191115-181032F
2019-11-15 18:10:39.286 P00      INFO: backup command end: completed successfully (6866ms)
</code></pre><p>And finally, confirm the backup is working:<pre><code class=language-shell>$ sudo -u postgres pgbackrest info

stanza: demo
    status: ok
    cipher: aes-256-cbc

    db (current)
        wal archive min/max (12-1): 000000010000000000000003/000000010000000000000003

        full backup: 20191115-181032F
            timestamp start/stop: 2019-11-15 18:10:32 / 2019-11-15 18:10:39
            wal start/stop: 000000010000000000000003 / 000000010000000000000003
            database size: 24.2MB, backup size: 24.2MB
            repository size: 2.9MB, repository backup size: 2.9MB
</code></pre><h2 id=restore-a-backup><a href=#restore-a-backup>Restore a Backup</a></h2><p>Now that a full backup is performed on a fresh database, it might be useful to test restoring from the full backup.<p>To do this, stop the PostgreSQL instance, and delete its data files, simulating a system administration disaster.<pre><code class=language-shell>sudo systemctl stop postgresql-12.service
sudo find /var/lib/pgsql/12/data -mindepth 1 -delete
</code></pre><p>At this point, trying to start the database will result in a failure:<pre><code class=language-shell>$ sudo systemctl start postgresql-12.service
## THIS WILL FAIL

Job for postgresql-12.service failed because the control process exited with error code. See "systemctl status postgresql-12.service" and "journalctl -xe" for details.
</code></pre><p>Perform a restore on the database:<pre><code class=language-shell>sudo -iu postgres pgbackrest --stanza=demo --delta restore
</code></pre><p>Once the restore has completed, the database will start as expected:<pre><code class=language-shell>sudo systemctl start postgresql-12.service
</code></pre><p>You can verify that pgBackRest is still working:<pre><code class=language-shell>$ sudo -u postgres pgbackrest --stanza=demo check

2019-11-15 18:13:56.707 P00      INFO: check command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=&#60redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --stanza=demo
2019-11-15 18:13:57.594 P00      INFO: WAL segment 000000020000000000000005 successfully archived to '/var/lib/pgbackrest/archive/demo/12-1/0000000200000000/000000020000000000000005-bd01dc079338748cd9772a7c324eed0d68d45a9c.gz'
2019-11-15 18:13:57.594 P00      INFO: check command end: completed successfully (887ms)
</code></pre><p>After any sort of disaster instance, it is always best practice to follow up any restore with a fresh backup:<pre><code class=language-shell>sudo -u postgres pgbackrest --stanza=demo --type=full backup
</code></pre><h2 id=in-conclusion><a href=#in-conclusion>In Conclusion</a></h2><p>In conclusion, pgBackRest offers a large amount of possibilities and use-cases.<br>It is quite simple to install, configure, and use, simplifying Point-in-time recovery through WAL archiving.<p>Ensuring that backups are working and valid allows for peace of mind, should any disaster strike. ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ Tom.Swartz@crunchydata.com (Tom Swartz) ]]></author>
<dc:creator><![CDATA[ Tom Swartz ]]></dc:creator>
<guid isPermalink="false">https://blog.crunchydata.com/blog/how-to-get-started-with-pgbackrest-and-postgresql-12</guid>
<pubDate>Mon, 25 Nov 2019 04:00:00 EST</pubDate>
<dc:date>2019-11-25T09:00:00.000Z</dc:date>
<atom:updated>2019-11-25T09:00:00.000Z</atom:updated></item></channel></rss>