<?xml version="1.0" encoding="UTF-8" ?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" version="2.0"><channel><title>David Thomas | CrunchyData Blog</title>
<atom:link href="https://www.crunchydata.com/blog/author/david-thomas/rss.xml" rel="self" type="application/rss+xml" />
<link>https://www.crunchydata.com/blog/author/david-thomas</link>
<image><url>https://www.crunchydata.com/build/_assets/default.png-W4XGD4DB.webp</url>
<title>David Thomas | CrunchyData Blog</title>
<link>https://www.crunchydata.com/blog/author/david-thomas</link>
<width>256</width>
<height>256</height></image>
<description>PostgreSQL experts from Crunchy Data share advice, performance tips, and guides on successfully running PostgreSQL and Kubernetes solutions</description>
<language>en-us</language>
<pubDate>Fri, 02 Aug 2019 05:00:00 EDT</pubDate>
<dc:date>2019-08-02T09:00:00.000Z</dc:date>
<dc:language>en-us</dc:language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<item><title><![CDATA[ How To Correct and Identify Indexes Affected by the GNU C 2.28 Update ]]></title>
<link>https://www.crunchydata.com/blog/glibc-collations-and-data-corruption</link>
<description><![CDATA[ Version 2.28
(release notes) of the
GNU C library introduces many changes to the collations it provides. Collations
determine how strings are compared and by default, PostgreSQL uses the operating
system’s collations which on Linux means glibC. When your operating system
updates to this version of glibc and you aren't using the “C” or “POSIX”
collation, you may encounter some differently ordered indexes. This unexpected
change in the order of indexes will lead to incorrectly ordered query results
and possible data corruption. Currently, the following distributions are
affected: ]]></description>
<content:encoded><![CDATA[ <p>Version 2.28 (<a href="https://savannah.gnu.org/forum/forum.php?forum_id=9205">release notes</a>) of the GNU C library introduces many changes to the collations it provides. Collations determine how strings are compared and by default, PostgreSQL uses the operating system’s collations which on Linux means glibC. When your operating system updates to this version of glibc and you aren't using the “C” or “POSIX” collation, you may encounter some differently ordered indexes. This unexpected change in the order of indexes will lead to incorrectly ordered query results and possible data corruption. Currently, the following distributions are affected:<ul><li>Ubuntu 18.10 (cosmic)<li>RHEL/CentOS 8<li>Debian 10 (buster)</ul><h2 id=how-can-a-system-update-cause-index-corruption><a href=#how-can-a-system-update-cause-index-corruption>How can a system update cause index corruption?</a></h2><p>By default, PostgreSQL uses the operating systems collations (as provided by the GNU C library). Since collations determine how strings are compared, changing them can affect the result of the ORDER BY clause in SELECT statements and the order of the keys in B-tree indexes. Changes to the behavior of ORDER BY can be annoying, but changing the order of an index can lead to incorrect query results and duplicate entries.<p>You may also see rows being stored in the wrong partition if using columns of type text, varchar, char, or citext in the partition key on range-partitioned tables.<h2 id=what-indexes-are-affected><a href=#what-indexes-are-affected>What indexes are affected?</a></h2><p>Only indexes involving columns of type text, varchar, char, or citex are affected. Databases or table columns using the “C” or “POSIX” locales and table columns using collations with the ICU provider are not affected.<p>You can identify affected indexes in all databases with the following query:<pre><code class=language-pgsql>SELECT indrelid::regclass::text, indexrelid::regclass::text, collname, pg_get_indexdef(indexrelid)
FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index, generate_subscripts(indcollation, 1) g(i)) s
 JOIN pg_collation c ON coll=c.oid
WHERE collprovider IN ('d', 'c') AND collname NOT IN ('C', 'POSIX');
</code></pre><h2 id=how-can-i-correct-this><a href=#how-can-i-correct-this>How can I correct this?</a></h2><p>The simplest method is to rebuild the indexes by running <code>REINDEX index_name</code> for all affected indexes. However, this will block all access to the index and prevent DDL on the table during the process. Starting with PostgreSQL version 12 (to be released this fall), you can use REINDEX CONCURRENTLY to avoid these limitations. For older versions of PostgreSQL you can work around these limitations by creating a copy of the index using <code>CREATE INDEX CONCURRENTLY</code>, then dropping the original index as shown below:<pre><code class=language-pgsql>CREATE INDEX CONCURRENTLY myindex2 ON mytable (textcol);
DROP INDEX myindex;
ALTER INDEX myindex2 RENAME TO myindex;
</code></pre><p>It should be noted that the above will not work for primary keys. Additionally sufficient storage for both indexes will be needed for the duration of the process described above.<p>You can find more information about the locale data changes here: <a href=https://wiki.postgresql.org/wiki/Locale_data_changes>https://wiki.postgresql.org/wiki/Locale_data_changes</a><p>and the Postgres documentation on managing collations here: <a href=https://www.postgresql.org/docs/10/collation.html#COLLATION-MANAGING>https://www.postgresql.org/docs/10/collation.html#COLLATION-MANAGING</a> ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ David.Thomas@crunchydata.com (David Thomas) ]]></author>
<dc:creator><![CDATA[ David Thomas ]]></dc:creator>
<guid isPermalink="false">https://blog.crunchydata.com/blog/glibc-collations-and-data-corruption</guid>
<pubDate>Fri, 02 Aug 2019 05:00:00 EDT</pubDate>
<dc:date>2019-08-02T09:00:00.000Z</dc:date>
<atom:updated>2019-08-02T09:00:00.000Z</atom:updated></item>
<item><title><![CDATA[ Performing a Major PostgreSQL Upgrade with pg_dumpall ]]></title>
<link>https://www.crunchydata.com/blog/performing-a-major-postgresql-upgrade-with-pg_dumpall</link>
<description><![CDATA[ For most major upgrades using a utility such as pg_upgrade or a replication tool such as pglogical will be the best solution. However if these options are not available, pg_dumpall can be used to safely perform a major upgrade of your PostgreSQL database. ]]></description>
<content:encoded><![CDATA[ <p>For most major upgrades using a utility such as <a href=https://www.postgresql.org/docs/current/pgupgrade.html>pg_upgrade</a> or a replication tool such as <a href=/blog/upgrading-postgresql-from-9.4-to-10.3-with-pglogical>pglogical</a> will be the best solution. However if these options are not available, <a href=https://www.postgresql.org/docs/current/app-pg-dumpall.html>pg_dumpall</a> can be used to perform a major upgrade. What follows is a guide on how you can safely upgrade your database to a newer version of PostgreSQL with pg_dumpall.<h2 id=install-and-initialize-new-system><a href=#install-and-initialize-new-system>Install and Initialize New System</a></h2><p>You will first need to install the latest PostgreSQL binaries on the new system (hostname new in this example). Once the binaries are installed you will need to initialize the new instance with the <a href=https://www.postgresql.org/docs/current/app-initdb.html><code>initdb</code></a> command:<pre><code class=language-shell>initdb -D /path/to/pgdata/
</code></pre><p>The new instance can now be started<pre><code class=language-shell>pg_ctl -D /path/to/pgdata/ -l logfile start
</code></pre><h2 id=confirm-connectivity><a href=#confirm-connectivity>Confirm Connectivity</a></h2><p>You can confirm the new instance is started and is accessible by running the following on the new instance:<pre><code class=language-shell>psql -U postgres
</code></pre><p>To test connectivity to the old system (hostname old in this example) by running the following on the new instance:<pre><code class=language-shell>psql -h old -U postgres
</code></pre><h2 id=migrate-data><a href=#migrate-data>Migrate Data</a></h2><p>Once connectivity is confirmed the data can be migrated from old to new with the following command run on the new instance:<pre><code class=language-shell>pg_dumpall -h old -U postgres | psql ---single-transaction --no-psqlrc -h new -U postgres
</code></pre><p>By running the restore in a single transaction, if any one command fails the entire migration will be rolled back. This helps maintain data consistency. Check out the documentation for more info on <a href=https://www.postgresql.org/docs/current/app-psql.html>psql</a>.<p>After the command completes all data has been migrated, however the configuration from the old system will need to be migrated manually. You will also need to update any application settings to point to the new instance. The old system should be kept until sufficient testing is performed on the new instance. ]]></content:encoded>
<category><![CDATA[ Production Postgres ]]></category>
<author><![CDATA[ David.Thomas@crunchydata.com (David Thomas) ]]></author>
<dc:creator><![CDATA[ David Thomas ]]></dc:creator>
<guid isPermalink="false">https://blog.crunchydata.com/blog/performing-a-major-postgresql-upgrade-with-pg_dumpall</guid>
<pubDate>Mon, 26 Nov 2018 04:00:00 EST</pubDate>
<dc:date>2018-11-26T09:00:00.000Z</dc:date>
<atom:updated>2018-11-26T09:00:00.000Z</atom:updated></item></channel></rss>