<?xml version="1.0" encoding="UTF-8" ?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" version="2.0"><channel><title>Chris Bandy | CrunchyData Blog</title>
<atom:link href="https://www.crunchydata.com/blog/author/chris-bandy/rss.xml" rel="self" type="application/rss+xml" />
<link>https://www.crunchydata.com/blog/author/chris-bandy</link>
<image><url>https://www.crunchydata.com/build/_assets/chris-bandy.png-CR2OSJBH.webp</url>
<title>Chris Bandy | CrunchyData Blog</title>
<link>https://www.crunchydata.com/blog/author/chris-bandy</link>
<width>800</width>
<height>800</height></image>
<description>PostgreSQL experts from Crunchy Data share advice, performance tips, and guides on successfully running PostgreSQL and Kubernetes solutions</description>
<language>en-us</language>
<pubDate>Tue, 03 Oct 2023 09:00:00 EDT</pubDate>
<dc:date>2023-10-03T13:00:00.000Z</dc:date>
<dc:language>en-us</dc:language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<item><title><![CDATA[ Huge Pages and Postgres in Containers ]]></title>
<link>https://www.crunchydata.com/blog/huge-pages-and-postgres-in-containers</link>
<description><![CDATA[ We love it when we get to work with the open source community! We recently helped with patches to OCI and Postgres 16 to make huge pages work with Postgres in containers and Kubernetes. ]]></description>
<content:encoded><![CDATA[ <p>We recently participated in a community solution for using huge pages when you’re running Postgres in containers or with <a href=https://www.crunchydata.com/products/crunchy-postgresql-for-kubernetes>Crunchy Postgres for Kubernetes</a>. We worked on a patch to the underlying OCI (Open Container Initiative) runtime specification with our partner Red Hat and also worked on a patch for Postgres 16. For those of you using huge pages or running in containers, we have some additional notes on our solution in this write up. We’re really proud of the improvements we’ve made because they help Postgres, Kubernetes, <em>and</em> every container runtime!<h3 id=background-on-huge-pages><a href=#background-on-huge-pages>Background on Huge Pages</a></h3><p>CPUs translate virtual memory addresses to physical addresses in chunks called “pages.” Pages are typically 4 KB each, but nearly all CPU architectures provide a way to use larger sizes, often 2 MB or 1 GB. Those larger pages are called “huge pages” in Linux and are more efficient when using lots of memory. Huge pages can improve Postgres performance and protect Postgres background processes from the <a href=https://www.crunchydata.com/blog/deep-postgresql-thoughts-the-linux-assassin><dfn>Out Of Memory</dfn> (<abbr>OOM</abbr>) manager</a>. Anyone who adjusts Postgres <code>shared_buffers</code> should consider tuning their system’s huge pages.<p>Because huge pages are so great, Crunchy Postgres for Kubernetes makes them super easy to use in the <code>resources</code> portion of the PostgresCluster YAML. The following example starts Postgres with 10 gigs of memory, 2 of which are huge pages. Kubernetes finds a machine the right size, and Postgres uses what’s available.<pre><code class=language-yaml>apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: pg
spec:
  postgresVersion: 14
  backups:
    pgbackrest:
      repos:
        - name: repo1
          volume:
            volumeClaimSpec:
              accessModes: [ReadWriteOnce]
              resources: { requests: { storage: 1Gi } }
  instances:
    - dataVolumeClaimSpec:
        accessModes: [ReadWriteOnce]
        resources: { requests: { storage: 1Gi } }
      resources:
        requests:
          cpu: 2
          memory: 8Gi
        limits:
          hugepages-2Mi: 2Gi
</code></pre><h3 id=huge-pages-missing-in-container-runtimes><a href=#huge-pages-missing-in-container-runtimes>Huge pages missing in container runtimes</a></h3><p>Crunchy Postgres for Kubernetes initially released this feature in 2021. Every once in a while, we would get a report that Postgres could not initialize due to <code>Bus error</code>, indicating that huge pages are to blame. The reporter would change their environment or set <code>huge_pages = off</code> and be satisfied. Earlier this year, we decided to dig in and really identify what was going on.<p>We reproduced the issue and saw that the container’s <code>hugetlb.2MB.limit_in_bytes</code> <a href=https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/hugetlb.html>cgroup</a> matched the Kubernetes <code>hugepages-2Mi</code> limit, but <code>hugetlb.2MB.rsvd.limit_in_bytes</code> did not. That difference explained the error, and setting the latter field made everything work again. (That made sense because the <code>rsvd</code> field was added to the kernel <em>because</em> no one liked the bus errors caused by the other field.)<p>Surely, we thought, something in the tall stack between Kubernetes and Linux was supposed to configure this field. Which one was misbehaving and deserved a bug report? None of them, it turns out! Container runtimes didn’t set this field because v1.0 of the OCI Runtime Specification didn’t mention it at all.<p>Naturally, we were not the first to notice this. We found issues related to this error and these cgroup fields scattered across Postgres, Kubernetes, operators, containers, and container runtimes for years. None, however, were making any progress toward a solution.<p>Odin Ugedal reported the problem to OCI and <a href=https://github.com/opencontainers/runtime-spec/issues/1050>suggested a solution in 2020</a>. Kailun Qin <a href=https://github.com/opencontainers/runtime-spec/pull/1116>submitted a patch in 2021</a>. We reviewed this patch and enlisted the help of our partner, Red Hat, to get it merged. It is now part of <a href=https://opencontainers.org/posts/blog/2023-07-21-oci-runtime-spec-v1-1/>OCI Runtime Specification v1.1</a> released in July. Look for it to be in container runtimes soon! 🎉<h3 id=postgres-16-allows-huge-pages-during-initialization><a href=#postgres-16-allows-huge-pages-during-initialization>Postgres 16 allows huge pages during initialization</a></h3><p>While we were working on OCI, David Angel, a PGO user, <a href=https://www.postgresql.org/message-id/flat/17757-dbdfc1f1c954a6db@postgresql.org>engaged with the pgsql-bugs mailing list</a> after struggling to avoid the issue on a system with huge pages. As a result, our own Tom Lane added a feature to <a href=https://www.postgresql.org/docs/16/release-16.html>Postgres 16</a> allowing server variables to be set using initdb. With that, initdb with <code>--set huge_pages=off</code> works on any system where huge pages are broken for any reason.<h3 id=better-huge-pages-in-the-future-for-everyone><a href=#better-huge-pages-in-the-future-for-everyone>Better huge pages in the future for everyone</a></h3><p>These two changes above are proper long-term solutions, but they’ll take time to make their way into your environments. In the meantime, Crunchy Postgres for Kubernetes has implemented <a href=https://access.crunchydata.com/documentation/postgres-operator/latest/guides/huge-pages/>workarounds</a> that will keep you running smoothly with huge pages. ]]></content:encoded>
<category><![CDATA[ Kubernetes ]]></category>
<author><![CDATA[ Chris.Bandy@crunchydata.com (Chris Bandy) ]]></author>
<dc:creator><![CDATA[ Chris Bandy ]]></dc:creator>
<guid isPermalink="false">23463f827f2ee3122be587e05d247946d2a1fb81ae0ccbbd85f94f127ca5903b</guid>
<pubDate>Tue, 03 Oct 2023 09:00:00 EDT</pubDate>
<dc:date>2023-10-03T13:00:00.000Z</dc:date>
<atom:updated>2023-10-03T13:00:00.000Z</atom:updated></item></channel></rss>