tag:blogger.com,1999:blog-39222197559716844122024-03-08T06:34:28.409-05:00Decoded NodePHP & Perl Web Development, and the Craft of ProgrammingUnknownnoreply@blogger.comBlogger311125tag:blogger.com,1999:blog-3922219755971684412.post-80547332135169141302024-03-03T11:33:00.000-05:002024-03-03T11:33:09.934-05:00vimrc tips<p>On Debian-family systems, <code>vim.tiny</code> may be providing the <code>vim</code> command, through the <a href="https://wiki.debian.org/DebianAlternatives">alternatives system</a>. If I bring in my dotfiles and haven’t installed a full vim package yet, such as <code>vim-gtk3</code>, then dozens of errors might show up. <code>vim.tiny</code> really <em>does not</em> support many features.</p>
<p>Other times, I run <code>gvim -ZR</code> for quickly checking some code, to get read-only restricted mode. In that case, anything that wants to run a shell command will fail. Restricted mode is also a signal that I don’t trust the files I’m viewing, so I don’t want to process their modelines at all.</p>
<p>To deal with these scenarios, my vimrc is shaped like this (line count heavily reduced for illustration):</p>
<pre><code>set nocompatible ruler laststatus=2 nomodeline modelines=2
if has('eval')
call plug#begin('~/.vim/plugged')
try
call system('true')
Plug 'dense-analysis/ale'
Plug 'mhinz/vim-signify' | set updatetime=150
Plug 'pskpatil/vim-securemodelines'
catch /E145/
endtry
Plug 'editorconfig/editorconfig-vim'
Plug 'luochen1990/rainbow'
Plug 'tpope/vim-sensible'
Plug 'sapphirecat/garden-vim'
Plug 'ekalinin/Dockerfile.vim', { 'for': 'Dockerfile' }
Plug 'rhysd/vim-gfm-syntax', { 'for': 'md' }
Plug 'wgwoods/vim-systemd-syntax', { 'for': 'service' }
call plug#end()
if !has('gui_running') && exists('&termguicolors')
set termguicolors
endif
let g:rainbow_active=1
colorscheme garden
endif</code></pre>
<p>We start off with the universally-supported settings. Although I use the abbreviated forms in the editor, my vimrc has the full spelling, for self-documentation.</p>
<p>Next is the feature detection of <code>if has('eval') … endif</code>. This ensures that <code>vim.tiny</code> doesn’t process the block. Sadly, inverting the test and using the <code>finish</code> command inside didn’t work.</p>
<p>If we have full vim, we start loading plugins, with a try-catch for restricted mode. If we can’t run the <code>true</code> shell command, due to E145, we cancel the error and proceed without that subset of non-restricted plugins. Otherwise, ALE and signify would <em>load</em> in restricted mode, but throw errors as soon as we opened files.</p>
<p>After that, it’s pretty straightforward; we’re running in a full vim, loading things that can run in restricted mode. When the plugins are over, we finish by configuring and activating the ones that need it.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-16650733218777817522024-02-02T18:53:00.000-05:002024-02-02T18:53:28.544-05:00My Issues with Libvirt / Why I Kept VirtualBox<p>At work, we use <a href="https://www.virtualbox.org/">VirtualBox</a> to distribute and run development machines. The primary reasons for this are:</p>
<ol>
<li>It is free (gratis), at least the portions we require</li>
<li>It has import/export</li>
</ol>
<p>However, it isn’t developed in the open, and it has a worrying tendency to print sanitizer warnings on the console when I shut down my laptop.</p>
<p>Can I replace it with kvm/libvirt/virt-manager? Let’s try!</p>
<a name='more'></a>
<h2>Import?</h2>
<p>There’s no import function. There’s not even a <em>vendor-specific</em> way to import/export for other <code>qemu-system-x86</code> instances. As far as I know, the best we can do is dump the libvirt XML defining the machine, and the disk image… and those bake in absolute paths.</p>
<p>For instance, an unprivileged guest will store in <code>$HOME</code>, so an image produced by user foo will have a path like <code>/home/foo/.local/share/libvirt/images/vm-1.qcow2</code> stored into it. Anyone who wants to import it, who is not named “foo”, will have to edit the file.</p>
<p>In VirtualBox, on the other hand, I can export an appliance as an OVA file, that my colleagues can import without changes. The process takes care of unpacking the disk inside to VirtualBox’s storage path, so we don’t have to know the details of that, either.</p>
<p>To truly replace VirtualBox, libvirt/kvm/qemu would <em>need</em> that level of frictionless exchange.</p>
<h2>Static IP?</h2>
<p>This particular VirtualBox guest has a second NIC on a host-only network, where we can access the Web server by a predetermined IP using standard ports, without any NAT rules. We can then forward traffic there with <a href="https://github.com/sapphirecat/devproxy2">devproxy</a>. (With Let’s Encrypt providing a TLS certificate, and a proxy manager in the browser, we can use the site’s <em>real</em> domain name to access the VM instead. This is much less error-prone than having host-specific configuration settings.)</p>
<p>It looks like I should create an isolated network for the equivalent in kvm, but limited users can’t do that. <code>virbr1</code> can’t be created. I was also unsuccessful at finding a way to control the IP address assignment done by usermode networking. Even if I could, I don’t think I could directly access listening ports on the guest.</p>
<p>I want to avoid giving my limited user full rights to fully reconfigure the entire network just to get a guest to have a static IP, but I don’t think it’s possible.</p>
<h2>virtio-fs?</h2>
<p>To avoid having to run <a href="https://github.com/bcpierce00/unison">file sync</a> all the time, and to have access to the files on the host without needing the VM running, this guest mounts the code via NFS. (To have the correct Unix permissions, it does not use VirtualBox shared folders.)</p>
<p>kvm supports virtio-fs as a replacement for NFS, 9p, etc. It is specifically meant to improve the performance over any networked file system by removing the network stack traversals.</p>
<p>Unfortunately, for limited users, it’s not usable yet. A patch to allow unprivileged use of virtio-fs was landed upstream in mid-December 2023, but it’s unclear whether this is soon enough that it will be included in Ubuntu 24.04 LTS. If not, I may not have this feature available until 2026.</p>
<h2>Just be privileged?</h2>
<p>I want my account for daily usage to be as isolated as possible from root. There is no ssh daemon running; what <code>sudo</code> permissions exist allow running bounce scripts in /usr/local/sbin, that carefully limit the real operations.</p>
<p>Meanwhile, the libvirt documentation states, “A read-write connection to daemons in system mode typically implies privileges equivalent to having a root shell.” Using privileged mode clearly means the account is no longer isolated.</p>
<p>A good compromise would be a setup where launching <code>virt-manager.desktop</code> prompts for my (or a custom) password, to gain access to libvirtd and <code>qemu:///system</code> <strong>only</strong> for its process tree. That would at least be better than me and my <em>entire desktop session</em> having the group ambiently available.</p>
<p>I am aware that libvirtd <em>may</em> support authentication, but I haven’t worked on that angle too hard. It’s looking like a lot of weeds over there. There are compile-time options for what’s supported, but no trivial way to find out which of those are used by the binary on the system. The <code>--version</code> switch that often holds such optional information for other programs is of no use with <code>libvirtd</code>.</p>
<h2>Where does this leave me?</h2>
<p>The <strong>most critical</strong> problem is the static IP. If I can’t give the guest a fixed IP that the host can reach it on, I can’t use the web server for testing. I’ve invested quite a bit into avoiding host-specific configuration or having the web app ‘know’ its address; I am not backing down now.</p>
<p>(The problem with using a port-forward and accessing localhost:8443 is that the app still sees itself running on port 443. When it builds a self-redirect, the port is wrong. Moreover, I’d need a TLS certificate for localhost, and everyone would need to trust it.)</p>
<p>I’m not sure what the ordering is on the next two issues. Both of them should be solved for an enthusiastic “Yes, I will use this,” but neither of them make libvirt unusable the way the static IP is a complete blocker.</p>
<p>One, import/export should exist. Machine description and disk image(s) in one file to transfer, and no XML editing required to import such a file. This would reach parity with the VirtualBox feature. I don’t think it must be OVA; I would anticipate only using it for libvirt-to-libvirt transfers.</p>
<p>Two, the libvirtd security model should be revised. Either I would like to understand the path to escalate privileges and why it is necessary, or libvirtd should be audited and improved to avoid handing out root implicitly. This would be less relevant if everything else (static IP) worked without privilege. And, noted above, it would be less relevant if the <code>.desktop</code> file could elevate itself instead of every process in my session having <code>libvirt</code> group rights.</p>
<p>The final problem is the state of the documentation. Scattered across several projects (kernel, qemu, libvirt) as well as Red Hat and the Stack Overflow network of sites… I feel like I spent a lot of time on research, and gained much less knowledge than might be hoped for.</p>
<p>Even if upstream makes changes to any of this <em>right now,</em> the results may not be available on LTS distros for years.</p>
<h2>The ancillary problem</h2>
<p>To get started quickly, I converted the VirtualBox disk image, and imported it to a new guest in virt-manager. When I booted, <strong>systemd hung forever</strong> due to the change in bus path for the primary NIC. The guest (Ubuntu 22.04) uses netplan, which baked the VirtualBox device name into the configuration, and that (DHCP/NAT connection) was mandatory.</p>
<p>So, systemd said it was waiting for the network (no timeout), and things sure hadn’t improved after 90 seconds. I used recovery mode to change the file. (Of course, then NFS failed, since it didn’t have the static IP. But at least that had a 90-second timeout.)</p>
<p>This isn’t libvirt’s fault, but it does show the importance of retaining the bus layout through an export/import cycle.</p>
<h2>But what is the performance? We must know!</h2>
<p>Best-of-three numbers for <code>npm run build</code> on a React site I had handy:</p>
<pre><code>Setup : Real/Wall Time : Normalized
--------------:----------------:-----------
distrobox : 16.468 sec. : 1.000 x
KVM virtiofs : 22.810 sec. : 1.385 x
VirtualBox NFS: 39.300 sec. : 2.386 x</code></pre>
<p>I didn’t go to the trouble of installing Node on the bare metal when I already had a distrobox container ready, but I expect <code>distrobox</code> to be <em>extremely close</em> to native performance. There’s no hypervisor and no other filesystem interposed. (The problem with distrobox <strong>is</strong> the lack of isolation from the host. Its main purpose is to make the container seamless, with access to the host’s Wayland/X11, etc.)</p>
<p>virtio-fs <em>would</em> be nice to have. Alas!</p>
<p>I also found that the <code>libvirt</code> group membership doesn’t help assign a static IP. I still didn’t have permission to create the virtual bridge. Alas…</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-60310656499679330122023-12-31T12:00:00.002-05:002024-01-13T09:45:43.209-05:00No More Realtek WiFi<p>The current Debian kernel (based on 6.1.66, after the ext4 corruption mess) seems to be locking up with <a href="https://github.com/morrownr/88x2bu-20210702">the Realtek USB wireless drivers</a> I use. Anything that wants the IP address (like <code>agetty</code> or <code>ip addr</code>) hangs, as does shutdown. It all works fine on the "old" kernel, which is the last version prior to the ext4 issue.</p>
<p>Meanwhile in Ubuntu 23.10, the in-kernel RTW drivers were flaky and bouncing the connection, so I had returned to morrownr’s driver there, as well. But now that I don’t trust any version of this driver? Forget this company. In the future, I will be using any other option:</p>
<ol>
<li>A Fenvi PCIe WiFi card with an Intel chip on board, or the like</li>
<li>Using an extra router as a wireless client/media bridge, with its Ethernet connected to the PC</li>
<li>If USB <em>were</em> truly necessary, as opposed to simply “convenient,” <a href="https://github.com/morrownr/USB-WiFi/blob/main/home/USB_WiFi_Adapters_that_are_supported_with_Linux_in-kernel_drivers.md">a Mediatek adapter</a></li>
</ol>
<p>Remember that speed testing and studying <code>dmesg</code> output led me to the conclusion that this chipset comes up in USB 2.0 mode, and even the Windows drivers just use it that way. While morrownr’s driver offers the ability to switch it to USB 3.0 mode under Linux, this prevents it from being connected properly. I never researched hard enough to find out if there is a way to make that work, short of warm rebooting <em>again</em> so that it is already in USB 3.0 mode.</p>
<p>It’s clearly deficient by design, and adding injury to insult, the drivers aren’t even stable. Awful experience, one star ★☆☆☆☆, would not recommend. Intel or Mediatek are much better choices.</p>
<p><strong>Addendum, 2024-01-13:</strong> I purchased an AX200-based Fenvi card, the FV-AXE3000Pro. It seemed not to work at all. In Windows it would fail to start with error code 10, and in Linux it would fail to load RT ucode with error -110. And then, Linux would report hangs for thermald, and systemd would wait forever for it to shut down. When the timer ran out at 1m30s, it would just kick up to 3m.</p>
<p>Embarrassingly enough, all problems were solved by plugging it into the correct PCIe slot. Apparently, despite being physically compatible, graphics card slots (which already had the punch-outs on my case, um, punched out) are for graphics cards only. (My desktop is sufficiently vintage that it has two PCIe 3.0 x16 slots, one with 16 lanes and one with 4 lanes, and two classic PCI slots between them.)</p>
<p>Result: my WiFi is 93% faster, matching the WAN rate as seen on the Ethernet side of the router. Good riddance, Realtek!</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-38816460079363658502023-12-26T19:47:00.000-05:002023-12-26T19:47:54.003-05:00Diving too deeply into DH Moduli and OpenSSH<p>tl;dr:</p>
<ul>
<li>Debian/Ubuntu use the <code>/etc/ssh/moduli</code> file as distributed by the OpenSSH project at the time of the distribution’s release</li>
<li>This file is only used for <code>diffie-hellman-group-*</code> KexAlgorithms</li>
<li>The default KEX algorithm on my setup is the post-quantum <code>sntrup761x25519-sha512@openssh.com</code> instead</li>
<li>Therefore, you can generate your own moduli, but it is increasingly irrelevant</li>
<li>Having more moduli listed means that <code>sshd</code> will do more processing during every connection attempt that uses the file</li>
</ul>
<p>There is also a “fallback” behavior if the server can’t read the <code>moduli</code> file or find a match, which I don’t fully understand.</p>
<a name='more'></a>
<h3>Intro</h3>
<p>There’s a lot on the internet about Diffie-Hellman, primes, and groups. I will be skipping over most of this, and looking more closely at how OpenSSH in particular handles the Diffie-Hellman group exchange (all <code>diffie-hellman-group-exchange-*</code> KEX algorithms.)</p>
<p>References to the source code and “current behavior” of OpenSSH specifically refer to Portable OpenSSH version 9.3p1. This upstream version was chosen for inclusion in Ubuntu 23.10, Mantic Minotaur, which happens to be my current desktop.</p>
<h3>The Fallback</h3>
<p>My curiosity was initially stoked by <a href="https://github.com/jtesta/ssh-audit/">ssh-audit</a> mentioning the “OpenSSH DH GEX fallback mechanism.” ⚠️ It turns out that if OpenSSH cannot read <code>/etc/ssh/moduli</code> or find an acceptable group in there (between min and max preferences from the client), then it falls back to one of groups 14/16/18 (2048/4096/8192 bits), defined in <a href="https://www.rfc-editor.org/rfc/rfc3526">RFC 3526</a>.</p>
<p>Curiously, fallback function is passed the client’s <strong>max</strong> preference, but rounds that (possibly <strong>upwards</strong>) to the <em>closest eligible</em> group. Thus, if the mechanism is activated with a client that accepted "2048–7168" bits, OpenSSH would choose the 8192-bit fallback. This happens even though the server is otherwise willing to use the 4096-bit group.</p>
<p>(This code appears in <code>dh.c</code>, functions <code>choose_dh()</code> and <code>dh_new_group_fallback()</code>.)</p>
<p>Either I don’t understand correctly, or it is only a theoretical problem, and not a practical one.</p>
<h3>The Client’s Preferred Size</h3>
<p>There is—probably wisely—no configuration for the length of moduli in the request sent by the client. OpenSSH chooses the lengths to send by hard-coded settings for min and max (currently 2048 and 8192, respectively), and determines its preferred setting by calling a <code>dh_estimate()</code> function.</p>
<p>That function, in turn, converts a “symmetric bit length” to a DH length. This appears to take into account the cipher key size, cipher block length, cipher IV length (except those are mostly defined to 0), and MAC length (except that only UMAC has a non-zero value.)</p>
<p>This yields 3072 for 128-bit, 7680 for 192-bit, and 8192 for 256-bit. The first two numbers correspond to the list of equivalent strengths listed in NIST SP 800-57 Part 1 (currently in Rev. 5, published in 2020.) The last is obviously limited by the maximum length that OpenSSH is willing to use, as the publication suggests a 15360-bit length instead.</p>
<p>It may be possible to get a preferred size of 2048 when using <code>3des-cbc</code>🛑, which is only 112 bits of security. But don’t do that!</p>
<h3>Rereading Moduli</h3>
<p>OpenSSH appears to read through the <code>/etc/ssh/moduli</code> file twice on every request: once to find candidates, and once more to rediscover the candidate that was chosen. Internally, it parses every line (first pass) and every line up to the chosen candidate (second pass), <strong>including</strong> allocating the generator and prime, and converting them from hex to binary. If the candidate disappears from the file between passes, then the fallback mechanism is activated.</p>
<p>I know I am worried about the performance of a once-per-connection thing here, but it appears that the moduli file should be <em>as short as is practical,</em> while providing enough moduli/lengths to offer suitable coverage. Likewise, it should contain only lengths that clients are actually willing to use.</p>
<h3>Moduli File Contents</h3>
<p>OpenSSH regularly distributes updated versions of their <code>moduli</code> file. Debian and Ubuntu pull that version of the file into their own release process. This limits the usefulness of precomputing anything based on the moduli; after some time, those moduli are phased out by new OS releases.</p>
<p>OpenSSH provides moduli with lengths of 2048, 3072, 4096, 6144, 7680, and 8192. But from what we learned earlier, it will only really request 3072, 7680, and 8192 itself. The other three sizes are for the benefit of <em>other</em> clients. But I wanted to wrap up my research, so I did not study any of them.</p>
<h3>The 2048-bit Moduli</h3>
<p><a href="https://www.sshaudit.com/hardening_guides.html">SSH hardening guides</a> recommend removing entries from <code>/etc/ssh/moduli</code> that are less than 3072 bits with commands like:</p>
<pre><code>awk '$5 >= 3071' /etc/ssh/moduli > /etc/ssh/moduli.safe
mv /etc/ssh/moduli.safe /etc/ssh/moduli</code></pre>
<p>(This file lists N-bit primes as N-1 bits, because <em>technically</em> the first bit is always 1. IDK that it really makes a difference, but to keep the 3072-bit primes, the command has to match <code>3071</code> instead.)</p>
<p>Aside from the fallback mechanism, this guarantees that the server won’t be able to find any smaller primes to use, even if the client sends a request like “min 1024, prefer 1024” that would <em>normally</em> select smaller primes.</p>
<h3>The Anticlimatic Conclusion</h3>
<p>After all that, I realized that this was theoretical in my case. Since OpenSSH 8.5 (March 2021), the package has supported the post-quantum <code>sntrup761x25519-sha512@openssh.com</code> KEX algorithm, which does not use classic Diffie-Hellman… and therefore, doesn’t use the <code>moduli</code> file. This appears to have been made the default in OpenSSH 8.9 (February 2022), which was included with Ubuntu 22.04 LTS.</p>
<p>I ended up limiting my SSH client via adding to <code>~/.ssh/config</code>:</p>
<pre><code>Host *
KexAlgorithms sntrup761x25519-sha512@openssh.com
Ciphers aes256-gcm@openssh.com</code></pre>
<p>I have AES-NI on both ends. I’ll add chacha20-poly1305 and/or per-host exceptions, if circumstances change.</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-7914503826322616632023-12-13T20:42:00.000-05:002023-12-13T20:42:54.751-05:00Viewing the Expiration Date of an SSH Certificate<p>A while ago, I set up my server with SSH certificates (following <a href="https://dev.to/gvelrajan/how-to-configure-and-setup-ssh-certificates-for-ssh-authentication-b52">a guide like this one</a>), and today, I began to wonder: “When do these things expire?”</p>
<h2>Host certificate (option 1)</h2>
<p>Among the output of <code>ssh -v</code> (my client is <code>OpenSSH_9.3p1 Ubuntu-1ubuntu3</code>) is this line about the host certificate:</p>
<p><code>debug1: Server host certificate: ssh-ed25519-cert-v01@openssh.com [...] valid from 2023-04-07T19:58:00 to 2024-04-05T19:59:44</code></p>
<p>That tells us the <em>server host certificate</em> expiration date, where it says “valid … to 2024-04-05.” For our local host to continue trusting the server, without using <code>~/.ssh/known_hosts</code> and the trust-on-first-use (<abbr>TOFU</abbr>) model, we must re-sign the server key and install the new signature before that date.</p>
<h2>User certificate</h2>
<p>I eventually learned that <code>ssh-keygen -L -f id_25519-cert.pub</code> will produce some lovely human-readable output, which includes a line:</p>
<p><code>Valid: from 2023-04-07T20:14:00 to 2023-05-12T20:15:56</code></p>
<p>Aha! I seem to have signed the <em>user</em> for an additional month-ish beyond the host key’s signature. I will be able to log into the server without my key listed in <code>~/.ssh/authorized_keys</code> (on the server) until 2023-05-12.</p>
<p>This looks like a clever protection mechanism left by my past self. As long as I log into my server at least once a month, I'll see an untrusted-host warning <em>before</em> my regular authentication system goes down. (If that happened, I would probably have to use a recovery image and/or the VPS web console to restore service.)</p>
<h2>Host certificate (option 2)</h2>
<p>There’s an <code>ssh-keyscan</code> command, which offers a <code>-c</code> option to print certificates instead of keys. It turns out that we can paste its output to get the certificate validity again. (Lines shown with <code>$</code> or <code>></code> are input, after that prompt; the other lines, including <code>#</code>, are output.)</p>
<pre><code>$ ssh-keyscan -c demo.example.org
# demo.example.org:22 SSH-2.0-OpenSSH_8.9p1
# demo.example.org:22 SSH-2.0-OpenSSH_8.9p1
ssh-ed25519-cert-v01@openssh.com AAAA[.....]mcwo=
# demo.example.org:22 SSH-2.0-OpenSSH_8.9p1</code></pre>
<p>The <code>ssh-ed25519-cert</code> line is the one we need. We can pass it to <code>ssh-keygen</code> with a filename of <code>-</code> to read standard input, then use the shell’s “heredoc” mechanism to provide the standard input:</p>
<pre><code>$ ssh-keygen -L -f - <<EOF
> ssh-ed25519-cert-v01@openssh.com AAAA[.....]mcwo=
> EOF</code></pre>
<p>Now we have the same information as before, but from the host certificate. This includes the <code>Valid: from 2023-04-07T19:58:00 to 2024-04-05T19:59:44</code> line again.</p>
<h2>Tips for setting up certificates</h2>
<p>Document what you did, and <em>remember the passphrases for the CA keys!</em> This is my second setup, and now I have scripts to do the commands with all of my previous choices. They’re merely one-line shell scripts with the ssh-keygen command. But they still effectively record everything like the server name list, identity, validity period, and so forth.</p>
<p>To sign keys for multiple users/servers, it may be convenient to add the CA key to an SSH agent. Start a separate one to keep things extra clean, then specify the signing key slightly differently:</p>
<pre><code>$ ssh-agent $SHELL
$ ssh-add user-ca
$ ssh-keygen -Us user-ca.pub ...
(repeat to sign other keys)
$ exit</code></pre>
<p>Note the addition of <code>-U</code> (specifying the CA key is in an agent) and the use of the <code>.pub</code> suffix (the public half of the key) in the signing process.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-48635858019610691242023-10-07T15:05:00.001-04:002023-10-07T15:05:37.408-04:00The Logging Tarpit<p>Back in August, Chris Siebenmann had some thoughts on logging:</p>
<ul>
<li><a href="https://utcc.utoronto.ca/~cks/space/blog/sysadmin/LogMonitoringTarpit">Monitoring your logs is mostly a tarpit</a></li>
<li><a href="https://utcc.utoronto.ca/~cks/space/blog/programming/LogMessagesNoPromises">Programs shouldn't commit to fixed and predictable log messages</a></li>
<li><a href="https://utcc.utoronto.ca/~cks/space/blog/programming/LogMessagesManySources">A program's (effective) log messages can have many sources</a> – not as relevant, but linked for completeness</li>
</ul>
<p>A popular response in the comments was “error numbers solve everything,” possibly along with a list (provided by the vendor) detailing all error numbers.</p>
<p>The first problem is, what if the error number changes? MySQL changed from key <em>index</em> to <em>name</em> in their duplicate-key error, and consequently changed from error code 1062 to 1586. Code or tools that were monitoring for 1062 would never hit a match again. Conversely, if “unknown errors” were being monitored for emergency alerts, the appearance of 1586 might get more attention than it deserves.</p>
<p>In other cases, the error numbers may not capture enough information to provide a useful diagnostic. MySQL code 1586 may tell us that there was a duplicate value for a unique key, but we need the <em>log message</em> to tell us <strong>which</strong> value and key. Unfortunately, that is still missing the schema and table!</p>
<p>Likewise, one day, my Windows 10 PC hit a blue screen, and the only information logged was an error code for <code>machine check 0x3E</code>. The message “clarified” that this was a machine check exception with code <code>3e</code>. No running thread/function, no stack trace, <em>no context.</em></p>
<p>Finally, in some cases, logging doesn’t always capture an intent fully. If a log message is generated, is it <em>because of</em> a problem, or is it operationally irrelevant? Deciding this is the <strong>real</strong> tar pit of log monitoring, and the presence of an error number doesn’t really make a difference to it. There’s no avoiding the decisions.</p>
<p>In the wake of Chris’ posts, I changed one of our hacky workaround services to log a message if it <em>decides to take action</em> to fix the problem. Now we have the opportunity to find out if the service <strong>is</strong> taking action, not simply being started regularly. Would allocating an error number (for all time) help with that?</p>
<p>All of this ends up guiding my log monitoring philosophy: <strong>look at the logs sometimes, and find ways to clear the highest-frequency messages.</strong> I don't want dozens of lines of <em>known uninteresting</em> messages clogging the log during incident response. For example, “can’t load font A, using fallback B.” We’d either install font A properly, or mute the message for font A, specifically. But, I want to avoid trying to categorize every single message, because that way lies madness.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-29558966175745959742023-09-29T19:45:00.000-04:002023-09-29T19:45:01.197-04:00AWS: Requesting gp3 Volumes in SSM Automation Documents<p>I updated our EC2 instance-building pipeline to use the <code>gp3</code> volume type, which offers more IOPS at lower costs.</p>
<p>Our initial build runs as an SSM [Systems Manager] Automation Document. The first-stage build instance is launched from an <a href="https://cloud-images.ubuntu.com/locator/ec2">Ubuntu AMI</a> (with <code>gp2</code> storage), and produces a “core” image with our standard platform installed. This includes things like monitoring tools, our language runtime, and so forth. The core image is then used to build final AMIs that are customized to specific applications. That is, the IVR system, Drupal, internal accounting platform, and antique monolith all have separate instances and AMIs underlying them.</p>
<p>Our specific SSM document uses the <code>aws:runInstances</code> action, and one of the optional inputs to it is <strong>BlockDeviceMappings</strong>. Through some trial and error, I found that the value it requires is the same structure as the AWS CLI uses:</p>
<pre><code>- DeviceName: "/dev/sda1"
Ebs:
VolumeType: gp3
Encrypted: true
DeleteOnTermination: true</code></pre>
<p><strong>Note 1:</strong> this is in YAML format, which requires spaces for indentation. Be sure “Ebs” is indented two spaces, and the subsequent lines four spaces. The structure above is a 1-element array, containing a dictionary with two keys, and the “Ebs” key is another dictionary (with 3 items.)</p>
<p><strong>Note 2:</strong> the DeviceName I am using comes from the Ubuntu AMI that I am using to start the instance. DeviceName <strong>may vary</strong> with different distributions. Check the AMI you are using for its root device setting.</p>
<p>The last two lines (Encrypted and DeleteOnTermination) may be unnecessary, but I don’t like leaving things to chance.</p>
<p>Doing this in a launch template remains a mystery. The best I have been able to do, when trying to use the launch template, Amazon warns me that it’s planning to ignore the entire volume as described in the template. It appears as if it will replace the volume with the one from the AMI, rather than merging the configurations.</p>
<p>I know I have complained about Amazon in the past for not providing a “launch from template” operation in SSM, but in this case, it appears to have worked out in my favor.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-68966421879466215032023-09-28T19:00:00.004-04:002023-12-13T20:41:08.167-05:00Using Cloudflare DNS over TLS system-wide on Ubuntu<p>My current Linux distributions (Ubuntu 23.04 and the Ubuntu-derived Pop!_OS 22.04) use NetworkManager for managing connections, and <code>systemd-resolved</code> for resolving DNS queries. I’ve set up Cloudflare’s public DNS service with DoT (DNS over TLS) support twice… and I don’t really have a solid conclusion. Is one “better?” 🤷🏻</p>
<h2>Contents</h2>
<ul>
<li>Per-connection mode with NetworkManager only</li>
<li>Globally with systemd-resolved / NetworkManager</li>
<li>Useful background info</li>
</ul>
<a name='more'></a>
<h2>Per-Connection Mode with NetworkManager</h2>
<p>First, I set the IP addresses <a href="https://developers.cloudflare.com/1.1.1.1/setup/">according to instructions</a> via the NetworkManager GUI, although I could have used the command line as well. The latter would have been like:</p>
<pre><code>$ nmcli c modify id "AirTubes WiFi" \
ipv4.dns 1.1.1.1,1.0.0.1</code></pre>
<p>And similar for the IPv6 addresses with the <code>ipv6.dns</code> setting. The “AirTubes WiFi” is the connection name, visible as the “NAME” column of <code>nmcli c show</code>. Using <code>nmcli c</code> is short for <code>nmcli connection</code>. For brevity, this post will always use the abbreviation.</p>
<p>The laptop also has an outbound firewall, so I added a rule to allow port 853 out:</p>
<pre><code>$ sudo ufw allow out \
proto tcp to any port 853 \
comment "DNS over TLS"</code></pre>
<p>Finally, activating DNS over TLS required the NetworkManager command line (this setting is not available in other interfaces):</p>
<pre><code>$ nmcli c modify id "AirTubes WiFi" \
connection.dns-over-tls opportunistic</code></pre>
<p>I chose “opportunistic” mode for DoT instead of “yes”, reasoning that for the most part, I’m at home. I’ll forget about this entire thing if I’m out, and that’s when I need the internet to just work, even if the <em>network</em> is blocking port 853.</p>
<p>Only then did I realize the flaw in per-connection settings: <strong>these would not apply if I were actually out.</strong> There’s no way a public network would be called “AirTubes WiFi,” so NetworkManager would consider it a different connection.</p>
<p>I put the nmcli settings back to their defaults:</p>
<pre><code>$ nmcli c modify id "AirTubes WiFi" \
connection.dns-over-tls ""
$ nmcli c modify id "AirTubes WiFi" \
ipv4.dns ""</code></pre>
<p>And one more time, for <code>ipv6.dns</code>.</p>
<p>Then, I went for global mode.</p>
<h2>Globally with systemd-resolved / NetworkManager</h2>
<p>The firewall settings to allow port 853 out, above, remained in place.</p>
<p>This approach uses a drop-in file for <code>systemd-resolved</code>, configuring it to do DNS over TLS with CloudFlare by default. The file looks like this, without comments, and with the DNS line abbreviated:</p>
<pre><code>[Resolve]
DNS=1.1.1.1#cloudflare-dns.com ... ...
DNSOverTLS=yes</code></pre>
<p>In fact, the DNS line is from the example value for CloudFlare, given in <code>/etc/systemd/resolve.conf</code>. It was abbreviated here for the blog to display better on phones.</p>
<p>Also note the capitalization here. <code>DNSoverTLS</code> with lowercase <code>O</code> is wrong, and will be <em>ignored.</em></p>
<p>Once it’s ready, the file gets copied into place:</p>
<pre><code>$ sudo mkdir /etc/systemd/resolve.conf.d
$ sudo cp local-resolve.conf \
/etc/systemd/resolve.conf.d</code></pre>
<p>I changed NetworkManager from fully “Automatic” mode to “Automatic (addresses only)”, then left the DNS Servers configuration blank. It appears that the command-line method to do this (if necessary) is:</p>
<pre><code>$ nmcli c modify id "AirTubes WiFi" \
ipv4.ignore-auto-dns yes
$ nmcli c modify id "AirTubes WiFi" \
ipv6.ignore-auto-dns yes</code></pre>
<p>Then, it was a matter of reloading the configurations:</p>
<pre><code>$ sudo systemctl restart systemd-resolved.service
$ nmcli c down id "AirTubes WiFi"
$ nmcli c up id "AirTubes WiFi"</code></pre>
<p>After that, using tcpdump to show traffic to port 853 started reporting packets captured! I was in business.</p>
<p>The limitation of <em>this</em> approach is that I have to remember to set up any new network connections this way: addresses only, no DNS. Otherwise, NetworkManager will tell <code>systemd-resolved</code> what DNS settings it wants to use, and they will be applied to the connection.</p>
<h2>Useful Background Info</h2>
<p>As mentioned in passing above, a NetworkManager setting can be reverted to its default value by setting the empty string:</p>
<pre><code>$ nmcli c modify id "AirTubes WiFi" \
connection.dns-over-tls ""</code></pre>
<p>A detailed listing of all of a connection’s settings can be seen with:</p>
<pre><code>$ nmcli c show id "AirTubes WiFi" | less</code></pre>
<p>More information about the settings are also available via man page.</p>
<pre><code>$ man nm-settings-nmcli</code></pre>
<p>Getting somewhat unrelated, there is a command to search man pages for keywords. This is how I discovered that there is an <code>nmtui</code> (text user interface) command, which gives a terminal-based menu similar to the GUI. Anyway, to search the man pages for NetworkManager:</p>
<pre><code>$ man -k NetworkManager</code></pre>
<p>Firewall rules can be restricted by IP. It was straightforward enough to do IPv4, but for IPv6, the address needs to be “complete” (note the double colon):</p>
<pre><code>$ sudo ufw allow out proto tcp \
to 1.0.0.0/15 port 853 \
comment 'CloudFlare DoT'
$ sudo ufw allow out proto tcp \
to 2606:4700:4700::/112 port 853 \
comment 'CloudFlare DoT v6'</code></pre>
<p>I chose the network masks so that one rule would cover both respective addresses, without allowing the port to the entire Internet. (Discussion of network masks and <a href="https://www.digitalocean.com/community/tutorials/understanding-ip-addresses-subnets-and-cidr-notation-for-networking">CIDR notation</a> for them—the /15 and /112—are out of scope for this blog post.)</p>
<p>When using <code>systemd-resolved</code> for name resolution, the status can be inspected with its own command:</p>
<pre><code>$ resolvectl status</code></pre>
<p>Here, the output should say things like <code>+DNSOverTLS</code>, and there shouldn’t be any <em>per-link</em> overrides of the “Global” section. It should be more like (reformatted to reduce width):</p>
<pre><code>Global
Protocols: -LLMNR -mDNS +DNSOverTLS
DNSSEC=no/unsupported
resolv.conf mode: stub
DNS Servers 1.1.1.1#cloudflare-dns.com
1.0.0.1#cloudflare-dns.com
2606:4700:4700::1111#cloudflare-dns.com
2606:4700:4700::1001#cloudflare-dns.com
Link 2 (enp2s0)
Current Scopes: none
Protocols: -DefaultRoute +LLMNR -mDNS
+DNSOverTLS DNSSEC=no/unsupported
Link 3 (wlx........)
Current Scopes: none
Protocols: -DefaultRoute +LLMNR -mDNS
+DNSOverTLS DNSSEC=no/unsupported</code></pre>
<p>I hope that covers it!</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-71929863084975148762023-09-20T21:07:00.002-04:002023-09-22T16:18:00.229-04:00Update on earlyoom<p>Back in <a href="https://www.decodednode.com/2022/12/linux-behavior-without-swap.html">Linux Behavior Without Swap</a>, I noted that the modern Linux kernel will still let the system thrash. The OOM killer does not come out until it is <em>extremely</em> desperate, long after responsiveness is near zero.</p>
<p>It has been long enough since installing <a href="https://github.com/rfjakob/earlyoom">earlyoom</a> on our clouds that I did another stupid thing. earlyoom was able to terminate the script, and the instance stayed responsive.</p>
<p>I also mentioned “swap on zram” in that post. It turns out, the ideal use case for <code>zram</code> is when <strong>there is no other swap device.</strong> When there’s a disk-based swap area (file or partition), one should <a href="https://www.addictivetips.com/ubuntu-linux-tips/enable-zswap-on-linux/">activate <code>zswap</code> instead</a>. zswap acts as a front-end buffer to swap, storing the compressible pages, or letting others go to the swap device.</p>
<p>One other note, <code>zswap</code> is compiled into the default Ubuntu kernels, but <code>zram</code> is part of the rather large <code>linux-modules-extra</code> package set. If there’s no other need for the extra modules, uninstalling them saves a good amount of disk space.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-30555456724233844952023-09-09T14:55:00.000-04:002023-09-09T14:55:02.977-04:00Upgrading a debootstrap'ped Debian Installation<p>For context, last year, I <a href="https://www.decodednode.com/2022/12/debootstraping-recovery-partition.html">created a Debian 11 recovery partition using debootstrap</a> for recovering from issues with the <code>88x2bu</code> <a href="https://github.com/morrownr/88x2bu-20210702">driver</a> I was using.</p>
<p>This year, I realized <a href="https://mjg59.dreamwidth.org/67126.html">while reading</a> that I have never used anything but Ubuntu’s <code>do-release-upgrade</code> tool to upgrade a non-rolling-release system. <a href="https://www.decodednode.com/2023/06/installing-debian-12-and-morrownrs.html">My Debian 12 desktop</a> felt far less polished than Ubuntu Studio, so I reinstalled the latter, and that means I once again don’t need the <code>88x2bu</code> driver.</p>
<p>Therefore, if I trashed the Debian partition, it wouldn’t be a <em>major</em> loss. It was time to experiment!</p>
<h2>Doing the upgrade</h2>
<p>The update process was straightforward, if more low-level than <code>do-release-upgrade</code>. There are a couple of different procedures online that vary in their details, so I ended up <del>winging it</del> combining them:</p>
<ol>
<li>apt update</li>
<li>apt upgrade</li>
<li>apt full-upgrade</li>
<li>apt autoremove --purge</li>
<li>[edit my sources from <code>bullseye</code> to <code>bookworm</code>]</li>
<li>apt clean</li>
<li>apt update</li>
<li>apt upgrade --without-new-pkgs</li>
<li>apt full-upgrade</li>
<li>reboot</li>
<li>apt autoremove</li>
</ol>
<p>DKMS built the <code>88x2bu</code> driver for the new kernel, and userspace appeared to be fine.</p>
<h2>Fixing the Network</h2>
<p>The link came up with an IP, but the internet didn’t work: there was no DNS. I didn’t have systemd-resolved, named, dnsmasq, nor nscd. Now, to rescue the rescue partition, I rebooted into Ubuntu, chroot’ed to Debian, and installed <code>systemd-resolved</code>.</p>
<h2>Fixing Half of the Races</h2>
<p>One of the Debian boots left me confused. Output to the console appeared to have stopped after some USB devices were initialized. I thought it had crashed. I unplugged a keyboard and plugged it in, generating more USB messages on screen, so I experimentally pressed <code>Enter</code>. What I got was an <code>(initramfs)</code> prompt! The previous one had been lost in the USB messages printed after it had appeared.</p>
<p>It seems that the kernel had done something different in probing the SATA bus vs. USB this time, and <code>/dev/sdb3</code> didn’t have the root partition on it. I ended up rebooting (I don’t know how to get the boot to resume properly if I had managed to mount <code>/root</code> by hand.)</p>
<p>When that worked, I updated the Ubuntu partition’s <code>/boot/grub/custom.cfg</code> to use the UUID instead of the device path for Debian.</p>
<p>It seems that the kernel itself only supports <em>partition</em> UUIDs, but Debian and Ubuntu use initrds (initial RAM disks) that contain the code needed to find the <em>filesystem</em> UUID. That’s why <code>root=UUID={fs-uuid}</code> has always worked for me! Including this time.</p>
<p><code>os-prober</code> (the original source of this entry) has to be more conservative, though, so it put <code>root=/dev/sdb3</code> on the kernel command line instead.</p>
<h2>The Unfixed Race</h2>
<p>Sometimes, the <code>wlan0</code> interface can’t be renamed to <code>wlx{MAC_ADDRESS}</code> because the device is busy. I tried letting <code>wlan0</code> be an alias in the configuration for the interface (using <code>systemd-networkd</code>) but it doesn’t seem to take.</p>
<p>I resort to simply rebooting if the login prompt doesn’t reset itself and show a DHCP IP address in the banner within a few seconds.</p>
<p>You have to admire the kernel and systemd teams’ dedication to taking a stable, functional process and replacing it with a complex and fragile mess.</p>
<h2>A Brief Flirtation with X11</h2>
<p>I installed Cinnamon. It <em>ran out of space;</em> I ran <code>apt clean</code> and then continued, successfully. This is my fault; I had “only” given the partition 8 GiB, because I expected it to be a CLI-only environment.</p>
<p>Cinnamon, however, is <em>insistent</em> on NetworkManager, and I was already using systemd-networkd. It’s very weird to have the desktop showing that there is “no connection” while the internet is actually working fine.</p>
<p>Due to the space issue, I decided to uninstall everything and go back to a minimal CLI. I would definitely not be able to perform another upgrade to Debian 13, for instance, and it was unclear if I would even be able to do <em>normal</em> updates.</p>
<h2>In Conclusion</h2>
<p>The Debian upgrade went surprisingly well, considering it was initially installed with <code>debootstrap</code>, and is therefore an unusual configuration.</p>
<p>Losing DNS might have been recoverable by editing <code>/etc/resolv.conf</code> instead, but I wasn’t really in a “fixing this from here is my only option” space. Actually, one might wonder what happened to the DHCP-provided DNS server? I don’t know, either.</p>
<p>Trying to add X11 to a partition never designed for it did not work out, but it was largely a whim anyway.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-4741478166385933432023-08-06T13:58:00.004-04:002023-08-06T13:58:52.005-04:00Sound for Firefox Flatpak on Ubuntu 23.04 with PipeWire<p>I reinstalled Ubuntu Studio recently, <a href="https://www.baeldung.com/linux/snap-remove-disable">excised all Snaps</a>, and installed Firefox from Flatpak. Afterward, I didn’t have any audio output in Firefox. Videos would just freeze.</p>
<p>I don’t know how much of this is <em>fully necessary,</em> but I quit Firefox, installed more PipeWire pieces, and maybe signed out and/or rebooted.</p>
<pre><code>sudo apt install pipewire-audio pipewire-alsa</code></pre>
<p>The <code>pipewire-jack</code> and <code>pipewire-pulse</code> packages were already installed.</p>
<p>AIUI, this means that “PipeWire exclusively owns the audio hardware” and provides ALSA, JACK, Pulse, and PipeWire interfaces <em>into</em> it.</p>
<p>It’s not perfect. Thunderbird (also flatpak; I’d rather have “some” sandbox than “none”) makes a bit of cacophony when emails come in, but at least there’s sound for Firefox.</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-65201365627889769022023-07-04T16:46:00.000-04:002023-07-04T16:46:04.474-04:00Boring Code Survives<p>Over on <a href="https://utcc.utoronto.ca/~cks/space/blog/">Wandering Thoughts</a>, Chris writes about <a href="https://utcc.utoronto.ca/~cks/space/blog/python/SantoolsCodeDurability">some fileserver management tools</a> being fairly unchanged over time by changes to the environment. There is a Python 2 to 3 conversion, and some changes when the disks being managed are no longer on iSCSI, “but in practice a lot of code really has carried on basically as-is.”</p>
<p>This is completely different than my experience <strong>with async/await</strong> in Python. Async was new, so the library I used with it was in 0.x, and in 1.0, the authors <em>inverted the entire control structure.</em> Instead of being able to create an AWS client deep in the stack and return it upwards, clients could only be used as context managers. It was quite a nasty surprise.</p>
<p>To allow testing for free, my code dynamically instantiated a module to “manage storage,” and whether that was AWS or in-memory was an implementation detail. Suddenly, one of the clients couldn’t write <code>self.client = c; return</code> anymore. The top-level had to know about the change. <em>Other storage clients</em> would have to know about the change, to become context managers themselves, for no reason.</p>
<p>I held onto the 0.x version for a while, until the Python core team felt like “explicit event loop” was a mistake big enough that <strong>everyone’s</strong> code had to be broken.</p>
<p>Async had been hard to write in the first place, because so much example code out there was for the <code>asyncio</code> module’s decorators, which had preceded the actual async/await syntax. What the difference between tasks and coroutines even was, and why one should choose one over the other, was never clear. Why an explicit <code>loop</code> parameter should exist was especially unclear, but it was “best practice” to include it everywhere, so everyone did. Then Python set it on fire.</p>
<p>(I never liked the Python packaging story, and <a href="https://www.decodednode.com/2020/04/pipenvs-surprise.html">pipenv didn’t solve it.</a> To pipenv, every Python minor version is an incompatible version?)</p>
<p>I had a rewrite on my hands either way, so I went <a href="https://www.decodednode.com/2021/12/the-best-tool-for-job.html">looking for something else</a> to rewrite in, and v3 is in Go. The <a href="https://github.com/sapphirecat/cloud-maker/">other Python I was using</a> in my VM build pipeline was replaced with a half-dozen lines of shell script. It’s much less flexible, perhaps, but it’s clear and concise now.</p>
<p>In the end, it seems that <strong>boring</strong> code survives the changing seasons. If you’re just making function calls and doing some regular expression work… there’s little that’s likely to change in that space. If you’re <a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">coloring functions</a> and people are <a href="https://trio.readthedocs.io/en/stable/">inventing brand-new libraries</a> in the space you’re working in, your code will find its environment altered much sooner. The newer, fancier stuff is inherently closer to the fault-line of future shifts in the language semantics.</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-84869417575862323082023-06-25T20:08:00.002-04:002023-07-03T11:30:50.207-04:00Installing Debian 12 and morrownr's Realtek driver (2023-07-03 edit)<p>Due to deepening dissatisfaction with Canonical, I replaced my Ubuntu Studio installation with Debian 12 “bookworm” recently.</p>
<p>tl;dr:</p>
<ol>
<li>My backups, including the driver source, were compressed with lzip <a href="https://www.nongnu.org/lzip/safety_of_the_lzip_format.html">for reasons</a>, but I fell back on a <strong><a href="https://www.decodednode.com/2022/12/debootstraping-recovery-partition.html">previously-built</a> rescue partition</strong> to get the system online.</li>
<li>I ended up with an improper <code>grub</code> installation, that couldn’t find <code>/boot/grub/i386-pc/normal.mod</code>. I rebooted the install media in rescue mode, got a shell in the environment, verified the disk structure with <code>fdisk -l</code>, and then ran <code>grub-install /dev/___</code> to fix it. Replace the blank with your device, but beware: using the wrong device may make the OS on it unbootable.</li>
<li>The USB doesn’t work directly with apt-cdrom to install more packages offline. I “got the ISO back” from <code>dd if=/dev/sdd of=./bookworm.iso bs=1M count=4096 status=progress conv=fsync</code> (1M * 4096 = 4G total, which is big enough for the 3.7G bookworm image; you may need to adjust to suit), then made it available with <code>mount -o loop,ro ~+/bookworm.iso /media/cdrom0</code> (the mount point is the target of <code>/media/cdrom</code>.)</li>
<li>Once finished, I found out the DVD had <code>plzip</code>, and if I’d <em>searched</em> for it (lzip), I could have used it (plzip). I didn’t actually need the rescue partition.</li>
<li>Once finished, I realized I hadn’t needed to <code>dd</code> the ISO back from the USB stick. The downloaded ISO was on my external drive all along, and I could have loop-mounted that.</li>
<li>[Added 2023-07-02]: Letting the swap partition get formatted gave it a new UUID. Ultimately, I would need to update the recovery partition’s <code>/etc/fstab</code> with the new UUID, and follow up with <code>update-initramfs -u</code> to get the recovery partition working smoothly again.</li>
</ol>
<p>Full, detailed rambling with too much context (as usual) below.</p>
<a name='more'></a>
<h3>The Wi-Fi driver</h3>
<p>My Wi-Fi adapter is an Archer T4U Plus (realtek / RTL 8822BU) wi-fi adapter. To get the system online prior to kernel 6.2, I need to build <a href="https://github.com/morrownr/88x2bu">morrownr’s driver</a> (generic link, not the specific version.) I knew from DistroWatch that Debian 12 has kernel 6.1, so I downloaded the DVD-1 image to be reasonably sure it had all of the development tools necessary to do this. I did not expect Debian to provide this driver on the DVD.</p>
<p>I had a working, up-to-date copy of the driver code on the Ubuntu partition I would be replacing. That code was included into my backups, and copied onto an external hard disk. The plan was to install Debian, install <code>build-essential</code>, <code>iw</code>, and the like from the install media, unpack my backups, and build the driver.</p>
<h3>On backups</h3>
<p>I ran into this issue last, but it is <em>incredibly important</em> to know before beginning a process like this, so I’m putting it first: my backups were compressed with lzip. It wasn’t pre-installed, there was no <code>lzip</code> package, and I didn’t know about <code>plzip</code> or do any searches that would find it. I may have tried harder in different circumstances, but I already had another plan: I also had a rescue partition I built (Debian 11 / bullseye) for when I broke the Wi-Fi driver in Ubuntu.</p>
<p>To get at the backups, then, I jumped into the rescue partition (with networking), chroot’ed to the new install, and pulled <code>lzip</code> from the online repositories.</p>
<h3>The missing normal.mod</h3>
<p>The installation appeared to go smoothly, but on rebooting, I met my first problem: grub couldn’t find <code>/boot/grub/i386-pc/normal.mod</code>. I <em>thought</em> “do not install on my primary hard drive” would have meant “don’t write it onto /dev/sda”, especially since I could pick “sdb” from the next menu. This did indeed skip installing to sda, but also clearly didn’t finish installing to sdb and sdb1.</p>
<p>(When I first installed it, Ubuntu managed to irretrievably corrupt Windows booting when I wrote the bootloader to the primary disk. Windows won’t install itself over grub, either. I found something else that was supposed to be a Windows-flavor bootblock; it didn’t allow Windows 10 to boot, but it <em>did</em> allow Windows to reinstall to that drive/partition afterward. It even kindly hid all my user/app data in C:\Windows.old without telling me. yay.)</p>
<p>To fix the issue with grub on Debian 12, I rebooted into the installation media in rescue mode, had a worrying number of identical prompts to the initial installation, and finally got into a shell in the new root partition. From there, I figured out the necessary commands:</p>
<p><strong>CAUTION: your device nodes will differ. You can break things doing this. Be sure you have the right device! I have changed the drive letter to something extremely unlikely to ruin your OS if you copy and paste this, but you have been warned anyway.</strong></p>
<pre><code># fdisk -l /dev/sdz
[made sure it’s my 256 GB Linux SSD,
not my 1 TB SSD with Windows]
# grub-install /dev/sdz
# ls /boot/grub/i386-pc
[there should be a lot of files there now]</code></pre>
<p>After rebooting, I was blessed with the Grub menu, instead of the error and the <code>grub rescue></code> prompt. The system was then able to boot into Debian 12 just fine.</p>
<p>Preview: the <em>other</em> problem with the installation was that I let it reformat the swap partition, which gave it a new UUID. I should have changed it to “use as swap, do not format.”</p>
<h3>The USB stick is not /media/cdrom</h3>
<p>The next order of business was to get <code>build-essential</code> and friends installed. Unfortunately, trying to run apt commands would only endlessly ask for the disc to be put into /media/cdrom.</p>
<p>The internet suggested copying the image of the USB stick back into a file, then loop-mounting that.</p>
<p>(I would realize later that I had saved the ISO on an external/portable drive, and I could have loop-mounted that directly, instead of using the <code>dd</code> command below. But since that’s not what I did…)</p>
<p>First, I had to log out and log back in to unmount the stick. I had tried reading some of the offline documentation, which left a ‘zygote’ browser process running in the background with its working directory on the stick, preventing unmount.</p>
<p>Anyway, we had to learn about the mount point.</p>
<pre><code># ls -ld /media/cdrom
[…] -> cdrom0
# ls -ld /media/cdrom0
drwxr-xr-x […] /media/cdrom0</code></pre>
<p>Okay. apt wants to use /media/cdrom, which is a symlink to a normal directory. We’ll get the image, then we’ll mount it over /media/cdrom0, and finally we’ll hope it works even though we never saw a Packages.gz or anything on the USB file system.</p>
<p>The commands for getting the image back off the USB (attached at <code>/dev/sdd</code>) and using it are:</p>
<pre><code># dd if=/dev/sdd of=bookworm.iso bs=1M count=4096 \
status=progress conv=fsync
# mount -o loop,ro ~+/bookworm.iso /media/cdrom0
# apt install build-essential make</code></pre>
<p><strong>Note for the future:</strong> I knew the image was 3.7G from putting it on the USB stick in the first place, so I am providing the command to dump the first 4.0G of it (4K * 1M) back to the file. Your image may be larger. (I actually ran it without a count at all, and it started dumping my entire 30G stick, so I interrupted it with Ctrl+C when I noticed around the 10G mark.)</p>
<p>Hey, good news: it worked. I got a bunch of packages installed, without setting up a network connection.</p>
<h3>Unpacking the backups</h3>
<p>I finally learned I couldn’t unpack my backups. They were compressed with <a href="https://www.nongnu.org/lzip/">lzip</a>, but the DVD only contained gzip, bzip2, xz, and lzma (from xz-utils).</p>
<p>I would find out later that the DVD contains <code>plzip</code>, a parallel version of lzip that is fully compatible with it, but I guess I didn’t <em>search</em> for lzip so much as expected to find “the package <em>named</em> lzip.”</p>
<p>Fortunately, in the past, I had run into problems rebuilding the driver, compounded by its recommendations to reboot and my own misunderstanding of how to do an upgrade (tl;dr on that: <strong>never</strong> run <code>remove-driver.sh</code> and reboot if you have dkms. During install, it removes itself from dkms, which is effectively the same thing; but it does this <em>after</em> checking installation requirements, which is helpful when a new one has been added.)</p>
<p>So… I had used <code>resize2fs</code> to make some space, modified my partition table, (added swap space while I was in there, <a href="https://www.decodednode.com/2022/12/linux-behavior-without-swap.html">because having no swap hurts</a>), and made room for a Debian 11 install (<a href="https://www.decodednode.com/2022/12/debootstraping-recovery-partition.html">notes on that</a>). It also needed the driver, but the basic theory was, I should have a working driver in one OS or the other, all the time. I’d upgrade the driver on the Ubuntu partition periodically, and if it worked, go upgrade the driver in Debian. Ubuntu could fix Debian, or vice versa, depending on what happened.</p>
<h3>Aside: swap and GUID</h3>
<p>During the Debian 12 installation, I had done a custom installation, formatting my Ubuntu partition as ext4 for <code>/</code>, and letting it format the swap partition for swap, which it wanted to do already.</p>
<p>When I booted Debian 11, I had to wait a minute and a half for the swap mount to time out. It had been listed by UUID in <code>/etc/fstab</code>, so systemd waited for the UUID to appear, but it had been overwritten.</p>
<p>I used <code>lsblk -f</code> to get the updated UUID, copied it, and updated the fstab file. (Pro tip from times long past: if you’ve installed Linux text-only, also install <code>gpm</code> to copy and middle-click-paste with the mouse on the console.)</p>
<p>[<strong>Updated 2023-07-03:</strong> I would later discover I should have run <code>update-initramfs -u -k all</code> afterward, to put the new swap UUID into initramfs. Otherwise, <em>the initrd</em> waits around a bit for the original UUID to appear. This is what it’s doing when it repeatedly prints something like <code>Begin: Running /scripts/local-bottom ... done.</code> It’s trying to wait for local filesystems before continuing the init process.]</p>
<h3>Using my recovery partition</h3>
<p>Back to the present. With Debian 11 booted, I had to do a few things to get everything working. Properly entering Debian 12 as a chroot required:</p>
<pre><code># mount /dev/sdb1 /mnt
# for i in dev dev/pts sys proc
do mount --bind "/$i" "/mnt/$i"
done
# cp /etc/resolv.conf /mnt/etc/resolv.conf
# cp /etc/apt/sources.list \
/mnt/etc/apt/sources.list.d/network.list
# chroot /mnt /bin/bash
# nano /etc/apt/sources.list.d/network.list
[change ‘bullseye’ to ‘bookworm’]
[change ‘main’ to ‘main non-free-firmware’]
# apt update</code></pre>
<p>If <code>/dev/pts</code> isn’t available in the chroot, then <code>sudo</code> will fail with “Could not allocate pty.” The shaggy dog here is that <code>sudo</code> wasn’t actually important.</p>
<p>If <code>/etc/resolv.conf</code> isn’t available, DNS doesn’t work, and apt can’t really get anything from the network.</p>
<p>If the copied sources.list file has the wrong distribution… Bad Things May Happen. Fortunately, I hadn’t copied the resolv.conf the first time, so it just failed to update, and I spotted the wrong distro in the output rather quickly.</p>
<p>The final change to add <code>non-free-firmware</code> quiets a message about it from apt in a way that I didn’t have to look up on the internet. Quite a bit of time was spent already.</p>
<p>Anyway, that let me install <code>lzip</code> and <code>rfkill</code> from the tubes. I probably could have gotten rfkill locally, but since I had the network temporarily working, this was the path of least resistance.</p>
<p>I unpacked the driver from my backups and tried to run <code>install-driver.sh</code>, but it needs headers for the kernel <em>currently running,</em> which of course is not from the right distribution, so the headers aren’t there. I had to reboot back into Debian 12 to run the install.</p>
<p>Don’t forget to clean up safely when you’re leaving the chroot:</p>
<pre><code># exit
# umount /mnt/{dev/pts,dev,sys,proc} /mnt
# reboot</code></pre>
<h3>Driver installation</h3>
<p>With all that, the driver installed smoothly in Debian 12.</p>
<p>[<strong>Updated 2023-07-03:</strong> The rest of this section was basically rewritten to be useful, instead of a bland “It worked!”]</p>
<p>It was a matter of changing to the directory with the driver code, <code>88x2bu-20210702</code>, and running <code>sudo ./install-driver.sh</code> from the (offline as of yet) Debian 12 environment. After rebooting, I was able to configure the network in System Settings, and it all worked.</p>
<p>Arcana for anyone who needs it: I installed KDE, and the underlying network configuration tool is NetworkManager. systemd-networkd is not installed.</p>
<p>Per previous experience, I also decided to store the Wi-Fi password “for everyone (unencrypted)”. With the wallet configured to lock when locking the screen, I needed to type my password twice when unlocking the screen after suspend: once to log in, once to unlock the Wi-Fi. That’s too much typing.</p>
<h3>A note about driver uninstallation</h3>
<p>When upgrading to Ubuntu 23.04, I found out that the <code>remove-driver.sh</code> script <strong>may not remove</strong> a file that prevents loading the in-tree driver. It seems that in the <em>current</em> driver, the relevant line is part of <code>/etc/modprobe.d/88x2bu.conf</code>, but I could swear it was somewhere else in older versions.</p>
<p>Regardless, if you upgrade to kernel 6.2+, remove morrownr’s 88x2bu driver, and you still don’t have Wi-Fi, try <code>grep -r 'blacklist.*rtw' /etc/modprobe.d</code> to see if a file shows up. I can’t give any general advice on whether to delete the file or comment out the line, but getting <code>rtw88_8822bu</code> off the list of blocked modules should make it work again. (At least, for my Archer T4U Plus, which is an 8822BU chip.)</p>
<h3>The end?</h3>
<p>I want to set up mainline kernels in Debian 12, preferably in a way that I don’t have to track my own from <a href="https://www.kernel.org/">kernel.org</a>, so that I can have 6.2+ and not worry about this driver anymore. It’s such a pain. But, whatever happens there will be a different post.</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-70015819797822050182023-03-31T21:03:00.002-04:002023-03-31T21:03:55.576-04:00Passing data from AWS EventBridge Scheduler to Lambda<p>The <a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-run-lambda-schedule.html#eb-schedule-create-rule" target="_blank">documentation</a> was lacking images, or even descriptions of some screens ("Choose Next. Choose Next.") So, I ran a little experiment to test things out.</p><p>When creating a new scheduled event in AWS EventBridge Scheduler, then choosing AWS Lambda: Invoke, a field called "<b>Input</b>" will be available. It's pre-filled with the value of <code>{}</code>, that is, an empty JSON object. This is the value that is passed to the <strong><code>event</code></strong> argument of the Lambda handler:</p><pre>export async function handler(event, context) {
// handle the event
}</pre><p>With an event JSON of <code>{"example":{"one":1,"two":2}}</code>, the handler could read <code>event.example.two</code> to get its value, 2.</p><p>It appears that EventBridge Scheduler allows one complete control over this data, and the <code>context</code> argument is only filled with Lambda-related information. Therefore, AWS provides the ability to include the <code><aws.scheduler.*></code> values in this JSON data, to be passed to Lambda (or ignored) as one sees fit, rather than imposing <i>any</i> constraints of its own on the data format. (Sorry, no examples; I was only testing the basic features.)<br /></p>
<p>Note that the <code>handler</code> example above is written with ES Modules. This requires the Node 18.x runtime in Lambda, along with a filename of "index.mjs".</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-39974628159713522712023-03-27T18:53:00.006-04:002023-04-26T20:46:57.742-04:00DejaDup and PikaBackup, early impressions (Update 2)<p>I tried a couple of backup programs:</p>
<ul>
<li><a href="https://flathub.org/apps/details/org.gnome.DejaDup">DejaDup</a> (the one I had heard of), a front-end for <a href="https://duplicity.gitlab.io/">Duplicity</a></li>
<li><a href="https://flathub.org/apps/details/org.gnome.World.PikaBackup">PikaBackup</a>, a front-end for <a href="https://www.borgbackup.org/">BorgBackup</a></li>
</ul>
<p>I installed both of them as Flatpaks, although deja-dup also has a version in the Pop!_OS 22.04 repository. I have been using DejaDup for four months, and PikaBackup for one month. This has been long enough for DejaDup to make a second full backup, but not so long for Pika to do anything special.</p>
<p><strong>Speed:</strong></p>
<p>For a weekly incremental backup of my data set…</p>
<ul>
<li>DejaDup: about 5 minutes, lots of fan speed changes</li>
<li>PikaBackup: about 1 minute, fans up the whole time</li>
</ul>
<p>Part of Pika’s speed is probably the better exclusion rules; I can use patterns of <code>**/node_modules</code> and <code>**/vendor</code>, to exclude those folders, wherever they are in the tree. With DejaDup, I would apparently have to add each one individually, and I did not want to bother, nor keep the list up-to-date over time.</p>
<p>Part of DejaDup’s slowness might be that it executes thousands of <code>gpg</code> calls as it works. Watching with <code>top</code>, DejaDup is frequently running, and sometimes there’s a <code>gpg</code> process running with it. Often, DejaDup is credited with much less than 100% of a single CPU core.</p>
<p><strong>Features:</strong></p>
<p>PikaBackup offers multiple backup configurations. I keep my main backup as a weekly backup, on an external drive that’s only plugged in for the occasion. I was able to configure an additional hourly backup of my most-often-changed files in Pika. (This goes into <code>~/.borg-fast</code>, which I excluded from the weekly backups.) The hourly backups, covering about 2 GB of files, aren’t noticeable at all when using the system.</p>
<p>Noted under “speed,” PikaBackup offers better control of exclusions. It tracks how long operations took, so I know that it has been <em>exactly</em> 53–57 seconds to make the incremental weekly backups.</p>
<p>On the other hand, Pika appears to <em>always</em> save the backup password. DejaDup gives the user the option of whether it should be remembered.</p>
<p>There is a DejaDup plugin for Caja (the MATE file manager) in the OS repo, which may be interesting to MATE users.</p>
<p><strong>Space Usage:</strong></p>
<p>PikaBackup did the weekly backup on 2023-04-24 in 46 seconds; it reports a total backup size of 28 GB and 982 MB (0.959 GB = 3.4%) written out.</p>
<p>With scheduled backups, Pika offers control of the number of copies kept. One can choose from a couple of presets, or provide custom settings. Of note, these are <em>count-based</em> rather than <em>time-based;</em> if a laptop is only running for 8-9 hours a day, then 24 hourly backups will be able to provide up to 3 days back in time.</p>
<p>For unscheduled backups, it’s not clear that Pika offers any ‘cleanup’ options, because the cleanup is tied to the schedule in the UI.</p>
<p>I do not remember being given many options to control space usage in DejaDup.</p>
<p><strong>Disaster Simulation:</strong></p>
<p>To ensure backups were really encrypted, I rebooted into the OS Recovery environment and tried to access them. Both programs’ CLI tools (<code>duplicity</code> and <code>borgbackup</code>) from the OS repository were able to verify the data sets. I don’t know what the stability guarantees are, but it’s nice that this worked in practice.</p>
<ul>
<li><code>duplicity</code> verified the DejaDup backup in about 9m40s</li>
<li><code>borgbackup</code> verified the PikaBackup backup in 3m23s</li>
</ul>
<p>This isn’t a benchmark at all; after a while, I got bored of duplicity being credited with 30% of 1 core CPU usage, and started the borgbackup task in parallel.</p>
<p>Both programs required the password to unlock the backup, because my login keychain isn’t available in this environment.</p>
<p>Curiously, <code>borgbackup</code> changed the permissions on a couple of files on the backup during the verification: the <code>config</code> and <code>index</code> files became owned by root. This made it impossible to access the backups as my normal user, including to take a new one. I needed to return to my admin user and set the ownership back to my limited account. The error message made it clear an unexpected exception occurred, but wasn’t very useful beyond that.</p>
<p><strong>Major limitations of this post:</strong></p>
<p>My data set is a few GB, consisting mainly of git repos and related office documents. The performance of other data sets is likely to vary.</p>
<p>I started running Pika about the same time that DejaDup wanted to make a second backup, so the full-backup date and number of incremental snapshots since <em>should</em> be fairly close to each other. I expect this to make the verification times comparable.</p>
<p>I haven’t actually done any restores yet.</p>
<p><strong>Final words:</strong></p>
<p>Pika has become my primary backup method. Together, its speed and its support for multiple configurations made hourly backups reasonable, without compromising the offline weekly backup.</p>
<p><strong>Update History:</strong></p>
<p>This post was updated on 2023-03-31, to add information about multiple backups to “Features,” and about BorgBackup’s file permission change during the verification test. Links were added to the list above, and a new “Final Words” concluding section was written.</p>
<p>It was updated again on 2023-04-26, to add the “Space Usage” section, and to reduce “I will probably…” statements to reflect the final decisions made.</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-10769055739921586442023-03-16T20:40:00.003-04:002023-03-16T20:40:36.201-04:00Using sshuttle with ufw outbound filtering on Linux (Pop!_OS 22.04)<p>I am using <a href="https://github.com/sshuttle/sshuttle">sshuttle</a> and <a href="https://en.wikipedia.org/wiki/Uncomplicated_Firewall">UFW</a> on my Linux system, and I recently set up outbound traffic filtering (instead of default-allow) in ufw. Immediately, I noticed I couldn’t make connections via sshuttle anymore.</p>
<p>The solution was to add another rule to ufw:</p>
<pre><code>allow out from anywhere to IP 127.0.0.1, TCP port 12300
</code></pre>
<p>Note that this is “all interfaces,” <strong>not</strong> tied to the loopback interface, <code>lo</code>.</p>
<p>Now… why does this work? Why <em>doesn’t</em> this traffic already match one of the “accept all on loopback” rules?</p>
<p>To receive that sshuttle is responsible for, <code>sshuttle</code> listens at 127.0.0.1:12300 (by default) and <strong>creates some NAT rules</strong> to redirect traffic for its subnet to that IP and port. That is, running <code>sshuttle -r example.com 192.168.99.0/24</code> creates a NAT rule to catch traffic to any host within <code>192.168.99.0/24</code>. This is done in <a href="https://www.netfilter.org/">netfilter’s</a> <code>nat</code> tables.</p>
<p>UFW has its rules in the <code>filter</code> tables, and <strong>the <code>nat</code> tables run first.</strong> Therefore, UFW sees a packet that has <em>already been redirected,</em> and this redirection changes the packet’s <strong>destination</strong> while its <strong>interface and source</strong> remain the same!</p>
<p>That’s the key to answering the second question: the “allow traffic on loopback” rules are written to allow traffic on <strong>interface <code>lo</code></strong>, and these redirected packets have a different interface (Ethernet or Wi-Fi.) The public interfaces are not expected to have traffic for local addresses on them… but if they do, they don’t get to take a shortcut through the firewall.</p>
<p>With this understanding, we can also see what’s going wrong in the filtering rules. Without a specific rule to allow port 12300 outbound, the packet reaches the default policy, and if that’s “reject” or “deny,” then the traffic is blocked. sshuttle never receives it.</p>
<p>Now we can construct the proper match rule: we need to allow traffic to IP 127.0.0.1 on TCP port 12300, and use either “all interfaces” or our public (Ethernet/Wi-Fi) interface. I left mine at “all interfaces,” in case I should ever plug in the Ethernet.</p>
<p>(I admit to a couple of dead-ends along the way. One, allowing port 3306 out didn’t help. Due to the NAT redirection, the firewall never sees a packet with port 3306 itself. This also means that traffic being forwarded by <code>sshuttle</code> can’t be usefully firewalled on the client side. The other problem was that I accidentally created the rule to allow <em>UDP</em> instead of TCP the first time. Haha, oops.)</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-39038757394328811912023-02-02T19:30:00.000-05:002023-02-02T19:30:00.224-05:00On Handling Web Forms [2012]<p><em>Editor’s Note: I found this in my drafts from 2012. The login form works as described, but few of the other forms do. However, the login form has not had glitches, even with 2FA being added to it recently. The site as a whole retains its multi-page architecture. Without further ado, the original post follows…</em></p>
<p>I’ve been working on some fresher development at work, which is generally an opportunity for a lot of design and reflection on that design.</p>
<p>Back in 2006 or so, I did some testing and tediously developed the standard “302 redirect on POST responses” technique for preventing pages that handled inbound form data from showing up in the browser history. Thus, sites would be Back- and Reload-friendly, as they’d never show that “About to re-submit a form, do you really want to?” box. (I would bet it was a well-known technique at the time, but <acronym title="Not Invented Here">NIH</acronym>.)</p>
<p>That’s pretty much how I’ve written my sites since, but a necessary consequence of the design is that on submission failure, data for the form must be stored “somewhere,” so it can be retrieved for pre-filling the form after the redirection completes.</p>
<p>My recent app built in this style spends a lot of effort on all that, and then throws in extra complexity: when detecting you’re logged out, it decides to minimize the session storage size, so it cleans up all the keys. Except for the flash, the saved form data, and the redirection URL.</p>
<p>That latter gets stored because I don’t trust Referer as a rule, and if login fails and redirects to the login page for a retry, it won’t be accurate by the time a later attempt succeeds. So every page that checks login also stores the form URL to return to after a login.</p>
<p>There’s even an extra layer of keys in the form area, so that each form’s data is stored independently in the session, although I don’t think people actually browse multiple tabs concurrently. All <em>that</em> is serving to do is bloat up the session when a form gets abandoned somehow.</p>
<p>Even then, it <strong>still doesn't work</strong> if the form was inside a jQueryUI dialog, because I punted on that one. The backend and page don’t know the dialog was open, and end up losing the user’s data.</p>
<h3>Simplify, Young One</h3>
<p>That's a whole lot of complexity just to handle a form submission. Since the advent of GMail, YouTube, and many other sites which are <em>only</em> useful with JavaScript enabled, and since this latest project is a strictly internal app, I've decided to throw all that away and try again.</p>
<p>Now, a form submits as an AJAX POST and the response comes back. If there was an error, the client-side code can inform the user, and no state needs saved/restored because <em>the page never reloaded.</em> All the state it had is still there.</p>
<p>But while liberating, that much is old news. “Everyone” builds single-page apps that way, right?</p>
<p>But here’s the thing: if I build out separate pages for each form, then I’m effectively building private sections of code and state from the point of view of “the app” or “the whole site.” No page is visible from any other, so each one only has to worry about a couple of things going on in the global navbar.</p>
<p>This means changes roll out more quickly, as well, since users do reload when moving from area to area of the app.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-65515463307041040172023-01-31T19:16:00.003-05:002023-01-31T19:16:43.587-05:00Argon2id Parameters<p><strong>There are no specific recommendations in this post.</strong> You will need to choose parameters for your specific situation. However, it is my hope that this post will add deeper understanding to the recommendations that others make. With that said…</p>
<p>The <em>primary</em> parameter for Argon2 is <strong>memory.</strong> Increasing memory also increases processing time.</p>
<p>The <strong>time cost</strong> parameter is intended to make the running time longer when memory usage can’t be increased further.</p>
<p>The <strong>threads</strong> (or parallelism, or “lanes” when reading the RFC) parameter sub-divides the memory usage. When the memory is specified as 64 MiB, that is the total amount used, whether threads are 1 or 32. However, the synchronization overhead causes a <strong>sub-linear speedup,</strong> and this is more pronounced with smaller memory sizes. SMT cores offer even less speed improvement than the same number of non-SMT cores, as expected.</p>
<p>I did some tests on my laptop, which has 4 P-cores and 8 E-cores (16 threads / 12 physical cores.) The 256 MiB tests could only push actual CPU usage to about 600% (compared to the 1260% we might expect); it took 1 GiB or more to reach 1000% CPU. More threads than cores didn’t achieve anything.</p>
<p>Overall then, <strong>higher threads allow for using more memory, if enough CPU is available</strong> to support the thread count. If memory and threads are both in limited supply, then time cost is the last resort for extending the operation time until it takes long enough.</p>
<p>Bonus discovery: in PHP, the argon2id memory is separate from the memory limit. <code>memory_get_peak_usage()</code> reported the same number at the beginning and end of my test script, even for the 1+ GiB tests.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-75964803590583424352023-01-28T06:13:00.000-05:002023-01-28T06:13:20.082-05:00Experiences with AWS<p>Our core infrastructure is still EC2, RDS, and S3, but we interact with a much larger number of AWS services than we used to. Following are quick reviews and ratings of them.</p>
<p><strong>CodeDeploy</strong> has been mainly a source of irritation. It works <em>wonderfully</em> to do all the steps involved in a blue/green deployment, but it is <em>never</em> ready for the next Ubuntu LTS after it launches. As I write, AWS said they planned to get the necessary update out in May, June, September, and September 2022; it is now January 2023 and Ubuntu 22.04 support has not officially been released. <a href="https://github.com/aws/aws-codedeploy-agent/issues/301#issuecomment-1124810261">Ahem.</a> 0/10 am thinking about writing a Go daemon to manage these deployments instead. I am more bitter than a Switch game card.</p>
<p><strong>CodeBuild</strong> has ‘environments’ thrown over the wall periodically. We added our scripts to install PHP from <a href="https://launchpad.net/~ondrej/+archive/ubuntu/php">Ondřej Surý’s PPA</a> instead of having the environment do it, allowing us to test PHP 8.1 separately from the Ubuntu 22.04 update. (Both went fine, but it would be easier to find the root cause with the updates separated, if anything had failed.) “Build our own container to route around the damage” is on the list of stuff to do eventually. Once, the CodeBuild environment had included a buggy version of git that segfaulted unless a config option was set, but AWS did fix that after a while. 9/10 solid service that runs well, complaints are minor.</p>
<p><strong>CodeCommit</strong> definitely had some growing pains. It’s not as bad now, but it remains obviously slower than GitHub. After a long pause with 0 objects counted, all objects finish counting at once, and then things proceed pretty well. The other thing of note is that it only accepts RSA keys for SSH access. 6/10 not bad but has clearly needed improvement for a long time. We are still using it for all of our code, so it’s not <em>terrible.</em></p>
<p><strong>CodePipeline</strong> is great for what it does, but it has limited built-in integrations. It can use AWS Code services… or Lambda or SNS. 8/10 conceptually sound and easy to use as intended, although I would rather implement my own webhook on an EC2 instance for custom steps.</p>
<p><strong>Lambda</strong> has been quarantined to “only used for stuff that has no alternative,” like running code in response to CodeCommit pushes. It appears that we are charged for the <em>wall time</em> to execute, which is okay, but means that we are literally paying for the latency of every AWS or webhook request that Lambda code needs to make. 3/10 all “serverless” stuff like Lambda and Fargate are significantly more expensive than being server’d. Would rather implement my own webhook on an EC2 instance.</p>
<p><strong>SNS</strong> [Simple Notification Service] once had a habit of dropping subscriptions, so our ALB health checks (formerly ELB health checks) embed a subscription-monitor component that automatically resubscribes if the instance is dropped. One time, I had a topic deliver to email before the actual app was finished, and the partner ran a load test without warning. I ended up with 10,000 emails the next day, 0 drops and 0 duplicates. 9/10 has not caused any trouble in a long time, with plenty of usage.</p>
<p><strong>SQS</strong> [Simple Queue Service] has been 100% perfectly reliable and intuitive. 10/10 exactly how an AWS service should run.</p>
<p><strong>Secrets Manager</strong> has a lot of caching in front of it these days, because it seems to be subject to global limits. We have observed throttling at rates that are 1% or possibly even less of our account’s stated quota. The caching also helps with latency, because they are either overloaded (see previous) or doing Serious Crypto that takes time to run (in the vein of bcrypt or argon2i). 8/10 we have made it work, but we might actually want AWS KMS instead.</p>
<p><strong>API Gateway</strong> has ended up as a fancy proxy service. Our older APIs still have an ‘API Definition’ loaded in, complete with <a href="https://www.decodednode.com/2016/09/aws-api-gateway-returning-404-errors.html">stub paths to return 404</a> instead of the default 403 (which had confused partners quite a bit.) Newer ones are all simple proxies. We don’t gzip-encode responses to API Gateway because <a href="https://www.decodednode.com/2016/12/api-gateway-vs-gzip-errcontentdecodingf.html">it failed badly in the past.</a> 7/10 not entirely sure what value this provides to us at this point. We didn’t end up integrating IAM Authentication or anything.</p>
<p><strong>ACM</strong> [AWS Certificate Manager] provides all of our certificates in production. The whole point of the service is to hide private keys, so the development systems (not behind the load balancer) use Let’s Encrypt certificates instead. 10/10 works perfectly and adds security (vs. having a certificate on-instance.)</p>
<p><strong>Route53 Domains</strong> is somewhat expensive, as far as registering domains goes, but the API accessibility and integration with plain Route53 are nice. It is one of the top-3 services on our AWS bill because we have a “vanity domain per client” architecture. 9/10 wish there was a bulk discount.</p>
<p><strong>DynamoDB</strong> is perfect for workloads that suit <em>non-queryable</em> data, which is to say, <a href="https://www.decodednode.com/2018/05/why-we-have-memcached-dynamo-proxy.html">we use it for sessions,</a> and not much else. It has become usable in far more scenarios with the additions of TTL (expiration times) and secondary indexes, but still remains niche in our architecture. 9/10 fills a clear need, just doesn’t match very closely to our needs.</p>
<p><strong>CloudSearch</strong> has been quietly powering “search by name” for months now, without complaints from users. 10/10 this is just what the doctor ordered, plain search with no extra complexity like “you will use this to parse logs, so here are extra tools to manage!”</p>
<p>That’s it for today. Tune in next time!</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-87973086044108498572023-01-26T18:30:00.000-05:002023-01-26T18:30:00.240-05:00FastCGI in Perl, but PHP Style [2012]<p><i>Editor's note: I found this in my drafts from 2012. By now, everything that can be reasonably
converted to FastCGI has been, and a Perl-PHP bridge has been built to allow new code to be written
for the site in PHP instead. However, the conclusion still seems relevant to designers working on
frameworks, so without further ado, the original post follows...</i></p>
<p>The first conversions of CGI scripts to FastCGI have been launched into production. I have both the
main login flow and six of the most popular pages converted, and nothing has run away with the CPU
or memory in the first 50 hours. It’s been completely worry-free on the memory front, and I owe it
to the PHP philosophy.</p>
<p>In PHP, users generally don’t have the option of persistence. Unless something has been carefully
allocated in persistent storage in the PHP kernel (the C level code), everything gets cleaned up at
the end of the request. Database connections are the famous example.</p>
<p>Perl is obviously different, since data can be trivially kept by using package level variables to
stash data, but my handler-modules (e.g. <code>Site::Entry::login</code>) don’t use them. Such
handler-modules define one well-known function, which returns an object instance that carries all
the necessary state for the dispatch and optional post-dispatch phases. When this object is
destroyed in the FastCGI request loop, so too are all its dependencies.</p>
<p>Furthermore, dispatching <em>returns</em> its response, WSGI style, so that if dispatch
<code>die</code>s, the FastCGI loop can return a generic error for the browser. Dispatch isn’t
allowed to write anything to the output stream directly, including headers, which guarantees a blank
slate for the main loop’s error page. (I once wrote a one-pass renderer, then had to grapple with
questions like “How do I know whether HTML has been sent?”, “How do I close half-sent HTML?”, and
“What if it’s not HTML?” in the error handler.)</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-30791411219167129882023-01-22T10:28:00.001-05:002023-01-28T06:16:40.104-05:00PHP’s PDO, Single-Process Testing, and 'Too Many Connections'<p>Quite some time ago now, I ran into a problem with running a test suite: at some point, it would fail to connect to the database, due to too many connections in use.</p>
<p>Architecturally, each connection sent a PSR-7 Request through the HTTP layer, which caused the back-end code under test to connect to the database in order to fulfill the request. All of these resources (statement handles and the database handle itself) <em>should</em> have been out of scope be the end of the request.</p>
<p>But every PDOStatement has a reference to its parent PDO object, and apparently each PDO keeps a reference to all of its PDOStatements. There was no memory pressure (virtually all <em>other</em> allocations were being cleaned up between tests), so PHP wasn’t trying to collect cycles, and the PDO objects were keeping connections open the whole duration of the test suite.</p>
<p>Lowering the connection limit in the database engine (a local, anonymized copy of production data) caused the failure to occur much sooner in testing, proving that it was an environmental factor and not simply “unlucky test ordering” that caused the failure.</p>
<p>Using phpunit’s <code>--process-isolation</code> cured the problem entirely, but at the cost of <em>a lot</em> of time overhead. This was also expected: with the PHP engine shut down entirely between tests, all of its resources (including open database connections) were cleaned up by the OS.</p>
<p>Fortunately, I already had a database connection helper for other reasons: loading credentials securely, setting up exceptions as the error mode, choosing the character set, and retrying on failure if AWS was in the middle of a failover event (“Host not found”, “connection refused”, etc.) It was a relatively small matter to detect “Too many connections” and, if it was the first such error, issue <code>gc_collect_cycles()</code> before trying again.</p>
<p>(Despite the “phpunit” name, there are functional and integration test suites for the project which are also built on phpunit. Then, the actual tests to run are chosen using <code>phpunit --testsuite functional</code>, or left at the default for the unit test suite.)</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-46432362899932490702022-12-28T16:07:00.004-05:002024-02-02T19:15:42.641-05:00Linux Behavior Without Swap<p>We had a runaway script clog all of the memory on a micro EC2 Ubuntu instance. Not enough that the kernel <acronym title="Out-of-Memory">OOM</acronym> killer would do anything, and not enough that <em>the script itself</em> hit the PHP memory limit, but enough to make the instance become unresponsive for 45 minutes.</p>
<p>I <em>have</em> sent Linux into thrashing, back in the old days when typical desktop RAM sizes were less than 1 GB and SSDs weren’t available yet. What surprised me was <em>just how similar</em> “running out of RAM” was in the modern times, even with the OOM killer. It let the system bog down instead of killing a process!</p>
<p>We chose to mitigate the issue at work by expanding the instance, so that it has more RAM than <code>memory_limit</code> now. It will take more than one simultaneous runaway script to bring it down in the future. (We also fixed the script. I don’t like throwing resources at problems, in general.)</p>
<p>Then one day, via <a href="https://utcc.utoronto.ca/~cks/space/blog/linux/SystemdOomdNowDisabled?showcomments#comments">pure serendipity</a>, I found out about <a href="https://github.com/rfjakob/earlyoom">earlyoom</a>. I have added it to our pet instances, and I’m considering it for the cattle template, but it hasn’t been well-tested due to our previous mitigations. The instance simply <em>doesn’t</em> run out of RAM anymore.</p>
<p>At home, I first set up swap on zram so that Ubuntu Studio would have a place to “swap out” 2+ GB (out of 12 GB installed), and then recently added a swap partition <a href="https://www.decodednode.com/2022/12/debootstraping-recovery-partition.html">while I was restructuring things</a> anyway. It’s not great for realtime audio to swap; but “not having swap” doesn’t appear to change the consequences of memory pressure, so I put some swap in. With a dedicated swap partition added, I reduced the zram area to 512 MB. I still want to save the SSD if there’s a small-to-moderate amount of swap usage.</p>
<p><strong>UPDATE: <a href="https://www.decodednode.com/2023/09/update-on-earlyoom.html">This was imperfect, as it turns out;</a></strong> if you have a swap partition, you should <em>remove</em> <code>zram</code>, and use <code>zswap</code> instead.</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-41838161298119239802022-12-27T15:40:00.000-05:002022-12-27T15:40:00.093-05:00debootstrap’ing a Recovery Partition<p>One of the nicer things about trying Fedora on my <a href="https://system76.com/laptops/darter">work laptop</a> is that when it broke the boot loader, there was a functioning Recovery Mode.</p>
<p>My desktop relies on <a href="https://github.com/morrownr/88x2bu">a particular driver</a> for WiFi, and upgrading the kernel (e.g. from Ubuntu 22.04 to 22.10) requires fully reinstalling it. But what if the kernel upgrades to a version that isn’t supported by the copy of the driver I happened to have on disk? And I didn’t want to “just” (disassemble and move the PC to) plug in an Ethernet cable?</p>
<p>I used the Fedora live environment to make a little bit of room for an 8 GiB partition at the end of the disk (and a 2 GiB swap partition, as long as I was there), and then I ran <code>debootstrap</code> to fill it in. This is about what surprised me doing that.</p>
<p>tl;dr: debootstrap is <em>a lot more aggressively minimalist,</em> more like Arch, than I would have expected.</p>
<a name='more'></a>
<p>Using debootstrap from a later Ubuntu (22.10) to install Debian Bullseye (2021-08) seemed to work fine.</p>
<p>The guide I was reading had an example sequence of how debootstrapping should work, suggesting I run <code>dselect</code> as my first thing in the chroot. Well, the command wasn’t found, and when I installed it, I hated it. The package list is grouped by <em>several</em> layers I don’t find terribly relevant, as I’m not a Debian release manager.</p>
<p>I went back to the familiar <code>apt</code> command to install dkms, build-essential, and git. I cloned the driver from github. Those were the things I knew I would need. I would come back later to install the things I didn’t know I would need: <strong>linux-image-amd64,</strong> iw, rfkill, iwd, and (because I use Dvorak) console-setup.</p>
<p>I also tracked down and ran <code>dpkg-reconfigure locales</code> in the chroot, because perl was constantly complaining that there was no <code>en_US.UTF-8</code> locale. That was in the environment from outside the chroot, but no locales were generated inside.</p>
<p>I found the missing kernel right away; I couldn’t set up GRUB in Ubuntu to make use of Debian’s kernel, because <em>it wasn’t there.</em></p>
<p>The next hurdle was that the system booted with a read-only root partition. It turns out that debootstrap leaves <code>/etc/fstab</code> blank. I ran <code>mount -o remount,rw /</code> to fix it, then mounted the Ubuntu partition and copied the syntax from its <code>fstab</code> file. I forgot about <code>blkid</code>; I got the correct UUID with <code>lsblk -o NAME,UUID</code> instead. I don’t want to lock in a fixed physical architecture (i.e. the traditional /dev/sdb3 names.)</p>
<p>With that out of the way, I moved on to installing the driver, which is where iw and rfkill came in. The installation only told me that <code>iw</code> was missing, but I suspected there would be more, so I searched the script and found everything it wanted from <code>iw</code> downward. It was back to Ubuntu (working WiFi) to chroot in and install those.</p>
<p>With that, the driver built fine; it took another cycle through Ubuntu to read the Arch wiki, install <code>iwd</code>, set up a <code>.network</code> file for systemd, and reboot to Debian to supply the WPA2 password. That last bit was a matter of running <code>iwctl</code>, typing <code>station wlx984827[…] connect [SSID]</code>, and following the password prompt. Success! I had an IP address!</p>
<p>I went ahead and installed <code>links</code> and <code>gpm</code>, and then I was done.</p>
<p>(To boot Debian, I added a <code>/boot/grub/custom.cfg</code> file, sourced by Ubuntu’s <code>/etc/grub.d/41_custom</code>. This must not be confused with <code>/etc/grub.d/40_custom</code>, which one would edit directly and thus cause conffile conflicts forever after. The custom.cfg is a pair of menuentry blocks for the <code>/vmlinuz</code> and <code>/vmlinuz.old</code> symlinks in Debian’s root partition, with kernel options <code>ro quiet splash</code>. debootstrap didn’t install a boot loader, so I left the partition that way. The Debian GRUB can’t take over the boot process if there’s no Debian GRUB. If I break Ubuntu’s GRUB, I can’t boot, but the main thing I’m worried about is breaking Ubuntu’s <em>internet.</em>)</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-85904677885221970372022-12-17T19:39:00.004-05:002022-12-19T20:38:40.721-05:00Container Memory Usage<p>How efficient is it to run multiple containers with the same code, serving different data? I am most familiar with a “shared virtual hosting” setup, with a few applications behind a single Web frontend and PHP-FPM pool. How much would I lose, trying to run each app in its own container?</p>
<p>To come up with a first-order approximation of this, I made a pair of minimal static sites and used the <code>nginx:mainline-alpine</code> image (ID 1e415454686a) to serve them. The overall question was whether the layer would be shared between multiple containers <em>in the Linux memory system,</em> or whether each image would end up with its own copy of everything.</p>
<p><strong>Updated 2022-12-19:</strong> This post has been substantially rewritten and expanded, because Ubuntu would not cleanly reproduce the numbers, requiring deeper investigation.</p>
<a name='more'></a>
<h3>Test Setups</h3>
<p>The original test ran on Debian 11 (Bullseye) using Podman, rootless Podman, the “docker.io” package, and uncontained nginx-mainline. The first three are from the Debian 11 package repository. Due to installing the VM guests on separate days, the Podman VM had kernel 5.10.0-19, and the Docker and uncontained VMs had kernel 5.10.0-20. Debian VMs were configured with 2 CPUs, 1024 MiB of RAM, and a 1024 MiB swap partition (unneeded.)</p>
<p>The Ubuntu test ran on Ubuntu 22.10 (Kinetic) with Podman, rootless Podman, and “docker.io” only; the uncontained test was not reproduced. Ubuntu also used the <code>fuse-overlayfs</code> package, which was not installed on Debian, so rootless Podman shows different sharing behavior in the <code>lsof</code> test.</p>
<p>The following versions were noted on the Ubuntu installations: <strong>docker.io</strong> used docker.io 20.10.16-0ubuntu1, containerd 1.6.4-0ubuntu1.1, and runc 1.1.2-0ubuntu1.1. <strong>podman</strong> used podman 3.4.4+ds1-1ubuntu1, fuse-overlayfs 1.9-1, golang-github-containernetworking-plugin-dnsname 1.3.1+ds1-2, and slirp4netns 1.2.0-1.</p>
<p>In an attempt to increase measurement stability, ssh, cron, and unattended-upgrades services were all stopped and deactivated on Ubuntu.</p>
<h3>Procedures</h3>
<p>The test cycle involved cold-booting the appropriate VM, logging into the console, checking <code>free</code>, and starting two prepared containers. (The containers were previously run with a bind-mount from the host to the container’s document root.) I accessed the Web pages using <code>links</code> to be sure that they were working properly, and then alternately checked <code>free</code> and stopped each container. I included a <code>sleep 1</code> command between stopping the container and checking the memory, to give the container a chance to exit fully.</p>
<p>On a separate run from finding the memory numbers, I also used <code>lsof</code> to investigate what the kernel reported as open files for the containers. In particular, lsof provides a “NODE” column with the file’s inode number. If these are different for the same file in the container image, then it shows that the container is not sharing the files.</p>
<p>The uncontained test is similar: boot, login on the console, check RAM, start nginx.service, access the pages, check the memory, stop nginx.service, and check the memory. The <code>lsof</code> research does not apply; multiple nginx instances do not exist.</p>
<p>Due to memory instability observed in the first round of Ubuntu testing, the tests were repeated with <a href="https://github.com/pixelb/ps_mem/">ps_mem</a> used to observe the PIDs associated with the containers, in order to get a clearer view of RAM usage of the specific containers.</p>
<p>Finally, a separate round of tests was done with <code>ps_mem</code> again, to get the breakdown by process with both containers running.</p>
<h3>Debian Results</h3>
<p>Limitation: I used <code>free -m</code> which was not terribly precise.</p>
<p>The Podman instance boots with 68-69 MiB of RAM in use, while the Docker instance takes 122-123 MiB for the same state (no containers running.)</p>
<p>Rootless Podman showed <strong>different</strong> inode numbers in lsof, and consumed the most memory per container: shutting things down dropped the used memory from 119 to 96 to 72 MiB. Those are drops of <strong>23 and 24 MiB.</strong></p>
<p>Podman in its default (rootful) mode shows the <strong>same</strong> inode numbers, and consumes the least memory per container: the shutdown sequence went from 77 to 75 to 73 MiB, dropping <strong>2 MiB</strong> each time.</p>
<p>Docker also shows the <strong>same</strong> inode numbers when running, but falls in between on memory per container: shutdown went from 152 to 140 to 129 MiB, which are drops of <strong>12 and 11 MiB.</strong></p>
<p>In the uncontained test, for reference, memory was difficult to measure. On the final run, <code>free -m</code> reported 68 MiB used after booting, 70 MiB while nginx was running, and 67 MiB after nginx was stopped. This is reasonable, since the nginx instance shares the host’s dynamic libraries, especially glibc.</p>
<h3>Ubuntu Results</h3>
<p>In the interests of being open and transparent about the quality of the methodology, the discredited data is also being reported here.</p>
<h4>Rootless Podman</h4>
<pre><code> (KiB) used free shared
boot 184360 1501076 1264
1 container 183144 1494032 1336
2 containers 205780 1470908 1400
1 container 177096 1493032 1356
final 190840 1479180 1312
</code></pre>
<p>Note that overall memory usage goes “down” after starting the first container, and “up” when stopping the second container.</p>
<p>The ps_mem results for slirp4netns, containers-rootlessport(-child), conmon, and the nginx processes:</p>
<pre><code>2 containers 64.3 MiB RAM
1 container 46.0 MiB
</code></pre>
<p>Matching the Debian results, rootless podman adds significant memory overhead (39.8% or 18.3 MiB) in this test.</p>
<p>With <code>fuse-overlayfs</code> installed, <code>lsof</code> showed the same inode numbers being used between the two containers, but on different devices. Previously, on Debian, they appeared on the actual SSD device, but with different inode numbers. The “same inodes, different device” matches the results when running the containers in rootful mode on Ubuntu. I did not pay attention to the device numbers in rootful mode on Debian.</p>
<p>The two-container breakdown (note again, this is a separate boot from the previous report, so does not total the 64.3 MiB shown above):</p>
<pre><code>Private Shared Sum Processes
708.0 K 434.0 K 1.1 M conmon
664.0 K 561.0 K 1.2 M slirp4netns
2.5 M 5.4 M 7.9 M nginx
15.4 M 12.2 M 27.6 M podman
15.3 M 12.6 M 27.9 M exe
65.6 MiB total
</code></pre>
<p>“podman” corresponds to <code>containers-rootlessport-child</code> in the output of <code>ps</code>, and “exe” is <code>containers-rootlessport</code>.</p>
<h4>Rootful Podman</h4>
<pre><code> (KiB) used free shared
boot 174976 1503404 1268
1 container 183880 1478800 1352
2 containers 194252 1467796 1420
1 container 164008 1497780 1372
final 184480 1477396 1324
</code></pre>
<p>The measurement problem was even more dramatic. Memory usage plummeted to “lower than freshly booted” levels after stopping one container, then bounced back <strong>up</strong> after stopping the second container. Neither of these fit expectations.</p>
<p>Rootful podman only needs the conmon and nginx processes, which leads to the following ps_mem result:</p>
<pre><code>2 containers 9.1 MiB RAM
1 container 6.6 MiB
</code></pre>
<p>The overhead remains high at 37.9%, but it is only 2.5 MiB due to the much lower starting point.</p>
<p>Here’s the breakdown with both containers running:</p>
<pre><code>Private Shared Sum Processes
708.0 K 519.0 K 1.2 M conmon
2.6 M 5.4 M 8.0 M nginx
9.2 MiB total
</code></pre>
<p>Without the <code>containers-rootlessport</code> infrastructure, memory usage is vastly lower.</p>
<h4>Docker</h4>
<pre><code> (KiB) used free shared
boot 192088 1430660 1104
1 container 213896 1355964 1192
2 containers 246264 1322052 1280
1 container 245940 1322052 1192
final 194276 1373788 1104
</code></pre>
<p>Calculating the deltas would suggest 21.3 MiB and 31.6 MiB to start the containers, but then 0.32 MiB and 50.4 MiB released when shutting them down.</p>
<p>Testing with ps_mem across all the container-related processes (<code>docker-proxy</code>, <code>containerd-shim-runc-v2</code>, and the <code>nginx</code> main+worker processes), I got the following:</p>
<pre><code>2 containers 25.4 MiB RAM
1 container 19.3 MiB
</code></pre>
<p>That suggests that the second container added 31.6% overhead (6.1 MiB) to start up.</p>
<p>The breakdown for 2-container mode:</p>
<pre><code>Private Shared Sum Processes
2.8 M 1.7 M 4.4 M docker-proxy
2.8 M 5.3 M 8.0 M nginx
5.0 M 7.9 M 12.9 M containerd-shim-runc-v2
25.4 MiB total
</code></pre>
<p>We see that containerd-shim-runc-v2 is taking just over half of the memory here. Of the rest, a third goes to docker-proxy, leaving less than one-third of the total allocation dedicated to the nginx processes inside the container.</p>
<h4>Uncontained</h4>
<p>I only collected stats for <code>ps_mem</code> this time:</p>
<pre><code>Private Shared Sum Processes
1.4 M 1.6 M 3.0 M nginx
</code></pre>
<p>This configuration is two document roots served by one nginx setup, rather than two nginx setups, so isolation is even lower than simply being uncontained. However, it represents a lower bound on what memory usage could possibly be.</p>
<h3>Conclusions</h3>
<p>Running podman rootless costs quite a bit of memory, but running it in rootful mode beats Docker’s consumption.</p>
<p>Both container managers can share data from the common base layers while running in memory, but Podman may require <code>fuse-overlayfs</code> to do so when running rootless.</p>
<p>For every answer, another question follows. It’s not that the project is finished; I simply quit working on it.</p>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3922219755971684412.post-62396163372534150582022-12-03T16:55:00.003-05:002022-12-19T20:42:36.319-05:00Failing to Install Fedora 37<p>Waiting for yet another 80+ MB download of a <code>.deb</code> file, I decided to try to dual-boot Fedora on my Pop!_OS <a href="https://system76.com/laptops/darter">laptop</a> (darp8). Because <code>drpm</code>s exist. That’s it, that’s the whole reason. <br /></p>
<p>[<strong>Update 2022-12-04:</strong> Because all resize-related commands use “as much space as possible” by default, I decided to try again, shrinking the filesystem/LV/PV more than was necessary, and then re-expanding them after the partition was changed. I got Fedora installed, but the system76 extension for the power manager <a href="https://github.com/pop-os/gnome-shell-extension-system76-power/issues/85">doesn’t work on Gnome 43</a>. Gnome claims <a href="https://gjs.guide/extensions/overview/updates-and-breakage.html#monkey-patching">it’s not their fault that the system they created</a> frequently breaks extensions in general; I assume they feel the same about the specific case here, where they <a href="https://gjs.guide/extensions/upgrading/gnome-shell-43.html#quick-settings">changed the menu API</a> instantly and completely.]</p>
<p>[<strong>Update 2022-12-19:</strong> I pretty quickly gave up on Fedora. It had a tendency to result in the laptop rebooting into recovery mode after using the Fedora install. Booting Linux is clearly not important enough to get standardized. Oh well!]</p>
<a name='more'></a>
<p>I had asked for encryption during the initial setup, and the machine is UEFI with coreboot and open firmware, so the disk has a GPT setup allocated thus:</p>
<ol>
<li>0.5 GB EFI SYSTEM</li>
<li>4.0 GB RECOVERY</li>
<li>923 GB (no name LUKS type)</li>
<li>4.0 GB (no name encrypted swap)</li>
</ol>
<p>I knew <a href="https://www.decodednode.com/2022/07/rescuing-encrypted-popos-2204.html">from past exploration</a> that the LUKS partition contained an LVM physical volume, with a volume group and logical volume filling the whole space. It’s not so hard to <code>resize2fs</code> and then <code>lvreduce</code>… very carefully. I had 99 GB of data in use, so I booted into the on-disk recovery image this time, changed the LV to 400 GiB in the filesystem, then set the number of LVM extents. That part worked flawlessly.</p>
<p>At this point, I headed into the Fedora installer. Things went okay, until selecting partitioning. I exclusively used the “Advanced Custom (Blivet GUI)” option, and that was kind of a mess. I had to unlock the LUKS partition; afterward, the LVM volume group appears as a second entry in the sidebar. Fair enough. It had my 400 GB LV inside, and some free space, and then things started going off the rails.</p>
<p>Failure #1: I didn’t understand the <code>Name</code> field, so I left it blank, and Blivet named it (the new logical volume) <code>00</code>. That seemed like it would be unnecessarily confusing later on, so I deleted that, and put in a new LV with a better name.</p>
<p>Failure #2: At this point, Blivet said it had 3 actions pending: create/format, delete, create/format. “Undo” would take out the only change I wanted to keep, so I clicked Reset All. It cleared the operations queue, but it also put the GUI into a state where it reported the LUKS partition was locked (it wasn’t), but <i>also</i> would not let me unlock it. I re-locked it through the Terminal, but it didn’t help. I rebooted to try again.</p>
<p>Alternative hypothesis: maybe the VG was still in the sidebar and I didn’t notice it. I was too n00b to cover every case yet.</p>
<p>Failure #3: After this point, Blivet never had the keyboard in Dvorak ever again, even after cold boots of the installer. I’m actually not certain if it did the first time, or if that detail just got lost with all the other stuff going on at that point.</p>
<p>Failure #4: Fedora does not want /boot to be on a logical volume. There is a bug, where the discussion is basically, “we should support everything reasonable, but md-raid is too hard.” I found another thread somewhere, where Lennart “You’re Doing It Wrong” Poettering said that the EFI System Partition (ESP) should be mounted at <code>/efi</code>, or even <code>/boot</code> because there’s no point to having <code>/boot</code> when using EFI, and <code>/boot/efi</code> is “dumb” [sic] (no reasoning given.) Fedora does not want the ESP to be mounted anywhere except <code>/boot/efi</code>, though.</p>
<p>Failure #5: I discovered it was possible to use <code>pvresize</code> to shrink the LVM PV. I wasn’t entirely sure that I could have the partitions out of order (it seemed like the easiest thing for an editor to do would be to append a 5th GPT entry, pointing to the 4th extent on disk) but I decided to try. Unfortunately, <code>fdisk</code> reported 32,768 more sectors allocated to nvme0n1p2 than <code>pvresize</code> said in its “pretending the disk is … not Y sectors” message, which completely destroyed my confidence that I could resize the partition around it correctly.</p>
<p>I gave up at this point. I can’t coax the system into a “supported” configuration, and I’m not really willing to install an OS on a production machine—this is the one where the money is made—in an unsupported, “may fail at any time” configuration.</p>
<p>Meanwhile, in Pop!_OS, the system boots just fine with /boot on LVM and the kernel/initrd stored in the ESP. I really thought the whole point of EFI was to make OS-specific boot loaders irrelevant. (Like Multiboot, but complicated, albeit with Microsoft and Intel supporting it.) I don’t understand why the Fedora “Let’s Remove BIOS Support” Project requires GRUB and <code>/boot</code> (non-nested) with UEFI.</p>Unknownnoreply@blogger.com0