<?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom">
     <title type="text">dcz's posts</title>
     <subtitle type="html">
       Thoughts on software and society.
     </subtitle>
     <updated>2026-03-08T14:00Z</updated>
     <link rel="alternate" type="text/html"
      hreflang="en" href="https://dorotac.eu/"/>
     <link rel="self" type="application/atom+xml"
      href="https://dorotac.eu/atom.xml"/>
     <rights>CC-BY-SA 4.0</rights>
       <author>
         <name>dcz</name>
         <uri>https://dorotac.eu/</uri>
       </author>
     
     <entry>
       <title>Wayland input method project post-mortem</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/im-finished/"/>
       <id>tag:dorotac.eu,2026-03-08:posts/im-finished</id>
       <updated>2026-03-08T14:00Z</updated>
       <published>2026-03-08T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Wayland input method project post-mortem</h1>
<p>Input methods on Wayland have a bit of a chicken-and-egg problem.</p>
<p>That topic has been my main occupation for the past year, thanks to <a href="https://nlnet.nl/project/WaylandInput/">NLNet</a>, and thanks to the fact that I was a cause of some of the problems in the first place.</p>
<p>But the time has come to move on, and take a less active role. So here's a summary of what I did, what I didn't do, and what I think about the whole topic.</p>
<h2>Chicken and egg</h2>
<p>Long story short, the chicken is users and the egg is implementations.</p>
<p>An input method is a very user-facing thing. It's already in the name: it's used for input. It's for the user to interact with the computer. Let the smartphone user type, the Japanese speaker write Kanji. Propose autocompletion from a dictionary. Use the right layout or language. Stay out of the way when unneeded.</p>
<p>For such a user-focused feature, the work I did was exceptionally user-unfocused. Tech demos instead of user testing. There was no one to use what I did every day, or even show me an example of a daily task. What I created isn't enough even for myself. I couldn't empirically evaluate if what I did was an improvement.</p>
<p>Why?</p>
<p>It comes down to manpower and demand.</p>
<h3>Manpower</h3>
<p>Most application software people use daily are built using the GTK or Qt toolkits. They most likely communicate with the input method through Mutter or Kwin as the compositor. The third relevant component to me is an Input Method Engine (IME), like fcitx or ibus.</p>
<p>To test my work on someone's daily workflow, I'd have to modify all 3 components and find a user who uses that specific combination. Then take them through the trouble of setting up the patched versions on their system.</p>
<p>It's not impossible, but for a single person, it's infeasible. Right from the start of this project, taught by the lessons implementing text-input in GTK and wlroots, I knew that I'm too stupid to write C/C++ code, and I can't afford the time debugging what I would have written. So I selected a Rust-based stack of <a href="https://github.com/Smithay/smithay/">smithay</a>, <a href="https://lap.dev/floem/">floem</a>, and, uh, <a href="https://codeberg.org/dcz/stiwri">my own input method</a> because I couldn't find any (now I know about <a href="https://github.com/Riey/kime">kime</a> – a bit late, though). Sadly, pretty much no one is using that combo, but at least it made proofs of concept possible.</p>
<h3>Demand</h3>
<p>But that's not the only way to get a working stack for users to test. What if someone else did the work? What if the relevant projects decided to implement my experiments themselves, to let their users test? I submitted all the protocols as <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/tree/8fcd62b39d4f893bc7aaab6f3857d4aa2c61beb0/experimental/xx-input-method">official</a> <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/tree/8fcd62b39d4f893bc7aaab6f3857d4aa2c61beb0/experimental/xx-keyboard-filter">experiments</a> in <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/tree/8fcd62b39d4f893bc7aaab6f3857d4aa2c61beb0/experimental/xx-text-input">wayland-protocols</a>, after all!</p>
<p>But that hasn't happened.</p>
<p>What gives? My suspicion is that most of the time, there isn't an expert in the area who could do the work. The rest of the time, the improvements aren't seen as worth the extra effort. Users aren't burning developers' effigies, demanding better Kanji. Linux mobile users aren't flooding bug trackers with complaints that the keyboard shoves their apps around for no good reason (it totally does). Linux Mobile is Android, after all, right? (<a href="https://phosh.mobi/">No, it's not.</a>) Android doesn't use Wayland.</p>
<p>A bit more puzzling is the relative silence of the Chinese users. But only a bit. Think about it: they are on the other side of The Great Firewall. The other side of the English-Chinese language barrier. The other side of the chasm between the Western and the Eastern cultures. How can we <em>not</em> expect to see a disconnect?</p>
<h4>Good enough</h4>
<p>I think the most important reason driving this lack of demand is that the existing solution is good enough. Input methods on Wayland work 90% of the time. They fall apart only 10% of the time, and everyone knows the last 10% is 90% of the work.</p>
<p>Unfortunately, that last 10% also contains the difference between &quot;kinda sorta works&quot; and &quot;a pleasure to use&quot;. The latter has been my motivation for the past year. And common programmer's hubris. <em>Are you saying, I can't do that???</em></p>
<p>So let's take a look at the 90% we already have, and then move on to guess at what the remaining 10% could look like.</p>
<h2>The 90%</h2>
<p>After reading this, you'll have a pretty good idea what each feature is necessary for.</p>
<p>I'll mark features relevant to the mobile use case with an <strong>M</strong>, and those relevant for input methods with an <strong>I</strong>.</p>
<ul>
<li><strong>IM</strong>: Send text</li>
<li><strong>IM</strong>: Select a language</li>
<li><strong>IM</strong>: Announce text purpose</li>
<li><strong>IM</strong>: Display pre-edit</li>
<li><strong>IM</strong>: Less ambiguity in corner cases</li>
<li><strong>IM</strong>: Compatibility</li>
<li><strong>M</strong>: Display a panel</li>
<li><strong>M</strong>: Cursor navigation</li>
<li><strong>I</strong>: Send key events</li>
<li><strong>I</strong>: Grab keyboard</li>
<li><strong>I</strong>: Style the pre-edit</li>
<li><strong>I</strong>: Display a popup</li>
</ul>
<h3>Common features</h3>
<p><strong>Purpose and language selection</strong> inform the input method what's expected inside the text field. This is such a generic thing, that honestly, it could even be useful with a keyboard and nothing else. Automatic layout switching between windows is already a thing, with the input method protocols it could be per text input field, and determined by the application.</p>
<p><strong>Pre-edit</strong> is a piece of text that you just typed, but which may change and is not yet considered final.</p>
<p>Chinese, Japanese, Korean (CJK) input methods use them to help composing the characters.</p>
<p>On my Maemo, it's used to display the most likely autocompletion.</p>
<p><img src="maemo.png" alt="Pre-edit text displayed on Maemo" /></p>
<h3>Features mostly useful for CJK input method</h3>
<p>CJK IMEs often show a <strong>popup window</strong> next to the typed text, with several candidate outcomes. This is called a candidate window, which needs to be allowed in the protocol.</p>
<p><img src="candidate.png" alt="Candidate window example" /></p>
<p>When the IME is paired with a keyboard, it needs to prevent the application from getting some key events. It wouldn't be helpful if, when you typed &quot;[kou](https://en.wikipedia.org/wiki/W%C4%81puro_r%C5%8Dmaji&quot;, the text editor received &quot;kou&quot;. You want to stop those events, and give them to the IME, which can turn them into &quot;こう&quot;. This is what they <strong>keyboard grab</strong> is for. At the same time, you still want to forward events like &quot;Ctrl+C&quot; to the application, so that's why we have <strong>sending key events</strong>.</p>
<h3>Features mostly useful for mobile text input</h3>
<p>No matter if you're using a QWERTY input method or <a href="https://en.wikipedia.org/wiki/MessagEase">MessagEase</a>, on your mobile, you need to display a special <strong>panel</strong> to be able to press the buttons. (If you thought about T9, you can feel smug now. You still need an input method, though.)</p>
<p><img src="stevia.png" alt="On-screen keyboard panel of Stevia" /></p>
<p>Hitting an exact position in the text with your finger that's 3x the size of a letter on the screen is still hard, though, and application toolkits are failing to make that useable, so <strong>cursor navigation</strong>, which makes it possible for an input method author to come up with something better, remains relevant.</p>
<h3>Complication</h3>
<p>That would be about it. Except those features are a bit spread across different protocols, some of which aren't even sanctioned <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols">officially</a>. So you can't have both popups and a panel on your system at the same time.</p>
<p>And what about the other 10%?</p>
<h2>The other 90%</h2>
<p>If the results of my work as they are today were adopted and displaced all the existing protocols, we'd get the following improvements:</p>
<h3>CJK IME</h3>
<p>Control over the <strong>position of the popup</strong> window. Maybe you are typing right-to-left. Maybe up-to-down. I don't know, but the experimental popup uses the same positioner mechanism as the popups you see after right clicking. In this protocol, you can even have <strong>multiple popups</strong>!</p>
<p>Actually not an improvement because a dedicated <strong>panel</strong> window is missing from experimental. That's an intentional oversight coming from lack of manpower. Sorry. Layer-shell works OK for me.</p>
<p>Another missing feature is <strong>pre-edit styling</strong>. That one is missing because no user ever came to me to have a chat how they use it and why it is important. And less features is less complexity = more better. Lack of demand in action.</p>
<p>Then, there's no <strong>language information</strong> sent from the application to the IME. That's an oversight that should be fixed, although in such a way that doesn't favor the <a href="https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes">officially recognized languages</a>, but instead lets the speakers define the language they use, no matter how <a href="https://www.eurotrad.com/en/magazine/what-unknown-languages/">obscure</a> or <a href="https://en.wikipedia.org/wiki/Pidgin#List_of_notable_pidgins">disorganized</a>, without having to beg an authority for recognition.</p>
<h4>No more keyboard grab. No more generating keyboard events</h4>
<p>This is another missing feature, but it's actually an improvement!</p>
<p>You see, behind keyboard event handling hides a realm where things – mostly key maps – wait for any misstep you made to turn your day into a hell of edge cases.</p>
<p>It's a source of real life bugs that had been reported to me unofficially, as &quot;the Wayland input method authority&quot;.</p>
<p>Consider the happy path: the user presses Ctrl+B. The IME forwads the key press, to the application, and the application makes text bold.</p>
<p>Let's make things more complicated. The user presses Ctrl+Б (same key as Ctrl+B, just on a Russian Phonetic keyboard). The input method forwards the Б key sym, the application is confused and can't figure it out because it actually listens to key code 56.</p>
<p>Maybe we should send key codes then? Okay: the user presses Ctrl+Б (Russian Phonetic). The input method engine forwards the key code for Ctrl+Б, but - oops - in the regular Russian key map, which is key code 59. Not 56. The application is confused.</p>
<p>Those examples represent bugs. The one app could translate the keysym back to keycode (but what if there are 2 valid solutions?). The other input method engine could use the correct key map. But why open the opportunities for all those bugs where all you really want is forward the exact events the user already pressed?</p>
<p>So the experimental input method protocol provides a filtering mechanism instead. The IME can choose to intercept button presses that help typing your Kanji. Those will then not reach the application, and things work as expected.</p>
<p>As a bonus, this makes it easier to support improvements to the keyboard protocol like server-side <strong>key repeat</strong>, which solves <a href="https://www.csslayer.info/wordpress/linux/key-repetition-and-key-event-handling-issue-with-wayland-input-method-protocols/">more problems</a>.</p>
<h4>No more generating keyboard events, again</h4>
<p>I hear you scream: &quot;foul! Stop! Generating keys is still useful! Think of automating the desktop! And mobile keyboards need to send the »enter« key when submitting text!&quot; To this, I reply: don't worry, there's a better way out.</p>
<p>I wasn't actually done complaining about keyboard events handling. In this Pandora's box, you can't know the consequences of your key presses in advance. There is no standard mandating that pressing Ctrl+X results in removing text. There isn't even a standard saying what to do when you press the humble &quot;a&quot;! Applications could do anything they want based on the current key map, locale, and the phase of the Moon, if their authors so please.</p>
<p>Trying to predict any of that in an input method engine is folly. You'd need to know everything about every application. Or cover 90% and hope for the best, but we're trying to be better than that.</p>
<p>So for the mobile use case, there are <strong>actions</strong>. Currently, there's only one: finish editing. Works like pressing &quot;Enter&quot; would: submit the field, or go to the next one.</p>
<p>There are more ideas outside of experimental, depending on the use case: adding more generic actions, defining a shortcuts protocol, and fixing virtual-keyboard. Keep reading, they are described later in this post.</p>
<h3>Across the board</h3>
<p><strong>Consolidation</strong>. While you still kinda have to choose between popups and panel, the experimental protocols are designed to be exactly compatible with text-input-v3 and its future developments, where the (precious little) social momentum seems to be.</p>
<p><strong>Atomic updates</strong> group requests into units, applied all at once. Going from &quot;blog pose|&quot; to &quot;blog post|&quot; takes two logical steps: remove the last letter, add &quot;t&quot;. If the IME submits them as two separate steps, the application doesn't know the intended output. As it receives the first step, it will draw and display &quot;blog pos|&quot;. On receiving the second step, it discards that, and draws &quot;blog post|&quot;. The first step ends up a waste of time and battery life. In extreme cases, it could lead to flickering as, for example, the candidates popup changes position to follow the cursor.</p>
<p>With atomic updates, the application can safely wait until the end of the command to start doing its work.</p>
<hr />
<p>The protocols so far make a dangerous assumption: that requests get received immediately. What makes it dangerous is byte indexing.</p>
<p>Whenever there is a need to change cursor position, the new position is specified as a byte offset. This works well if the part of the system choosing the offset knows exactly and without delay what text is being indexed.</p>
<p>But Wayland is a distributed system. After the IME receives the contents of the text field, and before it chooses a byte offset, the text can change. With a bit of bad luck, the new context may fall inside a UTF-8 code point. The result could be trying to delete half a character like &quot;я&quot;. Complete nonsense.</p>
<p><img src="diagram.svg" alt="A diagram illustrating how deleting 1 byte can turn from correct to wrong." /></p>
<p>Existing protocols don't say what to do, but the experimental ones <strong>remove such ambiguities</strong>.</p>
<h4>Compatibility</h4>
<p>This is a bigger problem, so it deserves its own section again.</p>
<p>With the protocols evolving, the need to <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/306">stay <strong>compatible</strong></a> between the IME and the application arises.</p>
<p>In this model of input method, the compositor acts as a little more than a forwarder of events between the application and the input method. All is fine if they use matching protocol versions – ones that were literally made for each other, like text-input-v3.1 and input-method-v2.1. But what if they don't? Wayland allows clients to negotiate any protocol version they like, up to the highest one supported by the compositor, so there's no guarantee they match.</p>
<p>Imagine that (1.) the application lands on a higher protocol version and the IME on a lower one. In the best case, the user won't be able to take advantage of the app's features. Worst case, the protocol semantics differ slightly (which Wayland allows) – for example strings are in UTF-8 in version .4, but version .5 switches to UTF-16 – and both sides can't communicate, the user can't type and throws their useless laptop at the developer.</p>
<p>What about the opposite situation? 2. The application uses a lower protocol version, and the IME uses a higher one, that the application never heard of. Even if the above doesn't happen, the user may try to use some feature in the IME that the app just doesn't understand. Not a good look – it would be good to at least hide those. But that needs another communication channel.</p>
<p>So that's my solution: to have the compositor – which knows who uses which protocol version – communicate a compatibility level to the IME. In addition, when the protocol contains optional features (like different <em>actions</em>), it directly communicates what it can handle, so the IME can not present the unsupported options to the user.</p>
<p>But this is not a perfect solution. It only addresses the case where the application is less capable or older than the input method. After a little brainstorming with other Wayland developers about two-directional capability negotiation, it was suggested that it's too much trouble to implement: it would mean that every application library needs to implement an unusual, complicated version negotiation mechanism.</p>
<p>Instead, we cut the Gordian knot by requiring that the IME is always up to date.</p>
<p>This is a trade-off, and like all trade-offs, it comes with downsides. The users can't keep using unsupported IMEs any longer. Also, when shipping a distribution, all shipped IMEs must be updated or removed before any compositor starts supporting newer protocol versions.</p>
<p><strong>Addendum:</strong> key forwarding.</p>
<p>Forwarding keys through the IME actually shares many of the above considerations, regardless if we're talking about event filtering or grabbing all and re-generating some key events. The IME needs to understand the events to make a forwarding decision, and the application needs to understand what the IME sends, so they must agree on a common set of functionality.</p>
<p>Thankfully, keyboards are a lot better understood than input methods. The protocols change a lot less, and it's clear what the base functionality is. Still, the protocols sometimes change, like the addition of server-side key repeat.</p>
<p>With grab and re-send, the common features were defined in the input method protocol directly. That's simple, but you need to redefine everything keyboard-related.</p>
<p>The <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/tree/main/experimental/xx-keyboard-filter?ref_type=heads">filtering protocol</a> lets the IME and the application use the keyboard protocol. The only alteration is that key events may be withheld. That should cause only few situations where sides can't just communicate. As a downside, the compositor must figure out how to make it work when they can't.</p>
<p>While making the protocol, I was aware of this tradeoff, so this protocol is separate from input method and could be rejected if the consensus is that the tradeoff sucks.</p>
<h2>Input method version 3.2</h2>
<p>What do we gain if we don't go experimental yet, but do take in the v3.2 improvements?</p>
<p>With it, the application takes part in setting the <strong>visibility of the on-screen keyboard panel</strong>, if there is one. I think it's something Android applications like to do.</p>
<p>There's a small improvement in <strong>synchronizing</strong> the displayed windows that fixes edge cases.</p>
<p>It allows the IME to take over <strong>displaying pre-edit</strong> for applications that really don't want to do it.</p>
<p>Also, it gives the application more say in what <strong>kind of text</strong> the input method engine should provide it. While it's useful to send information like numbers-only, I expect <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/282#note_3246106">this attempt</a> to be primarily used in a way that <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/136#note_2882916">makes things worse for the user</a>. But hey, we don't have any input method user testing, so there's literally zero empirical information to help make the right call.</p>
<p>It also completely neglects the IME counterpart protocol, where things like popups live.</p>
<h2>Future</h2>
<p>Of course, there's more. There <em>are</em> a few users who made their wishes known.</p>
<p>What to do when the <strong>text field loses focus</strong> while pre-edit is ongoing? <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/40#note_863260">Korean writers expect</a> the pre-edit to be turned into permanent text. Chinese writers don't.</p>
<p>Some Chinese writers even suggested that the IME could come back to the same state when focusing the same field again.</p>
<p>That leads to wider persistence considerations. If I always chat with Дима in Russian, I'd like to always have my preferred Russian layout and the Russian language dictionary activated. The application could supply <strong>stable text input field identifiers</strong> for the IME, so the IME could save my settings permanently.</p>
<p>The keyboard Pandora's box could be closed back again with more work around <strong>shortcuts</strong> and <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/478">actions</a>. Think copy, paste, undo. There's a bunch of actions already defined in XKB in the form of keysyms (XKB_KEY_Undo, XKB_KEY_Redo, XKB_KEY_Menu, and so on), but it would be nice to have a protocol that's separate from keyboards entirely so that we don't try to extend it back into the problem we're just trying to avoid.</p>
<p>Further along that way, using <a href="https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.portal.GlobalShortcuts.html">global shortcuts</a> or <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/216">something similar</a>, we could eventually build a system that allows <strong>generating arbitrary actions</strong>, while also preventing <a href="https://gitlab.gnome.org/GNOME/gtk/-/issues/4337">shortcut clashes</a> (because all shortcuts would be registered centrally).</p>
<p>I saw <strong>type-to-search</strong> functionality being requested: where the application doesn't focus a text field, but the field will get focussed as soon as something is typed in. Thinking about it deeper, text-input-v3.2's ability for the application to hide the panel seems like it could solve it: for the purpose of the IME, the field is focussed, but no panel is shown.</p>
<h2>Guesses</h2>
<p>Things that no one asked for, but I think input method is the right place for them, are:</p>
<ul>
<li><strong>speech to text</strong>. I think all is in place for that already.</li>
<li><strong>handwriting recognition</strong>. On a train, I saw a lady write on a tablet with a stylus. The text turned into typed text automatically. I think we'd need to define another popup type, which covers the whole text field, and which the IME can make transparent to receive stylus events. This definitely needs more thought.</li>
<li><strong>gamepad input</strong>, and other unusual input devices. I want my computer to be a proper game console! So does Valve, and <a href="https://github.com/ValveSoftware/gamescope">they already do it</a>. But Valve uses their own input method protocol. Hey Valve, why not join the standards track?</li>
</ul>
<h2>Non-goals</h2>
<p>The broad umbrella of user input is broad. Some things look like they share concern with input methods, but the commonality might be not so big.</p>
<p>One important out-of-scope case is <strong>keyboard emulation</strong>. If you look at an on-screen-keyboard from an input method and one used to emulate an array of switches, you may not immediately see the difference. Looking deeper, one may have buttons like &quot;copy&quot;, the other will have &quot;Alt&quot; and &quot;Control&quot;.</p>
<p>Apart from needing a panel, though, those are entirely different things. My input method effort completely ignores keyboard emulation. It would be just too much to handle.</p>
<p>But I can offer you a hint if that's what you need: fix the virtual-keyboard protocol. Rip out key map switching, and make it always follow the system key map. Switching key maps is a headache and the reason why the v1 of the protocol was not accepted in wayland-protocols. Go ahead and try to implement it and you'll see why.</p>
<p><strong>Typing in terminal windows</strong> with an IME is also out of scope. We're back to the problem where key presses have no standardized consequences. When you type &quot;:q!&quot;, will you get this text or will you lose all your data? Nobody knows, because there's no protocol to inform the input method whether it's in text edit mode.</p>
<p>If you wanted to correct this, you'd basically need to take all I did in Wayland and redesign it for VT100. Or whatever protocol terminals use these days.</p>
<p>Sorry, fellow hackers, I'm not motivated enough to do that. I code in <a href="https://kate-editor.org/">Kate</a>. Implement dumb keyboard emulation instead.</p>
<h2>Upstreaming</h2>
<p>This effort will live or die depending on adoption, and adoption survives on upstreaming.</p>
<p>All along, I kept in touch with KDE through their <a href="https://kde.org/goals/">input goal</a>. They aren't yet implementing the experiments in their on-screen keyboard or in Kwin, so I hope this writeup helps clarify the benefits.</p>
<p><a href="https://codeberg.org/dcz/stiwri">My own</a> experimental IME is, sadly, useless. It's just too slow for practical use. If you're familiar with integrating <a href="https://github.com/emilk/egui">egui</a> and you want to fix reinitializing the GPU on every redraw, you can be my friend.</p>
<p>When upstreaming things to toolkits, I had generally good experience. Thanks, <a href="https://github.com/rust-windowing/winit">winit</a> maintainers! Thanks, <a href="https://lap.dev/floem/">floem</a> folks! But I only managed to add some things missing from text-input-v3.1 before running out of steam, so experimental improvements haven't been tried yet.</p>
<p>I made the least progress on the compositor side, where even MRs bringing it up to date with base Wayland protocols have been languishing for months.</p>
<p>Perhaps the lukewarm response to my work means that it comes before its time. Maybe what we need is a breakaway success of Linux Mobile, which would push input methods onto the main stage. Maybe we need a large business <em>cough</em> like Valve <em>cough</em> to do their own thing while contributing to the upstream.</p>
<p>Or maybe we need to wait until X11 stops being supported and the previously working ways to input CJK text stop working, causing riots that are too hard to ignore by the developer community.</p>
<h2>What next?</h2>
<p>Myself, I'm taking a back seat now. I plan to push this forward only when I feel particularly inspired – after all, I have <a href="https://jazda.org/">other</a> <a href="https://nlnet.nl/project/MT818x_MT819x-firmware/">things</a> I'm burning to do now.</p>
<p>So, will input methods become a pleasure to use on Wayland? Now it's up to you – the community.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Free Software hasn&#39;t won</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/fosswon/"/>
       <id>tag:dorotac.eu,2025-10-10:posts/fosswon</id>
       <updated>2025-10-10T14:00Z</updated>
       <published>2025-10-10T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <p>This is a translated version of a talk I gave at <a href="https://piwo.sh">P.I.W.O</a> in June, with cleanups and adjustments for the blog form.</p>
<h1>Free Software hasn't won</h1>
<p>…that doesn't sound right. I made the slides in <a href="https://inkscape.org/">Inkscape</a>, on a computer running <a href="https://kde.org/">KDE</a> and Linux, I use <a href="https://www.firefox.com/">Firefox</a> regularly. But maybe that's just me. What about you, are you using Free Software? Hands up! [hands go up in the audience] Of course! What nonsense, &quot;Free Software hasn't won&quot;. Someone replaced my slides, hey conference staff!</p>
<p><strong>Staff:</strong> <em>The other folder.</em></p>
<p>[Browsing to a directory named &quot;other folder&quot;, opening file called &quot;your slides dimwit.pdf&quot;]</p>
<p>Now, those are finally my slides.</p>
<p>Hello audience, my name is Dorota, and I'm going to talk about how</p>
<h1>Open Source has won.</h1>
<p>And that's not recent, the news has been out in 2008, and has been regularly repeated since by reputable press: <a href="https://www.zdnet.com/article/linux-and-open-source-have-won-get-over-it/">ZDNET</a>, <a href="https://www.linuxjournal.com/content/open-source-winning-and-now-its-time-people-win-too">Linux Journal</a>, <a href="https://www.wired.com/2016/08/open-source-won-now/">Wired</a>, and so on.</p>
<p>Those press articles list a multitude of examples to prove it.</p>
<p><img src="logos.svg" alt="Logos of businesses and project mentioned in the articles above" /></p>
<p>Linux, Ruby, Red Hat, uh, GitHub? Does that mean I can download GitHub and run it on my own server? Microsoft? Come on, that's some kind of a joke. Those slides are manipulated! So what else do they contain? Oh, <a href="https://geoawesome.com/open-and-closed-source-who-wins-the-race/">this quote</a> is all right:</p>
<blockquote>
<p>Open source won. It’s not that an enemy has been vanquished or that proprietary software is dead, there’s not much regarding adopting open source to argue about anymore. After more than a decade of the low-cost, lean startup culture successfully developing on open source tools, it’s clearly a legitimate, mainstream option for technology tools and innovation.</p>
</blockquote>
<p>Oh, the name of the quoted person is wrong. Looks like an attack on my reputation! Anyways.</p>
<p>The point is, if we want to build something new, using Free Software is not a hindrance. And thats super important, because <a href="https://a16z.com/why-software-is-eating-the-world/">software is eating the world</a>. What does it mean? It means software keeps appearing in areas where there used to be no software before. That, in turn, means that we're slowly giving up control over more and more areas of life to those who made the software. After all, their software controls those areas of life from now on.</p>
<p>That's why it's great that there's always an alternative available, that we can select software that is free which grants control to us, and not just its manufacturer. So we have alternate <a href="https://fedoraproject.org/">operating systems</a> made with <a href="https://kernel.org/">Linux</a>. There are <a href="https://www.python.org/">very</a> <a href="https://rust-lang.org/">many</a> <a href="https://llvm.org/">programming</a> <a href="https://www.swi-prolog.org/">languages</a> <a href="https://isocpp.org/">to</a> <a href="https://www.scala-lang.org/">choose</a> <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript">from</a>. We can choose one of the <a href="https://zero-k.info/">open</a> <a href="http://openalchemist.free.fr/">games</a>, or <a href="https://krita.org/">graphics</a> or <a href="https://ardour.org/">audio creation</a> software without resorting to closed software.</p>
<p>Similarly, we don't need closed software to <a href="https://github.com/prusa3d/Prusa-Firmware-Buddy">print in 3D</a>, or to build a <a href="https://puri.sm/products/librem-5/">mobile computer</a> (also known as smartphone) or a <a href="https://github.com/InfiniTimeOrg/InfiniTime">smart watch</a>. There are graphics cards which run <a href="https://wiki.gentoo.org/wiki/Nouveau#Firmware">completely free of closed firmware</a> (Upon asking nouveau devs, they confirmed they wrote some firmware. Nvidia Kepler from 2012 is the last model where free firmware is allowed). There are such bicycles! Pretty much everyone owns one that rides without closed software. There are also sewing machines:</p>
<p><img src="sewing.png" alt="A fully mechanical sewing machine from the '90s" /></p>
<p>There are comms systems:</p>
<p><img src="doorbell.png" alt="A doorbell/intercom for 20 flats like one you might remember from the '90s" /></p>
<p>There are cars, and have been for a long time:</p>
<p><img src="beetle.jpg" alt="The classic half-century-old VW Beetle had no electronics inside" /> <img src="lada.jpg" alt="So did some Lada models which were not impressive when they came out in the 1980s" /></p>
<p>There are hard drives:</p>
<p>[the slides go blank]</p>
<p>There are wireless headphones, TVs... [slides remain blank] wait, something's wrong. There are phones! [Slides stay mockingly blank, noise of frantic clicking.]</p>
<p>[…]</p>
<p>[…]</p>
<p>[…]</p>
<p><img src="aster-72.jpg" alt="A pulse-dial Aster-72 rotary phone which belongs firmly in the 20th century" /></p>
<p>Crap. This isn't right.</p>
<p><img src="schemat_Aster_RWT.jpg" alt="A yellow card showing the schematic of the Aster phone" /></p>
<p>Oh, now I get it! The only kind of a phone that grants us openness is an analog phone. That reminds me of the time we were building the aforementioned Librem 5. There was a problem finding the modem for it. The reason is, one company controls the necessary patents necessary to connect to cellular networks. That company can and does impose arbitrary conditions on anyone using their integrated circuits. That made it very difficult to find modems that match our needs, and at the same time any reseller is willing to sell us. The resellers worried that by passing them on to us, they could break some distribution rule and not be able to get any modems in the future.</p>
<p>So that's a no. But I know what will be open for sure.</p>
<p><img src="GNU.png" alt="The stylized gnu logo of the GNU project" /></p>
<p>Richard Stallman started an important project called <a href="https://www.gnu.org/">GNU</a> in 1983. In one of his <a href="https://en.linuxadictos.com/stallman-y-la-impresora-el-origen-de-las-licencias-libres-de-software.html">interviews</a>, as he describes how he started the project, he mentions a certain device that his university bought, but which didn't work very well. He wanted to improve it, but no one wanted to share the device's sources with him. That was an offense! Why wouldn't anyone share the code?</p>
<p>What was that device?</p>
<p>That was a printer. Considering that the GNU project started in 1983 and that the story's from 1981, it works out to over 40 years of fighting for printer freedom. So let's reveal our open alternative.</p>
<p><img src="Colored_pencils_chevre.jpg" alt="Colored pencils" /></p>
<p>Oh come on. This cannot be. Have you ever used a printer? If you even manage to find a driver, if you even manage to connect to the printer, then it's still going to print single-sided black-and-white when you asked double-sided color. And despite having to put up with that, Free Software people still haven't gotten frustrated enough to solve the problem once and for all? Unbelievable.</p>
<p>I have a theory. People who say &quot;Open Source has won&quot; are only taking into account a small part of what software is out there. Take a look at this list: it's a map showing which kinds of software force you into running something closed (bold) and which have open options available (italics).</p>
<ul>
<li>Applications: <em>Blender, Firefox, KiCAD</em> – <strong>Twitter, YouTube</strong></li>
<li>Operating System: <em>GCC, Apache, OpenSSL</em></li>
<li>Kernel: <em>Linux, Zephyr, FreeRTOS</em></li>
<li>Firmware: <em><a href="https://coreboot.org/">Coreboot</a></em> – <strong>modem, GPU</strong></li>
<li>Appliances: <em>Prusa 3D, <a href="https://github.com/airgradienthq/arduino">Airgradient</a></em> – <strong>washing machine, TV</strong></li>
</ul>
<p>What picture does this paint? Things programmers care about directly, like the OS and the kernel, are quite well covered. Whatever we need, there's an open version. Applications are also more or less fine. There's a Web browser, there's creativity software. The problem appears when you try to participate in social media. Sure, there are alternatives. But <a href="https://joinmastodon.org/">Mastodon</a>, or <a href="https://joinpeertube.org/">PeerTube</a> are separate networks from the closed ones, so they won't help much when trying to reach people who aren't yet using them.</p>
<p>Looking at the lower layers, like appliances or firmware, there seem to be options. But those options are limited to a couple niches, and with most things we buy, like a TV or a PC component – sorry, pal, there's simply no choice at all.</p>
<h2>All the firmwares in the average laptop</h2>
<p>How many processors are there in a typical laptop? By &quot;processor&quot; I mean something that needs its own software. For example, a GPU has its own processor that needs software, or a hard drive, or a keyboard. Here's a diagram of my personal estimate of what separate components need software:</p>
<p><img src="firmwares.png" alt="A picture of a laptop with highlighted components: camera, touch screen, touchpad, Embedded Controller, SSD, battery, HDD, RAM, WiFi+Bluetooth card, sound card, BIOS, Intel ME." /></p>
<p>I estimate there are 10 to 15 separate processors on a typical laptop. Just the graphics card may host five of them.</p>
<p>What does that mean for free software? Normally, all that's open – Linux, drivers, applications – all of this is confined to the main CPU. Now imagine you want to use this operating system through some human-friendly interface, like the touch screen or the keyboard. Those are running closed software, so if you want to enter any sort of data on your average laptop, it's game over: you can't make a move without dependence on closed software.</p>
<p>Same story with the graphics card. You won't display anything without closed software. What a fail. Okay, let's ditch keyboards and displays because this is a server. But that's a fail, too: to communicate over a network card, you still need software that it's running and that hasn't been opened. Suppose that we managed to somehow solve this problem. We hit the wall anyway when we try to store data: SSDs as well as HDDs are running their own closed software. I haven't heard of a single case of open software ever running on a storage device!</p>
<p>But that's not even the worst. The peak of lameness is <a href="https://web.archive.org/web/20170828150536/http://blog.ptsecurity.com/2017/08/disabling-intel-me.html">the processor inside the processor</a>. Have you already heard of Secure Boot? It's a piece of BIOS that is loaded onto the processor inside the main processor before the main operating system. Secure Boot allows the manufacturer choose which software the user can run. A similar system exists on Android phones to lock them to a particular system. Manufacturers of Android-based phones are not shy about restricting what the user can run on their devices.</p>
<h2>That runs against user's freedom!</h2>
<p>User freedom exists only when the <a href="https://www.gnu.org/philosophy/free-sw.en.html">Four Freedoms of Software</a> are upheld:</p>
<ul>
<li>0: freedom to run the program for any purpose,</li>
<li>1: to study and change it,</li>
<li>2: to share copies of it,</li>
<li>3: to improve it and share the improvements.</li>
</ul>
<p>…but those are just words. Who cares about that? This theory is only ever going to be relevant to us computer experts, right?</p>
<p>Except… could it be that you're the family's tech support expert? Does your uncle/mum/grandma come to you carrying their malfunctioning Android phone, hoping for you to make everything right again?</p>
<p>And have you ever disappointed them? Has it already happened that their phone was simply too old and unsupported to be useful any more? Have you already told someone they need to pay up to replace a phone that seems perfectly functional?</p>
<p>Sadly, that's what the Android manufacturer <a href="https://www.nextpit.com/how-tos/how-many-android-updates-manufactures-offer">support</a> <a href="https://forum.fairphone.com/t/what-kind-of-guarantee-is-there-regarding-updates/105832">timelines</a> say: typically after 4, exceptionally after 8 years, they will no longer release security updates. That makes devices too insecure to use, and turns them into e-waste.</p>
<p>What does Free Software have to do with it? I don't know, but my Lenovo laptop, 13 years since release of the hardware, the operating system is still receiving regular security updates. I suspect this has something to do with the lack of a boot loader lock and the openness of all the drivers. That's unlike Android. Even if there's no explicit lock, the drivers are so rarely open that the community rarely has the manpower to create a custom ROM for a given device.</p>
<h2>Rug pulls</h2>
<p>The couple hundred bucks that your aunt might need to pony up to get a new phone pale in comparison to how much you need to pay for some cloud-only devices. Some cloud-enabled gadgets don't let the user choose an alternative provider for services the device requires. What happens when the company <a href="https://arstechnica.com/gadgets/2024/12/startup-will-brick-800-emotional-support-robot-for-kids-without-refunds/">shuts down</a> the <a href="https://www.gamesradar.com/games/racing/210-days-after-nintendo-shut-down-the-3ds-and-wii-u-online-servers-the-last-connected-player-finally-signs-off-after-his-console-crashes-its-over/">online service</a>? Of course, the device <a href="https://www.pcmag.com/news/2300-magic-leap-1-headset-will-stop-working-after-2024">becomes an expensive brick</a>. Imagine someone setting your 2300 bucks on fire just like that.</p>
<p>That's still nothing compared to what some other people have to deal with. Imagine you're a farmer, and your harvest is on the field, ready to get cut and brought in. T here's a storm brewing, so you jump into the combine harvester, and start the work. Oh no! The machine broke down! Not to worry, you're a resourceful farmer and you have the necessary spare part. You install it and start the machine, except… it tells you: &quot;Unauthorized component. Please contact customer service&quot;. Now you're in real trouble because it could take <a href="https://pirg.org/media-center/report-tractor-right-to-repair-would-save-u-s-farmers-4-2-billion/">9 months</a> for the customer service to solve the problem. You can't harvest the food worth tens of thousands of dollars, you're that much in the red. Game over, your farm is bankrupt. But that's not the end of the world, is it?</p>
<p><img src="pacemaker.jpg" alt="Two x-ray pictures of a human chest, with some wires and a chunky artificial shape visible" /></p>
<p>This is what a pacemaker looks like. Why would I mention those in a talk about software freedom? You see, a pacemaker is a complex device which must examine and diagnose the patient continuously, in real time, in order to perform its function. Its task is to detect a dangerous condition and perform a medical procedure in response to it. It needs software to do this complicated task. But if the device isn't perfect at diagnosing, that's a big problem. I'm not a medical expert, but getting your heart shocked when it's not necessary sounds dangerous in its own right. When it runs closed software that does not grant us the freedom to modify it, we have to resort to begging the manufacturer to fix it. And when we get no freedom to study it, we can't even avoid the circumstances that make it misfire!</p>
<p>But don't take my word for it. I only know of this problem because of <a href="https://punkrocklawyer.com/">Karen Sandler</a>, whose <a href="https://sfconservancy.org/about/staff/">involvement with Free Software</a> is intertwined with this problem since the beginning.</p>
<p>The bottom line is, if we have people who have no other choice but to trust their own body to a <a href="https://spectrum.ieee.org/bionic-eye-obsolete">piece of closed software and a single manufacturer</a>, how could we possibly say that Open Source had won?</p>
<h2>Appliances and copyleft</h2>
<p>Are you responsible for building an appliance? I bet you're using Open Source software in it, aren't you? Then licenses like the MIT require you to include a notice about the authors of the source code together with the software you distribute. There's a whole <a href="https://daniel.haxx.se/blog/2016/10/03/screenshotted-curl-credits/">gallery</a> of those on the curl website, ranging from cars to food processors. Are you feeling proud for releasing a device with Free Software in it? Not so quick! Can the user of your device study and modify the software you gave them? Have you actually granted them the Four Freedoms?</p>
<p>Permissive licenses <a href="https://www.gnu.org/licenses/license-list.html#Expat">like the MIT license are Free Software</a>, so they let you do all that the Four Freedoms promise. But they also allow you to do another thing: to close the software again by never granting those freedoms regarding your own modifications. If that's what happened, then freedom for me, not for thee. You, the manufacturer, reaped the benefits, the user can't, sucks to be the user.</p>
<p>The responsibility to prevent this falls on us, computer experts. When we create software, we have the choice of license we want to release it under. And we should be using what's called &quot;copyleft&quot;: it's a term that applies to licenses which prevent code once released under that license from being closed again. The most widespread copyleft license is the <a href="https://www.gnu.org/licenses/quick-guide-gplv3.html">Gnu General Public License</a> (GPL), and I recommend that you all use that one.</p>
<h2>Licenses and more</h2>
<p>Licenses are not the only thing relevant for Free Software. There are other things to fight:</p>
<ul>
<li>patents, like in the case of cellular modems,</li>
<li>hardware locks, like Android's,</li>
<li>project management.</li>
</ul>
<p>As for the last point, recently Google gave <a href="https://arstechnica.com/gadgets/2025/03/google-makes-android-development-private-will-continue-open-source-releases/">an amazing example</a> by restricting access to sources in development to select manufacturers. Everyone else will not get continuous updates, but only once per major release. This illustrates how much influence over the practical usability of a project management decisions have. This is not a change in licensing, and it's also not a technical change, so it's not immediately visible under those lenses.</p>
<p>Instead, it's a consequence of who's in charge. In this case, it's not a community who controls the Android project, but a for-profit corporation. At the same time, it's regular people who are on the user side of the project. Is it any wonder that the goals of a corporation and those of regular people differ? Is it any wonder that the corporation is making changes that suit it even when they don't suit the community of users? When those are the conditions under which a project is developed, it can have deep consequences, even on an architectural level.</p>
<p>Take <a href="https://www.debian.org/">Debian</a> as a point of comparison. The first statement on the web page already says &quot;Debian is a Community of People!&quot;. The software is being developed and used by the same people. They won't make it harder to use. They provide a complete operating system, publish all the sources, and <a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1099130">purge anything that isn't open enough</a>. On the other hand, Android has long been <a href="https://arstechnica.com/gadgets/2018/07/googles-iron-grip-on-android-controlling-open-source-by-any-means-necessary/">replacing</a> open components by closed ones, making AOSP (the open part of android) all but <a href="https://www.esper.io/blog/aosp-missing-features-google-gms">unuseable on its own</a>.</p>
<h2>Why?</h2>
<p>I suspect this situation has something to do with how computers and appliances have been developed historically. Computers have their roots in academia. When they were sold, they were always advertised as blank slates, general purpose devices, as opportunities to do what you choose to do with them. Not so much for appliances. Those have always had a single purpose. Except they kept getting complicated, until they entered a level of complexity where they needed to incorporate computers in order to perform their function. But they kept being manufactured as appliances, with only a handful of people being expected or allowed to exercise control over them. Incorporating computers didn't change the culture around them.</p>
<p>This is just a guess and I don't know how correct it is. For example Apple was always a computer manufacturer, but they are making computers now as if those were appliances.</p>
<h2>What now?</h2>
<p>The responsibility is ours – computer nerds' – to make Free Software win. When we build a hardware device, we must publish the firmware sources. We must publish technical documentation – it often so happens that the device documentation needed to make open firmware is missing or incomplete (another war story from the Librem 5, camera sensors this time).</p>
<p>As users, or institutional customers, we should demand that the manufacturer provides open sources for any firmware they are shipping with their devices.</p>
<p>But there's one more way: political pressure. I expect this to be a more effective method than individual action. After all, EU managed to convince phone manufacturers to <a href="https://www.bbc.com/news/technology-58665809">standardize on USB-C ports for charging</a>, as well as to <a href="https://www.androidcentral.com/phones/eu-repair-push-for-defective-consumer-goods">extend the warranty period</a>. Perhaps they could also force computer manufacturers to not install boot loader locks. It would fit nicely into the <a href="https://eur-lex.europa.eu/EN/legal-content/summary/copyright-and-related-rights-in-the-information-society.html">Information Society Directive</a>. It says things like:</p>
<blockquote>
<p>Member States must provide legal protection against any person knowingly performing without authority any of the following acts:</p>
<ul>
<li>the removal or alteration of any electronic rights-management information;</li>
</ul>
</blockquote>
<p>…oh. So instead of jailing people who put locks on devices they no longer own, it enforces the jailing of those who remove them from their own devices. Great.</p>
<p>Dear European Comission, please always have someone with a clue in the room who can explain the consequences of your ideas in a way you can understand. You can do it, already did a couple times, like above. But work on consistency, okay? Pinky promise?</p>
<p>Here are some people with a clue: <a href="https://fsfe.org/">Free Software Foundation Europe</a> with their <a href="https://publiccode.eu">Public Money Public Code</a> open letter, the <a href="https://repair.eu/">Right to Repair</a> movement, as well as the <a href="https://european-pirateparty.eu/">European Pirate Party</a>.</p>
<p>I recommend anyone who cares to join forces with them. But if you don't want to engage politically, there are also financial way of support. And I don't even mean (although I do encourage) donating. I mean supporting Free Software friendly manufacturers! Buy the Librem 5 from <a href="https://puri.sm">Purism</a>, or a 3D printer from <a href="https://www.prusa3d.com/">Prusa</a>, or a smartwatch running <a href="https://www.espruino.com/Bangle.js2">Espruino</a>. You see, it's expensive to manufacture any sort of hardware. It doesn't help that the markets are already saturated with closed products. Even if open source, hackable products are superior, it will take people at large a long time to realize that this is a superpower. Free Software thrived in culture of repair and modification. But this culture has been suffocated in the wider society with closed, throwaway items, so few people recognize its benefits. That unsustainable crowding out makes another obstacle for Open Source friendly products in the current markets.</p>
<p>There's a noble exception here. What makes it even more unusual is that it comes from Google. It's Chromebooks. Google has a set of requirements that all Chromebook manufacturers must fulfill, and one of them is having a completely open BIOS, together with the Embedded Controller firmware. All Chromebooks I'm aware of run Coreboot. They still contain some closed software, notably the RAM startup software, which, I believe, is present in all laptops, but! ARM-base Chromebooks are able to run with a completely open BIOS apart from that. So if anyone wants to take care of this together with me, I have this <a href="https://nlnet.nl/project/MT818x_MT819x-firmware/">NLNet project</a> to make it as easy as possible to run regular, mainline Linux on them. So please, contact me, if you're that person.</p>
<h2>The world</h2>
<p>A short quiz: how many devices can you count around you which contain processors?</p>
<p>Some hints: TV, camera, toothbrush, oscilloscope, e-book reader, radio receiver, dishwasher, router, washing machine, vacuum cleaner, bathroom scales.</p>
<p>Now think wider. When I went to the supermarket, the vegetable section had a scales that printed labels with barcodes. They were equipped with touch screens. You bet there's a processor and a load of firmware in those. But shops are chock-full of processors in my part of the world. There are thousands of price labels in each of those stores, and they are all e-paper screens. I'm fairly sure you need software to drive those and receive wireless updates.</p>
<p>Keep going and you might realize that the software running in your car allows <a href="https://www.theguardian.com/technology/2016/sep/20/tesla-model-s-chinese-hack-remote-control-brakes">remote control</a>. Or <a href="https://www.theregister.com/2023/12/08/polish_trains_geofenced_allegation/">in your train</a>. That snafu wouldn't have occured if the railways had access to sources of the train software.</p>
<p>What about other business uses? Car diagnostic stations? Medical equipment? Accounting software?</p>
<p>Software is really eating the world, and it's closed software which is <em>everywhere</em> around us, without free options. What's the regular person's role in this? They give up control over entire areas of their lives to others, others who often can't be supervised or replaced.</p>
<p>You know, we messed up. There's no other way to put it. We even let closed software sneak into our own home field: computers. Sure, the interfaces are open. There's SATA, there's PCI. We can swap parts if we want to, we can run Linux there, all is fine. Except it's not, because peripherals are as important as the core, and we, software people, lost control of the peripherals of our darlings already.</p>
<h2>Wasted potential</h2>
<p>In theory, it's possible that someone opens a piece of software regardless of the wishes of the original authors. The whole game modding scene is about that. Here's an example of someone running Tetris on a pocket camera:</p>
<p><img src="ZX3%20tetris.mp4" alt="ZX3 running Tetris with hacked firmware" /></p>
<p>But going against the manufacturer is just wasted work. Imagine the difference between hacking it in and modifying the official sources. The potential, things we could achieve if we didn't have to break doors that are open! So here's a silly example: I have an action camera. Due to some <a href="https://www.dpreview.com/articles/0794343949/wto-looking-at-moves-to-remove-30-minute-limit-from-digital-cameras">stupid law</a>, the camera breaks off every recording as soon as it reaches the 30 minutes mark. Now I have 20 years of coding experience. Having source code, I could have fixed the problem and went on with my life. Another example, another camera: I am making a time lapse from my window. Every day at 10:00, I take a picture from a camera that just sits there. But this camera has no time lapse feature, so I must go there in person every time. Why can't I fix this? Of course, no source code.</p>
<h2>Epilogue</h2>
<p>There's now a <a href="https://www.crowdsupply.com/open-tools/open-printer">new printer</a> project that advertises itself as open source. But if you look at the details, it's actually not. Instead, it uses a <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">source-available license</a> which does not grant you Freedom 0 – you must not use the sources for commercial purposes. Better than nothing, I guess.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Notes on DMABUF and video</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/DMABUF/"/>
       <id>tag:dorotac.eu,2025-07-12:posts/DMABUF</id>
       <updated>2025-07-12T14:00Z</updated>
       <published>2025-07-12T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Notes on DMABUF and video</h1>
<p>While working with the Linux video subsystem, especially while working on the <a href="https://source.puri.sm/Librem5/millipixels">Librem 5</a> even before <a href="https://libobscura.codeberg.page/">libobscura</a>, the term &quot;DMABUF&quot; appeared a lot. Yes, it's a buffer. No, you can't always use it.</p>
<p>It's always been some kind of a magically limited but also powerful thing that you must master if you want fast video, but you're likely to hit a corner case if you're not careful.</p>
<p>And the knowledge is spread all over.</p>
<p>So this is my latest understanding of what it is and how to use it. I intend to update this in follow-up posts, so please correct me if you see something wrong.</p>
<h2>Background</h2>
<p>Video devices operate on pictures. Those pictures are large amounts of data, and they need to travel somewhere to be useful: your screen and your eyes, or your hard drive, or the internet. Typically, your screen, though.</p>
<p>Moving around big pictures at 30 frames per second means moving around a lot of data across different components: all the way from the camera device to the GPU and the display. Moving data from one place to another is copying, and every copy along the way takes up some computing resources. So in order to have a smooth experience, unneeded copies should be eliminated.</p>
<h2>DMABUF</h2>
<p>One mechanism to avoid copies is called DMABUF: DMA (direct memory access) buffers. Those buffers can be used directly by devices like the camera controller, or the GPU. In the case of video coming from the camera, the idea is that once the camera data arrives from the hardware processing units into the main memory, the GPU can access it directly (with its DMA engine), without creating another &quot;GPU area&quot; and copying image data there from a &quot;camera area&quot;.</p>
<p>Unless it can't. DMABUF buffers are always associated with a device. This allows for the buffer to live in a special-purpose area of memory, like the GPU VRAM, where other devices might not be able to access it.</p>
<p>Location is one reason why DMABUF buffers are not exchangeable, but different devices can do different things. If your camera's DMA engine can access only the first 4GiB of memory, the buffer you allocated for your GPU might be out of reach. Or take the shape of the data: if your camera produces buffers with rows being a multiple of 4 bytes, but your GPU can read rows being multiples of 32 bytes, then the GPU may misinterpret the images.</p>
<h2>DMABUF life cycle</h2>
<p>I'm working on a camera application, so this is the case I focus on: buffers get filled, I read them out later.</p>
<p>At a high level, every buffer needs to be allocated, then fed to the camera capture device, then received once it's filled and read out - by the GPU or otherwise. Then, finally, the buffer is de-allocated.</p>
<p>In V4L2, this is achieved by the following sequence of <a href="https://docs.kernel.org/userspace-api/media/v4l/vidioc-reqbufs.html">IOCTL</a>s:</p>
<ol>
<li><code>VIDIOC_REQBUFS(N, MMAP)</code> allocates multiple (N) DMABUF buffers, without giving any access to the user yet</li>
<li><code>fd = VIDIOC_EXPBUF</code> returns a handle in the form of a file descriptor (fd)</li>
<li><code>VIDIOC_REQBUFS(N, DMABUF)</code> - not sure why this is needed</li>
<li><code>VIDIOC_QBUF(fd, DMABUF)</code> passes the buffer to the device for writing</li>
<li><code>VIDIOC_DQBUF</code> waits until a previous buffer is ready and returns it</li>
<li><code>VIDIOC_REQBUFS(0, DMABUF)</code> not sure what it does</li>
</ol>
<p>TODO: which call assigns a size to the buffers? They are magically the right size... and what if the format changes?
TODO: Which call releases the memory?</p>
<p>In kernel documentation, there are some mentions of &quot;<a href="https://docs.kernel.org/userspace-api/media/v4l/dmabuf.html">importing</a>&quot; a buffer into the device. I find this unhelpful for understanding how this works. The &quot;import&quot; operation is not the opposite of the &quot;<a href="https://docs.kernel.org/userspace-api/media/v4l/vidioc-expbuf.html#vidioc-expbuf">export</a>&quot; operation, which was very confusing when I was just trying to figure it all out. It just seems to be the act of enqueueing. It would mesh with my mind better if all mentions of &quot;import&quot; were unwrapped and instead the focus was on what the operation actually does.</p>
<h2>Reading the data</h2>
<p>The handles to the DMA buffers are file descriptors. To read the data out, mmap them. In Rust, it's as simple as:</p>
<pre><code>let outf = unsafe { File::from_raw_fd(fd) };
let outfmap = unsafe { memmap2::Mmap::map(&amp;outf) }?;

println!(&quot;  length    : {}&quot;, outfmap.len());

// Prevent File from dropping and closing the fd implicitly.
let _ = File::into_raw_fd(outf);
</code></pre>
<p>Or is it?</p>
<p>The buffers may reside in some other part of memory, like maybe on the GPU.</p>
<p>And can you expose a GPU buffer directly to the CPU? Maaaybe – there are unified memory architectures out there, like some systems with integrated graphics. But unless you know exactly what hardware you're running on, you may find out that the buffer cannot be mapped.</p>
<p>Even if it can be mapped, I'm guessing that access might be slow, for example if using a bounce buffer (can anyone fact check me on this?). That kind of defeats the speed point.</p>
<p>But that's not all the problems coming from different memory placement yet. You have to pay attention to caching. See, the buffer receives data by DMA, meaning that the memory is updated behind the back of the CPU. Which is exactly the point: the CPU should not be busy watching the transfer. BUT! If the CPU is caching some data from the buffer, any update will make the cache out of date and the CPU will never know.</p>
<p>So you have to make sure the CPU throws away or updates any old cached memories of the updated area.</p>
<p>There's an IOCTL for that: <a href="https://docs.kernel.org/driver-api/dma-buf.html#cpu-access-to-dma-buffer-objects"><code>DMA_BUF_IOCTL_SYNC</code></a>. I won't get into details, unless someone asks me to explain the docs.</p>
<h2>Complete reading example</h2>
<p>For your convenience and my own, I took an existing <a href="https://github.com/mripard/dma-buf/">dma-buf</a> crate and adjusted it to be a little nicer to use in libobscura. Here's how you map a buffer without fretting about safety in <a href="https://codeberg.org/libobscura/libobscura/src/branch/master/crates/dma-boom">dma-boom</a>:</p>
<pre><code>use dma_boom::DmaBuf;
use dma_boom::test;

// It's up to you to find a working buffer.
let buf: &amp;DmaBuf = test::get_dma_buf();

{
    // Request sync and create an access guard.
    // Multiple read-only accesses can co-exist
    let mmap = buf.memory_map_ro().unwrap();
    // The actual slice
    let data = mmap.as_slice();
    if data.len() &gt;= 4 {
        println!(&quot;Data buffer: {:?}...&quot;, &amp;data[..4]);
    }
} // `mmap` goes out of scope and unmaps the buffer
</code></pre>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>libobscura</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/libobscura/"/>
       <id>tag:dorotac.eu,2024-11-05:posts/libobscura</id>
       <updated>2024-11-05T14:00Z</updated>
       <published>2024-11-05T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Announcing libobscura</h1>
<p>Let me formally introduce you to my most recent overly-ambitious experiment: <em><a href="https://libobscura.codeberg.page">libobscura</a></em>.</p>
<p><img src="libobscura.svg" alt="libobscura" /></p>
<p><strong>Libobscura is a friendly library to use cameras on Linux.</strong></p>
<p>At least that's the goal.</p>
<h2>What does &quot;friendly&quot; mean?</h2>
<ul>
<li><em>It's hard to use it wrong.</em> No segfaults. Errors guide you to the right track.</li>
<li><em>Point-and-shoot.</em> If that's all you need, you get a RGB buffer in ten lines of code.</li>
<li><em>It's easy to add support for new devices.</em> Great documentation and a good internal API are the goals.</li>
<li><em>It's easy to contribute to.</em> Send patches using the <a href="https://codeberg.org/libobscura/libobscura/pulls">web interface</a>, not a mailing list.</li>
</ul>
<p>TL;DR: with the simple buffer API you can get a frame in 6 calls, and map it to CPU in another 2:</p>
<pre><code>let cameras_list = vidi::actors::camera_list::spawn()?;
let cameras = cameras_list.cameras();
let camera = cameras_list.create(&amp;cameras[0].info.id)
    .expect(&quot;No such camera&quot;)
    .expect(&quot;Failed to create camera&quot;);

let mut camera = camera.acquire();
if let Ok(ref mut camera) = camera {
    let mut stream = camera.start(
        Config{fourcc: FourCC::new(b&quot;YUYV&quot;)},
        4
    ).unwrap();
    loop {
        let (buf, meta, _next) = stream.next().unwrap();
        let mmap = buf.memory_map_ro().unwrap();
        let data = mmap.as_slice();
    }
}
</code></pre>
<p>Go to project <a href="https://codeberg.org/libobscura/libobscura/src/branch/master/crates/vidi-examples">examples</a> for more.</p>
<p><img src="baby.png" alt="A baby placing a missing block. They are stacked in the Bayer pattern." /></p>
<p>Figure 1: Libobscura will never be friendly enough for every audience.</p>
<h2>What cameras?</h2>
<p>Any webcams, industrial cameras, image sensors working exposed by the V4L2 interface are currently in scope.</p>
<h2>What does &quot;experiment&quot; mean?</h2>
<p>There are already other libraries for camera support on Linux. You can use the <a href="https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/v4l2.html">V4L2 APIs</a> directly, or use <a href="https://libcamera.org/">libcamera</a>, or <a href="https://gitlab.com/megapixels-org/libmegapixels">libmegapixels</a>.</p>
<p>They all strike various middle points on the power vs user-friendliness scale. Having worked with all of them while developing camera support for the <a href="https://puri.sm/products/librem-5/">Librem 5</a>, I never got the impression that any of them are particularly easy to use.</p>
<p>Libobscura is an experiment because it tries to find an API that fulfills the needs of most people and remains hard to use wrong.</p>
<p><img src="diagram.png" alt="A diagram showing two axes: vertically ease of use and horizontally covered use cases. A couple of projects are positioned below an area called &quot;unexplored&quot;, whichis itself under am area called &quot;unachievable&quot;. The borders are roughly moving to less easy to use as covered use cases grow." /></p>
<p>Figure 2: My &quot;Perfect tool&quot; conjecture. Imagine you have a perfectly well useable tool covering some of your needs. If your needs grow, the perfect tool covering those tools cannot be easier to use than your old tool. And humans are imperfect. This applies to APIs, as well. I put my rough idea of where I think a couple of examples fall. Libcamera tries to be as general as possible, so it's on the far right. OpenCV has Python bindings, so it's up top. Libobscura starts out a bit to the left and top, and I hope to push it to the right, to explore how many use cases can be added while staying easy.</p>
<p>Perhaps it's impossible to improve on what current libraries do. But now libobscura is a space where it's possible to try out radical changes without bothering the maintainers of existing libraries – for example, to try out an entirely new approach.</p>
<h2>Radical approaches</h2>
<p>There are a couple of radical approaches in libobscura that earn it the &quot;experiment&quot; status.</p>
<h3>Rust</h3>
<p>Rust is a memory-safe systems language. Libobscura uses it to ensure that the APIs are hard to use wrong. A Rust API built with a little care does not let you crash with a segfault – the compiler will alert you before you can create problems.</p>
<p>The Linux kernel started its <a href="https://rust-for-linux.com/">own Rust experiment</a> in 2020. Linux is a low-level project. Camera support is a low-level topic (many people working on libcamera also work on the kernel). If Rust is interesting for Linux, then it's interesting for libobscura.</p>
<h3>Get RGB data</h3>
<p>A camera library can't be described as easy to use if the pictures it gives you need to be processed before displaying.</p>
<p>The typical format to display data is RGB while cameras return either Bayer or YUV data. With libobscura, the goal is to provide those conversions transparently, without any extra effort on behalf of the user or camera backend.</p>
<h3>GPU acceleration</h3>
<p>The conversions might not be implemented in hardware, but don't worry, libobscura has your back! It comes with a GPU-accelerated image processing library called <a href="https://codeberg.org/libobscura/libobscura/src/branch/master/crates/crispy">crispy-img</a>.</p>
<p>Why GPU? The idea started with the Librem 5, which will often record video while running on battery. Offloading this work to the GPU is a clear win, and I expect most Linux video-capable devices to have some form of GPU. And we're open to a CPU decoder if you want to contribute!</p>
<p><img src="painter.png" alt="A painter looking at a canvas with 4 colored strokes: 2 colored, 2 gray, representing YUYV. The painter holds a brush at a canvas on the other side with 3 colored strokes: red, green, and blue." /></p>
<p>Figure 3: An artist's impression of a pixel format conversion shader.</p>
<h2>Status</h2>
<p>Because libobscura is only two months old as a funded project, the current status is &quot;proof of concept&quot;. There's a safe and limited user API, another more tricky but zero-copy API, the GPU accelerated library can handle some conversions, and camera support is relatively simple. USB camera demo works.</p>
<p>But there are still <a href="https://codeberg.org/libobscura/libobscura/issues">goals to achieve</a>:</p>
<ul>
<li><a href="https://codeberg.org/libobscura/libobscura/issues/1">change controls</a> like brightness or focus,</li>
<li><a href="https://codeberg.org/libobscura/libobscura/issues/2">transparently integrate</a> GPU processing,</li>
<li>have a way to choose your <a href="https://codeberg.org/libobscura/libobscura/issues/3">preferred output format</a>,</li>
<li>implement <a href="https://codeberg.org/libobscura/libobscura/issues/5">auto-* algorithms</a> when the camera doesn't have auto mode,</li>
<li>replace <a href="https://codeberg.org/libobscura/libobscura/issues/6">LGPL-licensed pieces</a> from libvidi and crispycam (not great for link-time-optimized code)</li>
<li>add <a href="https://pipewire.org/">Pipewire</a> <a href="https://codeberg.org/libobscura/libobscura/issues/7">integration</a>, so your browser can just pick up your camera feed for teleconferencing,</li>
<li>add <a href="https://codeberg.org/libobscura/libobscura/issues/9">Librem 5 support</a> as a realistic, useful verification of concept.</li>
</ul>
<p>Those will be the next steps for the project. First implement all the functionality and then extend support for more devices. This way we can catch corner cases in the API that are bound to appear with unusual setups.</p>
<h2>The future is YOU</h2>
<p>I'm making libobscura in the hope that it will be useful to people. As a base to build software, or to ship devices, or to learn what software architectures fit this problem.</p>
<p>When I sent my funding request to Prototype Fund, I didn't expect to be taken seriously. After all, what motivated me most was that it was a cool challenge. Apparently they believed in me more than I did, because I got funding until March 2025.</p>
<p>What happens until then and what happens next depends on how useful this work actually is to YOU. The ultimate goal of any software is to be useful, otherwise what's the point?</p>
<p>So I invite YOU to analyze it, try it out, give feedback, experiment with it, copy the concepts, <a href="https://libobscura.codeberg.page/community.html#contributing">contribute</a> to docs, illustrations, and code, fork it entirely! Regardless if you come from the <a href="https://postmarketos.org/">Mobile</a> <a href="https://mobian.org/">Linux</a> <a href="https://manjaro.org/">community</a>, or <a href="https://www.raspberrypi.com/documentation/accessories/camera.html">Raspberry Pi</a>, or you have a laptop with this <a href="https://github.com/linux-surface/linux-surface/wiki/Camera-Support">IPU thing</a>, or if you're from libcamera. When you improve, libobscura fulfills its goals.</p>
<p>So see you on the <a href="https://codeberg.org/libobscura/libobscura">project page</a>!</p>
<h2>Thanks</h2>
<p>Thank you to <a href="https://puri.sm">Purism</a>, for inviting me to the Linux camera work and funding the base which I'm using now.</p>
<p>Thank you to all the libcamera people! I stol... was inspired by libcamera's architecture, and you also answered my countless questions both during Librem 5 development and recently.</p>
<p>Thank you <a href="https://blog.brixit.nl/">Martijn Braam</a> for showing how to create simple config files for many mobile phones.</p>
<p>Thank you <a href="https://www.bmbf.de">BMBF</a> for providing the funding money, and thanks <a href="https://prototypefund.de/">Prototype Fund</a> for connecting me to it :)</p>
<p><img src="bmbf.jpg" alt="Logo: sponsored by the Federal Ministry of Education and Research" /></p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>IPv6 in your home</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/ipv6_homenet/"/>
       <id>tag:dorotac.eu,2024-11-05:posts/ipv6_homenet</id>
       <updated>2024-11-05T14:00Z</updated>
       <published>2024-11-05T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>IPv6 in your home</h1>
<p>IPv4, IPv6, who cares? As long as it works, right?</p>
<h2>The death of IPv4 is greatly exaggerated</h2>
<p>We ran out of IPv4 addresses, and so what? I can still watch Youtube videos, have calls over Zoom, buy things on Amazon. The Internet works due to the magic of NAT! Channeling Oprah, you get an address, you also get an address, and you too get the same address!</p>
<p>Except this all works because of lots of money and effort.</p>
<p>Effort because you can't just make a call to your friend over IPv4. You're both likely behind a <a href="https://en.wikipedia.org/wiki/Network_address_translation?useskin=vector#One-to-many_NAT">one-to-many NAT</a>, meaning you can't receive connections. So who will initiate one? <a href="https://de.wikipedia.org/wiki/Session_Traversal_Utilities_for_NAT">STUN</a> offers some painful, costly, partial, and annoying solutions involving having a dedicated address anyway.</p>
<p>An IPv4 address costs money, too. A single IPv4 address costs <a href="https://auctions.ipv4.global/prior-sales">26 USD in bulk</a> as of 2024-11-05, and I'm paying for a 3EUR per month for a cheap VPS just for the sake of having one address for myself.</p>
<p>Why bother? Because if you want a little place for yourself on the internet to serve stuff from, you need an address. Manage your own game server? Expose a quirky service? Have an actual, direct call? Seed torrents of your favorite classic art? Connect to your home network while away? I want to do all those things, so I bought myself an address.</p>
<p>Of course, taking an address from the limited pool makes me part of the problem.</p>
<p>But I want to connect to the plant watering system that I left at home! But I want to download stuff from the off-site backup that's at my friend's place, behind NAT! Yes, I could buy a server in a data center and install Wireguard there. I did it, it sucks. High pings, slow transfers. Help! Let me have direct connections.</p>
<h2>IPv6</h2>
<p>IPv6 comes to the rescue! The address pool is about 420^π bazillions addresses, so every address is really cheap. If you <a href="https://old.reddit.com/r/ipv6/comments/16dahsc/how_do_i_as_a_private_person_apply_for_an_ipv6/">want to apply</a> for a subnet, a random <a href="https://voldeta.com/en/product/ipv6-pi-support-48/">500 EUR package</a> gives you 2¹²⁸⁻⁴⁸=2⁸⁰ addresses, which is 2417851639229258137600 addresses per EUR yearly.</p>
<p>Global IPv6 addresses are effectively free.</p>
<p>If you have a decent Internet provider, they will assign you a /60 subnet and let all your LAN devices grab a /64. Now anyone can connect to you, so enjoy! And put up your firewall.</p>
<h2>That /64 thing</h2>
<p>What's that about a /64 subnet? A subnet for every device?</p>
<p>Well, that's how IPv6's DHCP equivalent works. You can't normally get anything smaller than that through autoconfiguration. Which is not a problem if you have a decent Internet provider.</p>
<p>But if you've got only a half-decent provider, they might only offer you a /64. That typically happens on mobile connections, but not only. (Also, this can happen if you like subdividing networks like Russian dolls. IPv4 you could just chain NATs, IPv6 has no NAT.)</p>
<p>So what do you do?</p>
<h2>OpenWRT</h2>
<p>I don't believe I have to introduce <a href="https://openwrt.org/">OpenWRT</a> to any of my readers. Newcomers, this is <strong>the</strong> Open Source operating system for routers.</p>
<p>It can extend IPv6 connections to connected computers <em>even if there's only a /64 available</em>! And it has this nifty Web interface called LuCi.</p>
<p>While there are multiple guides for the command-line, there are no guides for configuring IPv6 forwarding for the Web interface. So here's mine.</p>
<h2>IPv6 relay mode in Luci</h2>
<p>First, set up an IPv4 WAN network (if you care). I'll call it <em>wwa</em>n. Remember to set up a LAN interface if you don't have one. Ready? Then set up a WAN network for IPv6. I'l call it <em>relay6</em>, set it to DHCP client and select &quot;Alias Interface: @wwan&quot; as the device.</p>
<p><img src="relay%20setup.png" alt="Adding new interface &quot;relay 6&quot;" /></p>
<p>Once you have it, navigate to &quot;DHCP server&quot; and set one up.</p>
<p><img src="dhcp%20setup.png" alt="Interfaces » relay6 → DHCP Server → button: Set up DHCP Server" /></p>
<p>Yes, I know, it's weird. We don't want to provide addresses on this interface. That's why the next step is checking the &quot;Ignore interface&quot; box.</p>
<p><img src="dhcp%20ignore.png" alt="Interfaces » relay6 → DHCP Server → General Setup → checkbox: Ignore interface" /></p>
<p>Once that's done, go to DHCP IPv6 settings.</p>
<p><img src="empty%20settings.png" alt="Interfaces » relay6 → DHCP Server → IPv6 Settings → unchecked &quot;Designated master&quot; checkbox and 3 drop-downs, each on &quot;disabled&quot;" /></p>
<p>Make this interface a designated master and change 3 dropdowns (RA-Service, DHCPv6-Service, NDP-Proxy) to &quot;relay mode&quot;.</p>
<p><img src="filled%20settings.png" alt="Interfaces » relay6 → DHCP Server → IPv6 Settings → checked &quot;Designated master&quot; checkbox and 3 drop-downs set to &quot;relay mode&quot;" /></p>
<p>We're done with the WAN interface, but the LAN needs to be adjusted. Here, it's just a static address on a bridge device.</p>
<p><img src="lan%20info.png" alt="Interfaces » LAN → General Settings → Protocol: Static address" /></p>
<p>Go to DHCP Server → IPv6 Settings and change all the dropdowns to &quot;relay mode&quot;.</p>
<p><img src="lan%20settings.png" alt="Interfaces » LAN → DHCP Server → IPv6 Settings → unchecked &quot;Designated master&quot; checkbox and 3 drop-downs set to &quot;relay mode&quot;" /></p>
<p>Save all the changes and apply them. For me, the router immediately received an address.</p>
<p>My laptop also got an address immediately, but I had to reconnect to get the default route populated (otherwise you can't connect to the Internet).</p>
<p>Check a device connected to that router, it should get the address, too.</p>
<h3>Default route</h3>
<p>One snag, though.</p>
<p>The default route was not set for my laptop. I had to modify the connection manually and add one. The result looks like this:</p>
<pre><code>[me@foobar ~]$ ip -6 r
default via fe80::abcd:ef12:fecd:6573 dev wlp2s0 proto ra metric 600 pref high
</code></pre>
<h2>Bonus: ULA</h2>
<p>ULA is something that doesn't exist in IPv4.</p>
<p>ULA is the closest counterpart to an IPv4 local address. I use it to have stable addresses within my network. Even if the upstream changes their prefix (effectively your network address) and all previous addresses become invalid at 6:00 every morning, breaking all connections (thanks Telekom), and even if the upstream goes down at all, ULA will keep your network internally connected.</p>
<p>I put the names of all hosts on my network in /etc/hosts, like a troglodyte on the early Internet before DNS.</p>
<p>It sounds complicated if you come from IPv4. There, every computer knows its own address. 192.168.1.77. Great. That's it. What's my public address? No idea.</p>
<p>In IPv6, every computer can easily have multiple addresses. A global one, some ULA addresses. It doesn't become a mess because those addresses are completely independent (link-local ones are a bit special, though). ULA networks are dropped on the edge of the public Internet, so they can't even be used for Internet access.</p>
<p>OpenWRT supports using ULA in Network → Global network options.</p>
<p>For me, it doesn't always get picked up, but it's there on some &quot;Static address&quot; interfaces. I don't really know what controls it, but I have one on the same interface as my LAN:</p>
<p><img src="ula.png" alt="&quot;ula&quot; named network on interface bridgr spanning several devices, with protocol &quot;Static address&quot; and an IPv6/64 ULA address without an IPv4 address" /></p>
<h2>The end</h2>
<p>Now I have IPv6 in my local network! I can seed the world with torrents! I can host a quirky server for 10 minutes (remember to open firewall)! I can connect to the servers I manage at a friend's place!</p>
<p>Oh wait, that server is behind its own firewall and I never enabled IPv6 on that one -_-. I guess I'm not done blogging about IPv6 yet.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Bike dynamo</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/dynamo/"/>
       <id>tag:dorotac.eu,2024-03-07:posts/dynamo</id>
       <updated>2024-03-07T14:00Z</updated>
       <published>2024-03-07T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Bike dynamo</h1>
<p>I was recently planning a longer bike tour, where I would enjoy the wonders of sleeping under a tent and a wide open sky.</p>
<p>Retreating to nature is fun, but as a society, we're addicted to electricity. As a data hoarder, the very least I need to do is to keep my GPS tracker charged. The camera with a full battery is nicer than a dead one, too. This poses some logistical challenges if you don't want to regularly check into hotels, or libraries, or restaurants. And when cycling, I like having the freedom of not <em>having</em> to do many things.</p>
<p>This means I <em>should</em> produce my electricity myself to rely less on civilization.</p>
<p>I'll mention solar cells just to dismiss them: they don't work when I cycle, they don't work when I sleep. Next!</p>
<h2>Dynamo</h2>
<p>If the rumor is to be trusted, a bike dynamo is not a dynamo, but rather an alternator. It produces alternating current. But how exactly? What are the pitfalls of using a dynamo to power circuits?</p>
<h3>AC</h3>
<p>Alternating current is not directly useful for charging batteries. The AC needs to be rectified first, and a rectifier bridge made out of 4 diodes takes care of that.</p>
<h3>Voltage</h3>
<p>After the rectifier stage, we have direct current. But what voltage? Most bike dynamos are rated for 6V. But if you connect a multimeter (and nothing else) to the dynamo and spin the wheel, you'll get more like 30V to 100V AC peaks. Those rectify to way more than 6V. You're about to fry your USB device if you plug it in.</p>
<p>On the other hand, attach a small resistance and you don't even get the rated 6V. What the heck?</p>
<p>So I decided to run an experiment. I attached various resistive loads to a wheel with a hub dynamo, caught the axle in a vise, whipped out a hand drill, and held it against the tire to spin the wheel at an equivalent of about 20km/h.</p>
<p>This is a table of voltages I got:</p>
<table>
<thead>
<tr>
<th>resistance [Ω]</th>
<th>AC voltage[V]</th>
</tr>
</thead>
<tbody>
<tr>
<td>3200</td>
<td>28</td>
</tr>
<tr>
<td>1000</td>
<td>23</td>
</tr>
<tr>
<td>1000</td>
<td>22.4</td>
</tr>
<tr>
<td>470</td>
<td>26</td>
</tr>
<tr>
<td>470</td>
<td>23.2</td>
</tr>
<tr>
<td>68</td>
<td>22</td>
</tr>
<tr>
<td>22</td>
<td>13.2</td>
</tr>
<tr>
<td>6</td>
<td>8.6</td>
</tr>
<tr>
<td>4</td>
<td>4</td>
</tr>
<tr>
<td>2</td>
<td>1.8</td>
</tr>
<tr>
<td>0.8</td>
<td>0.6</td>
</tr>
<tr>
<td>0.4</td>
<td>0.4</td>
</tr>
</tbody>
</table>
<p>Some values repeat because I couldn't believe my eyes (j/k I was just sloppy at taking notes).</p>
<p>Clearly, a dynamo is not a voltage source. You'll find out as much after five minutes of research.</p>
<h3>Current</h3>
<p>But is it a current source? If so, then the current should be equal across all measurements. And current = voltage / resistance, so we can quickly calculate it and add it to the table:</p>
<table>
<thead>
<tr>
<th>resistance [Ω]</th>
<th>AC voltage[V]</th>
<th>AC current[A]</th>
</tr>
</thead>
<tbody>
<tr>
<td>3200</td>
<td>28</td>
<td>0,0087</td>
</tr>
<tr>
<td>1000</td>
<td>23</td>
<td>0,023</td>
</tr>
<tr>
<td>1000</td>
<td>22.4</td>
<td>0,022</td>
</tr>
<tr>
<td>470</td>
<td>26</td>
<td>0,055</td>
</tr>
<tr>
<td>470</td>
<td>23.2</td>
<td>0,049</td>
</tr>
<tr>
<td>68</td>
<td>22</td>
<td>0,32</td>
</tr>
<tr>
<td>22</td>
<td>13.2</td>
<td>0,6</td>
</tr>
<tr>
<td>6</td>
<td>8.6</td>
<td>1,4</td>
</tr>
<tr>
<td>4</td>
<td>4</td>
<td>1</td>
</tr>
<tr>
<td>2</td>
<td>1.8</td>
<td>0,9</td>
</tr>
<tr>
<td>0.8</td>
<td>0.6</td>
<td>0,75</td>
</tr>
<tr>
<td>0.4</td>
<td>0.4</td>
<td>1</td>
</tr>
</tbody>
</table>
<p>The current isn't always equal, as it goes from almost nothing with a high resistance to over 1A. So no current source, either?</p>
<h3>Fitting a dynamo model into a current hole</h3>
<p>There's no clarity from looking at raw data in a table, so instead I decided to guess a model and see whether it matches the data.</p>
<p>First I had to decide whether I wanted to model a current source, or a voltage source. Fortunately, <a href="https://en.wikipedia.org/wiki/Th%C3%A9venin%27s_theorem">Thévenin's theorem</a> says that one can be converted to another. So I simplified my life by modelling my circuit as a voltage source with internal resistance:</p>
<p><img src="vsource.png" alt="A circuit of a voltage source with internal resistance connected to a resistor" /></p>
<p>The current(resistance) function to fit to match the data looks like this:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>i</mi><mrow><mo stretchy="false">&#x00028;</mo><mi>r</mi><mo stretchy="false">&#x00029;</mo></mrow><mo>&#x0003D;</mo><msub><mi>v</mi><mi>s</mi></msub><mo>&#x0002F;</mo><mrow><mo stretchy="false">&#x00028;</mo><msub><mi>r</mi><mi>s</mi></msub><mo>&#x0002B;</mo><mi>r</mi><mo stretchy="false">&#x00029;</mo></mrow></mrow></math><p>where r_s is the internal source resistance.</p>
<p>After letting gnuplot find the best fit, it finds v_s = 23.9184  and r_s = 14.492. The plot looks like this:</p>
<p><img src="current.svg" alt="A plot of current versus resistance" /></p>
<p>Given that the data already contains 28V, I don't think it's a great fit. And I suspect there's more power hiding near the 1.4A, which should form a peak, but there isn't one in the best fit. Given that, the model seems wrong.</p>
<p>But I don't know what other model I could use. If you know, I'm eager to hear from you!</p>
<p>But this model is enough to estimate that any circuit dealing with the dynamo should withstand about 30V at about 20km/h. Investigation at higher speeds will continue after I find a faster hand drill.</p>
<h3>Spikes</h3>
<p>Except when the voltage spikes. If you're like me, your connectors will not be attached with micrometer precision and steam engine forces. It may get torn by accident or just get loose from all the experimenting. As the spinning dynamo on a bike in motion disconnects from the load, its resistance jumps to infinity, and voltages on the leads spike to 100V or maybe even above that.</p>
<p>And when the connection catches again, your circuits are in for a shock. Literally. Look at this vertical line at the beginning of the graph. That's the moment I connected the spinning dynamo.</p>
<p><img src="spike.png" alt="Oscilloscope showing a spike" /></p>
<p>The shock isn't long, but apparently it's enough. This matter has been brought to my attention because someone had had multiple lights fried for no clear reason.</p>
<p>Thankfully, it's possible to mitigate. Experiments showed me that sticking a 30µF capacitor rated for 50V in parallel to the load is enough to absorb the shocks. A TVS diode (or two) might also be sufficient.</p>
<h3>Overvoltage</h3>
<p>Okay, but what if 30V is still too much? You'd like to cut it off, even at the cost of losing some of the energy, because you just can't find a reasonably sized or priced component that can handle it?</p>
<p>Notice that current goes WAY down above about 23V and stick two Zener diodes in series, parallel to the load. A Zener diode cannot handle much current, but there isn't much to divert, either.</p>
<p>If that's still too much for you, add a resistor in parallel to the load. Choose one based on your preferred voltage from the table above, just be aware that it caps the voltage by always diverting some of the power.</p>
<h3>Power</h3>
<p>One more thing we can estimate is power. The spice must flow into those batteries! How much can we count on? Power = voltage · current. Skipping directly to the plot:</p>
<p><img src="power.svg" alt="A plot of power versus resistance. It has two peaks" /></p>
<p>Again, the fitted curve is kind of crazy, but at least we already know that we can cross 10W. I strongly suspect this is not the last word of the dynamo, given online tests, but more measurements are needed.</p>
<h2>Sources</h2>
<p>The data for the plots and the code can be downloaded: <a href="dynamo.data">data</a>, <a href="dynamo.gnuplot">plot</a>. Run with <code>gnuplot -p ./dynamo.gnuplot</code></p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>State of input method</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/input_broken/"/>
       <id>tag:dorotac.eu,2024-02-10:posts/input_broken</id>
       <updated>2024-02-10T14:00Z</updated>
       <published>2024-02-10T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>State of input method</h1>
<p><em>This is an edited version of the &quot;Input method on Wayland is broken and it's my fault&quot; talk which I presented at FOSDEM 2024.</em></p>
<p>My encounter with input methods started with the GNU/Linux-based Librem 5 phone, when I got hired to make sure people can type text on it.</p>
<p><img src="l5.png" alt="Librem 5" /></p>
<p>As you can see, the phone has no keyboard. That would make entering text difficult on a typical Linux computer. Thankfully, we're not the first who needed to enter text without depending on a physical keyboard. The solutions to that are collectively called <em>input methods</em>. They can cover handwriting recognition, or entering scripts like Chinese where the characters on the keyboard are by themselves insufficient, or – and this is what we're looking for – selecting characters on on-screen &quot;keyboards&quot;.</p>
<p><img src="banglay.png" alt="A screenshot of typing banglay" /></p>
<p>But how do we place an input method within the context of a GNU/Linux operating system?</p>
<p>Consider Wayland, which has been chosen as the centerpiece of the user interface on the Librem 5.</p>
<p>Wayland is a set of protocols connecting applications to the user, by the means of outputs, like the graphical display, and inputs: the mouse and keyboard. (It doesn't cover anything relating to sound, though, nor joysticks, for some reason). My task was to find an alternative to typing, so it would cover some of the tasks that a keyboard typically does. That suggests that Wayland is indeed the right place to put an input method – in place of the keyboard.</p>
<p>It turns out that I wasn't the only one having that thought. Wayland has seen attempts to create protocols for this: <code>zwp_input_method_unstable_v1</code> and <code>zwp_text_input_unstable_v2</code>.</p>
<p>I took those protocols and tried to implement them, but I used them wrong, resulting in breakage which I'm about to warn you against.</p>
<p>Haha, just kidding. My transgressions against Wayland were much worse.</p>
<p>The two existing protocols were very basic, and didn't really cover the actual needs of nontrivial input methods. I decided to update them and create improved versions. This is where it gets much worse: I embedded mistakes within the protocols, so no matter how hard you try, you can't use them correctly.</p>
<h2>Mistake 1: being too conservative</h2>
<p>A user of a modern, keyboard-less mobile phone expects the on-screen keyboard to do more than just input text. Actions like moving the cursor within a text field, moving to the next one, or submitting the form are not related to typing – they don't produce text. Yet, because of the expectation, the tech stack on the Librem 5 had to support this.</p>
<p>I decided to not introduce too many changes at the same time, and re-use an experimental keyboard emulation protocol that we supported for other reasons. That protocol would be used to submit just the non-text actions.</p>
<p>It looked great, until it became clear that it's a dead end.</p>
<p>To understand why, you have to take a look at how keyboard protocols work. What the user wants is to submit text. With a text-input protocol, it's easy. Entering the text &quot;Błąd&quot; would look something like this:</p>
<ol>
<li>Use text &quot;B&quot;</li>
<li>Use text &quot;Bł&quot;</li>
<li>Use text &quot;Błą&quot;</li>
<li>Use text &quot;Błąd&quot;</li>
<li>Finish</li>
</ol>
<p>But a keyboard is not at its core a device to enter text, but to press keys. It's concerned with buttons and whether they were pressed and released. Our simple example looks more like:</p>
<ol>
<li>KeyDown <code>SHIFT</code></li>
<li>KeyDown <code>B</code></li>
<li>KeyUp <code>B</code></li>
<li>KeyUp <code>SHIFT</code></li>
<li>KeyDown <code>AltGr</code></li>
<li>KeyDown <code>L</code></li>
</ol>
<p>and so on.</p>
<p>In reality, the buttons don't even have names, but numbers that need to get resolved to names. It looks closer to this:</p>
<ol>
<li>KeyMap *0xffffbced</li>
<li>ModifierDown 0x1</li>
<li>KeyDown 0x102</li>
<li>KeyUp 0x102</li>
<li>ModifierUp 0x1</li>
<li>ModifierDown 0x4</li>
<li>KeyDown 0x56</li>
</ol>
<p>and so on.</p>
<p>The tables responsible for resolving button numbers to actual text are called keymaps, and they are the problem here.</p>
<p>There are keyboards that support multiple scripts. A button can be labelled with &quot;Q&quot; and &quot;Й&quot; at the same time, even though its number can't be changed. The user is responsible for telling the software to use the currently intended keymap.</p>
<p>Then, there are custom keyboards. Those can send arbitrary key numbers as well, but there's no guarantee that a button labeled &quot;P&quot; on one sends the same number as the button labeled &quot;P&quot; on another. Again, the user must indicate the intended keymap.</p>
<p>An emulated keyboard is a kind of a custom keyboard, needing a custom keymap, independent of any physical keyboards that might be connected.</p>
<p>But Wayland combines all keyboard events into a single stream before giving them to an application.</p>
<p><img src="keymap.svg" alt="Diagram showing two different keyboards with different letters for the same code, and how they are interpreted as a single keyboard by the application" /></p>
<p>That means the application must always have the correct key map for every key press. This works until you consider that a user might want to smash and hold keys on multiple distinct keyboards at the same time. Long story short, Wayland maintainers told me that it's basically impossible to make this work due to corner cases, and they won't accept the keyboard emulation protocol.</p>
<p>This is bad. A protocol without Wayland buy-in will not get widespread adoption. And we must support non-text events if we want to meet users' expectations about how a phone on-screen keyboard works.</p>
<p>This makes emulating keyboards a dead end, and this mistake results in having no viable solution for the phone use case.</p>
<h3>Actions</h3>
<p>Not all is bad. There has been some talk about an &quot;actions&quot; protocol to cover those needs. It's still unclear, but it could allow the user to do things like copy and paste, select all, submit the form, etc.</p>
<h2>Mistake 2: bad synchronization</h2>
<p>What should happen if you're entering two words into a text field but the application lags for a moment?</p>
<p>If you're answer is &quot;the application should show both words&quot;, then I agree with you.</p>
<p>Except this is forbidden in my protocol. The text-input protocol I came up with will drop events a lot for that reason, and you'll end up with missing letters. That's bad.</p>
<p>I did it for a reason. An input method must be aware of the text inside the input field, if only to present the user with autocomplete suggestions. I came up with a protocol which carries the state identifier with every request.</p>
<p><img src="state.svg" alt="Diagram showing input method sending &quot;M&quot; based on state 0, and input method in state 0 accepting it" /></p>
<p>Every event of an input method pertains to some state, so that the application can reject events if they apply to text which is no longer there.</p>
<p><img src="lag.svg" alt="Diageam showing input method showing &quot;M&quot; based on state 0, and then &quot;Mo&quot; based on state 0 again, and input method being in state 0 and accepting &quot;M&quot; and afterwards being in state 1nd rejecting &quot;Mo&quot; due to state mismatch" /></p>
<p>So when the input method sends two events at once, and both apply to the original state, then after the first event is applied, the original state is already gone and the second event is automatically invalid.</p>
<p>&quot;That's wrong&quot;, I hear you say. &quot;Why not just let the input method override state changes? It would only be a problem if the user is typing into the input field from another source, and this is such a contrived case that it's not worth caring about.&quot;</p>
<p>To this, I say: I don't want to mandate which use cases are contrived or not. Initially, that was my only reasoning behind this aspect, but before FOSDEM, I came up with an actually realistic use case: two people editing the same text field. We don't even have to get into Wayland seats. It's already a thing in collaboratively edited Web documents: the text may change here while you're typing there. Should your input method automatically override the other person's edits?</p>
<p>This is a hard problem, and I have no solution to it.</p>
<h2>Progress</h2>
<p>Sadly, since I was moved to other work, there has been very little progress on those fronts. <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/commits/wip/text-input-next/?ref_type=heads">The last change</a> to the new text-input protocol is 2 years old, and it adds my 4 years old commit.</p>
<p>If you want to take part in the effort to change this, feel free to contact me.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>After Rustlab</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/rustlab/"/>
       <id>tag:dorotac.eu,2023-11-25:posts/rustlab</id>
       <updated>2023-11-25T14:00Z</updated>
       <published>2023-11-25T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>After RustLab</h1>
<p>Last week I attended the <a href="https://rustlab.it/">RustLab</a> conference for the first time. This report was going to be rather boring, but one of the presentations triggered me, so I'm going to put the (hopefully interesting) hot take right here in the front.</p>
<h2>Last-minute talk</h2>
<p>The talk about &quot;confidential computing&quot; was not on the schedule. It took the slot of a cancelled one, after some confusion regarding lightning talks in the same spot. Please excuse me that I don't know the exact title or the author (I will update the post once I find the recording).</p>
<h2>Same, but different</h2>
<p>The speaker asked the audience for a show of hands: who heard the term &quot;confidential computing&quot;? Very few hands went up.</p>
<p>Then he explained that &quot;confidential computing&quot; prevents anyone from accessing confidential data, and even the staff at the data center cannot access your computations. This is due to hardware features like TrustZone, hypervisors, Intel SGX, and similar.</p>
<p>Sounds familiar?</p>
<p>How many people would have put their hands up if he asked about &quot;trusted computing&quot;? Or &quot;DRM&quot;? Was the selection of the term intentional? I can only wonder.</p>
<p>Trusted computing can be used to empower. At <a href="https://puri.sm">Purism</a>, my co-workers used <a href="https://osresearch.net/">trusted computing</a> to give the user control over who can tamper with their OS. My own team added a smart card reader to the <a href="https://puri.sm/products/librem-5/">Librem 5</a> for similar purposes. But this is an exception, rather than a rule. Hardware manufacturers use trusted computing technologies to <a href="https://www.xda-developers.com/huawei-honor-request-bootloader-unlock-code/">prevent</a> <a href="https://xdaforums.com/t/unlock-bootloader-yes-but-unlocking-seems-impossible.3788708/">people</a> from loading their software of choice on the devices they own. Some don't give up, and eventually <a href="https://switch.homebrew.guide/">break into</a> their own devices to gain full(-er) access.</p>
<p>The speaker completely ignored that topic, instead describing the rather tame use case where we pay someone else to do our computations – the classic cloud computing arrangement. Trusted computing then ensures that our contractor doesn't see or mess with our inputs or outputs. That's a quite reasonable take on things.</p>
<p>Except it's like presenting the scientific benefits of the ballistic rocket and staying silent on its applications in war.</p>
<h2>Root of trust</h2>
<p>For a talk about controlling access to data, there was disappointingly little said about who's wielding that control. Even when someone from the audience (<em>moi</em>) asked about it, the answer didn't really include the core concept of processing secrets: the <em>root of trust</em>.</p>
<p>In very simple terms, the root of trust is the part of the system that the data cannot be hidden from.</p>
<p>As the presenter said, CPU doing the actual processing is one. Translating to more useful terms: you must trust the CPU maker to create a CPU free of bugs and not to intentionally exfiltrate your data. This is the CPU has direct access to your data, and verifying that the CPU is OK after buying it is <a href="https://security.stackexchange.com/questions/241303/how-can-you-trust-that-there-is-no-backdoor-in-your-hardware">damn near impossible</a>.</p>
<p>But there's an even more important party here. It's the one who actually wants to process the data. The one who loaded the application. If the application could not access data it wants to confidentially process, it wouldn't be useful for much. So the application loader also holds the keys to the kingdom. Quite literally, because the CPU interface is designed to grant access to the confidential computing facilities to anyone who holds one of the cryptographic keys burned (in)directly into the CPU.</p>
<p>If you load the application, but haven't thoroughly verified its source code, then the authors of the source also become roots of trust – but that's a topic for another day.</p>
<h2>It's mine! No, it's mine!</h2>
<p>So, who's the key holder in practice?</p>
<p>For Secure Boot on x86, Microsoft is one by default. Linux distributions like Fedora need to beg Microsoft to <a href="https://jfearn.fedorapeople.org/fdocs/en-US/Fedora_Draft_Documentation/0.1/html/UEFI_Secure_Boot_Guide/sect-UEFI_Secure_Boot_Guide-Implementation_of_UEFI_Secure_Boot-Shim.html">give them access</a>, or otherwise installation would stop with scary warnings by default.</p>
<p>For Android boot, the phone manufacturer typically holds the keys. The firmware running in <a href="https://genode.org/documentation/articles/trustzone">TrustZone</a> typically cannot be replaced by anyone else.</p>
<p>But if you're the user, the situation is not hopeless. Many CPU manufacturers on the ARM side keep TrustZone accessible. On the Librem 5, we deliberately left all 4 key slots open, for situations where the owner wants to create their own trusted computing environment.</p>
<p>Some people say that not allowing the owner to access the keys is a good thing. After all, the keys protect the owner's confidential data from attackers, and if the users got the freedom to protect themselves, they would certainly hurt themselves, therefore they should not have the option. I will mercifully not comment on this kind of argument.</p>
<p>Instead, I will encourage you, dear hardware and software maker, to stop for a moment, and think. Are you using confidential computing to empower people who own their hardware, or to split them from the computing power they paid for and own? Are you letting them make decisions, or making it for them?</p>
<p>This talk forgot to place humans in context of confidential computing, even though all computing is one for humans. Please don't make that mistake and don't forget who your work serves.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Two-factor login</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/2fa_login/"/>
       <id>tag:dorotac.eu,2023-10-29:posts/2fa_login</id>
       <updated>2023-10-29T14:00Z</updated>
       <published>2023-10-29T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Two-factor login</h1>
<p>Do you still type a password to get into your laptop? What about in the office? Or when you're on the train? What if there are cameras above your head? Have you ever worried that a snooper might get your password and do bad things with it?</p>
<p>I'm sure you haven't. Like a properly security-aware hacker, you disabled password-based SSH login already. You also don't reuse your laptop's password for anything else, especially not for your backups. Even if someone snoops the password, all they get is access to the laptop itself. Right?</p>
<p>Haha, I betcha. Let's face it: opsec is hard. Upgrading your opsec skill takes effort. Good opsec reduces the likelihood that secrets will be spilled.</p>
<p>That's why I decided to augment my laptop with two-factor login.</p>
<p>Now, I can whip out my laptop on the train, and log in to it without worrying too much whether the password gets intercepted. It's unique, so the attacker won't hack any of my other properties (and I'm not leaving my laptop sitting around). I see you grumble: &quot;another unique password is hard to remember! What's the point if you could just change your login password?&quot; The hardware token makes short and memorable passwords possible, because it bricks itself after 3 failed guesses! So I can get away with a password that's really easy to remember, without losing resistance to brute forcing.</p>
<p>Now I can also leave the laptop in a public space for 5 minutes. Even the security guard who used the ceiling camera to record me typing the password won't get into my laptop undetected. After all, I took the hardware token with half of my login credentials with me!</p>
<p><strong>WARNING:</strong> I only cover the user login procedure in this guide. Not unlocking the disk encryption. The evil security guard can still compromise the laptop by watching me enter the decryption password and then rebooting the system. Stealing the laptop is not in my threat model for this blog post.</p>
<h2>Second factor</h2>
<p>Two-factor authentication splits your secret into pieces of two different types (factors), which have different security properties. This is to reduce the likelihood of the entire secret getting intercepted. Each factor may get intercepted easily on its own, but both at the same time is going to be much more difficult because of how they differ.</p>
<p>There are two most common factors: &quot;something you know&quot;, like a password, and &quot;something you have&quot;, like a hardware token. Passwords can be intercepted by watching someone type them, or by key logging, or by retrieving it from badly secured storage. A hardware token cannot be duplicated, and its secret can only be intercepted by physically stealing the token.</p>
<p>While hardware tokens contain some data constituting the secret, it's not possible to get that data out. If you <em>can</em> duplicate the secret, then the token not doing its basic job as a &quot;something you have&quot;, and stops being a second factor to a password. It basically becomes a password with extra steps.</p>
<h2>Login setup</h2>
<p>This guide will show you how to enable 2-factor authentication with a password and a smart card (Nitrokey Pro) on Fedora 37. It does not disable the original password, so that locking yourself out is harder.</p>
<p>First, install some packages:</p>
<pre><code>sudo dnf install sssd opensc p11-kit gnutls-utils nitrokey-app easy-rsa
</code></pre>
<p>Let's play around with the smart card itself. Plug it into the USB port and give it a go. First, run <em>nitrokey-app</em> and set up the user and administrator passwords (if you have trouble finding the options, check out the menu bar).</p>
<p>Now, play around with the key.</p>
<pre><code>$ p11tool --list-tokens
[...]

Token 2:
        URL: pkcs11:model=PKCS%2315%20emulated;manufacturer=ZeitControl;serial=xxxxxx;token=OpenPGP%20card%20%28User%20PIN%29
        Label: OpenPGP card (User PIN)
        Type: Hardware token
        Flags: RNG, Requires login
        Manufacturer: ZeitControl
        Model: PKCS#15 emulated
        Serial: 000500005c61
        Module: opensc-pkcs11.so


Token 3:
        URL: pkcs11:model=PKCS%2315%20emulated;manufacturer=ZeitControl;serial=xxxxxx;token=OpenPGP%20card%20%28User%20PIN%20%28sig%29%29
        Label: OpenPGP card (User PIN (sig))
        Type: Hardware token
        Flags: RNG, Requires login
        Manufacturer: ZeitControl
        Model: PKCS#15 emulated
        Serial: 000500005c61
        Module: opensc-pkcs11.so
</code></pre>
<p>Remember this command, cause the URL field will be needed later.</p>
<h3>Certificates</h3>
<p>The login procedure involves checking PKCS12 certificates, so let's set up one using easy-rsa. First, create a file called &quot;vars&quot; in some directory, and put this inside:</p>
<pre><code>set_var EASYRSA_KEY_SIZE 4096
</code></pre>
<p>Now, let's create the entire certificate chain for the user &quot;thatsme&quot;. It will ask you for different passwords: one for the certificate authority (you only need to use it when adding another certificate), and one for the private key (you'll need it once, when uploading the private key to the smart card), so pay attention to them.</p>
<pre><code>ln -s /usr/share/easy-rsa/3.0.8/ easyrsa
EASYRSA_VARS_FILE=`pwd`/vars ./easyrsa/easyrsa init-pki
EASYRSA_VARS_FILE=`pwd`/vars ./easyrsa/easyrsa build-ca
EASYRSA_VARS_FILE=`pwd`/vars ./easyrsa/easyrsa build-client-full thatsme
</code></pre>
<p>Whew, this wasn't as difficult as the last time I tried to set up a new certificate authority. Good job, easy-rsa folks!</p>
<p>Now, store the generated private key and the corresponding certificate (approved by the newly minted certificate authority) into the smart card.</p>
<p><strong>WARNING:</strong> we generated a private key on the host computer. It can be intercepted by a rogue application on this computer, especially if you don't delete it fully (and deleting is harder than it seems). The interceptor will be able to follow the next steps to duplicate your smart card. If you want to be super careful, you should <a href="https://github.com/OpenSC/OpenSC/wiki/Card-personalization">generate the key directly on the smart card</a>, and use Certificate Signing Requests (CSRs) with easy-rsa to <a href="https://wiki.archlinux.org/title/Easy-RSA#Sign_the_certificates_and_pass_them_back_to_the_server_and_clients">approve the key</a>. This is not relevant to me, because I not will use the smart card to log in to any other computer. If someone snooped the key from the laptop, I'm already hosed. So I chose the easy way to generate the key.</p>
<pre><code>pkcs15-init --delete-objects privkey,pubkey --id 3 --store-private-key ./pki/private/thatsme.key --format pem --auth-id 3 --verify-pin
pkcs15-init --id 3 --store-certificate ./pki/issued/thatsme.crt
</code></pre>
<p>See how snugly the key and the certificate both sit on your smart card now? Use the URL you noted earlier to show them.</p>
<pre><code>$ p11tool --provider /usr/lib64/opensc-pkcs11.so --list-all pkcs11:model=PKCS%2315%20emulated;manufacturer=ZeitControl;serial=xxxx;token=OpenPGP%20card%20%28User%20PIN%29
Object 0:
        URL: pkcs11:model=PKCS%2315%20emulated;manufacturer=ZeitControl;serial=xxxx;token=OpenPGP%20card%20%28User%20PIN%29;id=%03;object=Authentication%20key;type=public
        Type: Public key (RSA-4096)
        Label: Authentication key
        Flags: CKA_WRAP/UNWRAP; CKA_EXTRACTABLE; 
        ID: 03

Object 1:
        URL: pkcs11:model=PKCS%2315%20emulated;manufacturer=ZeitControl;serial=xxxx;token=OpenPGP%20card%20%28User%20PIN%29;id=%03;object=Cardholder%20certificate;type=cert
        Type: X.509 Certificate (RSA-4096)
        Expires: Sun Sep 28 09:07:36 2025
        Label: Cardholder certificate
        ID: 03
</code></pre>
<p>I tried listing the private keys, but for some reason, p11tool won't display any.</p>
<h2>Login</h2>
<p>Now that the card is prepared, let's set up the login. First, copy the certificate authority to where sssd can find it. Any key signed with this certificate will be candidate for login.</p>
<pre><code>sudo cp pki/ca.crt /etc/sssd/pki/sssd_auth_ca_db.pem
sudo chmod ugo-w /etc/sssd/pki/sssd_auth_ca_db.pem
</code></pre>
<p>Then, set up sssd itself by creating the file <code>/etc/sssd/conf.d/base.conf</code> with the following contents:</p>
<pre><code>[sssd]
services = nss, pam
domains = files
disable_netlink = true

[domain/files]
id_provider = files

[pam]
pam_cert_auth = True
pam_p11_allowed_services = +kde, +kscreensaver, +sddm

[certmap/files/thatsme]
matchrule = &lt;SUBJECT&gt;^CN=thatsme$
</code></pre>
<p><em>NOTE:</em> The common name (file name) given to easy-rsa must match the username on this system. If anyone knows how to break that relationship, please tell me! I know it has to do with <a href="https://www.mankier.com/5/sss-certmap">sss-certmap</a>, but the docs are incomprehensible to me.</p>
<p>Finally, enable sssd. Do this as root:</p>
<pre><code>systemctl enable sssd
authselect select sssd with-smartcard --force
systemctl restart sssd
</code></pre>
<p>And that's it. When the smart card is plugged in, you should be able to see this:</p>
<pre><code>$ sudo echo &quot;ok&quot;
PIN for OpenPGP card (User PIN):
</code></pre>
<p>rather than the normal</p>
<pre><code>$ sudo echo &quot;ok&quot;
Password: 
</code></pre>
<p>Similarly, the KDE screen locker will accept the smart card password rather than the user password. It doesn't indicate that this is happening though, other than being really slow with the card in.</p>
<h3>Debugging</h3>
<p>Enable logging by adding the <code>debug_level</code> entry:</p>
<pre><code>[pam]
pam_cert_auth = True
debug_level = 9
</code></pre>
<p>and restarting sssd.</p>
<p>The two relevant log files are here:</p>
<p><code>/var/log/sssd/sssd_pam.log</code></p>
<p>and</p>
<p><code>/var/log/sssd/p11_child.log</code></p>
<p>but they are not that helpful, because sssd documentation doesn't really explain a lot.</p>
<p>Good luck upgrading your opsec!</p>
<h2>Actually, no login</h2>
<p>I did all of that, I rebooted my computer, I plugged in the Nitrokey, and I entered my password at the SDDM login screen. The prompt takes some long seconds to resolve – good, it's working with the card.</p>
<blockquote>
<p>Login refused</p>
</blockquote>
<p>Wait, what? I try again, same result. I try my regular password, and that goes through.</p>
<p>The logs in /var/log/sssd/sssd_pam.log show:</p>
<pre><code>[pam_reply] (0x4000): [CID#3] pam_reply initially called with result [9]: Authentication service cannot retrieve authentication info. this result might be changed during processing
[pam_reply] (0x0200): [CID#3] blen: 22
[pam_reply] (0x0200): [CID#3] Returning [9]: Authentication service cannot retrieve authentication info to the client
[client_recv] (0x0200): [CID#3] Client disconnected!
</code></pre>
<p>Meanwhile, a successful use in sudo:</p>
<pre><code>[pam_reply] (0x4000): [CID#9] pam_reply initially called with result [9]: Authentication service cannot retrieve authentication info. this result might be changed during processing
[pam_reply] (0x0040): [CID#9] Assuming offline authentication setting status for pam call 249 to PAM_SUCCESS.
[pam_eval_prompting_config] (0x4000): [CID#9] No prompting configuration found.
[may_do_passkey_auth] (0x0400): [CID#9] Passkey auth not possible, SSSD built without passkey support!
[pam_reply] (0x0200): [CID#9] blen: 22
[pam_reply] (0x0200): [CID#9] Returning [0]: Success to the client
</code></pre>
<p>I don't know if this difference is relevant. My search engine skills failed me, and I eventually gave up.</p>
<p>I guess I'll just have to use the main password for login. Thankfully, when traveling, I resume the laptop from sleep, and the screen locker accepts the smart card password, so it's not that bad.</p>
<p>Either way, if you know how to solve this problem (any sddm devs around?), please let me know! I'll update this post with your info.</p>
<hr />
<p>Thanks to one of my coworkers at <a href="https://puri.sm">Purism</a>, for showing me that not every password must be guarded jealously.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Non-English speech synthesis</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/speech/"/>
       <id>tag:dorotac.eu,2023-09-02:posts/speech</id>
       <updated>2023-09-02T14:00Z</updated>
       <published>2023-09-02T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Non-English speech synthesis</h1>
<p>If you're like me, you've been on the English-language Internet for so long that English is how you think of computer things.</p>
<p>Sadly, the English Internet has blind spots. The most obvious one of those is that the Eglish Internet doesn't care about other languages. So if you go and ask the search engine about &quot;Linux speech synthesizers&quot;, hoping to get a robot voice to say &quot;<a href="https://jazda.org">jazda</a>&quot;, you'll realize that the results have nothing to offer you.</p>
<p>Instead, you stop for a moment, and decide to go to the smaller, a bit backwater, non-English Internet, and you ask &quot;syntezator mowy linux&quot;, knowing that search results are likely to be infested with non-free solutions which you don't want to see, and you might not even find a free one.</p>
<p>The results are just as bad as in the English search, but somehow still worse. There's the unmaintained, one-person effort, a tutorial for Festival which asks you to download the voice bank from a long-dead URL, and a bunch of links referring to espeak which <a href="jazda_espeak.wav">sounds awful even for a robot voice</a> when asked to go <a href="https://github.com/espeak-ng/espeak-ng/issues/539">outside of its English comfort zone</a>. The guides to installing an Android software aren't even worth mentioning.</p>
<p>But there's something sparkling in the mud. A <a href="https://bon.uw.edu.pl/magda-i-natan-dwa-polskie-glosy-dla-syntezatora-mowy-rhvoice/">university website</a> links to <a href="https://rhvoice.org/">RHVoice</a>, which is even <a href="https://packages.debian.org/search?keywords=rhvoice&amp;searchon=names&amp;suite=stable&amp;section=all">packaged for Debian</a> – although I don't know why it's in the non-free repo, the license on the github page is listed as GPL. If it sounds like an easy solution, it almost is. Almost.</p>
<h2>RHVoice</h2>
<p>I don't run Debian, so I install the software in a container.</p>
<pre><code>podman run -ai debian:bookworm
</code></pre>
<p>Once inside, I install vim, and I install the package and some voices:</p>
<pre><code>echo 'deb http://deb.debian.org/debian bookworm non-free&quot; &gt;&gt; /etc/apt/sources.list
apt update
apt install rhvoice rhvoice-polish
</code></pre>
<p>Then, I tell it to say something:</p>
<pre><code>echo &quot;jazda&quot; | RHVoice-test -p natan -o - &gt; test.wav
</code></pre>
<p>This will place the recording in a file. On the host side, I retrieve it using the container ID:</p>
<pre><code>podman cp 1ea7197c68f6:test.wav Documents/jazda.wav
</code></pre>
<p>That's it! The recording <a href="jazda.wav">sounds okay</a>, and I can put it alongside the IPA pronunciation on the <a href="https://jazda.org">web site</a>.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>A biased guide to tech conferences</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/conferences/"/>
       <id>tag:dorotac.eu,2023-08-11:posts/conferences</id>
       <updated>2023-08-11T14:00Z</updated>
       <published>2023-08-11T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>A biased guide to tech conferences</h1>
<p>This year's Camp was my second so far. Despite knowing it already, and knowing how much is happening there, I left it feeling that I missed out on a lot. The previous camp was so much more exciting! Was this year's Camp different, or was it me?</p>
<p>The steamy weather affected everyone badly, putting all of us campers into a sleepy mood. But, once I thought about it, I noticed that I made some unprovoked mistakes, myself. Information wants to be free, so here's an easily digested guide for all of you, so that you don't have to make the same mistakes.</p>
<h2>My biases</h2>
<p>Out of the lack of experience otherwise, I'll describe what works for me when I come alone by train, and with an amount of luggage that does not impede my mobility much. The advice may need some adjustments for other cases.</p>
<h2>Early</h2>
<p>To get the most out of a conference, becoming a speaker might be useful. Forget about the long-term popularity boost. It's worth presenting just because of the immediate benefits. You'll meet the most fun people: the organizers before your talk, and afterwards, the listeners who are interested in the same topics as you are. Don't leave the stage immediately after you finished presenting.
Oh, and did I mention that a speaker's badge sometimes grants access to the backstage area?</p>
<p>To snatch a slot, you have to prepare early. The deadline for a typical &quot;Call for Papers&quot; is 6-2 months before the event. Look for announcements. If you miss it, then, depending on the event, you might still get another chance.</p>
<p>It may be worth to get the ticket early. In some places, tickets go up in price as the date of departure approaches. If you have a choice, snatch a ticket with a fast connection while you can. Sure, you can travel to the GPN in Karlsruhe from Berlin on the flat-rate Deutschlandticket, but you're going to face over 12h of travel. Maybe it's worth it to get there in half the time by paying a couple EUR for an IC ticket well in advance.</p>
<p>Now that you know you have to choose your tickets early, when to go? There are conferences where participants are the main attraction, and where they come with all their own toys. Find out if your conference is one of those, and if it is, come a day early, to help with unloading and the set-up. People who come early are the organizers, and the most proactive members of the community, so this is whose friendship you can win by carrying boxes and connecting cables. You can fall back on it over the conference, and you'll be remembered online and at future events.</p>
<p>Staying after the party is over will not earn you friends for the conference that just ended. But it may give you eternal gratitude by the few people who stayed. As I'm writing this, there are still people on the Camp grounds <a href="https://events.ccc.de/2023/08/26/camp23-feedback/">dismantling</a> the event. It's also going to pay off if the people you help with the teardown are the same who you meet at your local hackerspace.</p>
<h3>CCC events</h3>
<p>Events like the Camp and the Congress are self-organized to the extreme. They explicitly want visitors to come in groups, and organize activities, rather than to leave everything to the central organizers. Groups provide stages to have talks on, toys to play with, electricity to use, places to hang out, and, at the Camp, food to eat. While some of those benefits are available to anyone, you'll be seen a lot more welcome if you earn your participation by helping the group, and it's best to start early.</p>
<p>Go to your local hackerspace, and check who's going. Offer your help unpacking, and people will simply like you. Help organizing the talks on your stage, and you'll get another shot at presenting a talk (see above why talks are important). Come to the meeting before trip, and you might find out that someone has free space in the car to carry you or your cool hardware projects to the destination.</p>
<h2>Hotels</h2>
<p>There's not much to be said about them. It's best to choose the same hotel where others from the conference are already staying. This way, you'll be able to take the conference chat all the way back, and maybe even start nerd-talking at breakfast.</p>
<h2>Gadgets</h2>
<p>A tech conference cannot exist without blinkenlights. If you have the space, bring your favorite toys. But only <em>if</em> --. don't carry your huge 3d printer on the train, thank you. Take your projects, posters, <strong>stickers</strong>, -junk- rare stuff you don't need --. anything that draws attention --. and give them away or show them off later. Just don't get so lost in the project that you hack on it alone. You can always do that at home.</p>
<p>Oh, and some events run a phone network with DECT phones, so if you have one, take it with you.</p>
<h2>Travel</h2>
<p>The day of the trip is now close. You've prepared well. A printed ticket to prevent troubles with a dead phone battery. A bottle of water, because train trips can be exhausting. Some money, to buy food and a replacement bottle when you inevitably lose the one you came with.</p>
<p>But don't forget that the conference starts already on the train. Decorate yourself with nerd jewelry: armbands from previous hacker conferences, helper T-shirts, sticker-adorned laptops, and IKEA sharks are dead giveaways of a nerd coming to a social event. When you start seeing those signs, the conference is officially open. You've met conference people. Talk to them as if you already arrived! After hours on the train, they are probably just as bored as you are. (Be careful not to talk to them so intensely that you miss your next train – true story.) Maybe you'll discover that you have common interests. Some prompts, depending on the exact situation: &quot;Are you going to FrOSCon too?&quot;, &quot;Is this a Framework laptop? Can you remove the HDMI port?&quot;, &quot;Hey, I love Rust, too!&quot;, &quot;Your cat ears are falling off.&quot;</p>
<p>If you don't know how to continue, the train trip is a good place to practice. A lot of fun from the conference comes from finding people with cool interests. The more people you quiz, the most likely you'll find the outstanding ones, so talk to many people! In case of a stall in the conversation, it's best to ask about their projects and reasons for coming.</p>
<h2>Arrival, helpers</h2>
<p>After the arrival you might want to set up your base. This is obligatory if you're camping, but it's still useful if you just want somewhere to place your toys. Choose it based on the people around. Your hackerspace? Perfect. Sleeping spots? This way the party never pauses. A cool installation? More opportunities to catch passersby for a chat.</p>
<p>There is something more important, though. Register yourself as a helper (angel, troll, whatever it's called this week), and have your schedule filled for you with hanging out with people who care (orga team, bar shifts), leaving your mark on things things (construction), and going behind closed doors (video team), among other things. And all of that while being respected for helping make the event run! Maybe you'll even get a T-shirt for your trouble. Being a helper is a win-win-win.</p>
<h2>Schedule</h2>
<p>Just don't overdo it. You won't be the only helper, and if the only open shift remaining is babysitting the children playground at 7:00, maybe that's not the best use of your time.</p>
<p>Treat the shift schedule as another conference track. Welcome desk shift at 11? Great, but the talk at 11:15 about cyber security in Bhutan is presented by this guy I want to meet.</p>
<h2>Walk, watch, talk, play</h2>
<p>Every conference is different, even if it's just a re-edition of the same event. Go around, discover the hidden corners of it. Pay attention to what you see. Maybe there's a poster with the DECT number to the model railway operator group. Check it out. Maybe there's a puzzle to solve. Maybe there's an easter hunt going on, and the corner hides a prize. Or maybe there's a secret cabal of uber-nerds discussing plans of cyberspace domination. Join them! Don't be afraid to talk to people. Or if you're still afraid, find another curious hacker, and go explore together (it helps). Go sweeping, and visit each stall, asking people there what they are up to (they are usually up to something).</p>
<p>Stay curious, and not just because it's a hacker's virtue, but also because curiosity will make your day exciting.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Style me</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/styles/"/>
       <id>tag:dorotac.eu,2023-08-11:posts/styles</id>
       <updated>2023-08-11T14:00Z</updated>
       <published>2023-08-11T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Style me! — announcement</h1>
<p>Regular readers of this blog (meaning: me) could notice a change last month. The change is a new section of the web page called &quot;<a href="/userstyles">Styles</a>&quot;, linked from the navigation section. What is it?</p>
<p>It's a return to the roots. It's a revolution that comes back to the beginning. It's the return of control of the Web experience back to the user.</p>
<p>It's a dark style for dorotac.eu.</p>
<h2>Hoops to jump</h2>
<p>Of course, I'm not doing this like everyone else, oh boy no I don't. There is no JS on this web site which activates when you click the style link. The website does not set any cookie to select the style in the future. There is no button which you can click on an unmodified browser to make the style work – sadly.</p>
<p>The style is a (<a href="https://github.com/openstyles/stylus/wiki/UserCSS">User</a>)CSS file, containing the CSS to add to the page, and pretty much nothing else.</p>
<p>Unfortunately, no browser I'm aware of is able to use user styles without modifications. I recommend the <a href="https://addons.mozilla.org/en-US/firefox/addon/styl-us/">Stylus</a> extension: install it first, then choose the style.</p>
<h2>User agent</h2>
<p>Not many people realize, but browsers were once called &quot;<a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent">user agents</a>&quot;. To me, it sends a clear message: this browser is your agent. It serves you, respectable user. It's there to further your interests on the Web.</p>
<p>Sadly, this was not meant to last. Browsers have been co-opted by interests other than the user. The <a href="https://gs.statcounter.com/browser-market-share">most popular</a> browser is made by an <a href="https://www.nature.com/articles/s41599-017-0021-4?error=cookies_not_supported&amp;code=7e4b8ef6-5989-41e3-bc01-1072bbb3a58b">advertising company</a>, which have strong incentives to respect the rights of individuals on the web – oh <a href="https://www.digitalguardian.com/de/blog/google-fined-57m-data-protection-watchdog-over-gdpr-violations">wait</a>, <a href="https://www.tessian.com/blog/biggest-gdpr-fines-2020/">never</a> <a href="https://www.thedrum.com/news/2022/11/15/googles-400m-penalty-the-impact-the-5-heftiest-data-privacy-fines-2023-ad-plans">mind</a>.</p>
<p>Those interests, naturally, try to push their own interests, even if it wrests choice and control of their Web experience away from the user. First, <a href="https://wiki.mozilla.org/Media/EME">DRM support was standardized</a>, encouraging companies to give the user a choice putting them in control: either we control your video player, or you can't see our stuff. A little further on, Google Chrome switched to <a href="https://nordvpn.com/blog/manifest-v3-ad-blockers/">Manifest v3</a>, which removed support for a large swath of ad blockers (connect the dots yourself). Most recently, Google pushed for standardizing a version of DRM which would allow web sites to put the user in a situation where they <a href="https://arstechnica.com/gadgets/2023/07/googles-web-integrity-api-sounds-like-drm-for-the-web/">either use a pre-approved browser, or go home</a> via the WEI proposal. Control is going away from the user.</p>
<h2>User styles</h2>
<p>User styles are one of those things that bring a bit of control back to the user. They allow the user to see the Web as they please, even if the original designers imagined it otherwise. And the designers of the original Web page can't do squat about this!</p>
<p>This is the reason that dorotac.eu is so sparsely styled. I don't want to take your control away. Do you have a fancy green-on-blue default? Sure, I'll play with your rules. Do you want to add your own CSS via user styles? I keep custom stuff to the minimum, so that you can reorganize it easily.</p>
<p>The dark style on here is actually just a tweaked copy of the style I apply to all Web pages. Take it, and modify it! Oh, and if you create something really wild, send it to me. I'll publish it alongside!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Debugging Rust in QtCreator</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/debugging_rust/"/>
       <id>tag:dorotac.eu,2023-06-20:posts/debugging_rust</id>
       <updated>2023-06-20T14:00Z</updated>
       <published>2023-06-20T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Debugging Rust in QtCreator</h1>
<p><strong>Want to get straight to the juice? <a href="https://framagit.org/dcz/qtc-rustdebug/">Here's the repository</a> with the helpers.py file for debugging Rust with QtCreator.</strong></p>
<p>They say debugging is twice as hard as coding in the first place, so if you write the most clever code you can, you can't debug it by definition. Instead, you should write code at most half as clever as you are, so that you are able to actually debug it.</p>
<p>In reality, a lot of your skill depends on your tools, so if you have a bad/good debugger, you have to be more/less careful.</p>
<p>I'm only a human, and I make mistakes: sometimes I write clever code. It's embarrassing to say, but sometimes I even enjoy it. When the inevitable bug appears, I desperately search for any ways to make the chore of debugging easier. For me, that means avoiding a naked <em>gdb</em> for something more <em>visual</em>. I like to see an overview of variables all at the same time, know their types without asking explicitly, and to be able to see which variables have changed.</p>
<p>Because of those preferences, I use <a href="https://www.qt.io/product/development-tools">QtCreator</a> as my debugger.</p>
<h2>C? C++? Rust?</h2>
<p>QtCreator has been built with C and C++ in mind. That's all fine, but when I use Rust, it doesn't really know what to do with the variables I want to see. It doesn't understand enums, inserting weirdly named fields into them, and it gets completely lost when it comes to showing the contents of a slice or a string. Take a look at what a String looks like:</p>
<p><img src="qtstring.png" alt="QtCreator displays String as a deply nested structure" /></p>
<p>And now at this <code>Option::None</code> value:</p>
<p><img src="none.png" alt="QtCreator displays None in a nested way, including an inaccessible None variant and a #1 field with &quot;0&quot; in it" /></p>
<p>That's not great. Slices, strings, and enums like Option or Result are Rust's bread-and-butter. It's challenging to write any code without those. Does that mean that I have to go back to gdb? Gdb understands Rust, after all:</p>
<pre><code>test::printn (n=...) at test.rs:57
57          println!(&quot;{:?}&quot;, n);
(gdb) p n
$2 = test::Newtype (5)
</code></pre>
<p>...more or less:</p>
<pre><code>test::printS (t=...) at test.rs:33
33          println!(&quot;{}&quot;, t);
(gdb) p t
$1 = alloc::string::String {vec: alloc::vec::Vec&lt;u8, alloc::alloc::Global&gt; {buf: alloc::raw_vec::RawVec&lt;u8, alloc::alloc::Global&gt; {ptr: core::ptr::unique::Unique&lt;u8&gt; {pointer: core::ptr::non_null::NonNull&lt;u8&gt; {pointer: 0x5555555abaa0}, _marker: core::marker::PhantomData&lt;u8&gt;}, cap: 3, alloc: alloc::alloc::Global}, len: 3}}
(gdb) c
test::printSl (s=...) at test.rs:73
73          println!(&quot;{:?}&quot;, s);
(gdb) p s
$5 = &amp;[u8] {data_ptr: 0x555555598061, length: 2}
</code></pre>
<h2>Helpers</h2>
<p>Worry not. QtCreator's authors were smart enough to predict the need for such functionality. They provided a hook for something called &quot;debugging helpers&quot;. Those are small programs which interact with QtCreator's gdb's session in order to turn the data into something the debugger panel can understand.</p>
<p>Long story short, I spent a bunch of time hacking a <a href="https://framagit.org/dcz/qtc-rustdebug/-/blob/master/helpers.py">helpers.py</a> file which you can put into QtCreator, and which decodes the most common Rust types. Get it in my <a href="https://framagit.org/dcz/qtc-rustdebug/">Framagit repository</a> and read the README.</p>
<p><img src="string.png" alt="QtCreator presents human-readable contents of a Vec&lt;String&gt; in a readable way" /></p>
<h2>Limitations</h2>
<p>The project for which I needed this didn't make use of many types, so many things from the standard library are missing and may not work. Things like Path[Buf], OsStr[ing], Cells, Mutex, Arc are completely untouched. They shouldn't be hard to add, though. Patches accepted!</p>
<p>Beware! Rust doesn't have a stable ABI for the debug information, so those helpers are tuned for binaries produced with a specific version of Rust. I'll try to keep them up to date, but it's not a guarantee.</p>
<p>Finally, because I try to keep compatibility with plain C and with C++, I keep building on the basic helpers provided by QtCreator. Honestly, they are not thought out that great, and they need a rewrite, but instead I monkey-patch what was needed. If your answer it &quot;well, contribute it upstream&quot;, then I'll just make it clear that I refuse to sign over my rights to the code by signing the <a href="https://www.qt.io/community/legal-contribution-agreement-qt">Qt Company's</a> Contributor License Agreement. Qt Company: if you're reading this, you're allowed to take the code under the terms of the GPLv3, but you do not get the permission to relicense it, just like I didn't.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>SSH keys everywhere</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/ssh-keys/"/>
       <id>tag:dorotac.eu,2023-06-18:posts/ssh-keys</id>
       <updated>2023-06-18T14:00Z</updated>
       <published>2023-06-18T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>SSH keys everywhere</h1>
<p><em>This blog post is an edited and expanded version of the lightning talk I gave last week at <a href="https://entropia.de/GPN21/en">GPN21</a>.</em></p>
<p>As a reader of my blog, you probably know what SSH is. If not — the relevant part for this session is that it's a system allowing for secure, authenticated connections between two computers. It allows using passwords to log in, but the interesting part is when the user is logged in based on the provided key. Nearly every Linux hacker has a key pair which is used to log in to remote computers [citation needed].</p>
<p>Normies also enjoy secure systems, but those are based on a different technology. Connecting to web sites is secure and authenticated thanks to the layer called TLS. However, unlike with SSH, pretty much no one uses TLS keys to <em>log in</em> to web sites. Instead, the typical login process proceeds by making a TLS connection anonymously, and then a login mechanism built on HTTP or some other protocol (IMAP, XMPP, SMTP, IRC all have one) on top of TLS takes over.</p>
<p>Needless to say, using TLS to authenticate the server, and then a custom protocol to authenticate the client is more complicated than using TLS to authenticate both. SSH wins here by using the same mechanism (a public/private keypair) in both directions.</p>
<p>Another advantage of SSH is the one I mentioned before: many hackers already have a set of keys, and most understand how to use them.</p>
<p>However, there are not many libraries for creating custom services using the SSH protocol, compared to the ubiquitous TLS. What if I wanted to make a custom, secure, authenticated application, but avoid hacking around SSH's niche protocol?</p>
<h2>Would it be possible to use SSH keys in a TLS connection?</h2>
<p>You'll be pleased to hear that the answer is <a href="https://framagit.org/dcz/quishka"><strong>YES</strong></a>.</p>
<p>I put some effort into building a proof-of-concept pair of programs (server and client) which communicate using a TLS-based connection secured and authenticated using existing SSH keys.</p>
<p>Why is that possible?</p>
<p>Both TLS and SSH protocols are based on a similar set of cryptographic primitives. <a href="https://datatracker.ietf.org/doc/html/rfc8446#appendix-B.3.1.3">TLS 1.3 certificates support</a> three signature schemes: RSA keys, ECDSA NIST P-256, P-384, and P-512 elliptic curves, and the pair of elliptic curves called ED25519  and ED448 (which seem to be <a href="https://cabforum.org/baseline-requirements-documents/">unsupported by most Web browsers</a>). SSH public keys come in <a href="https://man.freebsd.org/cgi/man.cgi?query=ssh-keygen&amp;apropos=0&amp;sektion=0&amp;manpath=FreeBSD+13.2-RELEASE+and+Ports&amp;arch=default&amp;format=html">different flavors</a>: dsa	| ecdsa	| ecdsa-sk | ed25519 | ed25519-sk | rsa. The ecdsa key, once generated, expands to ecdsa-sha2-nistp256.</p>
<p>As you can see, there is some overlap: rsa, ecdsa, ed25519.</p>
<p>If we manage to extract the key data from the SSH public/private key files, and turn them into TLS-compatible public/private key files, then we should be pretty much set!</p>
<p>Except that there is no &quot;public key file&quot; in TLS. The closest equivalent would be the <a href="https://datatracker.ietf.org/doc/html/rfc5246#section-7.4.2">certificate</a> <a href="https://datatracker.ietf.org/doc/html/rfc5246#section-7.4.6">file</a>. The certificate adds some extra data about the certifying authority on top of the key file. But that's fine. We can skip all that, and create a self-signed certificate. (While certificates also <a href="https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Certificate-based_Authentication">exist in SSH</a>, most setups don't use them, so there's no need to bother with them in a proof of concept.)</p>
<p>Now we have a clear path to SSH key support in TLS connections!</p>
<h2>SSH security model</h2>
<p>Remember known_hosts and authorized_keys? Those are not things used in most TLS-based applications. TLS applications are usually based on the Web <a href="https://en.wikipedia.org/wiki/Public_key_infrastructure">PKI</a> concept, where some entities are trusted by default by everyone to prove the identity of the servers. SSH is different: nothing is trusted by default. When you connect to a server, you get a choice to trust it to be who it promises to be or not. If you don't, the client will prevent the connection. Your choice gets stored in known_hosts. Similarly, the server administrator registers your key as allowed to log in by placing it in the authorized_key file. If you're not registered, the server will reject your connections.</p>
<p>By reusing existing SSH keys, we apply the SSH model to TLS, and throw away centralized certificate authorities.</p>
<h2>Implementation</h2>
<p><a href="https://framagit.org/dcz/quishka">The proof-of-concept application</a> is simple: the client tries to connect, checks the server public key, presents its own, and if everything goes right, the connection gets accepted by both parties, and the server sends the client a single &quot;hello&quot;.</p>
<pre><code>$ cargo run --bin server -- -k foo 
    Finished dev [unoptimized + debuginfo] target(s) in 1.27s
     Running `target/debug/server -k foo`
listening on [::1]:4433
connection incoming
connection remote [::1]:43131, proto &quot;&lt;none&gt;&quot;
established
accepted
</code></pre>
<pre><code>$ cargo run --bin client -- -i foo2 --known_hosts known
    Finished dev [unoptimized + debuginfo] target(s) in 0.22s
     Running `target/debug/client -i foo2 --known_hosts known`
connecting from [::]:43131
opening
hello
</code></pre>
<p>The only supported key type on both sides is ED25519. Make sure to create new keys for the client and the server: this is a proof of concept, full of cut corners. It's possible that some of the corners became a security hole!</p>
<pre><code>ssh-keygen -t ed25519 -f my_special_key
</code></pre>
<p>This is what happens if a client with an unknown key tries to connect:</p>
<pre><code>     Running `target/debug/server -k foo`
listening on [::1]:4433
connection incoming
Unknown pubkey with fingerprint: [F0, 34, FB, EA, 59, B, 2D, 40, BD, 37, C1, 24, B1, 7D, 4C, D7, E4, 49, E3, 86, 7, A8, A9, 23, 1B, 1A, 9B, 6A, 5B, C5, 74, D6], rejecting connection.
IF THIS KEY SHOULD GAIN ACCESS, add this entry to your authorized_keys file:
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOD833SxhcyNayIdwoZfNJ12PK7heXfl5iHHKJWU/okB
connection failed: the cryptographic handshake failed: error 40: invalid peer certificate contents: Public key unknown
</code></pre>
<h3>ssh-agent</h3>
<p>The best way to ensure security of keys is to not have direct access to them. This is the job of ssh-agent: it loads a key, and signs everything you throw at it. The application never sees the key, which means it cannot leak it. This is perfect for applications written by someone so unskilled as me, so I implemented ssh-agent support in the client. I skipped it for the server, but it should be easy to do as well.</p>
<h3>ssh-keygen and openssl</h3>
<p>Without those tools, I would not have gotten anywhere. The first one can make a PEM public key from a SSH private key, and the other can inspect created keys and certificates.</p>
<pre><code>ssh-keygen -e -m pem -f my_private_key &gt; my_pub.pem
openssl asn1parse -inform pem -in my_pub.pem
    0:d=0  hl=2 l=  89 cons: SEQUENCE          
    2:d=1  hl=2 l=  19 cons: SEQUENCE          
    4:d=2  hl=2 l=   7 prim: OBJECT            :id-ecPublicKey
   13:d=2  hl=2 l=   8 prim: OBJECT            :prime256v1
   23:d=1  hl=2 l=  66 prim: BIT STRING        
</code></pre>
<p>Those help verify that the keys converted inside the application are indeed those which openssh would convert itself.</p>
<h3>My eyes are bleeding</h3>
<p>Have you seen the source code? Yes, I'm using 3 different ssh-related libraries. Each has parts which the two others don't, and which I needed to pull through. Also, the project came up because I was trying to learn <a href="https://en.wikipedia.org/wiki/QUIC">QUIC</a>, so it's made more convoluted because of that. As you can see, it's possible to write garbage, duplicates, unstructured code in Rust as well. I swear I will fix some of those things once I come back to the project.</p>
<p>Contributions welcome ;)</p>
<h2>Lost SSH goodies</h2>
<p>The SSH network protocol is not just security and key exchange. There are also things like channels and some awareness of X11/port forwarding specified in the <a href="https://datatracker.ietf.org/doc/html/rfc4254">RFC</a>. Most standalone applications would not feel their lack after switching to TLS, but what about those which would? Those who open multiple channels and forward data around?</p>
<p>TLS can run over TCP which is single-stream-at-a-time, but I took another approach. QUIC incorporates TLS, and also [supports streams[(https://www.rfc-editor.org/rfc/rfc9000.html#name-streams) natively. Best of both worlds!</p>
<p>That being said, there exists an expired <a href="https://www.ietf.org/archive/id/draft-bider-ssh-quic-09.html">draft for SSH over QUIC</a> already. According to my reading, it does not reuse SSH keys to establish the TLS layer, so in the end, it encrypts everything twice: once on the TLS, and once on the SSH layer (please correct me if I'm wrong). Not great for my server which is already limited by CPU, rather than network throughput.</p>
<p>There's also <a href="https://github.com/moul/quicssh">an implementation</a> of SSH over QUIC, but it looks more like a simple proxy, than anything more involved. Twice encrypted again.</p>
<h2>Future</h2>
<p><a href="https://framagit.org/dcz/quishka">Quishka</a>, the project I wrote, is intended to become a library supporting arbitrary applications. If you decide that it's worth exploring the idea with me, get in touch!</p>
<p>With QUIC, there's few remaining reasons to stick to the vanilla SSH protocol, and I would definitely want to see the official server support it. Multiple nonblocking streams would make multiple file transfers in sshfs so much smoother, for example. I would be totally stoked if this toy project revived the idea!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Shell recipes</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/recipes/"/>
       <id>tag:dorotac.eu,2023-05-10:posts/recipes</id>
       <updated>2023-05-10T14:00Z</updated>
       <published>2023-05-10T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Shell recipes</h1>
<p>The last time I organized a workshop at the local hackerspace, I made copious use of the shell. The audience ranged from ambitious youngsters to greybeards who ate their teeth on decades of Linux. And yet, I managed to surprise them with a tiny piece of bash.</p>
<p>Here's the piece:</p>
<pre><code>mv /home/dcz/file.md{,.bak}
</code></pre>
<p>Notice the brackets and the comma? This syntax turns the path into two paths:  <code>/home/dcz/file.md</code> and <code>/home/dcz/file.md.bak</code>. How could a shell-hater like me know of it, while true hackers didn't? I think it's a testament to shell's discoverability of features. (Hint: it's awful.)</p>
<h2>The tricks file</h2>
<p>It turns out that other people I know have a similar problem with the shell: they remember that they did something, but they can't easily find the way after some time. They create a file containing useful commands or incantations. I do as well, and I was asked to share it.</p>
<p>Most of those tricks will not directly apply to you. But they can serve as a base to whatever use case you have in mind, after slight adjustments. And of course I'm not responsible for any loss of data resulting from using my recipes.</p>
<p>So here goes!</p>
<h3>ARM emulation wth binfmt</h3>
<p>This one is useful before you want to run a binary compiled for the ARM architecture. It requires QEMU to be installed, and a chroot containing an ARM file system. I usually use podman to pull an arm64 container, so I don't have to create the file system myself. I use it to build binaries for the Librem 5 without having to set up the cross-compiler.</p>
<pre><code>echo ':aarch64:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7:\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff:/usr/bin/qemu-aarch64-static:F' &gt; /proc/sys/fs/binfmt_misc/register

echo ':arm:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x28\x00:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/bin/qemu-arm-static:F' &gt; /proc/sys/fs/binfmt_misc/register
</code></pre>
<p>The podman command for armhf is:</p>
<pre><code>podman run -ti -v /containers/build:/mnt/build:z --name=debian_armhf multiarch/debian-debootstrap:armhf-buster /bin/bash
</code></pre>
<h3>Copy a hard drive with bad sectors (untested)</h3>
<p>When a hard drive has bad sectors, the reading process will eventually hit an area that cannot be read immediately. Most copying programs will retry several times, or abort outright. This is wasting lots of time if you have thousands of sectors. Because I am diligent with backups, when my hard drive failed, I wanted to read it back to save the 2 hours needed to recreate some workspaces. A couple of missing files would not have hurt. So I came up with this command to read back everything, but without trying to hard.</p>
<p>Note that I specify disks by ID which contains the serial number. I <strong>never</strong> use <code>/dev/sdx</code> because it's too easy to make a mistake and overwrite something important.</p>
<p>CAUTION: I think I actually wrote this and never tested it.</p>
<pre><code>dd iflag=fullblock conv=sync,noerror bs=1M if=/dev/disk/by-id/ata-Hitachi_xxx of=/dev/disk/by-id/ata-Hitachi_yyy
</code></pre>
<h3>Trace file access</h3>
<p>Applications break in thousand ways. Sometimes they crash outright, or don't pick up the correct configuration, but they won't tell you what they were actually expecting. For those cases, I take them on the <em>strace</em> dissection table and observe their calls <em>in vivo</em>.</p>
<p>Here's the command that will tell you everything about how an application uses files. Most useful for figuring out where it attempts to look for configuration files, or dynamic libraries (the <code>LD_LIBRARY_PATH</code> env var), or pkgconfig instructions (<code>PKG_CONFIG_DIR</code>). One of my most used recipes.</p>
<pre><code>strace -fe trace=access,creat,open,openat,unlink,unlinkat --decode-fds=all
</code></pre>
<h3>Socat</h3>
<p><em>Socat</em> is a great tool to translate streams from one form to another. It can listen or connect, convert between Unix pipes or TCP sockets. Here's a basic forwarder from localhost to another:</p>
<pre><code>socat TCP-LISTEN:8000,fork TCP:192.168.4.4:8000
</code></pre>
<h3>Git change authors</h3>
<p>There are reasons you might forget to set your git identity. A new computer, or a computer where you want to work under multiple identities. You might realize that you had used the wrong email for this repository. Here's a recipe which alters the history of any repository by replacing your name on all commits:</p>
<pre><code>git filter-repo --name-callback 'return b&quot;dcz&quot;' --email-callback 'return b&quot;dcz@example.com&quot;'
</code></pre>
<h3>Git change identity</h3>
<p>Here's a related one. If you have different identities, you might have different ssh keypairs for each. Here's how you set a custom keypair for the current repository. Don't forget to ssh-add.</p>
<p><strong>WARNING</strong>: Opsec is hard. I think SSH will still disclose your other identities to the server when connecting.</p>
<pre><code>git config --local core.sshcommand &quot;ssh -i ~/.ssh/id_rsa.dcz -F /dev/null&quot;
</code></pre>
<h3>Git serve</h3>
<p>When you want to share your local git repository to the wide world, you need git-daemon installed, and the following entry in the .git/config file:</p>
<pre><code>[alias]
    serve = !git daemon --reuseaddr --verbose --base-path=. --export-all ./.git
</code></pre>
<h3>GPT partitioning tool</h3>
<p>This is the simplest recipe I have, but not the least useful.</p>
<pre><code>sgdisk
</code></pre>
<h3>Search for text files</h3>
<p>The <em>grep</em> program will by default search for the given string in all files, including binary files. This is almost never what I want: when I look for text, I want to find text, and including binaries slows the search for nothing. Even worse, even when my string exists in the binary file, all <em>grep</em> says is &quot;binary file matched&quot;, dropping any context. Completely useless. I use this to search only in text files:</p>
<pre><code>grep -r -I text dir
</code></pre>
<h3>Warm-plug of SATA drives</h3>
<p>You can plug in SATA hard drives when the computer is running. But they will not always get detected. This is how you poke the bus to try to find newcomers (try host0 through host3):</p>
<pre><code>echo '- - -'  &gt; /sys/class/scsi_host/host0/scan
</code></pre>
<p>Then you can kick out the drive again by doing:</p>
<pre><code>udisks --detach /dev/disk/by-id/ata-xxxx
</code></pre>
<p>and unplug it.</p>
<h3>Show most recent messages in Gajim</h3>
<pre><code>sqlite3 ./.local/share/gajim/logs.db
select * from logs where message != '' order by time desc limit 10;
</code></pre>
<h3>Show network connections in real time</h3>
<p>It works also on OpenWRT. Needs tcpdump on the remote host and etherape on the host which inspects the traffic.</p>
<pre><code>ssh root@192.168.5.1 tcpdump -n -i br-lan -w - not port 22 | etherape -m ip -r -
</code></pre>
<h3>Nmap ping scan</h3>
<p>Check who's up on the local network. A regular scan with nmap takes too much time.</p>
<pre><code>nmap -sn -PE 192.168.5.0/24
</code></pre>
<h3>Graphical podman</h3>
<p>If you want to sandbox games in your podman, it's possible with some tricks:</p>
<p>Wayland:</p>
<pre><code>podman run -e XDG_RUNTIME_DIR=/tmp            -e WAYLAND_DISPLAY=wayland-0     --security-opt label=disable       -v $XDG_RUNTIME_DIR/wayland-0:/tmp/wayland-0:rw -ti debian:buster /bin/bash
</code></pre>
<p>X11:</p>
<pre><code>podman run -ti -v /tmp/.X11-unix:/tmp/.X11-unix:rw --security-opt label=disable game_container
</code></pre>
<h3>Ripping a CD</h3>
<p>I never actually tried this one. It's supposed to read all the data, and produce a more complete file than just ISO. ISO does not contain audio tracks, for example:</p>
<pre><code>cdrdao read-cd --read-raw --read-subchan
</code></pre>
<h3>Touchpad</h3>
<p>Ever played a shooter game on the touchpad? Well, I was once bored enough on the train… But I needed to to make it possible to aim and move first. I did it by not disabling the touchpad while the keyboard is in use. First, check the device number in <code>xinput list</code>, and then:</p>
<pre><code>xinput set-prop 11 298 0
</code></pre>
<h3>Unrar</h3>
<p>RAR files have an Libre decompressor, but it doesn't work on all files. I refuse to pollute my system with such tools, so I keep them contained in a container.</p>
<pre><code>podman run --rm -v $PWD/unrar:/files:z maxcnunes/unrar:latest unrar e -or -r myfile.rar
</code></pre>
<h3>Listening to WOL packets</h3>
<p>Nothing tricky in this one.</p>
<pre><code>tcpdump -i enp9s0 'ether proto 0x0842'
</code></pre>
<h3>Widen</h3>
<p>I don't use fixed width fonts in editors. Proportional fonts are mostly superior, but they have a downside. Sometimes &quot;ASCII&quot; diagrams are useful to quickly sketch an idea visually, and they are typically drawn assuming a fixed font on the reader end, so they appear as misaligned garbage when I see them.</p>
<p>Instead of assuming a fixed font, the authors could <em>ensure</em> fixed font. This is what this Python function does: it converts ASCII to fixed width, on the <em>character</em> level.</p>
<pre><code>WIDE_MAP = {i: i + 0xFEE0 for i in range(0x21, 0x7F)}
WIDE_MAP[0x20] = 0x3000

def widen(s):
    &quot;&quot;&quot;
    Convert all ASCII characters to their full-width counterpart.

    &gt;&gt;&gt; print widen('test, Foo!')
    ｔｅｓｔ，　Ｆｏｏ！
    &gt;&gt;&gt;
    &quot;&quot;&quot;
    return s.translate(WIDE_MAP)
</code></pre>
<h2>Video tricks</h2>
<p>Ffmpeg is so versatile that it warrants an entire section.</p>
<h3>Keyframes</h3>
<p>Dropping all frames from the video except for keyframes results in a movie at normal speed but a very low FPS and lower size.</p>
<p>For video streams encoded in h264:</p>
<pre><code>ffmpeg -i test.mkv -c copy -map v -bsf:v &quot;filter_units=remove_types=1&quot; key.mkv
</code></pre>
<p>This one is h265, I think:</p>
<pre><code>ffmpeg -i IN.MOV -c:v copy -c:a copy -bsf:v noise=drop='not(key)' out.mkv
</code></pre>
<h3>Metadata</h3>
<p>To write all metadata (global, video, audio) to a file, use:</p>
<pre><code>ffmpeg -i in.mp4 -c copy -map_metadata 0 -map_metadata:s:v 0:s:v -map_metadata:s:a 0:s:a -f ffmetadata in.txt
</code></pre>
<p>To add all metadata from a file, use:</p>
<pre><code>ffmpeg -i in.mp4 -f ffmetadata -i in.txt -c copy -map_metadata 1 out.mp4
</code></pre>
<h3>Reencode as h265</h3>
<p>Those settings worked for me to get good quality:</p>
<pre><code>ffmpeg -i ./XXX.MP4 -acodec copy -vcodec libx265 -bufsize 2M -maxrate 1M ./20.mkv
</code></pre>
<h3>Extract the music</h3>
<p>Losslessly. This commands works for music encoded using the AAC codec, but other codecs mean you have to use different output formats.</p>
<pre><code>ffmpeg -i ../Music/movie -vn -acodec copy -bsf:a aac_adtstoasc out.m4a
</code></pre>
<h3>Replace the sound track</h3>
<p>When your camera can't record the sound in a good enough quality and you have to use a separate sound recorder, you will end up with a video file and a music file. This is how to synchronize the times and merge them:</p>
<pre><code>ffmpeg -itsoffset -00:00:59.8 -i an.wav -i XXX.MOV -ss 00:00:00 -c:a flac -c:v copy -map 0 -map 1 -disposition:a:1 none -map_metadata 1 out.mkv
</code></pre>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>User-oriented desktop, part 1</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/desktop-part1/"/>
       <id>tag:dorotac.eu,2023-05-01:posts/desktop-part1</id>
       <updated>2023-05-01T14:00Z</updated>
       <published>2023-05-01T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>User-oriented desktop, part 1</h1>
<p>Are you unhappy about current computer user interfaces? Do you mourn the unrealized potential of the desktop metaphor? Are tears of nostalgia coming to you when you see Windows 95?</p>
<p>I those feelings are real. You might be seeing something more than just personal taste. There's something actual that we have lost over time – possibly something few people ever saw – that is rubbing against the loudest complainers about user interfaces.</p>
<p>I don't know exactly what it is, so I'm writing this series of essays to narrow it down. Let's start with</p>
<h2>Windows 95</h2>
<p>What is the first thing you notice when you launch a couple programs on this ancient version of Windows? No, I don't mean the <a href="https://www.interaction-design.org/literature/topics/skeuomorphism">skeuomorphic</a> look-and-feel. Tell me the second thing! That's right: most windows look similar to every other window. the control panel, the start menu, Word, Paint Shop Pro, ACDSee, Creatures 2, WinRAR. Apart from the occasional program from Windows 3.11, and the notable exception of Winamp, they are following the same theme, and they will all obey if you change your scroll bar sizes or button colors.</p>
<p><img src="win95cpl.png" alt="Windows 95 control panel" /> Image credit: <a href="http://toastytech.com/guis/">Nathan</a></p>
<p><img src="Paint%20Shop%20Pro%204.png" alt="Paint Shop Pro 4 on Windows 95" /> Image credit: <a href="https://winworldpc.com/product/paint-shop-pro/4x">WinWorld</a></p>
<p>Now let's come back to today and take a look at Windows. From <a href="https://ntdotdev.wordpress.com/2023/01/01/state-of-the-windows-how-many-layers-of-ui-inconsistencies-are-in-windows-11/">reviews</a>, I gathered that this is no longer the case. Steam, Discord, iTunes all look different, and they also look different than builtin software like the Control Panel, which looks different from another builtin, the Microsoft Store.</p>
<p><img src="download-itunes-windows-10.jpg" alt="Itunes and Microsoft Store on Windows 10" /> Image credit: <a href="https://pureinfotech.com/install-itunes-windows-10/">Pureinfotech</a></p>
<p><img src="controlpanel6-1170x658.png" alt="Two control panel windows on Windows 10" /> Image credit: http://windows10quick.com/</p>
<p>Linux does not fare a lot better, with 3 different look-and-feels on my computer. GTK2 applications differ from GTK3, which look different from Qt, and all of those have a different setup for the user to apply themes (which carry over only imperfectly). On top of that, I have the occasional totally-out-of-place application like Blender, Cura or RawTherapee, which come with their own look-and-feel, and might or might not support a bespoke theme setting system.</p>
<p><img src="linux.png" alt="Linux applications: Nemiver on GTK3, Antimicrox on Qt5, Blender, and RawTherapee" /></p>
<p>I'm not entirely pleased by this. Spending time making every other app not burn my eyes out is not my idea of a good time. Why do I bother? Because I have needs which are fulfilled by having consistency. Let's put down some factors promoting → consistency in a diagram.</p>
<p><img src="consistency.svg" alt="Factors around consistency" /></p>
<p>See, consistency promotes ease of use when I know what to look for and what to expect. It promotes my self-expression when tied to my custom themes. That boosts my feeling of ownership, which is also a need of mine.</p>
<p>Then why is the diagram not showing any reason to be inconsistent, you ask? It's because it's missing another relationship: factors inhibiting ⊣ consistency.</p>
<p><img src="consistency-whole.svg" alt="Both promoting and inhibiting factors related to consistency" /></p>
<p>One thing is certain: making software consistent with your competition's software makes your brand blend together with the competitor's. Your software becomes less distinct and memorable, which is a grave sin in an economy as focused on attention and loyalty as ours. So let's redraw the diagram from the perspective of a developer who has needs and who chooses what to influence o⊣ o→ to fulfil those needs.</p>
<p><img src="branding.svg" alt="Feedback loops round consistency, with developer's needs considered" /></p>
<p>There are two main forces which influence consistency: the need to attract users mostly promotes consistency (and works in favor of user's needs), and the need to stand out as a brand, which is entirely against consistency and the user. At the end of the chain, increases in brand recognizability provide a feedback signal ⤏ that the need to have the brand distinct is getting fulfilled. Similarly with ease of use and the need to satisfy users.</p>
<p>This makes our model contain two opposing feedback loops! They balance each other to some extent. Branding cannot take over to abolish consistency entirely because it damages ease of use to an extent that nobody wants to use the software. The need to satisfy users cannot achieve peaks of consistency if the software has to compete with others, because it makes the software forgettable, and at some point it's more efficient to get more users by investing in branding rather than ease of use.</p>
<p>Those feedback loops can be viewed as working for two masters: one for the user, and one for the developer. And, watching the progress of user interfaces, I think we shifted to the developer more. The branding feedback loop is especially strong now, and keeps consistency levels low.</p>
<p>As a user, I'm asking myself the question: how can the balance be shifted back towards the user?</p>
<h2>What would ultimate consistency look like?</h2>
<p>If the need for branding comes from competition, could more cooperation break the feedback loop's power? And what other factors influence this network of relationships?</p>
<p>In part 2, we'll expand this simplified picture with additional factors.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Keypress entropy</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/password/"/>
       <id>tag:dorotac.eu,2023-04-20:posts/password</id>
       <updated>2023-04-20T14:00Z</updated>
       <published>2023-04-20T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Keypress entropy</h1>
<p>A good password is hard to type. Is it a passphrase? My hand is strained from just looking at <a href="https://en.wikipedia.org/wiki/Diceware">diceware</a>'s suggested passphrase <code>NinetiethStunnedPurityStumbleDorsalReps</code> (39 chars, 45 Keypress). <a href="https://www.passwordstore.org/">pass</a> is not much better with <code>''~E;I|pWZ)K:\?~=U&amp;q-|8^V</code>. That's  25 chars, but 17 shift presses on a US layout. Together it's 42 presses, almost as many as the passphrase. It's not a problem for <em>pass</em>. It's a manager, and it will remember the passwords for me so that I don't have to type it.</p>
<p>But I still need to type the master password. And it needs to be <strong>good</strong>, with 128 bits of <a href="https://en.wikipedia.org/wiki/Password_strength#Entropy_as_a_measure_of_password_strength">entropy</a>  (entropy measures how many choices were made to generate the password) and a glittery wrapper. So I should just take the one <em>pass</em> gives me, right?</p>
<p>No, not right. There's no reason I should be mashing shift like crazy. It only doubles the available characters - 1 extra bit of entropy for each. For the 25-characters password, the shift key is responsible for 25 bits of entropy (about 6 characters), but for 17 extra key presses. That's not a good deal at all.</p>
<p>The wrong choice of a password generator hurts multiple times every day, when I type in the various passwords: my computer unlock password, the one to unlock the password manager, another computer, and whenever I need to install some new package or just use <em>sudo</em>. That's too much typing to rely on badly generated passwords.</p>
<h2>Generating key presses</h2>
<p>But I know how to generate passwords specifically for typing them in. It's to switch focus from characters to key presses. Just like the choice of a character from a given set represents entropy, so does choosing a key from the available ones. So instead building the password by choosing among characters:</p>
<blockquote>
<p>a b c ... ABC ... 1 2 3 ... ! ? , ...</p>
</blockquote>
<p>let's choose among keys:</p>
<blockquote>
<p>` 1 2 ... q w e ... shift ...</p>
</blockquote>
<p>Let's forget for a moment about how many thousands of symbols Unicode contains. No password manager generates ⸘, because they <em>do</em> typically make a concession to typists by limiting themselves to some version of the ASCII set of characters.</p>
<p>The actual difference between generating characters and keys is that there are about 100 characters the typical US keyboard layout can produce, using about 50 keys. See that 100/50 = 2? That's the extra bit of entropy we're giving up per character. We have to make up for it if we don't want to trade security for less button mashing.</p>
<h2>A password generator</h2>
<p>So how do we generate a password that still has our 128 bits of entropy, but isn't as buttony as the naively generated one?</p>
<p>We write our own generator, of course.</p>
<p><a href="hasword.rs">Hasword</a> is a short Rust program I wrote for this purpose. (It's listed at the end of this post.) It generates a password, and tells you how many bits of entropy were used to generate it, the length in characters, and the number of key presses.</p>
<p>Here is an example of 128-bits-of-entropy passwords generated naively, from all characters:</p>
<pre><code># cargo run 128
g`{m&gt;|.4-&gt;Q1.^(&amp;n7&lt;J
Entropy: 131.39711
Chars: 20
Presses: 30
Modifiers: 10
</code></pre>
<p>And here's the same, except generated from key presses:</p>
<pre><code># cargo run 128
t i8ep51\:,.9y0me 94;z
Entropy: 129.07819
Chars: 22
Presses: 23
Modifiers: 1
</code></pre>
<p>Oh no, the lower per-character entropy in the typist's password makes it 2 characters longer than the naive password. But it's a <em>total win</em> in the button mashing department: it's 10% longer but it needs only 23/30≈3/4 the key presses without being any less secure!</p>
<p>Now, the only problem left is to commit it to memory…</p>
<hr />
<p>Here's the source code for the password generator. I use rdrand to block as many ways as possible for the attacker to guess the random numbers, but I do not guarantee that I did anything right here. You use it on your own responsibility!</p>
<pre><code>/* COPYING: allowed under GNU GPL v3 or later. 
To build, do `cargo init` and add this to Cargo.toml:
[dependencies]
rand = &quot;0.8&quot;
rdrand = &quot;*&quot;
*/
use rand;
use rand::seq::SliceRandom;
use std::env;
use std::iter::repeat;
use std::str::FromStr;

enum C {
    Char(char),
    Shift,
}

use C::*;

const PLAIN: &amp;[C] = &amp;[
    Char('`'), Char('1'), Char('2'), Char('3'), Char('4'), Char('5'),
    Char('6'), Char('7'), Char('8'), Char('9'), Char('0'), Char('-'),
    Char('='), Char('\\'), Char(']'), Char('['), Char('p'), Char('o'),
    Char('i'), Char('u'), Char('y'), Char('t'), Char('r'), Char('e'),
    Char('w'), Char('q'), Char('a'), Char('s'), Char('d'), Char('f'),
    Char('g'), Char('h'), Char('j'), Char('k'), Char('l'), Char(';'),
    Char('\''), Char('/'), Char('.'), Char(','), Char('m'), Char('n'),
    Char('b'), Char('v'), Char('c'), Char('x'), Char('z'), Char(' '),
    Shift,
];

const SHIFTED: &amp;[char] = &amp;[
    '~', '!', '@', '#', '$', '%', '^', '&amp;', '*', '(', ')', '_', '+', '|', '}', '{', 'P',
    'O', 'I', 'U', 'Y', 'T', 'R', 'E', 'W', 'Q', 'A', 'S', 'D', 'F', 'G', 'H', 'J', 'K',
    'L', ':', '&quot;', '?', '&gt;', '&lt;', 'M', 'N', 'B', 'V', 'C', 'X', 'Z',
];

fn all_chars() -&gt; Vec&lt;char&gt; {
    PLAIN[0..(PLAIN.len() - 1)].iter()
        .map(|c| match c {
            Char(c) =&gt; *c,
            _ =&gt; panic!(),
        })
        .chain(SHIFTED.iter().map(|c| *c))
        .collect()
}

/// Generates random characters. Considers entropy from each character.
fn chars_generator&lt;R: rand::RngCore&gt;(mut r: &amp;mut R)
    // entropy, keypresses, character
    -&gt; (f32, u8, char)
{
    let chars = all_chars();
    let bits_entropy_char: f32 = (chars.len() as f32).log2();
    
    let c = chars.choose(&amp;mut r).unwrap();
    let presses = 1 + SHIFTED.iter()
        .find(|p| *p == c)
        .is_some() as u8;
    
    (bits_entropy_char, presses, *c)
}

/// Same, but considers entropy from each keypress.
fn typing_generator&lt;R: rand::RngCore&gt;(mut r: &amp;mut R)
    -&gt; (f32, u8, char)
{
    let bits_entropy_plain: f32 = (PLAIN.len() as f32).log2();
    let bits_entropy_shifted: f32 = (SHIFTED.len() as f32).log2();

    match PLAIN.choose(&amp;mut r).unwrap() {
        Char(c) =&gt; (bits_entropy_plain, 1u8, *c),
        Shift =&gt; (
            bits_entropy_plain + bits_entropy_shifted,
            2,
            *SHIFTED.choose(&amp;mut r).unwrap()
        ),
    }
}

fn main() {
    let needed_entropy = env::args().skip(1).next()
        .and_then(|s| f32::from_str(&amp;s).ok())
        .unwrap_or(128.0);
    
    let mut r = rdrand::RdRand::new().unwrap();

    let gen = typing_generator;
    // uncomment the following line to use naive mode
    //let gen = chars_generator;

    let seq = repeat(()).map(|()| gen(&amp;mut r))
    
    let mut e = 0.0;
    let mut chars = 0;
    let mut presses = 0;
    for (en, p, c) in seq {
        e += en;
        presses += p;
        chars += 1;
        print!(&quot;{}&quot;, c);
        if e &gt; needed_entropy {
            break;
        }
    }
    println!();
    println!(&quot;Entropy: {}&quot;, e);
    println!(&quot;Chars: {}&quot;, chars);
    println!(&quot;Presses: {}&quot;, presses);
    println!(&quot;Modifiers: {}&quot;, presses - chars);
}
</code></pre>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Maps à la carte</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/quick_maps/"/>
       <id>tag:dorotac.eu,2023-01-26:posts/quick_maps</id>
       <updated>2023-01-26T14:00Z</updated>
       <published>2023-01-26T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Maps à la carte</h1>
<p>Those who follow <a href="https://fosstodon.org/@dcz">my Mastodon account</a> will know that I have a thing for bicycle trips. The bicycle is a technology that gave humans <a href="https://www.theatlantic.com/technology/archive/2014/06/the-technology-craze-of-the-1890s-that-forever-changed-womens-rights/373535/">immense freedom</a> back when it appears, and it remains one of my most favourite ways to meet the world on my terms.</p>
<p>Sometimes unrestricted freedom is the best, and sometimes we need to settle on a direction. And the best thing balancing freedom and directions on a bike is a map.</p>
<p><img src="demo.png" alt="A plain map using only 4 colors: black for roads and labels, green for greenery, blue for water, and red for tow name" /></p>
<p>Maps are typically made for a group of people. Thanks to OpenStreetMap you can take a look at a map drawn for <a href="https://www.openstreetmap.org/">everyone</a>, for the <a href="https://www.opencyclemap.org/">cyclist</a>, or even for a <a href="https://hikebikemap.org/">biker and hiker</a>. There's a service for generating <a href="https://print.get-map.org/">maps ready for printing</a>, too. However close they come, there are no easy tools to make a map for <em>me</em>, according to <em>my</em> preferences, showing things <em>I</em> need to plan my cycling.</p>
<blockquote>
<p>Yes, I'd like a map made, please. For cycling. Make the asphalt local roads thickest. Motorways? Skip them. Actually, draw them the same way as fences. And canals. Mark footpaths too, please. And information posts, and tourist information points. While we're at it, mark monuments in a different color. Oh, and don't forget about bike repair shops, public water sources and grocery stores. Make them red! Mountain springs? Well, only if there aren't any wells nearby...</p>
</blockquote>
<p>– me, ordering a cycling map in my imagination</p>
<p>Thankfully, I'm a problem solver, and I know a solution must exist.</p>
<h2>Playing with crayons</h2>
<p>If only I could draw the map on the computer like I drew fictional maps with crayons as a kid. A stroke there, a pictogram there, each takes two seconds total.</p>
<p>The maps I mentioned before use some sort of styles to generate the differently colored pictures out of the same data. The services are made for serving maps, rather than editing them, though. Tools like <a href="https://wiki.openstreetmap.org/wiki/Mapnik">Mapnik</a> require lots of set up, and even after it's done, editing the styles presumably isn't so comfortable.</p>
<p>What tools do the serious style designers use? Such a tool would put the ease of changing the looks at the forefront, even sacrificing speed…</p>
<p><em>A while later</em></p>
<p>Well, I found it. The project at the core of it is understaffed, and written in JavaScript, but works amazingly well. It's called <a href="https://github.com/tilemill-project/tilemill">TileMill</a>, it's open source, and the maintainers are looking for fresh contributors!</p>
<p>Here's how to set up everything and render a map of your region.</p>
<h2>TileMill</h2>
<p>As a first step, I recommend creating a separate user on your machine. The project uses NPM, and that wil leave lots of cruft in your home directory. In addition, we use the dreaded curl-to-shell pattern, which better be contained away from valuable data.</p>
<p>I'm using Fedora 37 as the base system. Fedora doesn't ship with nvm, and the required version of npm doesn't work, so let's fix this:</p>
<pre><code>curl https://raw.githubusercontent.com/creationix/nvm/master/install.sh | bash
source ~/.bashrc
</code></pre>
<p>Now, install some needed packages:</p>
<pre><code>sudo dnf install git wget osm2pgsql postgis postgresql-server postgresql-contrib vim
</code></pre>
<p>Then follow the <a href="https://tilemill-project.github.io/tilemill/docs/install/">installation instructions</a> with some changes:</p>
<pre><code>git clone https://github.com/tilemill-project/tilemill
cd tilemill
nvm install lts/carbon
nvm use v8.17.0
npm install # this will take a while
npm start
</code></pre>
<p>Ignore the rest of the installation instructions. Now we have an empty instance of TileMill running on <a href="http://localhost:20009">http://localhost:20009</a>. Sadly,  have no idea how to change the ports :( Anyway, navigate there with your browser and start a new project. Uncheck &quot;default data&quot;, we'll use our own.</p>
<p><img src="new_project.png" alt="A Web browser showing localhost:20009. on the left a list with &quot;Projects&quot; selected, central a &quot;New project&quot; button above a project tile &quot;Open Streets, DC&quot;" /></p>
<p>You'll see an empty project. Now that we have the renderer running, let's change tracks and prepare some data.</p>
<p><img src="empty_project.png" alt="A panel with multiple options on the left, showing &quot;Editor&quot; selected. Left half occupied with a blue expanse and map zoom controls. Right half is an editor with style.mss file open. Inside, there's a Map {background-color: blue}" /></p>
<h2>OpenStreetMap Data</h2>
<p>You won't draw a map if you don't know anything about your area. Thankfully, collecting geographical data is what the OpenStreetMap project is best at. Don't be fooled, the map is only a side thing. It should really be called OpenStreetData.</p>
<p>Let's get the data of region that interests us from <a href="https://download.geofabrik.de/">Geofabrik</a>. I'll choose Münster for demonstration purposes: Germany is <em>dense</em> with all kinds of detailed data added by volunteers, and Münster in particular is famous for <a href="https://www.thelocal.de/20150220/cycling-survey-best-cities-germany-association-gives-mnster-top-marks">its bike-friendliness</a>. Now download the .osm.pbf file:</p>
<pre><code>https://download.geofabrik.de/europe/germany/nordrhein-westfalen/muenster-regbez-latest.osm.pbf
</code></pre>
<p>TileMill does not support loading it directly, but it supports something better:connecting directly to a geographic database. The cost of that is that we need to set up the database ourselves.</p>
<p>A while back, we installed the necessary packages: Postgresql with Postgis. Now it's time to configure them.</p>
<pre><code>sudo postgresql-setup --initdb --unit postgresql
</code></pre>
<p><strong>CAUTION</strong>: it's possible that I messed up the access controls here. An attacker on the same network might be able to get in your (data)base and kill your landmarks.</p>
<p>Edit the file <code>/var/lib/pgsql/data/pg_hba.conf</code> to widen permissions. Afterwards, the relevant part should look more like this:</p>
<pre><code># TYPE  DATABASE        USER            ADDRESS                 METHOD

# &quot;local&quot; is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
host    all             all             ::1/128                 trust
</code></pre>
<p>Create the database &quot;osm&quot;, to keep our geographical data.</p>
<pre><code>systemctl restart postgresql
psql -U postgres -c &quot;create database osm;&quot;
psql -U postgres -d osm -c 'CREATE EXTENSION postgis;'
osm2pgsql -c -G -U postgres -d osm ./muenster-regbez-latest.osm.pbf # you can list additional files here!
</code></pre>
<p>This will take a moment, but if it succeeds, your data is now safely stored. Test your network access:</p>
<pre><code>psql -h 127.0.0.1 -p 5432 -U postgres -d osm
</code></pre>
<p>If this succeeds, then TileMill will be able to access the database, too.</p>
<h2>My own map</h2>
<p>Go back to the browser, and find the &quot;layers&quot; icon. It's shown on the picture:</p>
<p><img src="layers_icon.png" alt="Four vertically stacked icons. A triangle tip ends on the last one, looking like a stack of papers." /></p>
<p>Add a new layer, select PostGIS, and give it &quot;dbname=osm host=localhost port=5432 user=postgres&quot; in the &quot;Connection&quot; field. Write &quot;highway&quot; in &quot;Class&quot;. Finally, enter the query in &quot;Table or subquery&quot;:</p>
<pre><code>(select * from planet_osm_line where highway!='') as lines
</code></pre>
<p><img src="layer_dialog.png" alt="A form with 3 options above, PostGIS selected. It contains fields: ID, Class, Connection, Table or subquery." /></p>
<p>We're almost there. Now try this style:</p>
<pre><code>Map {
  background-color: #b8dee6;
}

#highway {
  line-color: #808080;
  line-width: 1.0;
}
</code></pre>
<p>Enter it in <code>map.mss</code>, and press &quot;Save&quot; (or Ctrl+S). Suddenly…</p>
<p><img src="blob.png" alt="A crop of the empty project view, containing only the left half. There's now a small gray shape among the blue. Zoom level is at 3." /></p>
<p>What is that? Zoom in, please.</p>
<p><img src="highways.png" alt="The whole empty projct view again. The blue is criss-crossed with gray lines. Zoom level is at 10. The style.mss file now contains the #highway section from above." /></p>
<p>Now, this looks like the communication network in the region!</p>
<h2>Come on, paint my world</h2>
<p>I'm not going to give you a hand-holding here, but I'll leave you with a couple useful tips.</p>
<ol>
<li>
<p><a href="https://tilemill-project.github.io/tilemill/docs/crashcourse/styling/">TileMill's documentation</a> is quite decent for educating you how to style things.</p>
</li>
<li>
<p>Remember that your database holds 3 types of objects: points, lines, and polygons. Those don't have to be the same as in OSM, and, in fact, I don't know how relations are represented, if at all.</p>
</li>
<li>
<p>You can filter objects using CSS styles, like this:</p>
</li>
</ol>
<pre><code>#highways[highway='path'][bicycle='permit'] { foo; }
</code></pre>
<ol start="4">
<li>But dedicated layers are faster at filtering data:</li>
</ol>
<pre><code>(select * from planet_osm_line where highway='path' and bicycle='permit') as bikes
</code></pre>
<ol start="5">
<li>Take a good look at your database with <code>psql -h 127.0.0.1 -p 5432 -U postgres -d osm</code>.</li>
</ol>
<p>Tables:</p>
<pre><code>osm=# select * from # I pressed tab here
geography_columns    pg_toast.            planet_osm_roads
geometry_columns     planet_osm_line      public.
information_schema.  planet_osm_point     spatial_ref_sys
pg_catalog.          planet_osm_polygon  
</code></pre>
<p>Tag values:</p>
<pre><code>osm=# select distinct highway from planet_osm_line where highway!='';
    highway     
----------------
 trunk
 road
 disused
 footway
 cycleway
 services
 secondary
 traffic_island
 tertiary
 abandoned
</code></pre>
<p>Example data (caution, long lines):</p>
<pre><code>select * from planet_osm_roads limit 5;
</code></pre>
<ol start="6">
<li>Use more layers!</li>
</ol>
<h2>Printing</h2>
<p>This application is not perfect for printing. It doesn't have a built-in rose of winds, nor a scale, and you're stuck with the Web Mercator projection. <a href="https://qgis.org/">QGIS</a> is way better in that respect, but also creating styles in QGIS is a lot more painful.</p>
<p>In a pinch, however, it's sufficient, and exporting is easy, too. The export menu is on the top-right. Make sure to aim at 600pixels per 2.54cm, and to select a zoom level that makes things readable.</p>
<p>That's it! Have fun with your new set of crayons!</p>
<h2>Example</h2>
<p>The picture in the first section is from a live demo I performed at the local hackspace. Here's the style for it:</p>
<pre><code>Map {
  background-color: #fff;
}
 
#lines {
  line-color: #808080;
  line-width: 0.0;
  
  [waterway='stream'] {
    line-color: #aaf;
    line-width: 1.0;
  }
  
  [waterway='river'] {
    line-color: #aaf;
    line-width: 2.0;
  }

  [highway='path'] {
    [bicycle='permit'],
    [bicycle='yes'],
    [bicycle='designated'],
    [bicycle='official'],
    [bicycle='permissive'],
    [bicycle='use_sidepath'] {
      line-color: #444;
      line-width: 1.0;
    }
  }
  //[bicycle='permit'],
  //[bicycle='yes'],
  [bicycle='designated'],
  [bicycle='official'],
  //[bicycle='permissive'],
  [bicycle='use_sidepath'],
  [highway='cycleway']{
    line-color: #444;
    line-width: 2.0;
  }
  
  [highway='track'][surface='asphalt'][bicycle!='no'],
  [highway='service'],
  [highway='unclassified'],
  [highway='residential'],
  [highway='tertiary'] {
    line-width: 1.0;
  }
}

#lines {
  line-color: #808080;
  line-width: 0.0;

[surface='asphalt'] {
    //[bicycle='permit'],
    //[bicycle='yes'],
    [bicycle='designated'],
    [bicycle='official'],
    //[bicycle='permissive'],
    [bicycle='use_sidepath'],
    [highway='cycleway']{
      line-width: 2.0;
    }
  }
}
#roads {
  line-color: #808080;
  line-width: 0.0;/*
  [highway='secondary'] {
    line-width: 2.0;
  }
  */
  [highway='primary'],
  [highway='trunk'],
  [highway='motorway'] {
    line-color: #d8c1b1;
    line-width: 2.0;
  }
}

#toiletten {
  [amenity='toilets'] {
    marker-width: 6;
    marker-fill: #070;
    marker-line-width: 0;
  }
  [amenity='drinking_water'] {
    marker-width: 6;
    marker-fill: #700;
    marker-line-width: 0;
  }
}

#rathaus {
  [amenity='townhall'] {
    polygon-fill: #f00
  }
}

#towns {
  [place='village'] {
    text-name: [name];
    text-fill: #f00;
    text-face-name: 'Droid Sans Regular';
  }
  [place='town'],
  [place='city']{
    text-name: [name];
    text-fill: #f00;
    text-face-name: 'Droid Sans Regular';
    text-size: 20;
  }
}
</code></pre>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Spots on the Sun and worn-out clothes</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/holes/"/>
       <id>tag:dorotac.eu,2023-01-17:posts/holes</id>
       <updated>2023-01-17T14:00Z</updated>
       <published>2023-01-17T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Spots on the Sun and worn-out clothes</h1>
<p>Have you seen the <a href="https://xkcd.com/2725">latest XKCD</a>? It imagines a world where the Sun gets completely covered with dark spots, as opposed to only partially:</p>
<p><a href="https://xkcd.com/2725"><img src="https://imgs.xkcd.com/comics/sunspot_cycle.png" alt="A graph of a cyclical relationship between sunspot number and time" /></a></p>
<p>Can you see the surprising observation? The more black you add, the more spots you have. Until a certain point, where adding more black causes spots to merge. If this problem sounds remote, imagine a shirt. It starts out with 4 holes, but as it accumulates tears, it gains extra holes. Keep wearing it, and the holes will grow in size, eventually merging, until your shirt will be as good as new! At least when it comes to the number – but not the size – of holes.</p>
<p><img src="holes.jpg" alt="A blue textile full of holes" /></p>
<p>But xkcd's plots didn't look quite right. Why is the sunspot number sinusoidal? And will it really get seriously dark while there's more than one spot?</p>
<p>I can't answer the first question definitively, as I don't know what model the author chose for darkness emerging. But I can check my own guess: if spots appear randomly and uniformly, as if a child ripped holes in your shirt deliberately, then I expect a sharp rise at first (unlike xkcd), then a slow incline until peak spots, and finally a slower decline… I guess. Like this:</p>
<p><img src="count.png" alt="Hand-drawn plot, just as described" /></p>
<p>But xkcd is explicit about the number of spots everywhere, so the second question is possible to answer definitively. Why do I think Sun wouldn't get so dark unless there's only one spot? It's because there's plenty of space for bright areas even with a couple big spots. When you add extra dark areas, it's going to be hard to keep them separated.</p>
<p><img src="bigspots.png" alt="A drawing of a circle with three separate dark squiggles and much white space between them" /></p>
<p>Okay, but how do I intend to test it? I don't have a miniature Sun, after all. If you're guessing that I'm going to use up my shirt supply for science, I'm also going to disappoint you. Of course, I'm using a simulation (code at the bottom).</p>
<p>The simulation makes some assumptions. First, we're poking white holes in a black canvas. The canvas is square and flat. Each poke leaves a white plus-shaped mark containing 5 pixels.</p>
<h2>Experimentation</h2>
<p>So I ran the simulation on a couple canvas sizes, gathered the number of pokes to reach perfect black, and watched the brightness as the peak was reached and crossed.</p>
<p>That's what I gathered:</p>
<p><img src="not_quite_xkcd.png" alt="A picture based on xkcd but with my own data" /></p>
<p>Looks different, doesn't it? The most striking is that the number of spots falls to 1 and remains there for the majority of the cycle. It's not sinusoidal at all. That's easy to explain: it's hard to hit a solid piece when there are only holes remaining, and my model doesn't try to be smart about it. It usually takes 9000 pokes to cover the entire picture, whereas a single hole emerges before the 3000 mark.</p>
<p>But that doesn't mean that it's always completely dark, either. As I predicted, reaching a single sunspot is not enough to put out the Sun. The grayness on the lower plot doesn't succumb to complete darkness immediately. But it's hard to see, so I prepared another plot:</p>
<p><img src="brightness.png" alt="A plot of brightness against number of spots" /></p>
<p>On this plot, we still have about 1/3 brightness when a single superspot takes over. That's not what xkcd expected! Another interesting thing is that we hit peak spots before dropping to 50% brightness.</p>
<p>Here's my methodology for the charts:</p>
<p>I don't have a firm opinion on how to model the disappearance of spots, so I just mirrored their appearance in the plots. I also assumed that spot sizes are uniform, and fudged the average spot size to be 1/10_000 of the whole (after looking at figure 4 in <a href="https://www.aanda.org/articles/aa/pdf/2005/45/aa3415-05.pdf">On the size distribution of sunspot groups in the Greenwich sunspot record 1874–1976</a>), giving a canvas 70px in size for a 5px spot.</p>
<p>Late nights bring weird ideas.</p>
<h2>Simulation code</h2>
<p>To run the simulation on a 40x40 pixels canvas, use <code>python3 holes.py 40</code>. This is <code>holes.py</code>:</p>
<pre><code>#!/usr/bin/env python3
# how many holes?

import cv2
import numpy as np
import itertools
import random
import sys

rand = random.randrange

w= int(sys.argv[1])
h=w

image=np.zeros((h,w,1),np.uint8)

try:
    for i in itertools.count():
        cv2.circle(image, (rand(0, w), rand(0, h)), radius=1, color=(255), thickness=-1)
        contours, hierarchy = cv2.findContours(image, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)

        if i % 10 == 0:
            cv2.imwrite('out_{}.png'.format(i), (255 - image))

        spots = [0 for c in contours if cv2.contourArea(c,True) &lt; 0]
        whites = cv2.countNonZero(image)
        print(', '.join(map(str, [i, len(spots), whites])))
        if i &gt; 100 and len(contours) == 1:
            break

except KeyboardInterrupt:
    cv2.imwrite('out.png', image)
</code></pre>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Graphical debugging on the Librem 5 using QtCreator</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/remote_debugging/"/>
       <id>tag:dorotac.eu,2022-11-04:posts/remote_debugging</id>
       <updated>2022-11-04T14:00Z</updated>
       <published>2022-11-04T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Graphical debugging on the Librem 5 using QtCreator</h1>
<p><strong>This blog post was paid for by <a href="https://puri.sm">Purism</a>, as part of the work I do on the <a href="https://puri.sm/products/librem-5/">Librem 5</a> mobile phone.</strong></p>
<p><img src="debugging.png" alt="QtCreator debugging the simple-cam program" /></p>
<p>I've always had a weak spot for debugging with graphical, user-friendly tools. Whenever I run bare GDB, I'm limited to what I can see at once, and learning about the choices available to me is a chore. Navigating the code is not integrated either. Clearly, I need some extra tooling to make GDB worthwhile.</p>
<p>QtCreator is such a tool. It checks the above boxes, and also some more. One feature in particular was taunting me for a while.</p>
<h2>Remote debugging</h2>
<p>Wouldn't it be great to compile stuff on my powerful desktop and debug it from my laptop on the sofa? Or, for that matter, to debug programs that access the camera on the Librem 5?</p>
<p>It isn't as easy as it could be. Sure, you could always attach to a GDB server ad-hoc, but then you have to deploy manually, start the GDB server, and you don't get to see the code you're debugging. Kinda useless.</p>
<p>There's also the mode where the debugger integrates with your project, taking care of deployment and startup automatically using SSH. Just like local debugging! Unfortunately, it quickly became very clear that it's meant for embedded devices. There's a strong focus on setting up a cross-compiling toolchain, compiling locally, and only then doing the remote debugging. Sorry, ain't nobody got time for that.</p>
<p>But QtCreator is flexible enough to allow for custom deployment, and I finally cracked it! This is how you get</p>
<h2>Remote debugging on the Librem 5</h2>
<p>Instructions for QtCreator 4.13.2.</p>
<h3>Devices and Kits</h3>
<p>Let's make QtCreator know that you have a device you want to take care of. Go to Go to Tools → Options → Devices → Devices tab → Add. You'll be presented with a choice, select &quot;Generic Linux device&quot;. Fill in choices in &quot;Connection&quot;. I chose the name &quot;evergreen&quot;. The next screen will let you choose the SSH key to use. If you're confused here, you should read up about SSH key-based authentication. QtCreator will open SSH connections to the Librem 5, so you don't want to get password prompts all the time (once you have the key, make sure it's in your keychain. <code>ssh-add ~/.ssh/my_key.pub</code> works for me).</p>
<p>QtCreator has a notion of &quot;kits&quot;, which define the set of libraries used for building a project. We'll need to define one on the new device for the Librem 5. Go to Tools → Options → Kits, and add a new one. Duplicating the default one also works, I think most fields don't matter. Choose the name for your kit (I chose &quot;evergreen&quot; again), change device type to &quot;Generic Linux Device&quot;, and select the device you just added. Make sure to select some compiler, otherwise you'll get complaints from QtCreator later.</p>
<p><img src="kit.png" alt="My &quot;evergreen&quot; kit" /></p>
<h3>Building and deployment</h3>
<p>You need to import a project if you want to compile and then debug things. I'm going to skip this – you can easily look up some tutorials online – and go straight to the details regarding remote deployment.</p>
<p>Deployment is configured in the &quot;projects&quot; section on the left panel. I select the project I'm interested in (<a href="https://git.libcamera.org/libcamera/simple-cam.git/tree/">simplecam</a>), and notice that, in addition to the default &quot;Desktop&quot; kit, there's also an &quot;evergreen&quot;.</p>
<p>There are two positions here: &quot;Build&quot; and &quot;Run&quot;.</p>
<h4>Build</h4>
<p>Let's rework &quot;Build&quot; first. Make sure that there are no build or clean steps. (If you intend to build the project manually, you can now skip to the &quot;Run&quot; subsection.)</p>
<p>We're going to use this step to synchronize git sources between the local checkout and the remote one. That means we need to create a checkout on the remote host. Adjust the following command to match your preference, and issue it:</p>
<pre><code>ssh purism@10.42.0.185 git init ~/simple-cam
</code></pre>
<p>Now, add the new repository as your remote in the local git repository (adjust to match the remote host):</p>
<pre><code>git remote add evergreen purism@10.42.0.185:~/simple-cam
git push evergreen master:foo
</code></pre>
<p>Now come back to the remote host and initialize the build directory.</p>
<pre><code># this is on purism@10.42.0.185
mkdir simbuild
cd simbuild
meson ~/simple-cam
</code></pre>
<p>Now, save the following script somewhere on your local computer. It'll be responsible for deploying each version. <strong>Make sure to adjust the paths if yours are different!</strong></p>
<pre><code>#!/bin/sh
set -e

REMOTE=purism@10.42.0.185
DEST=&quot;~/simple-cam&quot;

ssh $REMOTE &quot;git -C $DEST checkout -f foo&quot;
git push evergreen -f HEAD:master
ssh $REMOTE &quot;git -C $DEST checkout master&quot;
git push evergreen -f HEAD:foo
ssh $REMOTE &quot;cd ~/simbuild &amp;&amp; meson $DEST &amp;&amp; ninja &amp;&amp; ninja install&quot;
</code></pre>
<p>Finally, go back to QtCreator's &quot;Project&quot; section, and choose the &quot;Build&quot; configuration for Evergreen. Add a process step, and point it to the file you just created. Make sure it executes in the git checkout by setting &quot;working directory&quot; to &quot;%{sourceDir}&quot;.</p>
<p><img src="build.png" alt="Build configuration" /></p>
<p>I used &quot;sh&quot; as &quot;Command&quot;, and the path to the script (redacted) as Arguments.</p>
<h3>Run</h3>
<p>Here's the meat of the operation. You're given one method: &quot;Deploy to Remote Linux Host&quot;. We don't actually deploy anything here, because the executables are built on the remote host. Remove all deployment steps.</p>
<p>Add a new run configuration on the remote host. For me, it's called &quot;Custom Executable (on Evergreen)&quot;. Add the executable path on the remote host. I use: <code>/home/purism/simbuild/simple-cam</code>. &quot;Local executable&quot; can stay empty. Environment changes take effect as expected.</p>
<p><img src="run.png" alt="Run configuration" /></p>
<h2>Debugger</h2>
<p>When I tried debugging at first, I was dropped into disassembly mode. That is clearly not optimal. A lot of the appeal of a debugger is seeing the source code being debugged.</p>
<p><img src="disassembly.png" alt="disassembly mode" /></p>
<p>Fixing this involves some extra steps. First of all, this affects only some binaries. For me, what was not picked up was what I installed manually: the simple-cam binary, and libcamera. To fix this, you need to find the correct mapping between the files on the local computer and the paths embedded in the remote executable.</p>
<p>This method worked for me: I added a new breakpoint at &quot;main&quot; <strong>by function name</strong>, and started debugging. Make sure to select the correct configuration: it's on the left panel, above the &quot;Run&quot; button. Select your project, the kit you just made, and the remote executable.</p>
<p><img src="configuration.png" alt="Configuration" /></p>
<p>Then, I opened the debugger log via View → Views → Debugger Log. Put the cursor in the &quot;Command&quot; prompt while the program is paused on disassembly, and type &quot;bt&quot;. In the right pane you'll get something like:</p>
<pre><code>1918bt
&gt;&amp;&quot;bt\n&quot;
&gt;~&quot;#0  main () at ../simple-cam/simple-cam.cpp:139\n&quot;
&gt;1918^done
</code></pre>
<p><img src="debugger.png" alt="Debugger log" /></p>
<p>This shows the path embedded in the binary that QtCreator (or GDB?) failed to find. Now you need to map it to a path on your local file system. Open Tools → Options → Debugger → General, and add the paths to the &quot;Source Path Mapping&quot; table. Here, I enter &quot;../simple-cam&quot; as &quot;source path&quot;, and my local path to &quot;simple-cam&quot; as &quot;target path&quot;.</p>
<p><img src="mapping.png" alt="Sources mapping" /></p>
<p>Keep in mind that I redacted part of the local paths.</p>
<p>You can stop the debugging now, and start it again. Now you should see the sources for the paths you configured. You should see something like this in the &quot;Application Output&quot; tab on the bottom:</p>
<pre><code>18:09:20: Checking available ports...
18:09:21: Found 101 free ports.
18:09:21: Starting gdbserver --multi :10000...
18:09:21: Debugging starts
Listening on port 10000
Remote debugging from host ::ffff:10.42.0.1, port 44146
Process /home/purism/simbuild/simple-cam created; pid = 22866
File transfers from remote targets can be slow. Use &quot;set sysroot&quot; to access files locally instead.
</code></pre>
<p>Congratulations, you are now debugging your Librem 5 from your workstation!</p>
<h3>Debugging symbols</h3>
<p>If you need to debug a call into a library that isn't part of your application, you may see that this setup balks. It either ignores the call, drops down to assembly, or displays &quot;??&quot; in the stack trace.</p>
<p><img src="bad_stack.png" alt="Assembly view and question marks in the stack view" /></p>
<p>This is a telltale symbol of missing debugging information for the library that the remote host is executing. But even if you install the debugging symbols and sources (in Fedora it's <code>dnf debuginfo-install [name]</code>, in Debian it's <a href="https://michael.stapelberg.ch/posts/2019-02-15-debian-debugging-devex/">hell</a>), on the next run you get… no change.</p>
<p>It turns out that the debug symbols and sources must be available on the host side of the debugging connection. Let's get the symbols over:</p>
<pre><code>scp -r purism@10.42.0.185:/usr/lib/debug /somewhere/l5_debug/
</code></pre>
<p>and hook them up to GDB by adding a new line to Tools → Options → Debugger → GDB → Additional Startup Commands:</p>
<pre><code>set debug-file-directory /somewhere/l5_debug/debug
</code></pre>
<p>Now you'll be able to see symbol names, but you still can't inspect the sources of the library. Sadly, you have to find the sources of the package on the remote end, and hook them up yourself in the &quot;Source Path Mapping&quot; table, like we did a couple paragraphs earlier. Copy if needed, no automation here.</p>
<h2>Limitations</h2>
<p>This relies on git to push your changes to the remote host, so <strong>you must commit your changes if you want to run them</strong>. Otherwise the code shown may not match the code executed.</p>
<p>It could be adjusted for plain files, but this is what I have now.</p>
<p>When you try to debug libraries provided by the OS, make real sure that you're using the correct sources. Debugging something when the cursor is randomly one line off is hard to notice, and it makes for a frustrating debugging session.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Jazda: Rust on my bike</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/jazda_rust/"/>
       <id>tag:dorotac.eu,2022-09-05:posts/jazda_rust</id>
       <updated>2022-09-05T14:00Z</updated>
       <published>2022-09-05T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Jazda: Rust on my bike</h1>
<p><em>This blog post is an improved version of an impromptu talk I gave at FrOSCon two weeks ago.</em></p>
<p>I like to do things that don't quite make sense. One of them is putting rust on my bicycle.</p>
<p><img src="bike.png" alt="The front of a dirty purple bicycle. In the background, windows of a building which carries the FrOSCon logo." /></p>
<p>This is my bicycle, and it's made out of steel. It has some rust already, but it's not the kind of &quot;rust&quot; I want to talk about.</p>
<p><img src="tube.png" alt="The seat tube shown from above. On the outside, it's covered with light dirt, and on the inside, the colour is dirty orange" /></p>
<p>I put Rust the programming language on my bike.</p>
<h2>Bicycle computer</h2>
<p>I bought my first bicycle computer before I could even program. It was a simple mechanical device that is mounted near the wheel. It counts wheel rotations, and displays a distance.</p>
<p><img src="mechanical.png" alt="A metal box with a counter behind a window. On the left, there's a propeller-like element. On the bottom there's a mounting bracket. The display is made of 4 black 0s and one red ring to the right. The red digit reads 3, and partially rotated away, so that the bottom is not in the window's frame." /></p>
<p>My next upgrade was an electronic device, one unlike what you could buy today. Those have more functionality: they may display distance, speed, and time travelled. I used one for over a decade, and it was possibly the best spent 20 EUR in my life.</p>
<p><img src="electronic.png" alt="A squareish plastic display unit. It has a wide button on the top edge and another one on the bottom edge. In between, taking most of the space, is a transparent window with a LCD display. 7-segment digits show &quot;0&quot;: one on the top labelled &quot;KMH, and 3 on the bottom reading &quot;0.00 TRP&quot;" /></p>
<p>Two things happened between then and now. One is that I learned to code. The other is that the battery ran out. Instead of replacing the battery, I came to the obvious conclusion: I can code, so I can make a better one!</p>
<h2>Hardware</h2>
<p>I've been struggling to design the hardware for my bike computer for years. Sadly, I suck at hardware design. I can't solder, I can't design a plastic case, I can't build a device that won't shake apart during a most peaceful ride. It became clear that if I want a bike computer, I should start with existing hardware.</p>
<p>Meanwhile, the situation in the outside world slowly changed. Smartphones gained popularity, Internet of Things followed, smart watches started utilizing the new tiny and powerful components. Bicycle computers started taking advantage of the new powers too.</p>
<p>I'm much better at programming than at hardware, so I stopped to consider: should I take advantage of that influx of hardware, and adapt an existing device to my needs?</p>
<p>I could slap a smartphone on the handlebars, write an application and call it a day. Or I could find a smaller device, more like the bicycle computer I retired, and use it as the base for my software. But which path should I choose?</p>
<p><img src="modern.png" alt="A rounded-rectangular gadget, showing a table of values on a black-and-white LCD: speed, stopwatch, distance, time, heart rate, cadence. It has 4 small buttons, 2 on the left and 2 on the right." /></p>
<p>There are many cyclists in the world. There are those who commute to work every day, others who go on family trips, some cycle to deliver things, and yet some like to race. They have different goals and needs while cycling, and each is interested in something different from a cycling computer.</p>
<p>This time, I'm building one that makes <em>me</em> happy. Which of my needs as a cyclist can a bike computer help with?</p>
<p>I usually cycle in places I roughly know, so I don't need a map, or a big, fragile screen to display one. I want to time my rides, so I want a screen that's big enough to show a few numbers in a big font, and the bike computer must be sturdy enough when I push the tempo over rocky trails. When I cycle, I leave my phone at home, so I won't miss anything if the bike computer is 100% offline. I draw on maps for OpenStreetMap, so I want some sensors like GPS, and a place to store their data. Thankfully, handling sensors and statistics doesn't require much computational power.</p>
<p>A smartphone meets those needs, but has some important downsides. Most smartphones aren't well readable in sunlight. They have short battery life – days compared to weeks – and they are not very resilient compared to other gadgets. I often get caught in the rain, and I'm pretty sure too much road dust or violent shaking is not healthy for smartphones – even if they don't come flying out of the harness.</p>
<p>The conclusion is pretty clear: I would be better served by a dedicated device. And so I started my search.</p>
<p>However, I'm only a single hacker, with limited time and abilities. I want to build a bike computer from scratch, I have another need – as a hacker, not cyclist: the device must be simple enough so that I can hack on it myself.</p>
<p>And I did see some interesting examples, like <a href="https://www.reddit.com/r/linuxmasterrace/comments/ehn19r/how_to_hack_stages_cycling_dash_l50_or_m50/">one based on Linux</a>. I rejected this one purely because I was afraid about the hardware complexity – if the manufacturer decided they needed a full-blown OS, then there must be a lot of hardware to manage there.</p>
<p>Finally, I found one. The <a href="https://www.espruino.com/Bangle.js2">Bangle.jS 2</a> smart watch checked most of the boxes: lots of sensors, Bluetooth Low Energy, a screen readable in the sunlight, and even a GPS receiver! It only has one button, and it can't make sounds, but it was reverse engineered, and ready to flash with custom software. It was then or never: if I didn't use this as my base, I would probably never finish the bike computer project.</p>
<h2>Rust</h2>
<p>There are lots of choices in the embedded space. Operating systems like <a href="https://nuttx.apache.org/">Nuttx</a>, <a href="https://www.espruino.com/">Espruino</a>, <a href="https://os.mbed.com/mbed-os/">Mbed OS</a>, <a href="https://www.riot-os.org/">Riot</a>, <a href="https://www.zephyrproject.org/">Zephyr</a>. Or even running on bare metal. But my goal was clear:</p>
<p><img src="rust-logo-blk.svg" alt="Rust-lang logo" /></p>
<p>I wrote enough C code to know I don't want to use it. The 2 Rust-enabled options were Riot and bare metal using <a href="https://github.com/rust-embedded">rust-embedded</a> crates.</p>
<p>Actually, there were 3 options. At the last moment, I discovered <a href="https://tockos.org/">Tock</a>, an OS written entirely in Rust. It has an advantage over all other options (except Espruino): it's a pre-emptive operating system, able to load applications at runtime.</p>
<p>And loading new apps is a standard function of your computer, of your smart phone, and some smart watches. Why isn't this standard on bike computers yet? Puzzling, but if the sports gadgets manufacturers won't do that, I gladly will.</p>
<h2>Tock project</h2>
<p><img src="apps.svg" alt="Diagram showing 2 layers: Tock kernel and multiple apps on top: speed meter written in Rust, clock written in C, and space for more." /></p>
<p>Applications made for Tock are native code, can be written in C or Rust. Because of the multiprocessing architecture of Tock, we can split functionality into apps, like a speed meter, or a clock, and not worry how buggy they are: they may crash all they want, but they are separated from each other, so a crashing clock won't bring your down speed display.</p>
<p>Imagine a future where people load applications from the internet on the bike computer. Being able to just ignore a crashing app will be absolutely necessary.</p>
<p>But this kind of safety comes with a cost. You have to write a lot more code to abstract hardware resources. Instead of writing just one device driver, you actually need to write 3 pieces: one kernel driver, one userspace driver, and one multiplexer.</p>
<p><img src="stack.svg" alt="Diagram showing the 3 layers: hardware, kernel, app. Hardware is the GMC303 chip. On top in the kernel, GMC303 driver underlies the compass syscall driver. On top o that, as part of the map app, there's the compass API" /></p>
<h2>Status</h2>
<p>Despite the slower pace, I managed to put Tock OS on the Bangle.js 2 hardware rather quickly. I started with a demo displaying speed:</p>
<p><img src="speed.png" alt="A smart watch on a wooden background. It reads &quot;14&quot;, and there's a 30° arc to the left of the number, centered on the number. There's a small &quot;33&quot; in the lower right corner." /></p>
<p>The demo is part of the <a href="http://jazda.org">Jazda project</a> (which is what I called the bike computer), and it shows that the display stack is working, and that the GPS stack is working.</p>
<p>There's still lots of work before the grand vision can be realized. The main parts are:</p>
<ul>
<li>Bluetooth support</li>
<li>Concurrency: there are compiler shortcomings that prevent this part of the OS from really gaining speed</li>
<li>Communication between apps</li>
<li>Communication with a computer</li>
<li>Publishing apps online: a website, payment service</li>
</ul>
<p>Jazda started as a hobby project, so I never forget about fun projects:</p>
<ul>
<li><a href="https://gitlab.com/dcz_self/seismos">Seismos</a>, the sensor collection file system</li>
<li><a href="https://framagit.org/jazda/core/-/tree/master/ray-graphics">Ray-graphics</a>, the <a href="https://www.ronja-tutorials.com/post/034-2d-sdf-basics/">SDF</a>-based graphics library</li>
<li>Skitram, the unpublished time-series data compression library</li>
</ul>
<h2>GPS vs BLE</h2>
<p>Currently, the greatest shortcoming of Jazda is the lack of Bluetooth Low Energy. This is a wireless protocol normally used by bicycle sensors. What Jazda is using right now is GPS readings. Unfortunately, GPS is rather energy-hungry, which means the device can only display speed for 5 hours before turning off, and the readouts aren't very accurate, either.</p>
<p>Tock has a strict policy of not allowing direct hardware access, so I can't just snatch an external BLE stack. If I did that, I would be stuck maintaining the result myself, because it wouldn't be accepted upstream. Alternatively, I could write a BLE stack myself, but Bluetooth is notoriously difficult to implement correctly.</p>
<p>For now, I'll just keep working on the other parts.</p>
<h2>Innovation</h2>
<p>Thankfully, speed readouts are overrated. There are plenty of other things that can be done without them. I collected some ideas:</p>
<ul>
<li>drawing a situational map</li>
<li>wheelie detection</li>
<li>surface roughness scanning for OpenStreetMap</li>
<li>sonar mode for chasing a recorded ride</li>
<li>???</li>
<li>profit</li>
</ul>
<p>Some of them come from myself, some were suggestions I heard. Perhaps I won't be able to implement them all, but that's fine, because Jazda is open source software. Anyone is allowed to implement any crazy idea without signing an NDA and without the need to find a job at a sports equipment company. Having a shower thought and some perseverance is all that's necessary.</p>
<h2>Community</h2>
<p>But it's always easier to hack when other people can help, so you're invited to join our chat on Matrix or IRC: #jazda:<a href="https://web.libera.chat/">libera.chat</a> . You can also reach out to me there to order development kits for Jazda. Those come with a unique breakout board to make it easy to reflash the Bangle.js 2 smart watch with the Jazda firmware.</p>
<p><img src="devkit.png" alt="A USB programmer connected via a ribbon cable to a breakout board, connected to a USB cable that ends outside of the picture." /></p>
<p>If you're a hardware hacker, you're especially welcome. Perhaps you could help us build a future version of Jazda, without <em>any</em> shortcomings… and without useless heart sensors ;)</p>
<p><a href="https://jazda.org"><img src="jazda.svg" alt="Jazda logo: &quot;jz&quot; stylized as a human on a bike" /></a></p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Why Jazda?</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/why/"/>
       <id>tag:dorotac.eu,2022-06-05:posts/why</id>
       <updated>2022-06-05T14:00Z</updated>
       <published>2022-06-05T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Why Jazda?</h1>
<p>This is a cleaned up transcript of a lightning talk I gave at the Gulaschprogrammiernacht 20.</p>
<hr />
<p>Have you ever had a project that you worked on for so long that you forgot why you're doing it? I was just asked &quot;why are you doing this?&quot; about a project I started 10 years ago, and on which I really started working this year. The project is called Jazda [https://jazda.org].</p>
<p>Why do I do it? It's an open source bicycle computer. There are a few of them, but not many. The general idea of a bicycle computer has been explored in so many ways. There's Garmin, which is almost a monopoly. You can also buy one at Kaufland for 10 euro. So why do you need another one?</p>
<p>If you take a fancy one like the Garmin, you'll find that it's not programmable. You cannot do it, you're not allowed to do it. If you take the Kaufland bike computer, it's just too simple. If you try to mount your phone on the bike, it's bulky and inconvenient. So yeah, I wanted something that I could program to get the four freedoms of software and to experiment with it.</p>
<p>There are other options. There are smart watches, like the PineTime, or Bangle.js. There is the Cyclotron, which is a simple bike computer that's also open source, including custom hardware. But I don't actually want any of those. Why not? First off, a bike computer needs to be readable in sunlight. Most smart watches can't do that. Bangle.js 2 can, but JavaScript? Sorry, I'm not a fan. The Cyclotron is the closest to what I want, but since I came all the way here, then maybe… let's not stop here?</p>
<p>If I use a simple single-purpose device, and connect it to the computer with a flasher to load software, it would be too technical, I would be basically the only person in the world interested in using it. What if I took the next step? What about a bicycle computer with apps?</p>
<p>Why should a bicycle computer be more difficult to use than an Apple phone or watch, or your Android? And what if you had a nice SDK to program it? Yeah, that's something to do.</p>
<p>Basically, this is the idea with Jazda. The project is still early, still not there, the goal has not been reached yet.</p>
<p>But you can already buy them from me, so I'm not still in the weeds, but there's still a lot of work required.</p>
<hr />
<p>At the conference, my talk ended by inviting folks to talk to me and describing my hangout spot. It was a little awkward to explain.</p>
<p>Online, it's easier. I hang out on the IRC/Matrix channel #jazda:libera.chat . Stop by sometime!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Git-botch-email</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/git-botch-email/"/>
       <id>tag:dorotac.eu,2022-03-13:posts/git-botch-email</id>
       <updated>2022-03-13T14:00Z</updated>
       <published>2022-03-13T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Git-botch-email</h1>
<p>I don't really like git-send-email. I avoid projects that use it, if I can.</p>
<p>It could have been much better, but I (thankfully) use it rarely enough that I can't come up with its list of sins on the spot, and I don't end up getting my arguments acknowledged.</p>
<p>Today (2021-09-24) I have the dreadful need to send a patch to the Linux kernel, and I'm going to make the most of it by lining up my thoughts. Hopefully someone takes it to heart and creates a <code>git-sends-email-better</code>.</p>
<h2>Email is all right</h2>
<p>Don't get me wrong, I like the decentralization aspect of email, I like that it's based on open standards, that I can use an arbitrary client to send it, and that I don't have to run random code on my computer just to submit or review patches. Forges fail at those, to various extents.</p>
<h2>But git adds its own quirks</h2>
<p>I have a tree with my commit on top: it changed 3 lines. I want someone to include my changes on top of the official tree. Let's assume that I read the project's contribution guide, and I know that I need to use git-send-email, and I also know where to send the change. Let's begin.</p>
<pre><code>$ git send-email 
No patch files specified!
git send-email [options] &lt;file | directory | rev-list options &gt;
[...]
</code></pre>
<p>&quot;patch files&quot;? I thought git operated on the basis of commits and trees. Why would I want to send a patch file using git? Anyway, there's a &quot;rev-list&quot; possiblity, so we can proceed with that:</p>
<pre><code>$ git send-email 00082f898de21fd5ebb28dc561c173f6fde8e44a
/tmp/7yZAWWiApC/0001-media-imx-Fix-rounding.patch
To whom should the emails be sent (if anyone)?
</code></pre>
<p>I answer the questions, and then I get:</p>
<pre><code>Send this email? ([y]es|[n]o|[e]dit|[q]uit|[a]ll):
</code></pre>
<p>Lol, no. It's my first submission, and I want to review it. Possibly by giving it to someone else first. I'm only human, and I make mistakes.</p>
<p>But where's the &quot;save&quot; option?</p>
<p>Okay, never mind. Maybe there is a &quot;save&quot; option further along the way. After all, I didn't give git any email access yet – I just want a dry run. It's not going to throw away the edits it offers you to perform, right?</p>
<pre><code>Send this email? ([y]es|[n]o|[e]dit|[q]uit|[a]ll): y
sendmail: Cannot open mail:25
Died at /usr/libexec/git-core/git-send-email line 1497, &lt;FIN&gt; line 3.
</code></pre>
<p>…Actually, it totallydid throw them away. Thankfully I didn't make any edits (j/k, the first time in my life I tried to send a patch, I followed a tutorial and already wrote a heap of text at this step. That hurt).</p>
<p>Do you know how the cat keeps meowing but you have no idea what it wants? If only it could speak human language.</p>
<p>If only git-send-email could speak human language.</p>
<p>Some searching later, it turns out that, indeed, git wanted to have access to my SMTP server to send the email itself.</p>
<h3>Git sending emails</h3>
<p>Let's get back to the save fiasco. I have a perfectly cromulent email client which can import .mbox or .eml files, and send them on. It's already configured for my email server, it can sign my emails, I trust it with my passwords, and it's customized to save sent emails to IMAP.</p>
<p>Why can't I save my git patch, and load it into my email client, to send it with my GPG signature?</p>
<p>Why should I trust git with full access to my email outbox? Why shouldn't I be able to use my usual Mail User Agent (MUA) to do the sending in a way that I pre-approved, respecting the security level I want to maintain, including TLS versions? Email is based on standards, after all.</p>
<p>I just hope Git doesn't handle the email password itself, given its <a href="https://nvd.nist.gov/vuln/detail/CVE-2020-5260">track</a> <a href="https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=git">record</a>.</p>
<p>Oh, and don't forget that if you have a fancy multi-identity setup in your MUA, you have to duplicate it with git-send-email. Sure, git has a global config file. But If I configured that to my work account, then I'm one mistake away from sending a personal contribution to some other random repo from my work email. Double difficulty if you contribute under multiple different contexts.</p>
<p>So I'm stuck at having to <a href="https://www.freedesktop.org/wiki/Software/PulseAudio/HowToUseGitSendEmail/">configure</a> every repo which demands git-send-email separately (use <code>--local</code> instead of <code>--global</code>. With git-send-email, opsec is harder.</p>
<p>Just give me the damn save already!</p>
<h3>Acceptance</h3>
<p>Okay, I guess I have little choice in the matter. <em>Sigh</em>. Let's change the email to point to myself as the addressee. This will obviously not be a dry run for the final email because of the address mismatch, but hopefully nothing starts on fire when we change the addressee later.</p>
<p><img src="sent_email.png" alt="sent email" /></p>
<p>Great, it seems to look … odd. I was not expecting the patch to be part of the message. How am I supposed to download and apply it? By copying and pasting? It makes sense: those who reply can easily add their comments inline. I guess I can live with it. As long as I don't change the contents of the patch, it should be possible to extract.</p>
<h3>Intermission</h3>
<p>But wait, git-send-email allows you to edit the message before sending, and mess it up in any way you want?!?</p>
<p><img src="botched_email.png" alt="screenshot of an email with an extra &quot;From:&quot; header in a terminal editor" /></p>
<pre><code>Send this email? ([y]es|[n]o|[e]dit|[q]uit|[a]ll): y
OK. Log says:
[...]
Result: 250
</code></pre>
<p>LOL, git-send-email apparently doesn't do any validation. I hope that email didn't actually get forwarded anywhere. I put in &quot;none&quot; in the address field, but I dont really know well enough how git-send-email works. I've seen it add addresses. That wouldn't have worried me with my MUA.</p>
<h3>Back to the draft</h3>
<p>Oh, right, maybe I can actually edit the draft from my MUA this time! If git-send-email doesn't try to stop me from being an idiot, I have no reason to use it now.</p>
<p>I copied the email from inbox to drafts, and added the timely message.</p>
<h3>Intermission 2</h3>
<blockquote>
<p>Sometimes it's convenient to annotate patches with some notes that are not meant to be included in the commit message. [...] Such messages can be written below the three dashes &quot;---&quot; that are in every patch after the commit message.</p>
</blockquote>
<p>So, if I want to include extra context inside my email, I should cram it in between lines of computer-readable text and hope for the best?</p>
<blockquote>
<pre><code>The changes from 451a7b7815d0b pass 2^n to round_up, while n is needed.

Fixes: 451a7b7815d0b
---
Hi, I tested the patch on the Librem 5.
drivers/staging/media/imx/imx-media-utils.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/imx/imx-media-utils.c b/drivers/staging/media/&gt; imx/imx-media-utils.c
index 5128915a5d6f..a2d8fab32a39 100644
</code></pre>
</blockquote>
<p>That must be the best way ever invented to ensure people accidentally alter the wrong thing.</p>
<h3>Applying the patch</h3>
<p>I was hoping not to have to go through the ordeal, but I don't want to bother my co-workers with my non-work blog. I'm being too nice.</p>
<p>I was joking when I said &quot;copy-paste&quot; the email content, and yet, this is how <a href="https://stackoverflow.com/a/51995810">Stack Overflow</a> deals with it. And while I'm finding a bunch of guides for sending, that's the only answer dealing with receiving git-send-email results. Alas.</p>
<p>Surprisingly, it worked-for-me.</p>
<p>A couple exchanged emails later, and… I need to change the commit contents. Since the change was small, I just winged it, and altered the email directly, but if I actually had to go through the entire ordeal of git-send-email, I'd lose my changes again.</p>
<h3>Revisions</h3>
<p>Finally, my email went to review. Of course, it didn't come out unscathed. I had to submit a second version, which consisted of multiple commits.</p>
<p>Here again, git-send-email showed how being careful results in paper cuts: each commit is sent as a separate email. That would have been okay of I could just load them into my draft folder and send from there, but, as we established earlier, the only way to do that is to send the email to myself. That means I have to change the recipient list for each email in the series manually. What botheration!</p>
<h2>More tool problems</h2>
<p>Even after the patch was sent successfully, it's still less than a commit. It does not preserve the history of how the author got there. The base tree is discarded, and only the diff remains. Without an external convention, the reviewer does't even know which tree to apply the patch on, much less where it was originally tested. It's impossible to merge trees using git-send-patch either. There's not much &quot;git&quot; in it, really, because commits and trees are what git is made of.</p>
<h2>Linux problems</h2>
<p>There's a separate class of problems that are not the consequence of the tool, but the consequence of the culture which is often associated with the tool. It's fixable without any software changes, but it needs changes to wetware – which is probably even more difficult.</p>
<p>Here's a loose list, based on the Linux kernel:</p>
<p>How to find the correct address to direct the patches? How to find the correct tree to apply the patch on? Magical incantations are sometimes required like &quot;Signed-off-by:&quot; that have a meaning other than its constituent words (no, it doesn't mean your email is signed). Those things are documented, but they are not universal, and not obvious. And they are not difficult in the same way as writing a kernel patch is difficult – those are pure bureaucracy overhead, which does not have any bearing on code quality.</p>
<p>More overhead is in supporting ancient clients that can't handle MIME or compression. Those are banned in the kernel, instead of fixing clients to handle compression and inline disposition. Another exercise to support inflexible tools at the expense of human effort.</p>
<p>And the mother of all pet peeves: hard-wrapping of commit messages. Do you prefer to read prose by incessant scrolling, or with nonsense lines? Random example from the Linux kernel mailing list:</p>
<p><a href="scroll.webm">Monospaced text hard-wrapped wider than the display, reader scrolls every line horizontally</a></p>
<p><img src="wrapping.png" alt="Prose with lines wrapped to random widths" /></p>
<p>This is what reading a patch would look like on the Librem 5 – depending on how you prefer to suffer – and it's also what any other pre-formatted text looks like. Including your commit messages. Have mercy and don't do hard wrapping.</p>
<h2>It can be better!</h2>
<p>Those are, in the end, not unsurmountable problems, at least not all of them. <em>When used together with a competent email client</em>, git-send-email does its job passably. The only paper cut is not being able to export the emails without sending them.</p>
<p>It gets worse if there's no competent email client. Then the task of editing, reviewing, saving for later and coming back, and fixing mistakes falls entirely on the same tool. As far as I can tell, it's rather tedious, and the ability to get an overview is rather poor. There's no way to save and resume at all.</p>
<p>I'm not going to give it a score better than &quot;passable&quot; even with this papered over: it remains a fact that git-send-email almost encourages the submitter to accidentally mess up the patch by entering text in the wrong place. Until the payload is clearly delineated from the cover letter (as an inline attachment?), this cannot be solved.</p>
<hr />
<h2><strong>Some time later</strong></h2>
<p>One of my patches was eventually accepted. The rest have been picked up by some poor soul who has been trying to get them upstreamed for the past 2 months.</p>
<p>git-send-email didn't get in the way any more, but I still messed up. I missed some feedback in my inbox, and thought the patches were completely forgotten, until the new person stepped in. You might blame my way of reading email, and you would be right: my email inbox is a mess. I've been trying to fix it for the past 5 years with little success. Meanwhile, I rarely lose feedback when it's placed on a web page, because it organizes conversations in a sensible manner.</p>
<p>Thankfully, that problem is not inherent to email, and has been solved by <a href="https://sr.ht/">sourcehut</a>. It closes the gap between an email-only workflow and a web-only forge by providing a single contact point, and displaying the knowledge (including historical data) in a decent manner. My analysis wouldn't be complete without mentioning it, and I ask projects using git-send-email to adopt it or something similar: I can't handle naked emails, and I'm not the only one!</p>
<p>But, being an internet service, sourcehut doesn't fix the problems in git-send-email. Perhaps they could come up with an improve version of that tool? I sure hope they could.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>A potion of experience</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/potion/"/>
       <id>tag:dorotac.eu,2022-03-04:posts/potion</id>
       <updated>2022-03-04T14:00Z</updated>
       <published>2022-03-04T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>A potion of experience</h1>
<p>You may notice that this blog has just been enriched, especially if you're following my <a href="https://fosstodon.org/@dcz">Mastodon account</a>. See that thing all the way past the article? It's a comments section. Cause I want to have a conversation with my readers, rather than keep shouting into the void.</p>
<p>You can surely see it's pretty basic.</p>
<p>I made it myself. See, I had my eye for a while on this elixir that would give me some network development experience, and I decided to use it. I had to supplement it with some extra stuff to get anything more than <em>just</em> experience, but I'll talk about that later.</p>
<h2>Glug glug</h2>
<p>It's not a mystery: it's just <a href="https://elixir-lang.org/">Elixir</a>. It's a language running on the BEAM virtual machine, and taking advantage of the Erlang/OTP runtime. OTP has fascinated me for a long time, with its &quot;<a href="https://lwn.net/Articles/191059/">crash-only</a>&quot; approach, a concurrent take on the <a href="https://www.brianstorti.com/the-actor-model/">Actor model</a>, and aggressive immutability.</p>
<p>In short, it's a programming paradigm outside of the orthodoxy I've known. And what better source of new insights is there than an unusual point of view?</p>
<p>Disclaimer: I'm still rather new to Elixir, so if I mess up some terminology, let me know… in the comments ;)</p>
<h2>Level up</h2>
<p>The feature of Elixir that I liked the most comes from Erlang/OTP: it's the actor model. Your program gets split into… uh, &quot;applications&quot;, which operate as independent threads, and can exchange messages. For example, the web module sends a message to the email module to let me know I should approve a new comment, and doesn't even wait for an answer.</p>
<p>If the email sender crashes, the web service keeps churning on, and only the email component gets restarted. And I don't even need to care about the restart logic!</p>
<p>That's already cool, but what if I want extra logic in the email sender? I don't want to be spammed by a lot of emails when my blog reaches the front page of Hacker News, but rather have the email module wait for 5 minutes between notifications.</p>
<p>Immutability then changes the rules of the game: you can't just save the time of the last email in a local variable, and handle messages in a loop. There's a kind of a local database for each application, where data can be explicitly stored. One thing comes to another, and I implemented a state machine to deal with the notifications.</p>
<p>And that's where I levelled up: I stated using state machines, which exchange messages to cause transitions. It's an excellent organization to debug, because you can serialize the state of your module, and see how exactly external events cause changes in the state. If you add a little more effort to turn side effects into more messages, you can perform &quot;in vitro&quot; state transformations as part of your test suite.</p>
<h2>Bitter taste</h2>
<p>Elixir is not all good though. The one especially bittersweet part is how much metaprogramming it allows. The basic syntax ends up being simple (or so I'm told), but once you actually start using Elixir, expect surprising syntax constructs. Here are a couple of examples that keep confusing me.</p>
<p>Most statements are contained between <code>do</code> and <code>end</code>, <a href="https://elixir-lang.org/getting-started/case-cond-and-if.html">like this</a>:</p>
<pre><code>if true do
  x = x + 1
  y = y + 1
end
</code></pre>
<p>But not stuff inside branches of a <code>case</code> expression:</p>
<pre><code>case foo do
    true -&gt;
      x = x + 1
      y = y + 1
    false -&gt;
      x = x
end
</code></pre>
<p>I feel uneasy each time I write this. What's the delimiter marking the end of the <code>true</code> branch? Note that indentation doesn't matter here.</p>
<hr />
<p>Another annoying property of some Elixir libraries is spooky action at a distance, where the reasons for doing something are implicit and hidden away behind layers of abstraction. Here's an excerpt from the <a href="https://elixirschool.com/en/lessons/misc/plug">Plug library tutorial</a>:</p>
<pre><code>defmodule Example.Router do
  use Plug.Router

  alias Example.Plug.VerifyRequest

  plug Plug.Parsers, parsers: [:urlencoded, :multipart]
  plug VerifyRequest, fields: [&quot;content&quot;, &quot;mimetype&quot;], paths: [&quot;/upload&quot;]
  plug :match
  plug :dispatch

  get &quot;/&quot; do
    send_resp(conn, 200, &quot;Welcome&quot;)
  end

  get &quot;/upload&quot; do
    send_resp(conn, 201, &quot;Uploaded&quot;)
  end

  match _ do
    send_resp(conn, 404, &quot;Oops!&quot;)
  end
end
</code></pre>
<p>Take a look at the <code>plug</code> lines. They will take care of encoding and verification. It's nifty because you don't have to worry, just slap those in. It's confusing because if you're a newbie and want to stay near the basics, those constitute a barrier. It's opaque, because there doesn't seem to be a syntactical opening to declare some calls to be affected by those filters.</p>
<p>In the end, I avoided the problem by never using such clever tricks, at the cost of having less learning material (not that the material with tricks taught me anything).</p>
<hr />
<p>In the end, it comes out as a mostly positive coding experience, and I'll choose Elixir over Django in the future.</p>
<p>But coding the web app is not all.</p>
<h2>Hangover</h2>
<p>After coding, I had to deploy it on my server somehow, or else experience is all I get. This grew to be a full half of the experience, and a half I'd rather do without.</p>
<p>Let me start with Ansible.</p>
<p>I don't like it.</p>
<p>It disappoints me in one crucial area: it does not compose. I can't easily create Ansible instructions to deploy <a href="https://gitlab.com/dcz_self/beng">Beng</a> on a pristine system for developers, and then reuse the same instructions to deploy it on my infra alongside other pages. Isn't that what programming languages are good at? Executing batches of instructions differing by parameters? Perhaps I'm too much of an Ansible noob, but I haven't found the necessary flexibility there.</p>
<p>So I wrote a couple idempotent shell scripts to replace Ansible.</p>
<p><strong>INTERMISSION</strong></p>
<p>Back to Elixir - it turns out that the binaries need a certain version of the Erlang runtime to be present on the destination system. CentOS 7 didn't have the same version as my development machine. I couldn't build them in a CentOS container either.</p>
<p>Long story short, I gave up on CentOS and went with Nix to build the comments app.</p>
<p>The upside is that if I want, NixOS can take over a lot of what I needed Ansible for: building the software, configuring it, installing dependencies. The downside is, when I came back to the app 6 months later, my Nix package utterly and completely doesn't build. I'm not so sure about employing Nix for server config duties now.</p>
<p><strong>END INTERMISSION</strong></p>
<p>My shell scripts were still needed to move the data and configs from the development machine to the server. Step by step, they grew into something bigger. Something monstrous. Something like… Ansible? But with some neat features that Ansible doesn't have:</p>
<ul>
<li>my system can deploy stuff in Docker containers or SSH hosts. Ansible needs SSH to function.</li>
<li>My system supports actual conditionals.</li>
</ul>
<p>Sadly, it doesn't compose much better. I still have some hardcoded things I don't want to share with the world, so it will remain unpublished for now.</p>
<h2>Aftermath</h2>
<p><a href="https://gitlab.com/dcz_self/beng">The Elixir app</a> and the deployment sweat were a good lesson in humility. Originally, I estimated the whole thing to take a week. It took two weeks of intense work across several months. Last week I estimated the deployment portion to take an evening. That alone took a week. But now I have something to show for it, and the lessons I learned from Elixir levelled me up as a programmer, so… I guess it was worth it.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Browser tab archaeology</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/tabcheology/"/>
       <id>tag:dorotac.eu,2022-02-09:posts/tabcheology</id>
       <updated>2022-02-09T14:00Z</updated>
       <published>2022-02-09T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Browser tab archaeology</h1>
<p>Tab hoarding is a hot topic these days. There are people who feel <a href="https://fosstodon.org/@floppy/107690840449591767">bad</a> about it. Even the name &quot;hoarding&quot; is negatively coloured. Obviously, the internet is making a big deal out of using tabs as bookmarks, because it seems like we have multiple bookmarking options, and <a href="https://news.ycombinator.com/item?id=30166357">they all suck in practice</a>.</p>
<p>But I'm shameless about it: browser tabs <em>are</em> my reading list, and browser windows are how I keep groups of links on a single topic that might be useful one day. At least until and unless a better solution emerges.</p>
<h2>Sediments</h2>
<p>And I hope it does, cause my tab count keeps increasing! Oh, the times when I used to have 120 tabs! Long lost in the murky waters of history, tabs buried under further 100 layers of sedimentary URLs!</p>
<p>If only I was an archaeologist, I would order an expedition to find the gems I buried and forgot about. &quot;Oh, that bunch of tabs comes from the period when I wanted to learn Haskell! A very valuable finding!&quot;</p>
<p>If I was a tab geologist, I would order a vertical cut through the layers of rocks, look at the patterns, and guess the macro-scale processes that unconsciously guide my tab hoarding today.</p>
<p>Actually, maybe I am a tab geologist!</p>
<h2>Simple Tab Groups</h2>
<p>I have this folder on my computer. It contains the history of my open tabs – daily snapshots going back two years. Here's an example file called <code>stg-backup-2022-01-23@drive4ik.json</code>:</p>
<pre><code>{
    &quot;version&quot;: &quot;4.5.2&quot;,
    &quot;groups&quot;: [
        {
            &quot;id&quot;: 82,
            &quot;title&quot;: &quot;default&quot;,
            [...]
            &quot;tabs&quot;: [
                {
                    &quot;url&quot;: &quot;https://en.wikibooks.org/wiki/Haskell&quot;,
                    &quot;title&quot;: &quot;Haskell - Wikibooks, open books for an open world&quot;
                },
</code></pre>
<p>You see, I'm not the only one who uses tabs extensively. There's a lot of Firefox extensions to make them more powerful, and the one I settled on is called <a href="https://addons.mozilla.org/en-US/firefox/addon/simple-tab-groups/">Simple Tab Groups</a>. That's where the snapshots come from.</p>
<h2>Excavation</h2>
<p>Equipped with the historical snapshots and a string of Python, I dug into my own past. The entire 2 years stood in front of me, and revealed some facts that I might not have liked to know. The worst of them all is that I keep descending. When the first measurement was taken, I had merely 120 tabs open, and I peaked at 298. After days of concerted effort, I'm back to 205.</p>
<p>You can see it clearly here, on the age diagram:</p>
<p><img src="age.png" alt="age graph" /></p>
<p>It shows the general progression from prehistory on the left to modernity on the right. On the vertical scale, each non-white pixel is one open tab (yes, tabs are attached to the top of the image, fight me). Each day has a different colour assigned to it, and a tab opened on that day stays that color for its entire life.</p>
<p>Okay, that's not <em>exactly</em> right. Those dark horizontal streaks and dots – I think those are URLs I opened again later. Perhaps the front page of a news site, or simple the empty new tab.</p>
<p>As you can see, tabs opened in ancient times are assigned black, and later, differently colored layers gradually accumulate.</p>
<h2>Clusters</h2>
<p>While the smooth colors show the long term layering, they don't show the patterns: do tabs settle down one by one, or do they get forgotten in batches?</p>
<p>Another colouring helps answer this. Here, neighbouring days have contrasting colours:</p>
<p><img src="days.png" alt="days graph" /></p>
<p>And the answer is… a single tab can get forgotten, but about as often as a bunch of tabs. I wonder if I'm special in this regard.</p>
<h2>Prehistory</h2>
<p>But the last view still didn't shine any light on the dreaded dark layer. Thankfully, while dating is impossible, we can still analyze the material each tab was made of. i decided to colour each domain a different colour. Here's the result:</p>
<p><img src="domains.png" alt="domains graph" /></p>
<p>It's curious to see that there is clustering here too. What could the wide bands correspond to? A Wikipedia binge? Opening lots of comment threads on a social network? That kind of makes sense, until you notice that each wide band has a different colour – different domain! They correspond to some sorts of deep dives into topics I guess.</p>
<h2>Statistics</h2>
<p>A side effect of having access to that data is that I can also take some simple statistics. It turns out that, over 641 days,</p>
<ul>
<li>I visited 3659 URLs</li>
<li>across 1213 domains,</li>
<li>my last tab was usually something on <a href="https://news.ycombinator.com">Hacker News</a>,</li>
<li>and I beat the record of tab hoarding on 2022-01-28 with 298 tabs open.</li>
</ul>
<h2>Pickaxe and brush</h2>
<p>I'm not one to hoard the tools of my trade, however. Any aspiring introspector can take advantage of what I did here.</p>
<p>You'll need Python 3 and the Pillow library. Then, <a href="tas.py">download this script</a>, and run it:</p>
<pre><code>python3 tas.py ~/Downloads/my-tabs/ my_group image_name
</code></pre>
<p>Remember to replace <code>my_group</code> with the name of the tab group you wish to dig into!</p>
<p>So, are there any other Simple Tab Groups users here? Please show us what your hoarding looks like. If not, see you in another year!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Rust on Maemo</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/rust_maemo/"/>
       <id>tag:dorotac.eu,2021-08-05:posts/rust_maemo</id>
       <updated>2021-08-05T14:00Z</updated>
       <published>2021-08-05T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Rust on Maemo</h1>
<p>Everyone wants to run Rust on their phone, right?</p>
<p>But it's not so easy when your pocket device is the Nokia N900. You'd think that I should be using the <a href="https://puri.sm/products/librem-5/">Librem 5</a>, but my pockets disagree. Actually, I agree with them, too, at least as long as I use the device to collect GPS tracks.</p>
<p>The phone is over 10 years old by this point, and it's not open, which means it's running outdated software. Nokia stopped supporting Maemo around 2011, and the last <a href="https://wiki.maemo.org/Community_SSU">community update</a> happened a couple of years ago too. Despite that, I'd like to run some of my own code on it. The canonical way to do it is to use the SDK via a <a href="https://github.com/accupara/docker-images/tree/master/mobile/maemo">Docker image</a>, but it's rather clumsy.</p>
<h2>GNU libc</h2>
<p>My first attempt to compile code for the N900 was to use a recent version of Debian, running on an emulated ARM CPU. It didn't go far: after copying to the phone, the program failed to run. Glibc is too old.</p>
<p>Easy, I thought! I can always package up glibc together with the program via static linking. As I tried it out, I was greeted with a crash message along the lines of:</p>
<blockquote>
<p>This kernel is too old.</p>
</blockquote>
<p>Jolly, so Linux version 2.6.28 doesn't meet modern standards. It seems my road is blocked, and this was just a &quot;hello world&quot; C program.</p>
<h2>Musl</h2>
<p>A while later, I realized that the kernel complaint probably came from glibc as well. But Rust doesn't have to use glibc, it can also use musl as the C library! Let's check Rust's <a href="https://doc.rust-lang.org/nightly/rustc/platform-support.html">platform support page</a>. The CPU on the N900 supports the ARMv7 instruction set, so we're looking for <code>armv7-unknown-linux-</code>. And here we go: apart from <code>armv7-unknown-linux-gnueabi</code>, indicating the usage of glibc, there's also <code>armv7-unknown-linux-musleabi</code> for musl.</p>
<p>Looking closer at musl's <a href="https://www.musl-libc.org/faq.html">FAQ page</a> and the section on requirements:</p>
<blockquote>
<p>Linux 2.6 or later.</p>
</blockquote>
<p>Awesome! That means our ancient kernel is supported.</p>
<p>Rust has the ability to cross-compile, which makes testing the solution easy. I picked up [my earlier project] related to the N900, added a <code>.cargo/config</code> file, and typed in the following commands:</p>
<pre><code>dnf install -y gcc # needed for syn at compile time
dnf install -y lld # no armel gcc toolchain for Fedora, so use clang linker cause it does armel out of the box
curl https://sh.rustup.rs -sSf | sh -s -- -t armv7-unknown-linux-musleabi
cd /mnt/src/grconverter
source $HOME/.cargo/env
cargo build --target armv7-unknown-linux-musleabi
cp target/armv7-unknown-linux-musleabi/debug/read /mnt/n900 # copy onto the phone
</code></pre>
<p>…and it's alive!</p>
<pre><code>$ ./read MyDocs/gpsrecorder/gpstrack-20210804-140132.gpsr | head
Header(
    Header {
        magic: [
            71,
            80,
            83,
            82,
        ],
        format: 3,
        time: 1628078492,
</code></pre>
<p>Or at least half-alive. Linking to external libraries is still an open question, but the standard library works, which means pure-Rust crates are good to go.</p>
<p>Happy hacking!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Reverse engineering a pen</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/REing a pen/"/>
       <id>tag:dorotac.eu,2021-07-15:posts/REing a pen</id>
       <updated>2021-07-15T14:00Z</updated>
       <published>2021-07-15T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Reverse engineering a pen</h1>
<p>Like a lot of people, I have a favorite pen. I ran across it randomly, but I fell in love with it instantly. It glides across paper with barely any pressure, leaves strong, confident, and uniform streaks, and it's built like a tank to boot. But, unlike most pens that are worthy of appreciation, this pen is really many pens – it's disposable.</p>
<p><img src="pen.png" alt="A red ink Uni-Ball Eye pen" /></p>
<p>While the ink is bright an saturated, and there's a selection of colours, the Uni-Ball Eye is not meant to be refilled. To someone trying to reduce my waste production, using up one perfectly good pen every year is completely unacceptable. Knowing tricks back from the bad days of ink jet printers, I bought a vial of fountain pen ink, a syringe, and I set out to refill the carcass of my most recent pen.</p>
<p>Knowing that this is a blog post about refilling a pen, you might already guess that it wasn't as easy as it sounds.</p>
<h2>Sesame open</h2>
<p>Before refilling, I destroyed a previous used up carcass to find best ways to inject liquid into the pen. It took me a while with the scissors – the outer shell is <em>really</em> thick – but eventually I found out that the back end has a thinner wall, and a needle did the job.</p>
<p>I also dissected the funny comb-like part near the tip that always looked like it filled with a little bit of ink. It's actually a second ink vessel, whose role I discover while refilling. Surprisingly, this vessel is not connected to the tip. There's a wick going from the tip straight to the big compartment, <em>inside</em> the tubular, comb-like second vessel.</p>
<p>Here's a diagram:</p>
<p><img src="diagram%20annotated.svg" alt="A diagram of a pen. From end to tip, a useless? hole leading to an empty container. A weak wall separating it from the ink container, which contains a thin wick leading all the way to the tip. The ink container ends in a permeable wall, which leads to a container defined by a &quot;comb&quot; structure. The &quot;comb&quot; is a pipe with the wick inside, and many flat rings on its outside. The rings are not reaching the container wall. Two guessed passages connect the end of this container with two &quot;nostrils&quot; near the tip." /></p>
<p>(The walls are much thicker in reality.)</p>
<h2>Shooting it up</h2>
<p>For the first refill, I simply poked a hole in the back of the pen, placed the conjoined pen and syringe horizontally, and squirted in a few drops of the liquid. The wick became flooded with ink, and I could write again! But… all the ink slowly seeped into the comb container, leaving the wick dry after a minute. Even worse, two &quot;nostrils&quot; near the tip readily released ink as long as the comb was full. What's going on?</p>
<p><img src="comb.png" alt="Comb compartment" /> This comb container on a new pen is already partially filled. The red streaks are ink.</p>
<p>An obvious culprit is gravity, but the ink wasn't flowing all at once, so there must be more to it.</p>
<p>The comb shape drew my attention. Being composed mostly of walls, it gives the liquid plenty of tight spaces to which it can stick. Did the manufacturer intend to make use of the capillary effect? It would help liquid flow into the comb.</p>
<p>There was the open question of whether the pen was originally airtight, but adding pressure into considerations would make the problem too complicated, so I followed the next obvious lead.</p>
<h2>Capillary effect</h2>
<p>If I could prevent the comb compartment from filling up, ink would stay in the main vessel, and I would write indefinitely. The only difference, apart from the opened end, between my refill and the original, was the ink used. Perhaps the original ink was tuned specifically for this pen design. Hoping to alter the liquid to be less eager to climb up the comb, I tried a couple mixes with glycerin and water.</p>
<h2>Nosebleed</h2>
<p>I tried those tests also with the syringe hole plugged. That's when something unusual happened: the nostrils produced bubbles of air, like a kid who dropped an ice cream. Clearly, that was not just gravity's fault: air doesn't flow out under its own weight.</p>
<p>The story of fountain pens leaking inside airplanes came to my mind. Perhaps this pen had similar considerations?</p>
<h2>Air pressure</h2>
<p>I had reasons to believe that yes, they do. First off, why would a pen need &quot;nostrils&quot; connecting to the ink compartment, if not to equalize pressure? Coupled with the comb compartment, this made sense. When the pen is in an upright position, ink flows towards the tip, and there's a pocket of air near the butt. That air would expand when surrounding air pressure drops, and push out ink from the main vessel. The comb area would contain the excess, and the comb structure itself would prevent it from flowing out.</p>
<p>Every time I tried to use the pen in the morning, it cried me tears of ink, flourishing my handwriting in a special way. Pressure is the hint. Mornings are cold. My hands are warm. Gases expand with temperature. Air pushes out ink. It all makes sense!</p>
<h2>Flawed methodology</h2>
<p>Sadly, my tests so far were all flawed: I never allowed the comb to dry up so far, and if the pressure theory was right, then the pen could work correctly only when the comb had plenty of empty space. I came up with a new procedure:</p>
<ul>
<li>use the pen until all the ink was released – via the tip or the nostrils</li>
<li>carefully plug the injection point.</li>
</ul>
<p>If my theory was right, then the following would happen:</p>
<ul>
<li>the liquid would have enough surface tension to not flood the comb after a few hours</li>
<li>the wick would stay submerged in ink, and never dry up</li>
<li>writing on cold mornings would make the comb fill up a little</li>
<li>but it would not be enough to fill up the comb and cause spills.</li>
</ul>
<h2>Experiments</h2>
<p><img src="blot.png" alt="A red ink blot" /> My daily companion when researching this article: the ink blot.</p>
<p>The plan didn't work out flawlessly, and every attempt was about as fun as watching ink dry. That's because using up all ink took up ages, and the ink blots added a special flourish to my handwriting. I made several failures to plug the injection point: it turns out that just sticking a needle in there is not enough, and ink flows out the back. What's good enough is glueing it up using some plastic foil and liquid latex.</p>
<p><img src="plug.png" alt="The end of a pen plugged with a piece of paper, all soaked with red." /></p>
<p>While that works to keep the ink out the back end, it turns out that the ink still flows out the front! That invalidates my theory. Or does it?</p>
<p>I didn't want to seal the pen irreversibly in case that I needed to adjust something, so the end was probably letting air seep in a little. I left the pen overnight in different positions, and, after watching the ink in the comb change, concluded that gravity was responsible for the failure this time.</p>
<p>Without any better ideas, I went ahead and mixed in some glycerin into the ink vessel, which seems to have stopped the outflow. Success!</p>
<h2>Inkblots</h2>
<p>That last experiment proved the minimal changes needed for the Uni-Ball Eye to be reuseable:</p>
<ul>
<li>use ink with about 10% glycerin</li>
<li>refill from the back while the comb compartment is not full</li>
<li>plug in the refill hole afterwards.</li>
</ul>
<p>The comb compartment is probably a way to avoid leaks due to pressure.</p>
<p><img src="comparison.png" alt="Two pens next to similar squiggles of red ink. The bottom is broader and brighter than the top one." /> The top pen in this picture is the refilled one. In reality, both inks look darker.</p>
<p>The ink I used is not as intense as the original one, and it doesn't flow so easily, but on the other hand, it's a bit thinner, and still flows decently enough. It can work as a cheap alternative to a new pen, which normally costs about 3 EUR per piece.</p>
<h2>Lessons learned</h2>
<p>Turns out that there's a good deal of thought and complexity embedded in every day objects. Physical objects are take more effort to understand than as software. You need to control your environment, retrying the experiment can take a lot of time, and you may get yourself dirty in the middle. None of this will be a surprise to an experimental scientist, of course.</p>
<p>It's still a joy to explore, and the results are going to be a fair bit easier to explain to a random human – the outcome is satisfyingly concrete!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Start in the middle</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/in_the_middle/"/>
       <id>tag:dorotac.eu,2021-07-11:posts/in_the_middle</id>
       <updated>2021-07-11T14:00Z</updated>
       <published>2021-07-11T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Start in the middle</h1>
<p>It's a piece of advice I read years ago in the depths of the 'Net, which I can't find despite best efforts any more. It's surprising, because it's such an obvious guidance! Especially for those easily distracted, or prone to losing motivation.</p>
<p>TL;DR: When you work on something new, jump into the thick of it. Don't let yourself be sidetracked, and attack directly what interests you.</p>
<h2>Storybuilding</h2>
<p>Homer put Odysseus <em>in media res</em> as Odyssey begins: he's trying to return home after being shipwrecked. There's little backstory, or discussion about motivations. You focus on the current state, and wonder: how did he end up so? What challenges will he encounter? Can he succeed? You're hooked, and keep reading to see the questions answered.</p>
<p>It's good news for the distracted creator, because those are the same questions you may ask yourself when you have an awesome idea for a new project. What are the challenges? Can the idea succeed? You can get hooked on your own project by cutting out the fluff, and jumping straight into it.</p>
<h2>Focus</h2>
<p>No one has built Skynet yet. If you are overly ambitious, chances are that you will have to build a lot of supporting infrastructure. Instead, choose the core element and stick to it. If you want to take pictures of all the graffitti in your town, don't bother with setting up a web site, just take photos. If your idea is to program a fancy reverb effect, don't bother turning it into a library.</p>
<p>In fact, don't bother with all the surroundings. Before you write all the READMEs, set up a documentation site, and publish the library on your favourite web site, you're risking that you run out of energy to actually do the thing you set out to do.</p>
<h2>Side tracks</h2>
<p>Sometimes you might want to build something that is so original that no one even built the tools you could use. A spaceship out of water bottles? Perhaps needs heavy duty glue, and all the glues you found are suboptimal. Suddenly, you're spending days researching chemistry, and the spaceship project is not moving anywhere.</p>
<p>Everything is cool if you enjoy your glue side project, but if your only motivation is to build a rocket, then perhaps you need to cut your losses and use one of the store glues. Get back to the middle: the spaceship, before you start hating the project! Once you nail the central part, either you're going to have so much motivation you'll deal with the side projects, or you will blissfully not care any more. Win-win!</p>
<h2>Software</h2>
<p>The article is written about general projects, but software is especially prone to it, because there's so much complexity and potential challenges. I wouldn't have finished <a href="/posts/fluwid">fluw</a> if I didn't take this approach. Thankfully, knowledge stays, and the more you build, the easier it will be to choose tools and minimize the time spent not in the middle of things.</p>
<h2>Protagonist</h2>
<p>You're the protagonist in your life, and challenge you create for yourself are stories you live. So take advice from the art of writing, start in the middle, and chances are that you're going to enjoy more what you do.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>When being a nerd paid off</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/nerd/"/>
       <id>tag:dorotac.eu,2021-07-06:posts/nerd</id>
       <updated>2021-07-06T14:00Z</updated>
       <published>2021-07-06T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>When being a nerd paid off</h1>
<p>A few days ago, my knowledge of obscure computing facts surprisingly saved the day! Surprisingly, because sticking to garden variety best principles would have left me worrying. What was the deal? It's about a deleted photo on my camera.</p>
<hr />
<p>It was a sunny day, so I copied photos off my camera onto the computer, and set out with a sense of journey and a clean storage. Or so I thought: when I stopped to take a photo, my camera complained about full storage, so I selected &quot;delete all photos&quot;. While the screen showed &quot;erasing&quot;, I realized in horror: I took a cute picture of a dilapidated wagon already! And I'm not going back there again to take another one.</p>
<p>Common advice in those situations says: don't use the storage medium under no circumstances, and attempt recovery on a computer. Then you might be lucky and keep your deleted data from being overwritten.</p>
<p>This advice is right, but the trip was just beginning, and I was not ready to stop taking photos. Thankfully, my nerdy sense told me that I have a very good chance to get the picture back even if I take more photos – as long as I don't go crazy with them. That's due to some secrets that I learned about technology:</p>
<ul>
<li>computers (and therefore digital cameras) are made by humans, and humans like simplicity,</li>
<li>modern storage is logically structured a bit like a cassette tape (or a photo film),</li>
<li>&quot;deleted&quot; data does not actually get cleared until it's overwritten by new data.</li>
</ul>
<p>You might be surprised to learn about the similarity of storage to tape, but it's there on hard drives, actual tape drives, SD cards, and SSDs. The structure exposed to the programmer is of a long string of small pieces of data (sectors), each of which has a number, starting from 0, almost like houses on the street.</p>
<p><img src="sectors.svg" alt="A drawing of numbered houses on Sector Street, numbers 1 to 5" /></p>
<p>Accessing sectors (by writing or reading) is simplest when it's done in order. We're only humans, and why use an elaborate scheme if we can just move from one to the next? And so, the reading head slides steadily on the tape, the film in a camera moves by one frame after each photo, the postman visits houses in order. And it's not just out of convenience: it's clearly the fastest when there's a moving physical object involved. Surprisingly, writing in order is even the fastest way to <a href="https://www.anandtech.com/show/2738/25">write</a> data on SSDs, which have no moving parts!</p>
<h2>Append only</h2>
<p>This must be coupled with an observation: cameras – or at least my camera – only add new pictures onto the SD card (our &quot;tape&quot;), and when deletions happen, the entire contents of the card are marked as deleted. Just like in a film camera: once a photo is taken, it's taken. We can replace the entire tape, but we don't replace a single picture with a blank.</p>
<p>This makes life easier, because imagine if we had a tape where we could erase already taken pictures. At first it's okay: we take 30 pics, and we hit the end. Then we erase pictures number 7, 2, 17, and 22. To take another picture, we would have to fiddle with the tape to find an empty spot, then rewind it to the right position (make sure it's not overlapping!), and do it again until the tape is full again. That's no fun. It's easier if pictures come one after another.</p>
<p>And while I don't have any evidence, I assume that's how my digital camera works, too. It's not a general purpose computer, so it doesn't have to deal with a lot of deletion. I can safely assume that the picture of the wagon (which was the last picture on my SD card before it filled up) will not occupy the same sectors as the first 10 pictures I take after the card is erased, and that it will not get overwritten.</p>
<p>With a peace of mind (and ready to be proven wrong), I continued to take photos of the trip. When back home, I fired up <em>testdisk</em>. Lo and behold, I recovered the picture!</p>
<p><img src="wagon.jpg" alt="A dilapidated wooden wagon with one wheel modern and one missing" /></p>
<h2>The real world</h2>
<p>Okay, maybe I just got lucky. In reality, there's one thing that changes size. For SD cards, it's usually the File Allocation Table. It contains the numbers of sectors which contain pieces of each file, along with the names, and a bunch of extra info. As enough files and directories get added, it probably changes in size.</p>
<p>Another complication is the mysterious &quot;PRIVATE&quot; directory. I presume this contains only hibernation data to make the camera start faster, and I'm not seeing any reason for it to change size at all (what kind of generated data does a camera need anyway?).</p>
<p>In the end, what matters is not exactly that sectors get written from 0 up. What matters is that there is a defined order of writing which doesn't change between full card erasures. That's much more difficult to verify.</p>
<h2>FAT32</h2>
<p>FAT32 doesn't actually mark sectors as &quot;deleted&quot;. It marks the entire file as deleted, without permanently erasing any of its information until it's needed. Testdisk knows this, and it makes recovery a breeze. The entire operation took 5 minutes.</p>
<h2>CAUTION</h2>
<p>This is all based on high level guessing, so don't rely on this. I made peace with losing the deleted picture when I decided to keep shooting new ones. Basic advice is still best: if you accidentally delete something important, immediately stop using the storage, <strong>do not mount the file system</strong>, and make a <em>block level</em> copy. And why don't you restore it from a backup anyway?</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Linux kernel flame graphs</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/flame_graph/"/>
       <id>tag:dorotac.eu,2021-06-19:posts/flame_graph</id>
       <updated>2021-06-19T14:00Z</updated>
       <published>2021-06-19T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Linux kernel flame graphs</h1>
<p>This post is a nothing-burger. All I wanted was to have those words together on the Internet: &quot;Linux kernel flame graph profiling&quot;, because I spent about an hour looking for something I already knew would take 2 minutes to execute.</p>
<p>So, to whoever comes to ask the same question as I did back then, the solution is not <code>perf</code>, it' also not <code>trace-cmd</code>, but instead it's eBPF-based. It's <a href="https://github.com/iovisor/bcc/">bcc</a>. Read the <a href="https://github.com/iovisor/bcc/blob/master/tools/profile_example.txt">description of the profile tool</a>. TL;DR:</p>
<pre><code>./bcc/tools/profile.py -dfK &gt; profile.flame
flamegraph.pl &lt; profile.flame &gt; flame.svg
</code></pre>
<p>Conveniently, Fedora has both flamegraph and the bcc library (not the tools) in the repos already.</p>
<p>This is what comes out the other end:</p>
<p><img src="flame.svg" alt="Flame graph showing multiple programs including Web, Xorg, plasmashell. There are 2 large spikes named , and two spikes called , with  up on the stack. Those two take up almost 50% width" /></p>
<p>As you can see, my system is spending <em>a lot</em> of time in the kernel, messing with btrfs. Then it overheats, causing it to spend <em>even more</em> time in the kernel, due to idle injection. Then, it gets really sluggish, making the operator look for kernel profiling tools, and the operator finally writes a blog post about this.</p>
<p>The bug looks like <a href="https://bugzilla.kernel.org/show_bug.cgi?id=212185">this report</a>.</p>
<h2>Nothing-burger</h2>
<p>If you thought I had something valuable to say, then… Well okay, maybe I shouldn't write blog posts if they are useless to anyone but me. You win. Give me a moment, I'll come up with something insightful to say.</p>
<hr />
<p>Maybe you're not familiar with flame graphs, but they are a really nice way to check for performance issues. They show the amount of time the resource (in this case the CPU) is busy with each task. The width of the graph is 100% time, and the width of the different bars is the portion of time occupied by something.</p>
<p>This flame graph does not distinguish between different runs of the same task. If we were monitoring tasks done by Joe over the day, and he'd spend the morning slicing potatoes, then chopped wood, and then sliced potatoes again, this kind of a graph would only distinguish two tasks: slicing potatoes and chopping wood.</p>
<p>This makes sense when you consider why the bars are stacked vertically: each bar on top of another is a sub-task. If a task is composed of a hundred sub-tasks, presenting them together makes the picture clear. When we care about optimization, then it's useful to know how the total time spent peeling compares to the time spent cutting. The times for each potato aren't so relevant. It doesn't even matter a lot how we order the results on the left-right axis!</p>
<p>It's especially relevant for computers, where flame graphs may be used to get an idea how much time is spent in a procedure that may be called a thousand times. While the graph won't say how many times it was executed, it will show where it was called from.</p>
<h2>Bonus</h2>
<p>This is about the kernel, which is notoriously hard to debug, so let me share my most useful snippets for debugging. Those use trace-cmd, and give extra information without the need to recompile.</p>
<ol>
<li>Prints the names of functions (with some exceptions like inlining) along the execution path:</li>
</ol>
<pre><code>sudo trace-cmd record -p function_graph -g start_procedure -F ./program_under_test
</code></pre>
<ol start="2">
<li>Prints the call stack whenever the chosen procedure is executed:</li>
</ol>
<pre><code>sudo trace-cmd record -p function_graph -g procedure_under_test -n printk -n dev_printk_emit -F ./program_under_test
</code></pre>
<p>View results with <code>trace-cmd report</code>.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Simulating fluwids</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/fluwid/"/>
       <id>tag:dorotac.eu,2021-04-06:posts/fluwid</id>
       <updated>2021-04-06T14:00Z</updated>
       <published>2021-04-06T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Simulating fluwids</h1>
<p><a href="https://xkcd.com/356/">Nerd sniping</a>. It doesn't apply only to mathematical puzzles, and the victim doesn't have to be another person. It's not uncommon to nerd snipe oneself with a nifty programming idea. Especially one that looks simple enough to turn into a working demo in a matter of hours, if not minutes.</p>
<p>This is a story of how I decided to have a working fluid simulation later in the evening.</p>
<hr />
<p>This article uses MathML for rendering equations. If you don't see any, consider using a browser that can display maths, like <a href="https://www.mozilla.org/en-US/firefox/new/">Firefox</a>.</p>
<h2>Divergence</h2>
<p>The idea entered my consciousness as I was daydreaming about cells floating in liquids, to give <a href="https://github.com/dcz-self/breedmatic">Breedmatic</a> a biological special effect. As I considered what makes things flow, I was struck with a realization: a simple flowing fluid can be modelled with one equation!</p>
<p>If we simulate the fluid flow as a vector field <em>F</em>, then any volume <em>V</em> inside it satisfies our first equation:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><msub><mi>∬</mi><mi>S</mi></msub><mi>F</mi><mi>n̂</mi><mi>⋅</mi><mi>d</mi><mi>S</mi><mo>&#x0003D;</mo><mn>0</mn></mrow></math><p>where <em>n̂</em> is the normal vector to the surface <em>S</em>.</p>
<p>In simpler words, we can slice the fluid into any volume <em>V</em>, but the outflow of fluid is always going to be 0.</p>
<p>This itself is derived from the <strong><a href="https://en.wikipedia.org/wiki/Divergence">divergence</a> formula</strong>:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>d</mi><mi>i</mi><mi>v</mi><mrow><mo stretchy="false">&#x00028;</mo><mi>F</mi><mo stretchy="false">&#x00029;</mo></mrow><mo>&#x0003D;</mo><munder><mo>lim</mo><mrow><mi>V</mi><mi>→</mi><mn>0</mn></mrow></munder><mfrac><mrow><msub><mi>∬</mi><mi>S</mi></msub><mi>F</mi><mi>n̂</mi><mi>⋅</mi><mi>d</mi><mi>S</mi></mrow><mi>V</mi></mfrac></mrow></math><p>Divergence is essentially the measure of the amount of &quot;fluid&quot; coming out from a given volume. Except in this case, the volume is infinitely small, and divergence is defined at any <em>point</em> inside the vector field.</p>
<p>Can you see how simple and powerful the concept is? Fluid into = fluid out of. We could easily create a grid representation of a 2D fluid that upholds this property. Start with a trivial grid, then subdivide it cleverly… until a large and pretty flowing vector field is formed. Piece of cake, that shouldn't take more than 1 hour of work.</p>
<h2>The grid</h2>
<p>I took inspiration for the computer representation of the map from wind forecasts. Like this example from <a href="https://mapy.meteo.pl/">meteo.pl</a>:</p>
<p><img src="windmap.png" alt="Baltic Sea and neighboring lands overlaid with wind barbs between 5 and 25 knots. Most of them point East, but on the Eastern shore they turn North" />
This map uses <a href="https://a.atmos.washington.edu/wrfrt/descript/definitions/windbarbs.html">wind barbs</a> to indicate wind strength and direction in each cell of the underlying grid.</p>
<p>If we use square cells, it becomes a straightforward computer analogue of a vector field, with one important difference: the cell size inside a vector field is infinitely small. Computers can't do that (easily), but it's good enough for me.</p>
<p><img src="grid.svg" alt="A grid with an arrow in every cell" /> The grid keeps the direction of flow in every cell.</p>
<p>Our first equation still needs to be upheld here, and not just in an abstract form. The easiest way to do that is to use one of the many formulas approximating divergence on a grid that we can find online. They have the advantage of being simple to calculate, operating on just 4 neighboring cells. Instead of writing them down, I will show what they calculate:</p>
<p><img src="grid_divergence.svg" alt="A grid with 4 cells in a square. with distances between the start and the end of each arrow marked with a letter. a to d Caption reads &quot;div=a+b+c+d&quot;" /> Divergence is approximated by 4 components, one for each field. Each says how much the flow points outward of the middle of the group of 4 cells, and then they are summed up.</p>
<p>Our goal is to keep the inwardness of the arrows equal to their outwardness. Simple, right?</p>
<hr />
<p>Writing this project, I made a couple of mathematical assertions that I can't be bothered or am not able to prove. One of those is wrong, can you spot it before the grand reveal?</p>
<hr />
<p><strong>Assertion 1</strong> enter stage right.</p>
<p>That if divergence = 0 in every possible group of 4 neighboring cells on the grid, then it's not possible to find any set of cells where divergence ≠ 0.</p>
<p>I think it's reasonable, because groups of 4 neighboring cells are overlapping, and cover the entire fluid together with their divergence = 0 guarantee. If it's false, then divergence could appear on a larger scale without appearing on a smaller scale first.</p>
<p><em>Correction:</em> There is a hidden mistake here, that nevertheless doesn't affect algorithms as described in this post. Imagine a grid of cells where those in odd rows flow left, and those in even rows flow right. Taking each pair, the divergence is 0, but take 3 of them, and there will be 2 of one kind and 1 of the other. Unbalanced, and diverging! A better formulation should state:</p>
<p><strong>Assertion 1 (corrected)</strong></p>
<p>If divergence = 0 in every possible group of 4 neighboring cells on the grid, then it's not possible to find any set of cells where divergence ≠ 0, if that set is made of non-overlapping groups of 4 neighboring cells.</p>
<p>If we consider only pieces with 0 divergence, then the collection of them obviously has 0 divergence too.</p>
<p>Thanks to Roman for pointing it out.</p>
<h2>Building the grid</h2>
<p>We have an equation which is defined on a 4-group. How do we create a grid of size, say, 16×16 cells while maintaining the equation? Obviously, divide and conquer.</p>
<p>Start with a single 4-group. Insert some flow between the cells. Split it while maintaining the equation, and then insert some extra flow. Rinse, repeat.</p>
<p><img src="grid_growing.svg" alt="A 2×2 grid with zero flows gains some flows, and then turns into a 4×4 grid with 4 big flows, one in each quadrant. Finally, the 4×4 grid gets a separate flow in each cell." /></p>
<p>Unrolling this step by step, we start with a group of 4 undisturbed cells of fluid flow.</p>
<p><img src="grid_4_start.svg" alt="4 cells in  rows, each with a zero symbol inside." /></p>
<p>There is no flow, so divergence = 0 and our requirement is maintained. So far so good. But how do we introduce some flow into it? A fluid that doesn't flow makes for a boring simulation indeed.</p>
<p>Introducing stir. Notice that divergence will not be changed if the fluid keeps flowing in circles, no matter how fast, so let's do that:</p>
<p><img src="grid_4_stir.svg" alt="4 cells in arrows chasing each other, marking that no arrow points outwards. Each arrow has a letter &quot;a&quot; to &quot;d&quot; next to it, and all are =0. Caption says &quot;div=a+b+c+d=0&quot;" /> Flow in a stirred 4-group.</p>
<p>Great, now let's split it and cross the boundary from a 2×2 grid into a 4×4 world. We need new assertions here:</p>
<p><strong>Assertion 2</strong>: if a single cell is split into 4 cells all of the same flow, the resulting 4-group has 0 divergence.</p>
<p>This seems straightforward to me. First, a single cell in our model cannot have any divergence. Fluid comes in, fluid goes out. Making it bigger doesn't change anything, the flow is still straight through.</p>
<p><strong>Assertion 3</strong>: when a grid of 0 divergence can be split into 4-cells by turning each cell into 4 identical ones, the resulting divergence stays 0.</p>
<p>This one reaches a bit farther, but it's essentially the same principle: making a small thing bigger doesn't change the basic properties of the whole.</p>
<p><img src="grid_split_4.svg" alt="A grid of 4 arrows turns into a grid of 16 arrows, with each quadrant having only 1 kind of arrow." /> This is what the grid looks like after splitting.</p>
<hr />
<p>With assertions stated, we can do the stirring again:</p>
<p><img src="grid_stir.svg" alt="A 4×4 grid with curly arrows embracing each intersection point between grid lines." />
We stir around each middle point, each with a different force.</p>
<p>But now we hit another assertion!</p>
<p><strong>Assertion 4</strong>: Stirring a 4-group does not affect the divergence of any overlapping 4-groups.</p>
<p>This one sounds less reasonable. Let's quickly check the basic cases with stir magnitude = a.</p>
<p>First, is overlapping on the side enough to disturb the neighbors? The divergence before is indexed with 0, and after applying neighbor's stir with 1.</p>
<p><img src="group_overlap_side.svg" alt="Shows a grid made of two 4-groups overlapping with 2 cells, and marks the contribution of new arrows from one 4-group to the other 4-group as &quot;a&quot; and &quot;-a&quot;." /> The green 4-group and the one to the left share 2 cells. The one to the left was stirred with force <em>a</em>.</p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>d</mi><mi>i</mi><msub><mi>v</mi><mn>1</mn></msub><mo>&#x0003D;</mo><mi>d</mi><mi>i</mi><msub><mi>v</mi><mn>0</mn></msub><mo>&#x0002B;</mo><mi>a</mi><mo>&#x02212;</mo><mrow><mo stretchy="false">&#x00028;</mo><mo>&#x02212;</mo><mi>a</mi><mo stretchy="false">&#x00029;</mo></mrow><mo>&#x0003D;</mo><mi>d</mi><mi>i</mi><msub><mi>v</mi><mn>0</mn></msub></mrow></math><p>The extra flow comes in, but it leaves again. What about overlapping on the corner?</p>
<p><img src="group_overlap_corner.svg" alt="Shows a grid made of two 4-groups overlapping with 1 corner cell, and marks the contribution of new arrows from one 4-group to the other 4-group as 0." /></p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>d</mi><mi>i</mi><msub><mi>v</mi><mn>1</mn></msub><mo>&#x0003D;</mo><mi>d</mi><mi>i</mi><msub><mi>v</mi><mn>0</mn></msub><mo>&#x0002B;</mo><mn>0</mn><mo>&#x0003D;</mo><mi>d</mi><mi>i</mi><msub><mi>v</mi><mn>0</mn></msub></mrow></math><p>Here, the flow is perpendicular to the center between the 4 cells, so it never even enters the equation. Great!</p>
<p>Assertion upheld, we have all the pieces to divide and conquer. Both splitting and introducing stir is ours to have. It's time to implement the algorithm and enjoy the results.</p>
<h2>The reckoning</h2>
<p>After all this thinking, armed with the above knowledge and confidence, writing the code seemed like a formality.</p>
<p>Not so. As I progressed through stages of stirring and splitting, a red light furiously engaged. CODE RED! CODE RED! INTEGRATION TESTS FAILING. It was the 4x4 grid. And the 8×8 grid. The divergence scanner reported anomalies never seen on smaller scales. Manual checks revealed that the checks were not just due to inaccuracies that computer calculations always make. Divergence was in places almost as strong as the average cell's flow!</p>
<p>But how? Where did I go wrong?</p>
<p>After some head-wall interactions, I extracted the offending 4-group. Half of it was the edge, which I added with an unchangeable flow of 0, the other half came from splitting a cell with a positive flow equal <em>f</em>. This is what it looked like:</p>
<p><img src="offender.svg" alt="A diagram of 2 cells: one with an arrow, other gray and with a null symbol. This diagram turns into a bigger diagram where the left 4-group has the same arrow in each cell, and all cells to the right have zeros inside. Marked are diverge components on the edge 4-group, all of which except one are 0." /> Edge of the grid can never have any flow other than 0, so it's marked in gray.</p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>d</mi><mi>i</mi><mi>v</mi><mo>&#x0003D;</mo><mi>a</mi><mo>&#x0002B;</mo><mi>b</mi><mo>&#x0002B;</mo><mi>c</mi><mo>&#x0002B;</mo><mi>d</mi><mo>&#x0003D;</mo><mo>&#x02212;</mo><mi>f</mi><mi>≠</mi><mn>0</mn></mrow></math><p>As you can see, the divergence comes out to <em>-f</em>. This proves that <em>Assertion 3</em> is wrong. Splitting can affect flow after all, on the boundary between cells.</p>
<p>But maybe I could salvage it? Find a better way to split? Come up with necessary equations? So I have tried, but never found any set of equations I could write down. I had to give up hope and search for another way.</p>
<p>At that point, the clock struck midnight.</p>
<h2>Try again</h2>
<p>Burning with even more nerd-snipedness than before, my attention turned to overanalysis. Accurate splitting got me in trouble. And the splitting comes from the way my grid looks. What are other representations of flows inside a vessel? Is there one where splitting flows is easy?</p>
<p>There is one. But it does not hold directions like a weather map. It resembles a series of barriers instead, tracking flows <em>between</em>, not <em>inside</em> cells. And divergence is easily calculated, too, for each region that would have been a cell in the previous model:</p>
<p><img src="barrier_divergence.svg" alt="4 lines arranged in a square, an arrow pointing outwards crosses each. Arrows marked a-b." /></p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>d</mi><mi>i</mi><mi>v</mi><mo>&#x0003D;</mo><mi>a</mi><mo>&#x0002B;</mo><mi>b</mi><mo>&#x0002B;</mo><mi>c</mi><mo>&#x0002B;</mo><mi>d</mi></mrow></math><p>Can it be stirred?</p>
<p><img src="barrier_stir.svg" alt="4 cells, with walls they share highlighted. The shared walls have arrows of equal size crossing them. Following the arrows brings us back to starting cell." /> When the stirring flow crosses into a cell, the same amount leaves the cell, leaving its divergence unchanged.</p>
<p>And what does splitting look like?</p>
<p><img src="barrier_split.svg" alt="A square with sides marked a-d gets split with 2 lighter crossing lines in the middle. All lines are split into 2 as well, resulting in 4 squares sharing sides." /> Here, the added walls are marked in lighter gray. Arrows have been replaced by their values.</p>
<p>Almost good. The values are not defined, but they are also not horribly floating. Let's try limiting the unknowns by simply forcing the boundary values to be halved.</p>
<p><img src="barrier_split_boundary.svg" alt="The same diagram as above: 4 cells sharing 4 walls. 8 walls which are not shared are each marked as 1/2 times a-d, with the same letter on both walls on one side." /></p>
<p>The flows at 4 barriers inside still need to be calculated, but now they don't depend on anything outside this small piece. But now another piece of picture is revealed. We can calculate how much surplus fluid each subcell receives from the outside by comparing its 2 boundary edges:</p>
<p><img src="barrier_split_surplus.svg" alt="The same 4 cell arrangement, except this time each external corner has an arrow from one wall to the other. Arrows are marked s₁-s₄." /> s₁ = (b - a)/2 and so on.</p>
<p>If we distribute the surplus across the subcells, we can calculate the flows between them! But it's not so easy either. Notice that we can stir inside our 4-group, and that won't affect how much surplus fluid each cell has. That gives us some freedom we don't necessarily need.</p>
<h2>Waterfall</h2>
<p>One way to estimate flows between subcells is to look at the surplus, and consider the cells to be a waterfall. Where would the surplus flow? Downwards, from the highest point.</p>
<p><img src="surplus_waterfall.svg" alt="4 labels s₁-s₄ arranged on neighboring columns of different heights." /></p>
<p>Calculating this is quite straightforward if we know which cell has the highest surplus. The trick to do that is arranging cells in a circle, where consecutive ones share a barrier: <em>abcd</em>, and <em>a</em> again. If we create an array of <em>abcdabc</em>, we can always find the index of the highest value and start a new array there.</p>
<p><img src="array_circle.svg" alt="Different slices of length 4 from the abcdabc array." /></p>
<p>The actual algorithm practically writes itself, and is left as an exercise for the reader. It can be somewhat freeform, but if it eliminates divergence, it's good enough. My version is available in <a href="https://gitlab.com/dcz_self/fluw">fluw</a>.</p>
<h2>Simulation</h2>
<p>The new approach knows the same tricks as the old one: stirring and splitting. And it can do them without mistakes, too. But the representation is different. It's a network of barriers:</p>
<p><img src="barrier.svg" alt="4 squares made up of lines. Each shares 2 sides with its neighbors." /></p>
<p>which is not so easy to use in simulations. Thankfully, we can transform it into a grid by summing up barrier flows as components of the flow vector.</p>
<math xmlns="http://www.w3.org/1998/Math/MathML" display=""><mrow><mi>v⃗</mi><mo>&#x0003D;</mo><mrow><mi>a</mi><mo>&#x02212;</mo><mi>c</mi><mo>&#x0002C;</mo><mi>b</mi><mo>&#x02212;</mo><mi>d</mi></mrow></mrow></math><p>And we can also run our previous divergence detector, which… detects no divergence!</p>
<p><a href="https://porcupinefactory.org/data/fluw.webm">Video of the fluid simulation.</a></p>
<p>Mission complete! After several evenings of work, I ended up with <a href="https://gitlab.com/dcz_self/fluw">fluw</a>, which you can run on your computer too.</p>
<hr />
<p><em>Author's note:</em> The events didn't happen exactly as I described them here, but I prefer to spare the readers from reality which is way more boring and less dramatic than that.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>A beautiful blossom of engineering excellence</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/blossom/"/>
       <id>tag:dorotac.eu,2021-04-02:posts/blossom</id>
       <updated>2021-04-02T14:00Z</updated>
       <published>2021-04-02T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>A beautiful blossom of engineering excellence</h1>
<p>Have you ever created something perfect? Simple and functional, robust, and with a core that needs no further adjustment?</p>
<p>I have, recently. I named it <a href="https://gitlab.com/dcz_self/odwal">odwal</a>. I would have chosen a more positive name if I had known the peace of mind it bestowed upon me in the last several months.</p>
<p><em>Odwal</em> is perfectly underhanded. It doesn't look special, and I use it only to decrypt and mount my offline backup drives. But it takes care of all the tedium of the process: it deals with failures, and it stops the drives after I'm done.</p>
<p>It was not entirely my invention. The core perfectness comes from the idea of a <a href="https://en.wikipedia.org/wiki/Dependency_graph">dependency graph</a>, known to many from the <a href="https://www.gnu.org/software/make/">make</a> tool. Whereas <em>make</em> is only concerned with creating resources, <em>odwal</em> takes care of releasing them in the right order too.</p>
<p>This snippet takes care of mounting and unmounting my backup:</p>
<pre><code>(defstep backup_hdd
  (BlockStorage
    args (id &quot;ata-redacted&quot;) ; filename from /dev/disk/by-id
  )
)

(defstep decrypted
    (EncryptedBlock
        args (
            path parent.0.path
            name &quot;full1&quot;
         )
         parents (backup_hdd)
    )
)

(defstep full_partitions
    (Partitions
       args (path parent.0.path)
       parents (decrypted)
    )
)

(Mount
    args (
        path &quot;/dev/mapper/full1p1&quot;
        destination &quot;/backups/full&quot;
    )
    parents (full_partitions)
)
</code></pre>
<p>First, I connect the USB enclosure to my computer. Then, I run <em>odwal</em>:</p>
<pre><code># python3 -m odwal ./odwal.to hold Mount:/dev/mapper/full1p1
[here odwal checks for the drive]
Enter passphrase for /dev/sdc:
[here odwal mounts the partition]
Press enter to tear down
</code></pre>
<p><em>Odwal</em> asks me for the decryption password while preparing the mount, and then it gracefully waits until I no longer need it. Once I'm finished, I press enter, and <em>odwal</em> spins down the drive.</p>
<p>Even if I plug in the wrong drive, I don't have to worry. <em>Odwal</em> won't touch it. Even if I mess up the configuration, no harm is done. All I need to do is run <code>odwal clear</code>, and the world is good again.</p>
<p>The relief of using this compared to a bunch of shell scripts is impossible to describe. And it's only 400 lines of Python!</p>
<p>I hope one day I can bring my backup process halfway to the level of delight <em>odwal</em> represents.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>My objective function will blow it up</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/objective/"/>
       <id>tag:dorotac.eu,2020-11-29:posts/objective</id>
       <updated>2020-11-29T14:00Z</updated>
       <published>2020-11-29T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>My objective function will blow it up</h1>
<p>I have a preference for playing god. Build a system, observe, disrupt, see what new equilibrium it reaches. This manifests in weirdest places: gardening, cleaning, feeding the birds, computer games. Here's the story of how it turned me into a loss function and how you could become one too.</p>
<p>Computer games define the main thread today. It all played out in the span of mere weeks before publishing the post.</p>
<h2>Fuel</h2>
<p>It started from an unexpected angle this time: not <em>Black and White</em>, where you literally play god to your people, nor <em>Civilization</em> where your actions determine the prosperity of a city. Instead, what set me on the path to explosion was <em>Crimsonland</em>, namely <a href="http://phoenix.ee/old-crimsonland-1-02-1-3-0-1-4-0/">the freeware 2002 version</a>. As I mourned that it's not playable under Linux, I considered making a Libre clone. However, after succumbing to a few more waves of monsters, I also concluded that it's perfect at what it does, therefore any potential attempt to replicate it would end up a great disappointment. No point trying. I let my attention wander.</p>
<p><img src="http://phoenix.ee/wp-content/uploads/2012/04/cl1-1024x788.png" alt="Crimsonland in action, credit Phoenix.ee" /></p>
<h2>Oxidizer</h2>
<p><em>Creatures</em> is be a name you might have heard in the '90s and early '00s. A series of games that could be as sophisticated tamagotchi, they put you in charge of the well-being of humanoid creatures called Norns. I stumbled across a <a href="https://www.youtube.com/watch?v=mxperoO4Kd8">gameplay video</a>, which showed some gameplay, including sharing tips about punishing and rewarding Norns for their behaviour, breeding them, showing the contents of the neural network that makes their brains, and analyzing the chemicals flowing through their bodies. Turns out they have genes too!</p>
<p><img src="Creatures_2_Norns.JPG" alt="Norns splashing in the water. The pink one is about to jump off a pier, another is diving deep among the fishes. Credit: Wikipedia" /></p>
<p>To me, that moved the game from &quot;being a parent&quot; to &quot;playing god&quot; territory. I immediately tried it out. Turns out that playing a nurturing god is only exciting when there's something going on, and the Norns learn and explore rather slowly. As I let them be, as they walked back and forth on a second screen, my attention went astray again…</p>
<h2>Detonator</h2>
<p>My next chance encounter was the <a href="https://news.ycombinator.com/item?id=24983956">link</a> to Bevy, the game engine written in Rust. Nice, I've been wanting to learn to use an Entity Component System for a while, and I'm unwilling to deal with C. I never made games because of how tedious basic graphics are, but if I had anything to use it for, I would give it a try.</p>
<h2>Fire in the hole!</h2>
<p>And then it all clicked.</p>
<p>What if I took Crimsonland and gave the monsters a set of genes – simple enough to let them sense the enemy and walk – and culled them using the protagonist's weapons, it would be artificial selection!</p>
<h2>Selection by gun</h2>
<p>What's so cool about that? It's that the programmer is no longer directly in control. A well executed evolutionary algorithm may take you to unexpected places. It may be undesired when it subverts your goals and <a href="https://deepmind.com/blog/article/Specification-gaming-the-flip-side-of-AI-ingenuity">do something you didn't want it to</a> – there was a group who tried to teach a simulated human to throw a ball, but ended up with one <em>running with it</em> instead (sadly, I lost the link). It may be neutral when the shape of its solution is <a href="https://en.wikipedia.org/wiki/Evolved_antenna">just weird</a>, or, like I hope to see here, it may bring surprise and variety into the game. That's on top of the fact that <em>I don't really know</em> how to program a smart AI for a monster myself.</p>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/St_5-xband-antenna.jpg/220px-St_5-xband-antenna.jpg" alt="Evolved antenna looking like a randomly bent wire" /></p>
<p>Now that we know what evolution is good for, what are the pieces it requires to take off?</p>
<p>First, we need a bunch of creatures: the gene pool. That's a given both in the virtual world and on farms. Evolution does not work on individuals, but on groups, and it's the group of individuals that keeps improving.</p>
<p>Next, in order for the group to improve, there needs to be a way to know the &quot;better&quot; ones from &quot;worse&quot; ones. For a farmer, this may be the number of laid eggs. For a shooter, let's start with the rule that every creature (genome) that manages to score a bite gets a &quot;goodness&quot; point. That's also called the &quot;objective function&quot;.</p>
<p>It's no use knowing what is better if we don't do something about. We want improvement in our gene pool, so genomes scoring higher will more often be the source of newly spawned creatures, and those less bitey will be underrepresented. That keeps the balance on the high side.</p>
<p>But those spawned creatures will not just be genetic copies of their ancestors. If they were, our gene pool would only be able to lose genetic variety: some heritage lines (the unsuccessful ones) would die off forever, leaving emptiness in its place. So we need to restore some of that variety. Instead of spawning exact clones, we give them a chance to mutate! Less bitey mutations rarely score high, and so they rarely reproduce. Little harm done. However, better monsters score high often, and they will spawn copies more often – gene pool improvement!</p>
<p>We end up with a dominating loop: creature -&gt; bite -&gt; mutate -&gt; better creature -&gt; bite -&gt; mutate. There are other paths which arrive at a dead end, like: creature -&gt; bite -&gt; mutate -&gt; worse creature -&gt; fail, or bad creature -&gt; fail -&gt; not breed. As long as the loop dominates, however, we're improving.</p>
<p>There, now we are on the path to breed ultimate killing machines, and I can play god with an entire species of virtual beings.</p>
<h2>Breedmatic</h2>
<p>The result of that all of thinking is a prototype game <a href="https://github.com/dcz-self/breedmatic">Breedmatic</a>. Adhering to my principle of starting in the middle, from the most interesting part, I went on to create a bunch of genetically modifiable monsters. They are very simple: sense, turn, walk. The genes describe how they turn relative to their target: their &quot;brain&quot;. There's one hidden part of the story though: breeding and maintaining a healthy population.</p>
<p>It turns out that implementing a good genetic algorithm takes a lot of time. There are considerations regarding the size of gene pools, the rates of mutation of different parts of the genome, the selection of the best objective function, and even the strategy for preserving diversity: should reproduction be asexual? Diploid? How many sexes? That is a lot of sessions of shooting monsters. Hands get tired, progress is slow. In a moment of inspiration, I decided to sidestep the problem with… an evolutionary algorithm. As of version 0.2, the shooter itself is driven by evolution, in much similar way as the monsters.</p>
<p>But they start blank, so you can watch the shooters over hundreds of attempts going from barely moving through phases of spinning randomly to somewhat competent, and even sometimes coming up with weird moves which end up scattering the otherwise accurate laser, which end up being nonsensically effective against the incoming hordes.</p>
<p><a href="https://porcupinefactory.org/data/breed3.mkv">A video of an evolved shooter that actually kinda gets it. Seizures help when shooting lasers, apparently.</a></p>
<p>Playing god has never been so comfy: no need to even move a finger.</p>
<h2>Is this fun tho?</h2>
<p>Sure it is. Watching the struggle of each and every shooter, I can't help but pray to the random number generator gods for luck in the next draw. &quot;May it be 55's offspring next! It was turning the right way, just opposite!&quot;. The chance of hitting gold with the brain connections keeps me on the edge of the seat, and I lose track of time. Every session, there's a chance that a previously unseen combination appears – and the brain is only made of 7 neurons!</p>
<h2>But you promised I could be the objective function!</h2>
<p>And so you can! The most recent version from git lets you take over control. The monsters are not bred for variety, and the gameplay is basic though, so it's rather boring. At least compared to watching evolution go, so I have a better idea going in that direction.</p>
<p>Did you play <a href="http://www.massivechalice.com/">Massive Chalice</a>? It combines a tactical game with the management of a royal lineage. There are not many games of this sort, and the premise is effectively the same as mine here: to end up with a genetically superior population. Following the inspiration, future versions of Breedmatic will aim to put the player – not the algorithm – in the role of the gene pool manager. Hopefully that's more fun than shooting questionably intelligent monsters.</p>
<h2>Evolve</h2>
<p>Breedmatic is open, so if you are dismayed with the intelligence or shape of the creatures, feel free to fork it or contribute. Perhaps we can share ideas regarding the actual gameplay. The prototype could still evolve in many directions!</p>
<p>Otherwise, watch this space. While the next post may not be about evolution, I always have some crazy project in progress.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Geo in GL part 1: Flattening Earth</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/flat_earth/"/>
       <id>tag:dorotac.eu,2020-09-30:posts/flat_earth</id>
       <updated>2020-09-30T14:00Z</updated>
       <published>2020-09-30T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Geo in GL part 1: Flattening Earth</h1>
<p>It's my guilty pleasure to flatten the Earth. To put it on various planar surfaces: pages of paper, scrolls, billboards, screens. Many prominent people attempted that before me: <a href="https://en.wikipedia.org/wiki/Gerardus_Mercator">Mercator</a> and <a href="https://en.wikipedia.org/wiki/Cassini_projection">Cassini</a>, just to name two.</p>
<p>Yes, I'm talking about maps and projections.</p>
<p>This is the first part of the story of how I used OpenGL <a href="https://en.wikipedia.org/wiki/Shader">shaders</a> to project the Earth onto the flat computer screen. Here I introduce the reader to the basics of <a href="https://en.wikipedia.org/wiki/Map_projection">projections</a>, write about the scope of my project, and then choose a projection that fits the scope.</p>
<p><strong>Warning</strong>: if you believe the Earth is already flat, this blog post won't offer anything interesting for you. You might still be interested in the following parts in the series.</p>
<h2>What's the point?</h2>
<p>I read someone online say shaders are bad for drawing maps: they aren't precise enough to distinguish distances a meter apart! Of course, I don't believe everything online, so I decided to check it myself, while refreshing my OpenGL knowledge, while playing with maps.</p>
<p>Which brings me to the next point.</p>
<h2>I like maps</h2>
<p>While there are many representations of the real world, maps feel the most real. Some models turn the real into an unrecognizeable mess of abstract concepts. Maps are elegant, because they copy the real thing, just smaller. They aim to be recognizeable and familiar, so making them is always visually satisfying.</p>
<h2>Distortions</h2>
<p>Maps are not perfect, though. They are not scale models, and they contain some abstract parts: a mountain represented on a map will be as flat as the paper or screen it's on. If it's marked, it will be either a drawing of a mountain, or a squiggle of height lines.</p>
<p>Most maps are also not globes. Our planet is round, and that means a flat map can't capture it accurately. Like a flattened orange peel, the map will get distorted: there will be tears or stretched places. The different ways to flatten the Earth to fit on a map are called <em>projections</em>.</p>
<p><img src="ortho.png" alt="3D-like Globe" />
This is also not a globe. It's just its picture! The picture is called <em>orthographic projection</em>.</p>
<h2>The scope</h2>
<p>There are multitudes of projections available. To narrow it down, it helps to define what kind of maps are interesting in this project. With that in mind, the initial version of the program will</p>
<ul>
<li>display only small areas: not more than 100km in size. The smaller the area, the flatter the Earth surface, so distortions are smaller.</li>
<li>Load a GPS track and display it as a squiggle. My GPS tracks are usually only a few hours long, and fit in the 100km size limit.</li>
<li>Display nothing else. I don't want to get carried away and never prove what I set out to prove.</li>
</ul>
<h2>Projection choices</h2>
<p>The last big decision was to choose a projection.</p>
<h3>WGS84</h3>
<p>Normally, we're used to seeing geographical coordinates of a point looking like: &quot;<em>54,5°N 23,7°E</em>&quot;. This representation seems obvious. We live on a sphere, and so we use <a href="https://en.wikipedia.org/wiki/Spherical_coordinate_system">spherical coordinates</a>. However, the prime meridian, and the precise measurement of the Earth axis, among oteher things, need to be standardized, so this representation is called <a href="https://gisgeography.com/wgs84-world-geodetic-system/">WGS84</a>. Since this is what GPS is using, it would be trivial to just plot the points from track on the screen as they are. However, the results would end up rather unappetizing:</p>
<p><img src="wgs84.png" alt="Map of the world projected using WGS84" /></p>
<p>The distortions are huge! Can you see the grid sizes? Places at the equator are much smaller than at the poles, and the poles are extremely stretched sideways too – if you look at the globe, the grid is composed of wedges close to the poles. This would be a very unpleasant map to look at, even on our small scale.</p>
<h3>Mercator</h3>
<p>The stretching is easily fixable by applying a varying stretching factor across the vertical position. This gives us the <a href="https://en.wikipedia.org/wiki/Mercator_projection">Mercator projection</a>:</p>
<p><img src="600px-Mercator_projection_Square.JPG" alt="Map of the world projected with the Mercator projection" /> Image copyright <a href="https://commons.wikimedia.org/wiki/User:Strebe">Strebe</a>, CC-BY-SA 3.0 Unported.</p>
<p>This projection is easy to calculate, and it covers most of the world, so it's very popular on the Internet. Most online maps use it, including OpenStreetMap's own <a href="https://osm.org">osm.org</a>. However, it's still rather bad: the stretching factor approaches infinity as we approach the poles, so the map doesn't have a top and bottom edge. Instead it's just cut off somewhere. The other problem is that the areas at the equator are still much smaller than near poles.</p>
<h3>Something different</h3>
<p>It seems that both of those projections are quite bad at some places. But they are also both quite good at certain places: in this case, at the equator. So why don't we choose the best projection for our data?</p>
<p>That's what paper maps do. Unlike general purpose computer map viewers, paper maps are often limited to a small area on the Earth. They can focus on making this area undistorted, and they can ignore all the misshapen mess that's not printed on the paper. I deliberately chose to limit my program's area of interest to a 100km radius to take advantage of this.</p>
<p>For this, we need a projection that's not general like WGS84, or like the Mercator projection, but instead focussed on the area in question. To look at the North Pole in an undistorted way, we could choose the The North Pole Lambert Azimuthal Equal Area projection:</p>
<p><img src="espg102017.png" alt="Map of the North Pole and surrounding continents" /></p>
<p>Here, I decided to increase my difficulty: I will not make an assumption about where on the Earth my GPS track is. It could be Europe, the Equator, or the Arctic. Instead, I will try to hit the bullseye and make sure that my projection favors the area I'm displaying, no matter what it is. For that, I need to find a family of projections that I can adjust.</p>
<p>Thankfully, people have come up with <a href="https://proj.org/operations/projections/index.html">plenty of different ways of projecting</a>:</p>
<p><a href="https://proj.org/operations/projections/alsk.html"><img src="alsk.png" alt="Alaska projected stereographically" /></a>
<a href="https://proj.org/operations/projections/moll.html"><img src="moll.png" alt="Oval world in Mollweide projection" /></a>
<a href="https://proj.org/operations/projections/wink1.html"><img src="wink1.png" alt="Oval world in Winkel I projection" /></a>
<a href="https://proj.org/operations/projections/vandg.html"><img src="vandg.png" alt="Circle world in van der Grinten (I) projection" /></a>
<a href="https://proj.org/operations/projections/gstmerc.html"><img src="gstmerc.png" alt="World as a crumpled ball of paper in Gauss-Schreiber Transverse Mercator projection" /></a>
<a href="https://proj.org/operations/projections/tobmerc.html"><img src="tobmerc.png" alt="Diamond shaped world in Tobler-Mercator projection" /></a>
<a href="https://proj.org/operations/projections/imw_p.html"><img src="imw_p.png" alt="Cylinder world without poles in International Map of the World Polyconic projection" /></a>
<a href="https://proj.org/operations/projections/ocea.html"><img src="ocea.png" alt="World with North Pole at the bottom edge, and South Pole at the top edge in Oblique Cylindrical Equal Area projection" /></a><br />
Various projections: Alaska, Mollweide, Winkel I, van der Grinten (I), Gauss-Schreiber Transverse Mercator, Tobler-Mercator, International Map of the World Polyconic, Oblique Cylindrical Equal Area. (Images courtesy <a href="https://proj.org">PROJ</a>.)</p>
<p>Some of them are tweakable, to name a few: <a href="https://proj.org/operations/projections/aeqd.html">azimuthal equidistant</a>, <a href="https://proj.org/operations/projections/cass.html">Cassini</a>, <a href="https://proj.org/operations/projections/gnom.html">gnomonic</a>.</p>
<p>Most of them preserve shapes over small areas rather well, so the choice is not that important. Nevertheless, I'm not an expert, so it's possible that I'm underestimating distortions. To stay on the safe side, I choose the azimuthal equidistant, because it promises that distances from the central point can be measured with a ruler. I used it previously with an overlaid kilometer grid and it worked well.</p>
<p>Here's the projection in <a href="https://proj.org/">proj</a> form, so you can apply it in <a href="https://www.qgis.org/en/site/">QGIS</a>. Don't forget to adjust <code>lat_0</code> and <code>lon_0</code> to your area of interest!</p>
<pre><code>+proj=aeqd +lat_0=54 +lon_0=24 +x_0=0 +y_0=0 +a=6371000 +b=6371000 +units=m +no_defs
</code></pre>
<h2>Next steps</h2>
<p>I have the projection, but I still don't know the math behind it, and don't have any code. Stay tuned for the next part!</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Wayland and input methods</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/input_method/"/>
       <id>tag:dorotac.eu,2020-08-15:posts/input_method</id>
       <updated>2020-08-15T14:00Z</updated>
       <published>2020-08-15T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Wayland and input methods</h1>
<p>Wayland is gradually getting the ability to support input methods natively. Actually, it's plural &quot;abilities&quot;, because there are several pieces related to the functionality. Working in this area, I had to explain this to newcomers so often that I decided to write this blog post instead, to explain to everyone what's going on here, once and for all.</p>
<h2>Text input</h2>
<p>Quick recap. The purpose of an input method is twofold: to give applications text from the user, and to recognize when and what kind of text is expected.</p>
<p><img src="purpose.svg" alt="Input methods help the user enter text in an efficient way" /></p>
<p>The most basic thing to do that under Wayland is the <strong>text-input</strong> protocol. It takes text from the compositor, and gives it to applications. It lets applications tell the compositor when and what kind of text they need. The protocol doesn't worry about the user, instead leaving that to the compositor.</p>
<p><img src="text_input.svg" alt="Text input connects applications to the compositor" /></p>
<h2>Input method</h2>
<p>The compositor can take two paths in order to let user input the text. Either it takes the burden of communication on its own, by handling input itself, or it can delegate that task to some other program.</p>
<p><img src="input_method.svg" alt="Input method inserts itself between compositor and privileged program" /></p>
<p>In Wayland, the <strong>input-method</strong> protocol was designed to help. It is very similar to <em>text-input</em>, because it lets <em>a program</em> send text <em>to the compositor</em>, and allows <em>the compositor</em> to tell <em>the program</em> what kind of text is needed. Notice the inversion! This time, the program is special. It communicates with the user, and then gives the text to the compositor. The compositor is then typically going to send the text onward to the currently focused application using <em>text-input</em>, creating a chain: special program → compositor → focused application.</p>
<p><img src="focused.svg" alt="Input method is one, text inputs are many." /></p>
<p>This protocol has place for some additional responsibilities, too. Because there is typically only one application using this protocol, it can do things which would not work with multiple applications. One of them is grabbing the keyboard, known to CJKV language users. <em>Input-method</em> allows the special program to ask the compositor to send it all keyboard presses (&quot;exclusive grab&quot;). Taking over the keyboard makes it possible to send the text &quot;你好&quot; when &quot;nihao&quot; is typed. The other extension is meant for creating a special pop up window, which the compositor places next to the text field, and which can be used to show typing completions.</p>
<h2>Virtual keyboard</h2>
<p>Input methods support would have been complete here, if all we cared about was text. However, the world is not so simple, and we have to deal with additional categories of input before being useful:</p>
<ul>
<li>text in legacy applications which don't support text-input, and</li>
<li>triggering actions which would normally need a keyboard.</li>
</ul>
<p>Both of them can be addressed by using a keyboard. But what if we're using a tablet computer, a TV, game console, or a phone, and there isn't one to speak of? We can address this issue by emulating button presses. Again, there are two basic ways to address this. The compositor can come up with something on its own, or it can delegate the task to another program.</p>
<p>The protocol <strong>virtual-keyboard</strong> is designed for programs which want to tell the compositor to issue &quot;fake&quot; keyboard button press events.</p>
<h2>Together</h2>
<p>A fully-fledged input method program will be a Wayland client using the <em>input-method</em> protocol for submitting text, but also supporting <em>virtual-keyboard</em> for submitting actions, and as a fallback for legacy applications.</p>
<p><img src="virtual_keyboard.svg" alt="Virtual keyboard in parallel to input method" /></p>
<p>A compositor would ferry text around between the input method program and whichever application is focused. It would also carry synthetic keyboard events from the input method program to the focused application.</p>
<p>An application consuming text would support <em>text-input</em>, and it would send enable and disable events whenever a text input field comes into focus or becomes unfocused. It would also accept keyboard events.</p>
<p>Legacy applications won't send enable and disable, even when a text field is focused, and the user ready to type. When that happens, the compositor and the input method won't realize that text should be submitted now. If the input method uses an on-screen keyboard, it could remain hidden! Because of that, it's best to always make sure the user can bring up the input method and input text, which would then be delivered as keyboard events (which are always supported by applications).</p>
<h2>Current state</h2>
<p>As of 2020-08-15, the latest versions of relevant protocols are:</p>
<h3><a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/master/unstable/text-input/text-input-unstable-v3.xml"><em>text-input-unstable-v3</em></a></h3>
<p>Accepted in <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/">wayland-protocols</a>. Designed by me, based on <a href="https://gitlab.gnome.org/GNOME/mutter/-/blob/efd7a4af5e37299f17011a7f39cc66d8416a1bf9/src/wayland/protocol/gtk-text-input.xml"><em>gtk-text-input</em></a> by Carlos Garnacho, and on <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/master/unstable/text-input/text-input-unstable-v1.xml"><em>text-input-unstable-v1</em></a>.</p>
<h3><a href="https://github.com/swaywm/wlroots/blob/master/protocol/input-method-unstable-v2.xml"><em>input-method-unstable-v2</em></a></h3>
<p>Used in <a href="https://github.com/swaywm/wlroots/">wlroots</a>. Designed by me, based on <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/master/unstable/text-input/text-input-unstable-v3.xml"><em>text-input-unstable-v3</em></a> and <a href="https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/master/unstable/input-method/input-method-unstable-v1.xml"><em>input-method-unstable-v1</em></a>.</p>
<h3><a href="https://github.com/swaywm/wlroots/blob/master/protocol/virtual-keyboard-unstable-v1.xml"><em>virtual-keyboard-unstable-v1</em></a></h3>
<p>Used in <a href="https://github.com/swaywm/wlroots/">wlroots</a>. Designed by me, based on the <a href="https://gitlab.freedesktop.org/wayland/wayland/-/blob/4e16ef0aed8db425afc8910b2a9708b57e165d39/protocol/wayland.xml#L2171"><em>wl_keyboard</em></a> interface.</p>
<h2>Future</h2>
<p>There are still some topics open. The most important one is about fixing the deficiencies in <em>text-input</em>, and updating <em>input-method</em> to match. Another one is regarding whether <em>virtual-keyboard</em> is even worth the effort, considering how it stirs up some conflict with Wayland's design.</p>
<p>Less important is implementing the additional features of <em>input-method</em>.</p>
<p>There's also the exploratory idea of designing a protocol dedicated to submitting actions like &quot;undo&quot;, &quot;submit&quot;, &quot;next field&quot;, but not text, in order to eliminate the need to emulate keys in modern keyboard.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Word embedding fun</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/embeddings/"/>
       <id>tag:dorotac.eu,2020-07-16:posts/embeddings</id>
       <updated>2020-07-16T14:00Z</updated>
       <published>2020-07-16T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Word embedding fun</h1>
<p>I'm terrible at naming things. When others come up with slick user names like &quot;el_duderino&quot;, or &quot;pseudolus&quot;, I struggle to come up with anything better than &quot;blob123&quot;. Right now I'm busy setting up a new project which needs a name, and I struggle with inspiration.</p>
<p>What if there was a machine to transform terrible ideas into good ones? Take a trope, and tweak it a little to get something original?</p>
<h2>Word vectors</h2>
<p>I'm not going to pretend that I know exactly what the term &quot;<a href="https://en.wikipedia.org/wiki/Word_embedding">word embedding</a>&quot; means. What matters is that words in a human language can be associated to vectors in a linear space. With a little thinking, we can leverage that property to find words with similar meanings, or solve analogies like this one:</p>
<blockquote>
<p>What is to woman as king is to man?</p>
</blockquote>
<p>Of course, it's &quot;queen&quot;. The example is the classic one, so please read <a href="https://blog.acolyer.org/2016/04/21/the-amazing-power-of-word-vectors/">this</a> because I couldn't explain it better than that article.</p>
<p>Okay, so what about the trope? Let's try with a political meme:</p>
<blockquote>
<p>What's the anarchist version of &quot;<a href="https://knowyourmeme.com/memes/cultures/fully-automated-luxury-gay-space-communism">fully automated luxury gay space communism</a>&quot;?</p>
</blockquote>
<h2>Playing with words</h2>
<p>To answer the question, I installed <em>gensim</em> from pip, and fetched some <a href="https://radimrehurek.com/gensim/auto_examples/howtos/run_downloader_api.html">ready-made data</a>. Using Python3 interactively, I started this way:</p>
<pre><code>import gensim.downloader as api
corpus = api.load('text8')
from gensim.models.word2vec import Word2Vec
model = Word2Vec(corpus)
</code></pre>
<p>Now that I had a vector space with words, I started playing with it, translating the meme word by word. If &quot;fully&quot; is a &quot;communism&quot; term, what's its &quot;anarchism&quot; version?</p>
<pre><code>&gt;&gt;&gt; model.wv.similar_by_vector(-model.wv['communism'] + model.wv['anarchism'] + model.wv['fully'])
[('fully', 0.7874691486358643), ('correctable', 0.5339317321777344), ('strictly', 0.533452033996582), ('wholly', 0.5306737422943115), ('totally', 0.5066695213317871), ('properly', 0.4988959729671478), ('locally', 0.4917019009590149), ('entirely', 0.4908182919025421), ('statically', 0.48547613620758057), ('completely', 0.4838995933532715)]
</code></pre>
<p>Turns out it's still &quot;fully&quot;. Going all the way, we get &quot;fully introductory ornamental gay space humanism&quot;. That's a bit surprising! According to the model, &quot;communist&quot; is to &quot;communism&quot; as &quot;anarchist&quot; is to &quot;humanism&quot;. I blame it on the model being relatively small.</p>
<h2>Readable encyclopedic excellent lesbian topological altruism</h2>
<p>This is the best nugget I got out of the <em>text8</em> corpus. It seems to be giving a nice selection of words with similar meanings, although changing the political option doesn't seem to do much. Other examples of computer-generated silliness include &quot;perfectly analytical aquarium bisexual seti individualist&quot; and &quot;fully extensible cheddar gay space anarchism&quot;. I'm going to keep &quot;fully extensible cheddar&quot; for a different project…</p>
<h2>Using a larger model</h2>
<p>Seeing the obvious drawbacks in accuracy and lack of sensitivity to political stances, I tried to use a serious model, based on the Wikipedia. <em>Gensim</em> doesn't disappoint in that department either:</p>
<pre><code>model = api.load(&quot;fasttext-wiki-news-subwords-300&quot;)
</code></pre>
<p>After coming back from the tea break, we are in a 300-dimensional model, containing the condensed knowledge of Wikipedia. Let's see how it fares:</p>
<pre><code>&gt;&gt;&gt; model.wv.similar_by_vector(-model.wv['communist'] + model.wv['anarchist'] + model.wv['communism'])
[('anarchism', 0.883374810218811), ('anarchist', 0.8050785064697266), ('Anarchism', 0.7585680484771729), ('anarcho-syndicalism', 0.7496938705444336), ('anarcho-communism', 0.7432658672332764), ('anarchists', 0.7426955699920654), ('anarcho-socialism', 0.7400932312011719), ('anarchistic', 0.7102389931678772), ('anarcho-capitalism', 0.7090942859649658), ('anarchisms', 0.7069689035415649)]
</code></pre>
<p>Good! It could derive that the &quot;anarchism&quot; equivalent of &quot;communist&quot; is &quot;anarchist&quot;.</p>
<pre><code>&gt;&gt;&gt; model.wv.similar_by_vector(-model.wv['communism'] + model.wv['anarchism'] + model.wv['fully'])
[('fully', 0.8348307609558105), ('fullly', 0.6537811756134033), ('fuly', 0.6530433893203735), ('thoroughly', 0.6464489102363586), ('completely', 0.6422995328903198), ('properly', 0.6398820877075195), ('adequately', 0.6317397356033325), ('fully-', 0.6239138841629028), ('comprehensively', 0.6175973415374756), ('partially', 0.606654167175293)]
</code></pre>
<p>Uh, there's a bit of repetition there. What about other words?</p>
<pre><code>&gt;&gt;&gt; model.wv.similar_by_vector(-model.wv['communism'] + model.wv['anarchism'] + model.wv['space'])
[('space', 0.8099461793899536), ('sub-space', 0.6615338325500488), ('spaces', 0.6537838578224182), ('work-space', 0.6258285045623779), ('non-space', 0.6244523525238037), ('space--and', 0.6012811064720154), ('workspace', 0.6001818776130676), ('WPspace', 0.5913711786270142), ('space-', 0.5863816738128662), ('space--', 0.5790454745292664)]
</code></pre>
<p>Disappointing :( I had to dig through a lot of results to get anything not related to the original word. Here are some examples: &quot;adequately anarchist high-end lesbian column-free libertarianism&quot;, &quot;completely computerized boutique-style heterosexual  hypersphere anti-capitalism&quot;.</p>
<h2>Different APIs</h2>
<p>There's another API to create analogies: it's the <code>most_similar</code> call. It works like this:</p>
<pre><code>&gt;&gt;&gt; model.most_similar(positive=['anarchism', 'space'], negative=['communism'])
[('sub-space', 0.6117153763771057), ('spaces', 0.5981306433677673), ('work-space', 0.5736098885536194), ('non-space', 0.5651520490646362), ('workspace', 0.5563109517097473), ('space--and', 0.542439341545105), ('WPspace', 0.5411180257797241), ('space-', 0.5368467569351196), ('meta-space', 0.5321565866470337), ('space--', 0.5256682634353638)]
</code></pre>
<p>The results are a little different, although I don't know why. Perhaps it's about a different way of measuring similarity.</p>
<h2>Word salad</h2>
<p>Using models to reveal some hidden properties of political options didn't work out, but the smaller model was a decent thesaurus. I suspect this is due to less words, creating more varied results near the best answer. I don't know how the 2 APIs differ. While I expected the similarity function to work based on Euclidean distance, it seems to be based on cosine distance instead. Perhaps that's the reason the anchor word sometimes ends up in results as well.</p>
<p>All in all, I think word vectors are going to end up being a nice toy, when some weirdness is needed for brainstorming. Let &quot;aquarium&quot; be an example: it's far out in terms of adjectives, but it is indeed not entirely unrelated to &quot;luxury&quot;.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>It&#39;s not about keyboards</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/not_keyboard/"/>
       <id>tag:dorotac.eu,2020-07-03:posts/not_keyboard</id>
       <updated>2020-07-03T14:00Z</updated>
       <published>2020-07-03T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>It's not about keyboards</h1>
<p>My work on <a href="https://source.puri.sm/Librem5/squeekboard/">Squeekboard</a> and the corresponding Wayland protocols is open and public. That lets people come and contribute, and give feedback. Those people come with different expectations about the result of my work, and there's one that looks enticing but doesn't really correspond with what my work is about. It causes lots of confusion, so I would like to dispel it.</p>
<p>Squeekboard's goal is not to be a <em>keyboard</em>.</p>
<p>Instead, I designed Squeekboard and the pieces around it to support <em>text composition</em> first and foremost. While a <em>keyboard</em> is one method of composing text, it usually does other things as well:</p>
<ul>
<li>playing musical notes</li>
<li>turning things on and off</li>
<li>shooting virtual monsters</li>
<li>moving within virtual space</li>
<li>instructing the computer to do arbitrary actions.</li>
</ul>
<p>Squeekboard doesn't want to do any of those things, unless they help composing text. To understand why, let's look at the basic properties of keyboards and Squeekboard. Keyboards</p>
<ul>
<li>have labels which don't change</li>
<li>have keys which stay in place and retain their shape forever</li>
<li>have keys which sense force</li>
<li>have keys which provide precise tactile feedback.</li>
</ul>
<p>Meanwhile, Squeekboard:</p>
<ul>
<li>occupies a changeable area of the screen</li>
<li>has pixels which change color but not shape</li>
<li>has touch points which overlap the display</li>
<li>has touch points that capture &quot;pressure&quot;.</li>
</ul>
<p>As you can see, they are very different beasts, with different constraints and abilities</p>
<h2>It's just an app!</h2>
<p>Squeekboard has some ability to emulate the keyboard buttons and register touches, so in principle it could share the goals of a keyboard. The key to understanding why this is not a useful useful goal is to realize that all of Squeekboard's unique abilities are the same as of any other graphical program.</p>
<p>That gives us a new constraint: Squeekboard should not duplicate the functionality of the application it controls.</p>
<p>A physical keyboard has it easy: touch feedback and less hand movement always adds value. But what can an on-screen one offer? Let's think about the interactions the user has with an application.</p>
<p>The user might want to close or resize the application. That's easy, the compositor takes care of that. The user might want to issue an application-specific action, like rendering a scene, or moving the camera. Arguably, this is best handled by the application, which can present the relevant information. Commonly encountered actions like saving the file, copying or opening preferences are usually supported in a built-in way too. What does that leave us?</p>
<p>Text input.</p>
<p>Few applications give user an interface to enter text directly, making it the obvious responsibility of Squeekboard as an input method.</p>
<h2>Extra powers</h2>
<p>Dropping the preconception that an on-screen input method is a keyboard opens new possibilities for text input. Changing button shapes and numbers, presenting useful information in labels, offering corrections and suggestions, gestures, handwriting recognition, embedded input methods, and sending text directly to the application (not as button presses) are not things that can be done with just a matrix of buttons.</p>
<p>While Squeekboard doesn't support many of those features, the groundwork for most of them is included in the Wayland protocols developed alongside it.</p>
<p>Looking past the historically proven idea of a keyboard opens a different view on how touch-based human-computer interfaces should work, and Squeekboard is there to make best use of the new possibilities.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Why I avoid Cargo: dependency versions</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/cargo_versions/"/>
       <id>tag:dorotac.eu,2020-06-27:posts/cargo_versions</id>
       <updated>2020-06-27T14:00Z</updated>
       <published>2020-06-27T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Why I avoid Cargo: dependency versions</h1>
<p>Ever since I started making use of <a href="https://doc.rust-lang.org/cargo/">Cargo</a> to build the Rust pieces of <a href="https://source.puri.sm/Librem5/squeekboard/">Squeekboard</a>, I've hit a wall whenever I needed to add something nontrivial to the build process. I haven't documented them before, so whenever I complained about Cargo, I ended up looking ridiculous without having anything to show for the complaints.</p>
<p>Last week I spent three days solving a problem with building Squeekboard that should have been solveable in 30 minutes, in large part due to Cargo.</p>
<h2>Buster and Bullseye</h2>
<p>Squeekboard is a crucial part of the software stack of the <a href="https://puri.sm/products/librem-5/">Librem5</a> mobile phone. It's primary goal is to fit in well in <a href="https://pureos.net/">PureOS</a>, which powers the phone. PureOS is in turn based on Debian, and inherits a lot of its practices, which are followed by projects related to the Librem5, including Squeekboard.</p>
<p>One such practice is that Rust crates are vendored by Debian, and not by the application author. They are installable with <code>apt</code>, and end up in a local repository in <code>/usr/share/cargo/registry</code>.</p>
<p>Librem5's base operating system is moving from Debian 10 to Debian 11. I was <a href="https://source.puri.sm/Librem5/squeekboard/-/issues/200">asked</a> to make Squeekboard work on both in the transition period. Debian 10 ships with gtk-0.5.0, whereas Debian 11 ships with gtk-0.7.0, which contain some incompatibilities, and Squeekboard's build process must adjust to them, depending on where the build happens.</p>
<p>Piece of cake: there's one variable, which needs to be turned into one decision. I did this a million times. Little did I know Cargo hates simple solutions.</p>
<h2>What's the version we're using?</h2>
<p>It's not unusual for projects to support two versions of the same dependency. Perhaps the versions come from different vendors. C programs don't have trouble with this:</p>
<pre><code>#include &lt;gtk.h&gt;
#if GTK_VERSION==0.5
  // old stuff
#endif
</code></pre>
<p>Rustc won't know dependency versions by itself, but Cargo should turn it to something like this:</p>
<pre><code>#[cfg(dependency.gtk.version = &quot;0.5.*&quot;)]
use gio::SimpleActionExt; // Doesn't exist in later versions of gtk
</code></pre>
<p>But… I haven't managed to find any equivalent. It makes sense, Cargo can pull several copies of the same crate, and I think their pieces can even be used in the same files if one is not careful. So there's no good reason this should have worked. Fair, let's try something else.</p>
<h2>What's the available version?</h2>
<p>If the compilation process can't tell us what versions we're dealing with, perhaps we can check that before compilation. In Meson, we'd do something like this:</p>
<pre><code>gtk-rs = dependency(&quot;gtk-rs&quot;)
if gtk.version().startswith(&quot;0.5.&quot;)
      add_project_arguments('--cfg old_gtk', language : 'rust')
endif
</code></pre>
<p>And then, in Rust:</p>
<pre><code>#[cfg(&quot;old_gtk&quot;)]
use gio::SimpleActionExt; // Doesn't exist in later versions of gtk
</code></pre>
<p>Now the only remaining thing is to create the dependency lookup procedure. While Squeekboard is Debian-focused, building on non-Debian systems is still important, so it must build with crates.io. Thankfully, we have <code>cargo search</code>. Let's try with a vendored registry:</p>
<pre><code>root@c684b7b31b07:/# CARGO_HOME=/mnt/build/eekboard/debian/cargo/ cargo search gtk
error: dir /usr/share/cargo/registry does not support API commands.
Check for a source-replacement in .cargo/config.
</code></pre>
<p>Oh no. We can forget about it. I'm not willing to write a tool that searches for crates in all the ways that Cargo supports, and I'm honestly boggled that Cargo doesn't properly do it itself.</p>
<p>As a bonus, the output of cargo search just doesn't make sense. I'm looking for the regex crate, which is at version &quot;1.3.9&quot;:</p>
<pre><code>$ cargo search regex=1
combine-regex-1 = &quot;1.0.0&quot;      # Re-export of regex 1.0 letting combine use both 0.2 and 1.0
webforms = &quot;0.2.2&quot;             # Provides form validation for web forms
</code></pre>
<p>Totally useless :(. But I still have a few tricks up my sleeve.</p>
<h2>Use a build flag</h2>
<p>Isn't this obvious? Let's skip all that dependency detection, and just order the build system to use one, in the dumbest fashion possible. The way you'd do it with Meson:</p>
<pre><code>if get_option(&quot;legacy&quot;) == true
  gtk-rs = dependency(&quot;gtk-rs&quot;, version=0.5)
  add_project_arguments('--cfg old_gtk', language : 'rust')
else
  gtk-rs = dependency(&quot;gtk-rs&quot;, version&gt;0.5)
endif
squeekboard = executable('squeekboard',
  dependencies: [gtk-rs],
)
</code></pre>
<p>Add to that the conditional in the Rust file, and we're done. Cargo is not Meson, but we can call it with flags too, and then instruct it to use the right dependency. Right?</p>
<p>Wrong.</p>
<p>While Cargo allows choosing dependency versions, they are selected based on target, not on flags. You can use:</p>
<pre><code>[target.'cfg(unix)'.dependencies]
openssl = &quot;1.0.1&quot;
[target.'cfg(not(unix))'.dependencies]
openssl = &quot;1.0.0&quot;
</code></pre>
<p>You can remember that the <code>cfg()</code> syntax supports features:</p>
<pre><code>[target.'cfg(feature=&quot;legacy&quot;)'.dependencies]
gtk=&quot;0.5.*&quot;
[target.'cfg(not(feature=&quot;legacy&quot;))'.dependencies]
gtk=&quot;0.7.*&quot;
</code></pre>
<p>but then you see this:</p>
<pre><code># cargo build
warning: Found `feature = ...` in `target.'cfg(...)'.dependencies`. This key is not supported for selecting dependencies and will not work as expected. Use the [features] section instead: https://doc.rust-lang.org/cargo/reference/features.html
</code></pre>
<p>And when you follow that web page, you learn that you can specify other dependencies, but not other versions. Foiled again!</p>
<p>Again, I'm boggled why such a basic piece of functionality is working in such a complicated and restrictive way. Wouldn't it be easier to abolish the weird <code>feature = [dep]</code> syntax and instead let <code>foo.dependencies</code> work, with all the fine-grained control over what the dependencies are?</p>
<h2>Embedded crates</h2>
<p>But I didn't stop there. If the deps can't be specified the usual way, then let's get there through a back door. If I can't choose a version, I will choose a crate. 0.5 turns into a crate called &quot;legacy&quot;, and 0.7 into one called &quot;current&quot;. My Cargo.toml looked like this:</p>
<pre><code>[dependencies]
current = {path=&quot;deps/current&quot;, optional=true}
old = {path=&quot;deps/old&quot;, optional=true}

[features]
legacy = [&quot;old&quot;]
</code></pre>
<p>The <code>deps/old/Cargo.toml</code> contained the actual version:</p>
<pre><code>[dependencies]
gtk-rs = &quot;0.5.*&quot;
</code></pre>
<p>and the current one had <code>0.7.*</code>. When we choose the crate, we choose the dependency with it! So clever! There's no way it won't work!</p>
<pre><code>root@c684b7b31b07:~/foo/foo# CARGO_HOME=/mnt//build/eekboard/debian/cargo/ cargo build
error: failed to select a version for the requirement `gtk = &quot;0.5.*&quot;`
  candidate versions found which didn't match: 0.7.0
  location searched: directory source `/usr/share/cargo/registry` (which is replacing registry `https://github.com/rust-lang/crates.io-index`)
required by package `old v0.1.0 (/root/foo/foo/deps/old)`
    ... which is depended on by `foo v0.1.0 (/root/foo/foo)`
</code></pre>
<p>What!? This cannot be! Why oh why?</p>
<p>My only guess is that cargo pulls all the dependencies indiscriminately, regardless of whether it actually needs them or not. Obviously, this entire shebang is because we <em>don't</em> have one of them! Why is Cargo missing the point of choosing dependency versions so hard?</p>
<h2>Nuclear option</h2>
<p>With this in mind, there's only one solution left. If Cargo is greedy enough to snatch everything it sees, then we'll just not let it know there are possibly any other dependencies. We'll generate a Cargo.toml after we know which dependency we need, and we'll never let Cargo know about the other dependency. The details of the generation are quite straightforward: just change depedencies based on build flags. The build scripts get complicated, though. Now they must include <code>--manifest-path</code>. This looks trivial, but this changes the root of the crate, so now we need to give it the path to the sources again:</p>
<pre><code>[lib]
name = &quot;rs&quot;
path = &quot;@path@/src/lib.rs&quot;
crate-type = [&quot;staticlib&quot;, &quot;rlib&quot;]

# Cargo can't do autodiscovery if Cargo.toml is not in the root.
[[bin]]
name = &quot;test_layout&quot;
path = &quot;@path@/src/bin/test_layout.rs&quot;

[[example]]
name = &quot;test_layout&quot;
path = &quot;@path@/examples/test_layout.rs&quot;
</code></pre>
<p>But it works. Finally.</p>
<p>Sadly, we sacrificed autodetection of tests and binaries, as well as distro-agnosticism.</p>
<h2>You're doing it all wrong!</h2>
<p>I'm sure some of you will see the struggle and consider it self-inflicted. &quot;Why didn't you try X?&quot;, &quot;You're fighting the tool because your model is outdated!&quot;, &quot;This is not a bug&quot;. Some of them will be valid criticisms, and I'm going to address those I could think of.</p>
<h3>Use multiple Cargo.lock versions</h3>
<p>I could have used &quot;*&quot; as the dependency version, and instead rely on two versions of Cargo.lock to select the actual dependency I want. That is troublesome.</p>
<p>I don't want to know exactly what versions of Rust crates (or any other deps really) the upstream distro is shipping. This is why I use distros in the first place: they remove some burden of deps auditing from me. All I want to know is the minor version for API compatibility reasons. However, Cargo.lock forces me to care about dependencies by asking for a hash of each. I would have to find out what dependency is available, and feed Cargo.lock with its hash. That hits the same problem with detecting what we have again.</p>
<h3>Vendor your crates yourself</h3>
<p>Squeekboard is just one project out of many in PureOS, and all the others follow Debian's best practices: use Debian's software versions whenever possible. On one hand, this is designed to set right expectations, on the other it relegates the responsibility for auditing and updating to Debian, as explained in the previous argument.</p>
<p>In this light, I'm already letting Debian vendor my crates, and that won't change.</p>
<h3>Cargo is designed for crates.io, not for distributions or local repositories</h3>
<p>I'm not sure if this is Cargo's explicit goal to be useful with crates.io, but my experience says other sources are in practice playing catch-up. If I relied exclusively on crates.io, I would have no problems.</p>
<p>However, the build policy in PureOS requires builds to happen without a network connection. That means we can't use crates.io. We've already eliminated vendoring before, and so we're stuck with some form of a local repository, which clearly isn't well-supported by Cargo.</p>
<h2>Insights</h2>
<p>My insight from this adventure is that Cargo doesn't prioritize users who want to control their own sources of software at the moment. It's inflexible in the way that favours crates.io, where the implicit assumption is that all possible crates and versions are available. After all, it crashes when a non-available version is specified even if unused.</p>
<p>Cargo also doesn't want application developers to offload dependency checking. It's recommended to commit Cargo.toml together with the application, but there's no provision to ignore it when doing a build with local versions of the same crates, like when you do in distro builds.</p>
<p>Cargo is not composable the way other build systems are. When building a C program, artifacts in the form of shared or static libraries are placed in a well-known directory. Cargo creates an opaque directory structure for compiled crates in the <code>target</code> directory, which cannot be used by another build system later due to lack of documentation. Sadly, this means that when Cargo fails to solve a need, there's no alternative.</p>
<p>The same creates a network effect, where Cargo is de facto the only tool that builds useful Rust programs. It's difficult to the point of pointlessness to &quot;build&quot; serde and gtk-rs separately, and then &quot;link&quot; them with the main program using manual <code>rustc</code> calls.</p>
<p>Creating composable artifacts would undermine Cargo's monopoly and allow a better integration with distributions as well as other programming languages in my opinion. Build systems should not infect all software they touch.</p>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>From C to exec: Part 3</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/gcc_symbols/"/>
       <id>tag:dorotac.eu,2020-06-06:posts/gcc_symbols</id>
       <updated>2020-06-06T14:00Z</updated>
       <published>2020-06-06T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>From C to exec: Part 3</h1>
<p>In <a href="/posts/linking/">Previous</a> <a href="/posts/dynamic_linking/">parts</a>, we followed a story of a piece of code transformed into a linked executable. The center stage was taken by symbols, around which the entire process of linking happens. They are the crucial component that lets us put programs fro multiple pieces. However, we watched what happens to symbols after they are there, but we haven't seen how to control their existence, and how to solve problems we may encounter.</p>
<p>The third part focuses on practical problem solving related to symbols and C.</p>
<h2>Emitting symbols in C</h2>
<p>You might remember the term &quot;translation unit&quot;. It's just a more convenient name for &quot;a thing that is compiled together&quot;, the bunch of source files that result in a single object file. This simple definition will be used to refer to the sources as opposed to the object file.</p>
<p>C does not support namespaces, so whenever you define a function under the same name in a different context (e.g. a logging function, an error handler or a container operation), they truly take the same name. This can lead to trouble. Let's introduce a new set of files: <code>floats.c</code>:</p>
<pre><code>#include &lt;stdio.h&gt;
void print_number(float n) {
  printf(&quot;Float %f\n&quot;, n);
}

void multiply_by_tau(int number) {
  print_number(6.68 * number);
}
</code></pre>
<p>Then <code>float.h</code>:</p>
<pre><code>void multiply_by_tau(int number);
</code></pre>
<p>Finally, <code>main.c</code>:</p>
<pre><code>#include &lt;stdio.h&gt;
#include &quot;float.h&quot;

void print_number(int number) {
  printf(&quot;Integer %d\n&quot;, number);
}

void power_of_two(int number) {
  print_number(2 &lt;&lt; number);
}

int main(void) {
  multiply_by_tau(100);
  power_of_two(10);
  return 0;
}
</code></pre>
<p>In this example, we end up having two different functions called <code>print_number</code>. One is in the translation unit with <code>main.c</code>, the other in <code>float.c</code>. They are only called from the relevant translation unit, so we should be fine, right?</p>
<pre><code># gcc -c main.c -o main.o
# gcc -c float.c -o float.o
# nm main.o | grep print_number
0000000000000000 T print_number
# nm float.o | grep print_number
0000000000000000 T print_number
# gcc float.o main.o -o main
/usr/bin/ld: main.o: in function `print_number':
main.c:(.text+0x0): multiple definition of `print_number'; float.o:float.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
</code></pre>
<p>Not really. We emitted a symbol in both of them, and the linker doesn't know which one to choose. Oops. We have to tell the compiler explicitly that each of those functions will only ever be called from the same translation unit. C defines a keyword <code>static</code> for this purpose. Let's check if it works. <code>float.c</code> becomes:</p>
<pre><code>#include &lt;stdio.h&gt;
static void print_number(float n) {
  printf(&quot;Float %f\n&quot;, n);
}

void multiply_by_tau(int number) {
  print_number(6.68 * number);
}
</code></pre>
<p>And <code>main.c</code> now looks like this:</p>
<pre><code>#include &lt;stdio.h&gt;
#include &quot;float.h&quot;

static void print_number(int number) {
  printf(&quot;Integer %d\n&quot;, number);
}

void power_of_two(int number) {
  print_number(2 &lt;&lt; number);
}

int main(void) {
  multiply_by_tau(100);
  power_of_two(10);
  return 0;
}
</code></pre>
<p>And the compilation:</p>
<pre><code># gcc -c float.c -o float.o
# gcc -c main.c -o main.o
# gcc float.o main.o -o main
# ./main
Float 668.000000
Integer 2048
# nm main.o
0000000000000045 T main
                 U multiply_by_tau
0000000000000022 T power_of_two
                 U printf
0000000000000000 t print_number
# nm float.o
0000000000000024 T multiply_by_tau
                 U printf
0000000000000000 t print_number
</code></pre>
<p>Success! We turned<code>print_number</code> into a <a href="https://www.technovelty.org/code/why-symbol-visibility-is-good.html"><em>local symbol</em></a>, (<code>t</code> is lowercase) which is ignored during linking.</p>
<h2>Keywords</h2>
<p>C defines keywords for controlling linkage type. Those are:</p>
<ul>
<li><code>extern</code> for external linkage, i.e. emitting a symbol</li>
<li><code>static</code> for internal linkage, i.e. limiting usage to the current translation unit</li>
<li>no keyword defaults to <code>extern</code></li>
</ul>
<p>However, functions aren't the only items that can be assigned symbols. While any kind of memory area can get a symbol, global variables are the other most common usage. Let's see it all in action in <code>linkage.c</code>:</p>
<pre><code>int v_def = 0;
extern int v_ext = 0;
static int v_stat = 0;

void f_def(void) {};
extern void f_ext(void) {};
static void f_stat(void) {};
</code></pre>
<p>Looking at the symbols shows us:</p>
<pre><code># gcc -c linkage.c -o linkage.o
linkage.c:2:12: warning: ‘v_ext’ initialized and declared ‘extern’
    2 | extern int v_ext = 0;
      |            ^~~~~
# nm linkage.o
0000000000000000 T f_def
0000000000000007 T f_ext
000000000000000e t f_stat
0000000000000000 B v_def
0000000000000004 B v_ext
0000000000000008 b v_stat
</code></pre>
<p>As expected, the default is to emit a symbol, and <code>static</code> makes the symbol local. But there's an unexpected warning from the compiler, suggesting that <code>extern</code> on a variable does more than just emit a symbol for it. Let's try to apply compiler's advice and not assign any value in<code>variable.c</code>:</p>
<pre><code>extern int v_ext;
</code></pre>
<p>Looking at symbols:</p>
<pre><code># gcc -c variable.c -o variable.o
# nm variable.o
#
</code></pre>
<p>…nothing? Let's try something else in <code>variable.c</code>:</p>
<pre><code>extern int v_ext;

void use_variable(void) {
  v_ext = 0;
}
</code></pre>
<p>Symbols:</p>
<pre><code># gcc -c variable.c -o variable.o
# nm variable.o
0000000000000000 T use_variable
                 U v_ext
</code></pre>
<p>There it is, undefined! Without the body of the variable, the compiler took <code>extern</code> to mean &quot;not defined here&quot; instead of &quot;emit a symbol&quot;. No wonder that we didn't see its trace when we didn't try to make use of it. This is similar to <em>function prototypes</em> we covered earlier.</p>
<p>Notice that <code>U</code> printed by the <code>nm</code> program does not mean &quot;symbol of the undefined kind present&quot;. Instead, it means &quot;the relocation table calls for this symbol, but we don't have it&quot;. Here, <code>nm</code> mixes up the symbol table and the relocations table, which can be confusing.</p>
<h2>Using variables</h2>
<p>Variables with the <code>extern</code> symbol may sound confusing at first, but they work the same way functions do. Let's take an example of three files. First is <code>main_var.c</code></p>
<pre><code>#include &lt;stdio.h&gt;
#include &quot;var.h&quot;

int main(void) {
  set_var(10);
  printf(&quot;var is %d\n&quot;, var);
}
</code></pre>
<p>The file <code>var.h</code>:</p>
<pre><code>extern int var;
extern void set_var(int v);
</code></pre>
<p>And finally, <code>var.c</code>:</p>
<pre><code>int var = 5;
void set_var(int v) {
  var = v;
}
</code></pre>
<p>Compiling them shows us that both the <code>set_var</code> function and the <code>var</code> variable are needed by <code>main_var.c</code>, and provided by <code>var.c</code>:</p>
<pre><code># gcc -c main_var.c -o main_var.o
# gcc -c var.c -o var.o
# gcc main_var.o var.o -o main_var
# ./main_var 
var is 10
# nm main_var.o
0000000000000000 T main
                 U printf
                 U set_var
                 U var
# nm var.o
0000000000000000 T set_var
0000000000000000 D var
</code></pre>
<p>As you can see, we managed to modify a variable from a different file using a function from a different file, and then read it out directly. This mechanism is commonly used to give programs access to memory managed by a library.</p>
<h2>Header file trouble</h2>
<p>There is a situation is when a function clashes with itself. It sounds silly, but this can happen if the same function creates a symbol in multiple translation units.</p>
<p>This usually happens when a function lives inside a header of a library we're using. Headers are not actually restricted to holding function prototypes, but instead they are fully fledged pieces of C code, verbosely included in the source file. Let's create <code>source.c</code>:</p>
<pre><code>const int before;
#include &quot;header.h&quot;
const int after;
</code></pre>
<p>And <code>header.h</code>:</p>
<pre><code>const int inside;
</code></pre>
<p>After running the C preprocessor, which happens as the first compilation step, we get:</p>
<pre><code>$ gcc -E source.c
# 1 &quot;source.c&quot;
# 1 &quot;&lt;built-in&gt;&quot;
# 1 &quot;&lt;command-line&gt;&quot;
# 31 &quot;&lt;command-line&gt;&quot;
# 1 &quot;/usr/include/stdc-predef.h&quot; 1 3 4
# 32 &quot;&lt;command-line&gt;&quot; 2
# 1 &quot;source.c&quot;
const int before;
# 1 &quot;header.h&quot; 1
const int inside;
# 3 &quot;source.c&quot; 2
const int after;
</code></pre>
<p>We can see here that the text from the header was indeed included in place of the <code>#include</code> directive.</p>
<p>Now, why would anyone actually <em>want</em> to place a function in the header?</p>
<p>As you can see, the inclusion of a full function inside header means that it will land in the translation unit of the <em>calling</em> file, instead of the <em>providing</em> file, unlike in our library examples. Here we don't even <em>have</em> a separate &quot;providing&quot; file. We can see that our single translation unit contains what the header provided:</p>
<pre><code># gcc -c source.c -o source.o
# nm source.o
0000000000000004 C after
0000000000000004 C before
0000000000000004 C inside
</code></pre>
<p>The most obvious purpose for placing a function in the same translation unit as the caller is optimization. Making a function call is slow. In broad strokes, the compiled code usually saves a generic set of working data, as part of issuing a call and restores it as part of returning from it. But this doesn't make sense if our function is relatively simple. Sometimes it's more efficient to copy the contents of the function instead of calling it. The name for that is <em>inlining</em>.</p>
<p>A call to a function in a dynamic library can't be inlined automatically. That would require some advanced processing at the time our program is starting. A call to a function in a static library can sometimes be inlined. That is called <em>link-time optimization</em>, and it's done at linking time as a sort of additional, slow compilation step.</p>
<p>On the other hand, a call to a function in the same translation unit can be inlined as part of the compilation process even before the linking, and without much additional penalty. If we place the body of our function in the header, this is exactly what we get, regardless of the kind of library we create in our other translation units.</p>
<p>But what does this have to do with a symbol clashing with itself? Let's introduce an example of a library that must store something for bookkeeping, but doesn't want to be a burden for the calling code. Starting with <code>store.c</code>:</p>
<pre><code>int calls_made = 0;
</code></pre>
<p>Corresponding <code>store.h</code>:</p>
<pre><code>extern int calls_made;

void log_call(void) {
  calls_made += 1;
}
</code></pre>
<p>Now, <code>user.c</code>:</p>
<pre><code>#include &lt;stdio.h&gt;
#include &quot;store.h&quot;

void print_and_bump(void) {
  puts(&quot;Caller print&quot;);
  log_call();
}
</code></pre>
<p>Its <code>user.h</code>:</p>
<pre><code>void print_and_bump(void);
</code></pre>
<p>Finally, <code>clash.c</code>:</p>
<pre><code>#include &lt;stdio.h&gt;
#include &quot;store.h&quot;
#include &quot;user.h&quot;

int main(void) {
  puts(&quot;main print&quot;);
  log_call();
  print_and_bump();
  return 0;
}
</code></pre>
<p>Let's try compiling and linking this project:</p>
<pre><code># gcc -c store.c -o store.o
# gcc -c user.c -o user.o
# gcc -c clash.c -o clash.o
# gcc store.o user.o clash.o -o clash
/usr/bin/ld: clash.o: in function `log_call':
clash.c:(.text+0x0): multiple definition of `log_call'; user.o:user.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
</code></pre>
<p>That's an error! Remember that we placed the <code>log_call</code> function into the header file. That means that every translation unit including the header file will get a copy. We can see that this is what happened:</p>
<pre><code># nm store.o | grep log_call
# nm user.o | grep log_call
0000000000000000 T log_call
# nm clash.o | grep log_call
0000000000000000 T log_call
# gcc -E user.c
[...]
# 1 &quot;store.h&quot;
extern int calls_made;

void log_call(void) {
  calls_made += 1;
}
# 3 &quot;user.c&quot; 2
[...]
# gcc -E clash.c
[...]
# 1 &quot;store.h&quot;
extern int calls_made;

void log_call(void) {
  calls_made += 1;
}
# 3 &quot;clash.c&quot; 2
[...]
</code></pre>
<p>The solution is straightforward. Let's change <code>store.h</code> to show this:</p>
<pre><code>extern int calls_made;

static void log_call(void) {
  calls_made += 1;
}
</code></pre>
<p>Linking test:</p>
<pre><code># gcc -c user.c -o user.o
# gcc -c clash.c -o clash.o
# gcc store.o user.o clash.o -o clash
# nm clash.o | grep log_call
0000000000000000 t log_call
# nm user.o | grep log_call
0000000000000000 t log_call
</code></pre>
<p>That did it! Our static function's symbol turned local, and stopped being a duplicate.</p>
<h2>Inline woes</h2>
<p>You might know the C99 keyword <code>inline</code> already. It's meant to be a suggestion to the compiler that the function should be copied instead of called. It turns out that it affects linking.</p>
<p>Taking the example from above and modifying it slightly, we get a new <code>store.h</code>:</p>
<pre><code>extern int calls_made;

inline void log_call(void) {
  calls_made += 1;
}
</code></pre>
<p>If you think that <code>inline</code> implies no need for any sort of linking, you may be surprised:</p>
<pre><code># gcc -std=c99 -c user.c -o user.o
# gcc -std=c99 -c clash.c -o clash.o
# gcc store.o user.o clash.o -o clash
/usr/bin/ld: user.o: in function `print_and_bump':
user.c:(.text+0xf): undefined reference to `log_call'
/usr/bin/ld: clash.o: in function `main':
clash.c:(.text+0xf): undefined reference to `log_call'
collect2: error: ld returned 1 exit status
# nm user.o | grep log_call
                 U log_call
</code></pre>
<p>I'm not going to dive into what exactly happened here, but it boils down to C99 designers wanting to avoid too many copies when inlining isn't possible. A symbol may still be needed for the function, as <a href="https://gustedt.wordpress.com/2010/11/29/myth-and-reality-about-inline-in-c99/">explained</a> by Jens Gustedt. That explanation offers a solution in the form of placing the following line into exactly one translation unit:</p>
<pre><code>extern inline void log_call(void);
</code></pre>
<p>Alternatively, use the <code>static</code> version, although it may offer a different rate of success in inlining or emit too many copies.</p>
<h2>Bad declaration</h2>
<p>Remember that symbols don't carry type information? One of the most confusing problems is using something believing it's one type when in reality it's another. For a quick example, see a botched version check in file <code>version.c</code>:</p>
<pre><code>const char version[] = &quot;10.1&quot;;
</code></pre>
<p>The header <code>version.h</code> has a mistake! The variable is declared as <code>int</code> instead of <code>char</code>:</p>
<pre><code>extern const int version;
</code></pre>
<p>The file <code>check.c</code> is using the declaration from the header:</p>
<pre><code>#include &lt;stdio.h&gt;
#include &quot;version.h&quot;

int main(void) {
  printf(&quot;Version %d\n&quot;, version);
  return 0;
}
</code></pre>
<p>And when it tries to read the <code>version</code> variable…</p>
<pre><code># gcc -c version.c -o version.o
# gcc -c check.c -o check.o
# gcc version.o check.o -o check
# ./check
Version 825110577
</code></pre>
<p>That doesn't look right… Nowhere during the entire process was the type of the actual variable compared to the one declared in the header. The compiler believed what was the header file, the linker matched the symbol simply by name, and the mistake only appeared when we tried to execute the program.</p>
<p>This time we got off easily, with an obvious problem. But the bugs can crash your code outright – if the type in the mistaken definition is larger than the actual type, or be subtle, if the wrong type is only slightly bigger and causes unrelated memory corruption, or when an <code>int</code> is taken for a <code>float</code>.</p>
<p>Your C++ compiler can catch some of those errors:</p>
<pre><code># g++ -c check.c -o check.o
# g++ -c version.c -o version.o
# g++ version.o check.o -o check
/usr/bin/ld: check.o: in function `main':
check.c:(.text+0x6): undefined reference to `version'
collect2: error: ld returned 1 exit status
# nm version.o
0000000000000000 r _ZL7version
# nm check.o
0000000000000000 T main
                 U printf
                 U version
</code></pre>
<p>Unfortunately, C offers no solution but to always keep the headers correct!</p>
<h2>Summary</h2>
<p>Those are the most common but annoying problems you may encounter which stem from the way C and symbols interact. If you're intrigued about some other linking problem, let me know about it. You can find me on the &quot;about me&quot; page.</p>
<h2>Glossary</h2>
<p>Important terms this episode:</p>
<ul>
<li><em>inlining</em> – copying a function's body instead of making a call</li>
<li><em>link-time optimization (LTO)</em> – attempting to inline functions while linking, across translation unit boundaries</li>
</ul>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>From C to exec: Part 2</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/dynamic_linking/"/>
       <id>tag:dorotac.eu,2020-06-01:posts/dynamic_linking</id>
       <updated>2020-06-01T14:00Z</updated>
       <published>2020-06-01T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>From C to exec: Part 2</h1>
<p>The <a href="/posts/linking/">last part</a> explored static linking, but stopped before actually executing our executable. If you think this was all the linking, you're in for a surprise!</p>
<p>This post is best followed using the code from the previous part.</p>
<h2>Act 1: Unexpected symbol</h2>
<h3>Scene 1: Undefined?</h3>
<p>Did you notice the lonely symbol in the last act? Let's call it forth:</p>
<pre><code># gcc -c -o printer.o printer.c
# ar rcs libprinter.a printer.o
# nm libprinter.a

printer.o:
0000000000000000 T print_hello
                 U puts
</code></pre>
<p>That's a lonely &quot;Undefined&quot; symbol! And where did it even come from? Remember <code>printer.c</code>:</p>
<pre><code>#include &lt;stdio.h&gt;
void print_hello(void) {
  puts(&quot;Hello World!\n&quot;);
}
</code></pre>
<p>We call <code>puts</code>, but we never define it… Previously, when we tried to use a symbol that we didn't define, gcc refused to finish the linking, but here it works:</p>
<pre><code># gcc hello.o libprinter.a -o hello
# ./hello
Hello World!
</code></pre>
<p>What gives?</p>
<h3>Scene 2: Deus ex library</h3>
<p>Before we solve the mystery, let me throw another hint towards the solution: <code>puts</code> is hidden in a library! More specifically, it's in libc:</p>
<pre><code># nm /lib64/libc.so.6 | grep puts
[…]
0000000000073280 W puts
[…]
</code></pre>
<p>Here, &quot;W&quot; means it's a &quot;weak&quot; but defined symbol.</p>
<p>We know where the lost symbol is, and where it's missing from. But we still don't know the connection between the two places! The newest trace we have is the <code>/lib64/libc.so.6</code> file...</p>
<h3>Scene 3: Dynamic libraries</h3>
<p>We're familiar with static libraries already. They have the <code>.a</code> file ending, and they are an ingredient for the executable, and are no longer used after linking. But another variety exists. Like static libraries, it contain symbols, but its usefulness extends beyond linking time, and into the actual execution. They are called <em>dynamic libraries</em>, and their file names traditionally end with <code>.so</code>, for &quot;shared object&quot;.</p>
<pre><code># file /lib64/libc.so.6
/lib64/libc.so.6: symbolic link to libc-2.29.so
# file /lib64/libc-2.29.so
/lib64/libc-2.29.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=51377236ad01808e26404c98faa41d72f11a46a5, for GNU/Linux 3.2.0, not stripped, too many notes (256)
</code></pre>
<p><code>libc.so.6</code> is one of them.</p>
<h3>Scene 4: Reconstruction</h3>
<p>If we want to make progress on the story of the misplaced <code>puts</code>, we need to understand dynamic libraries. Time to create our own!</p>
<p>We already created a static library before, so let's use the same procedure, but make a dynamic one this time:</p>
<pre><code># gcc -fPIC -c printer.c -o printer.o
# gcc -shared printer.o -o libprinter.so
# file libprinter.so
libprinter.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=dd2595216f9c10b317ab3dbece8450a31fcaf672, not stripped
# nm libprinter.so | grep hello
0000000000001109 T print_hello
</code></pre>
<p>The procedure is similar to building a static library, except packaging with <code>ar</code> was replaced by another linking step (with <code>-shared</code>), making it look more like creating an executable. The object file gets created in a slightly different way, with the <code>-fPIC</code> flag being obligatory.</p>
<p>Most importantly, we see our <code>print_hello</code> symbol as present! Let's link the whole program together:</p>
<pre><code># gcc hello.o libprinter.so -o hello_dynamic
# file hello_dynamic
hello_dynamic: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=5a8e0d151089682720545349da87b330ab089805, not stripped
</code></pre>
<p>So far so good, this looks just like our previous static binary. Let's make sure all is fine.</p>
<pre><code># nm hello_dynamic | grep hello
                 U print_hello
# ./hello_dynamic
./hello_dynamic: error while loading shared libraries: libprinter.so: cannot open shared object file: No such file or directory
</code></pre>
<p>Uh oh! We lost <code>print_hello</code> along the way, and the program now tries to open the shared library we created. The one that contains <code>print_hello</code>! Could there be a connection? What made the program try to load a file if all we do is printing text?</p>
<h3>Scene 5: Dynamic linker</h3>
<p>The culprit here is the dynamic linker. It's part of the operating system responsible for loading programs. Remember the <code>interpreter</code> part of our file descriptions?</p>
<pre><code>[...] dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2 [...]
</code></pre>
<p>The interpreter's responsibility is to execute executables. It sets up basic resources, links them together if needed (!), and finally gives control to the program by running the entry point (in C that ends up being the function corresponding to the <code>main</code> symbol).</p>
<p>You may be surprised to see linking here again. Didn't we do enough of that already? We linked object files together with the dynamic library, and the dynamic library needed to be linked itself. Why again?</p>
<p>Well, that's because we chose to create a dynamic library instead of a static one. The static library's selling point is that its code is injected (at the time of linking) into the executable. The dynamic library's point is that its code is injected (at the time of execution) into the running program. The former option provides some degree of certainty that the program doesn't change, while the latter gives some flexibility: the dynamic library can get changed and updated without the need to recreate the executable.</p>
<p>When the interpreter complains about <code>cannot open shared object file</code>, it means that it needed the library for the linking step, but couldn't find it in the standard location. The <code>ldd</code> command lists all dynamic libraries that need to be linked together before running the executable:</p>
<pre><code># ldd ./hello_dynamic
        linux-vdso.so.1 (0x00007ffc35be9000)
        libprinter.so =&gt; not found
        libc.so.6 =&gt; /lib64/libc.so.6 (0x00007fc874215000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc874419000)
</code></pre>
<p>Our shared library is <code>not found</code>, because it's not in one of the standard paths. Thankfully, we can tell the dynamic linker to look for extra libraries in the current directory:</p>
<pre><code># LD_LIBRARY_PATH=`pwd` ldd ./hello_dynamic
        linux-vdso.so.1 (0x00007ff5f4e32000)
        libprinter.so =&gt; /home/rhn/libprinter.so (0x00007ff5f4e28000)
        libc.so.6 =&gt; /lib64/libc.so.6 (0x00007ff5f4c26000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ff5f4e33000)
# LD_LIBRARY_PATH=`pwd` ./hello_dynamic
Hello World!
</code></pre>
<p>Great, we directed the computer to find our lost <code>print_hello</code> symbol inside a shared library!</p>
<h3>Scene 6: Standard libraries</h3>
<p>The list coming from <code>ldd</code> is suspicious. In our final step using gcc, we linked <code>hello.o</code> together with <code>libprinter.so</code>. Where did the other dynamic libraries on the list come from?</p>
<p>The answer is: kernel and standard libraries.</p>
<p>If you look at our statically linked executable, you will see the same list:</p>
<pre><code># ldd ./hello
        linux-vdso.so.1 (0x00007ffeda3d9000)
        libc.so.6 =&gt; /lib64/libc.so.6 (0x00007f4e3293a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f4e32b3e000)
</code></pre>
<p>The libraries here are required for the execution of virtually all programs, therefore gcc includes them implicitly. If you remember all the way back, we found <code>puts</code> in one of them: in <code>libc.so.6</code>. It turns out that all our executables do eventually link with that library, just well after the executable is created. That's why we were allowed to use <code>puts</code> even without knowing where it was defined.</p>
<h2>Final words</h2>
<p>With this, you should have a pretty good understanding of where linking happens on a modern computer. Linking still has a lot of quirks when working with C — after all, we may want some control over what we mark as symbols and how. That will et covered in a future part.</p>
<h2>Glossary</h2>
<ul>
<li><em>interpreter</em>: the software that executes executables</li>
<li><em>standard library</em>: libraries containing basic functionality that most programmers don't need to think about</li>
</ul>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>From C to exec</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/linking/"/>
       <id>tag:dorotac.eu,2020-05-28:posts/linking</id>
       <updated>2020-05-28T14:00Z</updated>
       <published>2020-05-28T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>From C to exec</h1>
<p>As the idea of having a technical blog grew in my mind and came closer to reality, it became obvious that it needs to host some kind of useful content apart from fluff and opinions. For this, I asked a few of my many technical friends:</p>
<blockquote>
<p>What can I write about for you?</p>
</blockquote>
<p>The suggestion that I liked the most was to write about the process that turns a C source file into an executable. The topic seems quite opaque, and indeed I don't recall seeing many ways to get into it without already having some background knowing about linking, translation units, and symbols.</p>
<p>The goal of this post is to establish this background for a reader who has a passing familiarity with C.</p>
<p>The seasoned reader should be warned that this is not a rigorous summary of the C standard, but rather a gentle introduction to the practical world of linking ELF files on GNU/Linux with GCC. Simplicity prevails, so while the C standard gives some room for different results, we'll explore only the case of unoptimized code and GCC version 9.</p>
<h2>Act 1: Hello World</h2>
<p>If you've dealt with C before, you're probably familiar with the <code>hello.c</code> file.</p>
<pre><code>#include &lt;stdio.h&gt;
int main(void) {
  puts(&quot;Hello World!&quot;);
  return 0;
}
</code></pre>
<p>The process of turning code into an executable usually consists of at least 2 main steps: <em>compilation</em> and <em>static linking</em>. Depending on how you've usually done your programming, you may be familiar with different ways of performing those two steps. Let's take a look at two most popular ones.</p>
<h3>Scene 1: Executable</h3>
<p>This unsuspecting source file is about to undergo transformations that push it through the entire pipeline, changing it beyond recognition into a dynamic executable. Prepare your terminals:</p>
<pre><code># gcc hello.c -o hello
# ./hello
Hello World!
# file hello
hello.o: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=c1b39511769ab152611cd8ad83983dc724333b7a, not stripped
</code></pre>
<p>Congratulations, we've turned code into an excutable!</p>
<h3>Scene 2: Linking</h3>
<p>But wait, wasn't this supposed to be 2 steps? Indeed, gcc does them both at the same time when not passed the <code>-c</code> flag. Let's try again:</p>
<pre><code># gcc -c hello.c -o hello.o
# ./hello.o
bash: ./hello.o: Permission denied
# file hello.o
hello.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
</code></pre>
<p>That's not the same kind of file as before! What we got is an object file instead: a compiled version of the source code, without any linking applied. Notice the &quot;relocatable&quot; instead of &quot;executable&quot;.</p>
<p>Let's link it now:</p>
<pre><code># gcc hello.o -o hello
# ./hello
Hello World!
</code></pre>
<p>That's an executable again!</p>
<p>You may wonder: what happened? how is the intermediate file different from the executable? why is it useful? Let's move on to the second act...</p>
<h2>Act 2: Symbols</h2>
<p>Here we discover why linking is useful and necessary.</p>
<p>Our project is growing more complicated. It's composed of two source files now.</p>
<p><code>hello.c</code> morphs:</p>
<pre><code>#include &quot;print.h&quot;
int main(void) {
  print_hello();
  return 0;
}
</code></pre>
<p><code>printer.c</code> contains:</p>
<pre><code>#include &lt;stdio.h&gt;
void print_hello(void) {
  puts(&quot;Hello World!&quot;);
}
</code></pre>
<p>Finally, they are joined by <code>print.h</code>:</p>
<pre><code>void print_hello(void);
</code></pre>
<h3>Scene 1: Object files</h3>
<p>This is a more complicated project. A function from one file is referenced in another! And there's a new header file as well. How do we turn this into an executable? Of course, we could just use <code>gcc</code> without the <code>-c</code> flag, and hope that it figures out everything on its own. But that's not why we're here, so let's go step by step:</p>
<p>This compiles the <code>printer.c</code> file into an object file <code>printer.o</code>:</p>
<pre><code>gcc -c printer.c -o printer.o
</code></pre>
<p>So far so good. This produces the <code>hello.o</code> file, which is compiled <code>hello.c</code>:</p>
<pre><code>gcc -c hello.c -o hello.o
</code></pre>
<p>Wait a minute… <code>hello.c</code> calls a function in <code>printer.c</code>, but neither the command, nor <code>hello.c</code> reference <code>printer.c</code> anywhere! All they have is the name <code>print_hello</code>, but not the body of the function. How does the compiler know what call to insert?</p>
<p>Turns out that the compiler doesn't actually need to resolve all names to function bodies immediately. What it does instead is use <em>symbols</em> to step around the problem. When a source file containing a function (or a global variable) named <code>print_hello</code> gets compiled, the resulting object file (or a library) gets a symbol called <code>print_hello</code>. On the other hand, any object file <em>using</em> that symbol creates a reference to it (in the <em>relocation table</em>), to be filled later: at linking time.</p>
<p>Let's see it for ourselves:</p>
<pre><code># nm printer.o
0000000000000000 T print_hello
                 U puts
</code></pre>
<p>According to <code>man nm</code>, <code>T</code> means the symbol is defined and in the <em>Text</em> section. What about the other object file?</p>
<pre><code># nm hello.o
0000000000000000 T main
                 U print_hello
</code></pre>
<p>As expected, <code>print_hello</code> is referenced, but <code>U</code> for &quot;Undefined&quot;. It's simply not in this file.</p>
<p>To fill in the vacancy, let's do the final step, and link both files together:</p>
<pre><code># gcc hello.o print.o -o hello
# nm hello
[...]
0000000000401126 T main
0000000000401136 T print_hello
[...]
</code></pre>
<p>There are many more symbols, but the one we're interested in is there: <code>print_hello</code> is now present. Notice that <code>main</code> is present as well, meaning that we have the bodies of <em>both</em> of our source files in the same compiled file now.</p>
<p>The linker has performed <em>relocation</em>. If you remember from before, the <em>object file</em> type returned by the <em>file</em> command was <code>relocatable</code>. Our binary is now <em>executable</em>, meaning that we can't repeat the same operation again to add some symbols we forgot about any more. But the linker will not let you forget anything anyway:</p>
<pre><code># gcc hello.o -o hello
/usr/bin/ld: hello.o: in function `main':
hello.c:(.text+0x5): undefined reference to `print_hello'
collect2: error: ld returned 1 exit status
</code></pre>
<h3>Scene 2: Header files</h3>
<p>The other anomaly introduced in this example is the header file. It contains only one line:</p>
<pre><code>void print_hello(void);
</code></pre>
<p>What is it for? Suppose we didn't have it. The only information about <code>print_hello</code> would come from its call, which looks like this:</p>
<pre><code>  print_hello();
</code></pre>
<p>Can you guess the type of the function? Well, sort of: it takes no arguments, but it may return anything. And if we made a mistake, like this:</p>
<pre><code>  print_hello(&quot;world&quot;);
</code></pre>
<p>a compiler guessing the type of arguments would happily go on trusting what we wrote, instead of warning us of the extra argument. The line in the header file, called <em>prototype</em>, informs the compiler about which functions are available, and what type they have.</p>
<p>As an aside, you might notice that this was already partially covered by the linker: when function <code>print_hello</code> was unavailable in our example, we received am &quot;undefined reference&quot; message from the linker. Couldn't we get rid of headers and let the linker catch those errors? Indeed, gcc-c++ uses some additional information to store function type:</p>
<pre><code># cp printer.c printer.cpp
# g++ -c printer.cpp -o printerpp.o
# nm printerpp.o 
                 U puts
0000000000000000 T _Z11print_hellov
# c++filt _Z11print_hellov
print_hello()
</code></pre>
<p>This is called <em>symbol mangling</em>, and C compilers don't generally use it, leaving us with obligatory prototypes, and, in practice, header files.</p>
<p>Despite this, C++ still requires prototypes inside the header files, for reasons that are not clear to me. I suspect compatibility with C, or even IDE suggestions.</p>
<h3>Scene 3: Static libraries</h3>
<p>Static libraries are used when additional functionality provided by third party is needed. They also provide symbols, and may depend on other libraries. If this sounds familiar, it's because it is! That's the same core functionality as object files.</p>
<p>In fact, if you look at a shared library on Linux (file name ending with <code>.a</code>), it's just an <em>ar</em> archive containing multiple object files.</p>
<pre><code># ar -t /usr/lib64/libm-2.29.a
s_lib_version.o
s_matherr.o
s_signgam.o
fclrexcpt.o
fgetexcptflg.o
fraiseexcpt.o
[...]
</code></pre>
<p>Let's turn our printer into a static library and see for ourselves:</p>
<pre><code># gcc -c -o printer.o printer.c
# ar rcs libprinter.a printer.o
# nm libprinter.a

printer.o:
0000000000000000 T print_hello
                 U puts
# gcc hello.o libprinter.a -o hello
# ./hello
Hello World!
</code></pre>
<h2>Intermission</h2>
<p>Those steps are sufficient if you need to create a binary for the bare metal, for example for an Arduino with an AVR microcontroller, or if you're creating a basic operating system yourself. The executable needs to be provided in the final form, taking minimal advantage of libraries present on the system. Anything with an operating system permitting <em>dynamic (shared) libraries</em> (like Windows or Linux) is likely to have a 3rd step, which happens immediately before execution: <em>dynamic linking</em>. We will cover that in a future chapter, together with various gotchas related to how C emits symbols.</p>
<h2>Glossary</h2>
<p><em>compilation</em>: turning a <em>source</em> representation of a program into <em>machine code</em></p>
<ul>
<li><em>executable</em>: a runnable program</li>
<li><em>header file</em>: a file in the C language containing, among others, <em>prototypes</em></li>
<li><em>linking</em>: combining different <em>object files</em> together</li>
<li><em>object file</em>: a <em>compiled</em> form of the <em>source</em></li>
<li><em>prototype</em>: a piece of text telling the compiler about the existence of a certain <em>symbol</em> and its type</li>
<li><em>relocation</em>: filling <em>symbols</em> from one <em>object file</em> into others</li>
<li><em>relocation table</em>: a list of references to missing symbols</li>
<li><em>source code</em>: text written in a human-readable form, e.g. in the C language</li>
<li><em>static library</em>: a collection of <em>object files</em></li>
<li><em>symbol</em>: a piece of program that can be accessed from another <em>object file</em></li>
</ul>

         </div>
       </content>
     </entry>
     
     <entry>
       <title>Brutalist blogging</title>
       <link rel="alternate" type="text/html"
        href="https://dorotac.eu/posts/brutalism/"/>
       <id>tag:dorotac.eu,2020-05-19:posts/brutalism</id>
       <updated>2020-05-19T14:00Z</updated>
       <published>2020-05-19T14:00Z</published>
       <content type="xhtml" xml:lang="en">
         <div xmlns="http://www.w3.org/1999/xhtml">
           <h1>Brutalist blogging</h1>
<p>I started this website in order to share bits of my knowledge. While I don't know everything, I have some workflows and opinions on a range of topics, from systems programming, through user interfaces, to collaboration. I hope that some of my insights will be interesting to others.</p>
<p>One of my insights is that computers are under-utilized. They are powerful, yet few people are able to command their power. The deluge of programming resources seems to be losing against the dwindling cues to do something novel with the machines. Today's phones neither come with BASIC, nor Turbo Pascal, apps don't lend themselves to modification via replacing config or art files, and Excel is not as ubiquitous as on desktops. The end result is closer to reading someone's books of magic than writing them.</p>
<p>As I started writing this Markdown document, I noticed that I was on the verge of the magical realm, and it was up to me to leave cracks in the magical armor, through which the inner workings of the post could be seen. I took inspiration from <a href="https://en.wikipedia.org/wiki/Brutalist_architecture#Characteristics">brutalist architecture</a>, which tries to be <a href="https://medium.com/on-architecture-1/the-new-brutalism-6601463336e8">structurally honest</a>:</p>
<blockquote>
<p>Whatever has been said about honest use of materials, most modern buildings appear to be made of whitewash or patent glazing, even when they are made of concrete or steel. Hunstanton appears to be made of glass, brick, steel and concrete, and is in fact made of glass, brick, steel and concrete. Water and electricity do not come out of unexplained holes in the wall, but are delivered to the point of use by visible pipes and manifest conduits. One can see what Hunstanton is made of, and how it works, and there is not another thing to see except the play of spaces.</p>
</blockquote>
<p>In a parallel between architecture and computing, I want my work to stand out against mainstream technology by being more honest transparent, and less secretive and magical. Brutalist blogging, if you will.</p>
<p>The first property of this blog is its simplicity. Minimal CSS, HTML, little scripting, and no minification make it human-inspectable. I intend to add more transparency as time goes on: each entry's raw <a href="https://commonmark.org/">CommonMark</a> source can be accessed by adding <code>.md</code> to the URL. And finally, the <a href="/bm.py">HTML renderer</a> preserves some of the source markup, like <code>#</code> for headings, <code>&gt;</code> for quotes, to let the reader see the inner workings even if it never occurred to them to ask about them.</p>
<p>As a strong believer in the notion that computers should empower their users, I hope that my little contribution helps dispel the magic.</p>

         </div>
       </content>
     </entry>
     
   </feed>
