<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>rust &amp;mdash; binarycat</title>
    <link>https://paper.wf/binarycat/tag:rust</link>
    <description>blog about programming and general tech stuff</description>
    <pubDate>Wed, 29 Apr 2026 16:32:17 +0000</pubDate>
    <item>
      <title>Safe Delayed Initialization for Lifetime Extension</title>
      <link>https://paper.wf/binarycat/safe-delayed-initialization-for-lifetime-extension</link>
      <description>&lt;![CDATA[A niche programming pattern to satisfy the borrow checker.&#xA;!--more--&#xA;&#xA;Many Rust programmers don&#39;t know the exact semantics of let:&#xA;&#xA;A variable in let must be assigned at least once in each control-flow path it is used in (exactly once if it is not declared mut).&#xA;&#xA;This means we don&#39;t have to initialize variables if we don&#39;t use them:&#xA;&#xA;fn main() {&#xA;    let x: u8; // never assigned a value&#xA;}&#xA;&#xA;On its own, this isn&#39;t very helpful, but it gets slightly less useless when we combine it with conditional control flow:&#xA;&#xA;use rand;&#xA;&#xA;fn main() {&#xA;    let x: u8; &#xA;    &#xA;    if rand::random::bool() {&#xA;        x = rand::random();&#xA;        // the borrow checker knows that&#xA;        // x is always assigned before&#xA;        // it is read.&#xA;        println!(&#34;x is {x}&#34;);&#xA;    }&#xA;}&#xA;&#xA;Ok, still not very useful yet.  Where this gets actually useful is when we have a reference that we want to sometimes point to some data we allocated on the heap.&#xA;&#xA;use rand;&#xA;&#xA;fn main() {&#xA;    let xstring: String; &#xA;    let mut x = &#34;nothing&#34;;&#xA;    &#xA;    if rand::random::bool() {&#xA;        let n: u8 = rand::random();&#xA;        xstring = n.tostring();&#xA;        // because xstring is declared&#xA;        // in the same scope as x,&#xA;        // we can do this.&#xA;        // if xstring was&#xA;        // declared within this if block,&#xA;        // it would not live long enough.&#xA;        x = &amp;xstring;&#xA;    }&#xA;    &#xA;    // pretend this is really complicated code&#xA;    // that we don&#39;t want to repeat.&#xA;    println!(&#34;x is {x}&#34;);&#xA;}&#xA;&#xA;Many less-experienced Rust programmers would just use &#34;nothing&#34;.tostring() here, always storing x on the heap, but this has the downside of a needless allocation.&#xA;&#xA;Of course, in this example, we could just initialize xstring with a dummy value and make it mut, like let mut xstring = String::new(), but this has a few issues:&#xA;it implies the string may be mutated several times in-place.&#xA;if we forget the x_string = line, the compiler won&#39;t warn us, and we&#39;ll end up printing the empty string.&#xA;not all types have such a cheap/easy way to create a dummy value like this.&#xA;&#xA;Another alternative would be using MaybeUninit, which may be necessary if your control flow is significantly more complex, but this should be avoided if possible due to the potential to cause Undefined Behavior.&#xA;&#xA;If you&#39;re interested in a more real-world example of this pattern, rustdoc uses this in several places, such as in Type::attributes.&#xA;&#xA;-------&#xA;&#xA;#rust #programming&#xA;&#xA;--------&#xD;&#xA;&#xD;&#xA;You can follow this blog via its RSS feed or by searching for @binarycat@paper.wf on your Mastodon/ActivityPub instance.]]&gt;</description>
      <content:encoded><![CDATA[<p>A niche programming pattern to satisfy the borrow checker.
</p>

<p>Many Rust programmers don&#39;t know the exact semantics of <code>let</code>:</p>

<p>A variable in <code>let</code> must be assigned at least once in each control-flow path it is used in (exactly once if it is not declared <code>mut</code>).</p>

<p>This means we don&#39;t have to initialize variables if we don&#39;t use them:</p>

<pre><code class="language-rust">fn main() {
    let _x: u8; // never assigned a value
}
</code></pre>

<p>On its own, this isn&#39;t very helpful, but it gets slightly less useless when we combine it with conditional control flow:</p>

<pre><code class="language-rust">use rand;

fn main() {
    let x: u8; 
    
    if rand::random::&lt;bool&gt;() {
        x = rand::random();
        // the borrow checker knows that
        // `x` is always assigned before
        // it is read.
        println!(&#34;x is {x}&#34;);
    }
}
</code></pre>

<p>Ok, still not very useful yet.  Where this gets <em>actually</em> useful is when we have a reference that we want to <em>sometimes</em> point to some data we allocated on the heap.</p>

<pre><code class="language-rust">use rand;

fn main() {
    let x_string: String; 
    let mut x = &#34;nothing&#34;;
    
    if rand::random::&lt;bool&gt;() {
        let n: u8 = rand::random();
        x_string = n.to_string();
        // because `x_string` is declared
        // in the same scope as `x`,
        // we can do this.
        // if `x_string` was
        // declared within this `if` block,
        // it would not live long enough.
        x = &amp;x_string;
    }
    
    // pretend this is really complicated code
    // that we don&#39;t want to repeat.
    println!(&#34;x is {x}&#34;);
}
</code></pre>

<p>Many less-experienced Rust programmers would just use <code>&#34;nothing&#34;.to_string()</code> here, always storing <code>x</code> on the heap, but this has the downside of a needless allocation.</p>

<p>Of course, in this example, we could just initialize <code>x_string</code> with a dummy value and make it <code>mut</code>, like <code>let mut x_string = String::new()</code>, but this has a few issues:
1. it implies the string may be mutated several times in-place.
2. if we forget the <code>x_string =</code> line, the compiler won&#39;t warn us, and we&#39;ll end up printing the empty string.
3. not all types have such a cheap/easy way to create a dummy value like this.</p>

<p>Another alternative would be using <code>MaybeUninit</code>, which may be necessary if your control flow is significantly more complex, but this should be avoided if possible due to the potential to cause Undefined Behavior.</p>

<p>If you&#39;re interested in a more real-world example of this pattern, rustdoc uses this in several places, such as in <a href="https://github.com/rust-lang/rust/blob/a4a11aca5ecf24dfff3c00715641026809951305/src/librustdoc/clean/types.rs#L759" rel="nofollow"><code>Type::attributes</code></a>.</p>

<hr>

<p><a href="/binarycat/tag:rust" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">rust</span></a> <a href="/binarycat/tag:programming" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">programming</span></a></p>

<hr>

<p>You can follow this blog <a href="https://paper.wf/binarycat/feed/" rel="nofollow">via its RSS feed</a> or by searching for <a href="https://paper.wf/@/binarycat@paper.wf" class="u-url mention" rel="nofollow">@<span>binarycat@paper.wf</span></a> on your Mastodon/ActivityPub instance.</p>
]]></content:encoded>
      <guid>https://paper.wf/binarycat/safe-delayed-initialization-for-lifetime-extension</guid>
      <pubDate>Fri, 21 Mar 2025 18:51:05 +0000</pubDate>
    </item>
    <item>
      <title>Cursed Rust</title>
      <link>https://paper.wf/binarycat/cursed-rust</link>
      <description>&lt;![CDATA[Rust is a language with a lot of features.  Sometimes those features have rough edges.  Sometimes those rough edges are funny.  Let&#39;s look at some.&#xA;!--more--&#xA;&#xA;Copy and Clone can diverge&#xA;&#xA;[derive(Debug)]&#xA;struct OhNo(u32);&#xA;&#xA;impl Clone for OhNo {&#xA;    fn clone(&amp;self) -  Self {&#xA;        OhNo(self.0 + 1)&#xA;    }&#xA;}&#xA;&#xA;impl Copy for OhNo { }&#xA;&#xA;fn main() {&#xA;    let oh = OhNo(3);&#xA;    dbg!(oh.clone());&#xA;    dbg!(oh);&#xA;}&#xA;&#xA;Even worse, according to the docs, this isn&#39;t even a logic error, unlike inconsistent implementations of PartialOrd and Ord&#xA;&#xA;Really long place expression&#xA;&#xA;a place expression is an expression you can take the address of.&#xA;&#xA;turns out if statements are place expressions.  this is so obscure that not even the reference knows this, despite it being true since rust 1.0&#xA;&#xA;fn main() {&#xA;    let arr = [10, 20];&#xA;    let arrref = &amp;if true {&#xA;        arr[0]&#xA;    } else {&#xA;        arr[1]&#xA;    };&#xA;    dbg!(arrref)&#xA;}&#xA;&#xA;krate vs crate&#xA;t-compiler says to use krate, t-style says to use crate&#xA;&#xA;sidenote: how many of you have actually read the style guide? how many actually know that it exists?&#xA;&#xA;Rust has reference variables! kinda..&#xA;&#xA;use std::cell::Cell;&#xA;&#xA;fn main() {&#xA;    let x = Cell::new(1);&#xA;    let ref y = x;&#xA;    x.set(2);&#xA;    let ref z = x;&#xA;    asserteq!(y, z);&#xA;}&#xA;&#xA;This is mostly just silly.&#xA;&#xA;&amp; is actually useful&#xA;previous entries are here because they are weird and obscure. &amp; is here because it is actually useful and meaningful, despite the fact that it would be a very silly no-op in most languages.&#xA;&#xA;invoking Deref without importing the trait&#xA;reborrowing a mutable reference as shared&#xA;turning raw pointers into references&#xA;&#xA;Addendum: it&#39;s not just if&#xA;A reader asked me if a loop + break could also be a place expression.  I thought do myself &#34;well obviously that won&#39;t work&#34;, before testing it out and realizing in horror that it does:&#xA;&#xA;fn main() {&#xA;    let arr = [10, 20];&#xA;    let arrref = &amp;loop { break arr[0]; };&#xA;    dbg!(arrref);&#xA;}&#xA;&#xA;now this is weird.&#xA;&#xA;Errata 2024-10-08: temporary lifetime extension is complicated&#xA;&#xA;if is not a place expression. I was under the impression that the previous examples would not work unless it was, but actually my understanding of temporary lifetime extension is simply incomplete.&#xA;&#xA;Addendum 2024-11-05: + is overloaded on strings&#xA;rust typically avoids C++ style operator overloading, favoring the haskell style of always having operators represent the same semantic operation.&#xA;&#xA;nonetheless, you can use + for string concatenation, and you can use += as an alternative to pushstr.&#xA;&#xA;Addendum 2024-11-08: Calling methods of unnameable traits &#xA;Ever wanted an api that&#39;s impossible to use without glob imports? well now you can!&#xA;&#xA;mod greetext {&#xA;    mod inner {&#xA;        pub trait GreetExt {&#xA;            fn greet(&amp;self);&#xA;        }&#xA;        &#xA;        implT GreetExt for T {&#xA;            fn greet(&amp;self) {&#xA;                println!(&#34;hello, world!&#34;);&#xA;            }&#xA;        }&#xA;    }&#xA;    pub use inner::GreetExt as ;&#xA;}&#xA;&#xA;pub use greet_ext::*;&#xA;&#xA;fn main() {&#xA;    1.greet();&#xA;}&#xA;&#xA;-----&#xA;&#xA;#rust #programming&#xA;&#xA;--------&#xD;&#xA;&#xD;&#xA;You can follow this blog via its RSS feed or by searching for @binarycat@paper.wf on your Mastodon/ActivityPub instance.]]&gt;</description>
      <content:encoded><![CDATA[<p>Rust is a language with a lot of features.  Sometimes those features have rough edges.  Sometimes those rough edges are funny.  Let&#39;s look at some.
</p>

<h2 id="copy-and-clone-can-diverge" id="copy-and-clone-can-diverge"><code>Copy</code> and <code>Clone</code> can diverge</h2>

<pre><code class="language-rust">#[derive(Debug)]
struct OhNo(u32);

impl Clone for OhNo {
    fn clone(&amp;self) -&gt; Self {
        OhNo(self.0 + 1)
    }
}

impl Copy for OhNo { }

fn main() {
    let oh = OhNo(3);
    dbg!(oh.clone());
    dbg!(oh);
}
</code></pre>

<p>Even worse, <a href="https://doc.rust-lang.org/1.81.0/std/marker/trait.Copy.html" rel="nofollow">according to the docs</a>, this isn&#39;t even a logic error, unlike inconsistent implementations of <a href="https://doc.rust-lang.org/1.81.0/std/cmp/trait.PartialOrd.html" rel="nofollow"><code>PartialOrd</code></a> and <a href="https://doc.rust-lang.org/1.81.0/std/cmp/trait.Ord.html" rel="nofollow"><code>Ord</code></a></p>

<h2 id="really-long-place-expression" id="really-long-place-expression">Really long <del>place</del> expression</h2>

<p>a <a href="https://doc.rust-lang.org/reference/expressions.html#place-expressions-and-value-expressions" rel="nofollow">place expression</a> is an expression you can take the address of.</p>

<p><del>turns out <code>if</code> statements are place expressions.  this is so obscure that not even the reference knows this, despite it being true since rust 1.0</del></p>

<pre><code class="language-rust">fn main() {
    let arr = [10, 20];
    let arr_ref = &amp;if true {
        arr[0]
    } else {
        arr[1]
    };
    dbg!(arr_ref)
}
</code></pre>

<h2 id="krate-vs-crate" id="krate-vs-crate">krate vs crate_</h2>

<p><a href="https://rustc-dev-guide.rust-lang.org/conventions.html" rel="nofollow">t-compiler says to use <code>krate</code></a>, <a href="https://doc.rust-lang.org/stable/style-guide/items.html" rel="nofollow">t-style says to use <code>crate_</code></a></p>

<p>sidenote: how many of you have actually read the style guide? how many actually know that it exists?</p>

<h2 id="rust-has-reference-variables-kinda" id="rust-has-reference-variables-kinda">Rust has reference variables! kinda..</h2>

<pre><code class="language-rust">use std::cell::Cell;

fn main() {
    let x = Cell::new(1);
    let ref y = x;
    x.set(2);
    let ref z = x;
    assert_eq!(y, z);
}
</code></pre>

<p>This is mostly just silly.</p>

<h2 id="is-actually-useful" id="is-actually-useful"><code>&amp;*</code> is actually useful</h2>

<p>previous entries are here because they are weird and obscure. <code>&amp;*</code> is here because it is actually useful and meaningful, despite the fact that it would be a very silly no-op in most languages.</p>
<ul><li>invoking <code>Deref</code> without importing the trait</li>
<li>reborrowing a mutable reference as shared</li>
<li>turning raw pointers into references</li></ul>

<h2 id="addendum-it-s-not-just-if" id="addendum-it-s-not-just-if">Addendum: it&#39;s not just if</h2>

<p>A reader asked me if a <code>loop</code> + <code>break</code> could also be a place expression.  I thought do myself “well obviously <em>that</em> won&#39;t work”, before testing it out and realizing in horror that it does:</p>

<pre><code class="language-rust">fn main() {
    let arr = [10, 20];
    let arr_ref = &amp;loop { break arr[0]; };
    dbg!(arr_ref);
}
</code></pre>

<p>now this is weird.</p>

<h2 id="errata-2024-10-08-temporary-lifetime-extension-is-complicated" id="errata-2024-10-08-temporary-lifetime-extension-is-complicated">Errata 2024-10-08: temporary lifetime extension is complicated</h2>

<p><code>if</code> is not a place expression. I was under the impression that the previous examples would not work unless it was, but actually my understanding of temporary lifetime extension is simply incomplete.</p>

<h2 id="addendum-2024-11-05-is-overloaded-on-strings" id="addendum-2024-11-05-is-overloaded-on-strings">Addendum 2024-11-05: + is overloaded on strings</h2>

<p>rust typically avoids C++ style operator overloading, favoring the haskell style of always having operators represent the same semantic operation.</p>

<p>nonetheless, <a href="https://doc.rust-lang.org/nightly/std/string/struct.String.html#method.add" rel="nofollow">you can use <code>+</code> for string concatenation</a>, and you can use <code>+=</code> as an alternative to <code>push_str</code>.</p>

<h2 id="addendum-2024-11-08-calling-methods-of-unnameable-traits" id="addendum-2024-11-08-calling-methods-of-unnameable-traits">Addendum 2024-11-08: Calling methods of unnameable traits</h2>

<p>Ever wanted an api that&#39;s <em>impossible</em> to use without glob imports? well now you can!</p>

<pre><code>mod greet_ext {
    mod inner {
        pub trait GreetExt {
            fn greet(&amp;self);
        }
        
        impl&lt;T&gt; GreetExt for T {
            fn greet(&amp;self) {
                println!(&#34;hello, world!&#34;);
            }
        }
    }
    pub use inner::GreetExt as _;
}

pub use greet_ext::*;

fn main() {
    1.greet();
}
</code></pre>

<hr>

<p><a href="/binarycat/tag:rust" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">rust</span></a> <a href="/binarycat/tag:programming" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">programming</span></a></p>

<hr>

<p>You can follow this blog <a href="https://paper.wf/binarycat/feed/" rel="nofollow">via its RSS feed</a> or by searching for <a href="https://paper.wf/@/binarycat@paper.wf" class="u-url mention" rel="nofollow">@<span>binarycat@paper.wf</span></a> on your Mastodon/ActivityPub instance.</p>
]]></content:encoded>
      <guid>https://paper.wf/binarycat/cursed-rust</guid>
      <pubDate>Thu, 03 Oct 2024 21:04:57 +0000</pubDate>
    </item>
    <item>
      <title>inferred implicit parameters for ergonomic object capabilities</title>
      <link>https://paper.wf/binarycat/inferred-implicit-parameters-for-ergonomic-object-capabilities</link>
      <description>&lt;![CDATA[a system that elegantly provides the security benefits of an effect system and ocaps, while also being convenient to use.&#xA;!--more--&#xA;inspired by scala&#39;s implicit paramaters and the object-capability model (ocaps).&#xA;&#xA;background: what is an ocap&#xA;put simply, an object-capability is an object that represents a capability to do something.&#xA;&#xA;for example, on unix systems, having a file descriptor gives you the capability to read and/or write that file, depending on how it was created.&#xA;&#xA;where this becomes useful is when the ability to create these objects is restricted.  returning to the fd example, a setguid program might open all the relevant files as root, then drop it&#39;s privileges to that of a regular user.  if this is done before interpreting user input, it means whatever that input causes the program to do, it can&#39;t do anything that requires root, other than accessing to those specific files.&#xA;&#xA;step 1: implicit parameters&#xA;&#xA;when calling functions, implicit parameters are automatically filled in with a value from the surrounding context unless it is explicitly provided.&#xA;&#xA;implicit parameters are primarily identified by their type, so their name can be omitted if they are not named directly.&#xA;&#xA;def add(a: Integer)(using b: Integer) = a + b;&#xA;&#xA;def addmore(a: Integer)(using Integer) = add(a) + 1&#xA;&#xA;given Integer = 2;&#xA;&#xA;add(0) // 2&#xA;addmore(1) // 4&#xA;addmore(5)(using 10) // 16&#xA;&#xA;implicit parameters are useful for programs that are going to be passing around the same arguments to a lot of functions.&#xA;&#xA;step 2: using implicit parameters for ergonomic object-capabilities&#xA;the main problem with ocaps is simple: passing all those capability tokens around is tedious and annoying.&#xA;&#xA;without implicit parameters, if all the functions in your codebase log and access the filesystem, then ever function call in your code will have two extra parameters for the logging and filesystem handles.&#xA;&#xA;with implicit parameters, you can omit those params from the call, and they will automatically be filled, in if the containing function has implicit parameters of a compatible type.&#xA;&#xA;step 3: inferred implicit parameters for even less boilerplate, while maintaining control&#xA;while implicit parameters exist to repetition in function calls, inferred parameters exist to reduce repetition in function definitions   &#xA;&#xA;inferring implicit parameters is opt-in per-function (via a simple syntax such as implicit ... at the end of the argument list), and inferred parameters will be expanded in auto-generated docs.&#xA;&#xA;the logic for inferring parameters is simple: if the containing function opts in, any function calls that would report missing implicit parameters will instead result in those parameters being added to the containing function.&#xA;&#xA;note that if you want to manipulate a parameter directly, you would still have to name it as an implicit parameter, it could not be inferred.  this is to prevent functions from containing references to identifiers they do not declare.&#xA;&#xA;ideas for proof-of-concept implementations&#xA;&#xA;it should be fairly easy to implement a prototype of this via code transformation.&#xA;&#xA;a good candidate would be Julia, since it has a macro system, types, and is dynamic.&#xA;&#xA;potential use in a faster and more secure operating system&#xA;&#xA;modern OSes have many security features that require dynamic checking, often with hardware support.  &#xA;one obvious example is process memory isolation, which prevents one program from accessing the memory of another.&#xA;&#xA;there are a few past attempts to replace these dynamic security checks with static analysis (i&#39;m sure they exist, but unfortunately i can&#39;t find them right now), however these were mostly held back by the compiler technology of the time (most of them used a modified version of C)&#xA;&#xA;the main limitation of this would be having to distribute programs as source code in order to get substantial benefit, but with modern JIT compiler technology (as well as global compiler caches), this should be much more manageable.&#xA;&#xA;instead of having an external manifest that tells what permissions an application needs, these permissions would be obvious based on the signature of that program&#39;s main function.&#xA;&#xA;more performant?&#xA;&#xA;ring transitions and syscalls are expensive, and there are other costs associated with process memory isolation, such as having to have a general-purpose heap allocator within every process.  these in-process allocators can only receive whole pages from the OS, and more importantly, the can only return whole pages to the OS, and only under certain circumstances.&#xA;&#xA;having a single shared address space has the potential to improve performance significantly.&#xA;&#xA;more secure?&#xA;&#xA;wouldn&#39;t removing the kernel/userspace split make things less secure?&#xA;&#xA;well, if that&#39;s all you were doing, then sure!&#xA;&#xA;but there&#39;s a few more pieces of the puzzle:&#xA;most important stuff is in userspace anyways&#xA;we&#39;re not just removing it, we&#39;re replacing it, and we&#39;re replacing it with something much easier to use.&#xA;&#xA;filesystems&#xA;&#xA;the biggest security hole of modern desktop operating systems is the filesystem, where keeping files secure requires creating fake &#34;users&#34; for a program and using setuid hacks.&#xA;&#xA;mobile operating systems manage this by simply locking down the filesystem and requiring the use of alternate apis for inter-process communication.&#xA;&#xA;our operating system would take a different approach, where a program can only access parts of the filesystem given to it as objects, either by via a drag&amp;drop file manager, or via the system shell, which would logically be a repl for our new programming language.&#xA;&#xA;an example of ocap filesystem access is in the experimental FileSystemDirectoryHandle browser api.&#xA;&#xA;related work&#xA;guile paramaters&#xA;&#xA;Addendum 2024-09-30: isolation blocks&#xA;it has come to my attention that i have neglected to specify a useful concept: isolation blocks.&#xA;&#xA;any function calls inside an isolation block will not resolve implicit parameters to a declaration outside the block.&#xA;&#xA;this is not strictly necessary for the model to work, the same effect can be accomplished by using wrapper functions and passing parameters explicitly, but it makes things much easier to use.&#xA;&#xA;this is expecially useful for interactive use, where it essentially allows you to enter a sandbox just by starting an isolation block. specific ocaps can be imported into the sandbox with the equivalent of scala&#39;s given keyword.&#xA;&#xA;Addendum 2024-10-08: $PWD as an implicit parameter&#xA;one concept that is very pervasive in UNIX systems is that of a &#34;current directory&#34;.&#xA;&#xA;there is one big problem with this: it&#39;s a process-wide variable.&#xA;&#xA;one solution to this that has emerged is the openat() family of functions, these allow you to open a file relative to a file descriptor.&#xA;&#xA;unfortunately, this is almost never used, due to the simple fact that passing around dir handles everywhere is kinda annoying.&#xA;&#xA;under this new paradigm, everything that would care about the &#34;current directory&#34; instead takes an implicit DIR handle which all paths are opened relative to.&#xA;&#xA;this removes pitfalls such as chdir() in multi-threaded applications, while also making it clear in auto-generated docs which functions care about the current directory.&#xA;&#xA;if we require all paths to be simple relative paths (i.e. not starting with / and not containing ..), then we get the ocap-like isolation described before, and each function would essentially run in its own chroot.&#xA;&#xA;-------------&#xA;#programming #rust&#xA;&#xA;--------&#xD;&#xA;&#xD;&#xA;You can follow this blog via its RSS feed or by searching for @binarycat@paper.wf on your Mastodon/ActivityPub instance.]]&gt;</description>
      <content:encoded><![CDATA[<p>a system that elegantly provides the security benefits of an effect system and ocaps, while also being convenient to use.

inspired by <a href="https://docs.scala-lang.org/scala3/book/ca-context-parameters.html" rel="nofollow">scala&#39;s implicit paramaters</a> and <a href="https://doi.org/10.48550/arXiv.1907.07154" rel="nofollow">the object-capability model</a> (ocaps).</p>

<h2 id="background-what-is-an-ocap" id="background-what-is-an-ocap">background: what is an ocap</h2>

<p>put simply, an object-capability is an <em>object</em> that represents a <em>capability</em> to do something.</p>

<p>for example, on unix systems, having a file descriptor gives you the <em>capability</em> to read and/or write that file, depending on how it was created.</p>

<p>where this becomes useful is when the ability to create these objects is restricted.  returning to the fd example, a setguid program might open all the relevant files as root, then drop it&#39;s privileges to that of a regular user.  if this is done before interpreting user input, it means whatever that input causes the program to do, it can&#39;t do anything that requires root, other than accessing to those specific files.</p>

<h2 id="step-1-implicit-parameters" id="step-1-implicit-parameters">step 1: implicit parameters</h2>

<p>when calling functions, implicit parameters are automatically filled in with a value from the surrounding context unless it is explicitly provided.</p>

<p>implicit parameters are primarily identified by their type, so their name can be omitted if they are not named directly.</p>

<pre><code class="language-scala">def add(a: Integer)(using b: Integer) = a + b;

def add_more(a: Integer)(using Integer) = add(a) + 1

given Integer = 2;

add(0) // 2
add_more(1) // 4
add_more(5)(using 10) // 16
</code></pre>

<p>implicit parameters are useful for programs that are going to be passing around the same arguments to a lot of functions.</p>

<h2 id="step-2-using-implicit-parameters-for-ergonomic-object-capabilities" id="step-2-using-implicit-parameters-for-ergonomic-object-capabilities">step 2: using implicit parameters for ergonomic object-capabilities</h2>

<p>the main problem with ocaps is simple: passing all those capability tokens around is tedious and annoying.</p>

<p>without implicit parameters, if all the functions in your codebase log and access the filesystem, then ever function call in your code will have two extra parameters for the logging and filesystem handles.</p>

<p>with implicit parameters, you can omit those params from the call, and they will automatically be filled, in if the containing function has implicit parameters of a compatible type.</p>

<h2 id="step-3-inferred-implicit-parameters-for-even-less-boilerplate-while-maintaining-control" id="step-3-inferred-implicit-parameters-for-even-less-boilerplate-while-maintaining-control">step 3: inferred implicit parameters for even less boilerplate, while maintaining control</h2>

<p>while implicit parameters exist to repetition in function calls, <em>inferred</em> parameters exist to reduce repetition in function <em>definitions</em></p>

<p>inferring implicit parameters is opt-in per-function (via a simple syntax such as <code>implicit ...</code> at the end of the argument list), and inferred parameters will be expanded in auto-generated docs.</p>

<p>the logic for inferring parameters is simple: if the containing function opts in, any function calls that would report missing implicit parameters will instead result in those parameters being added to the containing function.</p>

<p>note that if you want to manipulate a parameter directly, you would still have to name it as an implicit parameter, it could not be inferred.  this is to prevent functions from containing references to identifiers they do not declare.</p>

<h2 id="ideas-for-proof-of-concept-implementations" id="ideas-for-proof-of-concept-implementations">ideas for proof-of-concept implementations</h2>

<p>it should be fairly easy to implement a prototype of this via code transformation.</p>

<p>a good candidate would be Julia, since it has a macro system, types, and is dynamic.</p>

<h2 id="potential-use-in-a-faster-and-more-secure-operating-system" id="potential-use-in-a-faster-and-more-secure-operating-system">potential use in a faster and more secure operating system</h2>

<p>modern OSes have many security features that require dynamic checking, often with hardware support.<br>
one obvious example is process memory isolation, which prevents one program from accessing the memory of another.</p>

<p>there are a few past attempts to replace these dynamic security checks with static analysis (i&#39;m sure they exist, but unfortunately i can&#39;t find them right now), however these were mostly held back by the compiler technology of the time (most of them used a modified version of C)</p>

<p>the main limitation of this would be having to distribute programs as source code in order to get substantial benefit, but with modern JIT compiler technology (as well as global compiler caches), this should be much more manageable.</p>

<p>instead of having an external manifest that tells what permissions an application needs, these permissions would be obvious based on the signature of that program&#39;s <code>main</code> function.</p>

<h3 id="more-performant" id="more-performant">more performant?</h3>

<p>ring transitions and syscalls are expensive, and there are other costs associated with process memory isolation, such as having to have a general-purpose heap allocator within every process.  these in-process allocators can only receive whole pages from the OS, and more importantly, the can only <em>return</em> whole pages to the OS, and only under certain circumstances.</p>

<p>having a single shared address space has the potential to improve performance significantly.</p>

<h3 id="more-secure" id="more-secure">more secure?</h3>

<p>wouldn&#39;t removing the kernel/userspace split make things less secure?</p>

<p>well, if that&#39;s all you were doing, then sure!</p>

<p>but there&#39;s a few more pieces of the puzzle:
1. most important stuff is in userspace anyways
2. we&#39;re not just removing it, we&#39;re <em>replacing</em> it, and we&#39;re replacing it with something much easier to use.</p>

<h4 id="filesystems" id="filesystems">filesystems</h4>

<p>the biggest security hole of modern desktop operating systems is the filesystem, where keeping files secure requires creating fake “users” for a program and using setuid hacks.</p>

<p>mobile operating systems manage this by simply locking down the filesystem and requiring the use of alternate apis for inter-process communication.</p>

<p>our operating system would take a different approach, where a program can only access parts of the filesystem given to it as objects, either by via a drag&amp;drop file manager, or via the system shell, which would logically be a repl for our new programming language.</p>

<p>an example of ocap filesystem access is in the experimental <a href="https://developer.mozilla.org/en-US/docs/Web/API/FileSystemDirectoryHandle" rel="nofollow"><code>FileSystemDirectoryHandle</code></a> browser api.</p>

<h1 id="related-work" id="related-work">related work</h1>
<ul><li><a href="https://www.gnu.org/software/guile/manual/html_node/Parameters.html" rel="nofollow">guile paramaters</a></li></ul>

<h1 id="addendum-2024-09-30-isolation-blocks" id="addendum-2024-09-30-isolation-blocks">Addendum 2024-09-30: isolation blocks</h1>

<p>it has come to my attention that i have neglected to specify a useful concept: isolation blocks.</p>

<p>any function calls inside an isolation block will not resolve implicit parameters to a declaration outside the block.</p>

<p>this is not strictly necessary for the model to work, the same effect can be accomplished by using wrapper functions and passing parameters explicitly, but it makes things much easier to use.</p>

<p>this is expecially useful for interactive use, where it essentially allows you to enter a sandbox just by starting an isolation block. specific ocaps can be imported into the sandbox with the equivalent of scala&#39;s <code>given</code> keyword.</p>

<h1 id="addendum-2024-10-08-pwd-as-an-implicit-parameter" id="addendum-2024-10-08-pwd-as-an-implicit-parameter">Addendum 2024-10-08: $PWD as an implicit parameter</h1>

<p>one concept that is very pervasive in UNIX systems is that of a “current directory”.</p>

<p>there is one big problem with this: it&#39;s a process-wide variable.</p>

<p>one solution to this that has emerged is the <code>openat()</code> family of functions, these allow you to open a file relative to a file descriptor.</p>

<p>unfortunately, this is almost never used, due to the simple fact that passing around dir handles everywhere is kinda annoying.</p>

<p>under this new paradigm, everything that would care about the “current directory” instead takes an implicit DIR handle which all paths are opened relative to.</p>

<p>this removes pitfalls such as <code>chdir()</code> in multi-threaded applications, while also making it clear in auto-generated docs which functions care about the current directory.</p>

<p>if we require all paths to be simple relative paths (i.e. not starting with <code>/</code> and not containing <code>..</code>), then we get the ocap-like isolation described before, and each function would essentially run in its own chroot.</p>

<hr>

<p><a href="/binarycat/tag:programming" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">programming</span></a> <a href="/binarycat/tag:rust" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">rust</span></a></p>

<hr>

<p>You can follow this blog <a href="https://paper.wf/binarycat/feed/" rel="nofollow">via its RSS feed</a> or by searching for <a href="https://paper.wf/@/binarycat@paper.wf" class="u-url mention" rel="nofollow">@<span>binarycat@paper.wf</span></a> on your Mastodon/ActivityPub instance.</p>
]]></content:encoded>
      <guid>https://paper.wf/binarycat/inferred-implicit-parameters-for-ergonomic-object-capabilities</guid>
      <pubDate>Wed, 11 Sep 2024 01:31:55 +0000</pubDate>
    </item>
  </channel>
</rss>