Faking Specialization on Stable Rust: Field Notes on Autoref Specialization

Jiho Park, Suyoung Hwang · 2026-05-22


Autoref method resolution: Rust tries receiver types w, &w, and &mut w in order and picks &w, the first match with the fewest references.

Rust has a handful of features that still haven’t stabilized, and one that’s been stuck for ages is specialization. It’s the feature where the compiler automatically picks the right impl: if this type implements this trait, use the more specific one; otherwise fall back to the general one. But #![feature(specialization)] is still locked to nightly, tangled up with soundness issues, and there’s no telling when it’ll land.

And yet, when you’re writing macros or code generators, the moment comes when you genuinely need it. The type of an expression the user hands you, or the type of the slot you’re shoving a value into, might be one of two (or more) things, and you want to take it and handle each case differently. The workaround for this, usable even on the stable channel, is the autoref specialization trick that @dtolnay collected in case-studies. Today I’ll lay out what it is, plus two cases where we actually put it to work. Both started out feeling like “surely this needs nightly?” and both ended up resolving cleanly on stable.

Exploiting the method resolution rules

The core idea is surprisingly simple. When Rust resolves x.method(), it adds autorefs one step at a time to settle on the receiver, looking for a matching impl as it goes. It tries x&x&mut x, in that order, and picks the impl that matches with the fewest references.

You can use this priority to place the “more specific impl” and the “fallback impl” at different reference depths.

struct Wrapper<T>(T);
 
// High priority: applies only when T: Display. Implemented directly on Wrapper<T>.
trait ViaDisplay {
    fn tell(&self) -> &'static str;
}
impl<T: std::fmt::Display> ViaDisplay for Wrapper<T> {
    fn tell(&self) -> &'static str { "display" }
}
 
// Low priority: applies to any T, but implemented on &Wrapper<T> -> one extra autoref needed.
trait ViaFallback {
    fn tell(&self) -> &'static str;
}
impl<T> ViaFallback for &Wrapper<T> {
    fn tell(&self) -> &'static str { "fallback" }
}
 
let w = Wrapper(value);
let s = (&w).tell();
// "display" if value: Display, otherwise "fallback"

Call (&w).tell() and the compiler tries to match on the receiver type &Wrapper<T>. If T: Display, the ViaDisplay impl attached to Wrapper<T> matches without any extra reference, so it wins. If T isn’t Display, that candidate disappears, and resolution falls to ViaFallback on &Wrapper<T>, which takes one more reference to reach. The whole branch resolves at compile time, so the runtime cost is zero.

In practice you usually add one more step. You have each impl return a distinct tag type (a zero-sized type), then hang another trait method off that tag. Now “which path did we take” propagates as a type rather than a value, and the real work happens on top of that tag.

struct TagHigh;
struct TagLow;
 
// Step 1: encode which impl was selected as a tag type.
// Step 2: dispatch the real work on that tag.
let tag = (&wrapper).kind();   // TagHigh or TagLow
let out = tag.run(raw);        // handle `raw` differently per tag

That’s the whole pattern. Once it’s in your hands, you can draw “branches that split on type at compile time” wherever you like, all on stable.

Proof it isn’t a toy: anyhow

The best evidence that this technique holds up in the real world is anyhow. Pass a single argument to the anyhow!(...) macro, and if it’s an error type implementing std::error::Error, the macro wraps it as is; if not, it treats it as a message or format string. The user never has to pick a different macro for the two cases. Whether you write anyhow!(err) or anyhow!("failed: {code}"), it just works.

That “the user never has to think about it” experience is the whole point. A macro’s ergonomics come down to how many cases it absorbs on the user’s behalf, and autoref specialization makes that possible on stable, at zero cost. A good part of why anyhow is so widely used is this smoothness.

Where we used it (1): absorbing the four block shapes of a caching macro

We built a macro that stores results in a persistent cache. The user wants to express “this computation is expensive, so cache it and pull it from the cache next time” in a single line. Set the incidental arguments (key, dependencies, and so on) aside for a moment; the core looks like this.

let x = cached!(..., { compute() });

The problem is that the shape of the block the user puts inside the braces isn’t just one thing. There are two independent axes.

  • Sync / async: it might be a block that simply computes a value, or one that uses .await inside and ultimately produces a Future.
  • Infallible / fallible: it might just return a value, or it might return a Result<T, E> so it can use ? partway through.

Multiply the two axes and you get four block shapes: sync-infallible, sync-fallible, async-infallible, async-fallible. We didn’t want to split this into four macros, or tell the user “this is the async version, use a different macro.” The user should just write the block however feels natural to them, and the macro should adapt.

So we absorbed both axes with autoref specialization. We wrap the block’s evaluated result, first figuring out “is this still a Future that needs awaiting, or an already-ready value” and unrolling it with await if needed, then figuring out “is this a Result<T, E> or a plain T” and handling it accordingly.

// Normalize the block's result: await it if it's still a Future, unwrap it (like `?`)
// if it's a Result, so every input shape merges into one "value + maybe-error" path.
let wrapper = Wrapper::new(&raw);
let tag = (&wrapper).result_kind();        // TagResult or TagItem
let normalized = tag.execute_result(raw);  // Result -> map error into the cache's error type; value -> wrap in Ok

If it’s identified as a Result, we convert the user’s error into the cache library’s error type and pass it along; if it’s an ordinary value, we wrap it in Ok and merge it onto the same path. The Future axis works the same way. All four input shapes converge onto a single shared one-liner inside the macro, and the user never has to care whether they used ?, .await, both, or neither inside the block.

What’s fun here is that we composed two specialization axes. Autoref specialization splits on only one axis at a time, but by passing tag types through stage by stage, you can apply several axes in sequence and cover the entire combinatorial space.

Where we used it (2): molding a TOML document into a Rust struct

The second case is the bigger one. The job: take a TOML document and generate code that builds a strongly-typed Rust struct out of it. The config values don’t stay as a dynamic Value tree — they get assembled into the actual struct the program declared.

Here’s what makes it tricky. The target is a fixed struct, with a definite type and definite fields. But it’s unknown to the code generator: it’s whatever struct the user pointed us at, and the generator only ever sees the TOML, never the type definition. So the generator just walks the TOML tree and emits construction code that mirrors its structure.

The snag is a pattern in nearly every Rust config struct: some fields are Option<T>, some are plain T.

struct ServerConfig {
    name: String,
    timeout: Option<u32>,
    // ...
}

In the TOML, name = "x" and timeout = 30 look identical — a key with a scalar value. But the generated code has to emit name: "x".into() for one and timeout: Some(30) for the other, and the generator can’t tell them apart: it doesn’t know the field types.

Settling this at codegen time would mean threading the struct definition through the generator and emitting different code per field — and every other shape it has to handle (nested structs, vectors, enums, …) multiplies that branching until the generator is a monster you’re afraid to touch.

So we did the opposite. The generator emits one shape of code for every field, and the Option-or-not decision is pushed entirely onto the type system. The generated code rests on just three pieces, none of which inspect a field’s type:

  • set — write a value into a field. If the field is Option<U>, wrap the value in Some; if it’s plain U, assign directly.
  • peel — hand back a &mut into a field so the recursion can drill deeper. If the field is Option<U>, fill it with a default and return &mut U of the inside; if it’s plain U, just return &mut U.
  • navigate! — a helper macro that expands into a chain of peels, one per step of the path, walking down to the target slot and transparently stripping any Option layer it crosses along the way.

With those, the generated code is always the same shape, whatever the field:

// The generator never looks at the field's type. Always this:
navigate!(root.field).set(value);

set (and peel likewise) splits via autoref specialization:

// navigate!(...) hands back a handle pointing at one field.
struct Slot<'a, T>(&'a mut T);
 
// High priority: the field is Option<U> -> wrap the value in Some.
trait SetOpt<U> {
    fn set(&mut self, value: U);
}
impl<U> SetOpt<U> for Slot<'_, Option<U>> {
    fn set(&mut self, value: U) { *self.0 = Some(value); }
}
 
// Low priority (one extra autoref): the field is plain U -> assign directly.
trait SetPlain<U> {
    fn set(&mut self, value: U);
}
impl<U> SetPlain<U> for &mut Slot<'_, U> {
    fn set(&mut self, value: U) { *self.0 = value; }
}
 
// One call site, both field types:
(&mut Slot(&mut field)).set(10);
// field: Option<i32>  =>  Some(10)
// field: i32          =>  10

A Slot<'_, Option<U>> matches the high-priority SetOpt impl with no extra autoref, so it wins; a plain Slot<'_, U> finds nothing there and falls through to SetPlain on &mut Slot. peel is built the same way — the Option<U> impl does a get_or_insert_with(U::default) before handing back the inner reference, the plain U impl returns the reference as-is — so the recursion can step into a nested field without ever knowing whether it had to cross an Option to get there.

The payoff: the generator stays a dumb recursion that only sees TOML structure, and every “is this field optional?” question is answered at compile time from the type at the call site.

Why go to this trouble when serde already turns TOML into structs? Because serde does it at runtime: every time the program starts, it parses the document, walks a Value tree, and matches it against the target type. If the config doesn’t fit, you find out then, as a startup crash. Our generator does all of that at build time and emits a plain construction expression — ServerConfig { name: "x".into(), timeout: Some(30) }. A mistyped value or a missing field stops being a runtime failure and becomes a compile error. The generator doesn’t have to know the which fields are Option and which aren’t. Autoref specialization settles that at each call site, at compile time, from the type alone.

To sum up

Where autoref specialization shines is, ultimately, when you have to absorb a diversity of types at an API boundary but don’t want to push that diversity onto the user (or the code generator).

  • When a macro has to take a block the same way no matter which combination of sync/async, infallible/fallible it is
  • When you want a code generator to emit one uniform kind of code without knowing whether the target type is Option-wrapped, needs a conversion, or has enum structure, and to let the type system fill in the rest
  • When an argument might be an error type, or might just be a message (anyhow)

The common threads: it (1) works on the stable channel, (2) resolves at compile time so there’s no runtime cost, and (3) lets you compose multiple axes via tag types to cover the entire combinatorial space. And above all, because it soaks up that diversity, the outer interface gets simpler. The user doesn’t have to distinguish four block shapes, and the code generator only has to emit one kind of code.

Until nightly specialization stabilizes, this method-resolution trick looks like it’ll stay useful for a good while yet. Thanks again to dtolnay. This person’s case-studies repo is a treasure trove of “wait, you can do that in stable Rust?”