Converting Markdown to PDF Inside a Binary

Suyoung Hwang · 2026-05-15

At some point every service ends up needing to generate PDFs on the fly. People just want things as documents. That’s apparently the protocol now.

We had the same requirement, but with a twist: not a fixed-format report, but a dynamic PDF generated from LLM-produced Markdown, integrated into an MCP server.

When you look for technical solutions to this, you generally land on one of two paths:

The TeX route: Pandoc with a TeX backend (Markdown → TeX → PDF)
The browser route: Playwright or similar (Markdown → HTML → PDF)

Outside of scientific or math-heavy documents where TeX genuinely shines, the browser-based approach seems to be where most people land these days, and it’s easy to see why. CSS gives you a lot of customization for free. But both approaches share the same core problems: you’re shipping a huge binary into your production image (Chromium, or TeX Live), you need a process pool or a message queue to make it work reliably at scale, and performance characteristics are unpredictable enough that you end up tuning it as its own separate concern.

So you’re stuck. What if there were a third option?

Litho

Litho is a Markdown-to-PDF converter that uses neither TeX nor a browser. Instead it goes through Typst: Markdown → Typst → PDF.

And the headline number is kind of absurd: a single function call, ~2ms per document.

let markdown_text: String = std::fs::read_to_string("document.md")?;
let pdf_bytes: Vec<u8> = litho::markdown_to_pdf(markdown_text)?;  // 2ms
std::fs::write("converted.pdf", pdf_bytes)?;

Typst is a typesetting compiler written in Rust. The organization has it fully open-sourced, and it’s published on crates.io. From a programming standpoint, the interesting part is that the Typst crate exposes a World trait that lets you embed the compiler into your own binary, which means third-party crates like Litho can customize the compilation pipeline without forking anything.

At runtime, litho::markdown_to_pdf does two things in sequence:

Markdown → Typst: parses the input Markdown and produces a .typ source file
Typst → PDF: hands that source file to the Typst compiler and gets bytes back

Most of the interesting engineering happened in building those two stages, and that’s what the rest of this post is about.

From Regex to an AST

The first attempt at the Markdown-to-Typst conversion was regex-based. Match headers, then bold/italic, then code blocks, then lists, tables, links, footnotes, in some order, and transform each one.

Problems showed up quickly. The classic example: pipe characters inside a Python code block getting picked up as table delimiters.

Take a look at this:
 
```python
row = "|name|age|"
cells = row.split("|")
```
 
| input | output |
|-------|--------|
| ...   | ...    |

I fixed bugs by adjusting processing order and adding tests, but every fix broke something else. There was always another edge case lurking.

The next attempt introduced placeholders as a kind of ad-hoc state management: replace protected content with special tokens like <<CODEBLOCK0>>, run the transformations, then restore. It worked, but every new Markdown feature you added required thinking through its interactions with all the existing ones. With $N$ features, you’re looking at $O(N^2)$ failure modes to keep in your head. Nested blockquotes, lists inside blockquotes with math blocks inside those, and on and on. Things that seem simple in Markdown are genuinely nasty to handle with string substitution.

Building this was not fun, and it increasingly felt like I was in the process of reimplementing an AST parser from scratch, badly. So I switched to pulldown-cmark, a proper Markdown AST parser.

pulldown-cmark processes Markdown source and emits a stream of typed events: Start(Heading), Text("hello"), End(Heading), and so on. Having an event stream means you need something state-machine-like to consume it, so Litho now maintains a struct that tracks the relevant state throughout a conversion:

struct ConversionState {
    list_stack: Vec<bool>,      // ordered vs. unordered at each depth
    in_table_cell: bool,
    in_code_block: bool,
    table_rows: Vec<Vec<String>>,
    emitted_footnotes: HashSet<String>,
    // ...
}

Tables were one of the clearest wins from this approach. Typst’s #table() requires the column count to be declared upfront. In the regex version I was counting pipe characters line-by-line and hand-rolling inline code exceptions. With the event stream, it’s almost trivial: watch for the table start event, accumulate rows in table_rows as cell events come in, and flush everything to Typst source when the table ends. What was a nightmare becomes just bookkeeping.

Footnotes had their own interesting constraint. Typst footnotes work like this: first use requires #footnote[content] <label> to declare the label and content together, and subsequent uses just reference back with @label.

Markdown:  Text with a citation[^1] and reused[^1].
           [^1]: See Knuth, 1984.
 
Typst:     Text with a citation#footnote[See Knuth, 1984.] <1> and reused@1.

The implementation collects all [^label]: content definitions upfront into a HashMap. Then during conversion, the first FootnoteReference event for a given label emits #footnote[content] <label>, and every subsequent one emits @label. Whether a footnote has been emitted yet is tracked in emitted_footnotes.

AST and Regex, Together

pulldown-cmark isn’t a complete solution on its own. It does support an ENABLE_MATH flag that parses $...$ into math events, but it doesn’t handle LaTeX conventions like \[...\] or $...$. We disabled ENABLE_MATH and went back to regex for math specifically.

Concretely: we placeholder-protect inline and block math expressions before passing to the AST parser, then post-process them. Because math only really interferes with code blocks and blockquotes, the $O(N^2)$ blowup from before doesn’t resurface. The math-to-Typst conversion itself is handled by MiTeX, a Typst package that provides #mi and #mimath functions for rendering TeX-syntax math in Typst.

We considered converting the LaTeX convention syntax to dollar-sign syntax first, then feeding it to pulldown-cmark, but making that reliable required understanding and matching pulldown-cmark’s internal parsing behavior. The work was roughly the same either way, and doing it all in regex/placeholder was just simpler to reason about.

Bundling Everything into the Binary

Rust’s build-time metaprogramming is genuinely useful here. The include_bytes! family of macros lets you embed arbitrary files into your binary at compile time, where they show up as &'static [u8] slices. Litho uses this to bundle 13 fonts and the MiTeX package directly:

static FONTS: [&[u8]; 13] = [
    include_bytes!(concat!(env!("CARGO_MANIFEST_DIR"), "/fonts/DejaVuSansMono.ttf")),
    include_bytes!(concat!(env!("CARGO_MANIFEST_DIR"), "/fonts/LibertinusSerif-Regular.ttf")),
    // ... 11 more
];
 
static MITEX_TARBALL: &[u8] = include_bytes!(concat!(
    env!("CARGO_MANIFEST_DIR"),
    "/packages/mitex-0.2.6.tar.gz"
));

The paths are constructed with env! and concat!, and include_bytes! pulls the file contents in as &'static [u8; N]. These end up in .rodata on ELF and __TEXT/__const on Mach-O. This adds a few MB to the binary, and LTO won’t strip it since the data is actually referenced.

For MiTeX, a OnceLock handles the first-access decompression: the tarball is extracted into a HashMap<FileId, Vec<u8>> on the first call and shared from then on. Since it never changes after initialization, there’s no locking needed after that point. Typst doesn’t have a dependency manifest like typst.toml; packages are imported implicitly from source. MiTeX v0.2.6 has no further dependencies (earlier versions depended on xarrow), so the bundle stays self-contained.

When a Library Has Outlived Its Usefulness

The early version of Litho used typst-as-lib, a crate that wraps the Typst compiler in a friendlier interface. Even after exhausting the optimizations available without looking inside it, single-threaded compilation was around 15ms. Worse, adding threads made things slower in a roughly linear way.

Performance degrading linearly with thread count is a strong sign of a critical section somewhere. Profiling all the way down into the library’s internals pointed at a struct called CachedFileResolver. Every call to source() or file() acquired a mutex, then hashed a FileId to look up a value inside a Mutex<HashMap>. The Typst compiler calls these methods many times per compilation, and the mutex contention accumulated into a serious bottleneck.

The fix was to implement Typst’s World trait directly. World is the interface the compiler uses to talk to the outside world: the standard library, source files, fonts, file data, the current date. Implement that, and you control everything. It sounded daunting but turned out to be straightforward. There are five required methods, most of them read-only lookups, and the whole implementation ended up around 80 lines:

#[comemo::track]
impl typst::World for LithoWorld {
    fn source(&self, id: FileId) -> FileResult<Source> {
        if id == self.main.id() { return Ok(self.main.clone()); }
        let bytes = self.resolver.resolve_bytes(id)?;
        let text = std::str::from_utf8(&bytes)...;
        Ok(Source::new(id, text.into()))
    }
 
    fn file(&self, id: FileId) -> FileResult<Bytes> {
        self.resolver.resolve_bytes(id).map(Bytes::new)
    }
    // ...
}

With the mutex gone, package lookups go through an Arc<HashMap> initialized once at startup. Fonts and the standard library are handled via OnceLock and shared across all compilations. Single-threaded compile time dropped from 15ms to 2ms.

There’s a general lesson here that I think applies more broadly: most libraries exist to let you focus on your actual problem. When a library starts getting in the way of that, it’s probably run its course. In 2026, with LLM assistance, the cost of just implementing the thing yourself has come down a lot. I’d reach for that option sooner next time.

Discarding Compiler Errors

There are two places in Litho where errors can occur: Typst compilation and PDF rendering. The Markdown parser and converter are fail-safe.

The Typst compiler does return diagnostics, but surfacing them directly would be confusing. An error there could be from the input, or it could be a bug in Litho’s conversion logic. Exposing Typst source-level error messages to callers of litho::markdown_to_pdf would tell them something about the internals they probably don’t care about and likely couldn’t act on. Instead, Litho logs diagnostics at error or warn level via tracing.

So the public error type ends up looking like this:

pub enum Error {
    Compile,
    Pdf,
}

If something goes wrong and you need to debug it, markdown_to_typst and typst_to_pdf are exposed separately so you can inspect the intermediate Typst source directly.

Benchmarks

Before the numbers: Litho does a subset of what Pandoc does. No citation processing, no template expansion, nowhere near the full LaTeX feature set. So this isn’t a fair apples-to-apples comparison, and you should read it with that in mind.

The test document is a 653-byte Markdown file containing math, tables, code blocks, and footnotes. Measurements on Apple M4 Pro, release build.

Single-threaded performance (10 runs):

Engine	Mean	Median	Min	Max
Litho	2.14ms	2.10ms	1.90ms	2.75ms
Chromium	2.27s	2.26s	2.25s	2.33s
pandoc + pdflatex	884.85ms	844.71ms	826.12ms	1.18s
pandoc + tectonic	1.06s	1.05s	1.04s	1.10s

Two to three orders of magnitude faster. Again, not the same workload, so take it with a grain of salt. But it’s enough to call this a real third option.

Concurrency:

Because all shared state (fonts, package files, standard library) is read-only after initialization, Litho should scale linearly with core count. It does.

Threads	Wall time	Per-thread mean	Min	Max
1	2.39ms	2.36ms	2.33ms	2.38ms
2	2.66ms	2.62ms	2.40ms	2.84ms
4	3.21ms	3.08ms	2.68ms	3.49ms
8	4.96ms	4.75ms	3.72ms	5.46ms
16	10.62ms	9.92ms	5.54ms	11.49ms

Wall time climbing into the 10ms range at 16 threads makes sense given the M4 Pro’s 14-core layout (10 performance + 4 efficiency). At that point you’re competing for CPU, not shared data.

Deployment:

Engine	Cold start	Size / dependencies
Litho	~61ms (tarball decompression)	47 MB self-contained
Chromium	>15s (Chromium download)	253 MB
pandoc + pdflatex	>15s (TeX Live install)	226 MB + TeX Live
pandoc + tectonic	>15s (TeX package fetch)	226 MB + tectonic

Tectonic takes a similar philosophical approach on the TeX side: self-contained engine, on-demand package fetching. But Pandoc still shells out to it as a subprocess, so you can’t get the in-process benefits that make Litho’s latency what it is.

What We Left Out

Litho doesn’t support images, figure/table captions, or citations.

Images would require fetching bytes, decoding them, and injecting them into the in-memory file resolver. You could expose a resolver interface for callers to provide, but general URL-based image loading would require runtime I/O, which cuts against the core design goal of doing everything in-process without network or filesystem access during compilation. Citations require a separate .bib or Hayagriva file; that’s just how Typst is designed.

Captions are a different kind of problem. Markdown doesn’t have a standard syntax for attaching a caption to a figure or table, so supporting it would mean inventing a syntax extension or adopting an existing one and threading it through the entire converter. That’s non-trivial.

None of these are technically impossible. They’re all tradeoffs where you give up API simplicity in exchange for scope, and so far that tradeoff hasn’t seemed worth it.

Litho is currently running in production as an internal library on our server. If you’re solving a similar problem, hopefully this writeup saves you some time.