vmx

the blllog.

Floats on ATProto

2026-05-28 12:51

ATProto currently does not support floats in their data model. I’ve created a proof of concept for floats on ATProto that works using the full stack, PDS, Relay and Jetstream.

When I say floats, I mean the IEEE 754 double-precision binary floating-point format. This blog post explains some of the details and reasoning behind why things are done in a specific way. All this is really a proof of concept, floats could also be implemented differently. The following is based on my experience working on IPLD and JSON-based databases/indexes like Apache CouchDB and Noise.

Why

ATProto never had floats for four years now. Why would you need them now?

ATProto is more than just micro-blogging. The science community is looking into ways to make good use of it, and floats are already used heavily in that space. Not supporting them at the protocol level hinders adoption and calls for workarounds at the application level that are future interoperability headaches in the making.

In the widely used GeoJSON, geometries are represented as nested arrays of floats. ATGeo is currently using the “floats as strings” workaround, but they would be more than happy to use proper floats. Tom MacWright came up with a hack to support floats on ATProto for his use case.

Emily Hunt mentions in her [Nebra] talk at the CID Congress that float support would be great for posting astronomy data to the AT Protocol, such as the NASA General Coordinates Network events in her eco.astrosky.transient.gcn Lexicon. As a workaround, she is storing the whole JSON as a string. For tools like Exhibit, floating-point support is essential for IIIF annotations. These annotations use spatial coordinates to target specific regions of a canvas, which rarely map cleanly to whole-pixel boundaries.

I think the future is now and I believe I found a way to “ensure reliable round-trip encoding of floats”.

How

There are two distinct cases to consider: parsing JSON without any further context, and generating JSON from native types using Lexicons.

JSON without context

JSON has only a single numeric type, there’s no distinction between floats and integer. That’s a problem when you receive JSON without any further context. Programming languages with distinct types usually solve this by treating numbers with digits only (and an optional sign) as integers, and everything else as floats. For more details on how to do this in JavaScript, see my previous blog post about JSON, CBOR and numeric types.

JS, Lexicons and floats

The second case is generating JSON from native types of a programming language. When creating new records on ATProto, we have the luxury of even more type information than the underlying language might support, thanks to Lexicon schemas. The following sections focus on JS, as that’s what matters for the reference PDS implementation.

Using Lexicons, we can distinguish between integers and floats even within JS. When generating the JSON version of a record within the SDK, we can ensure that numbers are formatted according to their intended type. In our case that means a floating-point number like 42.0 is actually encoded as such, and not simply as 42.

No changes are needed to the public API, everything happens at a lower level. client.create() and client.put() already take a Lexicon as a parameter. The new behaviour is that this information will now always be used when generating JSON. An escape hatch via client.createRecord() and client.putRecord() remains available for cases where you need to work around this.

JS and unexpected fields

The current Lexicon specification allows for unexpected fields. Those are fields not specified in the Lexicon. They are currently ignored, but can be used for cases where user input adds arbitrary data.

The simplest approach with the least surprise is to check whether a numeric value can be represented as an integer without any loss, and if so, treat it as one. Otherwise consider it a float. This has one minor edge case: if you update a record, the underlying CBOR type could change. If you first have a value of 42.3 and later change it to 42.0, the type shifts from float to integer. In practice I don’t think that’s a real problem, it’s more of a theoretical one. If you have hard constraints, such as a schema, you should make them part of the Lexicon.

This approach may not always be sufficient. For example, if you have a key-value map where values should always be floats but you don’t know the exact keys, that currently cannot be expressed in a Lexicon. The proof of concept handles this by wrapping such values in a distinct type called LexFloat(). This ensures the number always has a decimal point in its JSON representation, even when it could be represented as an integer. Similarly, LexInteger() can be used to enforce an integer.

The proof of concept

The proof of concept for floats on ATProto does not only patch the PDS, but covers the full stack. From using the JS/TypeScript SDK to create records, to posting them on a local PDS, to observing them on the Relay/Jetstream. The README of the repo describes how to run everything along with the exact commands, so I won’t repeat that here.

I’ll just outline what to expect, so you can decide whether it’s worth trying yourself. The repo checks out the atproto, indigo and jetstream repos at arbitrary commits (from when I started the implementation). The patches directory contains the actual implementation of float support. The main changes are:

  • JS/TypeScript SDK: Make record creation schema-aware
  • PDS/Relay: Make them aware of the integer/float distinction in JSON
  • Jetstream: Use the patched indigo code from the Relay

There are three examples in the repos/atproto/examples directory (once patched). One demonstrates float encoding without running any service. The other two are meant to be used together when a local PDS and Relay are running. They insert data into the PDS and display the pretty-printed CBOR the Relay received.

Behavioural changes

Introducing floats obviously changes some parts of ATProto. The good news is that the public API of the JS/TypeScript SDK doesn’t need to change. Only additional helpers for special cases are added (like LexFloat()). When a record contains a number that cannot be represented as an integer, the SDK no longer throws an error.

I’ve added range support for floats to the Lexicon to mirror what integers already support, though I’m not sure that’s really necessary.

One breaking change I do consider worthwhile is in the JSON representation. Currently a number like 23e2 is treated as an integer. I believe it should be treated as a float instead, which would align ATProto with how programming languages usually handle this.

Conclusion

This blog post introduces a proof of concept for floats on ATProto. It should be taken as such, it’s not a finished solution. I’m sure I’ve missed subtle things that people working on ATProto daily would catch. It’s a starting point that hopefully shows that adding floats isn’t as daunting as it sounds, and that we can still have a system that behaves deterministically and remains easy to reason about.

If you have a use case for floats, or you think this is all a terrible idea, please let me (and the rest of the world) know. Either via Bluesky or on this ATProto community forum thread.

Categories: en, ATProto

Comments are closed after 14 days.

By Volker Mische

Powered by Kukkaisvoima version 7