Why base 58? I've never even heard of base 58.

__sy__ · on May 11, 2022

from the RFC:

> Base58 is designed with a number of usability characteristics in mind that Base64 does not consider. First, similar looking letters are omitted such as 0 (zero), O (capital o), I (capital i) and l (lower case L). Doing so eliminates the possibility of a human being mistaking similar characters for the wrong character. Second, the non-alphanumeric characters + (plus), = (equals), and / (slash) are omitted to make it possible to use Base58 values in all modern file systems and URL schemes without the need for further system-specific encoding schemes. Third, by using only alphanumeric characters, easy double-click or double tap selection is possible in modern computer interfaces. Fourth, social messaging systems do not line break on alphanumeric strings making it easier to e-mail or message Base58 values when debugging systems.

jffry · on May 11, 2022

The linked README links a draft RFC [1], the introduction section of which contains an explanation.

[1] https://datatracker.ietf.org/doc/html/draft-msporny-base58#s...

NavinF · on May 11, 2022

Wow how did I never hear about this? I only read the first 2 paragraphs under Introduction, but it addresses all the qualms I have with base64 and its many variants.

teraflop · on May 11, 2022

The RFC doesn't mention an important drawback of base58, which is that the computational cost of encoding or decoding it is quadratic in the length of the input -- unlike base64, which is linear-time.

I assume the performance is acceptable when you're dealing with very small API tokens, but it would be totally unsuitable as a replacement for base64 in the context of, say, email attachments. Even using it for something like a PKI certificate would probably be asking for trouble.

dividuum · on May 11, 2022

I’m confused by this. Maybe I misunderstand the algorithm given in the draft linked above, but that doesn’t look O(n^2) at all. Intuitively it also doesn’t make sense as quadratic somehow implies that each character depends on all others (so for each of n chars you’d have do do something with all other n chars -> n^2)

oefrha · on May 11, 2022

It’s basically a naive algorithm of converting a number in base 256 (each byte is a digit) to base 58. The algorithm is super-linear and worst case quadratic due to all the carrying. Maybe average case quadratic too? Not sure.

teraflop · on May 11, 2022

Oops, I think my previous comment was inaccurate, and it's too late for me to edit it. It's only decoding that's quadratic, not encoding. Sorry for the confusion.

It's not really obvious (IMO) from the wording of the specification, but step 2 of the decoding algorithm is actually a nested loop.

dividuum · on May 11, 2022

Still not seeing it. As it's an encoding, not a compression, it cannot produce a large amount of output. At most I guess ceil(log2(256)/log2(58)). So encoding will produce a slightly larger output, while decoding will produce a slightly shorter output. Still all linear.

teraflop · on May 11, 2022

The output size is linear, but since each byte of output requires O(n) work to compute, the total time to decode an encoded string is O(n^2). Or to put it another way, the decoding throughput (in bytes per second) is inversely proportional to the total length of the string.

dividuum · on May 11, 2022

Odd. Certainly different to how I thought it works (equivalent to base64, just with divmod instead of bit-shifting). Thanks for fixing my intuition. Related: https://cs.stackexchange.com/q/21736

nick238 · on May 11, 2022

I usually just use base62 (ASCII letters/numbers). Visual inspection/comparison of a long random value seems unlikely, and when you get into 10+ random gibberish characters, you're just as likely to gloss over a transpositional error

seveibar · on May 11, 2022

Base62 is a great alternative! The libraries/alphabet around base58 are fairly common thanks to cryptocurrencies, but base62 libraries are generally available and easy to produce if needed.

UncleEntity · on May 11, 2022

Pretty sure that’s what they use for bitcoin addresses.