Yeah, fat pointers are definitely a viable approach, but a lot of the existing C...

matthewfcarlson · 2025-10-28T22:00:59 1761688859

You’re off by a few orders of magnitude. I’ll grant you, what is the bootloader becomes a very complex question. Even if you scope it to just “what is the code physically etched into the chip as the mask ROM” (secureROM) you’re talking hundreds of thousands. If you’re talking about all the code that runs before the kernel starts executing you’re talking hundreds of millions.

kragen · 2025-10-28T22:06:50 1761689210

No, I was only talking about the pre-existing C code that wasn't written for the bootloader, which therefore might have incompatibilities with fat pointers you had to hunt down and fix.

Also I'm really skeptical about your "hundreds of millions" number, even if we're talking about all the code that runs before the kernel starts. How do you figure? The entire Linux kernel doesn't contain a hundred millions of lines of code, and that includes all the drivers for network cards, SCSI controllers, and multiport serial boards that nobody's made in 30 years, plus ports to Alpha, HP PA-RISC, Loongson, Motorola 68000, and another half-dozen architectures. All of that contains maybe 30 million lines. glibc is half a million. Firefox 140.4.0esr is 33 million. You're saying that the bootloader is six times the size of Firefox?

Are you really suggesting that tens of gigabytes of source code are compiled into the bootloader? That would make the bootloader at least a few hundred megabytes of executable code, probably gigabytes, wouldn't it?

jacquesm · 2025-11-02T20:16:56 1762114616

100's millions is clearly nonsense, but Grub2, which arguably is a (second stage) bootloader is half a million lines. It's interesting that this is a bootloader, and not an operating system in its own right, and a nice example of software bloat.

1718627440 · 2025-10-30T08:46:19 1761813979

Why not assume it fits into an uintptr_t, which is the type for exactly this purpose?

ummonk · 2025-10-28T22:23:19 1761690199

Couldn’t one just make long bigger then to make it match?

kragen · 2025-10-28T22:36:33 1761690993

Maybe so; I haven't tried. Probably a lot less code depends on unsigned long wrapping at 2⁶⁴ than used to depend on unsigned int wrapping at 2¹⁶, and we got past that. But stability standards were lower then. Any code that runs on both 32-bit and 64-bit LP64 systems can't be too dependent on the exact sizeof long, and sizeof long already isn't sizeof int the way it was on 32-bit platforms.

ummonk · 2025-10-29T18:26:16 1761762376

I'd actually keep it still wrapping at 2^64, with the extra metadata not participating in arithmetic operations...

kbolino · 2025-10-29T13:57:55 1761746275

That seems worse.

For all the wrong code that assumes long can store a pointer, there's likely a lot more wrong code that assumes long can fit in 64 bits, especially when serializing it for I/O and other-language interop.

Also, 128-bit integers can't fit in standard registers on most architectures, and don't have the full suite of ALU operations either. So you're looking at some serious code bloat and slowdowns for code using longs.

You've also now got no "standard" C type (char, short, int, long, long long) which is the "native" size of the architecture. You could widen int too, but a lot of code also assumes an int can fit in 32 bits.

ummonk · 2025-10-29T16:54:00 1761756840

No, it should only do arithmetic on the first 64 (or 32) bits. The extra metadata should be copied unchanged.

kbolino · 2025-10-29T17:29:21 1761758961

Ok, I think I follow. You'd widen the type under the hood but not expose this fact to user code.

However, most longs are just numbers that have no metadata. I guess you'd set the metadata portion to all zeroes in that case. This feels like a reified version of Rust's pointer provenance, and I think you would have to expose some metadata-aware operations to the user. In which case, you're inviting some code rewrites anyway.

While not as bad as the register/ALU ops issue, you're still making all code pay a storage size penalty, and still adding some overhead to handle the metadata propagating through arithmetic operations, just to accommodate bad code, plus it complicates alignment and FFI.

ummonk · 2025-10-29T18:24:54 1761762294

It would still be exposed to user code that checks its size with sizeof, but yeah the long would only have numerical values between 2^-63 and 2^63-1.

And yes, there would still be some overhead for storing and propagating the metadata, and struct alignment would change and FFI wouldn't work with longs.

ummonk · 2025-10-30T11:35:07 1761824107

Err I meant -2^63, that’s embarrassing.

kragen · 2025-11-01T15:11:58 1762009918

Heh. I missed that.

jibal · 2025-10-29T06:08:40 1761718120

There's a lot of code that makes assumptions about the number of bytes in a long rather than diligently using sizeof ... remember, the whole point here is low quality code.

cryptonector · 2025-10-29T05:08:20 1761714500

It's going to break stuff one way or another.