This is the kind of misinformation that makes people more wary of floats than they should be.
The same series of operations with the same input will always produce exactly the same floating point results. Every time. No exceptions.
Hardware doesn't matter. Breed of CPU doesn't matter. Threads don't matter. Scheduling doesn't matter. IEEE floating point is a standard. Everyone follows the standard. Anything not producing indentical results for the same series of operations is *broken*.
What you are referring to is the result of different compilers doing a different series of operations than each other. In particular, if you are using the x87 fp unit, MSVC will round 80-bit floating point down to 32/64 bits before doing a comparison, and GCC will not by default.
Compliers doesn't even use 80-bit FP by default when compiling for 64 bit targets, so this is not a concern anymore, and hasn't been for a very long time.
There's just so many "but"s to this that I can't in good faith recommend people to treat floats as deterministic, even though I'd very much love to do so (and I make such assumptions myself, caveat emptor):
- NaN bits are non-deterministic. x86 and ARM generate different sign bits for NaNs. Wasm says NaN payloads are completely unpredictable.
- GPUs don't give a shit about IEEE-754 and apply optimizations raging from DAZ to -ffast-math.
- sin, rsqrt, etc. behave differently when implemented by different libraries. If you're linking libm for sin, you can get different implementations depending on the libc in use. Or you can get different results on different hardware.
- C compilers are allowed to "optimize" a * b + c to FMA when they wish to. The standard only technically allows this merge within one expression, but GCC enables this in all cases by default on some `-std`s.
You're technically correct that floats can be used right, but it's just impossible to explain to a layman that, yes, floats are fine on CPUs, but not on GPUs; fine if you're doing normal arithmetic and sqrt, but not sin or rsqrt; fine on modern compilers, but not old ones; fine on x86, but not i686; fine if you're writing code yourself, but not if you're relying on linear algebra libraries, unless of course you write `a * b + c` and compile with the wrong options; fine if you rely on float equality, but not bitwise equality; etc. Everything is broken and the entire thing is a mess.
Yes, there are a large number of ways to fall into traps that cause you to do a different series of operations when you didn't realise that you did. But that's still ultimately what all your examples are. (Except the NaN thing!)
I still think it's important to fight the misinformation.
Programmers have been conditioned to be so afraid of floats that many believe that doing a + b has an essentially random outcome when it doesn't work that way at all. It leads people to spend a bunch of effort on things that they don't need to be doing.
Cars are not popular becuase people pushed them. Cars are popular because the utility is undeniable.
This is true for any kind of transformative technology. Marketing and lobbying can only get you so far. If something has enough utility, it will be used regardless of what people say they want.
> Cars are not popular becuase people pushed them. Cars are popular because the utility is undeniable.
I think this is somewhat of a chicken and egg problem. Cars' utility is undeniable partially because society has twisted itself thoroughly around The Car being an assumed part of it. This societal change was both pulled (by car customers) and pushed (by car manufacturers).
Yes absolutely—I think cars have obvious utility as machines, but there has now been 100 years of building everything around them and changing laws in such a way that encourages their use: through direct and indirect subsidy, land use rules that largely outlaw building cities in any way other than sprawl that itself increases the importance and utility of cars, and various other preferential regulations that often tolerate the harms in a way that is not applied elsewhere (c.f. panic over e-bike safety vs American highway safety overall).
Cars won because they were (and are) better than the alternatives. The need for powerful individual transportation with utility has always existed, and was originally met with horses. Bicycles meet the transportation need, but not the need for utility. Cars do both, and they do it better than anything else. Even before fueling infrastructure was rolled out, you could still run a car on petroleum you bought from the chemist, which is still infinitely better than the acres of pasture you need for horses. If you had an early diesel, it would run on oil, which is even easier.
The idea that cars needed all this infrastructure that other alternatives didn't just doesn't match the reality of the history of the automobile. And yes, we've leaned on those advantages in the century since, which has also created vast areas where a car is necessary to participate in society, but we only did so because the advantages and utility were so undeniable.
Because you can't adopt that syntax after the fact. there is 30 years of C++ in the real world, initializing everything by default unless you opt-in will break some performance critical code that should not initialize everything (until it is updated manually - it has to be manual because tools are not smart enough to know where something was intentionally not initialized 100% of the time)
Thus the current erroneous. It means this isn't a bug (compilers used to optimized out code paths where an uninitialized value is read and this did cause real world bugs when it doesn't matter what value is read). It also means the compiler is free to put whatever value they want there - one of the goals was the various sanitizers that check for using uninitialized values need to still work - the vast majority of the time when an uninitialized value is read that is a bug in the code.
There are a lot of situations where a compiler cannot tell if a variable would be used uninitialized, so we can't rely on compiler warnings (it sometimes needs solving the halting problem).
> There are a lot of situations where a compiler cannot tell if a variable would be used uninitialized, so we can't rely on compiler warnings (it sometimes needs solving the halting problem).
It's an explicit choice in C++ to always accept correct programs (the alternative being to always reject incorrect programs†). The committee does not have to stick by this bad decision in each C++ version, of course they aren't likely to stop making the same bad choice, but it is possible to do so.
If you're allowed to take the other side, you can of course (Rust and several other languages do this) reject programs where the compiler isn't satisfied that you definitely always initialize the variable before it's value is needed. Most obviously (but it's pretty annoying, so Rust does not do this) you could insist on the initialization as part of the variable definition in the actual syntax.
† You can't have both, by Rice's Theorem, Henry Rice got his PhD for figuring out how to prove this, last century, long before C++ was conceived. So you must pick, one or the other.
> Because you can't adopt that syntax after the fact.
The `= void` syntax can be because it is currently not valid.
D (unlike C++) always has a default initializer, but does not allow a default constructor. This is sometimes controversial, but it heads off all kinds of problems.
The default initializer for floating point values is NaN. (And for chars it is 0xFF.) The point of this is for the value to not "happen" to work.)
> there is 30 years of C++ in the real world, initializing everything by default unless you opt-in will break some performance critical code that should not initialize everything
...But the change to EB in this case does initialize everything by default?
No it doesn't. It says the value is unspecified but it exists. Sometimes some compilers did initialize everything (this was common in debug builds) before. Some of them will in the future, but most won't do anything difference.
The only difference is some optimizer used to eliminate code paths where they could prove that path would read an uninitialized variable - causing a lot of weird bugs in the real world.
The precise value is not specified, but whatever value is picked also has to be something that isn't tied to the state of the program so some kind of initialization needs to take place.
Furthermore, the proposal explicitly states that (some) variables are initialized by default:
> Default-initialization of an automatic-storage object initializes the object with a fixed value defined by the implementation
> The automatic storage for an automatic variable is always fully initialized, which has potential performance implications.
> The automatic storage for an automatic variable is always fully initialized, which has potential performance implications.
We use floating point operations with deterministic lockstep with a server compiled on GCC in Linux a windows client compiled with MSVC in windows, and an iOS client running on ARM which I believe is compiled with clang.
Works fine.
This is a not a small code base, and no particular care has been taken with the floating point operations used.
I've written a lot of code using that method, and never had any portability issues. You use types with number of bits in them.
Hell, I've slung C structs across the network between 3 CPU architectures. And I didn't even use htons!
Maybe it's not portable to some ancient architecture, but none that I have experienced.
If there is undefined behavior, it's certainly never been a problem either.
And I've seen a lot of talk about TLB shootdown, so I tried to reproduce those problems but even with over 32 threads, mmap was still faster than fread into memory in the tests I ran.
Look, obviously there are use cases for libraries like that, but a lot of the time you just need something simple, and writing some structs to disk can go a long way.
As proven by many languages without native support for plain old goto, it isn't really required when proper structured programming constructs are available, even if it happens to be a goto under the hood, managed by the compiler.
My point is it's bad debating style. 'Everyone knows C is bad for all kinds of reasons ergo, even when someone presents their own actual experience, I can respond with a refrain that sounds good'
Not using goto because you've heard it's always bad is the same kind of thing. Yes it has issues, but that isn't a reason to brush anyone off that have actual valid uses for it.
C allows most of this, whereas C++ doesn't allow pointer aliasing without a compiler flag, tricks and problems.
I agree you can certainly just use bytes of the correct sizes, but often to get the coverage you need for the data structure you end up writing some form of wrapper or fixup code, which is still easier and gives you the control versus most of the protobuf like stuff that introduces a lot of complexity and tons of code.
Check your generated code. Most compilers assume that packed also means unaligned and will generate unaligned load and store sequences, which are large, slow, and may lose whatever atomicity properties they might have had.
Modern versions of standard C aren't very portable either, unless you plan to stick to the original version of K&R C you have to pick and choose which implementations you plan to support.
I disagree. Modern C with C17 and C23 make this less of an issue. Sure, some vendors suck and some people take shortcuts with embedded systems, but the standard is there and adopted by GCC, Clang and even MSVC has shaped up a bit.
Well, if that is the standard for portability then may_alias might as well be standard. GCC and Clang support it and MSVC doesn't implement the affected optimization as far as I can find.
Within the context of this discussion portability was mentioned as key feature of the standard. If C23 adoption is as limited as the, possibly outdated, tables on cppreference and your comments about gcc, clang and msvc suggest then the functionality provided by the gcc attribute would be more portable than C23 conformant code. You could call it a de facto standard, as opposed to C23 which is a standard in the sense someone said so.
That seems highly unlikely. Let's assume that all compilers use the exact same padding in C structs, that all architectures use the same alignment, and that endianness is made up, that types are the same size across 64 and 32 bit platforms, and also pretend that pointers inside a struct will work fine when sent across the network; the question remains still: Why? Is THIS your bottleneck? Will a couple memcpy() operations that are likely no-op if your structs happen to line up kill your perf?
I guess to not have to set up protobuf or asn1. Those preconditions of both platforms using the same padding and endianness aren't that hard to satisfy if you own it all.
But do you really have such a complex struct where everything inside is fixed-size? I wouldn't be surprised if it happens, but this isn't so general-purpose like the article suggests.
No defined binary encoding, no guarantee about concurrent modifications, performance trade-offs (mmap is NOT always faster than sequential reads!) and more.
C has had fixed size int types since C99. And you've always been able to define struct layouts with perfect precision (struct padding is well defined and deterministic, and you can always use __attribute__(packed) and bit fields for manual padding).
Endianness might kill your portability in theory. but in practice, nobody uses big endian anymore. Unless you're shipping software for an IBM mainframe, little endian is portable.
You just define the structures in terms of some e.g. uint32_le etc types for which you provide conversion functions to native endianness. On a little endian platform the conversion is a no-op.
It can be made to work (as you point out), and the core idea is great, but the implementation is terrible. You have to stop and think about struct layout rules rather than declaring your intent and having the compiler check for errors. As usual C is a giant pile of exquisitely crafted footguns.
A "sane" version of the feature would provide for marking a struct as intended for ser/des at which point you'd be required to spell out every last alignment, endianness, and bit width detail. (You'd still have to remember to mark any structs used in conjunction with mmap but C wouldn't be any fun if it was safe.)
I don't think it's faster than a windows game running Vulkan, though, is it? Like, if you benchmarked a game that has native DX12 and Vulkan modes (such as Wolfenstein: The New Colossus, I believe), it will probably have higher FPS in Vulkan mode, right?
Well our game runs faster in DX12 under Proton than Vulkan under Proton.
Of course since Proton uses Vulkan to implement DX12, it means that our Vulkan implementation is simply worse than the one that Valve created to emulate DX12.
I'm sure it's possible to improve that, but it implies that there way to get the best performance out of Vulkan is less obvious than the way to get it out of DX12.
God Roko's Basilisk is the most boring AI risk to catch the public consciousness. It's just Pascal's wager all over again, with the exact same rebuttal.
The culture that brought you "speedrunning computer science with JavaScript" and "speedrunning exploitative, extractive capitalism" is back with their new banger "speedrunning philosophy". Nuke it from orbit; save humanity.
I have never liked the word "manager" because it's rarely useful in practice, the best managers are actually doers who can cut through the performative transparency and "DDoS attack" of constant updates to see when communication is being used to manipulate or mask failure rather than report progress, because the last thing you want is for your employees to become politicians
The same series of operations with the same input will always produce exactly the same floating point results. Every time. No exceptions.
Hardware doesn't matter. Breed of CPU doesn't matter. Threads don't matter. Scheduling doesn't matter. IEEE floating point is a standard. Everyone follows the standard. Anything not producing indentical results for the same series of operations is *broken*.
What you are referring to is the result of different compilers doing a different series of operations than each other. In particular, if you are using the x87 fp unit, MSVC will round 80-bit floating point down to 32/64 bits before doing a comparison, and GCC will not by default.
Compliers doesn't even use 80-bit FP by default when compiling for 64 bit targets, so this is not a concern anymore, and hasn't been for a very long time.
reply