P550 is, like, what, only a year old? I suppose there has been some laughing at it at least.
Also Kendryte K230 / C908, but only on vector mem ops, which adds a whole another mess onto this.
I'd hope all the massive OoO will have fast misaligned mem ops, anything else would immediately cause infinite pain for decades.
But of course there'll be plenty of RVA23 hardware that's much smaller eventually too, once it becomes a general expectation instead of "cool thing for the very-top-end to have".
I do agree that it'd be reasonable to just assume fast misaligned ops, but for whatever reason gcc and clang just don't, and that's what we have for defaults.
It has take a while for this core to appear in an SoC suitable for SBCs, as Intel was originally announced as doing that and got as far as showing a working SoC/Board at the Intel Innovation 2022 event in September 2022.
Someone who attended that event was able to download the source code for my primes benchmark and compile and run it, at the show, and was kind enough to send me the results. They were fine.
For reasons known only to Intel, they subsequently cancelled mass production of the chip.
ESWIN stepped up and made the EIC7700X, as used in the Milk-V Megrez and SiFive HiFive Premier P550, which did indeed ship just over a year ago.
But technically we could have had boards with the Intel chip three years ago.
Heck we should have had the far better/faster Milk-V Oasis with the P670 core (and 16 of them!) two years ago. Again, that was business/politics that prevented it, not technology.
> No, it was released to customers in June 2021, almost five years ago.
Ah, okay. (still, like, at least a couple decades newer than the last x86-64 chip with slow unaligned mem ops, if such ever existed at all? Haven't heard of / can't find anything saying any aarch64 ever had problems with them either, so still much worse for the RISC-V side).
Well, I suppose we can hope that business/politics messes will all never happen again and won't affect anything RVA23.
> I do agree that it'd be reasonable to just assume fast misaligned ops, but for whatever reason gcc and clang just don't, and that's what we have for defaults.
This very much has a "for now" on it. Once there is actually widespread hardware with the feature, I would be very surprised if the compilers don't update their heuristics (at least for RVA23 chips)
Indeed we shall hope heuristics update; but of course if no compilers emit it hardware has no reason to actually bother making fast misaligned ops, so it's primed for going wrong.
hardware devs traditionally have been pretty good at helping the compiler teams with things like this (because its a lot cheaper to improve the compiler than your chip).
Also Kendryte K230 / C908, but only on vector mem ops, which adds a whole another mess onto this.
I'd hope all the massive OoO will have fast misaligned mem ops, anything else would immediately cause infinite pain for decades.
But of course there'll be plenty of RVA23 hardware that's much smaller eventually too, once it becomes a general expectation instead of "cool thing for the very-top-end to have".
I do agree that it'd be reasonable to just assume fast misaligned ops, but for whatever reason gcc and clang just don't, and that's what we have for defaults.