Coming from an 8-bit assembly background, I've always found this alignment size/speed tradeoff to be a little vexing (either it's slower, or we're wasting memory), and thought "couldn't the hardware be better?" so finding out that x86 has evolved to the point where unaligned accesses basically have no penalty was really pleasing. No more bytes wasted, so we can have small and fast. (How do they do this? By making the memory bus a lot wider than a word, among other things.) IMHO ARM is just starting to see the value of this and catching up, if only the other architectures would do the same...