You're probably better off reading the CPU manuals and old Mac/Amiga/Atari optimization trick textfiles. The biggest problem you'll face is the fact that the ELF ABI is documented only in a book that's long out of print, and the only authoritative source is GCC's sourcecode.
Old 1980s magazines like BYTE or DTACK Grounded may be useful as well.