Developer time is the most valuable resource. Writing in Python may cause slower code, but it is in reality much more efficient because you will get an exponential amount of more work done!
>>44
It wasn't about transistor count but keeping the delay between each pipeline stage as short as possible. It's the same with the odd add behaviour --- they probably figured propagating a carry through all 32 bits could be too slow for 1 cycle, so cut it up into 2 pieces with early-out.
AMD's "faildozer" was along the same lines too... but they came a few years later so the process advantage helped, although it still wasn't enough.