Yes, the step you're missing is "and build the table". Dynamic memory allocation is something you want to avoid, as well as any artifical restrictions to number of inputs or outputs. Current solution is slow, but there's really no limitation on tx size.
Plus there're significant restrictions to memory in embedded world. Actually TREZOR uses pretty powerful (and expensive) MCU just because it needs to do such validations and calculate such hashes. With SIGHASH_WITHINPUTVALUE or similar we may cut hardware cost significantly.
Marek