Two more half-baked thoughts:
We should be able to assume that the majority of transaction data (except for coinbase) has already been propagated. As Jeff said, incentivizing nodes to propagate transactions is a very good thing (the signature cache already gives a small incentive to miners to propagate and not 'hoard' transactions).
So the only information that theoretically needs to be propagated is which transactions a miner is including in their block, and in what order they are included.
But if there was some agreed-upon canonical ordering, then it should theoretically be possible to take shortcuts in the "what order".
You'd start with setof(transactions I think everybody knows about)
Select some subset, based on miner's policy
Sort that subset with the canonical ordering algorithm
Very efficiently broadcast, taking all sorts of shortcuts assuming most of your peers already know the set you started with and expect the same canonical ordering (see gmaxwell's thoughts on block encoding).
Second half-baked thought:
I wonder if broadcasting your transaction selection policy ("11KB of free transactions, sorted by priority, then 111K of fee-paying transactions, sorted by fee") might make it possible to save even more bandwidth by letting your peers create a very good approximation of your block with just that information....