This thread is discussing two unrelated things.
Your first email asked about transaction pruning ("stubbing"). You're correct. This doesn't do anything for initial chain download bandwidth or time. In fact it makes it slower because you have the overhead of deleting the old transactions. It exists purely to save disk space.
Christians reply is about simplified payment verification (SPV) mode. It is unrelated to transaction pruning. SPV clients can download only the chain headers with no bodies all the way from the genesis block until the creation time of their youngest key. This does reduce initial setup time and in fact is now implemented in BitCoinJ, but it's still linear in the length of Bitcoins life, so that's ultimately unsustainable. You need a regular series of checkpoints signed by a trusted developer and a circular block store to have truly bounded overheads. The merkle tree is still useful because it allows for SPV clients to receive only the transactions of interest yet have nearly the same assurances that downloading full blocks would give - remote nodes can now hide transactions from you (dos) but not invent new ones.
SPV clients do not use "number of blocks on top" as a way to decide validity. They look for the best chain they can find, same as a regular node does. As Satoshis paper says, if an SPV node has access to the P2P network and is also talking to you, you can defraud it for as long as you can dominate the networks hash power (51% attack) because you can create a harder chain than everyone else can. However your invalid blocks won't be accepted by the rest of the network regardless of how many there are or how much work they represent, so as soon as you stop dominating the network the correct chain will catch up and replace yours, resulting in the fraud being detected and shown to the SPV user.