The main processor load is for signature verification.
This
requires a hash function call, some large number maths and an elliptic
curve operation. The elliptic curve stuff is the longest step.
This takes around 1ms per signature on normal hardware, but optimized code is faster.
The main task is to prove that
R = u1 * G + u2 * Q
G is a constant, and the rest are different per signature.
There was talk of batch verification of signatures. The process might take 16 signatures and compute them together.
I
think a lot of the benefit of the GPU would be lost due to
communication bandwidth. GPU miners benefit from needing very little
information to be sent to the GPU routine.