You have some computation/logic that you want to implement. Should you implement it so that it is executed by a CPU or a GPU? Or should you implement it as an integrated circuit (ASIC)? Or should you rewire an FPGA to implement your logic?
Theoretically, any computation/logic you want to do, you can do it in any of the above ways (as they are all turing complete). Practically, the methods that you choose depends on some characteristics of your computation and what trade offs you are willing to accept (between difficulty of implementation, speed, power consumption, etc).
The following image summarizes the trade offs (if you get the image, reading the rest of this article is optional).
In general, implementing something in hardware, is harder and more expensive to do than implementing it in software, but the end product will run faster.
The reason why it is harder to implement in hardware is because in order to get a speed advantage in hardware, you have to ensure your algorithm is highly parallel. So if 1) your algorithm is highly parallelizable (some aren’t) and 2) you are willing to spend the time and effort to parallelize it, then and only then will you get a speed advantage by using hardware. This concept applies to writing software for a GPU as well, because both hardware and GPU have a speed advantage only when the problem has been parallelized enough.
So in summary, you will only get a speed advantage in GPU/hardward if you can sufficiently parallelize your algorithm; and if you can’t, you end up with a speed disadvantage! So, do not move to a GPU or hardware (ASIC/FPGA) unless you are able and willing to sufficiently parallelize your solution.
Latency in this context is the time between an input to the device and a response back from it. FPGA’s can have a latency of 1 microsecond, whereas even great CPUs will be at around 100 microseconds.
The reason why ASICs are the most expensive is because after you design your circuit, you have to print them. You will submit your design to a printing company, and they will charge you a nice sum for your order. An FPGA is just a “rewireable” piece of hardware (where as the logic in an ASIC is fixed), so they are already printed. You save a lot of money by grabbing an already printed hardware, and just rewiring it yourself for your logic. Quick aside, you specify the wiring of an FPGA via HDL (a language that specifies logic wiring).
The reason why ASICs, FPGAs and GPUs are more power efficient (computation per watt wise) than CPUs is because they do an immense amount of computation in one (or a few) cycles (because they do their computation in parallel). A CPU would spend hundreds or even thousands of cycles if it wanted to do the same amount of computation. Cycles take up power.
GPUs are still more general purpose than ASICs and FPGAs, so they consume more power, however they are getting better (power consumption wise).