I feel like there's a ton of unexplored area in the hardware space for extremely low power circuits used for ML. As we know, you can do ML with very low precision, which is why chips that can do fast INT8 and F16 are getting more popular.
However, these lower power chips are still designed like classical computers and are deterministic. What if you had something like a few thousand analog multipliers in parallel computing matrix products? An analog multiply can use far less power than a digital one at the expense of accuracy and determinism, which might not matter for some ML tasks.
However, these lower power chips are still designed like classical computers and are deterministic. What if you had something like a few thousand analog multipliers in parallel computing matrix products? An analog multiply can use far less power than a digital one at the expense of accuracy and determinism, which might not matter for some ML tasks.