-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
Description
The performance of the network on GPUs seems to be lagging behind the CPU performance.
I suspect that this is because the 2D convolution isn't designed to work efficiently if the height of the input is 1.
It shouldn't be too difficult to write some custom code to perform an efficient 1D convolution.
For example, fft could be used for this.