edit: Well.. you could TRY with FFT block size of 8 or 16 samples, then frequency domain convolution, but I’d imagine just bruteforcing is probably going to be faster. Using FFT block that’s longer than twice the kernel size is never profitable, and usually it makes sense to use at most about half your kernel size and then do the rest with frequency-domain delays, ‘cos the cost of having to multiply accumulate over a small number of buffers tends to be less than the overhead of a larger FFT… but it’s very much one of those things where the big-O complexity is not the whole truth and you’ll have to manually tune for the platform (eg. caches and all that).
Preferred pronouns would be “it/it” because according to this country, I’m a piece of human trash.
Read more here: Source link