Ah, was not aware of that particular octaver algorithm. Sounds really interesting. Did you derive that code from the paper or was it already published somewhere?
All the code outside of the
lib
directory is my own*. I'd say I followed the spirit of the paper, but I didn't end up slavishly copying every detail it specifies. I'm not aware of an existing implementation published anywhere.
More information than you bargained for inbound...
ERB-PS2 Algorithm
The paper that passtheducky mentioned, which describes the ERB-PS2 algorithm:
https://core.ac.uk/download/pdf/80719011.pdf
To summarize ERB-PS2 for everyone, the dry signal is fed into a bank of narrow bandpass filters in order to isolate individual frequencies (and their phases). This is similar to what an FFT does, but it allows unequal frequency bin sizes, and doesn't require evaluating the entire spectrum. Each filter output is then shifted to produce a new frequency double (or half) of the original, and the results from all filters are mixed back together.
The ERB part of the name refers to how the filters are spaced using a
model of how the human ear receives sound.
The PS2 part of the name refers to "Phase Scaling by 2", the process used to shift the frequency. In my opinion, this is the key innovation over the
Rollers algorithm, which used single-sideband modulation to perform the frequency shift.
What I Changed
The actual "Phase Scaling" equation that I used
is from the paper (page 29). That said, the paper only covers octave up, and if you expand the equation for octave down (γ = 1/2), you'll see that you end up with a square root, which needs to alternate between positive and negative for each cycle of the original signal. The good news is that we already have a quadrature signal, which makes detecting these transitions fast and easy. At least, easy once you realize it. I don't have a ton of signals background, and it took me a while to figure this out.
Like the paper, I use a bank of narrow-band filters that generate quadrature signals. I used a different bandpass filter design than what was specified, mostly because it was the first (and only) thing that I got to work well.
For the filter spacing, I ultimately gave up on worrying about the constant-ERB-bandwidth concept, which didn't produce the results I wanted (but maybe I was doing it wrong?). Instead, I did the same thing with the Sub 'N' Up that the author of the paper did with the POG - I fed it a frequency sweep and analyzed the output using an FFT. This convinced me that the SNU is using the same algorithm, and it revealed a couple of key ideas:
First, the SNU completely ignores any input frequency over about 2 kHz. This greatly reduces the number of filters needed. It also allows downsampling before performing the octave shifting. Both of these features proved to be essential for the amount of processing power available on the Daisy Seed.
Second, the SNU filter spacing is tighter than what I understood the paper to be recommending. Its filter placement more or less follows an exponential curve that is positioned so there are fewer filters per semitone at the lower frequencies. This keeps the low-frequency filters from getting too narrow, which would introduce more delay. I followed a similar curve for my own implementation.
The very lowest filters of the SNU are placed to space them out even further, and some are excluded from the down-shifts where the result would fall below 20 Hz. I did not replicate these details.
I think that covers it.
* Correction: the fast inverse square root function is cribbed from elsewhere; I forgot about it when I was writing this.