That'll work. C5 is one of two capacitors that tunes the input filter LPF1. The 6.8nF cap that I add in my filter mod is the 2nd cap. R3-R5 also influence the filter's tuning. Adding the 6.8nF cap changes LPF1 from a 1st-order filter to a 2nd-order filter. Higher-order filters have a sharper cutoff, i.e. they attenuate more of the unwanted signal above the cutoff freq. C6 is one of two capacitors that tunes the output filter LPF2. C11 is the other cap. I did not feel the need to retune LPF2.
With time-sample systems, filtering is a trade between bandwidth and noise. Let too much high-freq content in and it gets aliased, which sounds dissonant & noisy. The output from a sampled system contains quantization noise and sampling images that are also dissonant & noisy. Filtering reduces that unwanted crap. The filtering requirements are dependent on sample rate, which in the case of most delays is variable. Filtering that sounds good with short delay settings may not sound good at longer delay settings.
Analog delays (BBDs) are also time-sampled and benefit from the same filtering. Tape delays use filtering on the input and output to reduce tape hiss. The filtering, or lack thereof, contributes to the characteristic voice of the various delays.
I mentioned the FV-1 earlier. Like the PT2399, it is a digital sampled system. What makes the FV-1 so much clearer (more bandwidth and lower noise) is its higher sampling rate and resolution.