UPDATED CODE RE: [music-dsp] faster Hermite? - or not..
xk at myrealbox.com
Tue Apr 9 00:05:31 EDT 2002
> You will note that my test timing used a conditional based on argv() to
> the "x" value of the interpolation. I have a hard time figuring out how
> compiler could pre-sciently know what I'd feed into it. Perhaps it only
> calculates the function value once instead of four times... which seems to
> be what it's doing. It also folds the constants I pass for the Y0-Y3,
> naturally. Scary!
> Re-building the code with changes (see below) I now get 6.3 cycles per
> on the Athlon and 9.4 cycles per call on the P-III, including loop
> Looking at the code generated, that seems quite probable (and this time
> doing the calculation every time, darnit!). It really does seem that the
> code is mostly sequentially dependent :-(
The results are still incorrect. Althrough I get the same results like you,
the asm code is very short for 4 rolls of Hermite. I think that because the
Hermites use the same parameters (y1, ... , y3), and the c0->c3 coefficients
are computed only one time and they are then used. You also have to have
different y1->y3 so that the compiler cannot assume anything anymore.
dupswapdrop -- the music-dsp mailing list and website: subscription info,
FAQ, source code archive, list archive, book reviews, dsp links
More information about the music-dsp