[music-dsp] C++ performance

Didier Dambrin didid at skynet.be
Thu Oct 28 08:36:19 EDT 2010


While I don't know C++ and I have to write asm all the time (because 
Delphi's compiler hasn't evolved since the 386 came out), I think that what 
helps the most is to process in layers.
That is, don't write big complex loops processing 1 sample at a time, but 
process a good amount of samples in layers. That way the plain code will 
have more chances to be well optimized, and you will be able to use 
block-processing libraries.

Besides, coding in layer is also clearer.
(this of course doesn't always work, not suitable for processing in which 
feedback occurs)

But I'm not a fan of "optimize later", you could end up with something that 
can't be made fast enough & it won't be any usable. To know where you can 
go, you need to know the speed of your building blocks.



----- Original Message ----- 
From: "Michael Gogins" <michael.gogins at gmail.com>
To: "A discussion list for music-related DSP" <music-dsp at music.columbia.edu>
Sent: Thursday, October 28, 2010 2:26 PM
Subject: Re: [music-dsp] C++ performance


I tried to make my tests reasonably realistic, as I am aware of the
issues you raise. My loops performed arithmetic on randomly generated
numbers in the arrays.

I also did examine the generated assembler.

The designers and implementers of the Standard C++ Library state that
a primary design objective for containers and algorithms is to equal,
for all practical purposes, the performance of hand-coded assembler.
My experience and indeed my tests lead me to conclude that in at least
some cases, this objective has indeed been met.

That reinforces the advice, "write clear code, then profile, then
analyze hot spots for re-design and/or optimization."

I cannot over-emphasize the importance of clarity and simplicity in
programming. Clever tricks will obscure the clarity of the algorithms,
if not always for the author of the code, then certainly for its
maintainers.

Regards,
Mike

On Wed, Oct 27, 2010 at 9:03 PM, Alan Wolfe <alan.wolfe at gmail.com> wrote:
> Feel free to ignore this message but most benchmarks only benchmark
> how something preforms in benchmarks (not realistic scenarios)
>
> I'm not saying that vector and direct array access may necessarily
> have very different performance, but if you are going to benchmark you
> should make sure and benchmark an actual realistic situation instead
> of a tight loop.
>
> If you already know this, or disagree, feel free to ignore :P
>
> On Wed, Oct 27, 2010 at 5:44 PM, Michael Gogins
> <michael.gogins at gmail.com> wrote:
>> I did the experiment comparing std::vector<double>::iterator versus
>> double buffer [mysize] access versus pointer arithmetic access using
>> both Microsoft VC++ 2005 and MingW 3.4.5 (I think those were the
>> versions). There was no signficant difference at all in speed, and
>> none that I could see (though I'm not an expert) in the generated
>> assembler.
>>
>> Regards,
>> Mike
>> On Wed, Oct 27, 2010 at 2:21 PM, robert bristow-johnson
>> <rbj at audioimagination.com> wrote:
>>>
>>> On Oct 27, 2010, at 1:52 PM, Thomas Strathmann wrote:
>>>
>>>>
>>>> The register qualifier is just a hint. The compiler does not 
>>>> necessarily
>>>> hold the variable in a CPU register. Likewise, the comparison need not 
>>>> be
>>>> implemented using subtraction if your processor has more comparison
>>>> instructions than just a test for zero. gcc 4.2.1 for Intel 32 bit 
>>>> loads
>>>> either 0 (in your case) or CHUNK_SIZE-1 (in the other case) and does 
>>>> the
>>>> appropriate comparison. So i both cases you get a load. Bottom line is: 
>>>> You
>>>> will still have to look at the generated code and/or do profiling. No 
>>>> amount
>>>> of trickery with qualifiers and restructuring your code will 
>>>> automatically
>>>> change anything. But it does not hurt to give the compiler some advice. 
>>>> It's
>>>> always free to ignore it and will often enough do so.
>>>>
>>>>> sometimes you *should* spell it out for the compiler. i only wish the
>>>>> promotion to (long long) was unnecessary. i wish that in C and C++, it
>>>>> was automatically understood that the type of the result of 
>>>>> multiplying
>>>>> two N-bit numbers was a type with a 2N-bit number if a 2N-bit type is
>>>>> available. that is one flaw of C or C++.
>>>>
>>>> Spelling it out for the compiler is using idioms appropriate for your
>>>> language, compiler, and runtime (point 1). But you can easily overdo 
>>>> it,
>>>> especially if you just think that you know better than the compiler. 
>>>> Point 4
>>>> does not apply if you actually know how the compiler translates certain 
>>>> code
>>>> sequences. It's dangerous to think that if compiler X for language Y in
>>>> version Z and options O for target T produces such and such code then 
>>>> in
>>>> another setting it must certainly also emit the same code.
>>>
>>>
>>> i still think that this code:
>>>
>>> y1 += (long long)b0 * (long long)(*input++);
>>> y1 += (long long)a1 * (long long)output_sample;
>>> output_sample = (long)(y1>>24);
>>> *output_ptr++ = output_sample;
>>>
>>> has a better chance of being compiled into the simplest machine code 
>>> than
>>>
>>> *output_ptr++ = (long)( (y1 + (long long)a1*(y1>>24) + (long
>>> long)b0*(long long)(*input++) )>>24 );
>>>
>>>
>>>
>>> On Oct 27, 2010, at 2:08 PM, Ross Bencina wrote:
>>>
>>>> robert bristow-johnson wrote:
>>>>>
>>>>> for (register int i=CHUNK_SIZE; i>0; i--)
>>>>> {
>>>>
>>>> [snip]
>>>>>
>>>>> *output_ptr++ = output_sample;
>>>>> }
>>>>
>>>> When I looked at this I thought "that isn't spelling out anything" --  
>>>> but
>>>> really, it depends on how advanced your compiler is. For a simple 
>>>> compiler
>>>> that trusts you, I agree. But I imagine for most modern mainstream 
>>>> compilers
>>>> this isn't spelling out anything.
>>>>
>>>> If it thought it appropriate the compiler could easily rewrite your
>>>> example as:
>>>>
>>>> long *end = output_ptr + CHUNK_SIZE;
>>>> while( output_ptr != end ){
>>>> ...
>>>> *output_ptr++ = output_sample;
>>>> }
>>>>
>>>> or rewrite the latter as the former, or whatever other implementation 
>>>> is
>>>> known to be fastest on the target, given available registers etc etc
>>>
>>> using output_ptr instead of "i" is fine, but it might not save a 
>>> register
>>> (if "end" must be saved). and since end is compared or subtracted from a
>>> copy of output_ptr, i do not believe it saves an addition (compared to
>>> decrementing i).
>>>
>>>
>>> --
>>>
>>> r b-j rbj at audioimagination.com
>>>
>>> "Imagination is more important than knowledge."
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> r b-j rbj at audioimagination.com
>>>
>>> "Imagination is more important than knowledge."
>>>
>>>
>>>
>>>
>>> --
>>> dupswapdrop -- the music-dsp mailing list and website:
>>> subscription info, FAQ, source code archive, list archive, book reviews, 
>>> dsp
>>> links
>>> http://music.columbia.edu/cmc/music-dsp
>>> http://music.columbia.edu/mailman/listinfo/music-dsp
>>>
>>
>>
>>
>> --
>> Michael Gogins
>> Irreducible Productions
>> http://www.michael-gogins.com
>> Michael dot Gogins at gmail dot com
>> --
>> dupswapdrop -- the music-dsp mailing list and website:
>> subscription info, FAQ, source code archive, list archive, book reviews, 
>> dsp links
>> http://music.columbia.edu/cmc/music-dsp
>> http://music.columbia.edu/mailman/listinfo/music-dsp
>>
> --
> dupswapdrop -- the music-dsp mailing list and website:
> subscription info, FAQ, source code archive, list archive, book reviews, 
> dsp links
> http://music.columbia.edu/cmc/music-dsp
> http://music.columbia.edu/mailman/listinfo/music-dsp
>



-- 
Michael Gogins
Irreducible Productions
http://www.michael-gogins.com
Michael dot Gogins at gmail dot com
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


--------------------------------------------------------------------------------



No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.864 / Virus Database: 271.1.1/3223 - Release Date: 10/27/10 
21:12:00



More information about the music-dsp mailing list