[Portaudio] Alsa: support for sub-device & 24-bit BE format only

Ross Bencina rossb-lists at audiomulch.com
Fri Mar 30 23:33:32 EDT 2012

Hi Alan,

Thanks for your extensive explanation, I appreciate you taking the time 
to explain it. I don't think this has been considered in this detail before.

I have asked some questions inline below....

On 30/03/2012 11:51 PM, Alan Horstmann wrote:
> Hi Dmitry, Ross,
> On Friday 30 March 2012 10:55, you wrote:
>> Thank you for the details you provided regarding this case.
>>> It is not uncommon for audio hardware to have very specific hardware
>>> formats.
>> Yes, I believe so, but with Audio4DJ this is the reality. It does
>> provide only 3-byte 24-bit BigEndian format for capturing/playback.
>> Portaudio's Alsa implementation was missing ability to handle such
>> case.
> Very limited raw hardware flexibility is the reality with a great many sound
> cards in one way or another, but doesn't need fixing in Portaudio, IMHO.

I'm not sure we are talking about "fixing". Some reasonable questions 
appear to be:

- Should PA support talking direct to hw devices? If so, how much 
support? This appears to be somewhat analogous to the WDM/KS host API 
where we are getting as close to the metal as possible.

- Should PA ~usually~ talk to hardware via plughw devices? Apparently it 
does not at the moment, although this discussion appears to be heading 
in that direction.

>>> But Alsa provides a 'plug' layer that handles conversions
>> But does 'plug' operate with same low latency as it happens with
>> direct talking to hardware? What is the additional overhead of routing
>> audio through 'plug'? Does this routing is always safe and stable for
>> all Linux versions?
> The 'plug' layer adds no latency AFAIK; it is the standard mechanism for
> working with audio on Alsa

Is the 'plug' layer something that can be dynamically configured by PA 
code? Is it like an adapter object that can be imposed by PortAudio 
between the hardware and PA code? or is it something that must be 
correctly configured by the user independent and outside the control of PA?

What I'm getting at here, is that it seems unclear to me whether PA can 
rely on the plug layer if it's something that has to be correctly 
configured by the user to "just work".

>> To avoid consideration of these questions

Note to Dmitry: it is not appropriate to commit code that "avoids 
consideration of questions" -- things need to be correct, or at least 
discussed among peers.

>> to my
>> view it is better to handle format conversion inside Portaudio's Alsa
>> implementation which is usually overhead of 1 additional CPU
>> instruction and provide low-latency stable operation out of the box.
>>> Otherwise does channel mapping and rate conversion need also to be
>>> provided to support all hardware raw?
>> Well, Portaudio's ASIO implementation handles byte swapping of host
>> format to native endianness and I do not see a reason why Alsa
>> implementation can't do the same. Are you concerned about increased
>> size of binary, or some other things? What are the most important
>> reasons for not doing that with Alsa too.
> The fact that Alsa provides it, supported by a world-wide community, several
> full-time paid developers etc.  I presume ASIO does not provide that
> capability in the host api?

The native ASIO API absolutely requires PA to deliver samples in the 
correct byte order and packing.

It remains unclear to me whether ALSA has the same requirement. It seems 
to hinge on whether hw devices should be directly useable from PA.

>> To my view, not handling byte swapping for Alsa in Portaudio is a bug
>> because Alsa API provides declaration for LE and BE formats. If BE
>> format is ignored and not processed then this is the bug. If API
>> provides ability that format can be in BE format then Alsa
>> implementation shall be ready to handle such situation. Using 'plug'
>> to my view is just a workaround for this bug.
> At heart, I think there is perhaps a mis-understanding of the structure of
> Alsa, although I would not claim to be an expert.

There are also some misunderstanding of PortAudio. I will comment on 
this further in reply to Dmitry's next email, but suffice to say this is 
not a one dimensional problem and arguing it as if it is a simple 
question of whether or not to support byte swapping is not going to get 
us very far without some guiding principles for ALSA usage and PortAudio 

> I will try to communicate
> this as best I can, taking a step back to look from a distance; these points
> are probably all known to you, but I am putting them in an order so-as to
> follow a line of discussion.
> Sound card hardware varies a lot in its design, capabilities, limitations and
> target market.  Eg those based on the Envy24 chipset (I happen to know these
> chips reasonably well) are 96KHz capable semi-pro devices such as the Delta
> 1010.  These chips ONLY handle playback data as 10 channels of 4-bytes
> containing 24-bit signed pcm values.  It is not possible to send 2 channels,
> or to send 16-bit data in 2 bytes pairs.  Regardless of the operating system,
> drivers, host api, etc that is always what occurs on the PCI bus.  However,
> it can switch hardware clocks to most common rates, whereas many
> motherboard-based systems can only operate at 48KHz.  So a card may have many
> limitations of rate, format, channels etc.

A question here is: to what extent should PA support talking direct to 
devices with limited capabilities.

Other PortAudio implementations span the full range from relying on OS 
level conversion objects (e.g. PA/CoreAudio uses an AU for sample rate 
conversion) down to doing everything it has to itself (ASIO, WDM/KS?)

In general, if there is a "standard way" PortAudio should, by default, 
do it that way.

It seems to me, from what you say Alan, that the "standard way" for ALSA 
is to talk to plughw devices, not direct to raw hw devices.

The question then becomes, to what extent should support direct raw 
access to hw devices. I would say, "as much as practical". If it's 
possible to dynamically instantiate ALSA "plug" infrastructure to do the 
heavy lifting then we should do it that way, if not, then maybe having 
byte swappers in PA is not so bad.

> Host apis provide a framework to use audio on the given machine.  Each host
> api provides a mechanism to send data to the sound card through some kind of
> software 'drivers'.  But different apis take divergent approaches to how this
> is done.
> Linux first had OSS, which presents the Envy24 cards as a series of stereo
> devices, 16-bit only (24-bit is not supported), and handles all the
> conversions etc internally invisibly.  I believe some of the Windows hosts do
> similarly?
> However Alsa set out to give more flexibility and capability.  So it is
> separated into at least 2 layers.  At the lowest level are the hardware
> drivers that run the sound card chipsets.  Each type of board has it's own
> driver.  They run in what is known as Linux 'kernel-space' and simply expose
> the hardware as is to the Alsa user-space library, which apps can access.
> But the drivers provide no conversions/adjustments to any of the hardware
> capabilities or limitations.
> The Alsa user-space provides a powerful and flexible configuration capability.
> Virtual devices are created that allow access to the card in a way that
> incorporates modification of the pcm stream by means of the 'plugin' layer.
> Config files unique to each type of card are provided as standard, and load
> automatically when the card is opened.  The user can also add his/her own
> features.  So although a system may have just one sound card, it might
> normally have, say, 8 devices for that card, eg hw, plughw, front, side,
> rear, surround41, dmix, default, etc.

As I asked above, it is unclear how much scope PA has for accessing hw 
directly and still using the plug conversion infrastructure.

Dmitry also raises a valid point about the overhead of plug. Although I 
am doubtful about the overhead myself, I think this needs to be 
clarified. The case needs to be established for why PA is currently 
targetting hw devices.

> The 'hw' device provides access direct to the hardware with all it's
> limitations, which could be fixed rate, fixed number of channels, fixed
> format etc.  Those that work with Alsa should understand that fact, which is
> why it puzzles me that guys at Mixx would be pushing for certain direct
> hardware limitations to be worked around in Portaudio.

It seems reasonable to me that low level direct exclusive access should 
be supported by PA *as an option*. Other host APIs also have sample rate 
and channel limitations, so if byte order and packing is the only issue 
then it doesn't seem like a gross violation to me to support this. But 
it should be put forward as embodying the representative approach and 
reason for being of PortAudio.

> On the Envy24 cards,
> stereo cannot be played out of the hw device at all.  But Alsa provides the
> virtual devices that achieve all the functionality of other host apis (and
> more) as a result of the configured plugins.  So 'front' on an Envy24 card
> includes channel mapping using the 'route' plugin so it can handle stereo
> streams.  The configurations can be stacked so nowadays the 'default' device
> includes multiple-stream mixing through 'dmix', with format, rate, channel
> conversion through the 'plug' plugin.  It can get rather complex.

As I asked above, is it possible for PA to dynamically assemble plugins? 
or is it intended that this is outside the scope of using ALSA?

> That is why it is NOT a poor workaround to use Alsa plugins; they are *THE*
> provided mechanism for doing all the conversions that other host apis may
> hide, including endian conversion.  In the case of the Audio4DJ, when running
> under Windows apis the card must still be using _BE format itself, so the
> apis (except ASIO) must be doing the conversion for the hardware.  It's just
> that they don't (?) also show the raw _BE device as well, which Alsa does.  I
> don't think any of the Portaudio converters are needed when using Alsa
> through plugins.

[as an aside: pa converters may be needed for float formats if we wish 
to assert the PA int<->float scaling rules, whatever they may be now or 
in future]

> There are 2 side notes:
> a) if the Audio4DJ does not work with the plugins in place, that would be an
> Alsa bug.

This needs to be established.

> b) it may be that Portaudio-alsa should show the 'plughw' device instead of
> deliberately ignoring it, although the PA_ALSA_PLUGHW method exists.

This sounds like it might be the case to me. It seems like there are so 
many potential ALSA "devices" that it does not present a particularly 
usable picture to the user. The namespace seems to confuse things quite 
badly. Having hw and plughw in the same list for example would be quite 
confusing to a user. Is there any way to impose a namespace? For example 
perhaps PA could list the plughw device by default and use a host-API 
specific flag for accessing hw directly, if that is deemed necessary.

> Sorry that my explanation is long, and never as clear as my understanding.

Thanks for taking the time to write. It is best that we document and 
clarify things.


More information about the Portaudio mailing list