Skip to content

Audio setup dialog #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

xyzzy42
Copy link
Contributor

@xyzzy42 xyzzy42 commented Mar 28, 2021

This is an audio setup dialog that allows setting the device and rate. This includes support for changing the same rate at runtime from 44.1 kHz too. Device and rate are saved to config file.

I did this quite a long time ago, before the other audio setup PR #21 was made, and finally got around to finishing it and porting it to my other work.

I've tested this on Linux with ALSA and on OS X and it works on both.

All my other outstanding PRs show up here too since it builds on them. Only the last 6 commits are new. I don't know of a way to make them not appear in this PR and yet have the PR show up in @vacaboja 's repo, where I can't create branches.

Create some utilities for creating menu items so there isn't so much
copy and paste of the code.
It's quite large.  Split it into new -functions that create the various
parts of the output panel.

The waveforms (tic, toc, period, debug), the paper strip chart and its
controls, and the window under the output display that contains them,
each get their own function.

There is some minimal support for a vertical vs horizontal paperstrip
chart, but it's not used yet.
This lets the size of the paperstrip vs the waveforms be adjusted by
sliding the pane between them.

To get the layout right, we need to give the waveforms a minimum size.
300x150 seems about the minimum.  It should look basically the same as
it did before.

Each snapshot has its own unique pane position.  Adjusting one tab in
the notebook doesn't adjust the other tabs.  It might be nice if that
weren't the case and they were all synced.
When light mode is toggled, it's possible for the buffer pointer or
timestamp to get messed up if it happens while the callback is running.

The callback will read the current values when it starts, then light
mode switch resets them to zero, and then when the callback finishes it
updates them based on the original values, effectvely undoing to reset
to zero.

This doesn't cause a crash, but it does mean the timestamp isn't quite
right.  It also will cause problems if any code wants to process the
audio data as it arrives.

It could cause a single callback to be processed with the incorrect high
pass filter.

The easiest way to fix this is to pause the input stream while doing the
mode switch.  It could be fixed by making the entire callback a critical
section or having the mode switch done asyncrounsly via the callback,
but this makes the callback more complex and adds overhead.  It seems
better to add the extra work to the mode switch rather than the
callback, since the former is a rare operation that isn't performance
critical while the latter is.
Silences a compiler warning.  This could actually happen if the save
file had no snapshots.  It should always have some, but format does
allow for zero, which would probably crash.
I measured this as reducing CPU usage from ~78% to ~45%.  Almost
doubling the speed, which is expected as described below.

Tg processes the audio in a series of steps that double in size:  2, 4,
8, and then 16 seconds long.  This is the one to four dots shown in the
display.

Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't
pass.  Data from smaller steps isn't used if a larger step passes.

This means when running at a constant step 4 with good signal, CPU usage
is almost double what it needs to be, since steps 1-3 are processed and
unused and add up to 14 seconds, almost as much as step 4's 16 seconds.

Change this to start at the previous iteration's step.  With good
signal, only step 4 will be processed and CPU usage is cut almost in
half.  If the previous step fails, it will try smaller steps.  If it
passes and is not yet the top step, it will try larger steps.

End result should end up on the same step as before, but get there
sooner.  It could be slower if the step drops a lot, e.g. from 4 to 0,
but this happens far less often than the step saying the same or nearly
the same.

To do this, I moved the step logic out of analyze_pa_data() and entirely
into compute_update().  analyze_pa_data() will now just processes one
step and compute_update() decides which step(s).  Previously,
analyze_pa_data() did all steps and compute_update() decided which step
to actually use.

There is a small change to the algorithm.  To be good, a step needs to
pass a number of changes done in process().  Then there is one more
check, of sigma vs period, done in compute_update().

Previously a step didn't need to pass the sigma check and it still
counted enough to increase last_tic and show up as a signal dot in the
display.  But the data wasn't actually used unless it passed to sigma
check too.

I no longer keep track of "partially" passing steps like this.  Either
it passes all checks, including the sigma check, or not.  last_tic and
signal level only count fully passing steps.

I see no practical difference in my tests, but I think it could show up
with some kind of marginal signal that has a high error in period
estimation with the longer steps.
Previously max signal level (4 dots) but with no amplitude measured was
counted as signal level 3.  But level 3 or lower with no amplitude sill
counts as the same level.

Have compute_update() no longer do this signal level adjustment and just
report the level used, which indicates the averaging interval.

The dot graphic will now indicate "no amplitude" by using a hollow dot
for the final signal level's dot.

This way the number of dots always shows the averaging interval and a
hollow dot always shows that the signal is too poor to measure
amplitude.
@xyzzy42
Copy link
Contributor Author

xyzzy42 commented Apr 7, 2021

What the audio setup dialog looks like:
image

xyzzy42 added 2 commits April 8, 2021 21:27
This was the last one from systemd before they switched to Meson.
Needed to update some of the autoconf script to match a change w.r.t.
CFLAGS in the upstream attributes.m4.
Another AC_LANG_SOURCE warning that wasn't fixed yet.
@xyzzy42 xyzzy42 force-pushed the trentpi/audio-setup-dialog branch from f8d76fe to 7b09299 Compare April 12, 2021 10:37
@xyzzy42 xyzzy42 marked this pull request as ready for review April 12, 2021 10:39
xyzzy42 added 3 commits April 13, 2021 13:41
Periods (two beats) longer than 500 ms were rejected as too long.  This
effectively limits the low end of BPH to 14400.  Any slower and the
correct period length is greater than 500 ms, e.g. 12000 BPH is a 600 ms
period.

Change the limit to be 20% slower than the nominal value for the BPH.
If the BPH is set, then use that, when BPH is guessed, use MIN_BPH.

A MIN_BPH of 12000 will increases the max period to 720 ms when
guessing.  It could be longer if one were able to set a slower than
MIN_BPH period length.
Allow lower BPH values.  Create another setting that affects the BPH
guessing code so it doesn't get confused by fractions of faster beats.

Lowest value is 2.25 Hz or 8100 BPH.  Lower rates do not work well or at
all.

Tg uses a minimum 2 second detection window and needs to see two periods
in this window.  So that it can see the 1st period matches the 2nd
period and the time between them is about what it should be from the
BPH.  In order to get two complete periods (two beats each) in a 2
second window, one needs 7200 BPH.  Below this and there might only be
one period in the window and nothing to detect.

But even at 7200, the beats at the beginning and end of the window might
be cut off, so it's necessary to go a little faster to be reliable.
This works on Fedora, Redhat, CentOS, etc.  It also works for
autobuilding on COPR.

It will regenerate the autoconf code, which allows it to work from the
git tree, i.e. without the configure script that is only in the release
tarballs.

The version number will be injected automatically, both when building
with autoconf and on COPR, which was quite a challenge.

When the configure script is run, it will preprocess the spec file and
an include file for it and inject the version.  The spec file's macro
logic will detect that this has been done and use it.

But on COPR the SRPM is created by rpkg without running configure, and
there is no way to add extra commands to do it, so this won't work.
However, rpkg has its own preprocessing systems for spec files and so
that is used to inject the version number.  The spec file macro logic
will use this if the autoconf injected version is not present.

It's also possible to specify the version manually on the rpmbuild
command line by defining the macro "version".

If pkgrel is defined then a release is being generated and the RPM
will be the release specified in pkgrel.

If pkgrel is not defined, then it's assumed a snapshot is being built
and the git date and commit are added following the Fedora guidelines
for snapshots.  Again, this needs to be done in two different ways to
support building from the git checkout directly or using COPR, which
builds binary RPMs from an SRPM rather than from a git checkout.
@xyzzy42 xyzzy42 force-pushed the trentpi/audio-setup-dialog branch from 7b09299 to 21a5eab Compare April 18, 2021 07:59
xyzzy42 added 13 commits April 18, 2021 01:11
This allow changing the layout so the paperstrip is horizontal,
scrolling to the left, with the three waveforms arraigned horizontally
below the paperstrip.

The effect is to make the paperstrip longer and give it more space.

For many tasks the paper strip is much more useful than the waveform
displays.  This allows adjusting the layout to be more useful in those
cases.

A new checkbox in the menu is added to control this.  It can be changed
on the fly and the widgets will be re-arraigned.

A config file option, "vertical_paperstrip", is added to remember the
current setting.  Default vertical layout to true if not present in
config file.

If multiple snapshots are present, the change of orientation will affect
all of them, rather that just the current viewed one.  It would be
possible to change this.
Using the size of the main app window, what was done, doesn't work so
well now that the space allocation between the paperstrip and waveforms
can be changed.

This way making the waveforms (or paperstrip) smaller will give a
smaller font that fits in the smaller space better.
Create a new struct for the output_panel snapshot display parameters.
Instead of being mixed into the snapshot data from the computer, there
will just be a pointer to the display parameter struct in the snapshot.

The clone and delete snapshot functions will clone and delete the
display parameters, if they exist.  The computer's snapshots don't have
display parameters so the pointer will just be NULL for them.

Each time a new snapshot is passed from the computer to the interface,
the pointer to the display parameters can be kept so that each parameter
doesn't need to be copied.

paperstrip_draw_event() will allocate a new display parameter when it
gets a snapshot for the first time.

This will make it a lot easier to add new display parameters just by
adding them to the struct.  And they can be initialized in the output
panel code that uses them and knows what they should be.  The computer
doesn't need to know anything about them.

This does break the savefile format slightly since the struct field
names are coded into the savefile.  Maybe that design has some drawbacks
when it comes to backward compatibility.  In this case the trace
centering parameter is the only one broken and it doesn't really matter.
Since the space allocation between the paperstrip and waveforms can be
changed now, using just the main window size to control font size
doesn't work as well.  E.g., a small waveform in a large app window that
is mostly paperstrip will still get a large font, because the app window
is large.

Also adjust font spacing some.  Use the vertical size of the font to
adjust legend in paperstrip and provide consistent margins in waveforms.
Doesn't depend on state having the widget saved or what name is used
this way.  Also avoids unused parameter warning.
Add a pair of buttons that overlay the paperstrip.  One is a Gtk slider
button which allows zooming in and out with a slider.  The other will
reset to the default zoom.

This zoom is just zoom the chart in the horizontal direction, i.e. error
in each beat.  Not vertically, which would show more or fewer beats at
once.

The zoom ranges from the 1.0 to 0.01.  E.g., for a 10 beat/sec movement,
the chart's width will be at most 100 ms, an entire beat, to 10 ms, the
default, down to 1 ms.

The forced switch from 0.1 in normal mode to 0.01 in calibration mode is
removed.  If one wants to zoom in for calibration, then one is free to
do that.

Due to the way the paperstrip is drawn, only zoom values that are exact
integer fractions will work.  E.g., 0.5 = 1/2, 0.2 = 1/5, and 0.10 =
1/10, will all work.  But 0.12 = 1/8.333 will not, because 8⅓ is not an
integer.  This makes the initial zoom jumps quite large.
Re-write most of the code for generating the paperstrip.

A new algorithm is used to place the dots which has an advantage in that
it can display correctly with any level of zoom, not just at integer
scales.

What we do is find each event's relative position from the one before
it.  A relative offset of 0 would mean the event is exactly a multiple
of the beat time from the previous event.  A positive or negative value
means it is earlier or later than expected.  We can then scale this
offset by any value.

Since only the relative position between events is used there is no
longer a problem when the accumulated offset wraps around a beat length
but not the chart width.

With all event positions relative, we need some absolute position from
which to start.  This is done by saving the absolute offset of the most
recent event as an anchor.  The next display will place the previous
anchor at the same absolute offset and update the anchor to the new most
recent event.  This way the dots do not shift side-to-side as they
scroll.

Having done this, much of the other code relating to the paperstrip
becomes simpler and there is a net reduction in lines of code.

The dots are placed in nearly the same positions as before, with one
significant difference.  The slope is reversed, i.e. a movement running
fast will slope up and to the left.  This works better for the math, and
appears to match what other timegraphers do.
The snapshot display parameters should be initialized to NULL.

Really, it would be better to always use calloc() so that every field is
automatically initialized to zero/NULL rather than needing to remember
to manaully do it for each field.
Check to see if the computer has figured out a rate, beat error,
amplitude, etc. before drawing the rate slope lines.  Otherwise the
value of rate is uninitialized.
When at the default zoom scale, hide the zoom reset button.  No need to
have it on the screen since it won't do anything.  And it tells you that
the scale is already at default.

I noticed the zoom reset button in Chrome disappeared when at 100% and
liked this feature.
Rounding the current time down might cause the newest event to appear to
be from the future.  That confuses the paper strip math.  Round up so
that doesn't happen.
Turn a boolean vertical flag into a gtk orientation.  Was tired of
repeating this long expression.
Rotating the paperstrip just does the graphics, not the GUI widgets, so
they weren't working as well in horizontal mode.  Add some code to pack
and orient them well for either vertical or horizontal layout mode.
xyzzy42 added 2 commits April 18, 2021 11:16
This way settings that affect audio can be read in before audio starts.

Since the config file is now read before starting audio, it is possible
to start in light or normal mode as configured.

If light mode was configured, then previously audio would be started in
normal mode and then immediatly switch to light mode after processing a
few callbacks.
When there is just one period in the processing buffer it is not
possible to calculate the sample standard deviation (sigma).  In this
case, the period value was being used as the sigma estimate, which
effectively gives a huge sigma when there is 1 period in the buffer.

This results in the processing buffer being rejected as bad and a larger
buffer, which would have multiple periods, is never tried.

One period per buffer would happen with a combination of long period and
short buffer, e.g. a small BPH in light mode, since light mode uses
buffers half as long as normal mode.

Use 0 as the sigma value for 1 sample.  This means the buffer will pass
the sigma check and a larger buffer will be tried.
@xyzzy42 xyzzy42 force-pushed the trentpi/audio-setup-dialog branch from 21a5eab to f852bb7 Compare April 19, 2021 01:22
xyzzy42 added 12 commits April 18, 2021 20:11
This moves the high pass filter out of prepare_data() and into the audio
callback.

prepare_data() will process the same data many times, up to 300 times,
as it is repeatedly called to processes buffers that mostly overlap the
last set of buffers.

Running the HPF on incoming audio will process about 300x less data
without changing the result.  This produces about a 4% overall speed
improvement.

In the next step in the audio processing, the noise filter, each sample
depends on every other sample in the buffer, so all samples need to be
re-processed each time the buffers change.  So it's not possible to do
more processing past the HPF as the audio is received.

I also benchmarked code that would apply the HPF while coping the data
out of from the portaudio buffer to the ring buffer.  I.e., a biquad
filter function that did not modify the data in place, but instead had
separate input and output pointers.  Dealing with the audio
combinations, mono/stereo, normal/light, took more code and there was no
significant speed improvement.

I also tried moving the HPF out of the audio callback and into the
compute thread, while still applying the filter only once to new audio,
but this was slower, doesn't spread the load across multple CPUS as
well, and was more complex.
The buffer used for audio data is a static size based on what's needed
for 32 seconds of data at the fixed 44.1kHz sampling rate.  This doesn't
work as well if one wants to change the sample rate and possibly support
high rates.

Dynamically allocate it.  If the sampling rate were to increase,
re-allocate a larger buffer for it.
Commonly used macro to get the number of items in a statically sized
array.
This changes will make it easier to change audio device or rates after
Portaudio is already started.

The code needed to do this is taken out of start_portaudio() and put
into new functions open_stream() and set_audio_device().  The latter
function is global and can be used to change device or rate.

State needed to keep track of audio device, rate, streams, etc. is
moved into a static global struct.

A different sampling rate than PA_SAMPLE_RATE can be used if
start_portaudio() is passed that value in nominal_sample_rate.
Since the HPF is mostly used by the audio callback, put it into the
callback info struct.

And put the audio callback struct into the audio context struct.
The audio code needs to know, and does know, the sample rate and light
mode, since it is the code that configures portaudio to the requested
sample rate and does the light mode decimation.

Add a method to get the effective sample rate from the audio context,
taking light mode into account.  Code that does with the current audio
state can use that instead of being passed that info.

set_audio_light() doesn't need sample rate passed to it anymore.
fill_buffers() and get_timestamp() don't need light mode passed to them
anymore.
@xyzzy42 xyzzy42 force-pushed the trentpi/audio-setup-dialog branch 2 times, most recently from 8c8a062 to 80f2c41 Compare April 19, 2021 06:31
xyzzy42 added 5 commits April 19, 2021 12:28
A new exported interface, get_audio_device(), get_audio_devices(), and
set_audio_device(), is added to audio.c.

It allows querying what devices are present, what sample rates they
support, what device is active, and changing the active device and rate.

Portaudio's ALSA driver won't let you know what rates are allowed if the
device is currently active, due to the way it tries to re-open the
device to query the rate.

This also causes a problem querying rates on a direct hardware device,
e.g. "hw:0,0", if a virtual device, e.g. "pulse" or "dmix", is active
and using that hardware.

So the common use case I want to do, where "pulse" is the default on
startup and then I want to switch to the direct hardware device to avoid
resampling, and get lower latency and better time stamping of audio,
doesn't work.  Because the hardware device is "in use" through pulse.

To get around this, a predefined list of common rates are queried on
start up and the results cached.  This way it's not necessary to query
devices after audio capture has started.
Lets one select the sound device and the rate.

All devices appear in the list, but non-suitable ones will be grayed
out.

There is a predefined list of common rates and non-supported rates for
the selected device will be grayed out.

The rate combo box allows typing in a rate not in the list and that rate
will be tried.  This design allows for any rate, but doesn't waste too
much time querying the audio devices for all possible rates on startup.

If setting the new device/rate doesn't work, it will automatically go
back to the previous values.

The dialog is modal.  It's necessary to select new settings and hit OK
or Cancel beforing going back to the app.

Mask the dialog handler when the rate list is populated so callbacks
aren't run as new items as added.
This will save the audio device and rate selected to the config file.

Initially the saved device and rate are both codes for "default", so the
current behavior of using the Portaudio default device at 44.1 kHz
remains.  Only if the audio setup dialog is used to select something
different, and that selection is started successfully, does a specific
device and rate get saved.

Additionally, selecting the default device in audio setup is still saved
as "default", rather than the specific device that happens to be the
current default.

If the saved device is not available when starting up it will
automatically revert to the default.
Extend audio interface to allow setting the HPF cutoff frequency when
opening the audio device as well as changing it on the fly.  Add
a method to get the current filter parameters.

Also add ability to disable HPF by changing cutoff to 0.
Add a control to the setup dialog for the HPF cutoff frequency.  Also
save it to the config file.

Slider has ticks for Off and also for the default 3kHz frequency.  The
fill level of the slider is set to limit the cutoff to the Nyquist
frequency of the selected sampling rate.

To do that, I needed to get a callback on rate combobox change, and the
manual entry of the combobox made that a challenge.
@xyzzy42 xyzzy42 force-pushed the trentpi/audio-setup-dialog branch from 80f2c41 to f2f39dd Compare April 19, 2021 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant