Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This adds long term goals for rodio #654

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

This adds long term goals for rodio #654

wants to merge 1 commit into from

Conversation

dvdsk
Copy link
Collaborator

@dvdsk dvdsk commented Dec 6, 2024

Rodio has existed for about 9 years. When it was written there where no audio libraries for rust. Therefore rodio can do everything but not everything perfectly. A lot has changed since then and new libraries have popped up for specific goals such as game-audio (kira) & digital signal processing. This is a good thing in my opinion, when developing a library you have to make choices and they exclude some use cases.

When I started maintaining rodio I had a short exchange with the bevy devs (1/3 of rodio downloads are from bevy) about their planned move from rodio to kira. Note the have not migrated to kira as of this writing. See bevyengine/bevy#9076 (comment)

I also had a short exchange with the kira dev to see if kira could support all audio usecases, that seems to not be the case. See: tesselode/kira#87

Over the past few weeks we have collected the use-cases for rodio, see issue #626.

After talking to the bevy dev and kira dev I made some goals for rodio in my head. Now that rodio is (very) actively maintained we need to write those down and discuss them. That way we do not make contradicting decisions in rodio.

Please let me know what you think.

Rodio has existed for about 9 years. When it was written there where no
audio libraries for rust. Therefore rodio can do everything but not
everything perfectly. A lot has changed since then and new libraries
have popped up for specific goals such as game-audio (kira) & digital
signal processing. This is a good thing in my opinion, when developing a
library you have to make choices and they exclude some use cases.

When I started maintaining rodio I had a short exchange with the bevy
devs (1/3 of rodio downloads are from bevy) about their planned move from
rodio to kira. Note the have not migrated to kira as of this writing.
See bevyengine/bevy#9076 (comment)

I also had a short exchange with the kira dev to see if kira could
support all audio usecases, that seems to not be the case.
See: tesselode/kira#87

Over the past few weeks we have collected the use-cases for rodio, see
issue #626.

After talking to the bevy dev and kira dev I made some goals for rodio
in my head. Now that rodio is (very) actively maintained we need to
write those down and discuss them. That way we do not make contradicting
decisions in rodio.

Please let me know what you think.
@dvdsk dvdsk marked this pull request as draft December 6, 2024 17:38
@nednoodlehead
Copy link

As someone who has an actively developed music app, there is only one more thing that I'm hoping to get out of rodio, which in my comment i mentioned the problem of being unable to apply a crossfade-like effect to my music.

Again, I don't have a lot to offer in terms of implementation, but I would be open to discussing potential idea / brainstorming how this could work. Since I consider how I built my app to be a golden standard for how music apps should work. (All music is in one big vec, and songs are chosen at random, or in order).

I would be happy to do a breakdown of my implementation and how it works with the rest of the app. I would assume it would be pretty helpful to people making a similar app. My implementation is in iced, but it could be broken down in a way where it could be applied to more gui / tui apps in a general sense.

I also wouldn't mind doing tutorials / docs for general use-cases. Things for the average user.

@PetrGlad
Copy link
Collaborator

PetrGlad commented Dec 7, 2024

Regarding that Bevy thread.
I wonder what exactly makes rodio not real-time safe? CPAL pulling a sample does initiate processing, but I think that should be mitigated by output buffering. It seems that there were no specific efforts to limit the time it takes to produce a sample, is that a problem? I would not be surprised that some artifacts caused by bugs, not thread scheduling per-se. Sink do seem to do too much to my taste, but its delays seem to be predictable.

Yes, dyn dispatch will help with dynamic config, only question is, whether rodio is used on low power devices, and do they need special optimization. Maybe using dyn and atomics would do.

Rodio is bound to CPAL, although only real dependency is in output stream.

@PetrGlad
Copy link
Collaborator

PetrGlad commented Dec 7, 2024

I would maybe also clarify if rodio aspires to be more portable or cpal remains a requirement.

@dvdsk dvdsk mentioned this pull request Dec 10, 2024
@dvdsk
Copy link
Collaborator Author

dvdsk commented Dec 10, 2024

Regarding that Bevy thread.

When that was written I was only maintaining rodio for short while. The same is true when I asked the Kira dev whether Kira could support all playback use-cases. Now, one year later, I have my doubts whether game-audio and general playback are really separate use cases that need specialist libraries. Looking at the arguments made:

cannot be done dynamically unless all combinations are present at all times. This hinders attempts at building dynamical audio systems, which might want to add and remove processing at runtime

I wonder if they investigated the overhead of Kira's audio pipeline. My guess is Kira does something like apply a Vec of Box<dyn Effect> to each frame/chunk/span. Effect being some function performed on a frame/chunk/span (lets get that terminology down). Maybe that's faster on average then a rodio pipeline when most effects only need to be sporadically enabled but I would love to see a benchmark on that.

as audio processing is at the mercy of the OS descheduling the audio thread

As far as I know that is always the case, the OS can always decide to deschedule your thread. You can influence it and maybe that's something kira does, but on most systems to get realtime perf you need a very special kernel and sudo/admin rights.

Kira has a mixer feature, with flexible routing, which is much more condusive to usage in games. Its tracks are the place to put effects, and tracks can be created at runtime, which means you can both do bus processing and individual effects processing as well.

You can do that in rodio too however it might be more cumbersome?

Effects and tracks can be controlled separately from the audio thread structures, which means not having to deal with explicit synchronization; instead Kira uses controllers with command queues that are applied on each audio callback

This is periodic_callback with as body a mpsc::recv() and a period of zero.

and supports sample-accurate automation by providing an instant at which to apply the change, or locking to a clock for musical time-based events

We do not have an API like that. I can think of a rough way to implement it right now but Kira, being purpose build for this, might have a better answer.

@dvdsk
Copy link
Collaborator Author

dvdsk commented Dec 10, 2024

I would maybe also clarify if rodio aspires to be more portable or cpal remains a requirement.

Cpal is pretty portable right? Though I would love to get rid of the dependency of libasound-dev we have through cpal since that makes crosscompiling a chore. It might be worth it decoupling us from it.

@dvdsk
Copy link
Collaborator Author

dvdsk commented Dec 10, 2024

Yes, dyn dispatch will help with dynamic config, only question is, whether rodio is used on low power devices, and do they need special optimization. Maybe using dyn and atomics would do.

using dyn dispatch blocks soo many compiler optimizations though. Matching on an enum of different implementations is about 10x faster then using dyn 1. Whether an effect is enabled/disabled is just a bool in rodio. Checking a bool must be a hell of a lot faster then using dyn dispatch.

whether rodio is used on low power devices, and do they need special optimization

It is and yes they do (at least I do), on mobile low power devices every bit of cpu you take is battery life. Users tend to listen to audio for hours on end so a small difference in perf is noticeable.

Footnotes

  1. https://docs.rs/enum_dispatch/latest/enum_dispatch/

@roderickvd
Copy link
Collaborator

Regarding that Bevy thread.

When that was written I was only maintaining rodio for short while. The same is true when I asked the Kira dev whether Kira could support all playback use-cases. Now, one year later, I have my doubts whether game-audio and general playback are really separate use cases that need specialist libraries. Looking at the arguments made:

I did not know about Kira before. Looking through its documentation, I also have the feeling it would fit.

cannot be done dynamically unless all combinations are present at all times. This hinders attempts at building dynamical audio systems, which might want to add and remove processing at runtime

I wonder if they investigated the overhead of Kira's audio pipeline. My guess is Kira does something like apply a Vec of Box<dyn Effect> to each frame/chunk/span. Effect being some function performed on a frame/chunk/span (lets get that terminology down). Maybe that's faster on average then a rodio pipeline when most effects only need to be sporadically enabled but I would love to see a benchmark on that.

From librespot discussion experience, I can recommend defining the potato spec: what's the lowest-performing hardware that you are willing to support? It helps guide decision-making. For example, in librespot we said it's a Raspberry Pi Zero - but to use dithering and limiting, you need a Zero 2W.


Picking up from #626, I use Rodio as cross-platform, fire & forget audio playback library. Only DSP I use are volume attenuation and AGC - although I'd like to use fades more easily as I'll point out below.

Integrating Rodio more tightly in pleezer here are a couple of things I think can use improvement:

  • (context for the points below): For gapless playback, I concluded you pretty much need to use queue. append_with_signal is a much-needed feature because from a streaming point of view, it's the one cue (heh) to start downloading the next track and append it to the queue.

  • It's tricky though to reliably get the current playback position. Sink::get_pos gets you the latest playtime since when it last seemed to some position, or since when it first started, whichever comes first, so you need append_with_signal and try_seek callers to cache that playtime and subtract get_pos from it to get the real position.

    Beyond the ergonomics of it, I'd also prefer to get the position from the decoder (should it be that kind of Source), because the true position after seeking may be slightly before or after the seek target.

  • A few things I understand technically, but found unintuitive:

    • SourcesQueueInput::clear clears any upcoming sources, but not the current one.
    • When the queue is set to output silence until it gets a new item in the queue, Sink::empty is false.
    • Calling Sink::clear drops the SourcesQueueOutput and sets the playback state to stopped.

Again, technically I understand, but here I can think we could find a more ergonomic interface from the point of view of "maintaining a playlist" and applying DSP not only on individual sources, but on the audio output.

  • The Symphonia decoder in particular offers a lot of functionality that would be worth exposing. For example, (optionally) coarse seeking helps HTTP streaming very much. The MediaSource could be provided with more metadata like file size if known, which helps duration calculations. And in the "nice to have" department, track Metadata could offer useful metadata like replay gain.

  • Some more "nice to haves":

    • DSP optionally in the log domain (volume, gain ramps)
    • Optional quick fade-out / fade-ins on volume or position changes to reduce pops & clicks
    • Optional dithering

@dvdsk
Copy link
Collaborator Author

dvdsk commented Jan 15, 2025

@roderickvd

Thank you very much for your detailed points both here and in the user stories thread. I agree with everything you said, and as I see it should all be doable 🎉. Some things should already work so those are bugs, others fit best in a new Player struct that would replace Sink (planned), and yet others will need some new Sources or additions to the Source trait. Feel free to open issues for all of these if they do not exist yet then we can discuss the details there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants