Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better audio conversion #47

Open
Number969 opened this issue Nov 28, 2023 · 2 comments
Open

Better audio conversion #47

Number969 opened this issue Nov 28, 2023 · 2 comments

Comments

@Number969
Copy link

Hello!

I couldn't help but notice that the audio quality for the nodes isn't the same as on the playstation. As it was stated before in a pull request, using ffmpeg to convert the files applies a low pass filter to remove aliasing. You could tell ffmpeg to remove the filter but that ignores why the filter is there in the first place and creates artifacts.

By resampling the audio first to 44.1khz with something like r8brain and then converting with ffmpeg, the audio is correctly dithered, no (audible) artifacts are introduced and the frequency spectrum is preserved.

Here is how Cou001 sounds currently:

LAIN01.XA.0.c7687329.mp4

Here is the same node with my proposed approach:

LAIN01.XA.0.c7687329.mp4

Not only that, but the files are almost the same size, with all of the audio only nodes in the game currently being 166MB total while mine are 170MB.

I'd be willing to send the files if you're interested, considering I already converted them. If not, follow this approach to get the same results.

Cheers!

@Number969
Copy link
Author

A good alternative would be using ffmpeg with the sox resampler library. While not as exact as r8brain, it produces better results than the default ffmpeg resampler.

So using:

ffmpeg -i LAIN01.XA[0].wav -af aresample=resampler=soxr -ar 44100 output.wav

Should give decently accurate results.

@spaztron64
Copy link
Collaborator

Indeed, as the PR notes, ffmpeg currently downsamples the input to a lower sampling rate, and leaves the output to be interpolated by the client's output device itself. As the PS1's output frequency is 44.1kHz, and so is the output frequency for most consumer-grade computers out-of-the-box today, ffmpeg should be configured to output it's converted data at that sampling rate to prevent resampling inconsistencies.

As for which resampling algorithm should be used, this is a tricky one. Audio output accuracy is something that we've discussed during development, but couldn't come up with an ideal solution that didn't involve emulating the PS1 SPU outright, or at least implementing it's Gaussian interpolation ourselves. FFmpeg and all other general purpose audio libraries don't provide any kind of Gaussian interp implementations that are close to or match the output of the PS1 SPU.

Of course, if accuracy isn't the goal, but rather the highest output quality possible, then r8brain is the way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants