-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gstreamer-less flac encoder and tagging #121
Conversation
This adds a FlacEncodeTask that encodes wave files to flac files. This commit also replaces morituri's EncodeTask with FlacEncodeTask, however, in morituri, EncodeTask also does the tagging. FlacEncodeTask will not perform the tagging. So we will need an extra task for the tagging - this will be added soon. Meanwhile, do not merge this commit to master yet.
Please do not merge it yet until a few people have tested/verified it. |
It works for me, at least the encoding and the tagging does. I didn't yet test the ARC calculation on these files, mostly because I forgot to put "accuraterip-checksum" in my PATH, and it's taking some time to re-run this task now, and I want to go to bed. Testers wanted. |
Apparently accuraterip-checksum can already read flac files by itself, TMYK. Anyway, it works for me. |
Replace the gstreamer tagging code with mutagen tagging code. getTagList is rewritten to return a dictionary of tags, which are then simply passed to mutagen. The way it is set up right now is not the best - I don't think it makes sense for tagging to take place in program/cdparanoia.py ; but this is how the current code did it. I suggest that, when we rip out all the gstreamer code, we also move the tagging to a more sensible place; and then also make the tagging not be an actual 'task.Task'.
85067be
to
ae9e87e
Compare
This code should perform the CRC32 task: master...MerlijnWajer:crc32 |
Only works on wave files at this point. Should not be a problem, I think.
Added CRC32 code for testing purposes. I am able to rip CDs fine with this branch. |
morituri/program/flac.py
Outdated
""" | ||
try: | ||
check_call(['flac', '--totally-silent', '--verify', '-o', outfile, | ||
'-f', infile]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, could we get compression option configuration? I’d like to compress with --best, and unless we make it the default, this will require a config option. More generally, I suppose people might want to change flac call parameters, so having it in config would be great. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue for using --best
and not allowing for customization.
And please, let's not add anything because "someone might want to do X". Add features when there's a documented use-case applicable to more than a few users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--best
is spending a lot of CPU cycles (and electricity) for a very minor gain in size. If there is no configuration choice, I'd argue for using the default FLAC settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FLAC -8 (--best) isn't that slow on current machines. FLAC -5 is not a smart option, it often does produce FLAC files that are bigger than -8 by 1MB or more. "a lot of CPU cycles" is a very big exaggeration in my opinion. We aren't in the Pentium III era anymore, on semi-current machines the time difference between -5 and -8 is negligible.
As for @tobbez's idea of disabling customization, that's a very bad idea. There are people who will want -5, there are other people who will want -8, there are people who want ReplayGain during encode, there are people who don't. There are people who will want "-e", there are people who won't. Those are the two most common use cases that justify enabling option customization. These use cases are why EAC, XLD, CUERipper and dBpowerAmp (the four most reputable rippers) allow customization. Are we really going to convince people to ditch EAC on Wine for whipper when our response to "How do I add the replaygain option to the FLAC encoding process on whipper?" is "Change the source code and recompile it"?
Adding a customization option for this isn't bloat, it's a fundamental feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must be sure that whatever we encode to initially is the same as what we read from the CD, since we use the FLACs for accurate rip checksums. So anything that changes the contents is a bad idea.
I don't understand why people can't run something like this:
for i in $(ls *.flac); do flac -d $i -o foo.wav && flac --best --custom-option-1 foo.wav -o $1; done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not believe the goal of whipper is to attract or support users who are never going to learn how to use a proper Unix operating environment. There is no reason to add more lines of code to support a non-essential "feature" when users can do the same thing with a tiny shell script. No information about the disc/CDDA is lost by changing the compression option, so in my opinion it's bloat that flies in the face of the Unix philosophy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A script would be a good idea, but then the encoding process would be ran twice. Is a customization option (a simple entry in the configuration file with the encoding parameters, for example) that much of bloat really?
Oh, and before I forget, congratulations on the work on removing gstreamer! Great progress!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for i in $(ls *.flac); do flac -d $i -o foo.wav && flac --best --custom-option-1 foo.wav -o $1; done
^ Not safe. [1]
Rather use:
for i in *.flac; do flac -d "$i" -o foo.wav && flac --best --custom-option-1 foo.wav -o "$1"; done
Because, for example:
$ ls -1
flac file1.flac
flac file2.flac
other file.mp3
$ for i in $(ls *.flac); do echo $i ; done
flac
file1.flac
flac
file2.flac
# but:
$ for i in *.flac; do echo "$i" ; done
flac file1.flac
flac file2.flac
So maybe newcomers should not necessarily be expected to be proficient in something that even experienced users can slip sometimes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use --best
? There is no reduction in quality, the internal check sums still work and it only takes 2 seconds to encode a track which is essentially nothing when compared to the fact that it takes 20-30 minutes to actually rip the data from CD. Conversely as pointed out above bash scripting is hackish, error prone and requires everyone to come up with their own solution to the same problem. Including the script in whipper would probably be about the same if not more cognitive load / LOC than adding an option, and certainly more than just hard-coding --best
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, wasn't morituri using gstreamer's --best
equivalent?
I am fine with making it the default. I am not sure if it is a sane default, however. I'm also fine with making it a config option - but preferrably after gst code is out. Shouldn't be too hard for someone to pick this up if they care. |
morituri/common/program.py
Outdated
tags['ALBUMARTIST'] = albumArtist.encode('utf-8') | ||
tags['ARTIST'] = trackArtist.encode('utf-8') | ||
tags['TITLE'] = title.encode('utf-8') | ||
tags['DISC'] = disc.encode('utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.encode('utf-8')
should be removed from all of these. mutagen prefers unicode, and will encode the tags when writing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that mutagen prefers UTF-16 instead? UTF-8 is the most common Unicode codepage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is what tobbez said. He just said that mutagen itself will encode it to utf-8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see now! Never mind my comment then, I'm dumb!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace:
tags['DISC']
->tags['ALBUM']
tags['musicbrainz-trackid']
->tags['MUSICBRAINZ_TRACKID']
ret["musicbrainz-artistid"]
->ret['MUSICBRAINZ_ARTISTID']
tags['musicbrainz-artistid']
->tags['MUSICBRAINZ_ARTISTID']
ret["musicbrainz-albumid"]
->ret['MUSICBRAINZ_ALBUMID']
tags['musicbrainz-albumid']
->tags['MUSICBRAINZ_ALBUMID']
ret["musicbrainz-albumartistid"]
->ret['MUSICBRAINZ_ALBUMARTISTID']
tags['musicbrainz-albumartistid']
->tags['MUSICBRAINZ_ALBUMARTISTID']
ret["musicbrainz-discid"]
->ret['MUSICBRAINZ_DISCID']
tags['musicbrainz-discid']
->tags['MUSICBRAINZ_DISCID']
Remove:
FLAC = 'flac'
(morituri/program/flac.py, line 6)
morituri/program/flac.py
Outdated
""" | ||
try: | ||
check_call(['flac', '--totally-silent', '--verify', '-o', outfile, | ||
'-f', infile]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using --silent
instead of --totally-silent
and capturing any errors written (stderr=subprocess.PIPE
). Would be much more helpful to know that flac failed with a specific error, rather than just the fact that it failed.
I've tested the without-gstreamer branch against |
The without-gstreamer definitely breaks some image features: retagging is one of them. Thanks for testing. The without-gstreamer commit was more to prove that we have all the required functionality ready, without gstreamer. It wasn't meant to be merged yet. |
I think I've dealt with most suggestions. I've not included '--best' right now, and I am not yet parsing the flac stderr - but noted that we should do that soon. As far as I am concerned, this is ready for merging. |
On 04/02/17 18:49, Mariusz wrote:
So maybe newcomers should not necessarily be expected to be proficient
in something that even experienced users can slip sometimes...
This is based on assumptions that you made. I wrote a simple example -
to get the general idea across. In no way does this invalidate the
suggestion.
Are you volunteering to write these contrib scripts?
|
On 04/02/17 22:09, Kevin Mitchell wrote:
***@***.**** commented on this pull request.
Why not just use |--best|? There is no reduction in quality, the
internal check sums still work and it only takes 2 seconds to encode a
track which is essentially nothing when compared to the fact that it
takes 20-30 minutes to actually rip the data from CD. Conversely as
pointed out above bash scripting is hackish, error prone and requires
everyone to come up with their own solution to the same problem.
Including the script in whipper would probably be about the same if not
more cognitive load / LOC than adding an option, and certainly more than
just hard-coding |--best|.
In my experience, even without --best, it can take anywhere from 10
seconds to 1 minute to encode a wave file with flac on a recent i5. If
morituri/whipper would be properly optimised for speed, a whole CD would
take less than 10 minutes, 5 minutes if we leave out some runs. The
software based on morituri that I wrote does it in 2-3 minutes if the AR
records are available and match.
I'm not going to comment any more on what you guys refer to as the
"pointed solution" - I have suggested a way that people can *very
easily* customise the output to their liking. A proper script is not
error prone, and it also doesn't require everyone to come up with their
own solution to the problem. We could simply ship a script to do it. As
you can see, there are also people who are not in favour of using
--best. File an issue if this is such a big deal.
To me it seemed that the people who wanted --best stated that they don't
care about the CPU cycle waste. Well, then why not re-encode the file
with --best after the rip is done?
Please make an issue/feature-req instead, and provide some useful
numbers. If --best is really the best solution, why is it not the
default option for FLAC?
There really some other things I want to get fixed first, so I don't
want to dive into this further - but an issue on github seems like the
right way to deal with this, not the conversation on a merged PR.
|
FWIW, I've updated my |
Two commits - the first commit perform the FLAC encoding using the Xiph.org 'flac' program. Because whipper/morituri's gstreamer 'encoding' code also tagged - the second commit adds tagging using mutagen.
Note that these two commits mostly do not remove gstreamer code (the tag one does, however). We need to add an equivalent of the CRC32 task (or simply remove it?). Once we have that in place, as far as I am concerned, we can start removing ALL the gstreamer code.
See #29