-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support capturing audio output from sound card #629
Comments
Why does this need special treatment? |
When There should be a means to directly select It is not immediately clear at the specification that to select What is the canonical procedure to capture audio output (not input from a microphone) from the system? |
@alvestrand Firefox lists
(emphasis added) at How To Record Speaker Output on Linux
Firefox throws an |
At https://w3c.github.io/mediacapture-main/#methods-5
does a PR need to be filed to explicitly include
? |
getUserMedia is focusing on microphones and cameras, extending it to audiooutput would probably not work well. The getDisplayMedia spec provides the ability to capture system sound, see https://w3c.github.io/mediacapture-screen-share/#dom-mediadevices-getdisplaymedia. |
It is already possible to capture audio output with It is simply not clear that procedure is possible. Chromium only lists microphone and camera at UI prompt. "Speakers" or "Audio Output" should be listed, where the equivalent of |
Is the suggestion to file an issue at that specification? Attempted to get audio using What is the canonical procedure per any specification or implementation relevant to media streams and capture origin to capture only audio output? |
From #651 (comment):
I don't think mediacapture-main should specify generic capture of audio output from a browser or document without the context of a compelling user problem that it best solves. Especially not to work around another web API failing to provide adequate access¹ to audio it generates, to solve a use case that seems reasonable in that spec's domain. Instead, I'd ask that API to solve it without workarounds. Feel free to refer to mozilla/standards-positions#170 (comment). The audio capture support in 1. In the form of a |
Compelling use cases are listed at #651 (comment). As previously stated, that functionality is already possible at both Firefox and Chromium. What is missing is the specification acknowledging the technical facts and canonical example of how to do so.
are not true and correct. |
@jan-ivar One example proof of
being not true and correct https://bugzilla.mozilla.org/show_bug.cgi?id=1604994#c5. Are you stating that the non-exhaustive list of use cases at #651 (comment) are not compelling? Are you asking for an exhaustive list of individuals, institutions which have published use cases for capturing audio output? |
"that it best solves". I think those are compelling use cases for web speech to solve cleanly. |
@jan-ivar There is no bright-line rule that the input to Consider an individual with vision impairment. They have a book they want to read or write. They can feed the text of the book to In reverse, audio output can be converted to plain text (in one or more languages) or Brail, etc. Without the ability to capture audio output it becomes difficult to test input and output.
Well, you can refer to the document that you cited mozilla/standards-positions#170 (comment), in this case the author of the post is correct in their analysis
Essentially, the Web Speech API is dead. While a novel and worthy start, there are several issues with the current specification. Media Capture and Streams is well-suited to take on the task of filling in the holes, which are actually in accord with what is already possible. Am still not gathering the reluctance to acknowledge that the procedure is already technically possible. |
Am perfoming due diligence before writing TTS/SST for the web platform from scratch where the technology and infrastructure already exists within the body of Media Capture and Streams, to avoid repeating what is already technically possible, where all that is really needed at this point relevant to capturing audio output under the umbrella of this specification, is the acknowledgement that that behaviour is already technically possible. Once that acknowledgment is made, then the canonical algorithm to do so can be incorporated into the specification officially. Have listed compelling use cases for the subject matter of TTS/SST though audio capture is not limited to those use cases alone. How user utilize the functionality, once unequivocally specified, is up to them. It does not appear to be a difficult task to simply amend the specification to acknowledge what is already possible, even if those possible outputs were/are unintended, and make sure that an algorithm is clearly defined to guide in the implementation of the edge cases, if you will, or unintended consequences of the power of this API to be useful in more than only the perhaps limited intent conceived by the original authors of the technical document. It is mathematically impossible to conceive of all of the possible use cases from within the official body, hierarchial structure, or any variance of a system itself, no matter to field or domain of human activity. Though try to avoid citing secondary sources, in this case Wikipedia provides concise synposis of the indisputable mathematical fact demonstrated by Kurt Gödel's Incompleteness Theorm (On Formally Undecidable Propositions of "Principia Mathematica" and Related Systems, 1931)
|
Jan-Ivar said: "I don't think mediacapture-main should specify generic capture of audio output from a browser or document without the context of a compelling user problem that it best solves. Especially not to work around another web API failing to provide adequate access¹ to audio it generates, to solve a use case that seems reasonable in that spec's domain. Instead, I'd ask that API to solve it without workarounds. Feel free to refer to mozilla/standards-positions#170 (comment). The audio capture support in [BA] Closing this issue. |
@aboba The closure of this issue leads to other questions. If capturing from a source card were not possible at all and if implementations did not provide a means to create fake media devices and fake input streams available to the main thread then closure of this would be in accord with the position that capturing from an output device (or any other device, virtual or otherwise) is simply not possible under the umbrella of this specification. However, since browsers already do allow for setting a fake media device and fake input media stream the question then becomes what is the canonical procedure to create a specification compliant fake media device and fake media stream from a file (https://chromium.googlesource.com/chromium/src/+/4cdbc38ac425f5f66467c1290f11aa0e7e98c6a3/media/audio/fake_audio_output_stream.cc; https://chromium.googlesource.com/chromium/src/+/4cdbc38ac425f5f66467c1290f11aa0e7e98c6a3/media/audio/fake_audio_manager.cc; https://stackoverflow.com/a/40783725). It is not as if the substance of this feature request is not already possible. Am asking for the functionality to be officially standardized. Instead of now having to embark on writing and implementing to code outside of the official standard (https://github.com/auscaster/webrtc-native-to-browser-peerconnection-example), if only to prove the requirement is possible and not specifying the same will only lead to disparate code in the wild that will eventually lead right back to this specification.
The problem is that implementation has several bugs that can be fixed by the official specification body, to allow for creation of a Will ask the question ("How to create a fake media device and MediaStream from a file") officially in an issue. |
As already stated, the Web Speech API is essentially dead. Filed issues at least over a year ago to implement SSML parsing at both Chromium and Firefox. The fix is relatively simple and the patch has already been posted (https://bugs.chromium.org/p/chromium/issues/detail?id=795371#c18). ChromiumOS authors, again, focus on extension code using Besides, am banned from WICG for 1,000 years and when joined the W3C (https://lists.w3.org/Archives/Public/public-wicg/2019Oct/0000.html) to contribute to Web Speech API, was un-joined from that parent organization, for fraudualent reasoning easily proven hypocritical Have zero confidence that W3C and especially WICG are operating in a non-biased manner. |
@aboba Though, as usual, the closure of this official feature request will probably be beneficial in the end, in order to be free of the (at times moving, and occasionally arbitrarily) constraints of any specification when trying to meet a requirement for a use case. Performed due diligence in any case in an attempt to get what is already possible actually specified. |
I'm not sure what the etiquette is for commenting on closed issues, so let me know if this would be better posted elsewhere, but I have a use case that has not been considered and is completely unrelated to web speech: music production, video editing, and other similar work. Many music producers record or stream the music production process for educational purposes, or for live performances. This is even more common now that so many musicians aren't able to perform live due to COVID. In these cases, there doesn't appear to be any viable substitute for capturing the main output of an external audio interface/sound card.
If you want the sound card input, yes. But in use cases related to music production or video editing, if I want to stream or record, the sound card's main output is going to be the only useful audio to capture. None of the other sources will have all parts of the master mix. The DAW needs to use the resources the sound card provides directly instead of just passing audio data to it like most audio-light applications do, so the sound can't be grabbed using the 'desktop' generic source. In the specific application I'm writing I was disappointed to find that calling getUserMedia with an audiooutput device ID confusingly gave up and instead gave me stream containing audio from my laptop's built-in mic. |
Agree. Reading this language https://chromium-review.googlesource.com/c/chromium/src/+/1064373/
it appears that Chromium just "refuse" to open certain devices - rather than fix the actual problem with PulseAudio being described, where Mozilla Firefox and Nightly do not have that problem, and using Perhaps you can actually fix whatever problem was reported at https://issuetracker.google.com/79580580 re PulseAudio in order for Chromium to be in alignment with your evaluation? |
Add the capability to capture audio output from sound card. Whether headphones are plugged in or not it should be possible to capture audio output from the system soundcard.
navigator.mediaDevices.getUserMedia({audio:{speakers:true}})
or
navigator.mediaDevices.getUserMedia({audio:{headphones:true}})
or
navigator.mediaDevices.getUserMedia({audio:{output:true}})
Fixes
The text was updated successfully, but these errors were encountered: