Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

obs-webrtc: Add VDO.Ninja support and improve NAT traversal #11698

Closed

Conversation

steveseguin
Copy link

@steveseguin steveseguin commented Jan 3, 2025

Disclaimer:

  • I used Claude AI to assist in my coding as I am not a strong C++ developer
  • I don't expect this PR to be accepted, yet I am hoping even still it might provide value/discussion.

Description

Improve NAT traversal for WHIP output by making ICE candidate gathering more reliable
and adding configurable STUN support for peer-to-peer connections.

Major changes:

  1. Improve NAT traversal reliability

    • Gather all ICE candidates before creating offer
    • Add configurable port support via URL parameters
    • Set safe default port ranges (49152-65535)
  2. Add VDO.Ninja-specific optimizations

    • Add default STUN servers for whip.vdo.ninja endpoint
    • Support custom port configuration for firewall-restricted environments
  3. Simplify and improve error handling

    • Better error logging categorization
    • More specific SDP-related error messages
    • Improved status code handling

Note: This temporarily removes trickle ICE support in favor of a more reliable
bundled candidate approach. Trickle support can be re-added when LibDataChannel
is updated in a future PR.

More details on specific changes:

  1. Changed LibDataChannel version requirement from 0.20 to 0.21
  • Ideally we upgrade this to the newest version to get trickle-support working more reliably
  • I noticed no difference moving from 0.20 to 0.21, but it was an easy change to make.
  1. Removed trickle ICE candidate support since it wasn't working for me
  • Removed trickle_endpoint and related code
  • Eliminated SendTrickleCandidate() function
  • Removed candidate gathering state handlers
  • Removed OnIceCandidate() function

Reason: The asynchronous trickle ICE wasn't working reliably for me and my understanding is it may need an update to LibDataChannel to correctly function. My new approach of gathering all candidates before creating the offer, while potentially slower, is more reliable, universal, and simpler to implement correctly.

I would want to add Trickle support back in, but only after LibDataChannel is updated to a much newer version. I've not invested time into trying to get LibDataChannel updated, so that might be a better solution to start with.

  1. Simplified connection handling:
  • Removed ICE state change monitoring
  • Removed gathering state change monitoring
  • Simplified error handling and status reporting
  1. Added STUN servers only if VDO.Ninja is used
  • Added default STUN servers (Google/Cloudflare) when using whip.vdo.ninja
  • Hard-coded into the code, so no remote setting is needed and no added latency is needed to start gathering candidates

Reason: I made this limited to VDO.Ninja simply because I didn't want to assume what STUN servers should be used for other WHIP destinations. Since whip.vdo.ninja is an open-API/service, anyone can use it to help with peer to peer WHIP routing, so it felt like a good solution for now. Privacy focused users might not want to use Google's STUN servers, for example, but I'll leave that decision up to you.

VDO.Ninja's WHIP service already uses Google/Cloudflare's STUN servers however, along with its own free TURN. It could offer STUN if there was a compelling reason to however.

Instead of hard-coding the STUN servers, a better solution could be to have Trickle support working better and/or allow users to specify the STUN/TURN servers via a settings menu in the OBS user interface. Via URL query parameters, such as &stun=stun.stunserver.com, might be additional method to achieve this.

  1. Modified error handling:
  • Changed some LOG_INFO to LOG_ERROR for better severity indication
    • Debugging WHIP issues becomes easier if the error outputs are accessible via the OBS LOG file
  • Improved error messages and status codes
  • Added more specific error handling for SDP-related issues
  1. Added automatic port configuration:
  • Parse port from URL parameters for specific port (?port= or &port=)
    • This helps with port-forwarding on consumer routers when STUN not available
    • Corporations can whitelist a single port more easily than a range of ports
  • Set port range (49152-65535 for non-privileged ports)
  • Validate port range (1024-65535)

The key architectural change is moving from a trickle ICE approach to gathering all candidates before creating the offer, which should improve NAT traversal reliability at the cost of slightly longer initial connection times. Many users can't open 40,000 ports on their home router, but they can easily open a couple.

Motivation and Context

  • To allow for direct OBS to OBS low-latency video sharing, using WHIP (peer to peer), without needing a server, resulting in:

    • low latency
    • lower costs
    • easier to use
    • potentially lower network bandwidth
    • potentially higher max bitrate / quality
  • Many VDO.Ninja users were unable to make use of the OBS WHIP output over the Internet, from OBS to OBS, or OBS to VDO.Ninja, where NAT traversal was needed. It only worked on a LAN for them.

  • Publishing to a WHIP server doesn't tend to run into these issues, but when dealing with Peer to Peer traffic, ICE candidates are vital. I've made this step more reliable / universally friendly for a variety of WHIP scenarios.

  • Trickling of ICE candidates after the initial offer is made isn't broadly supported by WHIP servers

    • It also requires multiple back and forth messaging steps, rather than just a single POST request.
    • It can still be supported, however it will require adding it back in when the LibDataChannel is updated

How Has This Been Tested?

  • Windows builds in production environments

    • Several VDO.Ninja community users have verified it has worked for them; no issues reported
  • Multiple NAT/firewall configurations

  • VDO.Ninja WHIP endpoint

  • I've not done enough testing on TURN servers

  • I've not done enough testing to see how this implements other services, beyond VDO.Ninja's WHIP service

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Tweak (non-breaking change to improve existing functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code has been run through clang-format.
  • [ x ] I have read the contributing document.
  • [ x ] My code is not on the master branch.
  • [ x ] The code has been tested.
  • [ x ] All commit messages are properly formatted and commits squashed where appropriate.
  • [ x ] I have included updates to all appropriate documentation.

Modify WHIP publishing to support NAT traversal by gathering ICE candidates
before creating the offer. Changes include:
- Roll back LibDataChannel to version 0.20 for better compatibility
- Add default STUN server support for VDO.Ninja (Google/Cloudflare)
- Remove trickle ICE in favor of bundled candidates in initial offer
- Add port configuration via URL parameters for VDO.Ninja endpoint
- Set default port range to non-privileged ports (49152-65535)
- Clean up error handling and logging
- Remove unused transfer-related functions
@@ -207,13 +208,46 @@ bool WHIPOutput::Init()
bool WHIPOutput::Setup()
{
rtc::Configuration cfg;

// Add default STUN servers if using VDO.Ninja
if (endpoint_url.find("https://whip.vdo.ninja") == 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can return the STUN servers you want via HTTP Headers. I think we only accept TURN servers right now? If we fix the parsing should work with STUN!

     Link: <stun:stun.example.net>; rel="ice-server"
     Link: <turn:turn.example.net?transport=udp>; rel="ice-server";
           username="user"; credential="myPassword"; credential-type="password"
     Link: <turn:turn.example.net?transport=tcp>; rel="ice-server";
           username="user"; credential="myPassword"; credential-type="password"
     Link: <turns:turn.example.net?transport=tcp>; rel="ice-server";
           username="user"; credential="myPassword"; credential-type="password"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is me, wrong account

}

// Wait for ICE gathering to complete
int gather_timeout = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why block for gathering? Would prefer not to do this, maybe I can help fix what is happening on this.

@Sean-Der
Copy link
Contributor

Sean-Der commented Jan 4, 2025

I really like the port range restriction idea. Users who don't get automatic NAT Traversal can benefit from that a lot!

I am hesitant to put it in the URL/WHIP itself though. Would be nice if it is something more config driven.

@Sean-Der
Copy link
Contributor

Sean-Der commented Jan 4, 2025

I don't expect this PR to be accepted, yet I am hoping even still it might provide value/discussion.

I think if we split it up into smaller pieces will be more approachable! So much is happening in this PR :)

@Ndugutse

This comment was marked as spam.

@steveseguin
Copy link
Author

steveseguin commented Jan 4, 2025

Thank you for the feedback, Sean.

I think if we split it up into smaller pieces will be more approachable! So much is happening in this PR :)

I do agree. I was a bit hesitant to make a PR out of this, yet it's been going on a couple months since I hacked this out for some users and I've not found the time to do more with it.

You can return the STUN servers you want via HTTP Headers. I think we only accept TURN servers right now? If we fix the parsing should work with STUN!

The problem with this is to get the STUN servers you need to get the answer back from the server, after making the offer I believe. You're forced to then make PATCH requests, which isn't supported by all WHIP destinations or often fails without reason (eg Cloudflare's WHEP). But specific to this situation, I just couldn't get LibDataChannel to let me modify / update the STUN servers and get candidates after the initial offer was made.

My post-offer candidates issue might be my fault though; I struggled with LibDataChannel and ultimately decided that maybe LibDataChannel 0.21 just does not support that ability? I didn't manage to update LibDataChannel to confirm this suspicion, and instead just went with this simpler approach. Including ICE candidates with the offer has proven highly universal and reliable for me, so even with PATCH request support, the option to use a default STUN service I think has value.

I really like the port range restriction idea. Users who don't get automatic NAT Traversal can benefit from that a lot!

Awesome! I've had numerous user types unable/unwilling to open a broad port range, and this ability would go a long way in overcoming their issues. I'd agree that if it's a setting in the GUI, it would be better. If it's a URL option though, it would allow the WHIP service provider to give the user a single URL and that's it -- no other settings or fields to fill out.

There's was/is precedent for this URL parameter settings approach in OBS Studio already, too. "&layer-width=1920&layer-height=1080&layer-name=Browser" can pre-setup the OBS browser source with a custom label and resolution when dragged into OBS.

Regardless, having options to specify TURN / STUN servers, or perhaps SVC profiles options, or to set a candidate gather wait timeouts limits -- all things I can see being valuable in an advanced settings menu. I can of course ramble off a million other feature ideas.. you know my contact ;)

@gxalpha
Copy link
Member

gxalpha commented Jan 4, 2025

I’ll mark this as draft for now since there appears to be no intention of having this merged as-is (to prevent accidental merges).

@gxalpha gxalpha marked this pull request as draft January 4, 2025 10:59
@WizardCM WizardCM added Enhancement Improvement to existing functionality New Feature New feature or plugin labels Jan 4, 2025
@PatTheMav
Copy link
Member

Closing this. Please refrain from opening pull requests for anything but finished code ready to be merged into the main code base and that compiles and runs on all supported platforms.

Also for future reference please refer to our section about AI and machine learning systems as outlined in our contributing document.

@PatTheMav PatTheMav closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Improvement to existing functionality New Feature New feature or plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants