Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add res_init #585

Merged
merged 8 commits into from
May 3, 2017
Merged

Add res_init #585

merged 8 commits into from
May 3, 2017

Conversation

jonhoo
Copy link
Contributor

@jonhoo jonhoo commented Apr 27, 2017

The res_init function, while deprecated, is critical for making networked applications work correctly when switching between networks, or between being offline and online. By default, getaddrinfo and friends use a cached version of /etc/resolv.conf, which means that network changes are not picked up by applications after they first start up. This has bitten Firefox, Pidgin, MongoDB, and more in the past.

The logic behind exposing only res_init is that it is the only res_* function that is frequently called directly by user applications. The other res_* functions provide low-level access to domain lookups, whereas res_init is useful even if the application itself is not concerned with doing lookups.

Behind the scenes, getaddrinfo in glibc ultimately calls res_nsearch with &_res. res_init operates directly on this global reference, and is thus more useful to expose for most applications than the non-deprecated res_ninit function (which operators on an arbitrary res_state).

As far as I can tell, res_init is available in FreeBSD, NetBSD, OpenBSD, Solaris, Linux, and macOS.

@alexcrichton
Copy link
Member

@bors: r+

Thanks!

@bors
Copy link
Contributor

bors commented Apr 27, 2017

📌 Commit ae294f9 has been approved by alexcrichton

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

@alexcrichton looking at the Travis build thus far, it looks like a number of tests are failing with empty logs.. Is this a known issue?

@alexcrichton
Copy link
Member

I think Travis is doing some DB maintenance right now, and tha may be causing issues with logs :(

@bors
Copy link
Contributor

bors commented Apr 27, 2017

⌛ Testing commit ae294f9 with merge 703a5e7...

bors added a commit that referenced this pull request Apr 27, 2017
Add res_init

The `res_init` function, while deprecated, is critical for making networked applications work correctly when switching between networks, or between being offline and online. By default, `getaddrinfo` and friends use a cached version of `/etc/resolv.conf`, which means that network changes are not picked up by applications after they first start up. This has bitten [Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=214538), [Pidgin](https://developer.pidgin.im/ticket/2825), [MongoDB](https://jira.mongodb.org/browse/DOCS-5700), and more in the past.

The logic behind exposing only `res_init` is that it is the only `res_*` function that is frequently called directly by user applications. The other `res_*` functions provide low-level access to domain lookups, whereas `res_init` is useful even if the application itself is not concerned with doing lookups.

Behind the scenes, `getaddrinfo` in glibc [ultimately calls](https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/resolv/nss_dns/dns-host.c#L196) `res_nsearch` with `&_res`. `res_init` operates directly on this global reference, and is thus more useful to expose for most applications than the non-deprecated `res_ninit` function (which operators on an arbitrary `res_state`).

As far as I can tell, `res_init` is available in [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=res_init&manpath=SuSE+Linux/i386+11.3), [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?res_init+3+NetBSD-6.1), [OpenBSD](http://man.openbsd.org/resolver.3), [Solaris](http://www.polarhome.com/service/man/?qf=res_init&tf=2&of=Solaris&sf=), [Linux](https://linux.die.net/man/3/res_init), and [macOS](https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/res_init.3.html).
@bors
Copy link
Contributor

bors commented Apr 27, 2017

💔 Test failed - status-travis

@alexcrichton
Copy link
Member

Maybe a header file isn't #include'd?

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

Ah, good catch! Added.
Any chance you could also take a look at the related rust-lang/rust#41570 ?

@alexcrichton
Copy link
Member

Oh hm, is this required to link to the resolv library? That changes the calculus a bit here as it runs a high risk of causing accidental breakage :(

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

@alexcrichton yeah, unfortunately it seems like it does, at least on some platforms. Which is kind of strange, given that getaddrinfo internally calls these methods (and thus must somehow be able to get at the relevant resolv symbols). musl doesn't require this, since res_init is implemented directly, nor do the x86_64 targets (why?), but some other glibc targets do...

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

Also, interestingly, cargo complains about "undefined reference to res_init" even with -lresolv, despite

#include <netinet/in.h>
#include <resolv.h>

int main() {
        res_init();
}

compiling just fine locally for me with both gcc and clang without additional linking options. Have also now tested with gcc on i686, and still compile fine without -lresolv.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

Actually, we may be in luck @alexcrichton. See ae5cf30

@alexcrichton
Copy link
Member

Nice! Perhaps we could leave in the leading double undescore?

Or is that just the symbol name and the C identifier is res_init?

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

The C identifier (and what shows up in all the manpages) is res_init. The rename to __res_init is completely masked from C applications by the resolv.h macro:

#define res_init __res_init

@aidanhs
Copy link
Member

aidanhs commented Apr 27, 2017

I can only see res_init in musl, not __res_init.

(it's also a no-op in musl, but that's an aside)

@alexcrichton
Copy link
Member

@bors: r+

Ok thanks for the info @jonhoo!

@aidanhs ah we'll probably get a link error then, but let's see what travis says.

@bors
Copy link
Contributor

bors commented Apr 27, 2017

📌 Commit ae5cf30 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Apr 27, 2017

⌛ Testing commit ae5cf30 with merge 5e16276...

bors added a commit that referenced this pull request Apr 27, 2017
Add res_init

The `res_init` function, while deprecated, is critical for making networked applications work correctly when switching between networks, or between being offline and online. By default, `getaddrinfo` and friends use a cached version of `/etc/resolv.conf`, which means that network changes are not picked up by applications after they first start up. This has bitten [Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=214538), [Pidgin](https://developer.pidgin.im/ticket/2825), [MongoDB](https://jira.mongodb.org/browse/DOCS-5700), and more in the past.

The logic behind exposing only `res_init` is that it is the only `res_*` function that is frequently called directly by user applications. The other `res_*` functions provide low-level access to domain lookups, whereas `res_init` is useful even if the application itself is not concerned with doing lookups.

Behind the scenes, `getaddrinfo` in glibc [ultimately calls](https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/resolv/nss_dns/dns-host.c#L196) `res_nsearch` with `&_res`. `res_init` operates directly on this global reference, and is thus more useful to expose for most applications than the non-deprecated `res_ninit` function (which operators on an arbitrary `res_state`).

As far as I can tell, `res_init` is available in [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=res_init&manpath=SuSE+Linux/i386+11.3), [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?res_init+3+NetBSD-6.1), [OpenBSD](http://man.openbsd.org/resolver.3), [Solaris](http://www.polarhome.com/service/man/?qf=res_init&tf=2&of=Solaris&sf=), [Linux](https://linux.die.net/man/3/res_init), and [macOS](https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/res_init.3.html).
@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 27, 2017

@alexcrichton @aidanhs I think I tracked down all the major ones in fe94999

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Apr 27, 2017

📌 Commit fe94999 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Apr 27, 2017

⌛ Testing commit fe94999 with merge faa15ce...

bors added a commit that referenced this pull request Apr 27, 2017
Add res_init

The `res_init` function, while deprecated, is critical for making networked applications work correctly when switching between networks, or between being offline and online. By default, `getaddrinfo` and friends use a cached version of `/etc/resolv.conf`, which means that network changes are not picked up by applications after they first start up. This has bitten [Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=214538), [Pidgin](https://developer.pidgin.im/ticket/2825), [MongoDB](https://jira.mongodb.org/browse/DOCS-5700), and more in the past.

The logic behind exposing only `res_init` is that it is the only `res_*` function that is frequently called directly by user applications. The other `res_*` functions provide low-level access to domain lookups, whereas `res_init` is useful even if the application itself is not concerned with doing lookups.

Behind the scenes, `getaddrinfo` in glibc [ultimately calls](https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/resolv/nss_dns/dns-host.c#L196) `res_nsearch` with `&_res`. `res_init` operates directly on this global reference, and is thus more useful to expose for most applications than the non-deprecated `res_ninit` function (which operators on an arbitrary `res_state`).

As far as I can tell, `res_init` is available in [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=res_init&manpath=SuSE+Linux/i386+11.3), [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?res_init+3+NetBSD-6.1), [OpenBSD](http://man.openbsd.org/resolver.3), [Solaris](http://www.polarhome.com/service/man/?qf=res_init&tf=2&of=Solaris&sf=), [Linux](https://linux.die.net/man/3/res_init), and [macOS](https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/res_init.3.html).
@malbarbo
Copy link
Contributor

malbarbo commented Apr 29, 2017

I also tested the example in rust-lang/rust#41570 and it worked on a real android phone. The program prints connected when the phone is back online.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 29, 2017

Interesting.. I suspect this is because Android probably uses nscd or something similar, which does automatically refresh the cache. If you're not running a network manager, that is not the case, and you must call res_init again (at least on Linux). I guess we could then expose res_init only for Linux, since that then seems like the only place it is needed.. Thoughts @alexcrichton?

@semarie
Copy link
Contributor

semarie commented Apr 29, 2017

Android doesn't use glibc as Linux does (their libc version has at least parts from OpenBSD, but I dunno for details).

@jonhoo your underline problem seems to be unexpected behaviour in glibc (maybe a bug should be opened over there ?). But it could be more simple to workaround it with a global solution (like calling res_init unconditionnally). From your bug report, it seems Mozilla already solves it this way, so it works, no need to diverge from a working solution.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 30, 2017

@semarie I don't know that this is actually unexpected behavior as far as glibc goes. It seems quite reasonable for it to cache /etc/resolv.conf so that it doesn't need to do an additional two system calls for every new connection. My question above was more whether, since this only seems to be an issue on Linux when no network manager is in use, it's sufficient for us to just expose libc::res_init on Linux (and not on other targets)? And place the call to res_init in a cfg!(linux) in the modified lookup_host.

@jonhoo
Copy link
Contributor Author

jonhoo commented May 1, 2017

@alexcrichton would you prefer just exposing the call on Linux, or keep the current version which exposes it on all UNIX platforms?

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok getting back to this, glad it's green! I'm fine still exposing this on all platforms, I mainly just want to be as careful as possible adding a new library to link to. I figure we can just be super conservative and only add it to libstd builds and then if it gets past rust-lang/rust CI I'll be happy.

src/unix/mod.rs Outdated
not(target_os = "openbsd"),
not(target_os = "bitrig"),
not(target_os = "solaris"),
not(target_env = "musl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be converted to a whitelist? It's mostly just targeting Linux/android/FreebSD it looks like?

src/unix/mod.rs Outdated
// except on macOS and iOS, where we must link with lib resolv
// for res_init, despite libsystem_info including it:
// http://blog.achernya.com/2013/03/os-x-has-silly-libsystem.html
#[link(name = "resolv")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little wary of doing this, especially outside the stdbuild blocks. Just in terms of possible breakage on new platforms. I do realize though that this is required to link correctly as libc on crates.io otherwise doesn't pull in new libs (it relies on libc in std to do that).

Perhaps we could elide this section (or move it down below after the clause above) and then update the copy in libstd? We can rely on rust-lang/rust CI to block changes if it'd otherwise cause linkage problems. In the meantime we could just disable verification of the res_init symbol in libc as we already know it works!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow this? What do you mean by "move it down below after the clause above"? And by "update the copy in libstd"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, basically what I mean is:

  • Ideally, no libs are added
  • If a lib must be added, it should be added without adding clauses to cfg_if!
  • If a lib must be added, disable tests in libc-test and we'll test that it works in rust-lang/rust instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should simply not link with resolv on macOS/iOS, both on stdbuild and not(stdbuild) , and disable the test for res_init if apple?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start off by disabling linking to resolv everywhere. Let's then only enable it in the not(stdbuild) blocks to ensure we don't accidentally break crates.io. By only enabling it in not(stdbuild) then we can't test it in this PR, but it's ok to leave it enabled on all platforms you have it defined on currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've removed all the linking with libresolv in fcf5fc0. As noted in my discussion comment though, it seems like res_init is just never tested any more following the res_init if android => true addition made in afebd98.

@jonhoo
Copy link
Contributor Author

jonhoo commented May 3, 2017

Hmm, following afebd98, it doesn't look like res_init is tested for any of the platforms..? :/

@alexcrichton
Copy link
Member

@jonhoo hm but the current failures on CI seem to contradict your findings that it's not being tested?

@jonhoo
Copy link
Contributor Author

jonhoo commented May 3, 2017

Huh, weird.. You're right. I guess I must have actually fixed some linking errors then :p
And now you'd like me to skip tests for if apple? Done in be7e45f

@jonhoo
Copy link
Contributor Author

jonhoo commented May 3, 2017

@alexcrichton all right, I think that brings the changes in line with what you wanted?

@alexcrichton
Copy link
Member

@bors: r+

Looks great! Thanks and sorry for the runaround!

@bors
Copy link
Contributor

bors commented May 3, 2017

📌 Commit be7e45f has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented May 3, 2017

⌛ Testing commit be7e45f with merge 03562b0...

bors added a commit that referenced this pull request May 3, 2017
Add res_init

The `res_init` function, while deprecated, is critical for making networked applications work correctly when switching between networks, or between being offline and online. By default, `getaddrinfo` and friends use a cached version of `/etc/resolv.conf`, which means that network changes are not picked up by applications after they first start up. This has bitten [Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=214538), [Pidgin](https://developer.pidgin.im/ticket/2825), [MongoDB](https://jira.mongodb.org/browse/DOCS-5700), and more in the past.

The logic behind exposing only `res_init` is that it is the only `res_*` function that is frequently called directly by user applications. The other `res_*` functions provide low-level access to domain lookups, whereas `res_init` is useful even if the application itself is not concerned with doing lookups.

Behind the scenes, `getaddrinfo` in glibc [ultimately calls](https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/resolv/nss_dns/dns-host.c#L196) `res_nsearch` with `&_res`. `res_init` operates directly on this global reference, and is thus more useful to expose for most applications than the non-deprecated `res_ninit` function (which operators on an arbitrary `res_state`).

As far as I can tell, `res_init` is available in [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=res_init&manpath=SuSE+Linux/i386+11.3), [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?res_init+3+NetBSD-6.1), [OpenBSD](http://man.openbsd.org/resolver.3), [Solaris](http://www.polarhome.com/service/man/?qf=res_init&tf=2&of=Solaris&sf=), [Linux](https://linux.die.net/man/3/res_init), and [macOS](https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/res_init.3.html).
@bors
Copy link
Contributor

bors commented May 3, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 03562b0 to master...

@bors bors merged commit be7e45f into rust-lang:master May 3, 2017
@jonhoo
Copy link
Contributor Author

jonhoo commented May 4, 2017

No worries at all! It's important to get the details of these things right, and gave me some further insight into libc's innards. Now onto rust-lang/rust#41570!

jonhoo added a commit to jonhoo/rust that referenced this pull request May 4, 2017
As discussed in rust-lang#41570, UNIX systems often cache the contents of
/etc/resolv.conf, which can cause lookup failures to persist even after
a network connection becomes available. This patch modifies lookup_host
to force a reload of the nameserver entries following a lookup failure.
This is in line with what many C programs already do (see rust-lang#41570 for
details). On systems with nscd, this should not be necessary, but not
all systems run nscd.

Fixes rust-lang#41570.
Depends on rust-lang/libc#585.
jonhoo added a commit to jonhoo/rust that referenced this pull request May 4, 2017
bors added a commit to rust-lang/rust that referenced this pull request May 4, 2017
…excrichton

Reload nameserver information on lookup failure

As discussed in #41570, UNIX systems often cache the contents of `/etc/resolv.conf`, which can cause lookup failures to persist even after a network connection becomes available. This patch modifies lookup_host to force a reload of the nameserver entries following a lookup failure. This is in line with what many C programs already do (see #41570 for details). On systems with nscd, this should not be necessary, but not all systems run nscd.

Fixes #41570.
Depends on rust-lang/libc#585.

r? @alexcrichton
jonhoo added a commit to jonhoo/rust that referenced this pull request May 5, 2017
As discussed in rust-lang#41570, UNIX systems often cache the contents of
/etc/resolv.conf, which can cause lookup failures to persist even after
a network connection becomes available. This patch modifies lookup_host
to force a reload of the nameserver entries following a lookup failure.
This is in line with what many C programs already do (see rust-lang#41570 for
details). On systems with nscd, this should not be necessary, but not
all systems run nscd.

Introduces an std linkage dependency on libresolv on macOS/iOS (which
also makes it necessary to update run-make/tools.mk).

Fixes rust-lang#41570.
Depends on rust-lang/libc#585.
frewsxcv added a commit to frewsxcv/rust that referenced this pull request May 5, 2017
…-fail, r=alexcrichton

Reload nameserver information on lookup failure

As discussed in rust-lang#41570, UNIX systems often cache the contents of `/etc/resolv.conf`, which can cause lookup failures to persist even after a network connection becomes available. This patch modifies lookup_host to force a reload of the nameserver entries following a lookup failure. This is in line with what many C programs already do (see rust-lang#41570 for details). On systems with nscd, this should not be necessary, but not all systems run nscd.

Fixes rust-lang#41570.
Depends on rust-lang/libc#585.

r? @alexcrichton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants