Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve SyntaxContext for invalid/dummy spans in crate metadata #85211

Merged
merged 1 commit into from
May 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions compiler/rustc_metadata/src/rmeta/decoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -406,17 +406,17 @@ impl<'a, 'tcx> Decodable<DecodeContext<'a, 'tcx>> for ExpnId {

impl<'a, 'tcx> Decodable<DecodeContext<'a, 'tcx>> for Span {
fn decode(decoder: &mut DecodeContext<'a, 'tcx>) -> Result<Span, String> {
let ctxt = SyntaxContext::decode(decoder)?;
let tag = u8::decode(decoder)?;

if tag == TAG_INVALID_SPAN {
return Ok(DUMMY_SP);
if tag == TAG_PARTIAL_SPAN {
return Ok(DUMMY_SP.with_ctxt(ctxt));
}

debug_assert!(tag == TAG_VALID_SPAN_LOCAL || tag == TAG_VALID_SPAN_FOREIGN);

let lo = BytePos::decode(decoder)?;
let len = BytePos::decode(decoder)?;
let ctxt = SyntaxContext::decode(decoder)?;
let hi = lo + len;

let sess = if let Some(sess) = decoder.sess {
Expand Down
82 changes: 41 additions & 41 deletions compiler/rustc_metadata/src/rmeta/encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -187,11 +187,48 @@ impl<'a, 'tcx> Encodable<EncodeContext<'a, 'tcx>> for ExpnId {

impl<'a, 'tcx> Encodable<EncodeContext<'a, 'tcx>> for Span {
fn encode(&self, s: &mut EncodeContext<'a, 'tcx>) -> opaque::EncodeResult {
if *self == rustc_span::DUMMY_SP {
return TAG_INVALID_SPAN.encode(s);
let span = self.data();

// Don't serialize any `SyntaxContext`s from a proc-macro crate,
// since we don't load proc-macro dependencies during serialization.
// This means that any hygiene information from macros used *within*
// a proc-macro crate (e.g. invoking a macro that expands to a proc-macro
// definition) will be lost.
//
// This can show up in two ways:
//
// 1. Any hygiene information associated with identifier of
// a proc macro (e.g. `#[proc_macro] pub fn $name`) will be lost.
// Since proc-macros can only be invoked from a different crate,
// real code should never need to care about this.
//
// 2. Using `Span::def_site` or `Span::mixed_site` will not
// include any hygiene information associated with the definition
// site. This means that a proc-macro cannot emit a `$crate`
// identifier which resolves to one of its dependencies,
// which also should never come up in practice.
//
// Additionally, this affects `Span::parent`, and any other
// span inspection APIs that would otherwise allow traversing
// the `SyntaxContexts` associated with a span.
//
// None of these user-visible effects should result in any
// cross-crate inconsistencies (getting one behavior in the same
// crate, and a different behavior in another crate) due to the
// limited surface that proc-macros can expose.
//
// IMPORTANT: If this is ever changed, be sure to update
// `rustc_span::hygiene::raw_encode_expn_id` to handle
// encoding `ExpnData` for proc-macro crates.
if s.is_proc_macro {
SyntaxContext::root().encode(s)?;
} else {
span.ctxt.encode(s)?;
}

let span = self.data();
if self.is_dummy() {
return TAG_PARTIAL_SPAN.encode(s);
}

// The Span infrastructure should make sure that this invariant holds:
debug_assert!(span.lo <= span.hi);
Expand All @@ -206,7 +243,7 @@ impl<'a, 'tcx> Encodable<EncodeContext<'a, 'tcx>> for Span {
if !s.source_file_cache.0.contains(span.hi) {
// Unfortunately, macro expansion still sometimes generates Spans
// that malformed in this way.
return TAG_INVALID_SPAN.encode(s);
return TAG_PARTIAL_SPAN.encode(s);
}

let source_files = s.required_source_files.as_mut().expect("Already encoded SourceMap!");
Expand Down Expand Up @@ -262,43 +299,6 @@ impl<'a, 'tcx> Encodable<EncodeContext<'a, 'tcx>> for Span {
let len = hi - lo;
len.encode(s)?;

// Don't serialize any `SyntaxContext`s from a proc-macro crate,
// since we don't load proc-macro dependencies during serialization.
// This means that any hygiene information from macros used *within*
// a proc-macro crate (e.g. invoking a macro that expands to a proc-macro
// definition) will be lost.
//
// This can show up in two ways:
//
// 1. Any hygiene information associated with identifier of
// a proc macro (e.g. `#[proc_macro] pub fn $name`) will be lost.
// Since proc-macros can only be invoked from a different crate,
// real code should never need to care about this.
//
// 2. Using `Span::def_site` or `Span::mixed_site` will not
// include any hygiene information associated with the definition
// site. This means that a proc-macro cannot emit a `$crate`
// identifier which resolves to one of its dependencies,
// which also should never come up in practice.
//
// Additionally, this affects `Span::parent`, and any other
// span inspection APIs that would otherwise allow traversing
// the `SyntaxContexts` associated with a span.
//
// None of these user-visible effects should result in any
// cross-crate inconsistencies (getting one behavior in the same
// crate, and a different behavior in another crate) due to the
// limited surface that proc-macros can expose.
//
// IMPORTANT: If this is ever changed, be sure to update
// `rustc_span::hygiene::raw_encode_expn_id` to handle
// encoding `ExpnData` for proc-macro crates.
if s.is_proc_macro {
SyntaxContext::root().encode(s)?;
} else {
span.ctxt.encode(s)?;
}

if tag == TAG_VALID_SPAN_FOREIGN {
// This needs to be two lines to avoid holding the `s.source_file_cache`
// while calling `cnum.encode(s)`
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_metadata/src/rmeta/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -451,4 +451,4 @@ struct GeneratorData<'tcx> {
// Tags used for encoding Spans:
const TAG_VALID_SPAN_LOCAL: u8 = 0;
const TAG_VALID_SPAN_FOREIGN: u8 = 1;
const TAG_INVALID_SPAN: u8 = 2;
const TAG_PARTIAL_SPAN: u8 = 2;
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
// revisions: rpass1 rpass2

extern crate respan;

#[macro_use]
#[path = "invalid-span-helper-mod.rs"]
mod invalid_span_helper_mod;

// Invoke a macro from a different file - this
// allows us to get tokens with spans from different files
helper!(1);
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#[macro_export]
macro_rules! helper {
// Use `:tt` instead of `:ident` so that we don't get a `None`-delimited group
($first:tt) => {
pub fn foo<T>() {
// The span of `$first` comes from another file,
// so the expression `1 + $first` ends up with an
// 'invalid' span that starts and ends in different files.
// We use the `respan!` macro to give all tokens the same
// `SyntaxContext`, so that the parser will try to merge the spans.
respan::respan!(let a = 1 + $first;);
}
}
}
19 changes: 19 additions & 0 deletions src/test/incremental/issue-85197-invalid-span/auxiliary/respan.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// force-host
// no-prefer-dynamic

#![crate_type = "proc-macro"]

extern crate proc_macro;
use proc_macro::TokenStream;


/// Copies the resolution information (the `SyntaxContext`) of the first
/// token to all other tokens in the stream. Does not recurse into groups.
#[proc_macro]
pub fn respan(input: TokenStream) -> TokenStream {
let first_span = input.clone().into_iter().next().unwrap().span();
input.into_iter().map(|mut tree| {
tree.set_span(tree.span().resolved_at(first_span));
tree
}).collect()
}
24 changes: 24 additions & 0 deletions src/test/incremental/issue-85197-invalid-span/invalid_span_main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
// revisions: rpass1 rpass2
// aux-build:respan.rs
// aux-build:invalid-span-helper-lib.rs

// This issue has several different parts. The high level idea is:
// 1. We create an 'invalid' span with the help of the `respan` proc-macro,
// The compiler attempts to prevent the creation of invalid spans by
// refusing to join spans with different `SyntaxContext`s. We work around
// this by applying the same `SyntaxContext` to the span of every token,
// using `Span::resolved_at`
// 2. We using this invalid span in the body of a function, causing it to get
// encoded into the `optimized_mir`
// 3. We call the function from a different crate - since the function is generic,
// monomorphization runs, causing `optimized_mir` to get called.
// 4. We re-run compilation using our populated incremental cache, but without
// making any changes. When we recompile the crate containing our generic function
// (`invalid_span_helper_lib`), we load the span from the incremental cache, and
// write it into the crate metadata.

extern crate invalid_span_helper_lib;

fn main() {
invalid_span_helper_lib::foo::<u8>();
}