Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiletime can blow up when building with optimizations and debug info #48226

Closed
matthiaskrgr opened this issue Feb 15, 2018 · 23 comments
Closed
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@matthiaskrgr
Copy link
Member

rustc 1.25.0-nightly (3ec5a99aa 2018-02-14)
binary: rustc
commit-hash: 3ec5a99aaa0084d97a9e845b34fdf03d1462c475
commit-date: 2018-02-14
host: x86_64-unknown-linux-gnu
release: 1.25.0-nightly
LLVM version: 6.0

src/main.rs

fn main() {
    println!("Hello, world!");
}

Cargo toml:

[package]
name = "bla"
version = "0.1.0"
authors = ["me"]

[dependencies]
x11-dl = "2.17.2"

[profile.release]
lto=true
codegen-units=1
cargo build --release
[...]
    Finished release [optimized] target(s) in 16.47 secs

so far so good.

Now lets add debug=true to Cargo.toml release flags

cargo build --release
[...]
    Finished release [optimized + debuginfo] target(s) in 294.82 secs

Ooops....

Looks like most of the time is spent in llvm:

 time: 0.011; rss: 243MB	death checking
  time: 0.000; rss: 243MB	unused lib feature checking
  time: 0.088; rss: 244MB	lint checking
  time: 0.000; rss: 244MB	resolving dependency formats
    time: 0.154; rss: 254MB	write metadata
    time: 0.111; rss: 261MB	translation item collection
    time: 0.007; rss: 261MB	codegen unit partitioning
    time: 0.430; rss: 296MB	translate to LLVM IR
    time: 0.000; rss: 296MB	assert dep graph
    time: 0.000; rss: 296MB	serialize dep graph
  time: 0.771; rss: 296MB	translation
    time: 0.325; rss: 182MB	llvm function passes [x11_dl0]
    time: 4.235; rss: 193MB	llvm module passes [x11_dl0]
    time: 279.619; rss: 194MB	codegen passes [x11_dl0]
  time: 284.345; rss: 158MB	LLVM passes
===-------------------------------------------------------------------------===
                              Register Allocation
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0189 seconds (0.0196 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0050 ( 31.8%)   0.0010 ( 31.6%)   0.0060 ( 31.7%)   0.0063 ( 32.1%)  Spiller
   0.0046 ( 28.7%)   0.0010 ( 34.2%)   0.0056 ( 29.6%)   0.0058 ( 29.7%)  Evict
   0.0033 ( 20.9%)   0.0001 (  3.2%)   0.0034 ( 18.1%)   0.0035 ( 17.8%)  Global Splitting
   0.0015 (  9.7%)   0.0005 ( 17.0%)   0.0021 ( 10.9%)   0.0021 ( 10.9%)  Seed Live Regs
   0.0014 (  8.8%)   0.0004 ( 13.9%)   0.0018 (  9.6%)   0.0019 (  9.5%)  Local Splitting
   0.0159 (100.0%)   0.0030 (100.0%)   0.0189 (100.0%)   0.0196 (100.0%)  Total

===-------------------------------------------------------------------------===
                      Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
  Total Execution Time: 0.4225 seconds (0.4328 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.1520 ( 42.6%)   0.0141 ( 21.5%)   0.1660 ( 39.3%)   0.1676 ( 38.7%)  DAG Combining 1
   0.0589 ( 16.5%)   0.0149 ( 22.8%)   0.0738 ( 17.5%)   0.0758 ( 17.5%)  Instruction Selection
   0.0386 ( 10.8%)   0.0100 ( 15.2%)   0.0486 ( 11.5%)   0.0499 ( 11.5%)  Instruction Scheduling
   0.0295 (  8.2%)   0.0077 ( 11.7%)   0.0371 (  8.8%)   0.0390 (  9.0%)  Instruction Creation
   0.0281 (  7.9%)   0.0071 ( 10.9%)   0.0352 (  8.3%)   0.0364 (  8.4%)  DAG Combining 2
   0.0269 (  7.5%)   0.0055 (  8.5%)   0.0324 (  7.7%)   0.0332 (  7.7%)  DAG Legalization
   0.0091 (  2.6%)   0.0026 (  4.0%)   0.0117 (  2.8%)   0.0121 (  2.8%)  Type Legalization
   0.0065 (  1.8%)   0.0016 (  2.5%)   0.0082 (  1.9%)   0.0085 (  2.0%)  Vector Legalization
   0.0046 (  1.3%)   0.0011 (  1.7%)   0.0057 (  1.4%)   0.0060 (  1.4%)  DAG Combining after legalize types
   0.0029 (  0.8%)   0.0008 (  1.2%)   0.0037 (  0.9%)   0.0042 (  1.0%)  Instruction Scheduling Cleanup
   0.3571 (100.0%)   0.0654 (100.0%)   0.4225 (100.0%)   0.4328 (100.0%)  Total

===-------------------------------------------------------------------------===
                                 DWARF Emission
===-------------------------------------------------------------------------===
  Total Execution Time: 264.0672 seconds (264.4415 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  261.6501 ( 99.8%)   1.0310 ( 54.4%)  262.6811 ( 99.5%)  263.0485 ( 99.5%)  Debug Info Emission
   0.5193 (  0.2%)   0.8656 ( 45.6%)   1.3848 (  0.5%)   1.3917 (  0.5%)  DWARF Exception Writer
   0.0012 (  0.0%)   0.0000 (  0.0%)   0.0013 (  0.0%)   0.0013 (  0.0%)  DWARF Debug Writer
  262.1706 (100.0%)   1.8966 (100.0%)  264.0672 (100.0%)  264.4415 (100.0%)  Total

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 282.5991 seconds (283.0060 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  263.3423 ( 94.8%)   3.8350 ( 78.1%)  267.1773 ( 94.5%)  267.5604 ( 94.5%)  X86 Assembly Printer
   8.3376 (  3.0%)   0.0692 (  1.4%)   8.4068 (  3.0%)   8.4095 (  3.0%)  Virtual Register Rewriter
   0.9262 (  0.3%)   0.0288 (  0.6%)   0.9550 (  0.3%)   0.9550 (  0.3%)  Dead Store Elimination
   0.6031 (  0.2%)   0.0385 (  0.8%)   0.6416 (  0.2%)   0.6447 (  0.2%)  Machine Instruction Scheduler
   0.4900 (  0.2%)   0.1100 (  2.2%)   0.6000 (  0.2%)   0.6182 (  0.2%)  Demanded bits analysis
   0.5595 (  0.2%)   0.0065 (  0.1%)   0.5660 (  0.2%)   0.5666 (  0.2%)  Live DEBUG_VALUE analysis
   0.2207 (  0.1%)   0.0502 (  1.0%)   0.2709 (  0.1%)   0.2706 (  0.1%)  Target Library Information
   0.1724 (  0.1%)   0.0056 (  0.1%)   0.1781 (  0.1%)   0.1793 (  0.1%)  Prologue/Epilogue Insertion & Frame Finalization
   0.1231 (  0.0%)   0.0146 (  0.3%)   0.1377 (  0.0%)   0.1351 (  0.0%)  SROA
   0.0950 (  0.0%)   0.0160 (  0.3%)   0.1110 (  0.0%)   0.1108 (  0.0%)  Combine redundant instructions
   0.0999 (  0.0%)   0.0055 (  0.1%)   0.1054 (  0.0%)   0.1063 (  0.0%)  Natural Loop Information
   0.0915 (  0.0%)   0.0150 (  0.3%)   0.1065 (  0.0%)   0.1063 (  0.0%)  Global Value Numbering
   0.0724 (  0.0%)   0.0253 (  0.5%)   0.0977 (  0.0%)   0.0975 (  0.0%)  Combine redundant instructions
   0.0769 (  0.0%)   0.0133 (  0.3%)   0.0903 (  0.0%)   0.0901 (  0.0%)  Combine redundant instructions
   0.0673 (  0.0%)   0.0182 (  0.4%)   0.0856 (  0.0%)   0.0855 (  0.0%)  SROA
   0.0805 (  0.0%)   0.0044 (  0.1%)   0.0848 (  0.0%)   0.0849 (  0.0%)  ThinLTO Bitcode Writer
   0.0778 (  0.0%)   0.0179 (  0.4%)   0.0957 (  0.0%)   0.0811 (  0.0%)  Module Verifier
   0.0692 (  0.0%)   0.0099 (  0.2%)   0.0791 (  0.0%)   0.0789 (  0.0%)  Combine redundant instructions
   0.0667 (  0.0%)   0.0120 (  0.2%)   0.0787 (  0.0%)   0.0785 (  0.0%)  Combine redundant instructions
   0.0611 (  0.0%)   0.0121 (  0.2%)   0.0732 (  0.0%)   0.0732 (  0.0%)  SLP Vectorizer
   0.0662 (  0.0%)   0.0025 (  0.1%)   0.0687 (  0.0%)   0.0689 (  0.0%)  X86 Execution Dependency Fix
   0.0509 (  0.0%)   0.0128 (  0.3%)   0.0637 (  0.0%)   0.0666 (  0.0%)  Greedy Register Allocator
   0.0559 (  0.0%)   0.0108 (  0.2%)   0.0667 (  0.0%)   0.0666 (  0.0%)  Early CSE w/ MemorySSA
   0.0428 (  0.0%)   0.0193 (  0.4%)   0.0621 (  0.0%)   0.0646 (  0.0%)  Rewrite Symbols
   0.0600 (  0.0%)   0.0012 (  0.0%)   0.0613 (  0.0%)   0.0614 (  0.0%)  X86 pseudo instruction expansion pass
   0.0588 (  0.0%)   0.0010 (  0.0%)   0.0598 (  0.0%)   0.0599 (  0.0%)  X86 LEA Fixup
   0.0556 (  0.0%)   0.0013 (  0.0%)   0.0569 (  0.0%)   0.0569 (  0.0%)  Post-RA pseudo instruction expansion pass
   0.0506 (  0.0%)   0.0026 (  0.1%)   0.0532 (  0.0%)   0.0537 (  0.0%)  Machine Copy Propagation Pass
   0.0394 (  0.0%)   0.0102 (  0.2%)   0.0496 (  0.0%)   0.0497 (  0.0%)  Memory SSA
   0.0358 (  0.0%)   0.0075 (  0.2%)   0.0433 (  0.0%)   0.0433 (  0.0%)  Combine redundant instructions
   0.0335 (  0.0%)   0.0096 (  0.2%)   0.0431 (  0.0%)   0.0430 (  0.0%)  Module Verifier
   0.0333 (  0.0%)   0.0073 (  0.1%)   0.0406 (  0.0%)   0.0405 (  0.0%)  Value Propagation
   0.0359 (  0.0%)   0.0038 (  0.1%)   0.0397 (  0.0%)   0.0404 (  0.0%)  Debug Variable Analysis
   0.0270 (  0.0%)   0.0093 (  0.2%)   0.0363 (  0.0%)   0.0380 (  0.0%)  Demanded bits analysis
   0.0361 (  0.0%)   0.0000 (  0.0%)   0.0361 (  0.0%)   0.0361 (  0.0%)  Called Value Propagation
   0.0259 (  0.0%)   0.0094 (  0.2%)   0.0353 (  0.0%)   0.0352 (  0.0%)  Combine redundant instructions
   0.0268 (  0.0%)   0.0077 (  0.2%)   0.0345 (  0.0%)   0.0345 (  0.0%)  MemCpy Optimization
   0.0328 (  0.0%)   0.0060 (  0.1%)   0.0389 (  0.0%)   0.0330 (  0.0%)  Early CSE
   0.0271 (  0.0%)   0.0050 (  0.1%)   0.0321 (  0.0%)   0.0320 (  0.0%)  Value Propagation
   0.0188 (  0.0%)   0.0100 (  0.2%)   0.0288 (  0.0%)   0.0288 (  0.0%)  Module Summary Analysis
   0.0232 (  0.0%)   0.0050 (  0.1%)   0.0282 (  0.0%)   0.0282 (  0.0%)  Aggressive Dead Code Elimination
   0.0199 (  0.0%)   0.0080 (  0.2%)   0.0279 (  0.0%)   0.0278 (  0.0%)  Deduce function attributes
   0.0221 (  0.0%)   0.0049 (  0.1%)   0.0270 (  0.0%)   0.0269 (  0.0%)  Sparse Conditional Constant Propagation
   0.0210 (  0.0%)   0.0055 (  0.1%)   0.0266 (  0.0%)   0.0265 (  0.0%)  Machine Module Information
   0.0181 (  0.0%)   0.0050 (  0.1%)   0.0231 (  0.0%)   0.0231 (  0.0%)  CodeGen Prepare
   0.0168 (  0.0%)   0.0047 (  0.1%)   0.0214 (  0.0%)   0.0225 (  0.0%)  Machine Common Subexpression Elimination
   0.0178 (  0.0%)   0.0048 (  0.1%)   0.0225 (  0.0%)   0.0225 (  0.0%)  Reassociate expressions
   0.0144 (  0.0%)   0.0066 (  0.1%)   0.0210 (  0.0%)   0.0210 (  0.0%)  Interprocedural Sparse Conditional Constant Propagation
   0.0209 (  0.0%)   0.0019 (  0.0%)   0.0228 (  0.0%)   0.0204 (  0.0%)  Simplify the CFG
   0.0158 (  0.0%)   0.0038 (  0.1%)   0.0196 (  0.0%)   0.0196 (  0.0%)  Bit-Tracking Dead Code Elimination
   0.0125 (  0.0%)   0.0061 (  0.1%)   0.0186 (  0.0%)   0.0196 (  0.0%)  Demanded bits analysis
   0.0155 (  0.0%)   0.0042 (  0.1%)   0.0197 (  0.0%)   0.0195 (  0.0%)  Jump Threading
   0.0138 (  0.0%)   0.0048 (  0.1%)   0.0187 (  0.0%)   0.0189 (  0.0%)  Post-Dominator Tree Construction
   0.0126 (  0.0%)   0.0056 (  0.1%)   0.0182 (  0.0%)   0.0182 (  0.0%)  Remove unused exception handling info
   0.0138 (  0.0%)   0.0037 (  0.1%)   0.0175 (  0.0%)   0.0175 (  0.0%)  Tail Call Elimination
   0.0122 (  0.0%)   0.0041 (  0.1%)   0.0164 (  0.0%)   0.0168 (  0.0%)  Simple Register Coalescing
   0.0120 (  0.0%)   0.0039 (  0.1%)   0.0159 (  0.0%)   0.0168 (  0.0%)  Live Interval Analysis
   0.0141 (  0.0%)   0.0026 (  0.1%)   0.0166 (  0.0%)   0.0166 (  0.0%)  Induction Variable Simplification
   0.0158 (  0.0%)   0.0001 (  0.0%)   0.0158 (  0.0%)   0.0158 (  0.0%)  Global Variable Optimizer
   0.0126 (  0.0%)   0.0031 (  0.1%)   0.0157 (  0.0%)   0.0156 (  0.0%)  Jump Threading
   0.0117 (  0.0%)   0.0036 (  0.1%)   0.0153 (  0.0%)   0.0153 (  0.0%)  Simplify the CFG
   0.0055 (  0.0%)   0.0095 (  0.2%)   0.0150 (  0.0%)   0.0152 (  0.0%)  Free MachineFunction
   0.0099 (  0.0%)   0.0049 (  0.1%)   0.0148 (  0.0%)   0.0147 (  0.0%)  Promote 'by reference' arguments to scalars
   0.0109 (  0.0%)   0.0033 (  0.1%)   0.0142 (  0.0%)   0.0146 (  0.0%)  Merge disjoint stack slots
   0.0106 (  0.0%)   0.0036 (  0.1%)   0.0142 (  0.0%)   0.0142 (  0.0%)  Branch Probability Analysis
   0.0102 (  0.0%)   0.0032 (  0.1%)   0.0134 (  0.0%)   0.0134 (  0.0%)  Simplify the CFG
   0.0097 (  0.0%)   0.0036 (  0.1%)   0.0134 (  0.0%)   0.0133 (  0.0%)  Block Frequency Analysis
   0.0096 (  0.0%)   0.0030 (  0.1%)   0.0126 (  0.0%)   0.0125 (  0.0%)  Simplify the CFG
   0.0084 (  0.0%)   0.0029 (  0.1%)   0.0113 (  0.0%)   0.0120 (  0.0%)  Peephole Optimizations
   0.0116 (  0.0%)   0.0000 (  0.0%)   0.0116 (  0.0%)   0.0116 (  0.0%)  Global Variable Optimizer
   0.0084 (  0.0%)   0.0027 (  0.1%)   0.0111 (  0.0%)   0.0114 (  0.0%)  Two-Address instruction pass
   0.0087 (  0.0%)   0.0026 (  0.1%)   0.0113 (  0.0%)   0.0113 (  0.0%)  Simplify the CFG
   0.0080 (  0.0%)   0.0031 (  0.1%)   0.0111 (  0.0%)   0.0111 (  0.0%)  Dominator Tree Construction
   0.0078 (  0.0%)   0.0033 (  0.1%)   0.0110 (  0.0%)   0.0110 (  0.0%)  Simplify the CFG
   0.0098 (  0.0%)   0.0011 (  0.0%)   0.0109 (  0.0%)   0.0109 (  0.0%)  Loop Invariant Code Motion
   0.0074 (  0.0%)   0.0030 (  0.1%)   0.0104 (  0.0%)   0.0107 (  0.0%)  Memory Dependence Analysis
   0.0067 (  0.0%)   0.0035 (  0.1%)   0.0102 (  0.0%)   0.0102 (  0.0%)  Function Alias Analysis Results
   0.0064 (  0.0%)   0.0035 (  0.1%)   0.0099 (  0.0%)   0.0100 (  0.0%)  Loop Vectorization
   0.0063 (  0.0%)   0.0037 (  0.1%)   0.0100 (  0.0%)   0.0100 (  0.0%)  Branch Probability Analysis
   0.0077 (  0.0%)   0.0022 (  0.0%)   0.0099 (  0.0%)   0.0100 (  0.0%)  Conditionally eliminate dead library calls
   0.0070 (  0.0%)   0.0028 (  0.1%)   0.0099 (  0.0%)   0.0098 (  0.0%)  Scalar Evolution Analysis
   0.0066 (  0.0%)   0.0029 (  0.1%)   0.0095 (  0.0%)   0.0097 (  0.0%)  Function Alias Analysis Results
   0.0069 (  0.0%)   0.0026 (  0.1%)   0.0094 (  0.0%)   0.0094 (  0.0%)  Remove redundant instructions
   0.0073 (  0.0%)   0.0017 (  0.0%)   0.0090 (  0.0%)   0.0094 (  0.0%)  Control Flow Optimizer
   0.0067 (  0.0%)   0.0023 (  0.0%)   0.0089 (  0.0%)   0.0090 (  0.0%)  Live Range Shrink
   0.0060 (  0.0%)   0.0023 (  0.0%)   0.0084 (  0.0%)   0.0084 (  0.0%)  Dominator Tree Construction
   0.0060 (  0.0%)   0.0024 (  0.0%)   0.0084 (  0.0%)   0.0082 (  0.0%)  Scalar Evolution Analysis
   0.0074 (  0.0%)   0.0006 (  0.0%)   0.0080 (  0.0%)   0.0080 (  0.0%)  Loop Invariant Code Motion
   0.0058 (  0.0%)   0.0022 (  0.0%)   0.0080 (  0.0%)   0.0080 (  0.0%)  Natural Loop Information
   0.0052 (  0.0%)   0.0027 (  0.1%)   0.0079 (  0.0%)   0.0079 (  0.0%)  Promote Memory to Register
   0.0053 (  0.0%)   0.0022 (  0.0%)   0.0076 (  0.0%)   0.0077 (  0.0%)  Memory Dependence Analysis
   0.0054 (  0.0%)   0.0023 (  0.0%)   0.0078 (  0.0%)   0.0077 (  0.0%)  Scalar Evolution Analysis
   0.0053 (  0.0%)   0.0019 (  0.0%)   0.0072 (  0.0%)   0.0074 (  0.0%)  Remove dead machine instructions
   0.0054 (  0.0%)   0.0020 (  0.0%)   0.0074 (  0.0%)   0.0073 (  0.0%)  Dominator Tree Construction
   0.0047 (  0.0%)   0.0025 (  0.1%)   0.0072 (  0.0%)   0.0073 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0050 (  0.0%)   0.0019 (  0.0%)   0.0070 (  0.0%)   0.0072 (  0.0%)  Machine Block Frequency Analysis
   0.0066 (  0.0%)   0.0006 (  0.0%)   0.0072 (  0.0%)   0.0072 (  0.0%)  Unroll loops
   0.0051 (  0.0%)   0.0020 (  0.0%)   0.0071 (  0.0%)   0.0071 (  0.0%)  Dominator Tree Construction
   0.0054 (  0.0%)   0.0016 (  0.0%)   0.0070 (  0.0%)   0.0070 (  0.0%)  Rotate Loops
   0.0050 (  0.0%)   0.0019 (  0.0%)   0.0070 (  0.0%)   0.0070 (  0.0%)  Natural Loop Information
   0.0048 (  0.0%)   0.0020 (  0.0%)   0.0068 (  0.0%)   0.0069 (  0.0%)  Float to int
   0.0051 (  0.0%)   0.0019 (  0.0%)   0.0069 (  0.0%)   0.0069 (  0.0%)  Natural Loop Information
   0.0057 (  0.0%)   0.0008 (  0.0%)   0.0065 (  0.0%)   0.0069 (  0.0%)  Dominator Tree Construction
   0.0049 (  0.0%)   0.0019 (  0.0%)   0.0068 (  0.0%)   0.0068 (  0.0%)  Dominator Tree Construction
   0.0045 (  0.0%)   0.0024 (  0.0%)   0.0068 (  0.0%)   0.0068 (  0.0%)  Dominator Tree Construction
   0.0049 (  0.0%)   0.0018 (  0.0%)   0.0067 (  0.0%)   0.0067 (  0.0%)  Dominator Tree Construction
   0.0038 (  0.0%)   0.0029 (  0.1%)   0.0067 (  0.0%)   0.0067 (  0.0%)  Dominator Tree Construction
   0.0047 (  0.0%)   0.0018 (  0.0%)   0.0066 (  0.0%)   0.0066 (  0.0%)  Dominator Tree Construction
   0.0044 (  0.0%)   0.0019 (  0.0%)   0.0063 (  0.0%)   0.0065 (  0.0%)  Branch Probability Analysis
   0.0047 (  0.0%)   0.0018 (  0.0%)   0.0065 (  0.0%)   0.0065 (  0.0%)  Dominator Tree Construction
   0.0047 (  0.0%)   0.0018 (  0.0%)   0.0065 (  0.0%)   0.0065 (  0.0%)  Natural Loop Information
   0.0043 (  0.0%)   0.0019 (  0.0%)   0.0063 (  0.0%)   0.0064 (  0.0%)  Memory Dependence Analysis
   0.0045 (  0.0%)   0.0018 (  0.0%)   0.0064 (  0.0%)   0.0064 (  0.0%)  Dominator Tree Construction
   0.0047 (  0.0%)   0.0016 (  0.0%)   0.0064 (  0.0%)   0.0063 (  0.0%)  PGOMemOPSize
   0.0044 (  0.0%)   0.0019 (  0.0%)   0.0064 (  0.0%)   0.0063 (  0.0%)  Function Alias Analysis Results
   0.0022 (  0.0%)   0.0041 (  0.1%)   0.0063 (  0.0%)   0.0063 (  0.0%)  Call-site splitting
   0.0043 (  0.0%)   0.0019 (  0.0%)   0.0062 (  0.0%)   0.0062 (  0.0%)  Function Alias Analysis Results
   0.0044 (  0.0%)   0.0018 (  0.0%)   0.0062 (  0.0%)   0.0062 (  0.0%)  Natural Loop Information
   0.0044 (  0.0%)   0.0019 (  0.0%)   0.0062 (  0.0%)   0.0062 (  0.0%)  Function Alias Analysis Results
   0.0044 (  0.0%)   0.0018 (  0.0%)   0.0062 (  0.0%)   0.0062 (  0.0%)  Function Alias Analysis Results
   0.0043 (  0.0%)   0.0018 (  0.0%)   0.0061 (  0.0%)   0.0062 (  0.0%)  Simplify the CFG
   0.0044 (  0.0%)   0.0018 (  0.0%)   0.0061 (  0.0%)   0.0061 (  0.0%)  Natural Loop Information
   0.0042 (  0.0%)   0.0019 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Function Alias Analysis Results
   0.0041 (  0.0%)   0.0019 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Function Alias Analysis Results
   0.0042 (  0.0%)   0.0018 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Function Alias Analysis Results
   0.0043 (  0.0%)   0.0017 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Natural Loop Information
   0.0041 (  0.0%)   0.0018 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Function Alias Analysis Results
   0.0041 (  0.0%)   0.0019 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Function Alias Analysis Results
   0.0042 (  0.0%)   0.0016 (  0.0%)   0.0058 (  0.0%)   0.0059 (  0.0%)  MachinePostDominator Tree Construction
   0.0041 (  0.0%)   0.0018 (  0.0%)   0.0059 (  0.0%)   0.0059 (  0.0%)  Function Alias Analysis Results
   0.0041 (  0.0%)   0.0018 (  0.0%)   0.0058 (  0.0%)   0.0059 (  0.0%)  Function Alias Analysis Results
   0.0038 (  0.0%)   0.0022 (  0.0%)   0.0060 (  0.0%)   0.0059 (  0.0%)  Lazy Branch Probability Analysis
   0.0039 (  0.0%)   0.0018 (  0.0%)   0.0057 (  0.0%)   0.0058 (  0.0%)  Natural Loop Information
   0.0041 (  0.0%)   0.0017 (  0.0%)   0.0057 (  0.0%)   0.0057 (  0.0%)  Canonicalize natural loops
   0.0041 (  0.0%)   0.0016 (  0.0%)   0.0057 (  0.0%)   0.0057 (  0.0%)  Natural Loop Information
   0.0038 (  0.0%)   0.0014 (  0.0%)   0.0052 (  0.0%)   0.0055 (  0.0%)  MachinePostDominator Tree Construction
   0.0039 (  0.0%)   0.0016 (  0.0%)   0.0055 (  0.0%)   0.0055 (  0.0%)  Speculatively execute instructions if target has divergent branches
   0.0043 (  0.0%)   0.0010 (  0.0%)   0.0053 (  0.0%)   0.0054 (  0.0%)  Branch Probability Basic Block Placement
   0.0037 (  0.0%)   0.0014 (  0.0%)   0.0050 (  0.0%)   0.0054 (  0.0%)  Machine Block Frequency Analysis
   0.0039 (  0.0%)   0.0013 (  0.0%)   0.0053 (  0.0%)   0.0054 (  0.0%)  Remove dead machine instructions
   0.0038 (  0.0%)   0.0016 (  0.0%)   0.0054 (  0.0%)   0.0054 (  0.0%)  MergedLoadStoreMotion
   0.0039 (  0.0%)   0.0015 (  0.0%)   0.0053 (  0.0%)   0.0053 (  0.0%)  Expand memcmp() to load/stores
   0.0038 (  0.0%)   0.0015 (  0.0%)   0.0053 (  0.0%)   0.0053 (  0.0%)  Lazy Value Information Analysis
   0.0037 (  0.0%)   0.0014 (  0.0%)   0.0051 (  0.0%)   0.0053 (  0.0%)  Machine InstCombiner
   0.0036 (  0.0%)   0.0015 (  0.0%)   0.0051 (  0.0%)   0.0052 (  0.0%)  Canonicalize natural loops
   0.0038 (  0.0%)   0.0014 (  0.0%)   0.0052 (  0.0%)   0.0052 (  0.0%)  Constant Hoisting
   0.0039 (  0.0%)   0.0011 (  0.0%)   0.0051 (  0.0%)   0.0052 (  0.0%)  Machine code sinking
   0.0037 (  0.0%)   0.0014 (  0.0%)   0.0051 (  0.0%)   0.0051 (  0.0%)  Lazy Value Information Analysis
   0.0031 (  0.0%)   0.0019 (  0.0%)   0.0050 (  0.0%)   0.0051 (  0.0%)  Dominator Tree Construction
   0.0035 (  0.0%)   0.0015 (  0.0%)   0.0051 (  0.0%)   0.0051 (  0.0%)  Canonicalize natural loops
   0.0033 (  0.0%)   0.0017 (  0.0%)   0.0051 (  0.0%)   0.0050 (  0.0%)  Simplify the CFG
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0050 (  0.0%)  Dead Global Elimination
   0.0049 (  0.0%)   0.0000 (  0.0%)   0.0049 (  0.0%)   0.0049 (  0.0%)  Dead Global Elimination
   0.0034 (  0.0%)   0.0014 (  0.0%)   0.0048 (  0.0%)   0.0048 (  0.0%)  Lazy Value Information Analysis
   0.0033 (  0.0%)   0.0014 (  0.0%)   0.0047 (  0.0%)   0.0047 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0033 (  0.0%)   0.0011 (  0.0%)   0.0044 (  0.0%)   0.0047 (  0.0%)  Slot index numbering
   0.0033 (  0.0%)   0.0014 (  0.0%)   0.0047 (  0.0%)   0.0047 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0032 (  0.0%)   0.0014 (  0.0%)   0.0046 (  0.0%)   0.0046 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0033 (  0.0%)   0.0013 (  0.0%)   0.0046 (  0.0%)   0.0046 (  0.0%)  Loop-Closed SSA Form Pass
   0.0032 (  0.0%)   0.0014 (  0.0%)   0.0047 (  0.0%)   0.0046 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0032 (  0.0%)   0.0014 (  0.0%)   0.0046 (  0.0%)   0.0046 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0032 (  0.0%)   0.0014 (  0.0%)   0.0046 (  0.0%)   0.0046 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0033 (  0.0%)   0.0012 (  0.0%)   0.0045 (  0.0%)   0.0046 (  0.0%)  Machine Block Frequency Analysis
   0.0031 (  0.0%)   0.0014 (  0.0%)   0.0045 (  0.0%)   0.0045 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0031 (  0.0%)   0.0014 (  0.0%)   0.0046 (  0.0%)   0.0045 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0030 (  0.0%)   0.0011 (  0.0%)   0.0042 (  0.0%)   0.0045 (  0.0%)  Eliminate PHI nodes for register allocation
   0.0037 (  0.0%)   0.0006 (  0.0%)   0.0043 (  0.0%)   0.0045 (  0.0%)  Instrument function entry/exit with calls to e.g. mcount() (pre inlining)
   0.0030 (  0.0%)   0.0012 (  0.0%)   0.0042 (  0.0%)   0.0045 (  0.0%)  Machine Natural Loop Construction
   0.0037 (  0.0%)   0.0008 (  0.0%)   0.0045 (  0.0%)   0.0045 (  0.0%)  Loop Strength Reduction
   0.0031 (  0.0%)   0.0014 (  0.0%)   0.0045 (  0.0%)   0.0045 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0032 (  0.0%)   0.0012 (  0.0%)   0.0044 (  0.0%)   0.0044 (  0.0%)  MachineDominator Tree Construction
   0.0031 (  0.0%)   0.0011 (  0.0%)   0.0042 (  0.0%)   0.0044 (  0.0%)  MachinePostDominator Tree Construction
   0.0031 (  0.0%)   0.0013 (  0.0%)   0.0044 (  0.0%)   0.0044 (  0.0%)  Loop-Closed SSA Form Pass
   0.0028 (  0.0%)   0.0016 (  0.0%)   0.0044 (  0.0%)   0.0043 (  0.0%)  Branch Probability Analysis
   0.0037 (  0.0%)   0.0006 (  0.0%)   0.0043 (  0.0%)   0.0043 (  0.0%)  Lower 'expect' Intrinsics
   0.0030 (  0.0%)   0.0013 (  0.0%)   0.0043 (  0.0%)   0.0043 (  0.0%)  Lazy Value Information Analysis
   0.0028 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0043 (  0.0%)  Remove unreachable machine basic blocks
   0.0028 (  0.0%)   0.0015 (  0.0%)   0.0043 (  0.0%)   0.0043 (  0.0%)  Block Frequency Analysis
   0.0030 (  0.0%)   0.0013 (  0.0%)   0.0043 (  0.0%)   0.0043 (  0.0%)  Loop-Closed SSA Form Pass
   0.0029 (  0.0%)   0.0013 (  0.0%)   0.0042 (  0.0%)   0.0042 (  0.0%)  Lazy Branch Probability Analysis
   0.0029 (  0.0%)   0.0013 (  0.0%)   0.0042 (  0.0%)   0.0042 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0030 (  0.0%)   0.0010 (  0.0%)   0.0039 (  0.0%)   0.0042 (  0.0%)  Slot index numbering
   0.0035 (  0.0%)   0.0007 (  0.0%)   0.0041 (  0.0%)   0.0042 (  0.0%)  Globals Alias Analysis
   0.0029 (  0.0%)   0.0011 (  0.0%)   0.0039 (  0.0%)   0.0041 (  0.0%)  Machine Block Frequency Analysis
   0.0026 (  0.0%)   0.0014 (  0.0%)   0.0041 (  0.0%)   0.0041 (  0.0%)  Branch Probability Analysis
   0.0028 (  0.0%)   0.0013 (  0.0%)   0.0040 (  0.0%)   0.0040 (  0.0%)  Lazy Branch Probability Analysis
   0.0027 (  0.0%)   0.0013 (  0.0%)   0.0040 (  0.0%)   0.0040 (  0.0%)  Optimization Remark Emitter
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0040 (  0.0%)  Optimization Remark Emitter
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0040 (  0.0%)  Optimization Remark Emitter
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Lazy Block Frequency Analysis
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Lazy Branch Probability Analysis
   0.0025 (  0.0%)   0.0010 (  0.0%)   0.0035 (  0.0%)   0.0039 (  0.0%)  X86 cmov Conversion
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Dead Argument Elimination
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0040 (  0.0%)   0.0039 (  0.0%)  Lazy Branch Probability Analysis
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Lazy Branch Probability Analysis
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Lazy Branch Probability Analysis
   0.0026 (  0.0%)   0.0012 (  0.0%)   0.0038 (  0.0%)   0.0039 (  0.0%)  Optimization Remark Emitter
   0.0027 (  0.0%)   0.0012 (  0.0%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Optimization Remark Emitter
   0.0024 (  0.0%)   0.0015 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Optimization Remark Emitter
   0.0026 (  0.0%)   0.0012 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Optimization Remark Emitter
   0.0026 (  0.0%)   0.0012 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Lazy Branch Probability Analysis
   0.0026 (  0.0%)   0.0012 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Optimization Remark Emitter
   0.0026 (  0.0%)   0.0012 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Lazy Block Frequency Analysis
   0.0032 (  0.0%)   0.0006 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Loop Invariant Code Motion
   0.0026 (  0.0%)   0.0012 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Lazy Block Frequency Analysis
   0.0026 (  0.0%)   0.0011 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Lazy Block Frequency Analysis
   0.0030 (  0.0%)   0.0008 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Scoped NoAlias Alias Analysis
   0.0025 (  0.0%)   0.0012 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Type-Based Alias Analysis
   0.0024 (  0.0%)   0.0014 (  0.0%)   0.0038 (  0.0%)   0.0037 (  0.0%)  Lazy Block Frequency Analysis
   0.0025 (  0.0%)   0.0011 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Lazy Block Frequency Analysis
   0.0025 (  0.0%)   0.0012 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Lazy Block Frequency Analysis
   0.0025 (  0.0%)   0.0010 (  0.0%)   0.0035 (  0.0%)   0.0037 (  0.0%)  Machine Natural Loop Construction
   0.0025 (  0.0%)   0.0011 (  0.0%)   0.0036 (  0.0%)   0.0036 (  0.0%)  LCSSA Verifier
   0.0024 (  0.0%)   0.0011 (  0.0%)   0.0035 (  0.0%)   0.0036 (  0.0%)  LCSSA Verifier
   0.0027 (  0.0%)   0.0009 (  0.0%)   0.0036 (  0.0%)   0.0036 (  0.0%)  Machine Loop Invariant Code Motion
   0.0025 (  0.0%)   0.0011 (  0.0%)   0.0036 (  0.0%)   0.0036 (  0.0%)  Lazy Block Frequency Analysis
   0.0025 (  0.0%)   0.0010 (  0.0%)   0.0035 (  0.0%)   0.0036 (  0.0%)  Live Register Matrix
   0.0024 (  0.0%)   0.0012 (  0.0%)   0.0036 (  0.0%)   0.0036 (  0.0%)  Scalar Evolution Analysis
   0.0025 (  0.0%)   0.0011 (  0.0%)   0.0036 (  0.0%)   0.0036 (  0.0%)  LCSSA Verifier
   0.0035 (  0.0%)   0.0000 (  0.0%)   0.0035 (  0.0%)   0.0035 (  0.0%)  CallGraph Construction
   0.0021 (  0.0%)   0.0014 (  0.0%)   0.0035 (  0.0%)   0.0035 (  0.0%)  Block Frequency Analysis
   0.0024 (  0.0%)   0.0009 (  0.0%)   0.0033 (  0.0%)   0.0035 (  0.0%)  MachineDominator Tree Construction
   0.0024 (  0.0%)   0.0009 (  0.0%)   0.0033 (  0.0%)   0.0035 (  0.0%)  X86 Optimize Call Frame
   0.0023 (  0.0%)   0.0011 (  0.0%)   0.0034 (  0.0%)   0.0035 (  0.0%)  Assumption Cache Tracker
   0.0023 (  0.0%)   0.0010 (  0.0%)   0.0033 (  0.0%)   0.0034 (  0.0%)  Dominator Tree Construction
   0.0021 (  0.0%)   0.0013 (  0.0%)   0.0034 (  0.0%)   0.0034 (  0.0%)  Dominator Tree Construction
   0.0021 (  0.0%)   0.0012 (  0.0%)   0.0033 (  0.0%)   0.0033 (  0.0%)  Dominator Tree Construction
   0.0020 (  0.0%)   0.0014 (  0.0%)   0.0034 (  0.0%)   0.0033 (  0.0%)  Scalar Evolution Analysis
   0.0024 (  0.0%)   0.0007 (  0.0%)   0.0032 (  0.0%)   0.0033 (  0.0%)  Stack Slot Coloring
   0.0024 (  0.0%)   0.0009 (  0.0%)   0.0032 (  0.0%)   0.0033 (  0.0%)  MachineDominator Tree Construction
   0.0022 (  0.0%)   0.0008 (  0.0%)   0.0030 (  0.0%)   0.0033 (  0.0%)  MachineDominator Tree Construction
   0.0022 (  0.0%)   0.0011 (  0.0%)   0.0033 (  0.0%)   0.0033 (  0.0%)  Dominator Tree Construction
   0.0021 (  0.0%)   0.0011 (  0.0%)   0.0033 (  0.0%)   0.0032 (  0.0%)  Scalar Evolution Analysis
   0.0023 (  0.0%)   0.0009 (  0.0%)   0.0031 (  0.0%)   0.0032 (  0.0%)  Post RA top-down list latency scheduler
   0.0022 (  0.0%)   0.0008 (  0.0%)   0.0030 (  0.0%)   0.0032 (  0.0%)  MachineDominator Tree Construction
   0.0023 (  0.0%)   0.0010 (  0.0%)   0.0032 (  0.0%)   0.0032 (  0.0%)  Unroll loops
   0.0021 (  0.0%)   0.0009 (  0.0%)   0.0030 (  0.0%)   0.0032 (  0.0%)  Machine Trace Metrics
   0.0020 (  0.0%)   0.0011 (  0.0%)   0.0031 (  0.0%)   0.0031 (  0.0%)  Natural Loop Information
   0.0020 (  0.0%)   0.0011 (  0.0%)   0.0031 (  0.0%)   0.0031 (  0.0%)  Scalar Evolution Analysis
   0.0020 (  0.0%)   0.0010 (  0.0%)   0.0031 (  0.0%)   0.0031 (  0.0%)  Natural Loop Information
   0.0022 (  0.0%)   0.0009 (  0.0%)   0.0031 (  0.0%)   0.0031 (  0.0%)  Machine Natural Loop Construction
   0.0018 (  0.0%)   0.0013 (  0.0%)   0.0031 (  0.0%)   0.0031 (  0.0%)  Function Alias Analysis Results
   0.0020 (  0.0%)   0.0011 (  0.0%)   0.0031 (  0.0%)   0.0030 (  0.0%)  Scalar Evolution Analysis
   0.0021 (  0.0%)   0.0008 (  0.0%)   0.0029 (  0.0%)   0.0030 (  0.0%)  MachineDominator Tree Construction
   0.0021 (  0.0%)   0.0007 (  0.0%)   0.0029 (  0.0%)   0.0030 (  0.0%)  Expand ISel Pseudo-instructions
   0.0020 (  0.0%)   0.0010 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Function Alias Analysis Results
   0.0018 (  0.0%)   0.0012 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Block Frequency Analysis
   0.0019 (  0.0%)   0.0011 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Scalar Evolution Analysis
   0.0020 (  0.0%)   0.0009 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Natural Loop Information
   0.0021 (  0.0%)   0.0009 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Expand reduction intrinsics
   0.0026 (  0.0%)   0.0004 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  CallGraph Construction
   0.0020 (  0.0%)   0.0008 (  0.0%)   0.0028 (  0.0%)   0.0029 (  0.0%)  MachineDominator Tree Construction
   0.0019 (  0.0%)   0.0011 (  0.0%)   0.0029 (  0.0%)   0.0029 (  0.0%)  Hoist/decompose integer division and remainder
   0.0020 (  0.0%)   0.0008 (  0.0%)   0.0028 (  0.0%)   0.0029 (  0.0%)  Machine Loop Invariant Code Motion
   0.0019 (  0.0%)   0.0010 (  0.0%)   0.0029 (  0.0%)   0.0029 (  0.0%)  Scalar Evolution Analysis
   0.0018 (  0.0%)   0.0012 (  0.0%)   0.0029 (  0.0%)   0.0029 (  0.0%)  Natural Loop Information
   0.0019 (  0.0%)   0.0010 (  0.0%)   0.0029 (  0.0%)   0.0029 (  0.0%)  Function Alias Analysis Results
   0.0018 (  0.0%)   0.0010 (  0.0%)   0.0029 (  0.0%)   0.0028 (  0.0%)  Function Alias Analysis Results
   0.0020 (  0.0%)   0.0008 (  0.0%)   0.0028 (  0.0%)   0.0028 (  0.0%)  MachineDominator Tree Construction
   0.0019 (  0.0%)   0.0008 (  0.0%)   0.0027 (  0.0%)   0.0028 (  0.0%)  Tail Duplication
   0.0019 (  0.0%)   0.0009 (  0.0%)   0.0028 (  0.0%)   0.0028 (  0.0%)  Dominator Tree Construction
   0.0018 (  0.0%)   0.0010 (  0.0%)   0.0028 (  0.0%)   0.0027 (  0.0%)  Function Alias Analysis Results
   0.0019 (  0.0%)   0.0007 (  0.0%)   0.0026 (  0.0%)   0.0027 (  0.0%)  Machine Natural Loop Construction
   0.0017 (  0.0%)   0.0007 (  0.0%)   0.0025 (  0.0%)   0.0027 (  0.0%)  Insert fentry calls
   0.0017 (  0.0%)   0.0009 (  0.0%)   0.0026 (  0.0%)   0.0026 (  0.0%)  Function Alias Analysis Results
   0.0018 (  0.0%)   0.0008 (  0.0%)   0.0026 (  0.0%)   0.0026 (  0.0%)  Dominator Tree Construction
   0.0018 (  0.0%)   0.0008 (  0.0%)   0.0026 (  0.0%)   0.0026 (  0.0%)  Machine Natural Loop Construction
   0.0019 (  0.0%)   0.0007 (  0.0%)   0.0025 (  0.0%)   0.0026 (  0.0%)  Machine Natural Loop Construction
   0.0017 (  0.0%)   0.0008 (  0.0%)   0.0026 (  0.0%)   0.0025 (  0.0%)  Natural Loop Information
   0.0025 (  0.0%)   0.0000 (  0.0%)   0.0025 (  0.0%)   0.0025 (  0.0%)  CallGraph Construction
   0.0014 (  0.0%)   0.0011 (  0.0%)   0.0025 (  0.0%)   0.0025 (  0.0%)  Loop Load Elimination
   0.0017 (  0.0%)   0.0008 (  0.0%)   0.0025 (  0.0%)   0.0025 (  0.0%)  Interleaved Access Pass
   0.0016 (  0.0%)   0.0007 (  0.0%)   0.0023 (  0.0%)   0.0025 (  0.0%)  Local Dynamic TLS Access Clean-up
   0.0017 (  0.0%)   0.0007 (  0.0%)   0.0024 (  0.0%)   0.0025 (  0.0%)  X86 Fixup SetCC
   0.0017 (  0.0%)   0.0007 (  0.0%)   0.0024 (  0.0%)   0.0024 (  0.0%)  Machine Natural Loop Construction
   0.0014 (  0.0%)   0.0010 (  0.0%)   0.0024 (  0.0%)   0.0024 (  0.0%)  Canonicalize natural loops
   0.0015 (  0.0%)   0.0006 (  0.0%)   0.0022 (  0.0%)   0.0023 (  0.0%)  Virtual Register Map
   0.0017 (  0.0%)   0.0006 (  0.0%)   0.0023 (  0.0%)   0.0023 (  0.0%)  Process Implicit Definitions
   0.0015 (  0.0%)   0.0009 (  0.0%)   0.0023 (  0.0%)   0.0023 (  0.0%)  Canonicalize natural loops
   0.0015 (  0.0%)   0.0007 (  0.0%)   0.0022 (  0.0%)   0.0023 (  0.0%)  Early If-Conversion
   0.0015 (  0.0%)   0.0008 (  0.0%)   0.0023 (  0.0%)   0.0023 (  0.0%)  Canonicalize natural loops
   0.0016 (  0.0%)   0.0006 (  0.0%)   0.0022 (  0.0%)   0.0022 (  0.0%)  Live Stack Slot Analysis
   0.0012 (  0.0%)   0.0010 (  0.0%)   0.0022 (  0.0%)   0.0022 (  0.0%)  Block Frequency Analysis
   0.0015 (  0.0%)   0.0006 (  0.0%)   0.0021 (  0.0%)   0.0022 (  0.0%)  Dominator Tree Construction
   0.0014 (  0.0%)   0.0006 (  0.0%)   0.0019 (  0.0%)   0.0022 (  0.0%)  Implement the 'patchable-function' attribute
   0.0014 (  0.0%)   0.0008 (  0.0%)   0.0022 (  0.0%)   0.0022 (  0.0%)  Canonicalize natural loops
   0.0015 (  0.0%)   0.0006 (  0.0%)   0.0021 (  0.0%)   0.0022 (  0.0%)  X86 Retpoline Thunks
   0.0015 (  0.0%)   0.0007 (  0.0%)   0.0022 (  0.0%)   0.0022 (  0.0%)  Canonicalize natural loops
   0.0015 (  0.0%)   0.0006 (  0.0%)   0.0021 (  0.0%)   0.0021 (  0.0%)  Local Stack Slot Allocation
   0.0015 (  0.0%)   0.0006 (  0.0%)   0.0021 (  0.0%)   0.0021 (  0.0%)  Insert XRay ops
   0.0014 (  0.0%)   0.0007 (  0.0%)   0.0021 (  0.0%)   0.0021 (  0.0%)  Remove unreachable blocks from the CFG
   0.0013 (  0.0%)   0.0008 (  0.0%)   0.0021 (  0.0%)   0.0021 (  0.0%)  Canonicalize natural loops
   0.0014 (  0.0%)   0.0006 (  0.0%)   0.0020 (  0.0%)   0.0021 (  0.0%)  Target Library Information
   0.0013 (  0.0%)   0.0006 (  0.0%)   0.0019 (  0.0%)   0.0021 (  0.0%)  Compressing EVEX instrs to VEX encoding when possible
   0.0013 (  0.0%)   0.0006 (  0.0%)   0.0019 (  0.0%)   0.0021 (  0.0%)  Lazy Machine Block Frequency Analysis
   0.0013 (  0.0%)   0.0007 (  0.0%)   0.0020 (  0.0%)   0.0020 (  0.0%)  Loop Distribution
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0019 (  0.0%)   0.0020 (  0.0%)  X86 Atom pad short functions
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0019 (  0.0%)   0.0020 (  0.0%)  Contiguously Lay Out Funclets
   0.0014 (  0.0%)   0.0006 (  0.0%)   0.0020 (  0.0%)   0.0020 (  0.0%)  Machine Optimization Remark Emitter
   0.0013 (  0.0%)   0.0006 (  0.0%)   0.0019 (  0.0%)   0.0020 (  0.0%)  Rename Disconnected Subregister Components
   0.0013 (  0.0%)   0.0007 (  0.0%)   0.0020 (  0.0%)   0.0020 (  0.0%)  Loop-Closed SSA Form Pass
   0.0012 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0019 (  0.0%)  Safe Stack instrumentation pass
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0018 (  0.0%)   0.0019 (  0.0%)  X86 WinAlloca Expander
   0.0013 (  0.0%)   0.0006 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  X86 PIC Global Base Reg Initialization
   0.0012 (  0.0%)   0.0008 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Loop-Closed SSA Form Pass
   0.0012 (  0.0%)   0.0008 (  0.0%)   0.0020 (  0.0%)   0.0019 (  0.0%)  Loop-Closed SSA Form Pass
   0.0013 (  0.0%)   0.0006 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Loop-Closed SSA Form Pass
   0.0012 (  0.0%)   0.0007 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Alignment from assumptions
   0.0013 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0019 (  0.0%)  Machine Optimization Remark Emitter
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  X86 vzeroupper inserter
   0.0011 (  0.0%)   0.0008 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Lazy Branch Probability Analysis
   0.0011 (  0.0%)   0.0008 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0012 (  0.0%)   0.0007 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0012 (  0.0%)   0.0007 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0012 (  0.0%)   0.0007 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0019 (  0.0%)   0.0019 (  0.0%)  X86 FP Stackifier
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0018 (  0.0%)   0.0019 (  0.0%)  Detect Dead Lanes
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0019 (  0.0%)   0.0018 (  0.0%)  Lazy Machine Block Frequency Analysis
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Lazy Machine Block Frequency Analysis
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  StackMap Liveness Analysis
   0.0012 (  0.0%)   0.0007 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0012 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Instrument function entry/exit with calls to e.g. mcount() (post inlining)
   0.0012 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Lazy Branch Probability Analysis
   0.0013 (  0.0%)   0.0005 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Machine Optimization Remark Emitter
   0.0010 (  0.0%)   0.0008 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Optimization Remark Emitter
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Lazy Branch Probability Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Lazy Branch Probability Analysis
   0.0012 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Lazy Branch Probability Analysis
   0.0011 (  0.0%)   0.0007 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Lazy Branch Probability Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Optimization Remark Emitter
   0.0012 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Loop Access Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0018 (  0.0%)  Optimization Remark Emitter
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0018 (  0.0%)   0.0017 (  0.0%)  Optimization Remark Emitter
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Optimization Remark Emitter
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lazy Block Frequency Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Optimization Remark Emitter
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Optimization Remark Emitter
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Loop Access Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lazy Block Frequency Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lazy Block Frequency Analysis
   0.0010 (  0.0%)   0.0007 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lazy Block Frequency Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Loop Access Analysis
   0.0009 (  0.0%)   0.0008 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Natural Loop Information
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lazy Block Frequency Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lazy Block Frequency Analysis
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  LCSSA Verifier
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  Lower Garbage Collection Instructions
   0.0010 (  0.0%)   0.0007 (  0.0%)   0.0017 (  0.0%)   0.0017 (  0.0%)  LCSSA Verifier
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0016 (  0.0%)   0.0017 (  0.0%)  Shadow Stack GC Lowering
   0.0011 (  0.0%)   0.0006 (  0.0%)   0.0016 (  0.0%)   0.0016 (  0.0%)  LCSSA Verifier
   0.0010 (  0.0%)   0.0006 (  0.0%)   0.0016 (  0.0%)   0.0016 (  0.0%)  LCSSA Verifier
   0.0013 (  0.0%)   0.0002 (  0.0%)   0.0015 (  0.0%)   0.0015 (  0.0%)  Unswitch loops
   0.0009 (  0.0%)   0.0000 (  0.0%)   0.0009 (  0.0%)   0.0009 (  0.0%)  Globals Alias Analysis
   0.0005 (  0.0%)   0.0001 (  0.0%)   0.0006 (  0.0%)   0.0006 (  0.0%)  Recognize loop idioms
   0.0005 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)   0.0005 (  0.0%)  Merge Duplicate Global Constants
   0.0003 (  0.0%)   0.0001 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  Delete dead loops
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Deduce function attributes in RPO
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Strip Unused Function Prototypes
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Assumption Cache Tracker
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Eliminate Available Externally Globals
   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Rotate Loops
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Loop Sink
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Infer set function attributes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Pre-ISel Intrinsic Lowering
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Force set function attributes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scoped NoAlias Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  A No-Op Barrier Pass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Profile summary info
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Pre-ISel Intrinsic Lowering
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Assumption Cache Tracker
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Transform Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Transform Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Pass Configuration
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Create Garbage Collector Module Metadata
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Type-Based Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Branch Probability Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Transform Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Library Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Pass Configuration
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Branch Probability Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Module Information
  277.6899 (100.0%)   4.9092 (100.0%)  282.5991 (100.0%)  283.0060 (100.0%)  Total

  time: 0.000; rss: 156MB	serialize work products
  time: 0.006; rss: 157MB	linking

This might be a sideeffect of the llvm6 upgrade.

@matthiaskrgr
Copy link
Member Author

matthiaskrgr commented Feb 15, 2018

Building the slow config (optimizations + debuginfo) with beta only takes 16.5 seconds btw so this is a beta to (EDIT: stable nightly) regression.

$ rustc +beta -Vv
rustc 1.24.0-beta.12 (ed2c0f084 2018-02-12)
binary: rustc
commit-hash: ed2c0f08442915c628fc855e6a784c5979a4dc83
commit-date: 2018-02-12
host: x86_64-unknown-linux-gnu
release: 1.24.0-beta.12
LLVM version: 4.0

@gsollazzo
Copy link
Member

Hi! Can you reproduce the issue using rustc stable ?

@matthiaskrgr
Copy link
Member Author

Stable looks fine as well

$ cargo +stable build --release
   Compiling pkg-config v0.3.9
   Compiling libc v0.2.36
   Compiling lazy_static v1.0.0
   Compiling x11-dl v2.17.2
   Compiling bla v0.1.0 (file:///tmp/bla)
    Finished release [optimized + debuginfo] target(s) in 14.22 secs

@gsollazzo gsollazzo added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 15, 2018
@gsollazzo
Copy link
Member

Thanks, as there is no existing label to tag beta-to-nightly regressions, I've assigned "regression-from-stable-to-nightly".

@Manishearth Shall we add a "regression_from_beta_to_nightly" tag?

@matthiaskrgr
Copy link
Member Author

Maybe it would make more sense to have 1.23-to-1.24 / 1.24-to-1.25 etc tags otherwise things might get confusing when nightly become beta and beta becomes stable? (not sure how this is handled currently)

@Mark-Simulacrum
Copy link
Member

Currently it's not really handled, but I do have some plans as to how to best handle it. Expect to hear more soon.

@michaelwoerister
Copy link
Member

cc @alexcrichton

@pnkfelix
Copy link
Member

@gsollazzo I don't understand why we would need a regression_from_beta_to_nightly, at least in this case ... if the problem arised in nightly but is not witnessed in stable (nor in beta either), then it is indeed a regression from stable to nightly, no?

(in other words, I think the tag you added is fine?)

@nikomatsakis
Copy link
Contributor

triage: P-high

We need to narrow this down. Can somebody try bisecting through nightly releases or PRs? That would be very helpful.

It is possible that the difference is due to the fact that -- for some time -- we were using ThinLTO as the default. i.e., maybe the problem is that lto=true is now doing a full lto? I can't remember the exact chronology there.

@rust-highfive rust-highfive added the P-high High priority label Feb 15, 2018
@matthiaskrgr
Copy link
Member Author

@hanna-kruppe
Copy link
Contributor

Out of those, #47828 (LLVM 5 -> LLVM 6) seems by far the most likely suspect.

@nikomatsakis
Copy link
Contributor

Sounds quite likely.

@gsollazzo gsollazzo reopened this Feb 15, 2018
@gsollazzo
Copy link
Member

@pnkfelix You're right, but in this case the regression was from beta 1.24 to nightly 1.25, therefore the usual release cycle is inverted.

(btw, sorry for the closing-and-reopening of the issue, I pressed the wrong button...)

@alexcrichton
Copy link
Member

I've bisected this to https://bugs.llvm.org/show_bug.cgi?id=36417

@Lakier15
Copy link

I am not sure if it is related, but it sounds so.

I am working in embedded Rust and I am currently on Nightly.
I noticed that at a certain point my debug symbols were not tracking correctly the binary code (ELF, DWARF 4).

I tracked it down to:

  • Working debug symbols: 3bcda48 2018-02-09 Nightly
  • Broken debug symbols: 45fba43 2018-02-10 Nightly

Is this particular issue the cause of the problems?

P.S. Please be aware I am using also Xargo and GCC to link the binary, but as neither of those changed recently I think the problem might be related to this conversation.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Feb 19, 2018

@Lakier15 I don't see any reason to suspect this is the same bug, even though from the dates it seems likely that the same PR (LLVM upgrade) introduced the problem. Please file a separate issue with steps to reproduce.

@Lakier15
Copy link

@rkruppe I will try to produce a minimum reproducible example and file it in a separate issue.
Thanks

@nikomatsakis
Copy link
Contributor

Update from @rust-lang/compiler meeting: decided to leave this as P-high until we get some more information from the LLVM side.

@pnkfelix
Copy link
Member

pnkfelix commented Mar 1, 2018

triage: the LLVM ticket that @alexcrichton linked (https://bugs.llvm.org/show_bug.cgi?id=36417) has some ideas on ways to address this. But we still need someone to try those ideas out (or come up with new ones)

  • @michaelwoerister points out that the suggestions there are unlikely to be ones that we could readily implement via e.g. an LLVM IR transformation in our own pipeline ... the problematic pieces of IR here are themselves introduced by LLVM optimizations, i.e. after we have handed off the IR to LLVM.

@matthiaskrgr
Copy link
Member Author

A patch has been proposed: https://reviews.llvm.org/D43956

@matthiaskrgr
Copy link
Member Author

Update: the patch has landed: llvm-mirror/llvm@783006e 🎉

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 6, 2018
This pulls in the rest of LLVM's `release_60` branch (the actual 6.0.0 release)
and also pulls in a cherry-pick to...

Closes rust-lang#48226
@alexcrichton
Copy link
Member

I've confirmed the LLVM patch fixes the issue here and have opened #48782 and beta-nominated it

@michaelwoerister
Copy link
Member

🎉

alexcrichton added a commit to alexcrichton/rust that referenced this issue Mar 24, 2018
This pulls in the rest of LLVM's `release_60` branch (the actual 6.0.0 release)
and also pulls in a cherry-pick to...

Closes rust-lang#48226
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants