Join forces #2

michalfita · 2019-12-09T10:22:56Z

Please join forces with https://github.com/sapir/gcc-rust and don't try to do that by yourself. We need one good Rust frontend for GCC not multiple projects that both fail.

Thank you.

SimplyTheOther · 2019-12-09T11:24:16Z

Please join forces with https://github.com/sapir/gcc-rust and don't try to do that by yourself. We need one good Rust frontend for GCC not multiple projects that both fail.

Thank you.

The projects seem to have completely different approaches to implementing a Rust front end. gcc-rust looks like it’s a wrapper for rustc itself to be used in GCC (and is written in Rust itself and so suffers the same bootstrapping problems as the current rustc, as well as the “trusting trust” issue) while gccrs is a full alternative implementation written in C++. I don’t think that there would be much overlap between the actual code required with each project as they currently stand.

michalfita · 2019-12-09T11:54:46Z

What about joining forces with with https://github.com/thepowersgang/mrustc some way?

(I'd help myself, but I'm not a compiler type of guy, too much other work to do - however grustc would be huge benefit at convincing employers to use Rust 😄)

philberty · 2019-12-09T14:11:19Z

Hi thank you for pointing this out

I dropped working on this project for a few years because for 2 reasons:

How on earth do i approach this and why.
Personal Life job was too busy to do anything

It has taken along time to decide what direction i wanted to go given the size of the task.

Option 1 - Do what @sapir doing by reusing MIR from rust

     - Pros:
       a. No need to implement Lexer or Parser
       b. No need to implement analysis
       c. Get the same diagnostics
       d. unifyies the implementation to simply be a new backend

     - Con:
       a. Relies on rust
       b. possible license issues

Option 2 - Use rust GCC bindings to reuse GCC JIT interface ontop of the rust compiler

     - Pros:
       a. No need to implement Lexer or Parser
       b. No need to implement analysis
       c. Get the same diagnostics
       d. unifyies the implementation to simply be a new backend
       e. Gives rise of possibly doing full JIT on the fly

     - Cons:
       a. Lacks clarity of direction

Option 3 - Full GCC C++ implemtation

     - Pros:
       a. Clear vision to have an independant c++ implementation
       b. Can follow the gccgo pattern for making it independant front-end code to retarget GCC or LLVM or Mono or JVM
       c. No license issues it is a full reimplemtation
     - Cons:
       a. Huge task

There is no clear winner in terms of a pragmatic goal of implemting rust on GCC as each of them will do it and in a contest of fastest/reliable to the post it will be Option 1 and hats of to @sapir this is looks like its shaping up very well and i wish him all the sucess.

Option 3 appeals to me as there is a vision of what could be possible with this front-end code which exicites me. Given the size of the task i have taken the last few years to decide this is what my open-source focus will be for at least the next 2 years. I hope i can stick to this goal to see where i get to with it as i feel its very much a worthwhile goal for rust.

NalaGinrut · 2019-12-10T18:24:07Z

@redbrain It's so sad to hear this project was stopped. What if I submit some patches in my casual time? Do you think it's fine to maintain it and accept the patches, or do you prefer another fork?

ArneBab · 2020-04-03T08:20:06Z

Is there a place where you provide updates (i.e. small news snippets)?

(I’m happy to hear that you’re working on this again!)

NalaGinrut · 2020-04-04T05:06:40Z

I've made little research around GCC frontend and the current Rust upstream implementation. I'm interested in making some progress in Rust on GCC. But I have to finish another compiler project of mine first. So it may not be very soon to show something. Anyway, I hope we can use Rust seriously with GCC.
BTW, I prefer option-3.

philberty · 2020-04-07T17:02:06Z

Is there a place where you provide updates (i.e. small news snippets)?

(I’m happy to hear that you’re working on this again!)

Thanks for this i have several patches in progress to get this to a working state where people can jump in. As for news i have just created a blog over at:

https://thephilbert.io/

nothing there yet but i will be putting up some stuff next week on the status and my plan.

ArneBab · 2020-04-07T21:04:44Z

You’re awesome — thank you!

SimplyTheOther · 2020-04-12T08:11:22Z

I've written a fully-operational and working lexer, parser, and AST for a potential Rust GCC frontend that I can provide as patches. The code has been targeted against stable Rust as of December 2019 and is hand-coded recursive-descent (as opposed to generated). It doesn't yet complete macro expansion, name resolution, type checking, conversion to GCC IR, etc., but should be useful as a starting point if you haven't already written the patches. It's at https://github.com/SimplyTheOther/gccrs/tree/master/gcc/rust/test3 if anyone wants to look at it.

philberty · 2020-04-12T13:10:40Z

@SimplyTheOther Thanks for this, the work looks really good here. I reckon it would be a fairly hard project to get that merged into my stuff so how about i do it. I will rebase from your work to just the test3 folder as the front-end and merge in my work so you still have the git history of your work

philberty · 2020-04-12T16:04:42Z

@SimplyTheOther I have completed a rebase with your work into this branch https://github.com/philberty/gccrs/tree/simply-philbert-rebase/gcc/rust

If your happy with me using this as a base i would like to push to master and continue my work ontop of this. We could also change this into an organisation so you have ownership too.

NalaGinrut · 2020-04-13T07:55:05Z

@SimplyTheOther Thanks for the work! I've taken a look at your code. Now that you've finished the time-consuming parser part, I'd like to continue the work based on your existing code. And if you create an organization, I think it's better to choose a branch-based model, say, lock the master branch, and allow members to create branches, so that the contributions can be reviewed in PR.

SimplyTheOther · 2020-04-13T08:27:49Z

@philberty Using this as a base would be absolutely great. I'd be happy to contribute further too or answer any questions about design and implementation choices if you have any.

@NalaGinrut No problem. However, I'm not so sure that the parser is the time-consuming part - I imagine the type checking (since rust heavily uses type inference) and name resolution, as well as more advanced stuff like borrow checking if planned to be implemented, would take up quite a lot of time too.

NalaGinrut · 2020-04-13T08:48:58Z

@SimplyTheOther Agreed, type checking would be the hard work. I think you've roughly finished the architecture without filling in some functions. But I need time to understand your idea in the code. To organize a community efficiently, maybe we need discussion somewhere. IRC or slack are both fines for me.

I don't think gcc-rust is a quick job, and I can only work on it in my part time. But I hope it can continue.

philberty · 2020-04-13T21:56:34Z

I have done some more code cleanup there today i have 1 more commit to push in and then i will enforce PR's only to master including on myself to keep myself honest. I also setup github workflow to check build status etc. The next part is to share my plan somehow either via blog or on this github project.

philberty · 2020-07-11T19:20:14Z

Posted my first hello world blog post: https://thephilbert.io/?p=11

Are people interested in a youtube style content similar as https://github.com/SerenityOS/serenity https://www.youtube.com/channel/UC3ts8coMP645hZw9JSD3pqQ

glaubitz · 2020-07-13T09:02:13Z

@philberty It may not be much, but I have created a Bountysource campaign to support your effort. It has not gained that much attraction yet, but it might do in the future.

sapir · 2020-07-13T13:26:48Z

@glaubitz I think this one also applies: https://www.bountysource.com/issues/86138921-rfe-add-a-frontend-for-the-rust-programming-language

glaubitz · 2020-07-13T13:51:23Z

That's the one I created :-).

Missed a few lines in the last attempt. Whoops.

Handle constant 0 passed to the QMATH DImode add/sub handler such as with: #2 0x0000000011d409b0 in gen_adddi3 (operand0=0x7ffff5c0a128, operand1=0x7ffff5c60480, operand2=0x7ffff5c60470) at .../gcc/config/vax/vax.md:755 755 "vax_expand_addsub_di_operands (operands, PLUS); DONE;") (gdb) pr operand0 (reg:DI 31) (gdb) pr operand1 (const_int 0 [0]) (gdb) pr operand2 (const_int -1 [0xffffffffffffffff]) (gdb) causing an assertion in `vax_expand_addsub_di_operands': gcc_assert (operands[1] != const0_rtx || code == MINUS); to trigger: during RTL pass: expand .../gcc/testsuite/gcc.c-torture/compile/sync-1.c: In function 'test_op_ignore': .../gcc/testsuite/gcc.c-torture/compile/sync-1.c:33:10: internal compiler error: in vax_expand_addsub_di_operands, at config/vax/vax.c:2080 0x11815003 vax_expand_addsub_di_operands(rtx_def**, rtx_code) .../gcc/config/vax/vax.c:2080 0x11d409af gen_adddi3(rtx_def*, rtx_def*, rtx_def*) .../gcc/config/vax/vax.md:755 0x10ea2763 rtx_insn* insn_gen_fn::operator()<rtx_def*, rtx_def*, rtx_def*>(rtx_def*, rtx_def*, rtx_def*) const .../gcc/recog.h:304 0x10f7fc8f maybe_gen_insn(insn_code, unsigned int, expand_operand*) .../gcc/optabs.c:7402 0x10f67f8b expand_binop_directly .../gcc/optabs.c:1122 0x10f684cf expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*, rtx_def*, int, optab_methods) .../gcc/optabs.c:1209 0x10f6fb4f expand_unop(machine_mode, optab_tag, rtx_def*, rtx_def*, int) .../gcc/optabs.c:3013 0x10f6c493 expand_simple_unop(machine_mode, rtx_code, rtx_def*, rtx_def*, int) .../gcc/optabs.c:2200 0x10f7e2f3 expand_atomic_fetch_op(rtx_def*, rtx_def*, rtx_def*, rtx_code, memmodel, bool) .../gcc/optabs.c:7021 0x107f7523 expand_builtin_sync_operation .../gcc/builtins.c:7605 0x107ff547 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) .../gcc/builtins.c:9430 0x10acda63 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) .../gcc/expr.c:11249 0x10abeb9f expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) .../gcc/expr.c:8486 0x1085606b expand_expr .../gcc/expr.h:282 0x1086157f expand_call_stmt .../gcc/cfgexpand.c:2709 0x10865ab7 expand_gimple_stmt_1 .../gcc/cfgexpand.c:3713 0x108662fb expand_gimple_stmt .../gcc/cfgexpand.c:3877 0x10870387 expand_gimple_basic_block .../gcc/cfgexpand.c:5918 0x10872b6b execute .../gcc/cfgexpand.c:6602 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. compiler exited with status 1 FAIL: gcc.c-torture/compile/sync-1.c -O0 (internal compiler error) causing numerous failures in regression testing. While requesting an addition operation to be produced for the constant operands of 0 and -1 may seem silly, technically there is nothing wrong with it, and non-QMATH code (as with the `-mno-qmath' option) has no issues with that, so neither should QMATH code. This operation will normally be folded in later passes anyway. Observe then, that adding or subtracting constant 0 amounts to a move (and we even have a machine instruction available to do that with a single operation) so handle the case explicitly, swapping the addends if so required, removing the assertion failure and along with that 70 test suite failures like: FAIL: gcc.c-torture/compile/sync-1.c -O0 (internal compiler error) FAIL: gcc.c-torture/compile/sync-1.c -O0 fetch_and_nand (test for warnings, line ) FAIL: gcc.c-torture/compile/sync-1.c -O0 nand_and_fetch (test for warnings, line ) FAIL: gcc.c-torture/compile/sync-1.c -O0 (test for excess errors) FAIL: gcc.c-torture/compile/sync-2.c -O0 (internal compiler error) FAIL: gcc.c-torture/compile/sync-2.c -O0 (test for warnings, line ) FAIL: gcc.c-torture/compile/sync-2.c -O0 (test for excess errors) FAIL: gcc.c-torture/compile/sync-3.c -O0 (internal compiler error) FAIL: gcc.c-torture/compile/sync-3.c -O0 (test for warnings, line ) FAIL: gcc.c-torture/compile/sync-3.c -O0 (test for excess errors) and similarly across all the other optimization levels and compilation options covered. gcc/ * config/vax/vax.c (vax_expand_addsub_di_operands): Handle the addition or subtraction of 0.

/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128' #0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77 #1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190 #2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338 #3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370 #4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f) #5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24) #6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd) Differential Revision: https://reviews.llvm.org/D97263 Cherry-pick from 16ede0956cb1f4b692dfa619ccfa6ab1de28e19b.

…imize or target pragmas [PR103012] The following testcases ICE when an optimize or target pragma is followed by a long line (4096+ chars). This is because on such long lines we can't use columns anymore, but the cpp_define calls performed by c_cpp_builtins_optimize_pragma or from the backend hooks for target pragma are done on temporary buffers and expect to get columns from whatever line they appear on (which happens to be the long line after optimize/target pragma), and we run into: #0 fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986 #1 0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502 #2 0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827 #3 0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898 #4 0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592 #5 0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394 #6 0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601 #7 0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639 #8 0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589 #9 0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513 #10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522 #11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>) at ../../gcc/c-family/c-cppbuiltin.c:600 assertion that LC_RENAME doesn't happen first. I think the right fix is emit those predefined macros upon optimize/target pragmas with BUILTINS_LOCATION, like we already do for those macros at the start of the TU, they don't appear in columns of the next line after it. Another possibility would be to force them at the location of the pragma. 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR c++/103012 gcc/ * config/i386/i386-c.c (ix86_pragma_target_parse): Perform cpp_define/cpp_undef calls with forced token locations BUILTINS_LOCATION. * config/arm/arm-c.c (arm_pragma_target_parse): Likewise. * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise. * config/s390/s390-c.c (s390_pragma_target_parse): Likewise. gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform cpp_define_unused/cpp_undef calls with forced token locations BUILTINS_LOCATION. gcc/testsuite/ PR c++/103012 * g++.dg/cpp/pr103012.C: New test. * g++.target/i386/pr103012.C: New test.

This patch implements C++23 P2255R2, which adds two new type traits to detect reference binding to a temporary. They can be used to detect code like std::tuple<const std::string&> t("meow"); which is incorrect because it always creates a dangling reference, because the std::string temporary is created inside the selected constructor of std::tuple, and not outside it. There are two new compiler builtins, __reference_constructs_from_temporary and __reference_converts_from_temporary. The former is used to simulate direct- and the latter copy-initialization context. But I had a hard time finding a test where there's actually a difference. Under DR 2267, both of these are invalid: struct A { } a; struct B { explicit B(const A&); }; const B &b1{a}; const B &b2(a); so I had to peruse [over.match.ref], and eventually realized that the difference can be seen here: struct G { operator int(); // #1 explicit operator int&&(); // #2 }; int&& r1(G{}); // use #2 (no temporary) int&& r2 = G{}; // use #1 (a temporary is created to be bound to int&&) The implementation itself was rather straightforward because we already have the conv_binds_ref_to_prvalue function. The main function here is ref_xes_from_temporary. I've changed the return type of ref_conv_binds_directly to tristate, because previously the function didn't distinguish between an invalid conversion and one that binds to a prvalue. Since it no longer returns a bool, I removed the _p suffix. The patch also adds the relevant class and variable templates to <type_traits>. PR c++/104477 gcc/c-family/ChangeLog: * c-common.cc (c_common_reswords): Add __reference_constructs_from_temporary and __reference_converts_from_temporary. * c-common.h (enum rid): Add RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY. gcc/cp/ChangeLog: * call.cc (ref_conv_binds_directly_p): Rename to ... (ref_conv_binds_directly): ... this. Add a new bool parameter. Change the return type to tristate. * constraint.cc (diagnose_trait_expr): Handle CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. * cp-tree.h: Include "tristate.h". (enum cp_trait_kind): Add CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. (ref_conv_binds_directly_p): Rename to ... (ref_conv_binds_directly): ... this. (ref_xes_from_temporary): Declare. * cxx-pretty-print.cc (pp_cxx_trait_expression): Handle CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. * method.cc (ref_xes_from_temporary): New. * parser.cc (cp_parser_primary_expression): Handle RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY. (cp_parser_trait_expr): Likewise. (warn_for_range_copy): Adjust to call ref_conv_binds_directly. * semantics.cc (trait_expr_value): Handle CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY. (finish_trait_expr): Likewise. libstdc++-v3/ChangeLog: * include/std/type_traits (reference_constructs_from_temporary, reference_converts_from_temporary): New class templates. (reference_constructs_from_temporary_v, reference_converts_from_temporary_v): New variable templates. (__cpp_lib_reference_from_temporary): Define for C++23. * include/std/version (__cpp_lib_reference_from_temporary): Define for C++23. * testsuite/20_util/variable_templates_for_traits.cc: Test reference_constructs_from_temporary_v and reference_converts_from_temporary_v. * testsuite/20_util/reference_from_temporary/value.cc: New test. * testsuite/20_util/reference_from_temporary/value2.cc: New test. * testsuite/20_util/reference_from_temporary/version.cc: New test. gcc/testsuite/ChangeLog: * g++.dg/ext/reference_constructs_from_temporary1.C: New test. * g++.dg/ext/reference_converts_from_temporary1.C: New test.

This patch implements some additional zero-extension and sign-extension related optimizations in simplify-rtx.cc. The original motivation comes from PR rtl-optimization/71775, where in comment #2 Andrew Pinksi sees: Failed to match this instruction: (set (reg:DI 88 [ _1 ]) (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0))) On many platforms the result of DImode CTZ is constrained to be a small unsigned integer (between 0 and 64), hence the truncation to 32-bits (using a SUBREG) and the following sign extension back to 64-bits are effectively a no-op, so the above should ideally (often) be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))". To implement this, and some closely related transformations, we build upon the existing val_signbit_known_clear_p predicate. In the first chunk, nonzero_bits knows that FFS and ABS can't leave the sign-bit bit set, so the simplification of of ABS (ABS (x)) and ABS (FFS (x)) can itself be simplified. The second transformation is that we can canonicalized SIGN_EXTEND to ZERO_EXTEND (as in the PR 71775 case above) when the operand's sign-bit is known to be clear. The final two chunks are for SIGN_EXTEND of a truncating SUBREG, and ZERO_EXTEND of a truncating SUBREG respectively. The nonzero_bits of a truncating SUBREG pessimistically thinks that the upper bits may have an arbitrary value (by taking the SUBREG), so we need look deeper at the SUBREG's operand to confirm that the high bits are known to be zero. Unfortunately, for PR rtl-optimization/71775, ctz:DI on x86_64 with default architecture options is undefined at zero, so we can't be sure the upper bits of reg:DI 88 will be sign extended (all zeros or all ones). nonzero_bits knows this, so the above transformations don't trigger, but the transformations themselves are perfectly valid for other operations such as FFS, POPCOUNT and PARITY, and on other targets/-march settings where CTZ is defined at zero. 2022-08-03 Roger Sayle <roger@nextmovesoftware.com> Segher Boessenkool <segher@kernel.crashing.org> Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog * simplify-rtx.cc (simplify_unary_operation_1) <ABS>: Add optimizations for CLRSB, PARITY, POPCOUNT, SS_ABS and LSHIFTRT that are all positive to complement the existing FFS and idempotent ABS simplifications. <SIGN_EXTEND>: Canonicalize SIGN_EXTEND to ZERO_EXTEND when val_signbit_known_clear_p is true of the operand. Simplify sign extensions of SUBREG truncations of operands that are already suitably (zero) extended. <ZERO_EXTEND>: Simplify zero extensions of SUBREG truncations of operands that are already suitably zero extended.

In my previous patches I've been extending our std::move warnings, but this tweak actually dials it down a little bit. As reported in bug 89780, it's questionable to warn about expressions in templates that were type-dependent, but aren't anymore because we're instantiating the template. As in, template <typename T> Dest withMove() { T x; return std::move(x); } template Dest withMove<Dest>(); // #1 template Dest withMove<Source>(); // #2 Saying that the std::move is pessimizing for #1 is not incorrect, but it's not useful, because removing the std::move would then pessimize #2. So the user can't really win. At the same time, disabling the warning just because we're in a template would be going too far, I still want to warn for template <typename> Dest withMove() { Dest x; return std::move(x); } because the std::move therein will be pessimizing for any instantiation. So I'm using the suppress_warning machinery to that effect. Problem: I had to add a new group to nowarn_spec_t, otherwise suppressing the -Wpessimizing-move warning would disable a whole bunch of other warnings, which we really don't want. PR c++/89780 gcc/cp/ChangeLog: * pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Maybe suppress -Wpessimizing-move. * typeck.cc (maybe_warn_pessimizing_move): Don't issue warnings if they are suppressed. (check_return_expr): Disable -Wpessimizing-move when returning a dependent expression. gcc/ChangeLog: * diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Handle OPT_Wpessimizing_move and OPT_Wredundant_move. * diagnostic-spec.h (nowarn_spec_t): Add NW_REDUNDANT enumerator. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/Wpessimizing-move3.C: Remove dg-warning. * g++.dg/cpp0x/Wredundant-move2.C: Likewise.

The eliminate reg-reg move by inverting the condition of a cmove Rust-GCC#2 peephole2 converts the following sequence: 473: bx:DI=[r14:DI*0x8+r12:DI] 960: r15:DI=r8:DI 485: {flags:CCC=cmp(r15:DI+bx:DI,bx:DI);r15:DI=r15:DI+bx:DI;} 737: r15:DI={(geu(flags:CCC,0))?r15:DI:bx:DI} to: 1110: {flags:CCC=cmp(r8:DI+bx:DI,bx:DI);r8:DI=r8:DI+bx:DI;} 1111: r15:DI=[r14:DI*0x8+r12:DI] 1112: r15:DI={(geu(flags:CCC,0))?r8:DI:r15:DI} Please note that(insn 1110) uses register BX, but its initialization was eliminated. Avoid conversion if eliminated move intialized a register, used in the moved instruction. 2022-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog: PR target/107404 * config/i386/i386.md (eliminate reg-reg move by inverting the condition of a cmove Rust-GCC#2 peephole2): Check if eliminated move initialized a register, used in the moved instruction. gcc/testsuite/ChangeLog: PR target/107404 * g++.target/i386/pr107404.C: New test.

While looking at PR 105549, which is about fixing the ABI break introduced in GCC 9.1 in parameter alignment with bit-fields, we noticed that the GCC 9.1 warning is not emitted in all the cases where it should be. This patch fixes that and the next patch in the series fixes the GCC 9.1 break. We split this into two patches since patch #2 introduces a new ABI break starting with GCC 13.1. This way, patch #1 can be back-ported to release branches if needed to fix the GCC 9.1 warning issue. The main idea is to add a new global boolean that indicates whether we're expanding the start of a function, so that aarch64_layout_arg can emit warnings for callees as well as callers. This removes the need for aarch64_function_arg_boundary to warn (with its incomplete information). However, in the first patch there are still cases where we emit warnings were we should not; this is fixed in patch #2 where we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly. The fix in aarch64_function_arg_boundary (replacing & with &&) looks like an oversight of a previous commit in this area which changed 'abi_break' from a boolean to an integer. We also take the opportunity to fix the comment above aarch64_function_arg_alignment since the value of the abi_break parameter was changed in a previous commit, no longer matching the description. 2022-11-28 Christophe Lyon <christophe.lyon@arm.com> Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix comment. (aarch64_layout_arg): Factorize warning conditions. (aarch64_function_arg_boundary): Fix typo. * function.cc (currently_expanding_function_start): New variable. (expand_function_start): Handle currently_expanding_function_start. * function.h (currently_expanding_function_start): Declare. gcc/testsuite/ChangeLog: * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning.h: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning.h: New test.

Here the ahead-of-time overload set pruning in finish_call_expr is unintentionally returning a CALL_EXPR whose (pruned) callee is wrapped in an ADDR_EXPR, despite the original callee not being wrapped in an ADDR_EXPR. This ends up causing a bogus declaration mismatch error in the below testcase because the call to min in #1 gets expressed as a CALL_EXPR of ADDR_EXPR of FUNCTION_DECL, whereas the level-lowered call to min in #2 gets expressed instead as a CALL_EXPR of FUNCTION_DECL. This patch fixes this by stripping the spurious ADDR_EXPR appropriately. Thus the first call to min now also gets expressed as a CALL_EXPR of FUNCTION_DECL, matching the behavior before r12-6075-g2decd2cabe5a4f. PR c++/107461 gcc/cp/ChangeLog: * semantics.cc (finish_call_expr): Strip ADDR_EXPR from the selected callee during overload set pruning. gcc/testsuite/ChangeLog: * g++.dg/template/call9.C: New test.

After r13-5684-g59e0376f607805 the (pruned) callee of a non-dependent CALL_EXPR is a bare FUNCTION_DECL rather than ADDR_EXPR of FUNCTION_DECL. This innocent change revealed that cp_tree_equal doesn't first check dependence of a CALL_EXPR before treating a FUNCTION_DECL callee as a dependent name, which leads to us incorrectly accepting the first two testcases below and rejecting the third: * In the first testcase, cp_tree_equal incorrectly returns true for the two non-dependent CALL_EXPRs f(0) and f(0) (whose CALL_EXPR_FN are different FUNCTION_DECLs) which causes us to treat #2 as a redeclaration of #1. * Same issue in the second testcase, for f<int*>() and f<char>(). * In the third testcase, cp_tree_equal incorrectly returns true for f<int>() and f<void(*)(int)>() which causes us to conflate the two dependent specializations A<decltype(f<int>()(U()))> and A<decltype(f<void(*)(int)>()(U()))>. This patch fixes this by making called_fns_equal treat two callees as dependent names only if the overall CALL_EXPRs are dependent, via a new convenience function call_expr_dependent_name that is like dependent_name but also checks dependence of the overall CALL_EXPR. PR c++/107461 gcc/cp/ChangeLog: * cp-tree.h (call_expr_dependent_name): Declare. * pt.cc (iterative_hash_template_arg) <case CALL_EXPR>: Use call_expr_dependent_name instead of dependent_name. * tree.cc (call_expr_dependent_name): Define. (called_fns_equal): Adjust to take two CALL_EXPRs instead of CALL_EXPR_FNs thereof. Use call_expr_dependent_name instead of dependent_name. (cp_tree_equal) <case CALL_EXPR>: Adjust call to called_fns_equal. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/overload5.C: New test. * g++.dg/cpp0x/overload5a.C: New test. * g++.dg/cpp0x/overload6.C: New test.

Improve stack protector patterns and peephole2s even more: a. Use unrelated register clears with integer mode size <= word mode size to clear stack protector scratch register. b. Use unrelated register initializations in front of stack protector sequence to clear stack protector scratch register. c. Use unrelated register initializations using LEA instructions to clear stack protector scratch register. These stack protector improvements reuse 6914 unrelated register initializations to substitute the clear of stack protector scratch register in 12034 instances of stack protector sequence in recent linux defconfig build. gcc/ChangeLog: * config/i386/i386.md (@stack_protect_set_1_<PTR:mode>_<W:mode>): Use W mode iterator instead of SWI48. Output MOV instead of XOR for TARGET_USE_MOV0. (stack_protect_set_1 peephole2): Use integer modes with mode size <= word mode size for operand 3. (stack_protect_set_1 peephole2 #2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization, originally in front of stack protector sequence. (*stack_protect_set_3_<PTR:mode>_<SWI48:mode>): New insn pattern. (stack_protect_set_1 peephole2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization involving LEA instruction.

Use unrelated register initializations using zero/sign-extend instructions to clear stack protector scratch register. Hanlde only SI -> DImode extensions for 64-bit targets, as this is the only extension that triggers the peephole in a non-negligible number. Also use explicit check for word_mode instead of mode iterator in peephole2 patterns to avoid pattern explosion. gcc/ChangeLog: * config/i386/i386.md (stack_protect_set_1 peephole2): Explicitly check operand 2 for word_mode. (stack_protect_set_1 peephole2 #2): Ditto. (stack_protect_set_2 peephole2): Ditto. (stack_protect_set_3 peephole2): Ditto. (*stack_protect_set_4z_<mode>_di): New insn patter. (*stack_protect_set_4s_<mode>_di): Ditto. (stack_protect_set_4 peephole2): New peephole2 pattern to substitute stack protector scratch register clear with unrelated register initialization involving zero/sign-extend instruction.

Since the last import from upstream libsanitizer, the output has changed and now looks more like this: READ of size 6 at 0x7ff7beb2a144 thread T0 #0 0x101cf7796 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) sanitizer_common_interceptors.inc:813 #1 0x101cf7b99 in memcmp sanitizer_common_interceptors.inc:840 #2 0x108a0c39f in __stack_chk_guard+0xf (dyld:x86_64+0x8039f) so let's adjust the pattern accordingly. gcc/testsuite/ChangeLog: * c-c++-common/asan/memcmp-1.c: Adjust pattern on darwin.

…-int (PR target/112413) On m68k the compiler assumes that the PC-relative jump-via-jump-table instruction and the jump table are adjacent with no padding in between. When -mlong-jump-table-offsets is combined with -malign-int, a 2-byte nop may be inserted before the jump table, causing the jump to add the fetched offset to the wrong PC base and thus jump to the wrong address. Fixed by referencing the jump table via its label. On the test case in the PR the object code change is (the moveal at 16 is the nop): a: 6536 bcss 42 <f+0x42> c: e588 lsll #2,%d0 e: 203b 0808 movel %pc@(18 <f+0x18>,%d0:l),%d0 - 12: 4efb 0802 jmp %pc@(16 <f+0x16>,%d0:l) + 12: 4efb 0804 jmp %pc@(18 <f+0x18>,%d0:l) 16: 284c moveal %a4,%a4 18: 0000 0020 orib #32,%d0 1c: 0000 002c orib #44,%d0 Bootstrapped and tested on m68k-linux-gnu, no regressions. Note: I don't have commit rights to I would need assistance applying this. PR target/112413 gcc/ * config/m68k/linux.h (ASM_RETURN_CASE_JUMP): For TARGET_LONG_JUMP_TABLE_OFFSETS, reference the jump table via its label. * config/m68k/m68kelf.h (ASM_RETURN_CASE_JUMP): Likewise. * config/m68k/netbsd-elf.h (ASM_RETURN_CASE_JUMP): Likewise.

During partial ordering, we want to look through dependent alias template specializations within template arguments and otherwise treat them as opaque in other contexts (see e.g. r7-7116-g0c942f3edab108 and r11-7011-g6e0a231a4aa240). To that end template_args_equal was given a partial_order flag that controls this behavior. This flag does the right thing when a dependent alias template specialization appears as template argument of the partial specialization, e.g. in template<class T, class...> using first_t = T; template<class T> struct traits; template<class T> struct traits<first_t<T, T&>> { }; // #1 template<class T> struct traits<first_t<const T, T&>> { }; // #2 we correctly consider #2 to be more specialized than #1. But if the alias specialization appears as a nested template argument of another class template specialization, e.g. in template<class T> struct traits<A<first_t<T, T&>>> { }; // #1 template<class T> struct traits<A<first_t<const T, T&>>> { }; // #2 then we incorrectly consider #1 and #2 to be unordered. This is because 1. we don't propagate the flag to recursive template_args_equal calls 2. we don't use structural equality for class template specializations written in terms of dependent alias template specializations This patch fixes the first issue by turning the partial_order flag into a global. This patch fixes the second issue by making us propagate structural equality appropriately when building a class template specialization. In passing this patch also improves hashing of specializations that use structural equality. PR c++/90679 gcc/cp/ChangeLog: * cp-tree.h (comp_template_args): Remove partial_order parameter. (template_args_equal): Likewise. * pt.cc (comparing_for_partial_ordering): New global flag. (iterative_hash_template_arg) <case tcc_type>: Hash the template and arguments for specializations that use structural equality. (template_args_equal): Remove partial order parameter and use comparing_for_partial_ordering instead. (comp_template_args): Likewise. (comp_template_args_porder): Set comparing_for_partial_ordering instead. Make static. (any_template_arguments_need_structural_equality_p): Return true for an argument that's a dependent alias template specialization or a class template specialization that itself needs structural equality. * tree.cc (cp_tree_equal) <case TREE_VEC>: Adjust call to comp_template_args. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/alias-decl-75a.C: New test. * g++.dg/cpp0x/alias-decl-75b.C: New test.

This patch adjusts the costs so that we treat REG and SUBREG expressions the same for costing. This was motivated by bt_skip_func and bt_find_func in xz and results in nearly a 5% improvement in the dynamic instruction count for input #2 and smaller, but definitely visible improvements pretty much across the board. Exceptions would be perlbench input #1 and exchange2 which showed very small regressions. In the bt_find_func and bt_skip_func cases we have something like this: > (insn 10 7 11 2 (set (reg/v:DI 136 [ x ]) > (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0))) "zz.c":6:21 387 {*zero_extendsidi2_bitmanip} > (nil)) > (insn 11 10 12 2 (set (reg:DI 142 [ _1 ]) > (plus:DI (reg/v:DI 136 [ x ]) > (reg/v:DI 139 [ b ]))) "zz.c":7:23 5 {adddi3} > (nil)) [ ... ]> (insn 13 12 14 2 (set (reg:DI 143 [ _2 ]) > (plus:DI (reg/v:DI 136 [ x ]) > (reg/v:DI 141 [ c ]))) "zz.c":8:23 5 {adddi3} > (nil)) Note the two uses of (reg 136). The best way to handle that in combine might be a 3->2 split. But there's a much better approach if we look at fwprop... (set (reg:DI 142 [ _1 ]) (plus:DI (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0)) (reg/v:DI 139 [ b ]))) change not profitable (cost 4 -> cost 8) So that should be the same cost as a regular DImode addition when the ZBA extension is enabled. But it ends up costing more because the clause to cost this variant isn't prepared to handle a SUBREG. That results in the RTL above having too high a cost and fwprop gives up. One approach would be to replace the REG_P with REG_P || SUBREG_P in the costing code. I ultimately decided against that and instead check if the operand in question passes register_operand. By far the most important case to handle is the DImode PLUS. But for the sake of consistency, I changed the other instances in riscv_rtx_costs as well. For those other cases we're talking about improvements in the .000001% range. While we are into stage4, this just hits cost modeling which we've generally agreed is still appropriate (though we were mostly talking about vector). So I'm going to extend that general agreement ever so slightly and include scalar cost modeling :-) gcc/ * config/riscv/riscv.cc (riscv_rtx_costs): Handle SUBREG and REG similarly. gcc/testsuite/ * gcc.target/riscv/reg_subreg_costs.c: New test. Co-authored-by: Jivan Hakobyan <jivanhakobyan9@gmail.com>

Given below example for VLS mode void test (vl_t *u) { vl_t t; long long *p = (long long *)&t; p[0] = p[1] = 2; *u = t; } The vec_set will simplify the insn to vmv.s.x when index is 0, without merged operand. That will result in some problems in DCE, aka: 1: 137[DI] = a0 2: 138[V2DI] = 134[V2DI] // deleted by DCE 3: 139[DI] = #2 // deleted by DCE 4: 140[DI] = #2 // deleted by DCE 5: 141[V2DI] = vec_dup:V2DI (139[DI]) // deleted by DCE 6: 138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE 7: 135[V2DI] = 138[V2DI] // deleted by DCE 8: 142[V2DI] = 135[V2DI] // deleted by DCE 9: 143[DI] = #2 10: 142[V2DI] = vec_dup:V2DI (143[DI]) 11: (137[DI]) = 142[V2DI] The higher 64 bits of 142[V2DI] is unknown here and it generated incorrect code when store back to memory. This patch would like to fix this issue by adding a new SCALAR_MOVE_MERGED_OP for vec_set. Please note this patch doesn't enable VLS for vec_set, the underlying patches will support this soon. gcc/ChangeLog: * config/riscv/autovec.md: Bugfix. * config/riscv/riscv-protos.h (SCALAR_MOVE_MERGED_OP): New enum. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>

We evaluate constexpr functions on the original, pre-genericization bodies. That means that the function body we're evaluating will not have gone through cp_genericize_r's "Map block scope extern declarations to visible declarations with the same name and type in outer scopes if any". Here: constexpr bool bar() { return true; } // #1 constexpr bool foo() { constexpr bool bar(void); // #2 return bar(); } it means that we: 1) register_constexpr_fundef (#1) 2) cp_genericize (#1) nothing interesting happens 3) register_constexpr_fundef (foo) does copy_fn, so we have two copies of the BIND_EXPR 4) cp_genericize (foo) this remaps #2 to #1, but only on one copy of the BIND_EXPR 5) retrieve_constexpr_fundef (foo) we find it, no problem 6) retrieve_constexpr_fundef (#2) and here #2 isn't found in constexpr_fundef_table, because we're working on the BIND_EXPR copy where #2 wasn't mapped to #1 so we fail. We've only registered #1. It should work to use DECL_LOCAL_DECL_ALIAS (which used to be extern_decl_map). We evaluate constexpr functions on pre-cp_fold bodies to avoid diagnostic problems, but the remapping I'm proposing should not interfere with diagnostics. This is not a problem for a global scope redeclaration; there we go through duplicate_decls which keeps the DECL_UID: DECL_UID (olddecl) = olddecl_uid; and DECL_UID is what constexpr_fundef_hasher::hash uses. PR c++/111132 gcc/cp/ChangeLog: * constexpr.cc (get_function_named_in_call): Use cp_get_fndecl_from_callee. * cvt.cc (cp_get_fndecl_from_callee): If there's a DECL_LOCAL_DECL_ALIAS, use it. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-redeclaration3.C: New test. * g++.dg/cpp0x/constexpr-redeclaration4.C: New test.

philberty closed this as completed Dec 21, 2020

philberty pushed a commit that referenced this issue Mar 1, 2021

Remove the last trace of the Operator enum #2

5625e07

Missed a few lines in the last attempt. Whoops.

philberty mentioned this issue Jul 26, 2021

Can't call extern functions #421

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Join forces #2

Join forces #2

michalfita commented Dec 9, 2019

SimplyTheOther commented Dec 9, 2019

michalfita commented Dec 9, 2019 •

edited

Loading

philberty commented Dec 9, 2019

NalaGinrut commented Dec 10, 2019 •

edited

Loading

ArneBab commented Apr 3, 2020 •

edited

Loading

NalaGinrut commented Apr 4, 2020 •

edited

Loading

philberty commented Apr 7, 2020

ArneBab commented Apr 7, 2020

SimplyTheOther commented Apr 12, 2020

philberty commented Apr 12, 2020

philberty commented Apr 12, 2020

NalaGinrut commented Apr 13, 2020

SimplyTheOther commented Apr 13, 2020

NalaGinrut commented Apr 13, 2020

philberty commented Apr 13, 2020

philberty commented Jul 11, 2020 •

edited

Loading

glaubitz commented Jul 13, 2020

sapir commented Jul 13, 2020

glaubitz commented Jul 13, 2020

Join forces #2

Join forces #2

Comments

michalfita commented Dec 9, 2019

SimplyTheOther commented Dec 9, 2019

michalfita commented Dec 9, 2019 • edited Loading

philberty commented Dec 9, 2019

NalaGinrut commented Dec 10, 2019 • edited Loading

ArneBab commented Apr 3, 2020 • edited Loading

NalaGinrut commented Apr 4, 2020 • edited Loading

philberty commented Apr 7, 2020

ArneBab commented Apr 7, 2020

SimplyTheOther commented Apr 12, 2020

philberty commented Apr 12, 2020

philberty commented Apr 12, 2020

NalaGinrut commented Apr 13, 2020

SimplyTheOther commented Apr 13, 2020

NalaGinrut commented Apr 13, 2020

philberty commented Apr 13, 2020

philberty commented Jul 11, 2020 • edited Loading

glaubitz commented Jul 13, 2020

sapir commented Jul 13, 2020

glaubitz commented Jul 13, 2020

michalfita commented Dec 9, 2019 •

edited

Loading

NalaGinrut commented Dec 10, 2019 •

edited

Loading

ArneBab commented Apr 3, 2020 •

edited

Loading

NalaGinrut commented Apr 4, 2020 •

edited

Loading

philberty commented Jul 11, 2020 •

edited

Loading