Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join forces #2

Closed
michalfita opened this issue Dec 9, 2019 · 19 comments
Closed

Join forces #2

michalfita opened this issue Dec 9, 2019 · 19 comments

Comments

@michalfita
Copy link

Please join forces with https://github.com/sapir/gcc-rust and don't try to do that by yourself. We need one good Rust frontend for GCC not multiple projects that both fail.

Thank you.

@SimplyTheOther
Copy link
Member

Please join forces with https://github.com/sapir/gcc-rust and don't try to do that by yourself. We need one good Rust frontend for GCC not multiple projects that both fail.

Thank you.

The projects seem to have completely different approaches to implementing a Rust front end. gcc-rust looks like it’s a wrapper for rustc itself to be used in GCC (and is written in Rust itself and so suffers the same bootstrapping problems as the current rustc, as well as the “trusting trust” issue) while gccrs is a full alternative implementation written in C++. I don’t think that there would be much overlap between the actual code required with each project as they currently stand.

@michalfita
Copy link
Author

michalfita commented Dec 9, 2019

What about joining forces with with https://github.com/thepowersgang/mrustc some way?

(I'd help myself, but I'm not a compiler type of guy, too much other work to do - however grustc would be huge benefit at convincing employers to use Rust 😄)

@philberty
Copy link
Member

Hi thank you for pointing this out

I dropped working on this project for a few years because for 2 reasons:

  1. How on earth do i approach this and why.
  2. Personal Life job was too busy to do anything

It has taken along time to decide what direction i wanted to go given the size of the task.

Option 1 - Do what @sapir doing by reusing MIR from rust

     - Pros:
       a. No need to implement Lexer or Parser
       b. No need to implement analysis
       c. Get the same diagnostics
       d. unifyies the implementation to simply be a new backend

     - Con:
       a. Relies on rust
       b. possible license issues

Option 2 - Use rust GCC bindings to reuse GCC JIT interface ontop of the rust compiler

     - Pros:
       a. No need to implement Lexer or Parser
       b. No need to implement analysis
       c. Get the same diagnostics
       d. unifyies the implementation to simply be a new backend
       e. Gives rise of possibly doing full JIT on the fly

     - Cons:
       a. Lacks clarity of direction

Option 3 - Full GCC C++ implemtation

     - Pros:
       a. Clear vision to have an independant c++ implementation
       b. Can follow the gccgo pattern for making it independant front-end code to retarget GCC or LLVM or Mono or JVM
       c. No license issues it is a full reimplemtation
     - Cons:
       a. Huge task

There is no clear winner in terms of a pragmatic goal of implemting rust on GCC as each of them will do it and in a contest of fastest/reliable to the post it will be Option 1 and hats of to @sapir this is looks like its shaping up very well and i wish him all the sucess.

Option 3 appeals to me as there is a vision of what could be possible with this front-end code which exicites me. Given the size of the task i have taken the last few years to decide this is what my open-source focus will be for at least the next 2 years. I hope i can stick to this goal to see where i get to with it as i feel its very much a worthwhile goal for rust.

@NalaGinrut
Copy link
Contributor

NalaGinrut commented Dec 10, 2019

@redbrain It's so sad to hear this project was stopped. What if I submit some patches in my casual time? Do you think it's fine to maintain it and accept the patches, or do you prefer another fork?

@ArneBab
Copy link

ArneBab commented Apr 3, 2020

Is there a place where you provide updates (i.e. small news snippets)?

(I’m happy to hear that you’re working on this again!)

@NalaGinrut
Copy link
Contributor

NalaGinrut commented Apr 4, 2020

I've made little research around GCC frontend and the current Rust upstream implementation. I'm interested in making some progress in Rust on GCC. But I have to finish another compiler project of mine first. So it may not be very soon to show something. Anyway, I hope we can use Rust seriously with GCC.
BTW, I prefer option-3.

@philberty
Copy link
Member

Is there a place where you provide updates (i.e. small news snippets)?

(I’m happy to hear that you’re working on this again!)

Thanks for this i have several patches in progress to get this to a working state where people can jump in. As for news i have just created a blog over at:

https://thephilbert.io/

nothing there yet but i will be putting up some stuff next week on the status and my plan.

@ArneBab
Copy link

ArneBab commented Apr 7, 2020

You’re awesome — thank you!

@SimplyTheOther
Copy link
Member

I've written a fully-operational and working lexer, parser, and AST for a potential Rust GCC frontend that I can provide as patches. The code has been targeted against stable Rust as of December 2019 and is hand-coded recursive-descent (as opposed to generated). It doesn't yet complete macro expansion, name resolution, type checking, conversion to GCC IR, etc., but should be useful as a starting point if you haven't already written the patches. It's at https://github.com/SimplyTheOther/gccrs/tree/master/gcc/rust/test3 if anyone wants to look at it.

@philberty
Copy link
Member

@SimplyTheOther Thanks for this, the work looks really good here. I reckon it would be a fairly hard project to get that merged into my stuff so how about i do it. I will rebase from your work to just the test3 folder as the front-end and merge in my work so you still have the git history of your work

@philberty
Copy link
Member

@SimplyTheOther I have completed a rebase with your work into this branch https://github.com/philberty/gccrs/tree/simply-philbert-rebase/gcc/rust

If your happy with me using this as a base i would like to push to master and continue my work ontop of this. We could also change this into an organisation so you have ownership too.

@NalaGinrut
Copy link
Contributor

@SimplyTheOther Thanks for the work! I've taken a look at your code. Now that you've finished the time-consuming parser part, I'd like to continue the work based on your existing code. And if you create an organization, I think it's better to choose a branch-based model, say, lock the master branch, and allow members to create branches, so that the contributions can be reviewed in PR.

@SimplyTheOther
Copy link
Member

@philberty Using this as a base would be absolutely great. I'd be happy to contribute further too or answer any questions about design and implementation choices if you have any.

@NalaGinrut No problem. However, I'm not so sure that the parser is the time-consuming part - I imagine the type checking (since rust heavily uses type inference) and name resolution, as well as more advanced stuff like borrow checking if planned to be implemented, would take up quite a lot of time too.

@NalaGinrut
Copy link
Contributor

@SimplyTheOther Agreed, type checking would be the hard work. I think you've roughly finished the architecture without filling in some functions. But I need time to understand your idea in the code. To organize a community efficiently, maybe we need discussion somewhere. IRC or slack are both fines for me.

I don't think gcc-rust is a quick job, and I can only work on it in my part time. But I hope it can continue.

@philberty
Copy link
Member

I have done some more code cleanup there today i have 1 more commit to push in and then i will enforce PR's only to master including on myself to keep myself honest. I also setup github workflow to check build status etc. The next part is to share my plan somehow either via blog or on this github project.

@philberty
Copy link
Member

philberty commented Jul 11, 2020

Posted my first hello world blog post: https://thephilbert.io/?p=11

Are people interested in a youtube style content similar as https://github.com/SerenityOS/serenity https://www.youtube.com/channel/UC3ts8coMP645hZw9JSD3pqQ

@glaubitz
Copy link
Contributor

@philberty It may not be much, but I have created a Bountysource campaign to support your effort. It has not gained that much attraction yet, but it might do in the future.

@sapir
Copy link

sapir commented Jul 13, 2020

@glaubitz
Copy link
Contributor

That's the one I created :-).

philberty pushed a commit that referenced this issue Mar 1, 2021
Missed a few lines in the last attempt. Whoops.
philberty pushed a commit that referenced this issue Mar 3, 2021
Handle constant 0 passed to the QMATH DImode add/sub handler such as
with:

#2  0x0000000011d409b0 in gen_adddi3 (operand0=0x7ffff5c0a128,
    operand1=0x7ffff5c60480, operand2=0x7ffff5c60470)
    at .../gcc/config/vax/vax.md:755
755	  "vax_expand_addsub_di_operands (operands, PLUS); DONE;")
(gdb) pr operand0
(reg:DI 31)
(gdb) pr operand1
(const_int 0 [0])
(gdb) pr operand2
(const_int -1 [0xffffffffffffffff])
(gdb)

causing an assertion in `vax_expand_addsub_di_operands':

      gcc_assert (operands[1] != const0_rtx || code == MINUS);

to trigger:

during RTL pass: expand
.../gcc/testsuite/gcc.c-torture/compile/sync-1.c: In function 'test_op_ignore':
.../gcc/testsuite/gcc.c-torture/compile/sync-1.c:33:10: internal compiler error: in vax_expand_addsub_di_operands, at config/vax/vax.c:2080
0x11815003 vax_expand_addsub_di_operands(rtx_def**, rtx_code)
	.../gcc/config/vax/vax.c:2080
0x11d409af gen_adddi3(rtx_def*, rtx_def*, rtx_def*)
	.../gcc/config/vax/vax.md:755
0x10ea2763 rtx_insn* insn_gen_fn::operator()<rtx_def*, rtx_def*, rtx_def*>(rtx_def*, rtx_def*, rtx_def*) const
	.../gcc/recog.h:304
0x10f7fc8f maybe_gen_insn(insn_code, unsigned int, expand_operand*)
	.../gcc/optabs.c:7402
0x10f67f8b expand_binop_directly
	.../gcc/optabs.c:1122
0x10f684cf expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*, rtx_def*, int, optab_methods)
	.../gcc/optabs.c:1209
0x10f6fb4f expand_unop(machine_mode, optab_tag, rtx_def*, rtx_def*, int)
	.../gcc/optabs.c:3013
0x10f6c493 expand_simple_unop(machine_mode, rtx_code, rtx_def*, rtx_def*, int)
	.../gcc/optabs.c:2200
0x10f7e2f3 expand_atomic_fetch_op(rtx_def*, rtx_def*, rtx_def*, rtx_code, memmodel, bool)
	.../gcc/optabs.c:7021
0x107f7523 expand_builtin_sync_operation
	.../gcc/builtins.c:7605
0x107ff547 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
	.../gcc/builtins.c:9430
0x10acda63 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
	.../gcc/expr.c:11249
0x10abeb9f expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
	.../gcc/expr.c:8486
0x1085606b expand_expr
	.../gcc/expr.h:282
0x1086157f expand_call_stmt
	.../gcc/cfgexpand.c:2709
0x10865ab7 expand_gimple_stmt_1
	.../gcc/cfgexpand.c:3713
0x108662fb expand_gimple_stmt
	.../gcc/cfgexpand.c:3877
0x10870387 expand_gimple_basic_block
	.../gcc/cfgexpand.c:5918
0x10872b6b execute
	.../gcc/cfgexpand.c:6602
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
compiler exited with status 1
FAIL: gcc.c-torture/compile/sync-1.c   -O0  (internal compiler error)

causing numerous failures in regression testing.

While requesting an addition operation to be produced for the constant
operands of 0 and -1 may seem silly, technically there is nothing wrong
with it, and non-QMATH code (as with the `-mno-qmath' option) has no
issues with that, so neither should QMATH code.  This operation will
normally be folded in later passes anyway.

Observe then, that adding or subtracting constant 0 amounts to a move
(and we even have a machine instruction available to do that with a
single operation) so handle the case explicitly, swapping the addends if
so required, removing the assertion failure and along with that 70 test
suite failures like:

FAIL: gcc.c-torture/compile/sync-1.c   -O0  (internal compiler error)
FAIL: gcc.c-torture/compile/sync-1.c   -O0  fetch_and_nand (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-1.c   -O0  nand_and_fetch (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-1.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/sync-2.c   -O0  (internal compiler error)
FAIL: gcc.c-torture/compile/sync-2.c   -O0   (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-2.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/sync-3.c   -O0  (internal compiler error)
FAIL: gcc.c-torture/compile/sync-3.c   -O0   (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-3.c   -O0  (test for excess errors)

and similarly across all the other optimization levels and compilation
options covered.

	gcc/
	* config/vax/vax.c (vax_expand_addsub_di_operands): Handle the
	addition or subtraction of 0.
philberty pushed a commit that referenced this issue Mar 3, 2021
/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128'
    #0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77
    #1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190
    #2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338
    #3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370
    #4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f)
    #5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24)
    #6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd)

Differential Revision: https://reviews.llvm.org/D97263

Cherry-pick from 16ede0956cb1f4b692dfa619ccfa6ab1de28e19b.
bors bot pushed a commit that referenced this issue Jan 25, 2022
…imize or target pragmas [PR103012]

The following testcases ICE when an optimize or target pragma
is followed by a long line (4096+ chars).
This is because on such long lines we can't use columns anymore,
but the cpp_define calls performed by c_cpp_builtins_optimize_pragma
or from the backend hooks for target pragma are done on temporary
buffers and expect to get columns from whatever line they appear on
(which happens to be the long line after optimize/target pragma),
and we run into:
 #0  fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986
 #1  0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502
 #2  0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827
 #3  0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898
 #4  0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592
 #5  0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394
 #6  0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601
 #7  0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639
 #8  0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589
 #9  0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513
 #10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522
 #11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>)
     at ../../gcc/c-family/c-cppbuiltin.c:600
assertion that LC_RENAME doesn't happen first.

I think the right fix is emit those predefined macros upon
optimize/target pragmas with BUILTINS_LOCATION, like we already do
for those macros at the start of the TU, they don't appear in columns
of the next line after it.  Another possibility would be to force them
at the location of the pragma.

2021-12-30  Jakub Jelinek  <jakub@redhat.com>

	PR c++/103012
gcc/
	* config/i386/i386-c.c (ix86_pragma_target_parse): Perform
	cpp_define/cpp_undef calls with forced token locations
	BUILTINS_LOCATION.
	* config/arm/arm-c.c (arm_pragma_target_parse): Likewise.
	* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise.
	* config/s390/s390-c.c (s390_pragma_target_parse): Likewise.
gcc/c-family/
	* c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform
	cpp_define_unused/cpp_undef calls with forced token locations
	BUILTINS_LOCATION.
gcc/testsuite/
	PR c++/103012
	* g++.dg/cpp/pr103012.C: New test.
	* g++.target/i386/pr103012.C: New test.
bors bot pushed a commit that referenced this issue Aug 24, 2022
This patch implements C++23 P2255R2, which adds two new type traits to
detect reference binding to a temporary.  They can be used to detect code
like

  std::tuple<const std::string&> t("meow");

which is incorrect because it always creates a dangling reference, because
the std::string temporary is created inside the selected constructor of
std::tuple, and not outside it.

There are two new compiler builtins, __reference_constructs_from_temporary
and __reference_converts_from_temporary.  The former is used to simulate
direct- and the latter copy-initialization context.  But I had a hard time
finding a test where there's actually a difference.  Under DR 2267, both
of these are invalid:

  struct A { } a;
  struct B { explicit B(const A&); };
  const B &b1{a};
  const B &b2(a);

so I had to peruse [over.match.ref], and eventually realized that the
difference can be seen here:

  struct G {
    operator int(); // #1
    explicit operator int&&(); // #2
  };

int&& r1(G{}); // use #2 (no temporary)
int&& r2 = G{}; // use #1 (a temporary is created to be bound to int&&)

The implementation itself was rather straightforward because we already
have the conv_binds_ref_to_prvalue function.  The main function here is
ref_xes_from_temporary.
I've changed the return type of ref_conv_binds_directly to tristate, because
previously the function didn't distinguish between an invalid conversion and
one that binds to a prvalue.  Since it no longer returns a bool, I removed
the _p suffix.

The patch also adds the relevant class and variable templates to <type_traits>.

	PR c++/104477

gcc/c-family/ChangeLog:

	* c-common.cc (c_common_reswords): Add
	__reference_constructs_from_temporary and
	__reference_converts_from_temporary.
	* c-common.h (enum rid): Add RID_REF_CONSTRUCTS_FROM_TEMPORARY and
	RID_REF_CONVERTS_FROM_TEMPORARY.

gcc/cp/ChangeLog:

	* call.cc (ref_conv_binds_directly_p): Rename to ...
	(ref_conv_binds_directly): ... this.  Add a new bool parameter.  Change
	the return type to tristate.
	* constraint.cc (diagnose_trait_expr): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	* cp-tree.h: Include "tristate.h".
	(enum cp_trait_kind): Add CPTK_REF_CONSTRUCTS_FROM_TEMPORARY
	and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	(ref_conv_binds_directly_p): Rename to ...
	(ref_conv_binds_directly): ... this.
	(ref_xes_from_temporary): Declare.
	* cxx-pretty-print.cc (pp_cxx_trait_expression): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	* method.cc (ref_xes_from_temporary): New.
	* parser.cc (cp_parser_primary_expression): Handle
	RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY.
	(cp_parser_trait_expr): Likewise.
	(warn_for_range_copy): Adjust to call ref_conv_binds_directly.
	* semantics.cc (trait_expr_value): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	(finish_trait_expr): Likewise.

libstdc++-v3/ChangeLog:

	* include/std/type_traits (reference_constructs_from_temporary,
	reference_converts_from_temporary): New class templates.
	(reference_constructs_from_temporary_v,
	reference_converts_from_temporary_v): New variable templates.
	(__cpp_lib_reference_from_temporary): Define for C++23.
	* include/std/version (__cpp_lib_reference_from_temporary): Define for
	C++23.
	* testsuite/20_util/variable_templates_for_traits.cc: Test
	reference_constructs_from_temporary_v and
	reference_converts_from_temporary_v.
	* testsuite/20_util/reference_from_temporary/value.cc: New test.
	* testsuite/20_util/reference_from_temporary/value2.cc: New test.
	* testsuite/20_util/reference_from_temporary/version.cc: New test.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/reference_constructs_from_temporary1.C: New test.
	* g++.dg/ext/reference_converts_from_temporary1.C: New test.
bors bot pushed a commit that referenced this issue Aug 24, 2022
This patch implements some additional zero-extension and sign-extension
related optimizations in simplify-rtx.cc.  The original motivation comes
from PR rtl-optimization/71775, where in comment #2 Andrew Pinksi sees:

Failed to match this instruction:
(set (reg:DI 88 [ _1 ])
    (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0)))

On many platforms the result of DImode CTZ is constrained to be a
small unsigned integer (between 0 and 64), hence the truncation to
32-bits (using a SUBREG) and the following sign extension back to
64-bits are effectively a no-op, so the above should ideally (often)
be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))".

To implement this, and some closely related transformations, we build
upon the existing val_signbit_known_clear_p predicate.  In the first
chunk, nonzero_bits knows that FFS and ABS can't leave the sign-bit
bit set, so the simplification of of ABS (ABS (x)) and ABS (FFS (x))
can itself be simplified.  The second transformation is that we can
canonicalized SIGN_EXTEND to ZERO_EXTEND (as in the PR 71775 case above)
when the operand's sign-bit is known to be clear.  The final two chunks
are for SIGN_EXTEND of a truncating SUBREG, and ZERO_EXTEND of a
truncating SUBREG respectively.  The nonzero_bits of a truncating
SUBREG pessimistically thinks that the upper bits may have an
arbitrary value (by taking the SUBREG), so we need look deeper at the
SUBREG's operand to confirm that the high bits are known to be zero.

Unfortunately, for PR rtl-optimization/71775, ctz:DI on x86_64 with
default architecture options is undefined at zero, so we can't be sure
the upper bits of reg:DI 88 will be sign extended (all zeros or all ones).
nonzero_bits knows this, so the above transformations don't trigger,
but the transformations themselves are perfectly valid for other
operations such as FFS, POPCOUNT and PARITY, and on other targets/-march
settings where CTZ is defined at zero.

2022-08-03  Roger Sayle  <roger@nextmovesoftware.com>
	    Segher Boessenkool  <segher@kernel.crashing.org>
	    Richard Sandiford  <richard.sandiford@arm.com>

gcc/ChangeLog
	* simplify-rtx.cc (simplify_unary_operation_1) <ABS>: Add
	optimizations for CLRSB, PARITY, POPCOUNT, SS_ABS and LSHIFTRT
	that are all positive to complement the existing FFS and
	idempotent ABS simplifications.
	<SIGN_EXTEND>: Canonicalize SIGN_EXTEND to ZERO_EXTEND when
	val_signbit_known_clear_p is true of the operand.
	Simplify sign extensions of SUBREG truncations of operands
	that are already suitably (zero) extended.
	<ZERO_EXTEND>: Simplify zero extensions of SUBREG truncations
	of operands that are already suitably zero extended.
bors bot pushed a commit that referenced this issue Aug 24, 2022
In my previous patches I've been extending our std::move warnings,
but this tweak actually dials it down a little bit.  As reported in
bug 89780, it's questionable to warn about expressions in templates
that were type-dependent, but aren't anymore because we're instantiating
the template.  As in,

  template <typename T>
  Dest withMove() {
    T x;
    return std::move(x);
  }

  template Dest withMove<Dest>(); // #1
  template Dest withMove<Source>(); // #2

Saying that the std::move is pessimizing for #1 is not incorrect, but
it's not useful, because removing the std::move would then pessimize #2.
So the user can't really win.  At the same time, disabling the warning
just because we're in a template would be going too far, I still want to
warn for

  template <typename>
  Dest withMove() {
    Dest x;
    return std::move(x);
  }

because the std::move therein will be pessimizing for any instantiation.

So I'm using the suppress_warning machinery to that effect.
Problem: I had to add a new group to nowarn_spec_t, otherwise
suppressing the -Wpessimizing-move warning would disable a whole bunch
of other warnings, which we really don't want.

	PR c++/89780

gcc/cp/ChangeLog:

	* pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Maybe suppress
	-Wpessimizing-move.
	* typeck.cc (maybe_warn_pessimizing_move): Don't issue warnings
	if they are suppressed.
	(check_return_expr): Disable -Wpessimizing-move when returning
	a dependent expression.

gcc/ChangeLog:

	* diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Handle
	OPT_Wpessimizing_move and OPT_Wredundant_move.
	* diagnostic-spec.h (nowarn_spec_t): Add NW_REDUNDANT enumerator.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/Wpessimizing-move3.C: Remove dg-warning.
	* g++.dg/cpp0x/Wredundant-move2.C: Likewise.
ibuclaw pushed a commit to ibuclaw/gccrs that referenced this issue Nov 12, 2022
The eliminate reg-reg move by inverting the condition of
a cmove Rust-GCC#2 peephole2 converts the following sequence:

  473: bx:DI=[r14:DI*0x8+r12:DI]
  960: r15:DI=r8:DI
  485: {flags:CCC=cmp(r15:DI+bx:DI,bx:DI);r15:DI=r15:DI+bx:DI;}
  737: r15:DI={(geu(flags:CCC,0))?r15:DI:bx:DI}

to:

 1110: {flags:CCC=cmp(r8:DI+bx:DI,bx:DI);r8:DI=r8:DI+bx:DI;}
 1111: r15:DI=[r14:DI*0x8+r12:DI]
 1112: r15:DI={(geu(flags:CCC,0))?r8:DI:r15:DI}

Please note that(insn 1110) uses register BX, but its
initialization was eliminated.

Avoid conversion if eliminated move intialized a register, used
in the moved instruction.

2022-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

	PR target/107404
	* config/i386/i386.md (eliminate reg-reg move by inverting the
	condition of a cmove Rust-GCC#2 peephole2): Check if eliminated move
	initialized a register, used in the moved instruction.

gcc/testsuite/ChangeLog:

	PR target/107404
	* g++.target/i386/pr107404.C: New test.
CohenArthur pushed a commit that referenced this issue Jan 16, 2023
While looking at PR 105549, which is about fixing the ABI break
introduced in GCC 9.1 in parameter alignment with bit-fields, we
noticed that the GCC 9.1 warning is not emitted in all the cases where
it should be.  This patch fixes that and the next patch in the series
fixes the GCC 9.1 break.

We split this into two patches since patch #2 introduces a new ABI
break starting with GCC 13.1.  This way, patch #1 can be back-ported
to release branches if needed to fix the GCC 9.1 warning issue.

The main idea is to add a new global boolean that indicates whether
we're expanding the start of a function, so that aarch64_layout_arg
can emit warnings for callees as well as callers.  This removes the
need for aarch64_function_arg_boundary to warn (with its incomplete
information).  However, in the first patch there are still cases where
we emit warnings were we should not; this is fixed in patch #2 where
we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly.

The fix in aarch64_function_arg_boundary (replacing & with &&) looks
like an oversight of a previous commit in this area which changed
'abi_break' from a boolean to an integer.

We also take the opportunity to fix the comment above
aarch64_function_arg_alignment since the value of the abi_break
parameter was changed in a previous commit, no longer matching the
description.

2022-11-28  Christophe Lyon  <christophe.lyon@arm.com>
	    Richard Sandiford  <richard.sandiford@arm.com>

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix
	comment.
	(aarch64_layout_arg): Factorize warning conditions.
	(aarch64_function_arg_boundary): Fix typo.
	* function.cc (currently_expanding_function_start): New variable.
	(expand_function_start): Handle
	currently_expanding_function_start.
	* function.h (currently_expanding_function_start): Declare.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
	* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New
	test.
	* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test.
	* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New
	test.
	* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
	* gcc.target/aarch64/bitfield-abi-warning.h: New test.
	* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test.
	* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New
	test.
	* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test.
	* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New
	test.
	* g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test.
	* g++.target/aarch64/bitfield-abi-warning.h: New test.
tschwinge pushed a commit that referenced this issue Feb 18, 2023
Here the ahead-of-time overload set pruning in finish_call_expr is
unintentionally returning a CALL_EXPR whose (pruned) callee is wrapped
in an ADDR_EXPR, despite the original callee not being wrapped in an
ADDR_EXPR.  This ends up causing a bogus declaration mismatch error in
the below testcase because the call to min in #1 gets expressed as a
CALL_EXPR of ADDR_EXPR of FUNCTION_DECL, whereas the level-lowered call
to min in #2 gets expressed instead as a CALL_EXPR of FUNCTION_DECL.

This patch fixes this by stripping the spurious ADDR_EXPR appropriately.
Thus the first call to min now also gets expressed as a CALL_EXPR of
FUNCTION_DECL, matching the behavior before r12-6075-g2decd2cabe5a4f.

	PR c++/107461

gcc/cp/ChangeLog:

	* semantics.cc (finish_call_expr): Strip ADDR_EXPR from
	the selected callee during overload set pruning.

gcc/testsuite/ChangeLog:

	* g++.dg/template/call9.C: New test.
tschwinge pushed a commit that referenced this issue Feb 18, 2023
After r13-5684-g59e0376f607805 the (pruned) callee of a non-dependent
CALL_EXPR is a bare FUNCTION_DECL rather than ADDR_EXPR of FUNCTION_DECL.
This innocent change revealed that cp_tree_equal doesn't first check
dependence of a CALL_EXPR before treating a FUNCTION_DECL callee as a
dependent name, which leads to us incorrectly accepting the first two
testcases below and rejecting the third:

 * In the first testcase, cp_tree_equal incorrectly returns true for
   the two non-dependent CALL_EXPRs f(0) and f(0) (whose CALL_EXPR_FN
   are different FUNCTION_DECLs) which causes us to treat #2 as a
   redeclaration of #1.

 * Same issue in the second testcase, for f<int*>() and f<char>().

 * In the third testcase, cp_tree_equal incorrectly returns true for
   f<int>() and f<void(*)(int)>() which causes us to conflate the two
   dependent specializations A<decltype(f<int>()(U()))> and
   A<decltype(f<void(*)(int)>()(U()))>.

This patch fixes this by making called_fns_equal treat two callees as
dependent names only if the overall CALL_EXPRs are dependent, via a new
convenience function call_expr_dependent_name that is like dependent_name
but also checks dependence of the overall CALL_EXPR.

	PR c++/107461

gcc/cp/ChangeLog:

	* cp-tree.h (call_expr_dependent_name): Declare.
	* pt.cc (iterative_hash_template_arg) <case CALL_EXPR>: Use
	call_expr_dependent_name instead of dependent_name.
	* tree.cc (call_expr_dependent_name): Define.
	(called_fns_equal): Adjust to take two CALL_EXPRs instead of
	CALL_EXPR_FNs thereof.  Use call_expr_dependent_name instead
	of dependent_name.
	(cp_tree_equal) <case CALL_EXPR>: Adjust call to called_fns_equal.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/overload5.C: New test.
	* g++.dg/cpp0x/overload5a.C: New test.
	* g++.dg/cpp0x/overload6.C: New test.
CohenArthur pushed a commit that referenced this issue Dec 12, 2023
Improve stack protector patterns and peephole2s even more:

a. Use unrelated register clears with integer mode size <= word
   mode size to clear stack protector scratch register.

b. Use unrelated register initializations in front of stack
   protector sequence to clear stack protector scratch register.

c. Use unrelated register initializations using LEA instructions
   to clear stack protector scratch register.

These stack protector improvements reuse 6914 unrelated register
initializations to substitute the clear of stack protector scratch
register in 12034 instances of stack protector sequence in recent linux
defconfig build.

gcc/ChangeLog:

	* config/i386/i386.md (@stack_protect_set_1_<PTR:mode>_<W:mode>):
	Use W mode iterator instead of SWI48.  Output MOV instead of XOR
	for TARGET_USE_MOV0.
	(stack_protect_set_1 peephole2): Use integer modes with
	mode size <= word mode size for operand 3.
	(stack_protect_set_1 peephole2 #2): New peephole2 pattern to
	substitute stack protector scratch register clear with unrelated
	register initialization, originally in front of stack
	protector sequence.
	(*stack_protect_set_3_<PTR:mode>_<SWI48:mode>): New insn pattern.
	(stack_protect_set_1 peephole2): New peephole2 pattern to
	substitute stack protector scratch register clear with unrelated
	register initialization involving LEA instruction.
CohenArthur pushed a commit that referenced this issue Dec 12, 2023
Use unrelated register initializations using zero/sign-extend instructions
to clear stack protector scratch register.

Hanlde only SI -> DImode extensions for 64-bit targets, as this is the
only extension that triggers the peephole in a non-negligible number.

Also use explicit check for word_mode instead of mode iterator in peephole2
patterns to avoid pattern explosion.

gcc/ChangeLog:

	* config/i386/i386.md (stack_protect_set_1 peephole2):
	Explicitly check operand 2 for word_mode.
	(stack_protect_set_1 peephole2 #2): Ditto.
	(stack_protect_set_2 peephole2): Ditto.
	(stack_protect_set_3 peephole2): Ditto.
	(*stack_protect_set_4z_<mode>_di): New insn patter.
	(*stack_protect_set_4s_<mode>_di): Ditto.
	(stack_protect_set_4 peephole2): New peephole2 pattern to
	substitute stack protector scratch register clear with unrelated
	register initialization involving zero/sign-extend instruction.
CohenArthur pushed a commit that referenced this issue Dec 12, 2023
Since the last import from upstream libsanitizer, the output has changed
and now looks more like this:

READ of size 6 at 0x7ff7beb2a144 thread T0
    #0 0x101cf7796 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) sanitizer_common_interceptors.inc:813
    #1 0x101cf7b99 in memcmp sanitizer_common_interceptors.inc:840
    #2 0x108a0c39f in __stack_chk_guard+0xf (dyld:x86_64+0x8039f)

so let's adjust the pattern accordingly.

gcc/testsuite/ChangeLog:

	* c-c++-common/asan/memcmp-1.c: Adjust pattern on darwin.
CohenArthur pushed a commit that referenced this issue Dec 12, 2023
…-int (PR target/112413)

On m68k the compiler assumes that the PC-relative jump-via-jump-table
instruction and the jump table are adjacent with no padding in between.

When -mlong-jump-table-offsets is combined with -malign-int, a 2-byte
nop may be inserted before the jump table, causing the jump to add the
fetched offset to the wrong PC base and thus jump to the wrong address.

Fixed by referencing the jump table via its label. On the test case
in the PR the object code change is (the moveal at 16 is the nop):

    a:  6536            bcss 42 <f+0x42>
    c:  e588            lsll #2,%d0
    e:  203b 0808       movel %pc@(18 <f+0x18>,%d0:l),%d0
-  12:  4efb 0802       jmp %pc@(16 <f+0x16>,%d0:l)
+  12:  4efb 0804       jmp %pc@(18 <f+0x18>,%d0:l)
   16:  284c            moveal %a4,%a4
   18:  0000 0020       orib #32,%d0
   1c:  0000 002c       orib #44,%d0

Bootstrapped and tested on m68k-linux-gnu, no regressions.

Note: I don't have commit rights to I would need assistance applying this.

	PR target/112413
gcc/

	* config/m68k/linux.h (ASM_RETURN_CASE_JUMP): For
	TARGET_LONG_JUMP_TABLE_OFFSETS, reference the jump table
	via its label.
	* config/m68k/m68kelf.h (ASM_RETURN_CASE_JUMP): Likewise.
	* config/m68k/netbsd-elf.h (ASM_RETURN_CASE_JUMP): Likewise.
CohenArthur pushed a commit that referenced this issue Jan 10, 2024
During partial ordering, we want to look through dependent alias
template specializations within template arguments and otherwise
treat them as opaque in other contexts (see e.g. r7-7116-g0c942f3edab108
and r11-7011-g6e0a231a4aa240).  To that end template_args_equal was
given a partial_order flag that controls this behavior.  This flag
does the right thing when a dependent alias template specialization
appears as template argument of the partial specialization, e.g. in

  template<class T, class...> using first_t = T;
  template<class T> struct traits;
  template<class T> struct traits<first_t<T, T&>> { }; // #1
  template<class T> struct traits<first_t<const T, T&>> { }; // #2

we correctly consider #2 to be more specialized than #1.  But if the
alias specialization appears as a nested template argument of another
class template specialization, e.g. in

  template<class T> struct traits<A<first_t<T, T&>>> { }; // #1
  template<class T> struct traits<A<first_t<const T, T&>>> { }; // #2

then we incorrectly consider #1 and #2 to be unordered.  This is because

  1. we don't propagate the flag to recursive template_args_equal calls
  2. we don't use structural equality for class template specializations
     written in terms of dependent alias template specializations

This patch fixes the first issue by turning the partial_order flag into
a global.  This patch fixes the second issue by making us propagate
structural equality appropriately when building a class template
specialization.  In passing this patch also improves hashing of
specializations that use structural equality.

	PR c++/90679

gcc/cp/ChangeLog:

	* cp-tree.h (comp_template_args): Remove partial_order parameter.
	(template_args_equal): Likewise.
	* pt.cc (comparing_for_partial_ordering): New global flag.
	(iterative_hash_template_arg) <case tcc_type>: Hash the template
	and arguments for specializations that use structural equality.
	(template_args_equal): Remove partial order parameter and
	use comparing_for_partial_ordering instead.
	(comp_template_args): Likewise.
	(comp_template_args_porder): Set comparing_for_partial_ordering
	instead.  Make static.
	(any_template_arguments_need_structural_equality_p): Return true
	for an argument that's a dependent alias template specialization
	or a class template specialization that itself needs structural
	equality.
	* tree.cc (cp_tree_equal) <case TREE_VEC>: Adjust call to
	comp_template_args.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/alias-decl-75a.C: New test.
	* g++.dg/cpp0x/alias-decl-75b.C: New test.
CohenArthur pushed a commit that referenced this issue Feb 6, 2024
This patch adjusts the costs so that we treat REG and SUBREG expressions the
same for costing.

This was motivated by bt_skip_func and bt_find_func in xz and results in nearly
a 5% improvement in the dynamic instruction count for input #2 and smaller, but
definitely visible improvements pretty much across the board.  Exceptions would
be perlbench input #1 and exchange2 which showed very small regressions.

In the bt_find_func and bt_skip_func cases we have  something like this:

> (insn 10 7 11 2 (set (reg/v:DI 136 [ x ])
>         (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0))) "zz.c":6:21 387 {*zero_extendsidi2_bitmanip}
>      (nil))
> (insn 11 10 12 2 (set (reg:DI 142 [ _1 ])
>         (plus:DI (reg/v:DI 136 [ x ])
>             (reg/v:DI 139 [ b ]))) "zz.c":7:23 5 {adddi3}
>      (nil))

[ ... ]> (insn 13 12 14 2 (set (reg:DI 143 [ _2 ])
>         (plus:DI (reg/v:DI 136 [ x ])
>             (reg/v:DI 141 [ c ]))) "zz.c":8:23 5 {adddi3}
>      (nil))

Note the two uses of (reg 136). The best way to handle that in combine might be
a 3->2 split.  But there's a much better approach if we look at fwprop...

(set (reg:DI 142 [ _1 ])
    (plus:DI (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0))
        (reg/v:DI 139 [ b ])))
change not profitable (cost 4 -> cost 8)

So that should be the same cost as a regular DImode addition when the ZBA
extension is enabled.  But it ends up costing more because the clause to cost
this variant isn't prepared to handle a SUBREG.  That results in the RTL above
having too high a cost and fwprop gives up.

One approach would be to replace the REG_P  with REG_P || SUBREG_P in the
costing code.  I ultimately decided against that and instead check if the
operand in question passes register_operand.

By far the most important case to handle is the DImode PLUS.  But for the sake
of consistency, I changed the other instances in riscv_rtx_costs as well.  For
those other cases we're talking about improvements in the .000001% range.

While we are into stage4, this just hits cost modeling which we've generally
agreed is still appropriate (though we were mostly talking about vector).  So
I'm going to extend that general agreement ever so slightly and include scalar
cost modeling :-)

gcc/
	* config/riscv/riscv.cc (riscv_rtx_costs): Handle SUBREG and REG
	similarly.

gcc/testsuite/

	* gcc.target/riscv/reg_subreg_costs.c: New test.

	Co-authored-by: Jivan Hakobyan <jivanhakobyan9@gmail.com>
tschwinge pushed a commit that referenced this issue Mar 11, 2024
Given below example for VLS mode

void
test (vl_t *u)
{
  vl_t t;
  long long *p = (long long *)&t;

  p[0] = p[1] = 2;

  *u = t;
}

The vec_set will simplify the insn to vmv.s.x when index is 0, without
merged operand. That will result in some problems in DCE, aka:

1:  137[DI] = a0
2:  138[V2DI] = 134[V2DI]                              // deleted by DCE
3:  139[DI] = #2                                       // deleted by DCE
4:  140[DI] = #2                                       // deleted by DCE
5:  141[V2DI] = vec_dup:V2DI (139[DI])                 // deleted by DCE
6:  138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE
7:  135[V2DI] = 138[V2DI]                              // deleted by DCE
8:  142[V2DI] = 135[V2DI]                              // deleted by DCE
9:  143[DI] = #2
10: 142[V2DI] = vec_dup:V2DI (143[DI])
11: (137[DI]) = 142[V2DI]

The higher 64 bits of 142[V2DI] is unknown here and it generated incorrect
code when store back to memory. This patch would like to fix this issue
by adding a new SCALAR_MOVE_MERGED_OP for vec_set.

Please note this patch doesn't enable VLS for vec_set, the underlying
patches will support this soon.

gcc/ChangeLog:

	* config/riscv/autovec.md: Bugfix.
	* config/riscv/riscv-protos.h (SCALAR_MOVE_MERGED_OP): New enum.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
CohenArthur pushed a commit that referenced this issue Apr 23, 2024
We evaluate constexpr functions on the original, pre-genericization bodies.
That means that the function body we're evaluating will not have gone
through cp_genericize_r's "Map block scope extern declarations to visible
declarations with the same name and type in outer scopes if any".  Here:

  constexpr bool bar() { return true; } // #1
  constexpr bool foo() {
    constexpr bool bar(void); // #2
    return bar();
  }

it means that we:
1) register_constexpr_fundef (#1)
2) cp_genericize (#1)
   nothing interesting happens
3) register_constexpr_fundef (foo)
   does copy_fn, so we have two copies of the BIND_EXPR
4) cp_genericize (foo)
   this remaps #2 to #1, but only on one copy of the BIND_EXPR
5) retrieve_constexpr_fundef (foo)
   we find it, no problem
6) retrieve_constexpr_fundef (#2)
   and here #2 isn't found in constexpr_fundef_table, because
   we're working on the BIND_EXPR copy where #2 wasn't mapped to #1
   so we fail.  We've only registered #1.

It should work to use DECL_LOCAL_DECL_ALIAS (which used to be
extern_decl_map).  We evaluate constexpr functions on pre-cp_fold
bodies to avoid diagnostic problems, but the remapping I'm proposing
should not interfere with diagnostics.

This is not a problem for a global scope redeclaration; there we go
through duplicate_decls which keeps the DECL_UID:
  DECL_UID (olddecl) = olddecl_uid;
and DECL_UID is what constexpr_fundef_hasher::hash uses.

	PR c++/111132

gcc/cp/ChangeLog:

	* constexpr.cc (get_function_named_in_call): Use
	cp_get_fndecl_from_callee.
	* cvt.cc (cp_get_fndecl_from_callee): If there's a
	DECL_LOCAL_DECL_ALIAS, use it.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/constexpr-redeclaration3.C: New test.
	* g++.dg/cpp0x/constexpr-redeclaration4.C: New test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants