-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The IR design, type checking, and pre-optimizing #11
Comments
We have HIR with @SimplyTheOther AST classes which are very expressive to do all the resolution and static analysis we need. That ticks off the top 3. HIR->MIR i think for now using the Backend.h wrapper over GENERIC gcc tree's will work. I am not concerned about extra optimizations at this stage but there is the borrow checking and gccgro does its own escape analysis at this level so we will have to do that too. I just want to avoid any other IR's because i think the AST and the Backened IR is enough for the front-end at least for now. |
OK, then we may change the first item to "name resolution". |
My only thing is that when we bring things down to the Backend abstraction GENERIC although thats what we feed GCC to get output i think we get alot of similar concepts as MIR not quite the same its still fairly high level but i would rather get this first project out of the way then look at it again where it could very well fit in to have another IR. |
Agreed. |
According to the rustc dev guide and associated links, rustc used to have AST-based borrow-checking (and presumably type checking since it comes before borrow-checking), but they abandoned that approach due to difficulties with implementation (and for the borrow-checker, to allow "non-lexical lifetime" borrow checking, but that may not be a problem at this point with no borrow checker). For now though, without complex features like borrow checking, I think the AST for IR alone could suffice (though it may make some things difficult). |
@philberty @SimplyTheOther OK, let's try to do all the things with AST, if we found something too difficult, that's even better for us if we have to introduce special IR later. |
I've been considering a lot of what @bjorn3 mentioned about the end architecture for the compiler. Working with the AST for now in theory we could squeeze out GIMPLE but i fear the compiler could be hard to maintain at that point in terms of generating all the glue necessary for everything to work correctly without having MIR. Even at the moment doing type resolution using the AST i have butchered some of the AST classes with extra fields to have the data we need and created duplicated scope classes for lookups. I am starting to look at implementing HIR which does seem to map very closely to what the AST looks like post type resolution right now. It would also help clean the code up a lot and create a common reference point. |
Rustc does translation to HIR before typechecking. It stores the typecheck results in a side-table (or rather query result) as the HIR is immutable. |
…imize or target pragmas [PR103012] The following testcases ICE when an optimize or target pragma is followed by a long line (4096+ chars). This is because on such long lines we can't use columns anymore, but the cpp_define calls performed by c_cpp_builtins_optimize_pragma or from the backend hooks for target pragma are done on temporary buffers and expect to get columns from whatever line they appear on (which happens to be the long line after optimize/target pragma), and we run into: #0 fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986 #1 0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502 #2 0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827 #3 0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898 #4 0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592 #5 0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394 #6 0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601 #7 0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639 #8 0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589 #9 0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513 #10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522 #11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>) at ../../gcc/c-family/c-cppbuiltin.c:600 assertion that LC_RENAME doesn't happen first. I think the right fix is emit those predefined macros upon optimize/target pragmas with BUILTINS_LOCATION, like we already do for those macros at the start of the TU, they don't appear in columns of the next line after it. Another possibility would be to force them at the location of the pragma. 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR c++/103012 gcc/ * config/i386/i386-c.c (ix86_pragma_target_parse): Perform cpp_define/cpp_undef calls with forced token locations BUILTINS_LOCATION. * config/arm/arm-c.c (arm_pragma_target_parse): Likewise. * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise. * config/s390/s390-c.c (s390_pragma_target_parse): Likewise. gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform cpp_define_unused/cpp_undef calls with forced token locations BUILTINS_LOCATION. gcc/testsuite/ PR c++/103012 * g++.dg/cpp/pr103012.C: New test. * g++.target/i386/pr103012.C: New test.
The current Rust compiler contains two IRs before GENERIC:
My previous experiences are more about functional programming language's compiler. That's relatively easier for coding, since there's pattern matching, and there're fewer side-effects (or even no) so that the optimizing is pretty easy: find the correct pattern, and inline the function or closure, then execute the rewriting rules. This process can cover many common optimizing, say, constant-fold, dead-variable-elimination, and dead-function-elimination, etc. For Rust, I think they're doing similar rewriting, but I need more researches.
The Rust compiler is written in Rust, so there's pattern matching. I guess we have to write more code for the tree node matching. After all, pattern matching is just syntax sugar, which expands more code that we have to write in C++.
I'm not sure if we can follow the exact design of HIR and MIR, since C++ may not be possible to cover the expressiveness exactly so that it's better to design a similar IR for taking advantage of C++ features. I'm just guessing, and I need more researches for the conclusion.
So I think the plan could be:
That's a rough plan, there're more things, including memory management, library interfaces, exceptions handling, etc. But I'm not sure where to put them in the pipeline. So I just listed them.
Comments?
The text was updated successfully, but these errors were encountered: