-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Porting 8086-toolchain and cross-compiling for ELKS #2159
Comments
Thanks @ghaerr ! Also a binary release in the form of zip file should be released. This way anyone can start compiling. |
I stumbled over that one as well. Seems the
|
@toncho11 and @floriangit, thanks for your comments. I've fixed "make clean" and added a note about add_path in #2160.
I hadn't worried about -O3 for host builds, have you had issues with -O3 with various software? It is nonetheless can be easily changed.
I'll leave it to @rafael2k to continue building binary dev kits with the binary libc86.a and more testing on ELKS itself, but am considering an option to build an HD image that contains everything prebuilt for ELKS, discussed in #2157. We won't be able to easily build any host binary distribution of the cross-compiler(s) though. |
Added https://github.com/ghaerr/elks/wiki/Setting-up-the-8086-toolchain-(C86-compiler-and-tools) to Wiki. |
I had years ago issues with -O3 unexpectedly doing (a) a re-ordering of code and (b) optimizing out instructions, that were not supposed to be deleted. Linus sticks with -O2 for various reasons for his kernel, too ;-) I would only use O3 for contained modules that really need that hot path optimized and look at the assembler after, with the 8086 toolchain it looks like it's in a top Makefile. Anyhow, that may only be my experience... |
@ghaerr Can you make a binary native release please? Just a tar or zip file with everything needed as Rafael did with the elks-devdisk? For example I am not sure which header files it should contain on top of the bin and examples folder. Also there should be .sh file that configures some variables? A binary release will help with testing. I will try the Makefile.elks in the examples and check if it works. |
Its going to be a lot of work to produce that, as we're now talking about the entire C library header files, which also include the ELKS kernel header files, in both linuxmt/ and arch/, etc, etc. I don't think it has much chance to fit on a floppy, and I don't have a native ELKS machine setup to even test or try out such a thing. That's why a brought up the subject of how to best think of distributing all the stuff we have now in #2157. Since @rafael2k's last dev kit, the number of files has increased enormously, since were now providing the ability to compile anything with the full ELKS C library, which isn't small. If you just want to play around with c86, you can copy the 8086-toolchain/examples and elks-bin/ folders to your hard drive. But the bigger problem is that as soon as you include something as seemingly simple as <stdio.h>, the tangled mess of include files means lots of directory organization on the target in order to work. We don't yet have a script in ELKS that can create floppy or tar file from a list of other files easily like we do in elkscmd/Applications. Since I'm still very knee deep in getting the tools themselves working, with a huge list of enhancements needed, I will probably leave a binary distribution to @rafael2k for when he returns. I'm not sure whether he was manually creating the dev disk or had a script to do it, but its not checked in. In the meantime, I appreciate your desire to test and play with things - perhaps more cross-compilation of programs can be played with the existing tools by adding more sample C programs of your own choosing. This would be easily done by just copying them into 8086-toolchain/examples and updating 8086-toolchain/examples/Makefile. Then run "make" and continue testing with the cross-compilation environment. We still need to run lots more programs through the process to see if they'll work, even with host-based cross-compilation. I'm sorry but things are getting very complicated and time consuming and I'm doing the best I can! |
Yes, I understand what you're saying. With ia16-elf-gcc and -Os, its a total optimizing compiler with code re-ordering and unused code deletion, in the ELKS kernel now. So I've gotten quite used to it. But yes how simple it is to watch C86 spit out decent but definitely non-optimized code, and you can almost see the ASM output per C statement. But nowadays I view it as a balance, that is, its a good idea to turn on heavy code optimization and see whether your program still works, as in almost all cases, the optimization is actually correct and matches the C virtual execution machine specification, its just your expectation that isn't matching. This is especially true of asserts and "noreturn"-marked functions. This kind of thing has also forced me to jump much deeper into the "undefined behavior" compiler and execution issues lots of people have been talking about these days.
Definitely a good idea! I have object file disassemblers for GCC (ia16-elf-objdump) and OWC (wdis), but not yet AS86 (objdump86 will dump hex .text/.data w/relocations but not actual disassembly). I'm working on getting objdump86 to disassemble AS86 .o files right now for the very reason you are describing. It's been a learning process with no documentation on AS86 .o format, but pretty much have that now figured out. |
Thanks for the -O3/-Os clear-up, I understand. Now, I poked a bit more after following the instructions from the WIKI. |
I not sure but these that are OS/2 are probably compiled with OWC and this might be normal. It says OS/2 but these are still ELKS binaries.
So you need to copy all these headers (from the 3 folders specified above) in one or several folders and adjust the above in the Makefile.elks to point to them. TOPDIR is elks base folder when you git clone ELKS. Also the C86LIB must contain the libc86.a that you previously compiled on your Linux host. And you do:
|
Are you also using the latest source of ELKS? @ghaerr has done memory allocation optimization recently. |
Thanks toncho11, That said, your 2nd comment may be spot on! D'oh, I have copied some things around, but might as well be on 0.8.0 kernel-wise, lol. Let me check! thanks. |
There are a number of problems here. See that "MZ" on the 2nd screenshot? That's the OS/2 (DOS) MZ executable header. The shell thinks the executable is a shell script since the kernel isn't configured to run OS/2 binaries. Set CONFIG_EXEC_OS2=y to fix that. The second problem is that the toolchain "make" doesn't support ifdef, ifndef. That may be your problem. I think the supplied examples/Makefile.elks works, but I've been testing with another one, but I haven't actually been testing with that one.
|
The above Makefile requires that libc/include/c86/stdarg.h and stddef.h be copied to /root also. It is not setup for the devdisk, sorry.
[EDIT: Also, libc/libc86.a must be copied to /root.] |
OK, I can confirm the executables (c86, as86) are now executing normally after I added CONFIG_EXEC_OS2 (how much other-OS history you wanna put into ELKS, Greg? :-P) and installed 0.9.0-dev on my HD. As for Without any header it's quite entertaining to see the poor compiler cope, or not: |
Nice! I'm pretty sure that CONFIG_EXEC_OS2 is set default ON, but I'll double check. It probably picks up the settings from your existing .config so didn't get it.
Well we needed a new executable format that allowed for any number of code and data segments, since ELKS and MINIX a.out didn't do that... and whaddya know, Open Watcom supported the format along with large model, so ... "a child was (re)born".
I really like our new C86 compiler. Its very well engineered internally and has lots of potential, I'm very surprised I had never heard of it until @rafael2k found it. But yes, it's error handling isn't the greatest... It is strange to see the ".byte ..." output after max error count termination - that's supposed to stay in the output file (unless you were running c86 without a second argument, in which case it just writes ASM output to stdout which I bet is happening). You are aware that you have to run cpp86 before running C86, right? C86 doesn't have an internal preprocessor. |
I was not aware of both your inputs, since I was merely tinkering a bit. But (having to) provide two arguments to a C compiler and thinking that CPP means a C-preprocessor (I thought of course it's C++) surely makes me feel a bit younger today! :-) I grew up with gcc already, lol (where pre-processing, checking the C language conformance, optimizing, linking and all in-between is done by one executable hiding all those executables doing the grunt work). |
LOLOL!!! We're talking about running on ELKS here! Amazing to even get a C compiler going :)
Well, that's the plan here too - writing 'cc' to run em all... but its further down on the huge list of things to be done. On the host, you can use [EDIT: I'll update the examples/Makefile.elks so that it runs on ELKS with ELKS make]. |
@floriangit, you've come this far, we definitely need to have you try building the example chess program so you can actually run it on ELKS. I've updated examples/Makefile.elks in ghaerr/8086-toolchain#23. Pull that down, and copy examples/Makefile.elks to your ELKS /root directory as Makefile. Then type "./make". |
Disclaimer: I never learnt how to play chess. But maybe I learnt something else. :-D libc.a was already there, now I copied all libc/include headers into /root. Of course make (well: c86) complained that stdio.h lines 5 and 6 include other headers that get included system-wide. I stopped there (it's getting late to construct a fuller system like linux, haha) and commented the #include "/root/stdio.h" altogether in chess.c...And then again ./make: My system has a total of 1M, can the compiler diagnostic message be meaningful/trusted? edit: Yes, I pulled your Makefile.elks and put that into /root/Makefile |
It was not for me. And pretty sure I never touched that setting. |
I just fixed that.
Are you running networking? Perhaps turn that off. The message is saying that there's not enough memory in the 640k address space to compile. You can do that with 'net stop'.
If you have previously installed ELKS, the setting won't get updated unless you copy ibmpc-1440.config to .config |
I just create an automated way to both build C86 and copy it to /root in #2163. It has the stdio.h fix just mentioned above. This will allow users to (hopefully) build and test C86 more conveniently, although I can see we're already very right on RAM. |
If you still get the out of memory message after turning off most things, including perhaps an "init 1" to stop multiuser mode, run "meminfo" and take a screenshot. That'll allow me to look at what's happening in more details. Thanks! |
You can also try running "./make" without "time" which will run with slightly more RAM. Also, if chess.c is too big for reasons yet unknown on your system, try "./make test". [EDIT: On my system, neither "./make" or "./make test" will work with networking running. After "net stop", "./make" works]. |
Ok so in This is my ELKS image with the toolchain in root, but without the nanoprintf.h correction. You can load it here online: https://copy.sh/v86 |
Yes, exactly, I have around 20 floppies here, and around 10 have already died. Today I was greeted while booting the ELKS floppy (don't remember exactly):
Tried some more times, but then changed the floppy and that one is working again. You can buy those floppies "new", but they are 5-20 years old in the sense of production date. So there is luck involved. They got rather expensive, too! This is a 286 AMD/Intel and I think it's clocked at 8MHz (so mabye 4MHz w/o turbo?). I like the slow feeling and the grumbling HDD sound (see the only other repo on my github profile), but the power supply is so old and noisy that I keep the PC only running when fiddling with ELKS :-)
Thanks! If ELKS ever aims at POSIX compliance, we gotta play by the rules, lol! |
hdd_sound, huh?! Nice. Back when Doom was being ported to ELKS, I noticed a very cool source file that implements PC speaker sound effects for Doom WAV files. Its very basic but well written - I was thinking how nice it would be to be able to play WAV files on ELKS. The big problem is actually creating or finding any "WAV" files in this particular format (basically just time-sliced speaker frequencies). There's supposedly a program "Muse" that creates them, but I've been unable to find how that worked or whether its still available. Of course, my other big problem is actually listening to the PC speaker for development, when all I've actually got is a MacBook Pro running QEMU! Totally off-topic, but hey, who wants to fix FAT filesystem problems right now anyways...
Let me know if you get near to running out. Are these 5 1/4" or 3 1/2? I think I've got several boxes of old 3.5" floppies around here I'll never use, I could send some of them to you since I'l probably never get a chance to use them. |
Thank you so much, but no worries, the shipping to Europe would not be worth it, I assume. Sorry, also Offtopic ;-) |
Happy New Year! Unfortunately Japan manufacturers also It is only for PC-98, but I have seen in X post |
Happy new year! I just tested on the 86Box emulator emulating an Amstrad 8086, 8 Mhz and it took 3m20 seconds to compile test and chess. I did clean each time. OK, so the as86 is the main bottleneck I think. If you do Also optimizations can be disabled by default to speed up the compile time. I mean commented out, so that they can be enabled easily. |
Ok so this is taking a lot of time in as:
|
@toncho11, how do you know, are you measuring between printf statements to learn that's where lots of time is being spent?
AS86 is much faster than NASM, and we are purposely testing with a "large" (at least for ELK's purposes) chess.c file that produces a 39k .as ASM file. We also have had to turn on the "automatic" jump statement handling that requires multiple passes - this is probably taking quite a bit of time. (Jump handling is required because 8086 conditional jumps only allow a +/- 127 byte hop to another instruction; if the code in between is longer, then the jump can't assemble, so AS86 reverses the condition code and issues a direct (+/- 32k) jump instead). There is a possibility to turn on C86 jump reversal as standard. It isn't implemented yet for AS86 output, but would make the code quite a bit larger, but assembly time shorter. I will look into that. We can probably put in a display that shows how long AS86 is taking within each pass to get an idea of what's happening.
If you mean C86 optimizations, I've tried that - and the resulting code is terrible. So we really need to turn it on. If you're talking about the AS86 -O, it may be that can be turned off, but I think I tried that already and it needs to be on with -j (jump optimizations). |
I see, what you're saying is that AS86 is taking 21 seconds regardless of the input .as file size? Wow. I will look into that. |
Taking from your suggestions, I removed "-O -w-" from the ASFLAGS= flags line in Makefile and that cut the AS86 build time down by 40%! We need the -j option for jump handling, but it seems the -O goes further and tries to reduce any long jumps to short jumps by running another compiler pass. The -w- option will normally produce a warning to that affect without -O, so that gets removed to. It looks like the code file size increase is very small without these options. I'll make your idea the standard, and remove -O -w- from the standard Makefile(s). I see that it is taking a bit of time on my speedy QEMU to run through the init statements in AS86. I'm still looking into why these are slow. Thank you! |
@toncho11, your comments and debugging were spot on, and helped to find a major bug in CPP86 that had been lurking inside the toolchain since the beginning. In addition, the speed issues you pointed out were caused by a bad bug in the debug malloc routine. Both have been fixed in #2169, and running "make" is several times faster than before. Thank you! |
Thank you @ghaerr! Now the This means that we can not go less than 26s whatever we compile. All tested on 86Box. And here is the ELKS image to test: |
If we were to remove all console messages we will probably gain a few seconds, but for now seeing what happens is more important. I think printing on the console is slow in general and on ELKS. This is just a thought, not a request to disable them. |
Does as86 have a code that verifies the input assembler? Is it possible to turn this on/off? |
Do you mean does AS86 verify its input to see if it is correct? That always occurs. I'm still looking for ways to speed the assembly up. It seems that most of the last speedup is actually occurring because of the removal of the -O option, although I am seeing some other strange speed-related behavior. For instance, when running "time make" vs "make", the speed of the entire operation is different. Specifically, the "Pass 1..." displays run quite a bit quicker sometimes. The really strange thing is that some builds happen faster with "time make", and others with just "make". I'm not sure if this is a QEMU issue, still tracking it down.
The console is nowhere near that slow, I don't think. Removing the -v option in CFLAGS will stop the display of the c86 version number and memory used. I plan on moving the AS86 "Pass" output to a -v option, but that actually ended up being more complicated. Eventually there won't be any extraneous messages.
Yes, we could remove cprintf but I've left it all in as this has been very useful for testing the CPP86 preprocessor (otherwise there aren't actually any #defines to process!). The chess.c program is useful since its a larger .c file and useful for our timing tests. When we move to having to preprocess more C library header files that may slow things down. I may try to add that to the examples/ dir to get a more realistic example of general compilation times. |
No problem with 86Box with or w/o time. 86Box is cycle accurate, while I think QEMU is not cycle accurate. So it is a QEMU issue I think. Also qemu is 386 and above? The whole idea of using two programs one that generates an assembly and the second that compiles it looks heavy. The first program needs to generate a specific text format, next this is saved to HDD (takes time), next it must be loaded from HDD (takes time) and then it needs to be parsed by as86 and validated. Here we have a toolchain, but I always imagined a compiler where one program generates the .o directly from .c files. |
QEMU is definitely not cycle accurate. It does emulate 386+, but that of course includes compatible real-mode 8086.
Yep. And don't forget the first program CPP86, which pre-processes the .c file before compiling it. But producing assembly output simplifies the compiler greatly, and seperating each process into separate executables lowers the memory requirements greatly. This very reason is why most C compilers can't be made to run on ELKS - they do everything in one pass - way too large for us. If you can find a C compiler that produces compatible .o files for an ELKS-runnable linker, let us know! So this is what we have for now, and I'm pretty happy with it, since compilation speed on ELKS is a much smaller issue than having a toolchain that actually runs on ELKS in the first place. We're very tight on RAM and its somewhat amazing we've even got this running. I was actually thinking of writing a "historical" post about all the things that have had to be done right in order to get to this point. There is an option to pipe the output into AS86, but that likely won't work well since that would require C86 and AS86 to be running at the same time, and we very likely don't have memory for that on any PC running real mode. There's an unbelievable amount of complexity under the hood to get all this working - and I actually like the ASM output, as it is very easy to see what the C86 compiler is generating (which brings up a whole other set of issues in itself). I'm working on a .o disassembler, since we don't actually have the capability (yet) to disassemble .o files.
Try to find one that will run on ELKS! That's been the whole problem for two years now until @rafael2k finally succeeded with IMO a great selection of tools.
There's no way to "turn off validation" since the whole function of AS86 is to turn assembly into object files. It has to recognize the large number of instruction names to do it. BTW, that's the reason why the "Init" startup took a while, which you previously pointed out - hundreds of mallocs then added to a hash table, which get increasingly slow to execute as the allocation list became large. The current memory allocator has to a linear search of the entire allocation list in order to find best-fit each time an allocation is requested, which is taking too long. That's another item on the long list of enhancements for this project. |
I see. There are tons of work done already and compilation speed would gradually improve in the next months to come. Thank you! |
Yes - the current approach is to get something that actually works, then try to optimize it. I'm still very interested in comments regarding speed though, your last testing and comments sparked finding some big previously unfound bugs. And the early comments about NASM proved so bad that we had to ditch NASM in favor of AS86. There might be some very fast assemblers out there, but we're still limited by having to produce Introl-format (AS86) .o files for our LD86 linker. The link phase isn't super fast, but in looking more at it, it now has to read the entire 77k C library during the process of linking, which makes things slower. That all said, I think for 8088 systems, most of the time is being spent in the actual calculation of C conversion to ASM, and then a lot more seemingly in ASM conversion to .o, with not so much time spent in reading and writing disk files. That's helped because in many cases, depending on your buffer settings, the output files may not even get written to disk, but instead stay inside system buffers. So there's a lot of variables and tuning that ultimately will affect real time throughput. I'm also hoping to find other ways to speed up the assembler, such as differing memory allocation algorithms. I'm still working on that. |
I missed this comment. 26s is very slow. I wonder which portion of this is just reading make, cpp86, c86, as86 and ld86 from disk? I don't have a cycle accurate emulator, but that's almost 250k of executables before doing anything. I also just found that the AS86 source says having a "very fast" memcmp function is needed for the hash table lookup of all the instructions. Ours is currently written in C rather than ASM, so it can be sped up. I might also be able to (temporarily) disable the 80386 and 8087 instructions from the AS86 opcode tables, that would probably help. |
@toncho11, I decreased the size of the AS86 initial hash table by removing 386+ and floating point instructions which aren't used in our toolchain, and then wrote a fast, inlined memcmp for AS86 to speed up hash comparisons in ghaerr/8086-toolchain#29. This seems to be working well, and the AS86 portion of the toolchain should be sped up again by quite a bit. Thanks for your comments! I agree that 2.5 minutes or 26 seconds is way too slow for the toolchain. Let me know how much it speeds up execution on 86Box. We may need to start looking at individual timings for CPP86, C86 and LD86 as well, to see what else is really slow and see what can be done about it. |
@ghaerr 86Box has a Mac OS version https://github.com/86Box/86Box/releases/tag/v4.2.1 both Arm-64 and x86-64. I usually select "HD Controller" to "PC/XT XTIDE" and then I select an ELKS HDD image in "Hard disks". This is my config file: 86box.zip You also need to download https://github.com/86box/roms/releases and save it as "roms" in 86box folder. |
Time reduced from 2m20s to 1m56s. |
So chess has gone from 2m17 seconds initially to 1m30 seconds. This is a total of 47 seconds improvement. Very good. |
Thanks for your testing @toncho11, that has helped move from NASM to AS86, and now further decreasing the assembly time. I'm not sure what else can be done to increase speed on AS86. I have a few other ideas, but I don't think they'll amount to as big a savings as we have got recently.
Is this for just AS86, or for the entire make? Maybe we can pinpoint other tools that are running slowly and try to speed them up. |
Entire make. |
Currently I can "edit" chess.c which already very good. There is enough memory. I tried chess.as which is a very big file and it failed. Maybe if "edit" can be compiled to support bigger files? It is just to avoid future problems where we have a toolchain, but we can not edit the files needed to use it. It is just an idea for a future improvement. |
Good idea, but looking at it, it's already built with a max default heap. So recompiling using OWC and far data still won't solve its problem. I'm not sure what to do about it, other than we may need to find an WYSIWYG editor that uses disk for holding edited contents when there's not enough RAM. Our current |
This issue is a continuation of #2112, which was getting very long.
The topics covered are getting the 8086 toolchain compiled on the host, using it to cross-compile applications for ELKS, and native compiling of programs on ELKS itself.
For newcomers, getting just the first two done above, compiling the toolchain, then cross-compiling an example, require some setup, as there are three compilers involved:
. env.sh
in ELKS root.. wcenv.sh
in ELKS libc.. c86env.sh
in ELKS libc.First, the location of the OWC installation directory (WATCOM=) must be set by editing libc/wcenv.sh.
Second, the location of the C86 repo (C86=) must be set by editing libc/c86.sh.
Then the following steps are used to build each piece in order:
After all this, in the 8086-toolchain directory, you will have the host C86 toolchain executables in 8086-toolchain/host-bin,
and the ELKS native C86 toolchain executables in 8086-toolchain/elks-bin. The native C library is in ELKS/libc/libc86.a.
After the three environment variables are setup and all repos have been made at least once, the update cycle is quite a bit simpler, since one doesn't need to bootstrap the process as above.
When either repo is updated and the three environment variables set, only the following needs to be done:
The text was updated successfully, but these errors were encountered: