Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ADDFRR MVS assist #537

Merged
merged 4 commits into from
Jan 18, 2023
Merged

Add ADDFRR MVS assist #537

merged 4 commits into from
Jan 18, 2023

Conversation

jmaynard
Copy link
Contributor

This request adds the ADDFRR MVS assist from the 3033 Extensions feature.

Copy link
Member

@Fish-Git Fish-Git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea whether you change is good or not, Jay, since I have no fricking idea what the assist is supposed to do!   :(

Line 782 of your assist.c patch is a comment that says:

  /* Return with the FRRSPARM area address in r1 per the assist documentation  */

It would be nice if this documentation (manual # and title and URL if available) was mentioned somewhere so people like me could learn about the assist. Having the manual would certainly allow me to perform more proper (more thorough) review of your code.

It would also be nice if you'd use a more common indentation value too. It looks like the primary statements of your instruction function is using an indent of only 2 columns. It's not a requirement and you don't have to change it if you don't want to, but I'd personally prefer to see a more common 4 column indentation (which, oddly enough, appears to be what your e.g. if statements are using!) instead of the 2 column indentation you're currently using.

But like I said, if you want to use 2, that's fine! I can certainly live with it.   :)

But if you want a more proper/thorough review, I need to have the manual.

Thanks!

@jmaynard
Copy link
Contributor Author

jmaynard commented Jan 16, 2023 via email

@Fish-Git
Copy link
Member

Fish-Git commented Jan 16, 2023

The doc is very sadly lacking...a lot of trying and poking went into this. Still, I'll reference what little doc actually exists.

Thanks. Having something -- even if it's lacking -- is certainly better than not having any at all.

I can, however, guarantee the code works properly.

Of that I have zero doubt!   ;-)

Nevertheless, a you mentioned, having a second set of eyes look it over too certainly can't hurt!   :)

(Besides, looking over other's code is a good way to learn about things you didn't know before! That's how I gained 99% of my experience with Hercules over the years!)

@jmaynard jmaynard requested a review from Fish-Git January 18, 2023 06:54
@Fish-Git
Copy link
Member

Fish-Git commented Jan 18, 2023

FYI: I cloned your repository and have finished my review of your code. I see some problems and have some suggestions and a few questions. I will post it tomorrow. It's late and I need to get to bed.

Copy link
Member

@Fish-Git Fish-Git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, your PR (Pull Request) unfortunately cannot be accepted as is. It requires changes.

The first and most important required change is line 760. A simple typo:

        (vstore4) ( regs->GR_L(r2) | 0x00000001, frrnext, USE_PRIMARY_SPACE, regs );

It obviously occurred when you made the indentation changes I suggested earlier. It's no big deal since it's not really a bug per se (just a silly typo), but it does need to be fixed as it of course causes building of Hercules to fail due to the resulting compiler error.

The second required change is much more serious: a fix to what appears to be a rather potentially serious bug. While I'm sure it's unlikely to occur in actual practice (I'm sure MVS would never screw up so badly as to cause an invalid frrnext and frrsize combination to ever occur), we simply cannot afford to take that chance. Especially when in this particular case it could crash the emulator. Any/ALL values coming from the guest should be considered "suspect" by default and simply must be validated before being used. It is that thought that caused alarm bells to go off in my head when I came across the following:

        main = MADDRL (frrnext + 8, 4, 0, regs, ACCTYPE_WRITE, regs->psw.pkey);
        memset((char*)main, 0x00, frrsize);

Depending on the value of frrnext and frrsize, it is entirely possible for the above memset to clear memory beyond the end of guest storage thereby causing Hercules to crash, or worse, cause it to behave unpredictably or even create the condition for a security exploit. It is in essence a buffer overflow.

I have coded in my patch attached further below what I believe is one possible solution. You of course don't have to use it (nor any of the changes I've made in my patch), but you will need to fix this issue before I can accept your Pull Request.

Another change that I believe is needed is to the technique you're currently using to copy CR3/CR4 into the FRR stack entry. While the way you coded it would certainly work, vfetchc presumes the destination of the fetch is to a variable in emulator storage (not to somewhere in guest storage), and therefore (duh!) does not update storage keys. The data would certainly be properly copied, but unfortunately the storage keys of the guest storage where they were being copied to would not get updated. My patch fixes this oversight by doing the vfetchc into a temporary work variable instead and then immediately storing that temporary variable back into guest storage via vstorec, thereby causing the corresponding storage keys for the guest storage being modified to get properly updated.

I also changed to way the location of the store is calculated too. Please take a look at it and tell me what you think. I noticed the vfetch4 you were doing was coming from frrstak, so I simply fetched that value up front at the beginning of the function along with the rest of the values you were fetching, and then simply used that value in the calculation. (I called it frrparm, but I'm not sure if that's the right name to call it; I wish the FRR stack and this instruction were better documented!)

I also added a few & ADDRESS_MAXWRAP( regs ) here and there that may or may not be needed, but I figured it's better to be safe than sorry. Feel free to remove any you feel are not needed.

One question I have concerns your comment and the three if statements that check the entrycode value to determine which function to perform. Your comment says "Perform one of the following three functions...", but the tests to check for which function to perform are a series of stand-alone if statements, leaving open the logical possibility that all three functions could potentially be performed! I did not study your if statements that closely to see whether this was actually possible or not (I suspect it's not), but if the three possible functions that the instruction is meant to perform are truly mutually exclusive, I'd feel better if instead of using three stand-alone ifs like you are, that you use if, else if and else instead, and change you comment to say something like "Perform ONLY ONE of the following three functions...".

That's pretty much it. All other changes are minor whitespace and comment crap that you are free to ignore/discard.

So that's it. Your proposed changes to Hercules have now been officially reviewed. Not exactly what you were expecting I'll bet.   ;-)

I hope I haven't been too harsh too. I really do appreciate your effort, Jay! Truly! But you asked for my review so I gave it.

Make the suggested changes and I'll be happy to accept your Pull Request. But as it stands, I'm afraid I'm going to have to reject it at this time. Sorry!   :(

My suggested changes:

(Note: you may or may not want to run it through dos2unix first before applying it.)
 

@jmaynard
Copy link
Contributor Author

jmaynard commented Jan 18, 2023

No problem. Your comments are more than fair.

I examined the 3-way if, and the three really are mutually exclusive. Even so, your way is cleaner.

I wish the FRR stack and this instruction were better documented!

Me too, Fish, me too.

I've got no objections to the other changes; and they seem to work properly. Applied.

@jmaynard jmaynard requested a review from Fish-Git January 18, 2023 19:59
Copy link
Member

@Fish-Git Fish-Git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@Fish-Git Fish-Git merged commit 65d2e33 into SDL-Hercules-390:master Jan 18, 2023
@Fish-Git
Copy link
Member

Merged!

@Fish-Git
Copy link
Member

Merged!

DAMMIT!!!

It got merged into the wrong fricking branch!!!

It got merged into "master"! It was _supposed to be merged into "develop"!!

(sigh)

Let me see if I can fix it...

(revert the commit made to master and manually commit it to develop instead)

@jmaynard
Copy link
Contributor Author

jmaynard commented Jan 18, 2023 via email

@wrljet
Copy link
Member

wrljet commented Jan 18, 2023

I hate git and GitHub.

Fish-Git pushed a commit that referenced this pull request Jan 18, 2023
* Add ADDFRR MVS assist

* Add ADDFRR MVS assist

* Incorporate review comments

Co-authored-by: Jay Maynard <jmaynard@gmail.com>
@jmaynard
Copy link
Contributor Author

It doesn't help that my original fork of the repository on Github didn't include the develop branch at all. I've since corrected that, and will make sure to do my developing on the develop branch (or a separate branch which can then get merged into develop).

Sorry for the extra work.

@jmaynard
Copy link
Contributor Author

Well, that's annoying. Got everything re-forked/re-cloned, and now IPLing in ARCHMODE S/370 segfaults...

@Fish-Git
Copy link
Member

Jay wrote:

My bad...I added it to master, not develop, on my side.

Yeah, I completely missed that.   :(

 
Fish wrote:

Let me see if I can fix it...

Should be fixed now.

I first created/saved the accidental commit as a patch (so it could later be applied to the develop branch instead), and then did a hard reset of the master branch to take it back to where it was before (i.e. to immediately before the accidental commit), and then did a git forced push (as opposed to a regular git push) of that to the remote to fix it too (so that anyone cloning hyperion gets the correct official 4.5 master branch code).

Then I simply applied the patch to develop branch instead and pushed that.

Simple, really.   <shrug>

 
Bill said:

I hate git and GitHub.

I used to, but now that I'm used to it, I actually like it. Do you do your git stuff via the command-line? If so, then maybe that's why. I myself use TortoiseGit on Windows, which makes using git SO much easier!

 
Jay wrote:

It doesn't help that my original fork of the repository on Github didn't include the develop branch at all.

Actually I suspect it did. When you git clone a repository, you get everything: the default master branch and all other branches as well. It's just that "git clone", by default, sets your cloned repository to the master branch by default after doing the clone, so after doing your clone, you simply need to do a git checkout of the develop branch. But I understand it's very easy to forget to do that.

All new development should be done in the develop branch. This is to ensure the master branch is never changed and always remains at the current official release level. When we're ready to make a new release, we simply merge our develop branch into master, tag it, and then push. But once a new version is released, it (the master branch) should never change until it's time to do another release. The develop branch changes constantly as new development proceeds. But the master branch remains static, and doesn't ever change except from one release to the next.

ANYWAY...   everything is straightened out now.

 
Jay wrote:

Sorry for the extra work.

No big deal. It scared me at first, but fixing it wasn't really that much work. I just had to think about it and proceed carefully, that's all. When I panic and do things in a rush, I invariably fuck things up.

@Fish-Git
Copy link
Member

Fish-Git commented Jan 18, 2023

Well, that's annoying. Got everything re-forked/re-cloned, and now IPLing in ARCHMODE S/370 segfaults...

WHAT?!   That's not good!   :(

@Fish-Git
Copy link
Member

IPLing in ARCHMODE S/370 segfaults

Make sure you have crash dumps enabled and try again so we can analyze the dump to see WTF is going on.

Thanks.

@jmaynard
Copy link
Contributor Author

jmaynard commented Jan 18, 2023

I can't seem to get a core dump out of the doggone thing no matter what I do - and that includes going down a bunch of rabbit holes. (What systemd has to do with taking crash dumps, I haven't the foggiest idea!)

I can at least get a backtrace, though....

Thread 4 "Processor CP00" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f700efe1640 (LWP 1707878)]
s370_run_cpu (cpu=<optimized out>, oldregs=<optimized out>) at ../cpu.c:2081
2081	        UNROLLED_EXECUTE( current_opcode_table, regs );
(gdb) bt
#0  s370_run_cpu (cpu=<optimized out>, oldregs=<optimized out>)
    at ../cpu.c:2081
#1  0x00007f700fbb07bd in cpu_thread (ptr=0xd5dd840) at ../cpu.c:2364
#2  0x00007f70102c9ef2 in hthread_func (arg2=0x565500d406e0)
    at ../hthreads.c:1059
#3  0x00007f700f694b43 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:442
#4  0x00007f700f726a00 in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) 

This only affects 370 mode .Other modes still work fine.

@wrljet
Copy link
Member

wrljet commented Jan 18, 2023

It's not your new code, Jay.

It fails in testcase awsbsf S/370 AWS Tape BSF into Load Point.
Works in commit e4f10e3
Fails in commit c618bb8

I suspect it's in the new tracing feature from Jan 11, 2023.

As an aside, this explains why some people on our Discord have been asking about tests failing in recent days (that I hadn't gotten around to investigating).

@wrljet
Copy link
Member

wrljet commented Jan 18, 2023

This appears to be the commit 52b5df6 that causes the failure.

@wrljet
Copy link
Member

wrljet commented Jan 18, 2023

I can't seem to get a core dump out of the doggone thing no matter what I do - and that includes going down a bunch of rabbit holes. (What systemd has to do with taking crash dumps, I haven't the foggiest idea!)

Jay, you're on Pop!_OS ?

@jmaynard
Copy link
Contributor Author

Yes, I am. It's an Ubuntu derivative.

@Fish-Git
Copy link
Member

It fails in testcase awsbsf S/370 AWS Tape BSF into Load Point.

This appears to be the commit 52b5df6 that causes the failure.

I'm not seeing it.   :(

When I do a runtest on current git, I get:

Did 228 tests.  All OK.

Let me give DOS/VS a try...

@wrljet
Copy link
Member

wrljet commented Jan 18, 2023

Are you on Linux?

@Fish-Git
Copy link
Member

DOS/VS works just fine!

@Fish-Git
Copy link
Member

Are you on Linux?

No. Windows. But let me give Linux a try and I'll get back to you.

@jmaynard
Copy link
Contributor Author

VM/370 CE 1.1.1 blows up for me.

@Fish-Git
Copy link
Member

I can at least get a backtrace, though....

Thread 4 "Processor CP00" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f700efe1640 (LWP 1707878)]
s370_run_cpu (cpu=<optimized out>, oldregs=<optimized out>) at ../cpu.c:2081
2081	        UNROLLED_EXECUTE( current_opcode_table, regs );
(gdb) bt
#0  s370_run_cpu (cpu=<optimized out>, oldregs=<optimized out>)
    at ../cpu.c:2081
#1  0x00007f700fbb07bd in cpu_thread (ptr=0xd5dd840) at ../cpu.c:2364

THAT is very weird!

Because UNROLLED_EXECUTE of course essentially does the exact same thing as the statements that immediately precede that loop:

    ip = INSTRUCTION_FETCH( regs, 0 );
    PROCESS_TRACE( regs, ip, enter_fastest_no_txf_loop );
    EXECUTE_INSTRUCTION( current_opcode_table, ip, regs );
    regs->instcount++;
    UPDATE_SYSBLK_INSTCOUNT( 1 );

So if it was able to successfully execute the first instruction, why in the world would it suddenly blow up on the next one?! It doesn't make any sense!

@Fish-Git
Copy link
Member

I'm just getting started on my Linux test, but I think I may know why it might be crashing (and yes, it was my Trace File changes (commit 598c467) that probably introduced the problem):

I modified the Makefile.am file (to add the two new utilities to Hercules: tfprint and tfswap), but I forgot to commit a new updated configure script! (Doh!)

Doing an autogen.sh before doing your ./configure might fix the problem.

Let me test my theory and get back to you...

@jmaynard
Copy link
Contributor Author

No, autogen.sh before ./configure didn't help...well, I ran it before hercules-helper, but that with the no-gitclone option, so it shouldn't have made a difference.

@Fish-Git
Copy link
Member

Okay, I'm getting a SEGFAULT when I try to IPL DOS/VS on my Linux system. I don't have crash dumps enabled yet, but I will do that and try again. But this was a test without doing an autogen beforehand. Let me try the same thing again but doing an autogen beforehand, to see if I get the same thing you get, Jay....

@wrljet
Copy link
Member

wrljet commented Jan 19, 2023

I'm just getting started on my Linux test, but I think I may know why it might be crashing (and yes, it was my Trace File changes (commit 598c467) that probably introduced the problem):

It's 52b5df6

@Fish-Git
Copy link
Member

Okay, it still crashes even after doing autogen beforehand. Let me enable crash dumps so I can see what the FRICK is going on.

This whole issue is perplexing as hell. There should be nothing significantly different between the Linux and Windows builds. There's obviously something I'm missing (not doing right) with respect to Linux though. I apologize sincerely for this mess, Jay, whatever the frick it is. I'll get to the bottom of it eventually.

@jmaynard
Copy link
Contributor Author

jmaynard commented Jan 19, 2023 via email

@Fish-Git
Copy link
Member

Fish-Git commented Jan 19, 2023

It's 52b5df6

You may well be right, I don't know. I haven't gotten that far yet.

But if it is, the question is WHY?!

What is it about that particular commit that breaks things so severely on Linux? That's what I don't understand!   :(

(Still working on trying to get crash dumps enabled on my system...)

@wrljet
Copy link
Member

wrljet commented Jan 19, 2023

Jay, to enable core dumps on Pop!_OS:

ulimit -S -c unlimited
sudo sysctl -w kernel.core_pattern="%s.core"

@wrljet
Copy link
Member

wrljet commented Jan 19, 2023

Sorry for pasting this as an image, but I can't copy text out of this Pop!_OS VM yet.
This is with the current latest develop checkout.

image

@Fish-Git
Copy link
Member

Fish-Git commented Jan 19, 2023

Bill, try the same thing but with a Debug build of Herc instead of a normal Release build. It's been a long time since I've done it, but I believe all you have to do is specify --enable-debug when you do your ./configure, but you may also need to specify --enable-optimization=no too. I'm not sure.

But using an unoptimized Debug build should allow gdb to display actual variable values instead of displaying <optimized out>. Doing so should also allow us to determine whether the problem is a compiler optimization bug or not too. If the Debug build runs fine (albeit much more slowly of course) but the normal optimized Release build crashes, then we know it's not anything Hercules is doing. It's the compiler.

When chasing a bug it's best to do so with Debug builds of Hercules. Then once the bug is found and confirmed to be fixed (by virtue of the unoptimized Debug build working correctly), then the optimized Release build should work too. And if it doesn't, then someone more skilled at tracking down such compiler optimization bugs needs to get involved because such a situation would be largely out of our hands and well beyond our normal debugging skill level. (Or least way beyond MY skill level that's for darn sure!)

(Me: still working on getting crash dumps enabled; VMware temporarily got in the way, but I seem to be past that now...)

@Fish-Git
Copy link
Member

Fish-Git commented Jan 19, 2023

p.s. to both Bill (@wrljet) and Jay (@jmaynard) too:

I think it would be best to create a new GitHub Issue for this so we can better look into it (and get the input/advice of others in our user community who are quite skilled at debugging such issues) rather than to continue discussing it here, in a Pull Request thread for a Pull Request that has already been Merged and Closed.

Besides, I think it's clear by now that the issue has NOTHING to do with Jay's ADDFRR change. So continuing to discuss it here is silly. Someone needs to create a new GitHub Issue for this and we need to move our conversation over to there.

Yes?

@jmaynard
Copy link
Contributor Author

OK, works for me.

@Fish-Git
Copy link
Member

Okay, I can't seem to cause a core file to get created either, even though it's enabled:

fish@kubuntu2110-vm:~/hercules/hercules-0$ ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 31376
max locked memory           (kbytes, -l) 1013152
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 31376
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
fish@kubuntu2110-vm:~/hercules/hercules-0$ ls -al core*
ls: cannot access 'core*': No such file or directory
fish@kubuntu2110-vm:~/hercules/hercules-0$ 

I guess my only choice is to try executing Herc under gdb...

@Fish-Git
Copy link
Member

OK, works for me.

Since it was YOU that originally discovered and reported the problem, would you do us the honors please and create the GH Issue for us? Thanks! If I need to make another post I'll do so in that thread.

@jmaynard
Copy link
Contributor Author

Already did. #538

@wrljet
Copy link
Member

wrljet commented Jan 20, 2023

Bill said:

I hate git and GitHub.

I used to, but now that I'm used to it, I actually like it. Do you do your git stuff via the command-line? If so, then maybe that's why. I myself use TortoiseGit on Windows, which makes using git SO much easier!

Fish,

I on Windows, I use Github Desktop for committing stuff. And TortoiseGIT for looking at stuff, changes, etc.
On Linux and macOS I use git command line, and try to be careful.

Bill

@Fish-Git
Copy link
Member

I on Windows, I use Github Desktop for committing stuff. And TortoiseGIT for looking at stuff, changes, etc.

I've never used Github Desktop because I really don't see the need to have it on my system since I already have TortoiseGIT.

Why would you use Github Desktop (yet another completely unneeded tool on your system!) when you already have TortoiseGIT? It's so easy! Just right-click and select: Git Commit -> "develop"...

Sheesh! What could be easier?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants