Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.6]Hygon: Some enhancement and bugfixes for HYGON SME/CSV/CSV2 #372

Merged

Conversation

wojiaohanliyang
Copy link

Description:

  1. Fix the asid range for CSV2 guest
  2. Keep CSV2 #VC handling for user-space NAE event in atomic context
  3. Fix unexpected hypercall warning messages
  4. Fix cache coherency issue in the CSV/CSV2 guest
  5. Fix the return value and emulated MSRs info to user space
  6. Fix memleak on mapped GHCB pages

hanliyang and others added 6 commits August 16, 2024 18:07
hygon inclusion
category: feature
CVE: NA

---------------------------

All the ASIDs in range [1, max_sev_asid] are available for CSV2 guest,
regardless of the value of min_sev_asid.

Signed-off-by: hanliyang <hanliyang@hygon.cn>
…mes from userspace

hygon inclusion
category: bugfix
CVE: NA

---------------------------

In function vc_raw_handle_exception(), it will holds ghcb page and calls

    __sev_get_ghcb()    <- holding ghcb page to communicate with host
    vc_init_em_etxt()
    vc_handle_exitcode()
    __sev_put_ghcb()    <- no longer holding ghcb page after the
                           communication

to emulate instruction which cause #VC.

When the #VC comes from userspace, the code path

    user_exc_vmm_communication()
        vc_raw_handle_exception()

cannot keep memory access in atomic context, this may lead to direct
page fault handling if the emulation process access userspace address
which doesn't exist in memory. For userspace address page fault handling,
if it's not in the atomic context or the caller doesn't call
pagefault_disable(), the irq may be enabled and there is a risk of
generating more #VC. So it's necessary to switch to atomic context before
emulate instructions which cause #VC.

Add __preempt_count_{add,sub}() pair to keep the code between
__sev_get_ghcb() and __sev_put_ghcb() in atomic context if #VC comes from
userspace. If memory access fails during emulating, the caller will
construct page fault info and forward a page fault later.

Fixes: be1a540 ("x86/sev: Split up runtime #VC handler for correct state tracking")
Signed-off-by: hanliyang <hanliyang@hygon.cn>
…complete_hypercall_exit()

hygon inclusion
category: bugfix
CVE: NA

---------------------------

In the commit b5aead0 ("KVM: x86: Assume a 64-bit hypercall for
guests with protected state"), is_64_bit_mode() will trigger warning,
as the following messages, for SEV-ES or CSV2 guest.

[85350.053201] ------------[ cut here ]------------
[85350.053206] WARNING: CPU: 2 PID: 68989 at arch/x86/kvm/x86.h:156 complete_hypercall_exit+0x6a/0x70 [kvm]
[85350.053299] Modules linked in: kvm_amd(OE) kvm(OE) ccp(E) irqbypass(E) vhost_net(E) vhost(E) vhost_iotlb(E) tap(E) fuse(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) ip6table_mangle(E) ip6table_nat(E) iptable_mangle(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) libcrc32c(E) nfnetlink(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) tun(E) bridge(E) stp(E) llc(E) rfkill(E) vfat(E) fat(E) binfmt_misc(E) intel_rapl_msr(E) intel_rapl_common(E) amd64_edac(E) edac_mce_amd(E) crct10dif_pclmul(E) crc32_pclmul(E) acpi_ipmi(E) ipmi_ssif(E) ipmi_si(E) ast(E) joydev(E) mousedev(E) ghash_clmulni_intel(E) rapl(E) ipmi_devintf(E) drm_shmem_helper(E) drm_kms_helper(E) ipmi_msghandler(E) sg(E) k10temp(E) acpi_cpufreq(E) squashfs(E) loop(E) parport_pc(E) ppdev(E) lp(E) parport(E) drm(E) ip_tables(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) ahci(E) igb(E) i2c_designware_platform(E) libahci(E) i2c_algo_bit(E) dca(E) i2c_piix4(E)
[85350.053421]  i2c_designware_core(E) crc32c_intel(E) libata(E) i2c_core(E) [last unloaded: kvm(OE)]
[85350.053432] CPU: 2 PID: 68989 Comm: qemu-system-x86 Tainted: GF       W  OE      6.6.7-for-openanolis deepin-community#5
[85350.053438] Hardware name: HYGON HongHaiA1b/HongHaiA1, BIOS A1633050 02/02/2023
[85350.053441] RIP: 0010:complete_hypercall_exit+0x6a/0x70 [kvm]
[85350.053511] Code: e8 9b fb ff ff 48 83 c4 08 5b 5d e9 60 68 68 d8 48 8d 54 24 04 48 89 e6 48 89 ef e8 40 db 12 00 8b 44 24 04 85 c0 74 c4 eb c4 <0f> 0b eb b5 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
[85350.053514] RSP: 0018:ffffc90000ea3e28 EFLAGS: 00010202
[85350.053519] RAX: ffff8881419f0000 RBX: 0000000000000000 RCX: ffff8881003ad780
[85350.053522] RDX: 0000606fc0a29bc0 RSI: 00000000fffffe01 RDI: ffff888b5dc20000
[85350.053525] RBP: ffff888b5dc20000 R08: 0000000000000001 R09: 0000000000000000
[85350.053527] R10: ffffc90000ea3ee8 R11: 0000000000000000 R12: ffff88810fe1ea00
[85350.053530] R13: ffff888b5dc20000 R14: ffff888b5dc20048 R15: 0000000000000000
[85350.053532] FS:  00007eff45528700(0000) GS:ffff88903f080000(0000) knlGS:0000000000000000
[85350.053536] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[85350.053539] CR2: 0000000000000000 CR3: 00000001415d2000 CR4: 00000000003506e0
[85350.053541] Call Trace:
[85350.053545]  <TASK>
[85350.053550]  ? __warn+0x84/0x140
[85350.053558]  ? complete_hypercall_exit+0x6a/0x70 [kvm]
[85350.053627]  ? report_bug+0x1bd/0x1d0
[85350.053635]  ? handle_bug+0x3c/0x70
[85350.053640]  ? exc_invalid_op+0x18/0x70
[85350.053645]  ? asm_exc_invalid_op+0x1a/0x20
[85350.053655]  ? complete_hypercall_exit+0x6a/0x70 [kvm]
[85350.053724]  kvm_arch_vcpu_ioctl_run+0x3dd/0x410 [kvm]
[85350.053796]  kvm_vcpu_ioctl+0x277/0x6c0 [kvm]
[85350.053855]  __x64_sys_ioctl+0x92/0xd0
[85350.053864]  do_syscall_64+0x3f/0x90
[85350.053868]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[85350.053874] RIP: 0033:0x7eff486c33ab
[85350.053878] Code: 0f 1e fa 48 8b 05 e5 7a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b5 7a 0d 00 f7 d8 64 89 01 48
[85350.053881] RSP: 002b:00007eff45527848 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[85350.053886] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007eff486c33ab
[85350.053888] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000010
[85350.053891] RBP: 0000563586e32430 R08: 0000563584ff1d30 R09: 00007eff455276a4
[85350.053893] R10: 00007eff4552769c R11: 0000000000000246 R12: 0000000000000000
[85350.053896] R13: 00005635856bcd60 R14: 0000000000000000 R15: 0000000000000000
[85350.053904]  </TASK>
[85350.053906] ---[ end trace 0000000000000000 ]---

Use is_64_bit_hypercall() instead of is_64_bit_mode() in
complete_hypercall_exit() to avoid warning when the SEV-ES or CSV2
guest invoking KVM_HC_MAP_GPA_RANGE hypercall.

Fixes: b5aead0 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state")
Signed-off-by: hanliyang <hanliyang@hygon.cn>
…hes to early_top_pgt

hygon inclusion
category: bugfix
CVE: NA

---------------------------

The memory region of .bss..decrypted section maybe mapped with encryption
before early boot stage of Linux. If the correspond stale caches lives in
earlier stage were not flushed before we access that memory region in
later stages, then Linux will crash because the stale caches will pollute
the memory.

Fix this issue by flush the caches with encrypted mapping before we
access .bss..decrypted section.

Fixes: b3f0907 ("x86/mm: Add .bss..decrypted section to hold shared variables")
Signed-off-by: hanliyang <hanliyang@hygon.cn>
mainline inclusion
from mainline-v6.8-rc5
commit 3376ca3
category: bugfix

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3376ca3f1a2075eaa23c5576c47d04d7e8a4adda

---------------------------

Commit 6abe9c1 ("KVM: X86: Move ignore_msrs handling upper the
stack") changed the 'ignore_msrs' handling, including sanitizing return
values to the caller. This was fine until commit 12bc213 ("KVM:
X86: Do the same ignore_msrs check for feature msrs") which allowed
non-existing feature MSRs to be ignored, i.e. to not generate an error
on the ioctl() level. It even tried to preserve the sanitization of the
return value. However, the logic is flawed, as '*data' will be
overwritten again with the uninitialized stack value of msr.data.

Fix this by simplifying the logic and always initializing msr.data,
vanishing the need for an additional error exit path.

Fixes: 12bc213 ("KVM: X86: Do the same ignore_msrs check for feature msrs")
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20240203124522.592778-2-minipli@grsecurity.net
Signed-off-by: Sean Christopherson <seanjc@google.com>
hygon inclusion
category: bugfix
CVE: NA

---------------------------

The ghcb pages might be mapped when KVM handling the VMGEXIT events, and
these ghcb pages will be unmapped when prepare to switch to guest mode.
If we try to kill the userspace VMM (e.g. qemu) of a guest, it's
possible that the mapped ghcb pages will never be unmapped which will
cause memory leak. We exposed a serious memory leak by creating and
killing multiple qemu processes for state encrypted guests frequently.

In order to solve this issue, unmap ghcb pages if they're sill mapped
when destroy guest.

Fixes: ce7ea0c ("KVM: SVM: Move GHCB unmapping to fix RCU warning")
Fixes: 291bd20 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Signed-off-by: hanliyang <hanliyang@hygon.cn>
@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mingcongbai for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@deepin-ci-robot
Copy link

Hi @wojiaohanliyang. Thanks for your PR. 😃

@deepin-ci-robot
Copy link

Hi @wojiaohanliyang. Thanks for your PR.

I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants