Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline TLS field access for linux/osx x64/arm64 #87082

Merged
merged 88 commits into from
Jul 6, 2023
Merged
Show file tree
Hide file tree
Changes from 82 commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
9387a59
wip
kunalspathak Jun 1, 2023
4b27a2d
add __tls_get_addr() code in jitinterface
kunalspathak Jun 2, 2023
909b8e6
working model
kunalspathak Jun 2, 2023
773eb84
linux rely on __tls_get_addr() value
kunalspathak Jun 2, 2023
05aaa68
Add fields for both max/threadSTaticBlocks, have separate for GC/non-gc
kunalspathak Jun 2, 2023
03d9c2c
code cleanup
kunalspathak Jun 2, 2023
a98c2cf
code cleanup
kunalspathak Jun 2, 2023
04b3cbc
add comments
kunalspathak Jun 2, 2023
46d8fc3
jit format
kunalspathak Jun 2, 2023
f44d745
update guid
kunalspathak Jun 3, 2023
67d33a9
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 3, 2023
7fb2f16
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 5, 2023
fcbebaa
review feedback
kunalspathak Jun 6, 2023
7364faa
fix the offset
kunalspathak Jun 6, 2023
f257987
arm64: wip
kunalspathak Jun 7, 2023
f85614b
linux arm64 model
kunalspathak Jun 7, 2023
5bdf881
arm64: offsetOfThreadStaticBlock adjustment
kunalspathak Jun 8, 2023
b6d2ef0
Add mrs and tpid0 register
kunalspathak Jun 8, 2023
75ec05a
arm64: use the new mrs/tpidr0
kunalspathak Jun 9, 2023
fd200b5
fix arm64 build and offset calculation:
kunalspathak Jun 9, 2023
4523864
arm64: working
kunalspathak Jun 9, 2023
4bef20a
arm64: move to struct model
kunalspathak Jun 9, 2023
1f437f6
arm64: fixed the struct model
kunalspathak Jun 9, 2023
b5394d7
x64: move to struct model
kunalspathak Jun 9, 2023
dce8d91
code refactoring
kunalspathak Jun 10, 2023
e96530a
#define for field access
kunalspathak Jun 10, 2023
c4db025
change mrs -> mrs_tpid0
kunalspathak Jun 10, 2023
ddc931f
fix a bug
kunalspathak Jun 10, 2023
a716411
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 10, 2023
1e14591
remove unwanted method
kunalspathak Jun 10, 2023
1632086
another fix
kunalspathak Jun 10, 2023
529c7f7
Add entries in CorInfoType.cs
kunalspathak Jun 10, 2023
90e091d
Update the #ifdef
kunalspathak Jun 10, 2023
694c9cc
fix the windows scenario:
kunalspathak Jun 12, 2023
e6044a8
review feedback
kunalspathak Jun 12, 2023
ae76829
fix the data-type
kunalspathak Jun 12, 2023
76db418
add osx-arm64 support
kunalspathak Jun 14, 2023
63ae9e8
fix osx-arm64 issues
kunalspathak Jun 16, 2023
e376109
fix build error
kunalspathak Jun 12, 2023
e74fcd2
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 16, 2023
0d255d2
fix build error after merge
kunalspathak Jun 16, 2023
ac982cb
add osx/x64 support
kunalspathak Jun 16, 2023
130bc14
fix errors
kunalspathak Jun 16, 2023
eeb1a7a
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 16, 2023
d3cdf77
fix the macos/x64
kunalspathak Jun 19, 2023
ec40932
disable for alpine linux
kunalspathak Jun 21, 2023
368e8c5
Disable for R2R
kunalspathak Jun 21, 2023
c35938c
review feedback
kunalspathak Jun 22, 2023
8a222d2
fix r2r check
kunalspathak Jun 22, 2023
5235d37
move windows to struct model
kunalspathak Jun 22, 2023
a273623
review feedback
kunalspathak Jun 22, 2023
6ef9e8d
fix the register clobbering in release bits
kunalspathak Jun 23, 2023
6d31ec6
Move the linux/x64 logic to .S file
kunalspathak Jun 26, 2023
ce44ced
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 26, 2023
9292e2a
Use TargetOS::IsMacOS
kunalspathak Jun 26, 2023
de5ada2
disable optimization for single file
kunalspathak Jun 27, 2023
3fcec56
working for linux/x64
kunalspathak Jun 27, 2023
c39cf0e
fix some errors for osx/x64
kunalspathak Jun 27, 2023
9906f4e
fix for osx x64/arm64
kunalspathak Jun 27, 2023
cf3b8c0
fix for arm64 linux/osx
kunalspathak Jun 27, 2023
ed0c6a7
try disable for musl/arm64
kunalspathak Jun 27, 2023
5801bbf
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jun 27, 2023
7dca821
rename variable
kunalspathak Jun 27, 2023
99dec18
Rename variable to tlsIndexObject
kunalspathak Jun 28, 2023
bdca9fe
Make offset variables as uint32_t
kunalspathak Jun 28, 2023
f1e5459
change the type of indexObj/ftnAddr to void*
kunalspathak Jun 28, 2023
cb409f5
replace ifdef(msc_ver) with ifdef(windows)
kunalspathak Jun 28, 2023
e3b7dc6
Revert to JIT_TO_EE_TRANSITION_LEAF
kunalspathak Jun 28, 2023
ab284c7
Move code to asmHelpers.S and rename method
kunalspathak Jun 28, 2023
bcc0a55
rename the methods per the platform
kunalspathak Jun 28, 2023
f475b6d
fix osx builds
kunalspathak Jun 28, 2023
4e0c211
fix build break
kunalspathak Jun 28, 2023
47e087d
fix some errors around osx
kunalspathak Jun 29, 2023
04420db
rename some more methods
kunalspathak Jun 29, 2023
94f4d43
review feedback
kunalspathak Jun 29, 2023
71d27eb
review feedback
kunalspathak Jun 29, 2023
09ec151
delete the comment
kunalspathak Jun 29, 2023
e28401c
make methods static
kunalspathak Jun 29, 2023
b89a55f
remove macos/x64 check
kunalspathak Jun 30, 2023
c6ae3e4
fix the check for linux/x64
kunalspathak Jun 30, 2023
d237f0b
detect early for single-file linux/x64
kunalspathak Jun 30, 2023
71b99dd
move the assert
kunalspathak Jun 30, 2023
de5b3b7
review feedback
kunalspathak Jul 3, 2023
c7864df
misc fixup
kunalspathak Jul 3, 2023
2429d75
use fgMorphArgs()
kunalspathak Jul 3, 2023
8fa1e6f
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jul 3, 2023
d645f2e
remove commented code
kunalspathak Jul 5, 2023
d42fdce
Merge remote-tracking branch 'origin/main' into tls_linux
kunalspathak Jul 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions src/coreclr/inc/corinfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -1726,8 +1726,11 @@ struct CORINFO_FIELD_INFO

struct CORINFO_THREAD_STATIC_BLOCKS_INFO
{
CORINFO_CONST_LOOKUP tlsIndex;
uint32_t offsetOfThreadLocalStoragePointer;
CORINFO_CONST_LOOKUP tlsIndex; // windows specific
void* tlsGetAddrFtnPtr; // linux/x64 specific - address of __tls_get_addr() function
void* tlsIndexObject; // linux/x64 specific - address of tls_index object
void* threadVarsSection; // osx x64/arm64 specific - address of __thread_vars section of `t_ThreadStatics`
uint32_t offsetOfThreadLocalStoragePointer; // windows specific
uint32_t offsetOfMaxThreadStaticBlocks;
uint32_t offsetOfThreadStaticBlocks;
uint32_t offsetOfGCDataPointer;
Expand Down
10 changes: 5 additions & 5 deletions src/coreclr/inc/jiteeversionguid.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ typedef const GUID *LPCGUID;
#define GUID_DEFINED
#endif // !GUID_DEFINED

constexpr GUID JITEEVersionIdentifier = { /* 878d63a7-ffc9-421f-81f7-db4729f0ed5c */
0x878d63a7,
0xffc9,
0x421f,
{0x81, 0xf7, 0xdb, 0x47, 0x29, 0xf0, 0xed, 0x5c}
constexpr GUID JITEEVersionIdentifier = { /* 02e334af-4e6e-4a68-9feb-308d3d2661bc */
0x2e334af,
0x4e6e,
0x4a68,
{0x9f, 0xeb, 0x30, 0x8d, 0x3d, 0x26, 0x61, 0xbc}
};

//////////////////////////////////////////////////////////////////////////////////////////////////////////
Expand Down
7 changes: 7 additions & 0 deletions src/coreclr/jit/codegenarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2944,6 +2944,13 @@ void CodeGen::genCodeForStoreLclVar(GenTreeLclVar* lclNode)
inst_Mov_Extend(targetType, /* srcInReg */ true, targetReg, dataReg, /* canSkip */ true,
emitActualTypeSize(targetType));
}
else if (TargetOS::IsUnix && data->IsIconHandle(GTF_ICON_TLS_HDL))
{
assert(data->AsIntCon()->IconValue() == 0);
emitAttr attr = emitActualTypeSize(targetType);
// On non-windows, need to load the address from system register.
emit->emitIns_R(INS_mrs_tpid0, attr, targetReg);
}
else
{
inst_Mov(targetType, targetReg, dataReg, /* canSkip */ true);
Expand Down
7 changes: 2 additions & 5 deletions src/coreclr/jit/compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5026,11 +5026,8 @@ void Compiler::compCompile(void** methodCodePtr, uint32_t* methodCodeSize, JitFl
// Partially inline static initializations
DoPhase(this, PHASE_EXPAND_STATIC_INIT, &Compiler::fgExpandStaticInit);

if (TargetOS::IsWindows)
{
// Currently this is only applicable for Windows
DoPhase(this, PHASE_EXPAND_TLS, &Compiler::fgExpandThreadLocalAccess);
}
// Expand thread local access
DoPhase(this, PHASE_EXPAND_TLS, &Compiler::fgExpandThreadLocalAccess);

// Insert GC Polls
DoPhase(this, PHASE_INSERT_GC_POLLS, &Compiler::fgInsertGCPolls);
Expand Down
4 changes: 1 addition & 3 deletions src/coreclr/jit/emit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10205,9 +10205,7 @@ void emitter::emitRecordCallSite(ULONG instrOffset, /* IN */

if (callSig == nullptr)
{
assert(methodHandle != nullptr);

if (Compiler::eeGetHelperNum(methodHandle) == CORINFO_HELP_UNDEF)
if ((methodHandle != nullptr) && (Compiler::eeGetHelperNum(methodHandle) == CORINFO_HELP_UNDEF))
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
{
emitComp->eeGetMethodSig(methodHandle, &sigInfo);
callSig = &sigInfo;
Expand Down
22 changes: 17 additions & 5 deletions src/coreclr/jit/emitarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -937,7 +937,7 @@ void emitter::emitInsSanityCheck(instrDesc* id)
case IF_SI_0B: // SI_0B ................ ....bbbb........ imm4 - barrier
break;

case IF_SR_1A: // SR_1A ................ ...........ttttt Rt (dc zva)
case IF_SR_1A: // SR_1A ................ ...........ttttt Rt (dc zva, mrs)
datasize = id->idOpSize();
assert(isGeneralRegister(id->idReg1()));
assert(datasize == EA_8BYTE);
Expand Down Expand Up @@ -3740,7 +3740,11 @@ void emitter::emitIns_R(instruction ins, emitAttr attr, regNumber reg)
id->idReg1(reg);
fmt = IF_SR_1A;
break;

case INS_mrs_tpid0:
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
id = emitNewInstrSmall(attr);
id->idReg1(reg);
fmt = IF_SR_1A;
break;
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
default:
unreached();
}
Expand Down Expand Up @@ -11793,7 +11797,7 @@ size_t emitter::emitOutputInstr(insGroup* ig, instrDesc* id, BYTE** dp)
dst += emitOutput_Instr(dst, code);
break;

case IF_SR_1A: // SR_1A ................ ...........ttttt Rt (dc zva)
case IF_SR_1A: // SR_1A ................ ...........ttttt Rt (dc zva, mrs)
assert(insOptsNone(id->idInsOpt()));
code = emitInsCode(ins, fmt);
code |= insEncodeReg_Rt(id->idReg1()); // ttttt
Expand Down Expand Up @@ -13921,8 +13925,16 @@ void emitter::emitDispInsHelp(
emitDispBarrier((insBarrier)emitGetInsSC(id));
break;

case IF_SR_1A: // SR_1A ................ ...........ttttt Rt (dc zva)
emitDispReg(id->idReg1(), size, false);
case IF_SR_1A: // SR_1A ................ ...........ttttt Rt (dc zva, mrs)
if (ins == INS_mrs_tpid0)
{
emitDispReg(id->idReg1(), size, true);
printf("tpidr_el0");
}
else
{
emitDispReg(id->idReg1(), size, false);
}
break;

default:
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/jit/emitfmtsarm64.h
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ IF_DEF(SN_0A, IS_NONE, NONE) // SN_0A ................ ................
IF_DEF(SI_0A, IS_NONE, NONE) // SI_0A ...........iiiii iiiiiiiiiii..... imm16
IF_DEF(SI_0B, IS_NONE, NONE) // SI_0B ................ ....bbbb........ imm4 - barrier

IF_DEF(SR_1A, IS_NONE, NONE) // SR_1A ................ ...........ttttt Rt (dc zva)
IF_DEF(SR_1A, IS_NONE, NONE) // SR_1A ................ ...........ttttt Rt (dc zva, mrs)

IF_DEF(INVALID, IS_NONE, NONE) //

Expand Down
176 changes: 134 additions & 42 deletions src/coreclr/jit/helperexpansion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -478,36 +478,49 @@ bool Compiler::fgExpandThreadLocalAccessForCall(BasicBlock** pBlock, Statement*
return false;
}

assert(!opts.IsReadyToRun());
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved

if (TargetOS::IsUnix)
{
#if defined(TARGET_ARM) || !defined(TARGET_64BIT)
// On Arm, Thread execution blocks are accessed using co-processor registers and instructions such
// as MRC and MCR are used to access them. We do not support them and so should never optimize the
// field access using TLS.
assert(!"Unsupported scenario of optimizing TLS access on Linux Arm32/x86");
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
#endif
}
else
{
#ifdef TARGET_ARM
// On Arm, Thread execution blocks are accessed using co-processor registers and instructions such
// as MRC and MCR are used to access them. We do not support them and so should never optimize the
// field access using TLS.
assert(!"Unsupported scenario of optimizing TLS access on Arm32");
// On Arm, Thread execution blocks are accessed using co-processor registers and instructions such
// as MRC and MCR are used to access them. We do not support them and so should never optimize the
// field access using TLS.
assert(!"Unsupported scenario of optimizing TLS access on Windows Arm32");
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
#endif
}

JITDUMP("Expanding thread static local access for [%06d] in " FMT_BB ":\n", dspTreeID(call), block->bbNum);
DISPTREE(call);
JITDUMP("\n");

bool isGCThreadStatic =
eeGetHelperNum(call->gtCallMethHnd) == CORINFO_HELP_GETSHARED_GCTHREADSTATIC_BASE_NOCTOR_OPTIMIZED;

CORINFO_THREAD_STATIC_BLOCKS_INFO threadStaticBlocksInfo;
info.compCompHnd->getThreadLocalStaticBlocksInfo(&threadStaticBlocksInfo, isGCThreadStatic);
memset(&threadStaticBlocksInfo, 0, sizeof(CORINFO_THREAD_STATIC_BLOCKS_INFO));

uint32_t offsetOfMaxThreadStaticBlocksVal = 0;
uint32_t offsetOfThreadStaticBlocksVal = 0;
info.compCompHnd->getThreadLocalStaticBlocksInfo(&threadStaticBlocksInfo, isGCThreadStatic);

JITDUMP("getThreadLocalStaticBlocksInfo (%s)\n:", isGCThreadStatic ? "GC" : "Non-GC");
offsetOfMaxThreadStaticBlocksVal = threadStaticBlocksInfo.offsetOfMaxThreadStaticBlocks;
offsetOfThreadStaticBlocksVal = threadStaticBlocksInfo.offsetOfThreadStaticBlocks;

JITDUMP("tlsIndex= %u\n", (ssize_t)threadStaticBlocksInfo.tlsIndex.addr);
JITDUMP("offsetOfThreadLocalStoragePointer= %u\n", threadStaticBlocksInfo.offsetOfThreadLocalStoragePointer);
JITDUMP("offsetOfMaxThreadStaticBlocks= %u\n", offsetOfMaxThreadStaticBlocksVal);
JITDUMP("offsetOfThreadStaticBlocks= %u\n", offsetOfThreadStaticBlocksVal);
JITDUMP("tlsIndex= %u\n", (ssize_t)threadStaticBlocksInfo.tlsIndex.addr);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dump threadStaticBlocksInfo.tlsIndex.accessType also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it will be useful? I don't see we print accessType elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it will be useful? I don't see we print accessType elsewhere.

I don't know. Structurally, it seems odd to omit it. Up to you.

JITDUMP("tlsGetAddrFtnPtr= %u\n", threadStaticBlocksInfo.tlsGetAddrFtnPtr);
JITDUMP("tlsIndexObject= %u\n", (size_t)threadStaticBlocksInfo.tlsIndexObject);
JITDUMP("threadVarsSection= %u\n", (size_t)threadStaticBlocksInfo.threadVarsSection);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
JITDUMP("offsetOfMaxThreadStaticBlocks= %u\n", threadStaticBlocksInfo.offsetOfMaxThreadStaticBlocks);
JITDUMP("offsetOfThreadStaticBlocks= %u\n", threadStaticBlocksInfo.offsetOfThreadStaticBlocks);
JITDUMP("offsetOfGCDataPointer= %u\n", threadStaticBlocksInfo.offsetOfGCDataPointer);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved

assert(threadStaticBlocksInfo.tlsIndex.accessType == IAT_VALUE);
assert((eeGetHelperNum(call->gtCallMethHnd) == CORINFO_HELP_GETSHARED_NONGCTHREADSTATIC_BASE_NOCTOR_OPTIMIZED) ||
(eeGetHelperNum(call->gtCallMethHnd) == CORINFO_HELP_GETSHARED_GCTHREADSTATIC_BASE_NOCTOR_OPTIMIZED));

Expand Down Expand Up @@ -546,56 +559,135 @@ bool Compiler::fgExpandThreadLocalAccessForCall(BasicBlock** pBlock, Statement*
gtUpdateStmtSideEffects(stmt);

GenTree* typeThreadStaticBlockIndexValue = call->gtArgs.GetArgByIndex(0)->GetNode();
GenTree* tlsValue = nullptr;
unsigned tlsLclNum = lvaGrabTemp(true DEBUGARG("TLS access"));
lvaTable[tlsLclNum].lvType = TYP_I_IMPL;
GenTree* maxThreadStaticBlocksValue = nullptr;
GenTree* threadStaticBlocksValue = nullptr;
GenTree* tlsValueDef = nullptr;

if (TargetOS::IsWindows)
{
size_t tlsIndexValue = (size_t)threadStaticBlocksInfo.tlsIndex.addr;
GenTree* dllRef = nullptr;

void** pIdAddr = nullptr;
if (tlsIndexValue != 0)
{
dllRef = gtNewIconHandleNode(tlsIndexValue * TARGET_POINTER_SIZE, GTF_ICON_TLS_HDL);
}

size_t tlsIndexValue = (size_t)threadStaticBlocksInfo.tlsIndex.addr;
GenTree* dllRef = nullptr;
// Mark this ICON as a TLS_HDL, codegen will use FS:[cns] or GS:[cns]
tlsValue = gtNewIconHandleNode(threadStaticBlocksInfo.offsetOfThreadLocalStoragePointer, GTF_ICON_TLS_HDL);
tlsValue = gtNewIndir(TYP_I_IMPL, tlsValue, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);

if (tlsIndexValue != 0)
{
dllRef = gtNewIconHandleNode(tlsIndexValue * TARGET_POINTER_SIZE, GTF_ICON_TLS_HDL);
if (dllRef != nullptr)
{
// Add the dllRef to produce thread local storage reference for coreclr
tlsValue = gtNewOperNode(GT_ADD, TYP_I_IMPL, tlsValue, dllRef);
}

// Base of coreclr's thread local storage
tlsValue = gtNewIndir(TYP_I_IMPL, tlsValue, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);
}
else if (TargetOS::IsMacOS)
{
// For OSX x64/arm64, we need to get the address of relevant __thread_vars section of
// the thread local variable `t_ThreadStatics`. Address of `tlv_get_address` is stored
// in this entry, which we dereference and invoke it, passing the __thread_vars address
// present in `threadVarsSection`.
//
// Code sequence to access thread local variable on osx/x64:
//
// mov rdi, threadVarsSection
// call [rdi]
//
// Code sequence to access thread local variable on osx/arm64:
//
// mov x0, threadVarsSection
// mov x1, [x0]
// blr x1
//
size_t threadVarsSectionVal = (size_t)threadStaticBlocksInfo.threadVarsSection;
GenTree* tls_get_addr_val = gtNewIconHandleNode(threadVarsSectionVal, GTF_ICON_FTN_ADDR);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved

tls_get_addr_val = gtNewIndir(TYP_I_IMPL, tls_get_addr_val, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);

tlsValue = gtNewIndCallNode(tls_get_addr_val, TYP_I_IMPL);
GenTreeCall* tlsRefCall = tlsValue->AsCall();

// Mark this ICON as a TLS_HDL, codegen will use FS:[cns] or GS:[cns]
GenTree* tlsRef = gtNewIconHandleNode(threadStaticBlocksInfo.offsetOfThreadLocalStoragePointer, GTF_ICON_TLS_HDL);
// This is a call which takes an argument.
// Populate and set the ABI appropriately.
assert(threadVarsSectionVal != 0);
GenTree* tlsArg = gtNewIconNode(threadVarsSectionVal, TYP_I_IMPL);
tlsRefCall->gtArgs.InsertAfterThisOrFirst(this, NewCallArg::Primitive(tlsArg));

tlsRef = gtNewIndir(TYP_I_IMPL, tlsRef, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);
CallArg* arg0 = tlsRefCall->gtArgs.GetArgByIndex(0);
arg0->AbiInfo = CallArgABIInformation();
arg0->AbiInfo.SetRegNum(0, REG_ARG_0);

if (dllRef != nullptr)
tlsRefCall->gtFlags |= GTF_EXCEPT | (tls_get_addr_val->gtFlags & GTF_GLOB_EFFECT);
}
else if (TargetOS::IsUnix)
{
// Add the dllRef to produce thread local storage reference for coreclr
tlsRef = gtNewOperNode(GT_ADD, TYP_I_IMPL, tlsRef, dllRef);
#if defined(TARGET_AMD64)
// Code sequence to access thread local variable on linux/x64:
//
// mov rdi, 0x7FE5C418CD28 ; tlsIndexObject
// mov rax, 0x7FE5C47AFDB0 ; _tls_get_addr
// call rax
//
GenTree* tls_get_addr_val =
gtNewIconHandleNode((size_t)threadStaticBlocksInfo.tlsGetAddrFtnPtr, GTF_ICON_FTN_ADDR);
tlsValue = gtNewIndCallNode(tls_get_addr_val, TYP_I_IMPL);
GenTreeCall* tlsRefCall = tlsValue->AsCall();

// This is an indirect call which takes an argument.
// Populate and set the ABI appropriately.
assert(threadStaticBlocksInfo.tlsIndexObject != 0);
GenTree* tlsArg = gtNewIconNode((size_t)threadStaticBlocksInfo.tlsIndexObject, TYP_I_IMPL);
kunalspathak marked this conversation as resolved.
Show resolved Hide resolved
tlsRefCall->gtArgs.InsertAfterThisOrFirst(this, NewCallArg::Primitive(tlsArg));

CallArg* arg0 = tlsRefCall->gtArgs.GetArgByIndex(0);
arg0->AbiInfo = CallArgABIInformation();
arg0->AbiInfo.SetRegNum(0, REG_ARG_0);

tlsRefCall->gtFlags |= GTF_EXCEPT | (tls_get_addr_val->gtFlags & GTF_GLOB_EFFECT);
#ifdef UNIX_X86_ABI
tlsRefCall->gtFlags &= ~GTF_CALL_POP_ARGS;
#endif // UNIX_X86_ABI
#elif defined(TARGET_ARM64)
// Code sequence to access thread local variable on linux/arm64:
//
// mrs xt, tpidr_elf0
// mov xd, [xt+cns]
tlsValue = gtNewIconHandleNode(0, GTF_ICON_TLS_HDL);
#else
assert(!"Unsupported scenario of optimizing TLS access on Linux Arm32/x86");
#endif
}

// Base of coreclr's thread local storage
GenTree* tlsValue = gtNewIndir(TYP_I_IMPL, tlsRef, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);

// Cache the tls value
unsigned tlsLclNum = lvaGrabTemp(true DEBUGARG("TLS access"));
lvaTable[tlsLclNum].lvType = TYP_I_IMPL;
GenTree* tlsValueDef = gtNewStoreLclVarNode(tlsLclNum, tlsValue);
GenTree* tlsLclValueUse = gtNewLclVarNode(tlsLclNum);
tlsValueDef = gtNewStoreLclVarNode(tlsLclNum, tlsValue);
GenTree* tlsLclValueUse = gtNewLclVarNode(tlsLclNum);

size_t offsetOfThreadStaticBlocksVal = threadStaticBlocksInfo.offsetOfThreadStaticBlocks;
size_t offsetOfMaxThreadStaticBlocksVal = threadStaticBlocksInfo.offsetOfMaxThreadStaticBlocks;

// Create tree for "maxThreadStaticBlocks = tls[offsetOfMaxThreadStaticBlocks]"
GenTree* offsetOfMaxThreadStaticBlocks = gtNewIconNode(offsetOfMaxThreadStaticBlocksVal, TYP_I_IMPL);
GenTree* maxThreadStaticBlocksRef =
gtNewOperNode(GT_ADD, TYP_I_IMPL, gtCloneExpr(tlsLclValueUse), offsetOfMaxThreadStaticBlocks);
GenTree* maxThreadStaticBlocksValue =
gtNewIndir(TYP_INT, maxThreadStaticBlocksRef, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);
maxThreadStaticBlocksValue = gtNewIndir(TYP_INT, maxThreadStaticBlocksRef, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);

GenTree* threadStaticBlocksRef = gtNewOperNode(GT_ADD, TYP_I_IMPL, gtCloneExpr(tlsLclValueUse),
gtNewIconNode(offsetOfThreadStaticBlocksVal, TYP_I_IMPL));
threadStaticBlocksValue = gtNewIndir(TYP_I_IMPL, threadStaticBlocksRef, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);

// Create tree for "if (maxThreadStaticBlocks < typeIndex)"
GenTree* maxThreadStaticBlocksCond =
gtNewOperNode(GT_LT, TYP_INT, maxThreadStaticBlocksValue, gtCloneExpr(typeThreadStaticBlockIndexValue));
maxThreadStaticBlocksCond = gtNewOperNode(GT_JTRUE, TYP_VOID, maxThreadStaticBlocksCond);

// Create tree for "threadStaticBlockBase = tls[offsetOfThreadStaticBlocks]"
GenTree* offsetOfThreadStaticBlocks = gtNewIconNode(offsetOfThreadStaticBlocksVal, TYP_I_IMPL);
GenTree* threadStaticBlocksRef =
gtNewOperNode(GT_ADD, TYP_I_IMPL, gtCloneExpr(tlsLclValueUse), offsetOfThreadStaticBlocks);
GenTree* threadStaticBlocksValue =
gtNewIndir(TYP_I_IMPL, threadStaticBlocksRef, GTF_IND_NONFAULTING | GTF_IND_INVARIANT);

// Create tree to "threadStaticBlockValue = threadStaticBlockBase[typeIndex]"
typeThreadStaticBlockIndexValue = gtNewOperNode(GT_MUL, TYP_INT, gtCloneExpr(typeThreadStaticBlockIndexValue),
gtNewIconNode(TARGET_POINTER_SIZE, TYP_INT));
Expand Down
3 changes: 3 additions & 0 deletions src/coreclr/jit/instrsarm64.h
Original file line number Diff line number Diff line change
Expand Up @@ -1595,6 +1595,9 @@ INST1(isb, "isb", 0, IF_SI_0B, 0xD50330DF)
INST1(dczva, "dczva", 0, IF_SR_1A, 0xD50B7420)
// dc zva,Rt SR_1A 1101010100001011 01110100001ttttt D50B 7420 Rt

INST1(mrs_tpid0, "mrs", 0, IF_SR_1A, 0xD53BD040)
// mrs Rt,tpidr_el0 SR_1A 1101010100111011 11010000010ttttt D53B D040 Rt, tpidr_el0

INST1(umov, "umov", 0, IF_DV_2B, 0x0E003C00)
// umov Rd,Vn[] DV_2B 0Q001110000iiiii 001111nnnnnddddd 0E00 3C00 Rd,Vn[]

Expand Down
Loading