摘要:目前来说,.NET10 CLR比较重要的更新有两个,其一局部逃逸分析,其二EH表函数独立。
前言
目前来说,.NET10 CLR比较重要的更新有两个,其一局部逃逸分析,其二EH表函数独立。
本篇看下.NET10逃逸分析的IR层面,延续上一篇:.NET10逃逸分析-结构体局部字段。扩展逃逸分析的同时,也分析下.NET10另一个重要的重构部分内联EH表独立(深入IR层级)。
using System;Class Program{ static void Main { for (int i = 0; i { ABC; // 热身 } Console.WriteLine("Done. Press Enter to exit."); Console.ReadLine; } struct GCStruct { public int arr; } //便于分析这里进行了不内联设置 [System.runtime.CompilerServices.MethodImpl(System.Runtime.CompilerServices.MethodImplOptions.NoInlining)] static int ABC { int x = new int[10]; GCStruct y = new GCStruct { arr = x }; return y.arr[0]; }}ABC函数从.NET9堆分配如下:
; Program.ABC sub rsp,28 mov rcx,offset MT_System.Int32 mov edx,0A call CORINFO_HELP_NEWARR_1_VC mov eax,[rax+10] add rsp,28 ret到.NET10 PreView4的栈分配
Program.ABC //此处省略,便于观看 mov rax,offset MT_System.Int32 mov [rsp+28],rax lea rax,[rsp+28] mov dword ptr [rax+8],0A lea rax,[rsp+28] mov [rsp+60],rax cmp dword ptr [rax+8],0 jbe short M00_L00 mov eax,[rax+10] add rsp,68 ret.NET10为了提升程序性能,中间经历了逃逸分析的过程,这个过程具体怎么样的呢?继续看。
从大体上看,如果需要把
CORINFO_HELP_NEWARR_1_VC改成栈分配则堆IR的操控必不可少。ABC函数在第0层(也即Tier0)编译的时候,IR如下:
***** BB01 [0000]STMT00000 ( 0x000[E-] ... 0x007 ) DACXG------ * STORE_LCL_VAR ref V03 tmp1 --CXG------ \--* CALL help ref CORINFO_HELP_NEWARR_1_VC H arg0 +--* CNS_INT(h) long 0x7ffd35871130 class int arg1 \--* CNS_INT long 10注意以上展示的IR是ABC函数BB块的SMT0片段,因为过多展示导致代码非常冗余,不便于观看。此处展示一部分。
它描述的是通过JIT辅助函数
CORINFO_HELP_NEWARR_1_VC进行数组的分配,此辅助函数包含了两个参数:数组的类型和数组的长度。
当进行了逃逸分析之后,这个IR变成了如下:
***** BB01 [0000]STMT00000 ( 0x000[E-] ... 0x007 ) DACXG------ * STORE_LCL_VAR long V03 tmp1 --CXG------ \--* CALL help long CORINFO_HELP_NEWARR_1_VC H arg0 +--* CNS_INT(h) long 0x7ffd35871130 class int arg1 +--* CNS_INT long 10 &lcl arr \--* LCL_ADDR long V04 tmp2 [+0]多了一个参数tmp2,且ref变成了long,也即是引用变成了值类型。
最终的结果,已经与上面的逃逸例子完全一致了。
IN0007: 000000 sub rsp, 56IN0008: 000004 vxorps xmm4, xmm4, xmm4IN0009: 000008 vmovdqu ymmword ptr [rsp], ymm4IN000a: 00000D vmovdqa xmmword ptr [rsp+0x20], xmm4IN000b: 000013 xor eax, eaxIN000c: 000015 mov qword ptr [rsp+0x30], raxG_M37692_IG02: ; offs=0x00001A, size=0x0020, bbWeight=1, PerfScore 5.25, gcrefRegs=0000 {}, byrefRegs=0000 {}, BB01 [0000], byrefIN0001: 00001A mov rax, 0x7FFD35871130 ; intIN0002: 000024 mov qword ptr [V04 rsp], raxIN0003: 000028 lea rax, [V04 rsp]IN0004: 00002C mov dword ptr [rax+0x08], 10IN0005: 000033 lea rax, [V04 rsp]IN0006: 000037 mov eax, dword ptr [rax+0x10]G_M37692_IG03: ; offs=0x00003A, size=0x0005, bbWeight=1, PerfScore 1.25, epilog, nogc, extendIN000d: 00003A add rsp, 56IN000e: 00003E ret这里需要有个几点需要说明下。
tmp2是栈分配的局部变量,表示 将栈上局部变量V04 tmp2 的地址作为参数传入JIT辅助函数,用于让 JIT 把新建数组分配到这个地址上,替换原本的newarr堆分配。
JIT 想把某个结构体数组或对象分配在栈上,必须明确知道它的字段布局和内存大小,这里除了多了参数tmp2之外,还构建了一个数组在JIT编译器中的一个栈结构(layout)。因为从堆到栈上,需要构建一个新的结构,以便于GC以后后续分析使用。
我们看下这两个地方,其一tmp2的构建流程,其二栈结构layout。
tmp2和layout
tmp2栈局部变量是为了取代本来的JIT辅助堆分配函数,其操控代码如下
unsigned int ObjectAllocator::MorphNewArrNodeIntoStackAlloc(GenTreeCall* newArr, CORINFO_CLASS_HANDLE clsHnd, unsigned int length, unsigned int blockSize, BasicBlock* block, statement* stmt){ //此处省略,便于观看 comp->lvaSetStruct(lclNum, comp->typGetArrayLayout(clsHnd, length), /* unsafe */ false); //中间省略便于观看 GenTree* const stackLocalAddr = comp->gtNewLclAddrNode(lclNum, 0); newArr->gtArgs.PushBack(comp, NewCallArg::Primitive(stackLocalAddr).WellKnown(WellKnownArg::StackArrayLocal)); newArr->gtCallMoreFlags |= GTF_CALL_M_STACK_ARRAY; newArr->ChangeType(TYP_I_IMPL); newArr->gtReturnType = TYP_I_IMPL; comp->setMethodHasStackAllocatedArray; return lclNum;}typeGetArrayLayout获取数组的栈结构体。
它里面调用了ClassLayoutBuilder::BuildArray函数进行JIT内部结构体构建,以便于后面的GC识别和PGO分析用于。
当我们获取到栈局部变量的结构layout之后,通过lvaSetStruct构建栈局部变量的:
总大小(比如 16 字节?)
是否有 GC 指针(比如 offset 8 是一个 ref类型?)
是否对齐(8 字节?16 字节?)
是否可以传值调用?是否可以展开?
是不是 HFA(浮点向量结构体,用于 SIMD 或 ARM 优化)
后面通过如下代码,构建了tmp2
GenTree* const stackLocalAddr = comp->gtNewLclAddrNode(lclNum, 0);newArr->gtArgs.PushBack(comp, NewCallArg::Primitive(stackLocalAddr).WellKnown(WellKnownArg::StackArrayLocal));gtNewLclAddrNode函数构建一个LCL_ADDR的节点,把这个节点添加到newArr的下面。
我们看下MorphNewArrNodeIntoStackAlloc的堆栈
clrjit.dll!ObjectAllocator::MorphNewArrNodeIntoStackAlloc(GenTreeCall * newArr, CORINFO_CLASS_STRUCT_ * clsHnd, unsigned int length, unsigned int blockSize, BasicBlock * block, Statement * stmt) 行 1626C++clrjit.dll!ObjectAllocator::MorphAllocObjNodeHelperArr(ObjectAllocator::AllocationCandidate & candidate) 行 1453C++clrjit.dll!ObjectAllocator::MorphAllocObjNodeHelper(ObjectAllocator::AllocationCandidate & candidate) 行 1373C++clrjit.dll!ObjectAllocator::MorphAllocObjNode(ObjectAllocator::AllocationCandidate & candidate) 行 1284C++clrjit.dll!ObjectAllocator::MorphAllocObjNodes 行 1265C++clrjit.dll!ObjectAllocator::DoPhase 行 234C++clrjit.dll!Phase::Run 行 61C++clrjit.dll!Compiler::compCompile(void * * methodCodePtr, unsigned int * methodCodeSize, JitFlags * compileFlags) 行 4418C++clrjit.dll!Compiler::compCompileHelper(CORINFO_MODULE_STRUCT_ * classPtr, ICorJitInfo * compHnd, CORINFO_METHOD_INFO * methodInfo, void * * methodCodePtr, unsigned int * methodCodeSize, JitFlags * compileFlags) 行 7077C++clrjit.dll!`Compiler::compCompile'::`151'::__Body::Run(Compiler::compCompile::__l2::__JITParam * __JITpParam) 行 6264C++clrjit.dll!Compiler::compCompile(CORINFO_MODULE_STRUCT_ * classPtr, void * * methodCodePtr, unsigned int * methodCodeSize, JitFlags * compileFlags) 行 6268C++clrjit.dll!``jitNativeCode'::`8'::__Body::Run'::`6'::__Body::Run(jitNativeCode::__l8::__Body::Run::__l5::__JITParam * __JITpParam) 行 7722C++clrjit.dll!`jitNativeCode'::`8'::__Body::Run(jitNativeCode::__l2::__JITParam * __JITpParam) 行 7725C++clrjit.dll!jitNativeCode(CORINFO_METHOD_STRUCT_ * methodHnd, CORINFO_MODULE_STRUCT_ * classPtr, ICorJitInfo * compHnd, CORINFO_METHOD_INFO * methodInfo, void * * methodCodePtr, unsigned int * methodCodeSize, JitFlags * compileFlags, void * inlineInfoPtr) 行 7749C++clrjit.dll!CILJit::compileMethod(ICorJitInfo * compHnd, CORINFO_METHOD_INFO * methodInfo, unsigned int flags, unsigned char * * entryAddress, unsigned int * nativeSizeOfCode) 行 302C++coreclr.dll!invokeCompileMethod(EECodeGenManager * jitMgr, CEECodeGenInfo * comp, unsigned char * * nativeEntry, unsigned int * nativeSizeOfCode) 行 12842C++coreclr.dll!UnsafeJitFunctionWorker(EECodeGenManager * pJitMgr, CEECodeGenInfo * pJitInfo, NativeCodeVersion nativeCodeVersion, unsigned long * pSizeOfCode) 行 13066C++coreclr.dll!UnsafeJitFunction(PrepareCodeConfig * config, COR_ILMETHOD_DECODER * ILHeader, bool * isTier0, unsigned long * pSizeOfCode) 行 13382C++coreclr.dll!MethodDesc::JitCompileCodeLocked(PrepareCodeConfig * pConfig, COR_ILMETHOD_DECODER * pilHeader, ListLockEntryBase * pEntry, unsigned long * pSizeOfCode) 行 925C++coreclr.dll!MethodDesc::JitCompileCodeLockedEventWrapper(PrepareCodeConfig * pConfig, ListLockEntryBase * pEntry) 行 834C++coreclr.dll!MethodDesc::JitCompileCode(PrepareCodeConfig * pConfig) 行 715C++coreclr.dll!MethodDesc::PrepareILBasedCode(PrepareCodeConfig * pConfig) 行 431C++coreclr.dll!MethodDesc::PrepareCode(PrepareCodeConfig * pConfig) 行 319C++ coreclr.dll!TieredCompilationManager::CompileCodeVersion(NativeCodeVersion nativeCodeVersion) 行 953C++coreclr.dll!TieredCompilationManager::OptimizeMethod(NativeCodeVersion nativeCodeVersion) 行 930C++coreclr.dll!TieredCompilationManager::DoBackgroundWork(__int64 * workDurationTicksRef, __int64 minWorkDurationTicks, __int64 maxWorkDurationTicks) 行 817C++coreclr.dll!TieredCompilationManager::BackgroundWorkerStart 行 532C++coreclr.dll!TieredCompilationManager::BackgroundWorkerBootstrapper1(void * __formal) 行 483C++coreclr.dll!ManagedThreadBase_DispatchInner(ManagedThreadCallState * pCallState) 行 6917C++coreclr.dll!ManagedThreadBase_DispatchMiddle(ManagedThreadCallState * pCallState) 行 6961C++coreclr.dll!``ManagedThreadBase_DispatchOuter'::`11'::__Body::Run'::`5'::__Body::Run(Param * pParam) 行 7119C++ coreclr.dll!`ManagedThreadBase_DispatchOuter'::`11'::__Body::Run(ManagedThreadBase_DispatchOuter::__l2::TryArgs * pArgs) 行 7121C++coreclr.dll!ManagedThreadBase_DispatchOuter(ManagedThreadCallState * pCallState) 行 7141C++coreclr.dll!ManagedThreadBase::KickOff(void(*)(void *) pTarget, void * args) 行 7159C++coreclr.dll!TieredCompilationManager::BackgroundWorkerBootstrapper0(void * args) 行 466C++kernel32.dll!00007ffecc24e8d7未知ntdll.dll!00007ffecd45c34c未知函数MorphAllocObjNodeHelperArr里有:
CanAllocateLclVarOnStack(判断某个局部变量(lclvar)是否可以安全地从 堆上分配(heap allocation)转为栈上分配(stack allocation),也就是我们通常说的“stack replacement”或“stack promotion”。)
MorphNewArrNodeIntoStackAlloc调用(也即是上面介绍的),如下:
bool ObjectAllocator::MorphAllocObjNodeHelperArr(AllocationCandidate& candidate){ if (!CanAllocateLclVarOnStack(candidate.m_lclNum, clsHnd, candidate.m_allocType, len->AsIntCon->IconValue, &blockSize, &candidate.m_onHeapReason)) { // reason set by the call return false; } JITDUMP("Allocating Vu on the stack\n", candidate.m_lclNum); const unsigned int stackLclNum = MorphNewArrNodeIntoStackAlloc(data->AsCall, clsHnd, (unsigned int)len->AsIntCon->IconValue, blockSize, candidate.m_block, candidate.m_statement);}如果局部变量不满足从堆到栈的分配,则直接返回,进行堆分配。继续看函数MorphAllocObjNodeHelper
bool ObjectAllocator::MorphAllocObjNodeHelper(AllocationCandidate& candidate){ if (!IsObjectStackAllocationEnabled) { candidate.m_onHeapReason = "[object stack allocation disabled]"; return false; } //中间省略,便于观看 switch (candidate.m_allocType) { case OAT_NEWARR: return MorphAllocObjNodeHelperArr(candidate); case OAT_NEWOBJ: return MorphAllocObjNodeHelperObj(candidate); case OAT_NEWOBJ_HEAP: candidate.m_onHeapReason = "[runtime disallows]"; return false; default: unreached; }}IsObjectStackAllocationEnabled
inline bool ObjectAllocator::IsObjectStackAllocationEnabled const{ return m_IsObjectStackAllocationEnabled;}inline void ObjectAllocator::EnableObjectStackAllocation{ m_IsObjectStackAllocationEnabled = true;} if (compObjectStackAllocation && opts.OptimizationEnabled) { objectAllocator.EnableObjectStackAllocation; }compObjectStackAllocation和
OptimizationEnabled
bool compObjectStackAllocation{ if (compIsAsync) { // Object stack allocation takes the address of locals around // suspension points. Disable entirely for now. return false; } return (JitConfig.JitObjectStackAllocation != 0);}bool OptimizationEnabled const{ assert(compMinOptsIsSet); return canUseAllOpts;}其中的compObjectStackAllocation是环境变量:
DOTNET_JitObjectStackAllocation控制,这点需要注意。
MorphAllocObjNodes函数,它里面进行了BB块和STMT的循环。
bool ObjectAllocator::MorphAllocObjNodes{ m_stackAllocationCount = 0; m_PossiblyStackPointingPointers = BitVecOps::MakeEmpty(&m_bitVecTraits); m_DefinitelyStackPointingPointers = BitVecOps::MakeEmpty(&m_bitVecTraits); for (BasicBlock* const block : comp->Blocks) { const bool basicBlockHasNewObj = block->HasFlag(BBF_HAS_NEWOBJ); const bool basicBlockHasNewArr = block->HasFlag(BBF_HAS_NEWARR); const bool basicBlockHasBackwardJump = block->HasFlag(BBF_BACKWARD_JUMP); if (!basicBlockHasNewObj && !basicBlockHasNewArr) { continue; } for (Statement* const stmt : block->Statements) { GenTree* const stmtExpr = stmt->GetRootNode; if (!stmtExpr->OperIs(GT_STORE_LCL_VAR) || !stmtExpr->TypeIs(TYP_REF)) { // We assume that GT_ALLOCOBJ nodes are always present in the canonical form. assert(!comp->gtTreeContainsOper(stmtExpr, GT_ALLOCOBJ)); continue; } const unsigned int lclNum = stmtExpr->AsLclVar->GetLclNum; GenTree* const data = stmtExpr->AsLclVar->Data; ObjectAllocationType const allocType = AllocationKind(data); if (allocType == OAT_NONE) { continue; } AllocationCandidate c(block, stmt, stmtExpr, lclNum, allocType); MorphAllocObjNode(c); } } return (m_stackAllocationCount > 0);}注意看这两行代码
const bool basicBlockHasNewArr = block->HasFlag(BBF_HAS_NEWARR);//分隔符ObjectAllocationType const allocType = AllocationKind(data);basicBlockHasNewArr判断是否包含数组的分配。
它用到了BBF_HAS_NEWARR标志。何时设置这个标志的呢?
当JIT解析IL的时候,会判断当前的IL符号是否是CEE_NEWARR
如果是则设置BBF_HAS_NEWARR
void Compiler::impImportBlockCode(BasicBlock* block){ const BYTE* codeAddr = info.compCode + block->bbCodeOffs; const BYTE* codeEndp = info.compCode + block->bbCodeOffsEnd; while (codeAddr //省略其他case case CEE_NEWARR:{ block->SetFlags(BBF_HAS_NEWARR); } }}IL实际上在内存是一串十六进制数组,比如例子中的ABC函数在JIT表示如下(部分片段):
0x00000290 1F0A */ IL_0000: ldc.i4.s 10 /* 0x00000292 8D11000001 */ IL_0002: newarr [System.Runtime]System.Int32 /* 0x00000297 0A */ IL_0007: stloc.00x00000292 8D11000001 */ IL_0002: newarr [System.Runtime]System.Int32那么JIT识别
1f 0a 8d 11 00 00 018d代表的即CEE_NEWARR
也是IL中的newarr。
OK此处分析完了MorphAllocObjNodes函数的重要逻辑,
只有basicBlockHasNewArr标志为1,才能继续循环STMT,进而进行堆向栈的转变。
堆栈继续向前即使DoPhase函数,属于JIT顶层函数,分析价值不大,此处不赘述。
总结下:通过IL判断当前是否arr节点,如果是则循环STMT,判断是否能在栈上分配,是的话就进行局部变量JIT内部内存构建(layout)以及局部栈变量的字节等分配,给IR构建栈局部变量,进行插入,设置标志不为GC引用。
知识圈里面包括了,.NET10核心技术和最前沿的量子技术(C++/Python),欢迎加入学习。
.NET10 EH独立表
.NET10里面针对异常表进行了部分重构,具体的是:在.NET9如果一个函数内联了另一个函数,另一个函数里面包含了异常处理(try/cache,异常处理一般会生成异常表EH)。异常处理的EH表会被挂接到调用者的EH表里面,而.NET10改进是,被内联的函数的异常表会独立存在,不会挂接到调用函数的EH表里。
这么做的好处:
精确异常模型更健壮。
内联决策更加局部可控。
减少 caller EH 复杂度。
当然我们其实可以看到.NET10 Runtime庞大规模,需要更工程化的设计,呈现高内聚低耦合的特性了。
下面我们分析下。
using BenchmarkDotNet.Running;using System.Runtime.CompilerServices;public class Program{ static void Main { Bar; Console.ReadLine; } [MethodImpl(MethodImplOptions.AggressiveInlining)] static void Bar { try { Console.WriteLine("Bar"); } finally { Console.WriteLine("Bar finally"); } }}.NET10其bar函数的EH独立表如下
****** START compiling Program:Bar (MethodHash=9959ef7b)IL to import:IL_0000 72 01 00 00 70 ldstr 0x70000001IL_0005 28 0d 00 00 0a call 0xA00000DIL_000a de 0b leave.s 11 (IL_0017)IL_000c 72 09 00 00 70 ldstr 0x70000009IL_0011 28 0d 00 00 0a call 0xA00000DIL_0016 dc endfinallyIL_0017 2a retEH clause #0: Flags: 0x2 (finally) TryOffset: 0x0 TryLength: 0xc HandlerOffset: 0xc HandlerLength: 0xb ClassToken: 0x0我们通过benchmark .NET10(注意.NET9不内联)看下其内联情况,例子
using BenchmarkDotNet.Attributes;using BenchmarkDotNet.Running;using Microsoft.Diagnostics.Runtime.AbstractDac;using System.Runtime.CompilerServices;[DisassemblyDiagnoser]public class Program{ public static void Main { var summary = BenchmarkRunner.Run; //Console.ReadLine; } [Benchmark] public void ABC { Bar; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static void Bar { try { Console.WriteLine("Bar"); } finally { Console.WriteLine("Bar finally"); } }}.NET9
; Program.ABC jmp qword ptr [7FFD3376E940]; Program.Bar; Total bytes of code 6.NET10
; Program.ABC push rbp sub rsp,20 lea rbp,[rsp+20] mov rcx,20E00308AC0 call qword ptr [7FFDDB97F768]; System.Console.WriteLine(System.String) nop mov rcx,20E00308AE0 nop add rsp,20 pop rbp ret push rbp sub rsp,20 mov rcx,20E00308AE0 nop add rsp,20 pop rbp ret; Total bytes of code 78结合以上两点,内联之后可以看到.NET10的EH表确实独立了。
我们通过EH clause jit log看下其流程
//fgbasic.cppfor (XTnum = 0, HBtab = compHndBBtab; XTnum { CORINFO_EH_CLAUSE clause; info.compCompHnd->getEHinfo(info.compMethodHnd, XTnum, &clause); noway_assert(clause.HandlerLength != (unsigned)-1); // @DEPRECATED#ifdef DEBUG if (verbose) { dispIncomingEHClause(XTnum, clause); } //后面省略便于观看 }dispIncomingEHClause打印出EH表信息,for循环出EH表索引,然后通过getEHinfo函数获取clause(也即EH表的信息),getEhInfo调用如下:
//filename:jitinerface.cppvoid CEECodeGenInfo::getEHinfo( CORINFO_METHOD_HANDLE ftn, /* IN */ unsigned EHnumber, /* IN */ CORINFO_EH_CLAUSE* clause) /* OUT */{ CONTRACTL { THROWS; GC_TRIGGERS; MODE_PREEMPTIVE; } CONTRACTL_END; JIT_TO_EE_TRANSITION; MethodDesc* pMD = GetMethod(ftn); if (IsDynamicMethodHandle(ftn)) { pMD->AsDynamicMethodDesc->GetResolver->GetEHInfo(EHnumber, clause); } else if (pMD == m_pMethodBeingCompiled) { getEHinfoHelper(ftn, EHnumber, clause, m_ILHeader); } //此处省略便于观看 }可以看到其是通过m_ILHeader这个字段获取的EH表信息。其原型:
COR_ILMETHOD_DECODER* m_ILHeader;class COR_ILMETHOD_DECODER : public COR_ILMETHOD_FAT{ const BYTE * Code; PCCOR_SIGNATURE LocalVarSig; // pointer to signature blob, or 0 if none DWORD cbLocalVarSig; // size of dignature blob, or 0 if none const COR_ILMETHOD_SECT_EH * EH; // eh table if any 0 if none const COR_ILMETHOD_SECT * Sect; // additional sections 0 if none}struct COR_ILMETHOD_SECT_EH : public COR_ILMETHOD_SECT{}struct COR_ILMETHOD_SECT{ union { COR_ILMETHOD_SECT_EH_SMALL Small; COR_ILMETHOD_SECT_EH_FAT Fat; };}COR_ILMETHOD_SECT_EH结构里面存储的就是EH表的数据
TryOffsetTryLengthHandlerOffsetHandlerLength当CLR需要取数据的时候
//jitinterfacp.cppstatic void getEHinfoHelper( CORINFO_METHOD_HANDLE ftnHnd, unsigned EHnumber, CORINFO_EH_CLAUSE* clause, COR_ILMETHOD_DECODER* pILHeader){ STANDARD_VM_CONTRACT; _ASSERTE(CheckPointer(pILHeader->EH)); _ASSERTE(EHnumber EH->EHCount); COR_ILMETHOD_SECT_EH_CLAUSE_FAT ehClause; const COR_ILMETHOD_SECT_EH_CLAUSE_FAT* ehInfo; ehInfo = (COR_ILMETHOD_SECT_EH_CLAUSE_FAT*)pILHeader->EH->EHClause(EHnumber, &ehClause); clause->Flags = (CORINFO_EH_CLAUSE_FLAGS)ehInfo->GetFlags; clause->TryOffset = ehInfo->GetTryOffset; clause->TryLength = ehInfo->GetTryLength; clause->HandlerOffset = ehInfo->GetHandlerOffset; clause->HandlerLength = ehInfo->GetHandlerLength; if ((clause->Flags & CORINFO_EH_CLAUSE_FILTER) == 0) clause->ClassToken = ehInfo->GetClassToken; else clause->FilterOffset = ehInfo->GetFilterOffset;}那么EH表何时写入JIT的呢?然后通过JIT报告给,让其进行后续操作,比如异常调度、GC、安全点、调试器等功能。代码如下:
// cocegencommon.cppvoid CodeGen::genReportEH{ // 此处省略,便于观看 for (EHblkDsc* const HBtab : EHClauses(compiler)) { UNATIVE_OFFSET tryBeg, tryEnd, hndBeg, hndEnd, hndTyp; tryBeg = compiler->ehCodeOffset(HBtab->ebdTryBeg); hndBeg = compiler->ehCodeOffset(HBtab->ebdHndBeg); tryEnd = (HBtab->ebdTryLast == compiler->fgLastBB) ? compiler->info.compNativeCodeSize : compiler->ehCodeOffset(HBtab->ebdTryLast->Next); hndEnd = (HBtab->ebdHndLast == compiler->fgLastBB) ? compiler->info.compNativeCodeSize : compiler->ehCodeOffset(HBtab->ebdHndLast->Next); if (HBtab->HasFilter) { hndTyp = compiler->ehCodeOffset(HBtab->ebdFilter); } else { hndTyp = HBtab->ebdTyp; } // Note that we reuse the CORINFO_EH_CLAUSE type, even though the names of // the fields aren't accurate. CORINFO_EH_CLAUSE clause; clause.ClassToken = hndTyp; /* filter offset is passed back here for filter-based exception handlers */ clause.Flags = ToCORINFO_EH_CLAUSE_FLAGS(HBtab->ebdHandlerType); clause.TryOffset = tryBeg; clause.TryLength = tryEnd; clause.HandlerOffset = hndBeg; clause.HandlerLength = hndEnd; clauses[XTnum++] = {clause, HBtab}; } // 此处省略,便于观看}那么以上就构成了一个整体。
JIT通过genRepotrEH函数把EH表的字段比如TryOffset写入到EH表,且报告给CLR。
CLR运行的时候通过m_ILHeader字段的COR_ILMEOTHD_SETC_EH结构获取到上面赋值的字段,进而把EH表打印出来,也就是我们开头看到的NET10其bar函数的EH独立表
这里还有一个问题,.NET10是每个函数独立表,而.NET9则是挂接在父EH表中,这个过程是怎么转变的呢?鉴于.NET9已经被10取代,这个问题并不是太重要,留在后续再研。
结尾
.NET10的局部逃逸总结下:通过IL判断当前是否arr节点,如果是则循环STMT,判断是否能在栈上分配,是的话就进行局部变量JIT内部内存构建(layout)以及局部栈变量的字节等分配,给IR构建栈局部变量,进行插入,设置标志不为GC引用。
.NET10的EH表独立总结下:即使是被内联的函数包含了异常,其EH表依旧是独立的,而不是依附于其调用者的EH表。
.NET10的逃逸分析进一步加大了在栈上的分配力度,而减少了堆分配和GC的压力,性能进一步提升。.NET10的EH部分重构,则加强了程序的稳定性和健壮性,以及更方便的可操控性。
知识圈里面包括了,.NET10核心技术和最前沿的量子技术(C++/Python),欢迎加入
来源:opendotnet