Skip to content
Teddy edited this page Nov 3, 2015 · 10 revisions

Introduction

NanoProfiler is a light weight profiling library written in C# which requires .NET 4.0+. It was inspired by the MiniProfiler project, but is designed for high performance, big-data analytics, and is easy to be used for both sync & async programming model. It has been used in EF (Education First) projects generating billions of profiling events per day with trivial performance penalty.

NanoProfiler itself implements the core profiling feature and a simple implementation for persisting results via slf4net. If you want better profiling result management feature, you could implement the IProfilingStorage interface by yourself.

NanoProfiler also provides a wonderful view-result Web UI supports view latest profiling results in a tree-timeline view (simply visit ~/nanoprofiler/view in your web application).

How does NanoProfiler work internally?

Logic Profiling Session Context

To add any NanoProfiler profiling steps in your code, you could always consistently use the simple code like, using (ProfilingSession.Current.Step("name of profiling step")) { }, no matter it is in web application event callback, in a WCF behavior callback, in a custom thread context or in an async method. But how to make this work internally? For example, in a web application, we are profiling for pr request, so ProfilingSession.Current should not be a global static variable. It could not be a ThreadStatic variable, either, because different ASP.NET application event callbacks might be executed in different threads, even it is a sync request. Base on HttpContext.Current? It works in all ASP.NET application event callbacks, but does not work in WCF or custom thread context or async methods.

So, what else can we use then? Luckily, we have the Logical CallContext, which is a build-in feature in .NET framework that automatically passes down context variables cross threads via the .NET ExecutionContext. Sounds like it could solve the problem? Yes, but only if you use it correctly. Because there are some limitations when using logical CallContext:

  1. The variables stored in logical CallContext should be immutable. To be immutable is because the behavior of a logical CallContext in .NET framework version < 4.5 and >= 4.5 is different when a variable value stored in logical CallContext is changed, check this link for the details: http://blog.stephencleary.com/2013/04/implicit-async-context-asynclocal.html
  2. The variables stored in logical CallContext should be serializable and performance cheap for serialization. To be serializable is because the .NET runtime itself might switch running threads by serialize/deserialize the execution context (the logical CallContext is part of it) of threads to utilize the efficiency of thread pool threads. If the variable is not serializable, after a deserialization, the value will become null. If the serialization/deserialization performance cost is not cheap, there will be huge performance overheads when thread-switch happens.
  3. The variables stored in logical CallContext actually are stored in the thread local storage, the .NET runtime only do the value-set when passing down cross threads, but never set-null. So for thread pool threads, the variables stored will always be there until manually set-null. Which means, two issues we need to carefully handle: The variable value should not be big (cost too much memory) because for worse case, the values will always be stored in each thread pool thread, which occupies memory forever then. We need to have a way to guarantee the variables stored in logical CallContext to be disposed when it is no longer being used by any other code.
  4. In WEB applications, the begin request event handler (where we trigger the starting of a profiling session) is not guaranteed to be executed in the same thread as the thread where request process handler is executed. When this case happens, i.e. under big concurrent load, in the code being executed in the same thread of the request process handler, logical CallContext doesn't contain the current profiling session, while HttpContext.Current does.

So, what we do in NanoProfiler is, by default, we use logical CallContext. We only store simple primitive type variables in logical CallContext (which are ids of the big, mutable instances), which also ensures that, as long as the values ares set to logical CallContext, the values themselves will never be changed. To change the values identified by the ids, we use the ids, to look up the instances from global lock-less cache, and then do the change. The instances in the global cache are wrapped with WeakReference class instances to ensure, as long as there are no other instance references, they could be collected by GC. And when HttpContext.Current or the WCF OperationContext is accessible, in the context of ASP.NET web application or WCF application, we always try to use HttpContext.Current or the WCF OperationContext first, if not accessible, use logical CallContext, for storing profiling session context. And for the #4 issue we mentioned above, we also specifically fix the current session stored in logical CallContext for that case.

Lock-free Profiling Data Storing & Processing

To ensure the profiling itself does not add additional performance overheads to the application code being profiled, we need to ensure high performance of the profiling data storing & processing. What we do is, at runtime, the data of each profiling step is stored in a lock-free append-only way. At the end of a profiling session, we get a list of raw profiling steps, and only when we need to persist and analyze the data, we process the raw data to create the profiling session hierarchy. And, the processing of the raw data only happens in an async single thread queue worker, so it will never impact the response time of the main threads being profiled.