Fixing issues with CPU
Problem
Few weeks ago we noticied that CPU consumption is very hight in one of our prod servers. We have several servers with services written in .net core
. As temporary solution we increased size of Azure VM
and started the investigation.
Investigation
Since hight CPU consuption was found only in 1 of our servers, it was clear what service produce these problems. But unfortunatelly it was unclear what part of this service contains bottleneck.
We tried to identify the bottleneck with help of DataDog, but it didn’t show any useful information, it was obviuos that latency is hight, but it was not clear where is the problem. That was because part of the code with bottleneck
was not covered with metrics.
After DataDog we decided to come from the other side and received traces from our server. Really it is not good practise to get traces from production, but it is the fastest way and it was possible to use it since it was night, there were not so many clients online, and it was just only one of our services.
So, here are traces which we received from CPU
:
Performance-analysis showed that there were a lot of threads, each one spend ~1.9% of CPU. These traces say that the most of the time CPU spends on iterating IEnumerable
. In my practise i faced with a lot of situations when rough work with collections produces issues with CPU or RAM. But this case was a little bit wired - iteration of these IEnumerable
was caused by Automapper
.
First idea was - that the problem was in code for some complex mappings, where for example OrderModelA.Qty
mapped from OrderModelB.Items.Sum()
.
Second idea - the problem is in automapper configuration itself. To answer this question we covered both pieces of code with metrics, and started to wait. Soon we received our first metrics and now it was clear where is the root of the problem:
So the second guess was confirmed, the problem was not in mapping, but in the mapping configurations.
Fixing
Diving deep inside the code, found that automapper
was injected in .net core
application as AddScoped
. That means that automapper was initializing for each request, each time iterating all mappings, wasting CPU resources. Looks like this approach was implemented a long time ago, and a lot of developers passed by every day, and did not notice that. After changing DI
to AddSingleton
code was send to testing server. And here are results:
TraceView:
Latency:
The Latency was reduced from 5-10 sec to 30-50 msec, 100-333 times!
The picture for CPU consumption is also amazing from 23% to 10%:
Conclusion
This challenge learned us:
- Automapper configuration may have significant impact on the performance;
- it is important to choose a correct way of registering service in
.net core
application; - sometimes, if some mechanism works for a long time, it doesn’t mean it works well, may be there is a problem inside.
Read more
https://docs.microsoft.com/ru-ru/aspnet/core/fundamentals/dependency-injection?view=aspnetcore-5.0