Fixing issues with CPU
Few weeks ago we noticied that CPU consumption is very hight in one of our prod servers. We have several servers with services written in
.net core. As temporary solution we increased size of
Azure VM and started the investigation.
Since hight CPU consuption was found only in 1 of our servers, it was clear what service produce these problems. But unfortunatelly it was unclear what part of this service contains bottleneck.
We tried to identify the bottleneck with help of DataDog, but it didn’t show any useful information, it was obviuos that latency is hight, but it was not clear where is the problem. That was because part of the code with
bottleneck was not covered with metrics.
After DataDog we decided to come from the other side and received traces from our server. Really it is not good practise to get traces from production, but it is the fastest way and it was possible to use it since it was night, there were not so many clients online, and it was just only one of our services.
So, here are traces which we received from
Performance-analysis showed that there were a lot of threads, each one spend ~1.9% of CPU. These traces say that the most of the time CPU spends on iterating
IEnumerable. In my practise i faced with a lot of situations when rough work with collections produces issues with CPU or RAM. But this case was a little bit wired - iteration of these
IEnumerable was caused by
First idea was - that the problem was in code for some complex mappings, where for example
OrderModelA.Qty mapped from
Second idea - the problem is in automapper configuration itself. To answer this question we covered both pieces of code with metrics, and started to wait. Soon we received our first metrics and now it was clear where is the root of the problem:
So the second guess was confirmed, the problem was not in mapping, but in the mapping configurations.
Diving deep inside the code, found that
automapper was injected in
.net core application as
AddScoped. That means that automapper was initializing for each request, each time iterating all mappings, wasting CPU resources. Looks like this approach was implemented a long time ago, and a lot of developers passed by every day, and did not notice that. After changing
AddSingleton code was send to testing server. And here are results:
The Latency was reduced from 5-10 sec to 30-50 msec, 100-333 times!
The picture for CPU consumption is also amazing from 23% to 10%:
This challenge learned us:
- Automapper configuration may have significant impact on the performance;
- it is important to choose a correct way of registering service in
- sometimes, if some mechanism works for a long time, it doesn’t mean it works well, may be there is a problem inside.