Fixing issues with CPU

Problem

Few weeks ago we noticied that CPU consumption is very hight in one of our prod servers. We have several servers with services written in .net core. As temporary solution we increased size of Azure VM and started the investigation.

Investigation

Since hight CPU consuption was found only in 1 of our servers, it was clear what service produce these problems. But unfortunatelly it was unclear what part of this service contains bottleneck.

We tried to identify the bottleneck with help of DataDog, but it didn’t show any useful information, it was obviuos that latency is hight, but it was not clear where is the problem. That was because part of the code with bottleneck was not covered with metrics.

After DataDog we decided to come from the other side and received traces from our server. Really it is not good practise to get traces from production, but it is the fastest way and it was possible to use it since it was night, there were not so many clients online, and it was just only one of our services.

So, here are traces which we received from CPU:

_config.yml

Performance-analysis showed that there were a lot of threads, each one spend ~1.9% of CPU. These traces say that the most of the time CPU spends on iterating IEnumerable. In my practise i faced with a lot of situations when rough work with collections produces issues with CPU or RAM. But this case was a little bit wired - iteration of these IEnumerable was caused by Automapper.

First idea was - that the problem was in code for some complex mappings, where for example OrderModelA.Qty mapped from OrderModelB.Items.Sum().

Second idea - the problem is in automapper configuration itself. To answer this question we covered both pieces of code with metrics, and started to wait. Soon we received our first metrics and now it was clear where is the root of the problem:

_config.yml

So the second guess was confirmed, the problem was not in mapping, but in the mapping configurations.

Fixing

Diving deep inside the code, found that automapper was injected in .net core application as AddScoped. That means that automapper was initializing for each request, each time iterating all mappings, wasting CPU resources. Looks like this approach was implemented a long time ago, and a lot of developers passed by every day, and did not notice that. After changing DI to AddSingleton code was send to testing server. And here are results:

TraceView: _config.yml

Latency:

_config.yml

The Latency was reduced from 5-10 sec to 30-50 msec, 100-333 times!

The picture for CPU consumption is also amazing from 23% to 10%:

_config.yml

Conclusion

This challenge learned us:

  • Automapper configuration may have significant impact on the performance;
  • it is important to choose a correct way of registering service in .net core application;
  • sometimes, if some mechanism works for a long time, it doesn’t mean it works well, may be there is a problem inside.

Read more

https://docs.microsoft.com/ru-ru/aspnet/core/fundamentals/dependency-injection?view=aspnetcore-5.0

Written on July 7, 2021