How to make your APIs / Library code run ~10x faster.
Sometimes your phone application lags, what you do? restart it. Similarly for the laptop or any electronics device. What this restart does that phone/electronics device start working again? It resets working memory(RAM).
If we watch performance issue, they are mainly related to the memory. Therefore, reducing memory usage or allocating less will definitely improve performance of any API.
Let’s brush over few concepts before moving ahead. what is value type and what is reference type? value type is value which lives on stack. Reference type is value which lives on managed heap. Eg: Struct for value type( Struct can be reference type as well but I will address that in another post) and Class/Object for reference type.
Okay Sumit we got this. Why you are telling about stack and managed heap? If you write in C, you have to manage your own memory (eg like allocate for array, free the space after use is done) whereas language like C#, Java they have Garbage Collector (GC) and memory management system to take care of cleaning and allocating the memory space(such languages take away complexity from programmers and GC does the heavy lifting). GC only operates on managed heap, when GC operates entire application shutdowns(remember lag I mentioned in the beginning of the post?The lag is because there is no enough memory and GC is aggressively trying to get free memory and run the program efficiently. This is why when we create new object in program, we need to give more thought to it). Anyways, so if GC operates on only managed heap, can we use stack so that there is no GC and more performance?YUSSS!!! we can exploit stack so that we can write high performing code. Only limitation one needs to take care of is stack is limited(1MB in .NET I guess) so we put lot of stuff on stack, we will get stack overflow error classic one!! 😁. To conclude one can say to avoid GC and lag by using stack but one needs to be careful how to use it.
Let’s benchmark code and see the results.
Example: In this example, we will create 1 lac record and process it. This example is trivial because if you are building ERP systems, data pipeline, machine learning APIs, you will have collection of records and you will need to process them.
From screenshot, we can see two method one is ProcessData() and ProcessDataStackVersion()
ProcessData() is classic method which we write, make a class ,create list out of it and fill the data. ProcessDataStackVersion() is little different we are using stackalloc to allocate space on stack and using struct we are defining datatypes that are stack only. ProcessData() will allocate on heap and GC will be triggered to clean this memory whereas ProcessDataStackVersion () will allocate on stack ,No GC will be triggered.
From screenshot , you can see ProcessData() version takes 9106 microseconds and ProcessDataStackVersion () version takes 784.3 microseconds ( almost 10x faster than classic one). Gen 0,1,2(I will explain it in separate post what this is) and allocated memory is zero for ProcessDataStackVersion () as everything is allocated on stack(nothing on heap. Sometimes this is referred as zero allocation (as there is no allocation on heap) ). This way we allocated less and performance got improved🚀🚀🚀.
These days writing API which works and scales on cloud is trivial but writing efficient one which consumes less resources and not causing too much cloud billing is still an art 😁
Hope you enjoyed reading this article!!!