Recently, I was involved in the performance testing of several different projects at Caplin and for our customers. What I have noticed is that when we do functional testing we have different tiers of functional tests like Unit test level, Acceptance test level, Integration test level and End to end test level so that it can provide both fast feedback to the developers when something goes wrong, and also provide high visibility for both the Product Manager and Customers on which behaviour is covered by test and which is not. However, when we start writing performance test, we do not considering different tiers at all. Thus I would like to take this chance to discuss about the different tiers or test triangle for performance testing.
Do we need different tiers of performance testing?
The first question is always whether or not it is necessary to have different tiers? From my previous experience, what I noticed is 2 issues when we get the performance test result.
Issue 1: We noticed the performance downgrade from the performance test and told the developer. There are 2 options for the developers either to check all the code changes between the 2 tests or using a profiling tool to figure out which part of the code causing the issue. But both options take time.
Issue 2: When the customer asking for performance results, they don’t really care about the details of your performance of a specific request or a specific functionality. What they care about is the overall end to end performance and the user’s experience. Let’s use a trading system as an example, when customers asking about the performance, what is provided is normally, what is the latency of the pricing update when we receive x amount of updates per second. Or how fast is the response time for performing a trade. Does the customer want to know about these results? Yes, they do, but I don’t believe that’s what they care most. What they care most is when we have x amount of traders, will the latency (response time) be acceptable? We cannot really answer this question as we only test the each individual piece and we don’t know what will happen if we get them working together. Some people will wonder why we don’t merge all these tests together. Well, that will make the developers unhappy. It becomes even worse when they find a performance downgrade as they have to figure out which component actually having performance issues.
From these 2 issues we can see that developers and customers require the performance test from different angles. That’s exactly the reason we need different tiers of performance, to meet the need of both developers and the customer.
What should these tiers be?
Unit Level
This level of performance test is mainly to help the developers get fast feedback and figure out which piece of code causes problem when there is a performance downgrade. Thus the test should be fast and only aiming at a small amount of codes. Unlike the functional unit test that we want to cover most part of the code, the performance unit test should only cover the functions that will be called huge amount of times in a second or the functions that does complex algorithms. Let’s use the trading system as examples again, normally we have large amount of users subscribing to a list of currency pairs that updates several times each second at the peak. Thus the functions that handles the price update and distribute the update to all the clients should be covered by the unit level performance test as assume we have 3000 users and each subscribe to 40 currency pairs which update 4 times per second at peak. This means the functions will be called 3000 * 40 * 4 = 480,000 times every second. This means if these functions take more than 2 microsecond of CPU to perform, you will run out of CPU if you are single threaded. Thus it will be really useful to write a unit level performance test to run these functions 480,000 times to check how long it will take to finish all of them. When you notice a performance downgrade in these functions, it is likely that you will see a big performance downgrade in your product. Also, we should write performance unit test for complex algorithms such as functions that does filtering and sorting or those functions that does complex calculation. This is because, for complex algorithm a good algorithm will perform much faster than a bad one. Use Sorting for example, a bad algorithm will take over 10,000,000,000 steps to sort a 100,000 row container, but a good one will take just over 1,661,000 steps to do the same job. Well, if each step take 1 nanosecond to perform. The bad algorithm will take 10 second while the good one only take less than 2 ms to do the job.
Component Level
This level of performance test is going to help both the developer and the customer. It will mainly focus on the main functionality or behaviour of the product. For example, how many pricing updates we can handle per second, how many trades we can do per second, how many order we can submit per second, etc. This will be very like the performance test most people write nowadays. For developers, when it shows a performance issue in the component level performance test but not in the unit level performance test, it will normally mean that we have missed the functions that have been called for huge amount of times and we will only need to look at the functions that are not covered by unit level performance tests. We should fix the issue and add unit level performance test for those functions as well. Normally, component level performance test shouldn’t pick up a lot of performance issues. If most performance issues are picked up at the component level, then we need to review the coverage of the unit level performance test. This level of performance test will also give customer a brief idea about the performance at each individual level and understand which area they need to invest more to improve performance.
End to End level
This is to focus on the need of the customer. To design and implement the end to end level performance test, you need to think like a customer and talk with the customers to understand what their typical and peak usages are. For a trading system, it will mainly be the trades using the system. Thus you need to understand how many traders will be using the system and for each trader how do they normally use the system like how many price do they have on the screen, how many trades do they perform every hour? Remember, you need to get these numbers for both normal usage and for the peak usage. Then you can simulate the trading environment that can help the customers understand what their user experience will be performance-wise. Normally, you shouldn’t pick up any performance issue in your end to end level performance test that is not pick up in the lower level performance test as to figure out where the issue is in the code will be a very difficult job for the developers.