In the last few months, we have helped some clients to migrate from the on-premise infrastructure to Microsoft Azure. Many of them asked us what the maximum user load that the configuration can support is.
Answering this question is not easy, mostly because it could depend on many factors. Some examples are: how the application has been developed, how it manages user connections or memory, and how many external resources it refers to.
If we do some performance tests, we can have a more precise answer: this test identifies how our application works on a different user workload.
Performance tests are divided further into multiple types, and the most common are load and stress tests. The first type helps identify the maximum load of our application, simulating a high number of concurrent users and determining the application’s operating capacity. The second type, instead, identifies the application reliability under or after a high user workload.
Now, let’s figure out how to implement these kinds of tests: today, many frameworks and services are available for this scope.
I have used a few of them like the following:
- Web Application Load and Performance Testing
- Cloud-Based Load Testing from Azure DevOps or Azure App Service
- Apache JMeter
For some of our clients, we used the Visual Studio Web Application Load and Performance Testing framework. Unfortunately, this is currently deprecated, and Visual Studio 2019 will be the last version to support it. Despite this bad news, the framework is handy and has an excellent customization level, including a web request recorder and a tool to create a test rig for our tests.
To investigate problems in the sample application, we will collect a few metrics available in the load test results and the Azure App Service metrics overview. Some fundamental metrics are:
- Time to First Byte (FTTB): it represents the client’s web request waiting time to receive the first content byte from the server. This metric includes the socket connection time, the HTTP request time, and the first-byte receiving time;
- Page Response Time: it represents the waiting time to receive the server’s entire web response (including all related resources). This metric also includes FTTB;
- Number of warnings and errors: during the test phase, we must always check for the number of warnings and errors. For example, when we use an Azure App Service, we should check for the number of errors like 503 Service Unavailable or SocketException.
In this article, we will demonstrate:
- How to prepare an Azure App Service for testing purposes;
- How to record test scenarios;
- How to configure a test rig to execute scenarios;
- How to identify and solve bottlenecks.
The sample application is available here and is deployed to an Azure App Service with an S1 plan.
To run a load test, we need a set of test scenarios related to the user’s most used page in our application. Using Visual Studio 2019, we can create a new Web Performance and Load Test Project [Deprecated] project. As the name implies, this type of project has been deprecated and is not present among the available templates in the wizard creation project popup. To enable it, we should use the Visual Studio Installer as in the following figure:
Let’s create the project using the template suggested above, and at the end of the creation, we have a web performance test ready to be recorded. Clicking on the button “Add Recording,” the browser will open with the “Web Test Recorder,” recording all the requests. Once the scenario is completed, click the “Stop” button.
The next step is creating the load test. Click on the web performance test project in the solution explorer using the mouse right button, and add a new “Load Test” item. Once created, a configuration wizard will appear with the following steps:
- Execution type: the available choices are “Cloud-based Load Test with Azure DevOps” (not available anymore) or “On-premises Load Test.” In this case, we need to select the second one;
- Test duration: we can choose the time or the number of iterations, and it depends on how much we want to stress our application. Usually, a ten minutes test is enough to get a few bottlenecks;
- Scenario name and think time (to simulate user interaction);
- Load pattern: constant or by step;
- Test Mix Model: to mix tests among the others;
- Test Mix: list of the web performance test to execute;
- Network Mix: to simulate different network types (i.e., LAN, 3G);
- Browser Mix: ti simulate different browser;
- Counter Sets: machines that collect the metrics.
Once these settings are in place, we are ready to execute our test, but the only problem is which computer will do it. As an option, we could start the test from Visual Studio on our machine, but this approach is far from reality because the same source sends all web requests. Also, if the number of web requests is very high, it could bring our machine to its hardware’s limits.
As an alternative, we can use a test rig, a group of related machines to distribute the web requests load.
Using the available tools provided by Microsoft, the test rig creation is relatively simple. Indeed, we need to install a Test Controller that manages one or more Test Agents. Usually, there’s no predefined test agent number; it depends on how much load we need.
An example of a test rig topology is available in the following figure:
This executable will install a test controller or a test agent. The controller also requires an instance of SQL Server to store all the test results. Once everything is installed, it is possible to register agents to the controller.
In my case, I have five virtual machines created on Microsoft Azure. One of them has both Visual Studio and the Test Controller installed. The remaining ones are configured as Test Agents. The environment can then be linked to the load test created before using the “Manage Test Controller” button.
We are ready for our first execution.
Execution of the first load test
To execute the load test, we can hit the “Run Load Test” button. In my case, I have configured the test to run for ten minutes, starting with one-hundred users, which increases every 5 seconds until they reach four thousand concurrent users.
At starting, Visual Studio will open a real-time monitoring page, including all the performance counters helping identify resource usage. The next figure summarizes the web requests trend during the ten-minutes execution. It is interesting how the Page Response Time increases with the test number.
Once the test is completed, I have about one-thousand errors (this is a configurable limit) with 20.391 failed tests on a total of 23.482. The number of failed web requests is 146.725. Everything is detailed both in the Tables and Summary section:
It is worth noticing that the failed requests are related to the HTTP status code 503 – ServiceUnavailable, representing the quota limit for my App Service Plan. I have found the quota limit with the current configuration. What can I do at this point?
Looking at the metrics from the App Service, we notice a high number of web requests:
Since the failed requests have 503 as an HTTP status code, we know that increasing our App Service instance number will help.
We can increase them using the Scale-Out section on the App Service page:
We are ready to test the updated configuration.
Second load test
Let’s start the second test and see what happens. In this case, we are luckier because the situation is better, having just one failing test with 500 as an HTTP status code. At the same time, we have executed a higher number of web requests because the page response times are lower. The total number of tests is 54,255.
Of course, this is a simple use case to identify the problem. Many scenarios require more time and investigation than this one, such as collecting the memory dump using memory profilers like ANTS Profiler.
Azure App Service includes a set of diagnostic tools to investigate performance problems. For example, we can collect the memory dump and analyze it in PerfView, searching for memory leaks.
Finally, identifying performance problems is an iterative process to monitor our applications, both reactively and proactively.
In this article, we saw how to identify performance problems in our application using Microsoft tools. The adopted solution to improve the sample application’s performance, which is the instance scaling, is the first step to managing an emergency. If the performance problems still exist, then a more accurate investigation of the application’s resources should occur. But we will examine this aspect in a future article.