Performance Testing for Massive Scale with Anvesh Malhotra

2 March 2021, 06:17 PM

By Test Guild

About this Episode:

How do you run performance testing for massive scale? In this episode, Anvesh Malhotra, a Senior Software Developer at Gannett | USA TODAY NETWORK will share his experience using Artillery.io in a serverless environment. Discover how to implement a serverless architecture using Google Cloud Functions to execute performance tests with concurrency as high as 1 million requests per minute. Listen up to hear how this setup allowed Gannett to dominate the U.S. Presidential Elections’ 2020 news cycle.

TestGuild Performance Exclusive Sponsor

SmartBear is dedicated to helping you release great software, faster, so they made two great tools. Automate your UI performance testing with LoadNinja and ensure your API performance with LoadUI Pro. Try them both today.

About Anvesh Malhotra

Anvesh Malhotra

Anvesh is a Senior Software Developer at Gannett | USA TODAY NETWORK

Connect with Anvesh Malhotra

- - Company: Gannett
  - LinkedIn: /anveshmalhotra/
Full Transcript Anvesh Malhotra

Hello! I'm Anvesh Malhotra and I work as a senior software developer at Gannette USA Today network. Today I will talk about how we execute performance tests at a massive scale. I will cover what is performance testing and the most commonly available tools to execute performance tests. My focus will be mainly on Artillery.io. I will talk about the infrastructure requirements for hosting an in-house performance testing tool. I will cover how we build the serverless tool around Artillery.io, how it helps us in executing performance tests at a massive scale followed by our success story post-presidential elections 2020. Lastly, I will cover our future road map with First Testing and BigQuery. So without any further ado, let's start.

Gannette USA Today network is a leading local to national media and marketing solution company. We are the largest local media company in America with 100 plus newsrooms spanning over 46 states. Gannette has over 2000 journalists with 100 plus newsrooms and a combined readership of over 150 million unique monthly users and one billion monthly page views.

Performance testing is a process used for testing the speed, response times, the stability of the infrastructure, and application responsiveness over multiple platforms of an application under load. Performance really is in the eye of the beholder. A well-performing application is one that lets the end-user carry out the given task without undue perceived delay. You can either be a firefighter and fix the performance-related defects later in your application code, or you can take measures so there's no fire in the first place. Consider performance during design. Application scalability, execute performance has to cover issues as early as possible.

Performance testing is critical to customer satisfaction. If your application performance doesn't meet the expectations of your customer, they will move on to your competitor.

There are a wide variety of products and services which can help you build and run performance tests. I'm showing you only a small subset of these tools available. Firstly, Artillery.io is a modern, powerful, and easy-to-use load and functional testing toolkit. Apache Jmeter is written in Java by the Apache Software Foundation. It can measure performance for static and dynamic web applications. You can also use paid services such as BlazeMeter to execute your performance test. Web Page Test is another tool that was originally developed by AOL. It uses WebDriver to execute performance tests on multiple locations around the globe using real browsers. It provides free and paid services and the ability to host private instances of the tool. LoadRunner, which is another software testing tool from MicroFocus, can help you execute your performance tests.

Let's talk more about Artillery.io. It is an open-source application with an opt-in of premium services. It has detailed performance metrics including latency, requests per second, concurrency, and throughput. Peak performance testing to handle maximum traffic for your backend application for stability and reliability. It has the ability to write custom logic, post and pretest scenarios using JavaScript, which has a wide variety of NPM modules that you can use. It supports multiple protocols, including HTTP, Web Socket, Socket.IO, Kineses, and HLS.

Running performance tests not only requires good test scenarios and a stable environment to execute the test against. It also requires a stable runner infrastructure where the performance test runs. When performance tests are executed the runner requires heavy computer memory resources. These resources are then utilized to mimic the behavior of a real user, geographical location from where the input is received to the back end. This is very common when the application under test is behind the QIP base load balancer. Memory and CPU are needed to create virtual users to generate the desired load for the application. In today's modern application there are many tools available to visualize the data in hand. Choose the right tool to visualize your test results. Controlling the bandwidth is needed when the behavior of the application is proportional to a client's bandwidth.

To execute our performance test, we have built a tool around Artillery. We call it the FAS Artillery, very short for function as a service. Yes, I agree we didn't find a good name for the tool. And unfortunately, this isn't an open-source application as yet. There is a similar open-source application available from Nordstrom called the serverless-Artillery, which you can easily find on GitHub and use if you're on AWS. We use Google Cloud products and services to host our services on its platform. First, Artillery has two components – cloud functions and an engine to run the test.

First, let's talk about the serverless type. A wrapper on Artillery is built and deployed in cloud functions but is a serverless service available on GCP. This helps in quickly creating ephemeral Artillery that auto-scales the infrastructure when the tests are executed, keeping the costs low. Later I will show you slides that show we are able to execute performance tests with a million requests per minute. The wrapper is deployed in two regions – us-central1 and us-east4. This is used to generate load for multiple locations to our back end which are usually behind load balancers. At Gannette we heavily used Docker and Kubernetes for hosting our application servers. When running performance tests, our back-end applications are behind the firewall-protected by the Fastly APM management. In order to have the network connectivity between the cloud function and the application servers use, we use VPC connector. This allows the cloud functions to access the backend service through the secured gateway.

Now let's talk about our engine behind the FAS Artillery. It is built in Golang and shipped as a binary inside a Docker container. This engine distributes the load on the cloud function by breaking down the test and then collecting all the results back from the cloud function, generating a single report. Since the cloud function is protected, it requires a developer or a tester to provide a service account with the invoker role on the cloud function. Finally, the engine then forwards the test results to New Relic for visualization. This serverless architecture helps to execute performance test at massive scale while keeping the costs very low.

Let's take a look at a sample test and then see how we use FAS Artillery to execute our test. In this test, we are creating a performance test for our target endpoint to be foo.com. Inside the phases block, we have provided the duration and arrival rate as 120 and 20 respectively. We also created a scenario called bar which will append /bar at the end of the endpoint. This means that the script will run for two minutes with 20 requests per second or 1200 requests per minute. Since the cloud function has limited CPU and memory resources, each cloud function can only handle 20 requests per second.

Wait, hold on. Didn't I tell you that we are running the performance test on a massive scale? Uh-huh. We're going to use a Docker engine to convert the 20 requests per second to 1000 requests per second shortly. But before we dive into the load distribution, let's recap and provide the base parameters for the container. First is the New Relic account ID and New Relic API Key, which serves as the authentication for sending reports to New Relic. Next, we specify Google application credentials, which is a service account to invoke the cloud function. They also attached the volume to the current directory in order for the container to read the test configurations. Next, we defined the runtime parameter to enable New Relic reports, the name of the test, which is guild conferences, and the script to run, which is guildconferences.yaml. And finally, the number of workers is 500. This tells the engine to distribute the load across serverless architecture and thus result in 1000 requests per second.

Although the engine can dynamically allocate the number of workers based on the request we have provided, it has runtime parameters the developer has more control over the entire process of the test. Let's take a look at some of the charts. The first chart shows us that 1.2 million users were generated for the entire duration of the test. And the second chart shows us that the average latency to be 8.3 milliseconds. We can also see there were 500 cloud functions running at the same time during a performance test. Once the tests are completed, the instance quickly drops to zero. Hence, the cost for executing our test becomes minimal and it's only pay per use.

Elections 2020. Elections to us is like Black Friday is to Amazon. Traffic spikes and thus backend application performance comes to test. Usually we receive over one million requests per minute. Since the data is constantly changing only 25 to 30 percent of the requests are cached and the rest are served by the backend server. To test our applications, we used test generation techniques to quickly generate scenarios for 50 foreign markets spanning over 600 endpoints. The first day we executed our performance test, the pods inside the Kubernetes started to crash as soon as it started to receive traffic. This issue was quickly identified and resolved by the developers. We also noticed the application was using legacy services for configuration management. It was later changed to the new configuration from Google cloud storage bucket and helped improve the response times. With the help of a performance test, we were able to identify issues with Fastly caching. This resulted in an elevated five or three response code. It was later resolved by working together with SRE part of the engineering team and the Fastly support team. Lastly, the Nodes that were supporting the application servers were moved to reserved Nodes instead of preemptable for high availability and disaster recoveries.

Now we are finally ready for the elections. The response times status codes were acceptable to the product and development team. As a result, our system processed and published updates across 300 sites in just over three seconds. At one point we beat Google by about 10 seconds. We were consistently ahead of our competitors using Associated Press source data. Our system performance was the invincible power center underpinning quickly updating news coverage and results, feeds, and seamless consumer-facing experience.

What does our future road map look like? Fuzz testing. Fuzz testing is a technique that involves sending random data to monitor for application crashes, filling those in code assertions or potential memory leaks. Artillery provides a plug-in, which is called Artillery Fuzzer to execute the performance test. It has a wide coverage of random data that it uses to dynamically change the associated variables in the test script. We are currently building and reporting API, which was built using Google Cloud Run and Apache beam dataflow. It will be used on our production applications to send it (unintelligible) logs, crash reports, intervention, deprecation, and content security reports. We used Fuzz testing to ensure no bad data was coming into our system. We use this with FAS Artillery and also identify a hashing issue with the dataflow pipeline. It was later resolved by reshuffling the data across multiple workers.

We are also looking towards storing performance test results into Big Query. It will help us analyze performance test results from historical data. We will also use data studio for visualizing our results which is backed by Big Query.

I would like to thank my wonderful team at Gannette who worked together to build innovative tools and services that shape the next generation of testing. Thank you all for your time. If you have any further questions, you can connect with me at LinkedIn, linkedIn/in/anvesh malhotra. Thank you.

Expert Take on Playwright, and API Testing with Bas Dijkstra

Posted on 04/14/2024

About This Episode: In today's episode, we are excited to feature the incredible ...

Brittany Greenfield TestGuild DevOps Toolchain

AI-Powered Security Orchestration in DevOps with Brittany Greenfield

Posted on 04/10/2024

About this DevOps Toolchain Episode: In today's episode, AI-Powered Security Orchestration in DevOps, ...

First AI software tester, Will You Be Replaced and more TGNS116

Posted on 04/08/2024

About This Episode: Will you be replaced by AI soon? How do you ...