Handle.Net® Software v9.1.0 Performance Testing
CNRI has conducted tests to benchmark the performance of the Handle.Net server software version 9.1.0 configured to use Berkeley DB JE, which is the default storage software. The testing methodology, results, and testing software details are discussed below.
Methodology
The objective of CNRI's testing was to measure the throughput of the Handle.Net server software. Throughput, in this context, is the number of successful operations performed by the server each second on average. In particular, tests were conducted to discover the top limits of throughput, while staying within an acceptable latency1 range. Latency, in this context, is the time lapse between the moment the request was sent by the client and the moment the response was received. Although tests were not conducted exhaustively to conclude our observations as true peaks, we refer to our observations as peak or peaks in this narrative given the potential proximity of our observations to the true peaks.
In order to measure the throughput of the Handle.Net server software, CNRI developed a custom handle client application that when deployed across multiple machines creates sufficient load to determine the server performance peaks. The custom client application makes use of the Handle.Net Java client library.
Bare metal machines can be used for deploying the server and the custom client application, although for this testing virtual machines were used purely as a matter of convenience. The actual specifications of those virtual machines are discussed in the next section. The Handle.Net server software was deployed on a separate virtual machine. All the virtual machines were provisioned with Java 8 software.
This testing focused on handle resolution and handle administration performance as realized from various interfaces, such as UDP and TCP, provided by the Handle.Net server software. Only "create" operations were used to extract administration performance metrics, because an update operation is similar in nature to a create operation and handle delete operations are rarely used by the user community.
Prior to running tests, the handle server underwent a brief warmup period to fully load the Java code into memory: 250 create requests and 1000 resolution requests were sent for such warmup.
Tests were conducted with all client machines sending requests simultaneously to the Handle.Net server. Each client machine then recorded the response time as well as the response code. The response times from all clients were then aggregated to determine average latency at peak observed throughput.
Resolution tests were performed by repeatedly resolving the same handle and recording the response times. The client application makes authoritative requests thereby ignoring the client-side caching. There is no caching option on the Handle.Net server software, although low-level (storage and disk) caching would come into the equation when the same handle record is requested repeatedly. However, this means only the latencies introduced by the Handle.Net server software are considered as opposed to also considering the performance of the underlying storage system. Tests were run against the endpoints running on TCP, UDP, and HTTP. The HTTP endpoint offers a JSON API interface as well as a native protocol tunnel; and the tests measured both varieties.
Handle "create" tests were performed by each client machine by first establishing a secure session with the server, and then using that session to create handle records. This avoided the step to re-authenticate the client with each request. The values of each handle record used to create the record were the same, but the handles varied. Tests were run using the TCP and HTTP endpoints (again, both API and tunnel varieties were measured). UDP endpoint is not normally used for administration because special error handling is required on the client-side to distinguish between true server-side processing failures and delivery failures: non-idempotent requests, such as creates, cannot simply be re-requested without side effects. Creation tests, as such, were not performed against the UDP endpoint.
After running a few experiments to identity settings that provide peak observed throughput and acceptable latencies, we settled on the following configuration for each test scenario:
Request Type | Interface | Threads per Client Machine | Number of Client Machines |
Resolution | TCP | 10 | 200 |
UDP | 30 | 200 | |
HTTP (native protocol) | 10 | 200 | |
HTTP JSON API | 10 | 200 | |
Creation | TCP | 20 | 50 |
HTTP (native protocol) | 20 | 50 | |
HTTP JSON API | 20 | 15 |
For each test, each thread sent 2000 requests, with a 10 milli-second (ms) delay between each request. Iterating 2000 times ensured the tests ran a reasonable amount of time to gather reliable averages. The delay helped to address the issue of TCP port exhaustion issues. Clients use one network port per request to connect to a server, and Operation System (OS) supplies a limited number of ports; a 10 ms delay (in combination with using a lower number of threads on TCP-based tests) was found to be reasonable to ensure that the OS reclaims the ports.
Overall, there are many variables to consider here, and all combinations of those variables were not tested. It is possible some combination of the variables might result in a better performance compared to what was observed with the combination that was finally used.
System Specification
A cloud provider, specifically, Amazon Web Services (AWS), was used for deploying the server and the client application, purely for convenience. It is likely that an on-premise, enclave, deployments would yield higher performance as network and computational resources are not shared with other customers in such cases.
The Handle.Net server software was run on an AWS virtual machine of type m5.large and then separately on type m5.2xlarge. Virtual machine of type m5.large has 2 vCPUs and 8GB of memory, whereas m5.2xlarge has 8 vCPUs and 32GB of memory. In both cases, the Handle.Net server was configured to use 4GB of memory. The default of 200MB of memory produced inconsistent results, potentially due to Java garbage collection interruptions. Other memory values were not explored. Ubuntu 18.04 OS was installed on these virtual machines.
Test Results
Results from resolution and creation tests are shown below. The results show the average latency at peak observed throughput.
Resolution Test Results
Server | Interface | Peak Observed Throughput (resolutions/second) | Average Latency (ms) |
m5.large | TCP | 22,806 | 62 |
UDP | 58,545 | 57 | |
HTTP (native protocol) | 16,194 | 104 | |
HTTP JSON API | 18,045 | 102 | |
m5.2xlarge | TCP | 35,746 | 31 |
UDP | 89,602 | 39 | |
HTTP (native protocol) | 31,544 | 41 | |
HTTP JSON API | 27,606 | 55 |
For the configurations that were put in place, the maximum throughput across all interfaces was 89,602 resolutions/second and that was with the UDP interface. The average latency was 39 ms when that maximum throughput was observed. Throughputs that were observed when other interfaces were chosen are lower compared to when UDP interface is chosen.
Creation Test Results
Server | Interface | Peak Observed Throughput (creates/second) | Average Latency (ms) |
m5.large | TCP | 7,820 | 109 |
HTTP (native protocol) | 6,324 | 136 | |
HTTP JSON API | 4,847 | 45 | |
m5.2xlarge | TCP | 11,225 | 72 |
HTTP (native protocol) | 11,532 | 70 | |
HTTP JSON API | 10,744 | 77 |
Throughput of TCP and HTTP is roughly equivalent on the larger machine. On a smaller machine TCP offers higher throughput. The maximum throughput of 11,532 creates/second was observed with the HTTP interface on the larger machine. The average latency was 70 ms when that maximum throughput was observed.
Testing Software
CNRI's performance testing software is available for download here. Refer to the README for using that software for running performance tests in your environment.
It is worth noting the following points when performance tests against the Handle.Net server software are made:
1Note that latency as observed at peak observed throughput is still reported in our results. Because tests were not conducted to optimize for server-side latency, the throughput numbers were collected from observations made by remote network clients. If measurement of optimized server-side latency is the goal, network introduced delays have to be considered.
March 1, 2019