Stress Testing openIMIS - HIB

Stress Testing openIMIS - HIB

The below stress testing is done in very limited memory of MSSQL server , the stress test result will subject to change as production level database is connected.

Scalability Test Report: OpenIMIS Claims Query (1k vs 2.5k Users)

Date: February 8, 2026

Target Host: http://imisbeta.hib.gov.np

Status: DEGRADED (Passed at 1k, Failed at 2.5k)

command:

python -m locust -f locustfile.py \
--host=http://localhost\
--users 1000 \
--spawn-rate 10 \
--run-time 5m \
--headless \
--html=locust_1000_report.html \
--csv=locust_stats_1000

1. Executive Summary

We conducted a two-phase load test to determine the scalability limits of the OpenIMIS GraphQL claims query.

  • Phase 1 (1,000 Users): PASSED. The system was stable with healthy throughput (266 RPS) and zero errors, though latency spiked at the 99th percentile.

  • Phase 2 (2,500 Users): FAILED. The system collapsed under load. Throughput dropped by 85% (down to 41 RPS), average response time increased by 40x, and the server began forcibly closing connections (Connection Reset errors).

2. Comparative Analysis (1k vs 2.5k)

Metric

Phase 1 (1k Users)

Phase 2 (2.5k Users)

Impact / Delta

Throughput (RPS)

266.9 RPS

41.3 RPS

-85% Drop (System stalled)

Error Rate

0%

~1.45%

Failures Detected

Avg Response Time

15 ms

607 ms

40x Slower

Median (50%)

24 ms

82 ms

3.4x Slower

95th Percentile

910 ms

3,100 ms

> 3 Seconds

Max Response

1,581 ms

9,600 ms

~10 Seconds

3. Deep Dive: Phase 2 (2,500 User Stress Test)

A. Failure Analysis

The test recorded 176 failures out of 12,385 aggregated requests. The error types indicate the server was overwhelmed and actively rejecting connections:

  • ConnectionResetError (160): The server (or load balancer) forcibly closed the TCP connection because its queue was full.

  • RemoteDisconnected (15): The server accepted the connection but closed it before sending a response.

B. Throughput Collapse

In a healthy system, RPS should increase as users increase. Here, we saw the opposite:

  • 1,000 Users: 266 RPS

  • 2,500 Users: 41 RPS

  • Conclusion: The system hit a hard resource ceiling (likely DB connections or Thread Pool). Instead of processing more requests, threads locked up waiting for resources, causing throughput to plummet.

C. Latency Distribution

  • P50 (Median): 82ms — Acceptable.

  • P90: 2,000ms — The breaking point. 10% of users are waiting 2+ seconds.

  • P99: 5,200ms — Unusable. 1% of users are waiting >5 seconds.

4. Test Methodology & Calculations

Phase 1: 1,000 User Load

  • Command: --users 1000 --spawn-rate 10 --run-time 5m

  • Ramp-Up: $1000 / 10 = 100 \text{ sec}$ (1m 40s).

  • Steady State: 3m 20s of full load.

  • Result: Sufficient time to measure stability.

Phase 2: 2,500 User Load

  • Command: --users 2500 --spawn-rate 10 --run-time 5m

  • Ramp-Up Calculation:

    • Ramp-Up Time: It took 4 minutes and 10 seconds to reach the full 2,500 concurrent users.

    • Steady State: Once peak load was reached, the test ran for only 50 seconds before completing.

    • Conclusion: The test spent 83% of its time ramping up and only 17% of the time at full scale.

  • Analysis: The test only sustained peak load for 50 seconds before finishing. The massive degradation likely occurred during the ramp-up phase around the 1,500-2,000 user mark.