-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Problem Statement
When starting a payrun job via POST /tenants/{tenantId}/payruns/jobs with a large number of employees (500+), the HTTP request blocks until all payroll calculations are complete. This causes timeout issues when the backend is deployed behind API gateways or CDNs with request timeout limits (e.g., Cloudflare's 100s limit results in HTTP 524 errors).
Current Behavior
In PayrunJobController.StartPayrunJobAsync(), the payrun processing is synchronous:
// Line 178 - Blocks until ALL employees are processed
var payrunJob = await processor.Process(domainJobInvocation);For 1,000 employees at ~0.5-1s each, this results in 8-15+ minutes of blocking time, far exceeding typical gateway timeouts.
Observation
The architecture already supports progress tracking and polling:
- Job is saved to DB before processing (line 265-266 in
PayrunProcessor.cs) - Progress is updated after each employee (
ProcessedEmployeeCount+++UpdateJobAsync) - GET endpoint exists to retrieve job status and progress
This suggests the system was designed with async polling in mind, but the HTTP request still blocks during processing.
Proposed Solution
Decouple job creation from job processing:
public virtual async Task<ActionResult<ApiObject.PayrunJob>> StartPayrunJobAsync(
int tenantId, ApiObject.PayrunJobInvocation jobInvocation)
{
// ... validation code (unchanged) ...
// Create job record immediately
var payrunJob = await processor.CreateJobAsync(domainJobInvocation);
// Process in background (fire-and-forget or via IHostedService/Hangfire)
_ = Task.Run(async () =>
{
try
{
await processor.ProcessAsync(payrunJob.Id);
}
catch (Exception ex)
{
// Update job status to Failed/Abort
await processor.AbortJobAsync(payrunJob.Id, ex.Message);
}
});
// Return immediately with 202 Accepted
return new AcceptedResult(Request.Path + "/" + payrunJob.Id, MapDomainToApi(payrunJob));
}Benefits
- No timeout issues - Works with any gateway/CDN configuration
- Scalable - Supports thousands of employees per job
- Backward compatible - Existing polling via GET endpoint works unchanged
- Minimal code changes - Architecture already supports this pattern
Would you be open to a PR implementing this feature? Happy to contribute.