Our client is an established fintech headquartered in Singapore, operating across payments and foreign exchange with a footprint spanning Asia and beyond, including a significant client base in Greater China. As the business scales towards its next stage of growth, they are building out the senior layer of their reliability function.
They are hiring a newly created Lead Site Reliability Engineer (Quality Assurance) sits within the SRE team as a key technical hire and co-lead with the Head of SRE.
Job responsibilities
- Own production reliability across the FX and payments platforms - monitoring, observability, alerting, and the definition and tracking of SLIs and SLOs.
- Lead incident response end to end, including war rooms, post-mortems, root-cause analysis and the upkeep of operational runbooks.
- Strengthen quality assurance across the platform - improving test coverage, release sign-off standards, and modernising legacy test automation toward more automated and AI-assisted workflows.
- Support client API integration from sandbox to production, including liquidity provider onboarding and conformance testing.
- Act as the technical escalation bridge between clients, internal users and engineering, supporting China-based clients directly on production and integration issues.
- Lead business continuity and disaster recovery testing, including failover, recovery and audit evidence preparation.
- Contribute to DevOps and tooling improvements across reliability, testing and support.
- Coach and uplift a team of junior SRE and QA engineers, setting standards and mentoring on best practice as the function matures.
Job requirements
- At least 6 years of experience in site reliability, production support, platform engineering or technical operations, ideally within fintech, payments, FX, trading systems or another high-availability environment.
- Strong hands-on troubleshooting across production systems, logs, APIs and application behaviour.
- Hands-on quality assurance exposure - test automation, release support or regression testing - as the role spans both reliability and quality.
- Working knowledge of API integration and comfort in client-facing or client-support situations; sandbox-to-production experience is an advantage.
- Solid fundamentals across cloud and containers (AWS, Docker, Kubernetes), monitoring and observability tooling (Grafana, Prometheus, OpenSearch, CloudWatch), and scripting (Python, Java, Bash).
- People management or team-lead experience is necessary.
- Sound grounding in incident management, RCA and BCP/DR practice.
- Professional spoken and written Mandarin proficiency is required to communicate directly with China-based clients and stakeholders, support production issues, and manage API integration activities.
- Suited to someone hands-on today who wants to grow into a broader leadership role. Candidates requiring relocation to Singapore are welcome to apply. Relocation expenses will not be provided for this position.
Why you should join them
- A newly created, high-visibility role as second-in-command within the reliability function, with a genuine path to grow into a deputy or co-leadership position through succession planning.
- Direct exposure to senior engineering leadership, reporting to the Head of Engineering and working alongside the infrastructure team.
- Broad ownership across reliability, quality, client integration and incident management, with room to shape how these run rather than inherit a fixed playbook.
- A modern technical environment with real appetite for AI-assisted tooling across testing, RCA and support, plus a hybrid working arrangement.
JL
Reg. No. R1766249
BeathChapman Pte Ltd
Licence no. 16S8112





