Tìm việc làm

Lead Site Reliability Engineer AWS, Cloud

Văn phòng đại diện công ty TNHH tập đoàn GFT

Hạn nộp: 23/10/2025

Chi tiết thông tin tuyển dụng "Lead Site Reliability Engineer AWS, Cloud"

Mức lương

Thỏa Thuận

Địa điểm

  • 29A Nguyen Dinh Chieu, District 1, Ho Chi Minh

Mô tả công việc

Tóm tắt công việc
Role Summary
We are seeking a highly skilled and motivated Lead Site Reliability Engineer (SRE) with strong AWS expertise to lead our Service Operations team. You will be responsible for driving SRE practices, ensuring the scalability, reliability, and performance of mission-critical systems for our digital banking clients. This role requires balancing technical depth with leadership capability - setting direction, mentoring engineers, and ensuring service reliability at scale across multiple teams and clients.
Sign-on Bonus: Eligible for candidates who are currently employed elsewhere and able to join GFT within 30 days of offer acceptance.
Key Responsibilities
Leadership & Mentorship: Lead a team of SREs, providing technical guidance, coaching, and fostering a culture of reliability and continuous improvement.
SRE Practices: Define and mature SRE practices, including SLIs/SLOs, error budgets, and incident response processes across production systems.
Architecture & Automation: Own the design and evolution of automated cloud operations, driving adoption of Infrastructure-as-Code (Terraform, CloudFormation) and CI/CD pipelines.
Incident Management: Lead major incident responses, ensuring rapid resolution, root cause analysis, and implementation of preventive measures.
Collaboration: Work closely with Development, DevOps, and Cloud Engineering teams to ensure reliability and resilience are built into every stage of delivery.
Operational Excellence: Establish and track key reliability metrics (availability, latency, error rates) and drive initiatives to continuously improve them.
Innovation & Tooling: Evaluate and implement AWS-native and third-party tools to improve monitoring, alerting, and automation.
Stakeholder Engagement: Act as the primary contact point for Service Reliability topics with clients, ensuring transparency and alignment on reliability goals.
Governance: Ensure compliance with industry standards and internal policies around security, audit, and operational risk.
HR benefits
Competitive salary
Salary band per level are reviewed once per year
13th month salary pro rata depending on the employee's length of service (within a calender year), paid with the December salary
Monthly lunch allowance: 700,000 VND/employee
Parking: GFT covers the monthly parking fee for employee motorbikes
Performance evaluation is once per year, for 2 purposes:
> Performance bonus > Salary increments
Health care
Private health insurance: including accident, outpatient, in-patient, maternity, and dental for all permanent employees who pass 2-month probation.
Optical: expense claim for eyewear
Annual health check-ups.
Vacation
Maximum 18-day vacation leave/year (with the ability to carry over 05 days till 31st March of the following year)
Adding one more annual leave day for each two-year anniversary.
Healthy lifestyle
Sports and hobby clubs: company has an annual fund for fitness activities, which is allocated per month as team's vote.
Range of healthy snacks, tea, coffee, milk and beer on tap
Social
Company townhall: each 6 weeks
CSR activities: as per company's CSR guideline
Onsite tour/training courses at other GFT offices and client's destination overseas (where applicable).

Yêu cầu công việc

Experience: 7-10 years in SRE/DevOps/Cloud Engineering, with at least 2-3 years in a lead or managerial capacity.
Cloud Expertise: Deep hands-on experience with AWS services (EC2, ECS/EKS, S3, RDS, IAM, VPC, CloudWatch).
Infrastructure as Code: Strong experience with Terraform, CloudFormation, and automated deployment pipelines (Harness, GitLab, Jenkins).
Containerization & Orchestration: Expertise in Kubernetes and container-based workloads in production.
Monitoring & Observability: Proficiency with monitoring, logging, and alerting tools (CloudWatch, Prometheus, Grafana, ELK).
Incident Leadership: Proven ability to lead high-pressure incident response and post-mortem processes.
Problem-Solving & Risk Management: Strong analytical skills with the ability to anticipate, assess, and mitigate technical risks.
Collaboration & Communication: Excellent stakeholder management skills; fluent English required, with good communication in Vietnamese for local collaboration.
Nice-to-Have Skills
Certifications such as AWS Certified DevOps Engineer - Professional or AWS Solutions Architect - Professional.
Experience in financial services or other highly regulated industries.
Knowledge of advanced security practices and compliance frameworks (PCI-DSS, ISO 27001, SOC2).
Multi-region/multi-AZ architecture design for high availability and disaster recovery.
Due to the high volume of applications we receive, we are unable to respond to every candidate individually. If you have not received a response from GFT regarding your application within 10 workdays, please consider that we have decided to proceed with other candidates. We truly appreciate your interest in GFT and thank you for your understanding.

Cách thức ứng tuyển

Ứng viên nộp hồ sơ trực tuyến bằng cách bấm "Ứng tuyển" ngay dưới đây.

Thông tin công ty

Giới thiệu

Văn phòng đại diện công ty TNHH tập đoàn GFT

Quy mô

Từ 26 - 100 nhân viên Nhân viên

Địa chỉ

275 Lạch Tray, Ngô Quyền, Hải Phòng

Việc làm tương tự từ JobOKO

Xem thêm
× Modal Image