Senior SRE/DevOps Engineer (SAP BTP) in Herndon
Job DescriptionJob Description
Contract Details
Work Mode: 100% Remote (US-based)
Location: Herndon, VA
Employment Type: Contract (temp)
Schedule: 40 hours/week; on-call rotation required; occasional nights/weekends
Openings: 2
Security Clearance: Not required
About the Opportunity
This role focuses on ensuring high availability, reliability, and quality of service for cloud-hosted platforms. As a Senior SRE/DevOps Operations Engineer, you will operate and improve cloud services, drive incident response and root-cause analysis, and collaborate with global teams to enhance service reliability and operational excellence.
Key Responsibilities
- Provision, monitor, and operate cloud services in a globally distributed team.
- Analyze and resolve operational issues; own incident response and remediation.
- Maintain integrity and security of servers and systems; manage upgrades and hotfixes.
- Develop and operate monitoring policies, standards, and tooling.
- Optimize resource allocation across cloud environments.
- Conduct root-cause analysis and implement continuous improvements.
- Partner with product engineering to design and enhance service reliability.
- Develop and implement testing strategies; document results.
- Participate in an on-call rotation; support occasional after-hours/weekend work.
Required Qualifications
- 8 years of experience in SRE/DevOps/Cloud operations.
- Expertise with Git.
- Expertise with Concourse (pipeline setup, management, troubleshooting).
- Expertise with Linux (SUSE and Ubuntu).
- Expertise with Kafka, Zookeeper, and Big Data technologies.
- Expert in automation for testing, deployment, scalability, and management of cloud services.
- Expertise with cloud monitoring/observability tools.
- Strong knowledge of cloud computing and databases.
- Strong understanding of web services, networking, virtualization, and internet protocols.
- Security fundamentals for multi-tenant SaaS applications.
- Excellent communication, prioritization, and customer service skills.
- Must be a U.S. (no dual citizenship).
Qualifications
- Hands-on with AWS services: Route 53, EC2, S3, CloudWatch, DynamoDB, RDS, IAM, ACM, KMS, VPC.
- Experience with Cloud Foundry environments.
- Jenkins and/or Chef automation; Terraform.
- Expertise with Kubernetes operations, troubleshooting, and configuration.
- Experience troubleshooting IP networks and application stacks.
- Observability with Prometheus and Grafana.
Work Environment
- Collaborative, global team with cross-training opportunities.
- On-call rotation for P1 incidents required; flexible schedule may include weekends or after-hours.
#ZR