Intermediate Site Reliability Engineer (SRE II)
ARCS
- Cape Town, Western Cape
- Permanent
- Full-time
- Kubernetes running on Google Kubernetes Engine (GKE)
- Prometheus, Grafana, Elastic, Kibana
- CI/CD with Jenkins
- Kong API Gateway
- LogDNA
- Falco
- MongoDB Atlas
- Microservice Architecture with Event Sourcing and CQRS
- Containers running Kotlin, Python, Javascript (and a bit of Golang)
- Being part of their security incident response team
- Managing their identity platform and enabling enterprise user and system authentication and authorization using OAuth2
- Working effectively with the development team to plan and deploy required infrastructure changes or new capabilities ahead of time and unblocking the development team when unforeseen infrastructure blockers arise
- Performing high-quality, ego-free code reviews drive visibility, testing, and improvement initiatives
- Writing operational tooling to automate otherwise manual processes (e.g., Golang, Bash)
- Writing, testing, and executing change control plans for production changes with an eye for detail to spot potential issues
- Debug production issues
- Being part of their on-call rotation. When on-call, you will work on repaying technical debt and deal with operational incidents as and when they occur. This will require you to have or acquire a good general knowledge of production operations for technical support.
ExecutivePlacements.com