Your mission, should you choose to accept it: 


Cloud Infrastructure Management: Design, implement, and manage scalable and resilient infrastructure using AWS services like EC2, Lambda, Aurora RDS PostgreSQL, and DynamoDB.
Container Orchestration: Deploy, manage, and scale applications in Kubernetes.
Monitoring & Observability: Set up and maintain comprehensive monitoring using Grafana Cloud, Mimir, Loki, Tempo, and OpenTelemetry.
CI/CD Integration: Automate deployments with robust CI/CD pipelines. Familiarity with tools like GitHub Actions and AWS CodeBuild is essential..
Log Management & Analysis: Utilize tools like OpenSearch/Elasticsearch and Loki for log analysis and troubleshooting.
Scripting & Automation: Develop scripts and tools using Python and Golang to automate tasks and processes.
Database Management: Manage and optimize data workflows across databases like Aurora RDS PostgreSQL and DynamoDB.
Stream Processing: Work with Kafka for real-time data processing and integration workflows.
Incident Management: Participate in on-call rotations, providing expertise in incident resolution and system troubleshooting


Nice to have:


Experience with NoSQL, PostgreSQL, DynamoDB, Elasticsearch
Experience with common web stack applications (nginx, tornado, FastAPI)
Experience with messaging platforms (Kafka, Kinesis, SQS, SNS)
Experience with Google (GCP, Firebase)


Qualifications & Experience: 


Bachelor’s Degree or Advanced Diploma in Information Systems, Computer Science, Mathematics, Engineering and 3 years of hands-on experience in a DevOps or Site Reliability Engineering role is required.
In the event that a candidate does not have a Bachelor’s Degree or an Advanced Diploma (in Information Systems, Computer Science, Mathematics, or Engineering), an equivalent experience requirement must be met, which equates to a minimum of 6 years experience in a software/technology environment.
Certifications in AWS or Kubernetes are advantageous.
3-5 years of hands-on experience in a DevOps or Site Reliability Engineering role.
2 - 5 years of experience in Python and Golang for automation, scripting and development
AWS Expertise: 3 years comprehensive experience with AWS services, including EC2, Lambda, DynamoDB, and Aurora RDS PostgreSQL and AWS OpenSearch.

Ability to design and manage scalable and resilient cloud architectures.
Kubernetes Proficiency: 3 years hands-on experience with deploying, managing, and scaling applications in Kubernetes environments. Practical experience with Helm and ArgoCD

Understanding of containerization concepts and tools like Docker/Podman.
Infrastructure as Code (IaC): 3 years experience with IaC tools like Terraform or CloudFormation to manage and automate cloud resources effectively. CloudFormation is preferential.
  • Cape Town