Peoplefy
Site Reliability Engineer - Terraform/Ansible
Job Location
pune, India
Job Description
Responsibilities : - Design, build, and maintain highly available and scalable infrastructure on cloud platforms (AWS, Azure, GCP). - Implement and manage CI/CD pipelines using Jenkins, GitLab CI/CD, or other relevant tools. - Automate infrastructure provisioning and management using Terraform, Ansible, or Puppet. - Monitor system performance and proactively identify and resolve issues using tools like Prometheus, Grafana, and ELK stack. - Troubleshoot and resolve production issues quickly and effectively. - Participate in on-call rotations and provide 24/7 support for critical systems. - Collaborate with software engineers to improve the reliability and performance of applications. - Implement and maintain monitoring and alerting systems. - Implement and maintain security best practices and controls. - Ensure compliance with security and compliance regulations. - Conduct security audits and penetration testing. - Contribute to the development and improvement of SRE best practices and processes. - Automate routine tasks and improve operational efficiency. - Participate in incident response and post-mortem analysis. - Stay up-to-date with the latest technologies and trends in the field of Site Reliability Engineering. - Research and implement new technologies and tools to improve system reliability and performance. Required Skills : - Strong experience with at least one major cloud provider (AWS, Azure, GCP). - Proficiency in infrastructure-as-code tools like Terraform, Ansible, or Puppet. - Experience with CI/CD pipelines and tools (Jenkins, GitLab CI/CD, etc. - Experience with monitoring and alerting systems (Prometheus, Grafana, ELK stack, etc. - Proficiency in scripting languages like Python, Bash, or Ruby. - Strong understanding of Linux/Unix systems administration. - Solid understanding of networking concepts (TCP/IP, DNS, routing). - Understanding of security best practices and common security threats. - Excellent analytical and problem-solving skills. - Strong communication and collaboration skills. Nice to Have : - Experience with containerization technologies (Docker, Kubernetes). - Experience with serverless computing (AWS Lambda, Azure Functions). - Experience with service mesh technologies (Istio, Linkerd). - Experience with chaos engineering. - Experience with SRE principles and practices (Google SRE book) (ref:hirist.tech)
Location: pune, IN
Posted Date: 1/22/2025
Location: pune, IN
Posted Date: 1/22/2025
Contact Information
Contact | Human Resources Peoplefy |
---|