ROSEMALLOW TECHNOLOGIES PRIVATE LIMITED
Rosemallow Technologies - Chaos Engineer - Prometheus/Grafana
Job Location
pune, India
Job Description
Key Responsibilities : Chaos Engineering : - Design and implement chaos engineering experiments to identify weaknesses in systems and applications. - Develop and execute strategies to improve system resilience and reliability. - Analyze experiment results, provide actionable insights, and drive remediation efforts. - Collaborate with development, operations, and infrastructure teams to integrate chaos engineering practices. Operational Acceptance : - Develop and maintain comprehensive operational acceptance criteria for new and existing systems. - Conduct thorough operational acceptance testing, ensuring systems meet all predefined criteria before go-live. - Work closely with project managers, developers, and QA teams to align operational acceptance processes with project timelines and objectives. - Document and communicate operational readiness findings, providing recommendations for improvement. System Resilience and Reliability : - Implement and manage strategies for continuous improvement of system resilience and reliability. - Monitor and assess system performance, identifying potential risks and areas for enhancement. - Lead initiatives to improve disaster recovery and business continuity plans. - Stay updated with the latest industry trends and best practices in chaos engineering and operational acceptance. Collaboration and Training : - Educate and mentor team members on chaos engineering and operational acceptance methodologies. - Foster a culture of resilience and reliability within the organization. - Engage with external communities, attending conferences and participating in knowledge-sharing events. Requirements : - Extensive experience in chaos engineering, operational acceptance testing, and system resilience. - Strong understanding of cloud platforms (AWS, Azure, GCP) and their resilience features. - Proficiency in scripting and automation tools (Python, Bash, Terraform, etc. - Experience with monitoring and observability tools (Prometheus, Grafana, Splunk, etc. - Experience with Chaos Engineering Tools such as Gremlin, Chaos Monkey etc. - Excellent analytical and problem-solving skills. - Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams. - Certifications in relevant fields (e.g , AWS Certified Solutions Architect, Azure DevOps Engineer) are a plus. (ref:hirist.tech)
Location: pune, IN
Posted Date: 4/19/2025
Location: pune, IN
Posted Date: 4/19/2025
Contact Information
Contact | Human Resources ROSEMALLOW TECHNOLOGIES PRIVATE LIMITED |
---|