As a Site Reliability Engineer, you'll ensure the reliability and performance of systems for a leading airline. You'll collaborate with development and operations teams to implement best practices and drive continuous improvement.
Innovative and collaborative, with a focus on reliability and performance.
In this role as a Site Reliability Engineer, you will be essential in maintaining the reliability and performance of systems for a prominent EU airline. Your day-to-day work will involve collaborating closely with both development and operations teams to implement effective standards and methodologies. You'll be responsible for driving continuous improvement and innovation within the organization, ensuring that systems are not only functional but also optimized for performance.
Your key responsibilities will include creating clear standards and processes, leading incident management, and conducting post-mortem analyses to learn from issues. You will also review architectural decisions and recommend improvements, while developing high-quality tools to tackle operational challenges. Additionally, you will gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
This role is ideal for someone with a solid background in AWS and Linux, along with strong debugging and troubleshooting skills. You should have a robust understanding of software development principles and system administration. Experience with monitoring and observability tools is also crucial. If you enjoy working in a dynamic environment where you can mentor junior staff and drive improvements, this position could be a great fit for you.
Overall, you'll be balancing feature development with reliability, ensuring that the systems you manage are sustainable and efficient. Your contributions will directly impact the airline's ability to deliver high-quality services to its customers.
You'll be taken to the original listing on Indeed to apply.