Test Environment Manager
Posted 1 day 2 hours ago by isepglobal
£80,000 - £100,000 Annual
Permanent
Full Time
I.T. & Communications Jobs
London, United Kingdom
Job Description
Role: Test Environment Manager
Location: London
Work Mode: Hybrid (3 days from office)
Contract role
Experience Level: 15+ Years.
Job Description:
A Test Environment Manager (TEM) is responsible for transforming the SDLC environment with an engineering-focused role that emphasizes system reliability, automation, and performance in a non-production setting.
Mandatory SkillsThe primary technical skills which are needed are Observability, Management for cloud/on prem environments, IAC automation with DevOps exposure along with other soft skills.
Operational Responsibilities- Automate environment lifecycle: Develop Infrastructure as Code (IaC) to automate the provisioning, teardown, and configuration of test environments, integrating them with the CI/CD pipeline.
- Establish service level objectives (SLOs): Define and measure key service indicators (SLIs) for test environments, such as availability and provisioning time, to ensure they meet the needs of development and testing teams.
- Monitor environment health and performance: Use observability tools like Prometheus and Grafana to track the health of test environments, identify bottlenecks, and resolve issues proactively, not reactively.
- Manage incident response: Lead the incident management process for test environment issues, conducting blameless post mortems to understand the root causes and implement lasting fixes.
- Minimize toil: Automate manual, repetitive tasks associated with test environments to free up engineering time for more strategic work.
- Drive continuous improvement: Analyze environment performance data, incident reports, and post mortems to identify opportunities for continuous improvement and innovation.
- Balance reliability and speed: Use an "error budget" for test environments. If environments are highly reliable, teams can use the budget for quicker feature development; if reliability is low, the focus shifts to improving stability.
- Instill a reliability culture: Promote a blameless culture around test environment incidents, encouraging shared ownership and collaboration between development, QA, and SRE teams.
- Capacity planning: Anticipate future resource needs of test environments by analysing usage patterns and project forecasts. Ensure the infrastructure can scale to meet demand.
- Advance test data management: Work with Test Data Managers to ensure that test data is not only readily available but also consistent, compliant, and automatically provisioned with the environments.
- Expertise in tooling: Proficiency with monitoring and logging tools (e.g., Prometheus, Splunk, Grafana), CI/CD platforms (e.g., Jenkins, GitLab CI), and configuration management tools (e.g., Ansible, Terraform).
- Cloud infrastructure knowledge: Deep understanding of cloud platforms like AWS, including experience with containerization technologies (Docker, Kubernetes) and serverless computing.
- Scripting and programming: Strong scripting skills in languages such as Python or Bash to automate environment management tasks.
- Systems and networking knowledge: Solid understanding of Linux systems, networking concepts, and database management.
- Leadership and influence: The ability to champion SRE practices and influence technical and business stakeholders across different teams.
- Problem solving: Strong analytical and debugging skills for investigating and resolving complex environment issues under pressure.
- Communication: Excellent communication and collaboration skills to bridge the gap between development, QA, and operations teams.
- Adaptability: A proactive and adaptable mindset to keep pace with evolving technology and development methodologies.