Capacity and Performance Engineer
Salary £40,000 - £45,000 per annum
Consultant Lorraine Carroll
Date posted 08 August 20192019-08-08 2019-10-07 it Leeds Greater Manchester GB GBP 40000 45000 45000 YEAR Robert Walters https://www.robertwalters.co.uk https://www.robertwalters.co.uk/content/dam/robert-walters/global/images/logos/web-logos/square-logo.png
This investment organisation are looking for a Capacity and Performance Engineer
Capacity and Performance Engineer - This award-winning online investment service is looking for a Capacity and Performance Engineer to perform a critical support function to the Commercial IT business area, specifically around analysis and investigation into capacity and performance of our customer facing platforms.
Reporting into the DevOps Operational Support Lead, you will work towards the resolution of know capacity/performance issues as well as using tools and analysis to highlight and deliver preventative changes.
You will focus on investigation into the following areas:
- Service affecting Issues that have come to light through customer reports or areas highlighted on the monitoring platform. Specifically, service outages or issues that prevent normal functioning on the platform.
- Performance/Capacity issues that result in a slow-down of customer facing platforms, which result in the platforms being in a degraded state but may not have materialized into service affecting issues our outages
- Preventative activities to highlight where potential performance bottlenecks or underlying issues could result in future degradation or outages
Working as part of the Commercial IT team, this role involves close collaboration with both internal teams (Development and Support), and also third parties to investigate, replicate and ultimately work towards resolution of existing and potential issues.
- Ability to perform detailed Impact Analysis of complex incidents spanning multiple platforms
- Application log analysis, using common linux/unix CLI tools such as grep and regular expressions in order to quickly and efficiently extract key details.
- Ability to manipulate and administer monitoring tools such as Splunk, Nagios, Grafana (Prometheus) and App Dynamics in order to provide clear triggers/alerts to both service affecting issues requiring immediate resolution, but also underlying performance issues.
- Ability to recreate issues using JMeter / Gatling / Loadrunner in order to recreate an issue, and help define performance validation criteria to highlight differences between current and N+1 environmental changes.
- Ability to understand (java) code to the level where it is possible to collaborate on defining metrics for Prometheus and Grafana
KNOWLEDGE AND EXPERIENCE REQUIRED:
- A strong understanding of a full stack ecosystem including:
- Application Servers and APIs
- Third party services and data feeds
- Infrastructure, Networks and Operating systems
- Cloud-based infrastructure and services such as AWS
- Experience using tools, including the following
- App Dynamics
- JMX and JVisualVM