Description
As a Software Engineer III at JPMorgan Chase within the Corporate Technology , you serve as a seasoned member of an agile team to design and deliver trusted market-leading technology products in a secure, stable, and scalable way. You are responsible for carrying out critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
Job responsibilities
• Consistently models and champions site reliability culture and practices and exerts technical influence throughout your team.
• Take initiatives to improve the reliability and stability of your team’s applications and platforms using data-driven analytics to improve service levels.
• Collaboration with your team to identify comprehensive service level indicators and the stakeholder partners to establish reasonable service level objectives and error budgets with your customers.
• Offers a high level of technical expertise within one or more technical domains and proactively identifies and solves for technology-related bottlenecks in your areas of expertise.
• Develop and maintain automation tools and scripts for deployment, monitoring, and incident response to enhance operational efficiency and reduce manual intervention (Toil).
• Implement robust monitoring and alerting systems to proactively identify and resolve issues, ensuring optimal performance and availability of services.
• Lead incident response efforts, investigate and troubleshoot system outages, performance degradation and other operational issues, ensuring timely resolution and minimal impact on Users.
• Create and maintain detailed documentation runbooks and knowledge base articles to facilitate effective collaboration and knowledge sharing across teams.
Required qualifications, capabilities, and skills
• Formal training or certification on software engineering concepts and 3+ years applied experience
• Possess 7+ years of experience, ideally working with Data/Python applications in Production environment.
• Experience in programming or scripting language (Python).
• Experience with automation tool/solution such as Ansible, Autosys, Control-M etc.
• Experience with cloud platform (preferably AWS) in setting up infrastructure using Terraform. Good to have cloud certification.
• Demonstrated proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other Site Reliability best practices
• Proficient knowledge and experience in service level objective alerting, data visualization tools such as (Tableau, Grafana etc) and data analytics/monitoring tool (Apache Spark, Splunk, Datadog, Dynatrace etc).
• Significant advantage to have experience supporting applications on platforms such as Databricks, Snowflake or AWS EMR.
• Proficient with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
• Good experience in Terraform, familiar with automation in Python and understands network basics and Kubernetes..Proven track record of leading incident response efforts and driving post-incident reviews to prevent recurrence.
• Proficient with container and container orchestration: (ECS, Kubernetes, Docker)
Preferred qualifications, capabilities, and skills
• Good knowledge on Cloud technology
• Actively self-educates, evaluate new technology, and recommend suitable ones.
• Knowledge of virtualization, cloud architecture, services and automated deployments.