American Express Careers
CTO SRE Engineer
· Conceptualize and implement Artificial Intelligence driven Site Reliability Engineering Framework/Components to improve predictive monitoring and driving SRE team’s journey towards “Robotics First” approach.
· Drive end to end Performance Engineering, Capacity Management, A/B testing, Chaos Testing to maintain and improve application performance in all phases of SDLC.
· Research latest technology, concepts, conceptualize solution and develop proof of concept that will improve resiliency and performance of the production infrastructure. Design and implement innovative solution/framework that will improve software engineering velocity, infrastructure resiliency and security, and data availability.
· Develop common framework components (to be leveraged by enterprise applications), define standards for configuration, monitoring, reliability and performance engineering.
· Work with operations team to resolve major incidents.Continuously improve automated remediation tasks to ensure the highest
· A BS degree in Computer Science, Computer Engineering, other Technical discipline, or equivalent work experience.
· 10 + years of Technical hands-on experience with systems analysis, incorporating: Design Methodology, Production Support and Engineering, Enterprise level technologies including, but not limited to OpenShift, WebSphere Administration, JEE (JSP, Servlets, XML, Java), and internet-related technologies to deliver complex Internet facing solutions.
· Broad Technical field exposure, with preference to following skills: Cloud Infrastructure, VM, load balancing, containers, Kubernetes, JVM’s, web servers, application debugging, queing technologies, Caching technologies, databases, routing and switching, etc.
· Experience in designing mission critical highly available enterprise applications.
· Hands on experience in designing and implementing- Predictive Monitoring Framework using Artificial Intelligence, Chaos testing Framework design, A/B testing framework design and implementation.
· Hand on experience with performance testing framework design, tuning Java and C applications.
· Experience managing relational and nosql databases such as Oracle RAC, Cassandra & Redis.
· Strong knowledge of Linux internals and experience managing Linux systems in high traffic environments.
· Fluent in at least one of the following programing languages Java, Python, Go,
· Strong interpersonal communication skills and the ability to work well in a diverse team-focused environment.
· Experience with Splunk and/or ELK.
· Familiarity with financial services and authorizations systems.
· Understanding of using Agile Practices in Operations teams
Schedule (Full-Time/Part-Time): Full-time
Date Posted: Apr 15, 2019, 12:28:48 PM