American Express Careers
Senior Engineer - Site Reliability
Why American Express?
There’s a difference between having a job and making a difference.
American Express has been making a difference in people’s lives for over 160 years, backing them in moments big and small, granting access, tools, and resources to take on their biggest challenges and reap the greatest rewards.
We’ve also made a difference in the lives of our people, providing a culture of learning and collaboration, and helping them with what they need to succeed and thrive. We have their backs as they grow their skills, conquer new challenges, or even take time to spend with their family or community. And when they’re ready to take on a new career path, we’re right there with them, giving them the guidance and momentum into the best future they envision.
Because we believe that the best way to back our customers is to back our people.
The powerful backing of American Express.
Don’t make a difference without it.
Don’t live life without it.
Talk to our people and you’ll find out what we’re really all about. Open, creative, risk-taking, collaborative and innovative are just some of the expressions you’ll hear. It’s our culture that makes American Express an extraordinary place to work, and a huge part of why we regularly win the best workplace awards all over the world. If you’re ready to tackle a challenge and make an impact, American Express is a great place to launch or grow your career
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
At American Express, we are building customer focused applications where user experience is the primary focus. Our customer focused brand drives us to think "what can we do better for our customers" while we are building our platform in Customer Experience Platform Excellence (CXPX) group.
As a Site Reliability Engineer (SRE), you will work closely with application development teams to build standards that drive the highest levels of availability across the Digital Acquisitions channel. You will join a team that provides 24/7 support and are expected to develop solutions that improve production support, monitoring services, while responding to incidents to ensure a high level of availability of applications. You can expect to spend about 50% of your time on engineering work -- this means things such as infrastructure automation, designing and building tools, as well as code to support our application teams.
If you were to join our team, you would be expected to:
- Work closely with our application engineering teams to launch and maintain applications both on-premise and hybrid-cloud.
- Act as primary escalation point from L1 support team in helping to make decisions to restore service and minimize impact to availability.
- Provide production support and respond to production incidents as the first line of defense for the organization
- Facilitate the resolutions of non-application issues (3rd party upstream issues, infrastructure issues, storage, database, network, file transfer etc.)
- Drive monitoring requirements to ensure business-service level visibility for all support teams
- Debug network and performance issues in large scale distribute systems.
- Provide consultation and strategic recommendations by quickly assessing and remediating complex availability issues.
- Introduce new and impactful technologies to the production support tool chain to help minimize friction for production releases and that results in quick diagnosis and recovery from production incidents.
- 6 to 8 years work experience in DevOps environment with java/J2EE/REACT JS applications
- Experience managing a team of engineers, conducting 1 on 1s, career path development, and mentoring engineers to improve overall technical capabilities.
- A BS degree in Computer Science, Computer Engineering, other Technical discipline, or equivalent work experience.
- Experience in Java, Python, Go, React, or Ruby
- Experience with supporting 3 tier architecture which includes exposure to at least 2 of : IBM DB2, Couchbase, Mongo, Redis
- Hands on experience leveraging enterprise tools such as Grafana, Dynatrace, AppDynamics, Jenkins, Splunk
- Experience working in a distributed team model with daily hand off of issues during shift changes
- Broad Technical field exposure, with preference to following skills: Cloud Infrastructure, VM, load balancing, containers, OpenShift, Kubernetes, JVM’s, web servers, application debugging, queing technologies, Caching technologies, databases, routing and switching, monitoring tools such as prometheus etc.
Bonus Points if you:
- Familiarity with financial services and authorizations systems is a plus.
- Understanding of using Agile Practices in Operations teams
Schedule (Full-Time/Part-Time): Full-time
Date Posted: May 28, 2019, 12:33:43 PM