You Lead the Way. We’ve Got Your Back.
At American Express, we know that with the right backing, people and businesses have the power to progress in incredible ways. Whether we’re supporting our customers’ financial confidence to move ahead, taking commerce to new heights, or encouraging people to explore the world, our colleagues are constantly redefining what’s possible — and we’re proud to back each other every step of the way. When you join #TeamAmex, you become part of a diverse community of over 60,000 colleagues, all with a common goal to deliver an exceptional customer experience every day.
This position will focus on providing technical expertise, education, and tooling to ensure the highest level of reliability and availability for critical applications. Their outstanding technical and interpersonal skills will be utilized while providing consultation and strategic recommendations to quickly assess and remediate complex availability issues. This individual will also drive automation and efficiencies to increase quality, availability, and security.
This is a Technical individual contributor role that works seamlessly with technology and business partners to ensure efficiencies in increasing quality, availability, and security to technical platforms. Works individually and with teams to drive reliability goals and objectives across platforms.
- Ensure application data flows are accurate and up to date with the objective to increase the knowledge base of all support teams and drive reliability.
- Facilitates the resolutions of application issues through the full stack of application tools and monitoring (Application/service unavailable, communications failures, expected values outside of SLA, etc.)
- Build, implement and advise on recovery tooling to adhere to enterprise standards and/or frameworks
- Introduce new and impactful technologies to the production support tool set that help minimize friction for production releases and support, and more quickly diagnose and recover from production incidents
- Responsible for availability, proactive monitoring / alerting, capacity planning, performance (reducing latency and increasing efficiency) to include testing for technical platforms
- Partner with appropriate supporting teams to ensure operational readiness throughout the application lifecycle
Scope of Impact/Influence
- Consults with teams to build standards that drive the highest levels of availability
- Mentors teams through ongoing development efforts
- Center of Enablement – coach and advise about the SRE function working with varies teams and provide real-life examples when necessary
- Bachelor’s Degree in related field preferred; Relevant industry experience can substitute
- 6+ years of engineering and/or architecture experience in a complex environment, such as: large scale web infrastructure or development team
- Experience supporting a 24/7 enterprise environment with on-call responsibilities for production support
- Experience in a broad range of software development and operations technologies such as Infrastructure, virtualization, load balancing, containers, JVM’s, web servers, application debugging, queueing technologies, caching technologies, databases routing and switching, etc. Examples include, but are not limited to:
- Kubernetes, Redis, Spark, Cassandra, etc.
- Experience in high transaction volume OLTP sites or the Financial Services industry is preferred
- Ability to guide and implement the scripting or development of production support tooling that can be leveraged by your team and others.
- Understanding of multi-tier application architectures and related development technologies in support of service virtualization and API implementation/support
- Ability to write and build code and/or interpret and understand code
- Ability to assess logging and understanding value
- Working knowledge of operations to include certificate management, firewall rules, websites, XaaS, load balancer configuration, website virtualizations (VM’s and containers), etc.
- Deep understanding of *nix technologies, e.g. AIX, RedHat and CentOS
- Exposure to web frameworks
- Experience working with platform enablement technologies; Consul, Vault, Kafka, etcD, Elastisearch, Prometheus, Edge Envoys
- Understanding monitoring technologies, focused on logging, time-series or machine-learning products from a product owners’ perspective
- Has an ‘Automation First” mindset – fundamentally will not accept doing things over and over by hand
- Combines deep technical expertise, a continuous improvement and automation mindset, and systematic and rational root cause analysis to identify opportunities to make things faster and better
- Highly influential at all levels including peers, leaders, and key stakeholders, distilling complex ideas and concepts with clear, structured, easy to understand language
- Adapts to change quickly and easily and helps others adjust to changes through effective communication
- Ability to discern when escalation is appropriate, and act as needed
American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability status, age, or any other status protected by law.
Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations.
Schedule (Full-Time/Part-Time): Full-time
Date Posted: Mar 30, 2021, 2:40:21 AM