Purpose of the Role:We’re looking for a Site Reliability Engineers responsible for web application performance, availability and reliability. Candidate is responsible to provide consultation and strategic recommendations by quickly assessing and remediating complex platform availability issues.
Manage Your Card Account SRE is a continuous engineering discipline that effectively combines software development and systems engineering to build and run scalable, distributed, fault-tolerant systems. This role will ensure that American Express internal and external services have reliability and uptime appropriate to users' needs. We also ensure a continuous improvement, while keeping an ever-watchful eye, automated, on capacity and performance.
This role will drive the DevOps mindset which strives to use software engineering to build and run better production systems. You will write software to optimize day to day work through better automation, monitoring, alerting, testing and deployment.
You’ll be expected to work with several Technology partners to identify areas of opportunity within the availability platform and build a solution to automate monitoring solutions for the next generation platform, technology and constant innovations to drive efficiencies. You will be responsible for implementing tracing, monitoring, tooling solutions to maximize the performance and availability of our Web applications.
This is an opportunity to work in one of the best Technology units to help improve customer experience for American Express digital assets and influence how millions of people interact with their cards, their merchants and their money.
2+ years of hands on experience on configuring Splunk dashboards, Alerts setup
Good understanding of cloud technologies such as Kubernetes, Openshift
Knowledge of server-side technologies such as WebSphere, JBose, NodeJS
Experience with building Rest APIs, API Integration, and Web Services is preferred
Monitoring and analyzing PMI data
Hands on experience on enterprise tools set such as Grafana, Dynatrace, AppDynamics, BMC, etc.
Knowledge on Unix shell scripting, PERL or Python programming is preferred
Experience in handling DDoS/BOT attack
Working experience on network rules creation, load balancer configurations, network packet analysis
Analytical knowledge and exposure on root cause identification using analyzer tools like IBM support assistant, Splunk etc.
Good understanding of Linux OS commands
Experience on supporting three tier architecture which includes exposure to databases such IBM DB2, Couchbase, Mongo, Redis etc.
Certificate Management automation - Message signing, SSL, etc.
Exposure to ITIL processes is preferred
Exposure to enterprise platform migration from dedicated to cloud environment is preferred
Academic Background:BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 3-6 years of work experience in DevOps - Java/J2EE/REACT JS applications
Enterprise Leadership Behaviors
• Set The Agenda: Define What Winning Looks Like, Put Enterprise Thinking First, Lead with an External Perspective
• Bring Others With You: Build the Best Team, Seek & Provide Coaching Feedback, Make Collaboration Essential
• Do It The Right Way: Communicate Frequently, Candidly & Clearly, Make Decisions Quickly & Effectively, Live the Blue Box Values, Great Leadership Demands Courage
Schedule (Full-Time/Part-Time): Full-time
Date Posted: Jul 24, 2019, 3:17:03 PM