American Express Careers
Infrastructure Engineer - Splunk / Elastic
EMTE seeks a Infrastructure Engineer with the ideas, knowledge, and strengths to enhance how American Express Technology uses log monitoring (Splunk/Elastic) and machine data to gain greater Operational Intelligence and Predictive Monitoring. This Senior Infrastructure Engineer will align all engineering designs with American Express’ architectural enterprise standards and promote the adoption of best practices offered through the Enterprise Logging Center of Excellence. Success for this individual’s performance and outcomes will be measured, in part, on the engineer’s ability to lead and collaborate with team members, technology partners and other stakeholders to create innovative solutions that achieve personal goals and those set by organizational leaders and the team.
- Maintain and enhance multiple Log Monitoring tools and environments (e.g., Splunk 6.x, Elastic 5.x, Red Hat Linux)
- Assist various teams with data onboarding into Splunk and/or Elastic
- Assist users with Log Monitoring queries, dashboards, and applications for use by Operations, Development, and Management personnel
- Mentor Splunk/Elastic users and administrators
- Work closely, at a deep technical-level, with engineering teams to ensure solution designs are consistent with American Express Technology’s architectural vision, platform/product roadmaps, enterprise standards, guidelines and principles
- Ensure compliance with security standards, and assist in audit preparations.
- Develop, document and implement enterprise standards and procedures
- Monitor environment and computing resources for reporting and capacity planning.
Maintain systems documentation
- Assist with the administration/support of other EMTE platforms as necessary
- Participate in 24x7 on-call support rotations for monitoring and automation tools during business hours, nights and weekends
- Function as an active member of an agile DevOps team, consistently contributing to the team and its Agile practices (tools, common components, and documentation)
- Adopt DevOps methods and roles in support of monitoring and automation tools/services
- Assist in troubleshooting various system, network, and application issues using Splunk data
- Follow Incident/Problem/Change Management, SOX and PCI processes
- Perform all activities in a timely manner, as required, to contribute toward Enterprise-level compliance of internal/external processes, standards and regulatory controls.
- Perform other duties as assigned
- Strong knowledge of Splunk and/or Elastic administration and maintenance.
- Strong knowledge of Splunk and/or Elastic cluster construction and administration.
- Strong knowledge of Splunk and/or Elastic application development and optimization.
- Ability to write scripts in one or more languages. (shell, Perl, Python, Ruby, etc…)
- Familiarity with Red Hat Enterprise Linux 6 and 7.
- Experience creating and supporting highly available enterprise production environments.
- Fundamental knowledge of TCP/IP networking, subnetting and routing concepts, and distributed computing concepts;
- Ability to self-direct personal activities to achieve goals and meet commitments
- Self-motivated leader with strong interpersonal skills and ability to work in cross-functional and inter-organizational teams.
- Ability to persuade and influence others without direct control.
- Able to manage multiple projects tasks and those of supporting team members needed to meet multiple demands in a dynamic, fast paced environment.
- Strong analytical, logical reasoning
- Strong troubleshooting skills and experience working within a heterogeneous environment.
- Ability to solve problems quickly and independently
- Ability to automate processes
Strong written and verbal communication skills, with the ability to influence cross-functional teams, business and/or vendor partners, and technology leaders
- Able to develop/make presentations, facilitate discussions and provide technical demonstrations in 1:1, small group and large group settings.
- 3+ years’ experience in Splunk production administration
- Experience managing team/workgroup activities
- Bachelor’s Degree in computer science, computer engineering preferred, or experience in related field required.
- Working knowledge of Linux administration: configuration and tuning, networking, logging, storage, and installation and integration of third-party software.
- Programming background in any applicable language
- Familiarity of SOX, PCI DSS and other regulatory standards helpful
- Ability to design, and present training to varying levels of users.
- Experience with any of the following: Ansible, Elastic, Graphite, InfluxDB, Kafka
- Prior experience in DevOps or DevOps-like environment (Practices that emphasize the collaboration and communication of both software developers and operations engineers)
- Working knowledge of Application Development workflow and Agile Methods
- Experience working with Scrum or Kanban-related tools and concepts (e.g., Jira, Rally, Epics, Stories, estimating story points, etc.)
Schedule (Full-Time/Part-Time): Full-time