Site Reliability Engineering Lead ( R-00062579 )
Leidos is looking for a Site Reliability Engineering Lead to support the Defense Health Agency (DHA) Military Health Services (MHS) Enterprise Information Technology Services Integrator (EITSI) program.
The DHA MHS EITSI program is a critical program to modernize, standardize and optimize the Defense Health Agency’s IT infrastructure by establishing an Enterprise IT Services (EITS) environment using a Multisourcing Services Integrator (MSI) approach.
The services integrator will manage the EITS environment and assist the government in providing oversight of the Service Providers (SPs) for the benefit of the DHA, MHS, and mission partners. The EITSI will provide independent end-to-end accountability through coordination and validation of IT services delivered by separately contracted SPs. The EITSI will also deliver Service Desk and Global Operations (Network Operations Center, Performance Management) services to DHA.
The IT infrastructure and operations framework will support a global data communications network, and enterprise services infrastructure, including data center(s), server hosting, and end-user platform capabilities enabling more than 150K military personnel to support the continuum of health services.
- Make decisions about DHA’s site reliability and performance strategy/roadmap.
- Own live monitoring systems across the entire infrastructure - maintaining existing tools and implementing new systems.
- Lead advance planning to prepare our services for handling 10x seasonal traffic (setting scaling policies, provisioning resources, doing load testing, etc.)
- Manage processes and automated stability/performance checks that the team uses to develop fast, reliable software.
- Design, test, upgrade and integrate new or existing infrastructure platforms in accordance to the organization’s business requirements.
- Document all aspects of the engineering process as it relates to build and configuration, testing, and deployment of infrastructure platforms.
- Coordinate changes such as implementation of new infrastructure platforms, upgrades or version increments to existing platforms with system administrators and stakeholders.
- Plans, coordinates, and implements security measures to safeguard infrastructure platforms.
- Performs infrastructure platform modeling, analysis, and planning.
- Coordinates with architecture and evaluate new and emerging infrastructure platform technologies and make recommendations for selection of these products.
- Assists system administrators with complex troubleshooting issues and if needed coordinates vendor engagements to assists in troubleshooting efforts.
- May provide day-to-day operational or administrative duties for deployed infrastructure platforms in lab or production environments.
- Ability to work with other senior technical and user staff to complete projects.
- Operates with appreciable latitude in developing methodology and presenting solutions to problems. Ability to work with other senior technical and user staff to complete projects.
- Develop new system design plans and engineering products including documentation, models and relevant engineering/architecture artifacts.
- Establish system design and functionality models and documentation.
- Be accountable for the development and maintenance of the organizations technical architecture and related solution engineering work.
- Be self-motivated to discover new industry related technologies and solution models.
- Responsible for detailing the organizations cloud technical transition and evolutionary change.
- Evaluate and support the installation of developed or COTS software during various phases of testing.
- Support preventive maintenance of applications and operating systems, as well as assessing of database and operating system problems.
- Review and prepare documentation for system test and installation of software.
- Collaborate and coordinate with teammates, subcontractors, vendors, and other disciplines.
- Level III performs more independent thinking and difficult tasks compared to Level II and may supervise others.
- Bachelor’s Degree and/or equivalency
- Six (6) years of progressive experience demonstrating the required proficiency.
- Possesses and applies expertise on multiple complex work assignments.
- Requires deep understanding of and ability to apply principles, theories, and concepts of technical domain and has broad understanding of other related specialty areas.
- Ability to achieve Public Trust clearance
- ITIL Foundations v3/4 Certification
- ServiceNow experience or like ITSM