Data Engineer ( R-00070818 )
Leidos is seeking a Data Engineer to support the development of mission-critical applications involving AI/ML capabilities. The Data Engineer will create ETL processes, develop cloud data pipelines, conduct data analysis, and create data visualizations. This role will maintain responsibility for the maintenance, improvement, cleaning, and manipulation of data in the business operational and analytics databases. The Data Engineer will work with the business’s software engineers, data analytics teams, data scientists, and data warehouse engineers to comprehend and aid in the implementation of database requirements, analyze performance, and troubleshoot any existent issues. Due to the nature of the role, only U.S. Citizens may be considered.
This position may initially be performed remotely during the contract transition. Candidates will need to attend in person meetings at Joint Base Anacostia-Bolling AFB and/or Reston, VA.
- Design and develop reusable data pipelines to support batch and stream processing of large volumes of structured and unstructured data
- Collaborate with team members including Business Analysts, Data Scientists, Machine Learning (ML) engineers and Application Developers to solve complex problems and fulfill business requirements
- Develop code to automate ETL, cleansing, and other processes to support exploratory data science analysis and model development
- Utilize open-source and secure cloud-native services to implement data pipelines and orchestrate complex process flows
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Instrument controls to measure and monitor pipelines and report metrics on data quality and performance
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability and similar
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data, and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Implement security controls to secure and protect data “at rest” and “in transit” and restrict access to only users and systems with required privileges
- Create data tools for analytics and data scientist team members that assist them in building and optimizing developed products into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- 5+ years of experience as a Data Engineer implementing data processing pipelines
- Experience working in model development teams composed of Data Scientists and Machine Learning (ML) Engineers
- Graduate degree in Data Science, Computer Science, Statistics, or related field
- Experience with relational SQL and NoSQL databases, including Postgres and MongoDB
- Experience with data pipelines and workflow management tools such as Airflow and MLFlow
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with Infrastructure as Code tools: Cloud Formation, Terraform or similar
- Experience with object-oriented and functional programming languages: Python, Java, Scala
- Fluent in at least one major big data/stream processing tool: Hadoop, Spark, Kafka, or similar
- US Citizenship is required due to the nature of the government contracts we support.
- Candidates must already possess an active TS/SCI clearance.
- Candidates must be willing to obtain and maintain a polygraph.
- Experience supporting intelligence mission-based projects
- Expertise in Data Science tools and ML frameworks including Jupyter notebooks, TensorFlow, PyTorch, AWS Sagemaker, and similar