Data Scientist ( R-00069762 )
Leidos has an immediate opening for a Data Scientist to contribute to various public health projects working within a cross-functional team for the National Center of Immunization and Respiratory Disease (NCIRD) at the Centers for Disease Control and Prevention (CDC). The Data Scientist would analyze various datasets (both unstructured and structured) to determine data relationships, model datasets for loading/storing into relational databases, create new data pipelines and ETL processes leveraging the R environment and the Azure cloud stack of resources and services to perform data migrations from on-premise traditional SQL databases to the Azure Data Warehouse and Azure Synapse. The Data Scientist may also contribute to data visualization projects, working with the team to connect data streams/pipelines to Power BI, Tableau, and R Shiny. This role will work closely and collaboratively with the Data Science team and CDC staff. This position requires an entrepreneurial mindset and strong communication skills to meet with customers to translate their requirements to working data solutions, while operating within the government’s guidelines and mandates.
Day to Day:
- Proficiency in R programming performing data cleaning, manipulation, and transformation to prepare data for visualization and analysis.
- Design, Develop, and Implement production data pipelines using R programming and the Azure stack of tools and services working in a collaborative environment within a cross-functional team.
- Migrate complex data pipelines from on-premise into production-ready data pipelines to the Azure cloud for use in data visualization, analysis, and reporting.
- Build and configure data workflows using Azure data factory and Azure SQL Database/Warehouse.
- SQL Server - Create complex stored procedures, high performance views, triggers and SQL queries, tables and functions.
- Attend team planning meetings, backlog refinement, daily stand-ups, and customer demos.
- Bachelor's Degree and 3+ years of experience designing, developing, and implementing data pipelines and performing analysis using R Studio and MS Azure.
- 1-2+ years demonstrated work experience with Azure Data Factory, Azure SQL Database/Warehouse.
- 1-2+ years demonstrated work experience using Azure Databricks, Delta Lake, Lake House, PySpark, and Scala.
- Hands on experience migrating complex data pipelines from on-premise into production-ready data pipelines to the Azure cloud for use in data visualization, analysis, and reporting.
- Experience developing data pipelines and data flow tasks applying complex data manipulation and integration from a variety of data sources and destinations.
- Experience building and configuring data workflows using Azure data factory, Azure Stream Analytics, and Azure SQL Database/Warehouse.
- Experience with any of the following data visualization tools: Power BI, Tableau or R Shiny.
- Knowledge and interest in Machine Learning techniques.
- Ability to multi-task based on the project priorities and deliver the solutions on-time with good quality
- Understanding of the enterprise data architecture and data quality controls
- Experience with Version control tools – git
- Team player who thrives in a dynamic and fast-paced environment.
- Write technical documentation and system architectures
- Knowledge of Agile Development methodologies and the Software Development Lifecycle (SDLC).
Plus, but not required:
- Any of the following: Azure Data Engineer Certification, Azure Solution Architect, Microsoft Certified Solutions Associate, Solutions Expert or Database Administrator.
- Knowledge/experience with Azure Stream Analytics.