Leidos has an immediate opening for a Data Scientist to contribute to public health projects working within a cross-functional team for the National Center of Immunization and Respiratory Disease (NCIRD) at the Centers for Disease Control and Prevention (CDC).
The Data Scientist would work across the full Data Science lifecycle to onboard and prepare data, model datasets for loading/storing into relational databases, develop data pipelines and ETL processes leveraging R programming and the Azure Cloud stack of services, analyze structured datasets to determine data relationships, and contribute to data visualization projects, working with the team to connect data streams/pipelines to Power BI, Tableau, and R Shiny. The Data Scientist may also contribute to data migration projects from on-premises SQL databases to the Azure SQL Databases or Azure Synapse databases working with the team. This role will work closely and collaboratively with the Leidos Data Science team, CDC staff, and external data partners. This position requires an entrepreneurial mindset and strong communication skills to meet with customers to translate their requirements to working data solutions, while operating within the government’s guidelines and mandates. This role provides the opportunity to work across a number of technologies alongside CDC Public Health experts within an exciting and growing Data Science discipline at the CDC to help solve tomorrow’s public health challenges. This position offers a flexible schedule with “meeting free” Fridays and a remote work location.
- Design, develop, test, and implement fully automated, event-triggered, or scheduled production data pipelines using R programming and/or the Azure stack of tools and services working in a collaborative environment within a cross-functional team.
- Writes, tests, and implements R code for cleaning, wrangling, manipulating, and transforming data to prepare it for downstream storage processing such as insertion/updating of databases and for downstream analytical processing, including statistical computation, visualization, and standardized reporting.
- Develops data visualizations for public facing Power BI dashboards.
- Independently meets and clearly communicates with CDC subject-matter experts, CDC technical staff, and fellow NCIRD Data Science Team members to extract project requirements, translate them into technical implementation plans, and to develop solutions to meet the requirements.
- Creates generalized functions and incorporates them into an R package maintained by the team to perform common data tasks.
- Attend team planning meetings, backlog refinement, daily stand-ups, and customer demos.
- Bachelor's Degree in Statistics, Biostatistics, Data Science, Analytics, Mathematics, Computer Science or Computer engineering, Computer or Management Information Systems, (or similar scientific degree), and 3+ years of experience designing, developing, and implementing data pipelines or analytical processes using R Studio and/or MS Azure.
- High proficiency in R programming performing data cleaning, data wrangling, manipulation, and transformation to prepare data for visualization and analysis.
- Expertise in writing R functions and in developing and maintaining in-house R packages.
- Expertise in R packages, especially dplyr, tidyverse, Shiny, plotly, knitr, and ggplot2.
- Skilled in developing dynamic documents, slide decks, and other deliverables using R Markdown.
- Skilled in computing various statistics using statistical methods to extract meaning from data.
- Experience merging and integrating disparate datasets and outputting to various target destination databases or systems.
- Experience using Application Programming Interfaces (APIs) to retrieve data or submit requests to online data systems.
- Experience with Power BI and R Shiny Dashboards.
- Knowledge and interest in Statistics, Machine Learning, and/or AI techniques.
- Ability to multi-task based on the project priorities and deliver the solutions on-time with excellent quality.
- Familiarity with Azure DevOps (or GitHub) as a version control code repository as well as a project management tracking system.
- Understanding of the enterprise data architecture and data quality controls.
- Team player who thrives in a dynamic and sometimes fast-paced environment.
- Write technical documentation and create system architecture diagrams.
- Knowledge of Agile Development methodologies and the Software Development Lifecycle (SDLC).
- University coursework in R, Statistical Computing, and/or Database Design and Administration.
- Previous experience working in the Public Health domain.
- Programming experience in Python, SAS, SUDAAN.
- Familiarity with database stored procedures, tables, views, triggers, and queries.
- Experience using data management software and utilities such as Stat-Transfer, DBMS Copy.
- Experience with the O365 Application Power Automate.
- Work experience with Azure Data Factory, SQL Server, or other Azure Databases.
- Hands on experience migrating data pipelines from on-premises into Azure cloud environments.
- Any of the following relevant certifications: Azure Data Engineer Certification, Azure Solution Architect, Microsoft Certified Solutions Associate, Solutions Expert or Database Administrator.
- Experience using MS SQL Server Management Studio to write SQL and T-SQL Code for interacting with project databases.
Ability to obtain NACI Clearance is required.
Pay Range $60,450.00 – $93,000.00 – $125,550.00
The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.