We are seeking a highly experienced Principal Data Scientist in Washington, DC. This senior role demands extensive expertise in data science, machine learning, and industry-specific applications, particularly within the railroad sector. The ideal candidate will have a proven track record of developing, validating, deploying, and monitoring complex data models and leveraging cutting-edge technologies to drive data-driven decision-making and innovation.
Key Responsibilities
- Develop and implement data science models for data ingestion, preparation, visualization, selection, validation, accuracy measurement, deployment, and monitoring.
- Utilize Python libraries such as Pandas, NumPy, SciPy, Scikit-Learn, Seaborn, and Keras for advanced data analysis and modeling.
- Employ AWS SageMaker for building, training, and deploying machine learning models.
- Apply various algorithms including machine learning (classification, multiclass classification, regression), time series forecasting, computer vision, natural language processing, text analytics, graph databases, and IoT sensor data analysis.
- Leverage extensive experience in the railroad industry to deliver innovative data solutions and insights.
Required Qualifications
- 10+ years of experience working with large data sets or doing large-scale quantitative analysis.
- Expert SQL scripting is required.
- Development experience in one of the following: Scala, Java, Python, Perl, PHP, C++ or C#.
- Experience working with Hadoop, Pig/Hive, Spark, and MapReduce.
- Basic understanding of statistics – hypothesis testing, p-values, confidence intervals, regression, classification, and optimization are core lingo.
- Experience manipulating large data sets through statistical software or other methods.
- Experimentation design or A/B testing experience is preferred.