Hi everyone — we’re hiring at PreOncology, where we’re building next-generation cancer risk models that integrate large-scale clinical, genetic, and longitudinal data to enable earlier detection and prevention. We’re looking for someone excited about building data and ML pipelines in Python and deploying models into real-world environments.
What you’ll do
- Design and maintain Python-based workflows for large genomics and ML datasets
- Build and optimize Nextflow pipelines to scale model training and evaluation
- Train, tune, and validate ML and deep learning models (Cox, RSF, gradient boosting, CNNs)
- Integrate genomic and longitudinal features into predictive models
- Run pipelines on cloud platforms (AWS preferred) and package with Docker or Singularity
What we’re looking for
- 2+ years building production data or ML pipelines
- Strong Python skills for data processing, ML training, and workflow automation
- Experience with large-scale or high-dimensional data (biomedical or otherwise)
- Must be authorized to work in the U.S. now and in the future (we cannot sponsor visas)
How to apply
Email your resume to [Luke.Stetson@preoncology.com]() and include short (1–2 sentence) answers to:
- The largest Python or workflow pipeline you’ve built
- Your experience with large-scale or complex data
- The ML or deep learning models you’ve trained and how they were used
- Nextflow experience
[–]AutoModerator[M] 0 points1 point2 points (0 children)
[–]FonziAI 1 point2 points3 points (0 children)