Description
In this role, you will take ownership of the training and calibration infrastructure for advanced machine learning models. You will work closely with data scientists and engineers to scale models from research prototypes into production systems that operate on large datasets and deliver statistically sound outputs.
A key part of the assignment is ensuring that model predictions are not only performant but also well-calibrated and externally defensible, meeting high standards for validation and reproducibility.
You will operate across the full ML lifecycle—from training pipelines and experimentation frameworks to model evaluation, calibration, and deployment readiness—within a modern cloud and HPC-enabled environment.
Start: ASAP / negotiable
Work model: Hybrid: preferred: on-site 3 days per week (Espoo)
Requirements
- Master’s degree (or higher) in Computer Science, Machine Learning, Statistics, Applied Mathematics, or a related field
- 5+ years of hands-on experience training and deploying ML models in production environments
- Strong experience with large-scale datasets and distributed or GPU-based training
- Proven expertise in probability calibration (e.g. isotonic regression, Platt scaling) and ability to diagnose calibration issues
- Deep understanding of evaluation for imbalanced classification, including calibration and ranking metrics
- Experience with hyperparameter optimization at scale (e.g. Optuna, Ray Tune)
- Solid background in ML/ModelOps practices, including:
- Experiment tracking
- Model versioning
- Reproducibility and artifact management
- Strong Python skills (pandas, NumPy, scikit-learn)
- Experience with cloud environments (AWS) and distributed workloads
- Ability to communicate technical results clearly to non-technical stakeholders
- Experience collaborating cross-functionally with Data Engineers and Data Scientists
Nice to have
- Experience delivering ML-powered features in production (CI/CD, monitoring)
- Understanding of geospatial data challenges (e.g. spatial cross-validation)
- Experience with uncertainty quantification methods (e.g. conformal prediction, quantile regression)
- Familiarity with HPC environments (e.g. SLURM-based clusters)
- Experience with Databricks, PostGIS, or AWS RDS/Aurora
- Background in insurance, catastrophe modeling, or climate risk
- Familiarity with tabular deep learning approaches (e.g. TabNet, FT-Transformer)