In this role, you will work as part of a Machine Learning team developing and deploying scalable ML solutions in cloud environments. Your work will include data processing and feature engineering, model training and optimization and productionizing models. You will also contribute to building ML infrastructure, automation and monitoring systems to ensure model performance and reliability.
Required Skills:
- Strong experience in Python and data processing (e.g., Databricks, Spark, Redis, feature stores such as Tecton)
- Experience with cloud platforms (AWS and/or GCP)
- Solid understanding of ML frameworks (PyTorch, TensorFlow, ONNX)
- Experience in model training and tuning (Databricks, Spark, Ray)
- Experience in model serving and API development (Ray Serve, FastAPI, Triton Inference Server)
- Knowledge of containerization and orchestration (Docker, Kubernetes)
- Experience with infrastructure as code and CI/CD (Terraform, Kubeflow, GitHub Actions, Jenkins)
- Experience in monitoring and experiment tracking (Grafana, Prometheus, MLflow, Weights & Biases)
- Familiarity with AI-assisted coding tools (e.g., Claude or similar)