Data Scientist, Reinforcement Learning

Spring

Thursday, 30 April 2026

Design, develop, and deploy reinforcement learning solutions for real-world energy applications such as production optimization, process control, supply chain scheduling, drilling optimization, and resource allocation. Formulate sequential decision problems by defining state spaces, action spaces, reward structures, transition dynamics, and operational constraints with domain experts. Develop RL agents using model-free methods (e.g., PPO, SAC, TD 3, DQN where appropriate) and model-based approaches, selecting methods based on problem requirements, safety, and data availability. Build and use simulation environments and digital twins for offline training, policy evaluation, and validation before real-world deployment. Apply safe and constrained RL techniques to ensure agents operate within operational and safety limits. Integrate RL solutions with existing optimization, simulation, and control systems across real-time and planning use cases. Partner with data scientists and ML engineers to operationalize solutions, including training pipelines, monitoring, retraining, and performance tracking. Benchmark RL against traditional methods such as LP, MIP, heuristic search, MPC, and stochastic optimization to identify best-fit approaches. Stay current with advances in offline RL, safe RL, multi-agent RL, hierarchical RL, and model-based RL. Share knowledge, publish findings where appropriate, and mentor peers on RL best practices. About you Desired Skills: Experienced AI/ ML professional with strong expertise in reinforcement learning, sequential decision-making, optimization, and real-world deployment. 5 years of experience in AI/ ML, optimization, or related fields, including at least 2 years in reinforcement learning, sequential decision-making, or optimal control. Master's or PhD in Computer Science, Machine Learning, Operations Research, Control Theory, Robotics, Applied Mathematics, Engineering, or a related quantitative field. Deep understanding of RL fundamentals, including MD - Ps, dynamic programming, temporal-difference learning, policy gradients, and actor-critic methods. Proven experience building RL systems end-to-end, from environment and reward design through training, evaluation, and deployment. Experience with simulation environments, digital twins, or system models. Strong background in statistics, probability, optimization, control theory, and algorithm design. Proficiency in Python, Py. Torch and/or Tensor. Flow, plus RL tools such as Stable Baselines 3, R - Llib, and Gymnasium. Strong communication and collaboration skills, including the ability to explain technical concepts to non-technical stakeholders. Preferred Skills: Experience applying RL or decision optimization in industrial domains such as process control, robotics, autonomous systems, supply chain, energy systems, or operations research. Familiarity with offline (batch) RL, safe RL, and multi-agent RL. Knowledge of model-based RL, MPC, and hybrid RL-control approaches. Understanding of classical optimization methods and how RL complements them. Experience with physics-informed or hybrid mechanistic/ ML modeling and domain-informed reward or constraint design. Familiarity with platforms such as Azure ML, Azure OpenAI, Databricks, and ML - Ops tools such as M - Lflow or Weights & Biases. Experience in the energy industry or other asset-intensive, safety-critical sectors.

Loading Similar Jobs...

JOBZ is an independent Job Search Engine. JOBZ is not an agent or representative and is not endorsed, sponsored or affiliated with any employer. JOBZ uses proprietary technology to keep the availability and accuracy of its job listings and their details. All trademarks, service marks, logos, domain names, job descriptions and other company descriptions / details are the property of their respective holder. JOBZ does not have its users apply for a job on the J-O-B-Z.com website. Additionally, JOBZ may provide a list of third-party job listings that may not be affiliated with any employer. Please make sure you understand and agree to the website's Terms & Conditions and Privacy Policies you are applying on as they may differ from ours and are not in our control.