Learning to Stock Smarter: How Tailored Algorithms Strengthen Supply Chains
Tailored deep reinforcement learning methods help companies manage uncertainty in inventory systems, reducing costs and improving reliability across complex supply chains.
Tailored deep reinforcement learning methods help companies manage uncertainty in inventory systems, reducing costs and improving reliability across complex supply chains.
PhD researcher from the Operations Planning Accounting & Control (OPAC) group at the Department of Industrial Engineering and Innovation Sciences defended his PhD on April 02 2026. His work explores how adaptive machine learning can support companies facing rising uncertainty in supply chains.
Growing Pressures
Modern supply chains experience a level of volatility that forces entrepreneurs and organizations to make difficult choices. Demand can change quickly, suppliers may not deliver on time, and shortages can ripple through entire production systems. Temizoz shows how inventory decisions become more resilient when algorithms learn to anticipate such uncertainty rather than applying a single fixed rule. His research demonstrates how tailored learning methods help companies maintain service levels without tying up unnecessary capital.
New Thinking
Deep reinforcement learning is promising because it captures how inventories evolve over time and how today鈥檚 choices affect future costs and risks. However, standard versions of these methods struggle when randomness is high. Temizoz addresses this by designing learning approaches that reduce the instability caused by fluctuating demand and lead times, allowing algorithms to improve their decisions even when conditions are unpredictable.
Reliable Learning
Temizoz introduces a method that transforms learning from experience into a supervised learning task, limiting the influence of noise and controlling the way decisions are tested. As a result, the algorithm can handle the erratic nature of lost sales situations, perishable goods, and variable lead times. This structured approach keeps training efficient while still achieving high decision quality.
Adapting on the Fly
Temizoz also examines how learning systems can operate effectively when key parameters are unknown. Instead of retraining models whenever conditions shift, he proposes training agents on broad families of inventory problems and deploying them directly in new situations. These agents combine deep reinforcement learning with statistical estimation during operation, allowing them to make sound decisions even when demand patterns or lead time distributions must be learned along the way.
Contract Realities
Many companies work with service level agreements that specify performance over clearly defined time horizons, where every shortage can lead to penalties. Temizoz studies how inventory policies can be designed to meet these contractual requirements in systems with many items. His research shows how companies can combine static policies with dynamic adjustments that select among predefined strategies. By learning when to switch policies in real time, these dynamic approaches reduce costs while maintaining the service levels required by the agreement. The findings also highlight that longer review horizons can lead to lower overall costs, offering suppliers insights for negotiation.
Future Outlook
Together these contributions form a coherent toolkit for researchers and practitioners who want to apply machine learning to real inventory systems. Temizoz illustrates that the strength of such methods lies not in applying generic algorithms, but in tailoring them to the uncertainties, contracts, and constraints that shape modern supply chains.
Tarkan Temizoz defended his thesis on April 02 2026.
Title of the thesis: .
Supervisors: Willem van Jaarsveld, Remco Dijkman, Christina Imdahl, Douniel Lamghari-Idrissi