Teaching machines to make better decisions when they can't see the full picture

2 juni 2026

Roy van Zuijlen. Photo: Angelique Swinkels

In the real world, engineering systems rarely have access to perfect information. Robots, industrial machines, and autonomous systems must often act based on sensor readings that are incomplete, noisy, or indirect. This makes it difficult to determine whether a change in performance is caused by an improved decision strategy or simply by uncertainty in the available data.

As a result, developing reliable methods for improving decision-making remains a major challenge.

Learning directly from observations

The thesis focuses on a technique known as policy iteration, a method for gradually improving decision-making strategies. Traditional approaches often require detailed knowledge of the underlying system, something that is not always available in practice.

The new methods developed in this research of allow policies to be learned and refined directly from observations. This means systems can continue to improve even when their internal dynamics are only partially understood.

Better results with less data

One of the key findings is that learning can become more efficient when prior knowledge is combined with limited experimental data. This is particularly important in applications where collecting large amounts of data is expensive, time-consuming, or impractical.

By making smarter use of available information, the proposed methods can achieve strong performance while reducing both data requirements and computational costs.

Revealing hidden states

Many systems depend on information that cannot be measured directly. To address this, the thesis introduces advanced techniques for estimating hidden system states from complex sensor data, including images.

The researcher combines ideas from Bayesian statistics, machine learning, and signal processing to extract meaningful information from observations and improve decision-making under uncertainty.

From theory to real-world impact

The work provides both theoretical foundations and practical tools for designing smarter control systems. By enabling reliable learning and policy improvement under partial observability, the research opens new possibilities for robotics, manufacturing, and other data-driven engineering applications.

Ultimately, the thesis shows that even when information is incomplete and data are scarce, intelligent systems can still learn to make better decisions.

Title of PhD thesis: . Supervisors: Dr. Duarte Antunes, and Prof. Maurice Heemels.

Media Contact

Rianne Sanders

(Communications Advisor ME/EE)

J.J.M.Sanders@tue.nl

黑料福利网