Share

Balancing data compression and accuracy in decision-making systems

November 25, 2025

Han Wu defended his PhD thesis at the Department of Electrical Engineering on November 25th.

/

Modern technologies generate large volumes of digital data, such as images, videos and sensor reading. To process this efficiently, systems often rely on lossy compression. This technique reduces data size by discarding less important details. For example, facial recognition systems extract key features like the positions of the eyes, nose, and mouth instead of storing an entire image. This saves computing power, but introduces a challenge: fewer features mean less information, which can increase the risk of mistakes. The trade-off between compression efficiency and decision accuracy is critical for systems that rely on compressed data for decision-making. In his PhD research, Han Wu explores these trade-offs across three different systems and uncovers deeper connections between them.

Before analyzing trade-offs, it鈥檚 important to define metrics for both compression efficiency and decision accuracy. The information bottleneck method provides a framework in which both are measured by mutual information, a metric that quantifies the correlation between a system鈥檚 inputs and outputs. However, correlation alone does not guarantee maximum reliability. In many cases, decision error probability provides a more direct measure of reliability. Han Wu investigated three systems related to the information bottleneck method and derived upper and lower bounds for their decision error probabilities. He focused particularly on the exponential regime鈥攖he rate at which error probability decreases exponentially. This rate indicates the highest achievable reliability and serves as a benchmark for evaluating real-world system performance.

Remote lossy source coding under logarithmic loss

The first system examined in Wu鈥檚 research is remote lossy source coding under logarithmic loss, where a data source is corrupted by noise before compression and reconstruction. Instead of making a hard reconstruction, the system outputs a likelihood score that indicates confidence in its estimate. These soft reconstructions are widely used in inference tasks because they convey richer information. However, if the likelihood score is too low, it is considered erroneous. Wu derived the exact exponential convergence rates for reconstruction error probability in this setting.

Oblivious relaying

The second system studied is oblivious relaying. In this setup, mobile users send encoded signals to a radio receiver, which compresses the signals and forwards them to a cloud server for decoding. Han Wu analyzed the decoding error probability and established upper and lower exponential error bounds, offering insights into the reliability of such communication channels.

Source coding with a helper

The third system investigated in Wu鈥檚 research is source coding with a helper. Video compression illustrates this concept well: instead of storing every frame, only the differences between consecutive frames are saved. The question becomes: given the previous encoded frame, how much space can we save when compressing the next one? Wu demonstrates that this system is deeply connected to the previous two at the coding level. In fact, exponential error bounds derived for the first two systems can be applied here, revealing structural similarities and enabling cross-system optimization. These insights can serve as a guide to design future decision-making systems.

Title of PhD thesis: . Supervisors: Dr. Hamdi Joudeh and Prof. Alex Alvarado.

Media Contact

Linda Milder
(Communicatiemedewerker)