Measuring the hay in the haystack: quantifying hidden variables using Bayesian Inference

Monitoring performance metrics in network traffic is important in technology-driven trading, but not every metric is visible to measure. Here I discuss how to tackle this problem via Bayesian inference using PyMC3 package.

Tags: Data Science, Networks

Scheduled on thursday 11:55 in room lounge


Omer Yuksel

Omer Yuksel is a Data Scientist in IMC, responsible for developing models and Python tools for monitoring and predicting network performance metrics. He has a background in applied mathematics and computer engineering, and worked on research projects in social network analysis and data-driven methods for network security.


Technology-driven trading is a field with many challenges, and performance and availability of the network communication is essential to the business. To have a good understanding on the performance and availability, we monitor certain metrics - however not every interesting metric is readily available to measure. Some of these have to be inferred from the data we see in production by incorporating our own knowledge. What complicates this further is that the relationship between the hidden variables and the output data is not a deterministic one, as we are often dealing with a stochastic system.

Bayesian inference is a suitable way to tackle this issue - it allows encoding our knowledge as a prior distribution of the model parameters. Here we will go through real-world uses of Bayesian inference at IMC, using PyMC3 to make an estimate for the hidden metrics in the network traffic.

Knowledge: No prior knowledge of PyMC3 is required. Since this is a short presentation, the talk with approach the problem and the solution at a high level instead of implementation details.