We cannot address every time-series problem with standard tools, especially in the case of extrapolation, where even the applicability of many standard machine-learning tools is not obvious. We will describe a problem we recently encountered in time-series extrapolation, where the underlying time-series were (i) non-stationary, (ii) subject to secular shocks ,(iii) short, and (iv) sometimes only indirectly visible. The challenge was to extrapolate the next nine points for each instance.

Our approach consisted in two stages, an initial analytic stage, for which we used Gaussian process regression to identify intrinsic structure, and a final extrapolation stage, for which we used a custom state-space model motivated by the specific structure we had been able to identify using the GP. We implemented all the models we used in the Stan probabilistic programming language, which we can access through the PyStan Python package.

In our presentation, we will describe the details and motivation of each step (what advantage does a Gaussian process offer in comparision to, e.g., LOESS regression? Why a state-space model, and not an ARMA model?) and the results. We will also compare and contrast our approach with alternatives: e.g., why we do not use a Kalman filter to analyze our model, but instead implement in Stan; why we perform a full marginal analysis of the Gaussian process, rather than taken the better known (older) algebraic approach? We will also describe the supporting processes for the project, and how these interacted with the technical work.

Marion Cornelia Schwärzler

Affiliation: Deloitte Analytics Institute

Member of the Mathematical analytics group, Deloitte Analytics Institute.

visit the speaker at: Homepage