This is not white noise! – Ceci n’est pas bruit blanc!

Disclaimer: This is an informal summary of Goerg (2012), Testing for white noise against locally stationary alternatives, Statistical Analysis & Data Mining, 5, (6). You can find a draft version on By having come to this site I assume you know about time series modeling and prediction – so I spare you technical details and formal definitions here.


This is not white noise!

Figure: “This is not white noise.” Rene Magritte’s 2 cents on time series residual checks (see MagrittePipe).

I am sure you have seen many of those residual check figures in text books, papers, etc. and also have produced them yourselves. However, these plots are misleading because they do not necessarily mean that the series is indeed white noise; in fact, it just shows you that on average it is white noise.

What does that mean? Well, what actually happens is that sample estimates of the autocorrelation function (ACF) \widehat{\rho}(k) from the entire sequences x_1, \ldots, x_T converge to the temporal average of a possibly time-varying correlation structure \rho(k; u), where u = t/T \in [0,1] is time rescaled (from [0, T]) to the unit-interval.

This means that when you fit a model, plot the sample ACF of the residuals and all lags k \geq 1 are below the significance line, this just shows that your model can remove auto-correlation on average, but you still have to check if your data has time varying dynamics.


Let’s take an example from the paper. Consider the locally-stationary moving-average process of order 1 (LS-MA(1)):

    \[x_{t/T} = \varepsilon_{t/T} + \theta_1(u - 1/T) \varepsilon_{(t-1)/T}, \quad \varepsilon_{t/T} \sim N(0, \sigma(u)^2)\]


and \theta_1(u) has a non-linear time-varying structure over normalized time u = t/T \in [0,1], and \sigma(u)^2 = (1 + \theta_1(u)^2)^{-1} is the inverse of the usual MA(1) variance of a process. This choice of variance ensures that the process x_t has unit variance over time (if you want another variance, just multiply the result with the square root of this target variance).

yy <- lsarma11.sim(n = 1000, theta = c(0.5, -1), sigma = "constant")
layout(matrix(1:2, ncol = 2))

Figure: Simulation of an LS-MA(1) with such a time-varying heta_1(u) that it becomes a spurious white noise process.

The functional form of \theta(u) was parametrized by (0.5, -1), which means that it is a linear polynomial of u with coefficients

    \[\theta_1(u) = 0.5 - 1 \cdot u\]


Recall that the autocovariance function of an MA(1) process has only a non-zero value for lag k=1 with \gamma(k = 1) = \sigma^2 \theta_1. Now let’s add in the normalized time in again and get \gamma(k, u) = \sigma^2(u) \cdot \theta_1(u).

The important part here is that \sigma^2(u) \theta_1(u) integrates to 0 over time u \in [0, 1]: \int_0^1 \sigma^2(u) \theta_1(u) du = \int_0^1 \theta_1(u) du = 0, where the first equaliy follows since \sigma(u)^2 is an even function (around 0.5) and the second because \theta_1(u) is odd (around 0.5). This means that this particular LS-MA(1) process has an autocovariance function that averages out to 0 for all k > 0. Using the terminology of Goerg (2012) this particular LS-MA(1) process is a spurious white noise process.

Testing for white noise and proposal for a new white noise test

Why does that matter in practice? Well, it is spurious in the sense that the sample autocorrelation function (ACF) \widehat{\rho}(k) tends to the the ACF of white noise, \gamma_{WN}(k) = \mathbf{1}(k = 0), for T \rightarrow \infty. This means for a large enough sample the sample ACF is indistinguishable from white noise. This is bad!! Especially if we use Portmanteau style white noise tests like the Box-Ljung test.

Box.test(yy, lag = 5, type = "Ljung-Box")
##  Box-Ljung test
## data:  yy
## X-squared = 5.5603, df = 5, p-value = 0.3514

Using the popular white noise test we wouldn’t reject white noise (p-value: 0.3513767) – even though we know it is not white noise. This shows you that the Ljung-Box test has very low power against locally stationary alternatives.

In the paper I propose a simple windowing approach to improve the power against such locally stationary alternatives – locally stationary Lung Box test., lag = 5, K = 2, type = "Ljung-Box")
##  LS Ljung-Box test; Number of windows = 2; non-overlapping window
##  size = 500
## data:  yy
## X-squared = 47.813, df = 10, p-value = 6.715e-07

Here even with already two windows (K = 2) we can detect the non-stationarity.

See the paper for details and from now on beware when someone shows you an ACF plot with a Ljung-Box test to justify that their models produce white noise residuals.

Note: I do note that the locally stationary deviation from white noise might not always be important for the application at hand. However, in principle to understand the dynamics of the process – and to get even better forecasts – you should care about it.

Update: It is true that this might look just like a theoretically interesting case, but not seen in practice. I agree it is unlikely to see such spurious white noise in a real world dataset directly, but you will see this in almost any residuals of a stationnary model. Without going into too much theoretical detail, just imagine a general time-varying model \theta(u) with a time varying covariance function \gamma_{\theta}(k, u). Well if you fit a constant model to this with parameter \widehat{\theta} then this estimate will be close to the time average of the true model \theta(u); and naturally the residuals will now be close to \theta(u) - \widehat{\theta} which by construction of \widehat{\theta} will integrate out to zero (approximately).

Let’s look at an example

tree.ring <- ts(fread(file.path(data.path, "tree-rings.txt"))[, V1])
layout(matrix(1:4, ncol = 2))

mod.arfima <- fracdiff(tree.ring)
mod.arfimad  ## [1] 0.236507 arfima.res <- diffseries(tree.ring, mod.arfimad)

Looks good right? Well, the issue is that the residuals are spurious white noise. How do I know? First, I can test it

Box.test(arfima.res, type = "Ljung-Box")
##  Box-Ljung test
## data:  arfima.res
## X-squared = 1.8757, df = 1, p-value = 0.1708, K = 4, type = "Ljung-Box")
##  LS Ljung-Box test; Number of windows = 4; non-overlapping window
##  size = 497
## data:  arfima.res
## X-squared = 39.361, df = 4, p-value = 5.867e-08

and second, we know from literature that the tree ring data is in fact locally stationary fractional noise: see Goerg (2012) and Ferreira, Olea, and Palma (2013).



Simulating a locally stationary ARMA(1, 1) process:

function (n, phi = 0, theta = 0, sigma = 1, rand.gen = rnorm, 
    innov = rand.gen(n)) 
    u.d <- seq_len(n)/n
    phi.u <- as.function(polynomial(phi))
    theta.u <- as.function(polynomial(theta))
    phi.t <- phi.u(u.d)
    theta.t <- theta.u(u.d)
    if (sigma == "constant") {
        var.arma11 <- (1 + theta.t^2 - 2 * theta.t * phi.t)/(1 - 
        sigma.t <- sqrt(1/var.arma11)
    else {
        sigma.u <- as.function(polynomial(sigma))
        sigma.t <- sigma.u(u.d)
    xx <- innov
    for (ii in 2:n) {
        xx[ii] <- phi.t[ii] * xx[ii - 1] + sigma.t[ii] * innov[ii] + 
            theta.t[ii - 1] * sigma.t[ii - 1] * innov[ii - 1]

The locally stationary Box test:
function (x, lag = 1, type = c("Ljung-Box", "Box-Pierce"), fitdf = 0, 
    K = 1, plot = FALSE, ...) 
    stopifnot(is.numeric(x), K >= 1, K == round(K), is.logical(plot))
    type <- match.arg(type)
    if (K >= length(x)) {
        stop("Number of windows must be smaller than the length of the time series.")
    N.k <- floor(length(x)/K)
    if (K == 1) {
        Box.test(x, type, lag = lag)
    else {
        METHOD <- paste0("LS ", type, " test; \n Number of windows = ", 
            K, "; non-overlapping window size = ", N.k)
        stats.pvals <- rollapply(yy, width = N.k, by = N.k, FUN = function(x) {
            out <- Box.test(x, lag = lag, type)
            return(c(outstatistic, outp.value))
        }, fill = NA)
        stats.pvals <- stats.pvals[![, 1]), 
        colnames(stats.pvals) <- c("statistic", "p-value")
        if (plot) {
            plot(stats.pvals[, "p-value"], ylim = c(0, max(stats.pvals[, 
                "p-value"])), type = "h", lwd = 2, ylab = "p-value", 
                xlab = "Window")
            abline(0.05, 0, col = 4, lty = 4)
            abline(0, 0, lty = 1, col = "gray")
        STATISTIC <- sum(stats.pvals[, "statistic"])
        names(STATISTIC) <- "X-squared" <- lag * K
        names( <- "df"
        PVAL <- 1 - pchisq(STATISTIC,
        return(structure(list(statistic = STATISTIC, parameter =, 
            p.value = PVAL, method = METHOD, = deparse(substitute(x))), 
            class = "htest"))