Bias Correction Tutorial

In this tutorial, we will explore how to work with the topt module from xtremes to apply bias correction techniques in extreme value analysis. Specifically, we will cover:

Generating index ranges for sliding and disjoint block methods (I_sb and D_n).
Computing exceedances using exceedances.
Estimating cluster size probability using hat_pi0.
Utilizing the Upsilon, Pi, and Psi functions for bias correction.
Solving for bias-corrected parameters with varpi and a1_asy.

Each section includes code examples for clarity.

Step 1: Generating Block Indices

In time series analysis, blocks of data are used to analyze cluster sizes and exceedances. We define two key functions:

I_sb(j, bs): Generate indices for a sliding block

import numpy as np

def I_sb(j, bs):
    return np.arange(j, j + bs)

# Example usage
print(I_sb(3, 5))  # Output: [3, 4, 5, 6, 7]

D_n(n, bs): Generate index pairs for disjoint blocks

def D_n(n, bs):
    idx = np.arange(1, n - bs + 2)
    pairs = []
    for i in idx:
        left_js = idx[idx + bs <= i]
        right_js = idx[idx >= i + bs]
        for j in left_js:
            pairs.append((i, j))
        for j in right_js:
            pairs.append((i, j))
    return pairs

# Example usage
print(D_n(5, 2))  # Output: [(1, 3), (1, 4), (2, 4), (3, 1), (3, 2), (4, 1), (4, 2)]

Step 2: Computing Exceedances

The exceedances function counts values exceeding a given threshold in a block.

def exceedances(data, maxima, bs, i, j, stride='DBM'):
    if stride == 'DBM':
        return np.sum(data[(j-1)*bs:j*bs] > maxima[i-1])
    if stride == 'SBM':
        return np.sum(data[j:j+bs] > maxima[i-1])

# Example usage
data = np.array([1, 3, 5, 2, 6, 4, 7, 9])
maxima = np.array([5, 7])
print(exceedances(data, maxima, bs=2, i=1, j=3, stride='DBM'))  # Output: 1

Step 3: Estimating Cluster Size Probability

The function hat_pi0 estimates the probability of cluster size 1.

import xtremes.topt as topt

def hat_pi0(data, maxima=None, bs=None, stride='DBM'):
    if maxima is not None:
        bs = len(data) // len(maxima)
    elif bs is not None:
        maxima = topt.extract_BM(data, bs, stride=stride)
    else:
        raise ValueError('Either maxima or block size must be provided')
    k = len(maxima)
    if stride == 'DBM':
        s = np.sum([exceedances(data, maxima, bs, i, j, stride='DBM') == 1
                    for i in range(1, 1 + k)
                    for j in np.delete(np.arange(1, 1 + k), i-1)])
        return 4 * s / (k * (k-1))

# Example usage
print(hat_pi0(data, maxima, bs=2, stride='DBM'))

Step 4: Using Bias Correction Functions

We define functions for the Upsilon, Pi, and Psi functions used in bias correction.

from scipy.special import gamma, digamma, polygamma
from scipy.optimize import root_scalar

def Upsilon(x, rho0):
    return rho0 * gamma(x+2) + (1-rho0) * gamma(x+1)

def Upsilon_derivative(x, rho0):
    return rho0 * gamma(x+2) * digamma(x+2) + (1-rho0) * gamma(x+1) * digamma(x+1)

def Pi(x, rho0):
    return 1/x - Upsilon_derivative(x, rho0)/Upsilon(x, rho0) + rho0/2 - np.euler_gamma

def Psi(a, a_true, rho0):
    vp = a / a_true
    term = 1 / vp - Upsilon_derivative(vp, rho0) / Upsilon(vp, rho0) + rho0/2 - np.euler_gamma
    return 2 / a_true * term

Step 5: Solving for Bias-Corrected Parameters

To obtain bias-corrected parameters, we use numerical root-finding methods.

def varpi(rho0):
    sol = root_scalar(Pi, args=(rho0,), bracket=[0.01, 10])
    return sol.root

def a1_asy(a_true, rho0):
    sol = root_scalar(Psi, args=(a_true, rho0), bracket=[1e-2, 100])
    return sol.root

# Example usage
rho0 = 0.5
print(varpi(rho0))  # Computes bias-corrected root

Conclusion

In this tutorial, we explored: - How to generate block indices. - Compute exceedances within blocks. - Estimate cluster size probability using hat_pi0. - Use the Upsilon, Pi, and Psi functions for bias correction. - Solve for bias-corrected parameters using numerical root-finding.

These methods are essential for extreme value analysis in time series data and improving MLE estimates by accounting for bias.