=============================== Bias Correction Tutorial =============================== In this tutorial, we will explore how to work with the `topt` module from `xtremes` to apply bias correction techniques in extreme value analysis. Specifically, we will cover: - Generating index ranges for sliding and disjoint block methods (`I_sb` and `D_n`). - Computing exceedances using `exceedances`. - Estimating cluster size probability using `hat_pi0`. - Utilizing the `Upsilon`, `Pi`, and `Psi` functions for bias correction. - Solving for bias-corrected parameters with `varpi` and `a1_asy`. Each section includes code examples for clarity. Step 1: Generating Block Indices ================================ In time series analysis, blocks of data are used to analyze cluster sizes and exceedances. We define two key functions: `I_sb(j, bs)`: Generate indices for a sliding block ---------------------------------------------------- .. code-block:: python import numpy as np def I_sb(j, bs): return np.arange(j, j + bs) # Example usage print(I_sb(3, 5)) # Output: [3, 4, 5, 6, 7] `D_n(n, bs)`: Generate index pairs for disjoint blocks ----------------------------------------------------------- .. code-block:: python def D_n(n, bs): idx = np.arange(1, n - bs + 2) pairs = [] for i in idx: left_js = idx[idx + bs <= i] right_js = idx[idx >= i + bs] for j in left_js: pairs.append((i, j)) for j in right_js: pairs.append((i, j)) return pairs # Example usage print(D_n(5, 2)) # Output: [(1, 3), (1, 4), (2, 4), (3, 1), (3, 2), (4, 1), (4, 2)] Step 2: Computing Exceedances ============================== The `exceedances` function counts values exceeding a given threshold in a block. .. code-block:: python def exceedances(data, maxima, bs, i, j, stride='DBM'): if stride == 'DBM': return np.sum(data[(j-1)*bs:j*bs] > maxima[i-1]) if stride == 'SBM': return np.sum(data[j:j+bs] > maxima[i-1]) # Example usage data = np.array([1, 3, 5, 2, 6, 4, 7, 9]) maxima = np.array([5, 7]) print(exceedances(data, maxima, bs=2, i=1, j=3, stride='DBM')) # Output: 1 Step 3: Estimating Cluster Size Probability =========================================== The function `hat_pi0` estimates the probability of cluster size 1. .. code-block:: python import xtremes.topt as topt def hat_pi0(data, maxima=None, bs=None, stride='DBM'): if maxima is not None: bs = len(data) // len(maxima) elif bs is not None: maxima = topt.extract_BM(data, bs, stride=stride) else: raise ValueError('Either maxima or block size must be provided') k = len(maxima) if stride == 'DBM': s = np.sum([exceedances(data, maxima, bs, i, j, stride='DBM') == 1 for i in range(1, 1 + k) for j in np.delete(np.arange(1, 1 + k), i-1)]) return 4 * s / (k * (k-1)) # Example usage print(hat_pi0(data, maxima, bs=2, stride='DBM')) Step 4: Using Bias Correction Functions ======================================= We define functions for the `Upsilon`, `Pi`, and `Psi` functions used in bias correction. .. code-block:: python from scipy.special import gamma, digamma, polygamma from scipy.optimize import root_scalar def Upsilon(x, rho0): return rho0 * gamma(x+2) + (1-rho0) * gamma(x+1) def Upsilon_derivative(x, rho0): return rho0 * gamma(x+2) * digamma(x+2) + (1-rho0) * gamma(x+1) * digamma(x+1) def Pi(x, rho0): return 1/x - Upsilon_derivative(x, rho0)/Upsilon(x, rho0) + rho0/2 - np.euler_gamma def Psi(a, a_true, rho0): vp = a / a_true term = 1 / vp - Upsilon_derivative(vp, rho0) / Upsilon(vp, rho0) + rho0/2 - np.euler_gamma return 2 / a_true * term Step 5: Solving for Bias-Corrected Parameters ============================================= To obtain bias-corrected parameters, we use numerical root-finding methods. .. code-block:: python def varpi(rho0): sol = root_scalar(Pi, args=(rho0,), bracket=[0.01, 10]) return sol.root def a1_asy(a_true, rho0): sol = root_scalar(Psi, args=(a_true, rho0), bracket=[1e-2, 100]) return sol.root # Example usage rho0 = 0.5 print(varpi(rho0)) # Computes bias-corrected root Conclusion ========== In this tutorial, we explored: - How to generate block indices. - Compute exceedances within blocks. - Estimate cluster size probability using `hat_pi0`. - Use the `Upsilon`, `Pi`, and `Psi` functions for bias correction. - Solve for bias-corrected parameters using numerical root-finding. These methods are essential for extreme value analysis in time series data and improving MLE estimates by accounting for bias.