Bias Correction

This module computes bias corrections for the Maximum Likelihood Estimation (MLE) of cluster size distributions in stationary time series. So far, it only works for $t=2$.

Overview

The biascorrection module provides tools for estimating probabilities of cluster sizes, calculating related functions like ( Upsilon ) and ( Pi ), and finding roots for bias correction.

Functions

The `I_sb` Function

xtremes.biascorrection.I_sb(j, bs)

Generate a range of integers representing indices for a sliding block method.

This function is often used in time series analysis or statistical modeling contexts where data is divided into overlapping or non-overlapping blocks.

Parameters:

jint: The starting index for the block. This typically represents the position in the time series or dataset where the block begins.
bsint: The block size, which determines the number of elements in the block and the range of indices generated.

Returns:

numpy.ndarray: An array of consecutive integers starting from j and ending at j + bs - 1. This array can be used to reference indices of elements within a specific block.

Example:

>>> import numpy as np
>>> I_sb(3, 5)
array([3, 4, 5, 6, 7])

References:

Bücher, A., & Jennessen, T. (2022). Statistical analysis for stationary time series at extreme levels: New estimators for the limiting cluster size distribution. Stochastic Processes and their Applications, 149, 75-106.

The `D_n` Function

xtremes.biascorrection.D_n(n, bs)

Generate all pairs of indices representing disjoint blocks in a time series.

This function identifies pairs of block indices (i, j) such that the blocks starting at indices i and j, each of size bs, do not overlap. It is useful in time series analysis for constructing disjoint block structures, particularly in statistical estimators for cluster sizes.

Parameters:

nint: The length of the time series or dataset. This determines the range of valid indices for block positions.
bsint: The block size, which specifies the number of elements in each block. Blocks are constructed as contiguous subsets of the time series.

Returns:

list of tuples: A list of tuples (i, j), where i and j are the starting indices of two disjoint blocks. The blocks I_sb(i, bs) and I_sb(j, bs) have no overlap, i.e., their intersection is empty.

Example:

>>> import numpy as np
>>> def I_sb(j, bs):
>>>     return np.arange(j, j + bs)
>>> D_n(5, 2)
[(1, 3), (1, 4), (2, 4), (3, 1), (3, 2), (4, 1), (4, 2)]

Notes:

The function uses the helper function I_sb to define the range of indices covered by a block starting at a given index.
It checks for disjointness using np.intersect1d, which computes the intersection of two arrays.
This function is critical in methods involving disjoint blocks, such as certain statistical estimators that assume independence between blocks.

The `exceedances` Function

xtremes.biascorrection.exceedances(data, maxima, bs, i, j, stride='DBM')

Calculate the number of exceedances of a given maximum within a specified block. This function computes the number of values in the j-th block of the data that exceed the i-th maximum. The block can be defined using either a disjoint block method (DBM) or a sliding block method (SBM).

Parameters:

dataarray-like

The time series or dataset from which exceedances are calculated.

maximaarray-like

An array of maxima values, where maxima[i-1] is the reference maximum for which exceedances are counted.

bsint

The block size, specifying the number of consecutive elements in each block.

iint

The index (1-based) of the maximum in maxima used as the reference for exceedances.

jint

The index (1-based) of the block in the dataset to examine for exceedances.

stride{‘DBM’, ‘SBM’}, optional

Specifies the block method used: - ‘DBM’ (Disjoint Block Method): The blocks are non-overlapping and start at (j-1)*bs and end at j*bs. - ‘SBM’ (Sliding Block Method): The blocks can overlap and are defined to start at j and end at j+bs.

Default is ‘DBM’.

Returns:

int: The number of elements in the specified block that exceed the given maximum.

Example:

>>> import numpy as np
>>> data = np.array([1, 3, 5, 2, 6, 4, 7, 9])
>>> maxima = np.array([5, 7])
>>> exceedances(data, maxima, bs=2, i=1, j=3, stride='DBM')
1  # One value in the 3rd disjoint block exceeds maxima[0] = 5
>>> exceedances(data, maxima, bs=3, i=2, j=2, stride='SBM')
2  # Two values in the sliding block exceed maxima[1] = 7

Notes:

The stride parameter allows flexibility in block definition: - ‘DBM’ is suited for non-overlapping blocks, often used in classical block maxima methods. - ‘SBM’ is suited for sliding windows, offering finer granularity for overlap-based analysis.
Indexing of blocks and maxima is 1-based for user-friendliness but is internally converted to Python’s 0-based indexing.

The `hat_pi0` Function

xtremes.biascorrection.hat_pi0(data, maxima=None, bs=None, stride='DBM')

Compute estimators ( hat{pi}(1) ) for the limiting cluster size probability ( pi(1) ).

This function estimates the first value of ( pi(m) ), which represent probabilities associated with cluster sizes in stationary time series.

Parameters:

dataarray-like: The dataset or time series used to compute probabilities and estimators.
maximaarray-like, optional: A pre-computed array of maxima (e.g., block maxima). If not provided, block maxima are computed internally using the specified bs.
bsint, optional: The block size, specifying the number of elements in each block. If not provided, it is inferred from the length of data divided by the number of maxima, if maxima are provided.
stride{‘DBM’, ‘SBM’}, optional: Defines the block method: - ‘DBM’ (Disjoint Block Method): Non-overlapping blocks are used. - ‘SBM’ (Sliding Block Method): Overlapping sliding blocks are used. Default is ‘DBM’.

Returns:

float: Estimated ( hat{pi}(1) ) value.

Raises:

ValueError: If neither maxima nor bs is provided.

Example:

>>> import numpy as np
>>> data = np.array([1, 3, 5, 2, 6, 4, 7, 9, 8, 10])
>>> maxima = np.array([5, 7, 10])
>>> hat_pis(1, data, bs=3, stride='SBM')
[1.5, 1.0]  # Using SBM and calculated block maxima

Notes:

The recursive formula for ( hat{pi}(m) ) is given by:

\[\hat{\pi}(1) = 4 \bar{p}(1) \hat{\pi}(m) = 4 \bar{p}(m) - 2 \sum_{k=1}^{m-1} \hat{\pi}(m-k) \bar{p}(k)\]

This ensures that each ( hat{pi}(m) ) is built upon the values of smaller ( m ).
If pbars is not provided, the function computes ( bar{p}(m) ) using pbar_dbm_fast for the DBM stride. Future extensions can integrate SBM support dynamically.
The function is optimized for the DBM stride but can handle SBM if provided with pbars.

References:

Bücher, A., & Jennessen, T. (2022). Statistical analysis for stationary time series at extreme levels: New estimators for the limiting cluster size distribution. Stochastic Processes and their Applications, 149, 75-106.

The `Upsilon` Function

xtremes.biascorrection.Upsilon(x, rho0)

Compute the ( Upsilon(x, rho_0) ) function specified in bias correction.

Parameters:

xfloat: The parameter ( x ) in the ( Upsilon(x, rho_0) ) function. Typically, ( x ) relates to the scaling or shape parameter in the analysis.

rho0 : float

Returns:

float: The computed value of ( Upsilon(x, rho_0) ).

The `Upsilon_derivative` Function

xtremes.biascorrection.Upsilon_derivative(x, rho0)

Compute the derivative of ( Upsilon(x, rho_0) ) function specified in bias correction.

Parameters:

xfloat: The parameter ( x ) in the ( Upsilon(x, rho_0) ) function. Typically, ( x ) relates to the scaling or shape parameter in the analysis.

rho0 : float

Returns:

float: The computed value of ( Upsilon’(x, rho_0) ).

The `Upsilon_2ndderivative` Function

xtremes.biascorrection.Upsilon_2ndderivative(x, rho0)

Compute the second derivative ( Upsilon(x, rho_0) ) function specified in bias correction.

Parameters:

xfloat: The parameter ( x ) in the ( Upsilon(x, rho_0) ) function. Typically, ( x ) relates to the scaling or shape parameter in the analysis.

rho0 : float

Returns:

float: The computed value of ( Upsilon’’(x, rho_0) ).

The `Psi` Function

xtremes.biascorrection.Psi(a, a_true, rho0)

Compute the ( Psi(x, rho_0) ) function specified in bias correction, closely related to (Upsilon).

Parameters:

xfloat: The parameter ( x ) in the ( Upsilon(x, rho_0) ) function. Typically, ( x ) relates to the scaling or shape parameter in the analysis.

rho0 : float

Returns:

float: The computed value of ( Psi(x, rho_0) ).

The `a1_asy` Function

xtremes.biascorrection.a1_asy(a_true, rho0)

Compute the numerical root of ( Psi(x, rho_0) ).

Parameters:

a_true : float

rho0 : float

Returns:

float: The computed root of ( Psi(x, rho_0) ).

The `varpi` Function

xtremes.biascorrection.varpi(rho0)

Compute the numerical root of ( Pi(x, rho_0) ).

Parameters:

rho0 : float

Returns:

float: The computed root of ( Pi(x, rho_0) ).

The `z0` Function

xtremes.biascorrection.z0(rho0)

Compute (z_0), a special quantity related to bias-correcting (sigma) ( Pi(x, rho_0) ).

Parameters:

rho0 : float

Returns:

float: The computed root of ( Pi(x, rho_0) ).

Bias Correction

Overview

Functions

The I_sb Function

Parameters:

Returns:

Example:

References:

The D_n Function

Parameters:

Returns:

Example:

Notes:

The exceedances Function

Parameters:

Returns:

Example:

Notes:

The hat_pi0 Function

Parameters:

Returns:

Raises:

Example:

Notes:

References:

The Upsilon Function

Parameters:

Returns:

The Upsilon_derivative Function

Parameters:

Returns:

The Upsilon_2ndderivative Function

Parameters:

Returns:

The Psi Function

Parameters:

Returns:

The a1_asy Function

Parameters:

Returns:

The varpi Function

Parameters:

Returns:

The z0 Function

Parameters:

Returns:

The `I_sb` Function

The `D_n` Function

The `exceedances` Function

The `hat_pi0` Function

The `Upsilon` Function

The `Upsilon_derivative` Function

The `Upsilon_2ndderivative` Function

The `Psi` Function

The `a1_asy` Function

The `varpi` Function

The `z0` Function