Reference: topt

This module is specialized in implementing Top-t based Maximum Likelihood Estimators.

Overview

The xtremes.topt module provides tools for analyzing higher order statistics and their influence on Maximum Likelihood estimations. It includes classes and functions for handling time series data, extracting block maxima, and performing statistical analysis.

Classes

The `TimeSeries` Class and its Functionalities

The TimeSeries class is used to handle and manipulate time series data. It provides methods for extracting block maxima and high order statistics.

class xtremes.topt.TimeSeries(n, distr='GEV', correlation='IID', modelparams=[0], ts=0)

Bases: object

TimeSeries class for simulating and analyzing time series data with optional correlation structures.

This class is designed to simulate time series data based on specified distributions and correlation types. It also provides methods to extract block maxima and high order statistics from the simulated data.

Parameters

nint: The length of the time series.
distrstr, optional: The distribution to simulate the time series data from. Default is ‘GEV’ (Generalized Extreme Value).
correlationstr, optional: The type of correlation structure. Options include [‘IID’, ‘ARMAX’, ‘AR’]. Default is ‘IID’ (independent and identically distributed).
modelparamslist, optional: The parameters of the specified distribution. Default is [0].
tsfloat, optional: A parameter for controlling the time series characteristics, particularly for correlated series (e.g., in AR models). Must be in the range [0, 1]. Default is 0.

Attributes

valueslist: A list to store the simulated time series data.
distrstr: The type of distribution used for generating the time series data.
corrstr: The type of correlation structure applied to the time series.
modelparamslist: The parameters of the chosen distribution model.
tsfloat: A parameter controlling the correlation or time series structure, if applicable.
lenint: The length of the time series.
blockmaximalist: A list storing block maxima extracted from the simulated data.
high_order_statslist: A list storing the high order statistics extracted from the simulated data.

Methods

simulate(rep=10, seeds=’default’):: Simulates time series data based on the given distribution and correlation type.
get_blockmaxima(block_size=2, stride=’DBM’, rep=10):: Extracts block maxima from the simulated time series data.
get_HOS(orderstats=2, block_size=2, stride=’DBM’, rep=10):: Extracts high order statistics from the simulated time series data.

Examples

>>> # Create a TimeSeries object
>>> ts = TimeSeries(n=100, distr='GEV', correlation='ARMAX', modelparams=[0.5], ts=0.6)
>>> # Simulate time series data with 5 repetitions and specific seeds
>>> ts.simulate(rep=5, seeds=[42, 123, 456, 789, 1011])
>>> # Extract block maxima using a block size of 5
>>> ts.get_blockmaxima(block_size=5, stride='DBM', rep=5)
>>> # Extract high order statistics (order 3) using the same block size and stride
>>> ts.get_HOS(orderstats=3, block_size=5, stride='DBM', rep=5)

get_ABM_weights(recursively=True): “Computes weights for MLE with ABM stride

get_HOS(orderstats=2, block_size=2, stride='DBM', rep=10)

Extract high order statistics from simulated time series data.

High order statistics include values such as the second-largest value within each block of the time series.

Parameters

orderstatsint, optional: The order of statistics to extract. Default is 2 (i.e., second-highest value).
block_sizeint, optional: The size of blocks from which to extract the statistics. Default is 2.
stridestr, optional: The stride or step type used for block extraction. Choose from [‘SBM’, ‘DBM’]. Default is ‘DBM’.
repint, optional: The number of repetitions for extracting high order statistics. Should match the number of simulations. Default is 10.

get_blockmaxima(block_size=2, stride='DBM', rep=10)

Extract block maxima from simulated time series data.

Block maxima are the maximum values extracted from blocks of the time series data.

Parameters

block_sizeint, optional: The size of blocks from which to extract maxima. Default is 2.
stride{str, int}, optional: The stride or step type used for block extraction. Choose from [‘SBM’ (Sliding Block Maxima), ‘DBM’ (Disjoint Block Maxima)] or specify an integer as a step size. Default is ‘DBM’.
repint, optional: The number of repetitions for maxima extraction. This should match the number of simulations. Default is 10.

plot(rep=1, filename=None)

Plot the simulated time series data.

This method generates a plot showing the simulated time series data. The user can choose to display data from specific repetitions or all repetitions.

Parameters

repint or list, optional: The repetition number(s) to plot. If 0, all repetitions are plotted. If a list is provided, only the specified repetitions are plotted. Default is 1.
filenamestr, optional: The name of the PNG file to save the plot. If None, the plot is displayed but not saved. Default is None.

plot_blockmaxima(rep=1, filename=None, plot_type='line')

Plot the extracted block maxima.

This method generates a plot showing the extracted block maxima from the simulated time series data. The user can choose to display block maxima from specific repetitions or all repetitions. The plot can be either a line plot or a histogram.

Parameters

repint or list, optional: The repetition number(s) to plot. If 0, all repetitions are plotted. If a list is provided, only the specified repetitions are plotted. Default is 1.
filenamestr, optional: The name of the PNG file to save the plot. If None, the plot is displayed but not saved. Default is None.
plot_typestr, optional: The type of plot to generate. Options are ‘line’ for a line plot and ‘hist’ for a histogram. Default is ‘line’.

plot_with_blockmaxima(rep=1, plotlim=300, filename=None)

Plot the simulated time series data along with block maxima.

This method generates a plot showing the simulated time series data along with the block maxima. The user can choose to display data from specific repetitions or all repetitions.

Parameters

repint, optional: The repetition number to plot. Default is 1.
plotlimint, optional: The limit for the number of data points to plot. Default is 300.
filenamestr, optional: The name of the PNG file to save the plot. If None, the plot is displayed but not saved. Default is None.

simulate(rep=10, seeds='default')

Simulate time series data.

This method generates time series data based on the specified distribution, correlation type, and other parameters.

Parameters

repint, optional: The number of repetitions (simulations) to run. Default is 10.
seeds{str, list}, optional: Seed(s) for random number generation. If ‘default’, a sequence of seeds is automatically generated based on the number of repetitions. If a list of seeds is provided, it must match the number of repetitions (rep). Default is ‘default’.

Raises

ValueError: If the number of provided seeds does not match the number of repetitions.

Examples

Extract Block Maxima:

from xtremes.topt import TimeSeries
import numpy as np

ts_data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
ts = TimeSeries(ts_data)
block_maxima = ts.extract_BM(block_size=5, stride='DBM')
print("Block Maxima:", block_maxima)

Extract High Order Statistics:

from xtremes.topt import TimeSeries
import numpy as np

ts_data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
ts = TimeSeries(ts_data)
high_order_stats = ts.extract_HOS(orderstats=3, block_size=5, stride='DBM')
print("High Order Statistics:", high_order_stats)

The `HighOrderStats` Class and its Functionalities

The HighOrderStats class is used to compute and analyze higher order statistics from the time series data. It provides methods for calculating log-likelihoods and performing Maximum Likelihood Estimation (MLE).

class xtremes.topt.HighOrderStats(TimeSeries)

Bases: object

HighOrderStats class for calculating and analyzing high-order statistics of time series data.

Notes

This class provides functionality for calculating Probability Weighted Moment (PWM) estimators and Maximum Likelihood (ML) estimators from a given TimeSeries object.

Methods

get_PWM_estimation():: Calculate the PWM estimators for the time series data.
get_ML_estimation(initParams=’auto’, FrechetOrGEV = ‘Frechet’, option=1, estimate_pi=False):: Calculate the ML estimators for the time series data.

Attributes

TimeSeriesTimeSeries: The TimeSeries object containing the time series data.
high_order_statslist: List of high-order statistics extracted from the TimeSeries object.
blockmaximalist: List of block maxima derived from the high-order statistics.
gamma_truefloat: True gamma parameter of the GEV distribution derived from the TimeSeries object.
PWM_estimatorsPWM_estimators: Instance of PWM_estimators class for calculating PWM estimators.
ML_estimatorsML_estimators: Instance of ML_estimators class for calculating ML estimators.

Example

>>> # Create a TimeSeries object
>>> ts = TimeSeries(n=100, distr='GEV', correlation='ARMAX', modelparams=[0.5], ts=0.6)
>>> # Simulate time series data
>>> ts.simulate(rep=5, seeds=[42, 123, 456, 789, 1011])
>>> # Initialize HighOrderStats object
>>> hos = HighOrderStats(ts)
>>> # Calculate PWM estimators
>>> hos.get_PWM_estimation()
>>> # Calculate ML estimators
>>> hos.get_ML_estimation(initParams='auto', r=1)

get_ML_estimation(initParams='auto', r=None, FrechetOrGEV='Frechet')

Calculate the Maximum Likelihood (ML) estimators.

Parameters

initParamsstr or array-like, optional: Method for initializing parameters. Default is ‘auto’, which uses automatic parameter initialization.
rint, optional: Number of order statistics to calculate the log-likelihood on. If not specified, use all provided.
FrechetOrGEVstr, optional: Whether to fit the Frechet or GEV distribution.

Notes

This function performs maximum likelihood estimation based on either the Frechet or GEV distribution.

get_PWM_estimation(): Calculate the Probability Weighted Moment (PWM) estimators.

Examples

Log Likelihood:

from xtremes.topt import HighOrderStats
import numpy as np

hos_data = np.array([[0.1, 0.2], [0.3, 0.4], [0.2, 0.5], [0.4, 0.6]])
hos = HighOrderStats(hos_data)
log_likelihood = hos.log_likelihood(gamma=0.5, mu=0, sigma=2, r=2)
print("Log Likelihood:", log_likelihood)

Frechet Log Likelihood:

from xtremes.topt import HighOrderStats
import numpy as np

hos_data = np.array([[0.5, 1.0], [1.5, 2.0], [1.2, 2.2], [2.0, 3.0]])
hos = HighOrderStats(hos_data)
frechet_log_likelihood = hos.Frechet_log_likelihood(alpha=2, sigma=1.5, r=2)
print("Frechet Log Likelihood:", frechet_log_likelihood)

The `Data` Class and its Functionalities

The Data class is used to handle and manipulate real data for analysis.

class xtremes.topt.Data(values)

Bases: object

A class for analyzing data with block maxima and high order statistics.

This class provides methods to extract block maxima and high order statistics from a given dataset. It also supports Maximum Likelihood (ML) estimation for the parameters of either the Frechet or GEV distributions.

Parameters

valueslist or numpy.ndarray: The dataset from which block maxima and high order statistics are extracted. This should be a 1D array or list of values representing the time series or data.

Attributes

valueslist or numpy.ndarray: The dataset on which operations are performed.
lenint: The length of the dataset.
blockmaximalist: List to store the block maxima extracted from the dataset.
bm_indiceslist: List of indices corresponding to the positions of the block maxima in the original dataset.
high_order_statslist: List to store the high order statistics extracted from the dataset.
ML_estimatorsML_estimators_data: Object that stores and handles the ML estimation results for the Frechet or GEV parameters.

Methods

get_blockmaxima(block_size=2, stride=’DBM’):: Extracts block maxima from the dataset.
get_HOS(orderstats=2, block_size=2, stride=’DBM’):: Extracts high order statistics from the dataset.
get_ML_estimation(r=None, FrechetOrGEV=’GEV’):: Computes ML estimations for the Frechet or GEV parameters.

Example

>>> # Initialize the Data object with a dataset
>>> data = Data([2.5, 3.1, 1.7, 4.6, 5.3, 2.2, 6.0])
>>> # Extract block maxima using a block size of 2
>>> data.get_blockmaxima(block_size=2, stride='DBM')
>>> print(data.blockmaxima)
>>> # Extract second-highest order statistics (HOS) with the same block size and stride
>>> data.get_HOS(orderstats=2, block_size=2, stride='DBM')
>>> print(data.high_order_stats)
>>> # Perform ML estimation using the extracted HOS, choosing between Frechet and GEV
>>> data.get_ML_estimation(FrechetOrGEV='GEV')
>>> print(data.ML_estimators.values)

get_ABM_weights(recursively=True): “Computes weights for MLE with ABM stride

get_HOS(orderstats=2, block_size=2, stride='DBM')

Extract high order statistics (HOS) from the dataset.

High order statistics refer to statistics of interest beyond the maximum (e.g., second-highest, third-highest). This method extracts these statistics from blocks of the dataset.

Parameters

orderstatsint, optional: The order of the statistic to extract (e.g., 2 for second-highest). Default is 2.
block_sizeint, optional: The size of blocks from which to extract the statistics. Default is 2.
stridestr or int, optional: The type of stride to use when extracting blocks. Choose from: - ‘SBM’: Sliding Block Maxima (step size 1) - ‘DBM’: Disjoint Block Maxima (non-overlapping blocks) - int: Specifies the step size directly. Default is ‘DBM’.

Returns

None: The high order statistics are stored in the high_order_stats attribute.

get_ML_estimation(r=None, FrechetOrGEV='GEV', bounds=None)

Compute Maximum Likelihood (ML) estimations for the Frechet or GEV parameters.

This method computes ML estimators for the parameters of either the Frechet or GEV distribution based on the high order statistics extracted from the data. If no high order statistics are available, it will first extract them.

Parameters

rint, optional: The number of order statistics to use in the ML estimation. If not specified, it uses all the extracted statistics.
FrechetOrGEVstr, optional: The type of distribution to use for the ML estimation. Choose between ‘Frechet’ and ‘GEV’. Default is ‘GEV’.

Returns

None: The ML estimators are stored in the ML_estimators attribute.

get_blockmaxima(block_size=2, stride='DBM')

Extract block maxima from the dataset.

Block maxima are the largest values in each block of the dataset, where the block size and the stride (step size) determine how the blocks are divided.

Parameters

block_sizeint, optional: The size of blocks for maxima extraction. Default is 2.
stridestr or int, optional: The type of stride to use when extracting blocks. Choose from: - ‘SBM’: Sliding Block Maxima (step size 1) - ‘DBM’: Disjoint Block Maxima (non-overlapping blocks) - int: Specifies the step size directly. Default is ‘DBM’.

Returns

None: The block maxima and their corresponding indices are stored in the blockmaxima and bm_indices attributes.

The `PWM_estimators` Class

The PWM_estimators class is used to compute Probability Weighted Moment (PWM) estimators.

class xtremes.topt.PWM_estimators(TimeSeries)

Bases: object

Calculates Probability Weighted Moment (PWM) estimators from block maxima and computes statistics and confidence intervals for the GEV parameters.

Notes

This class provides methods to compute the Probability Weighted Moment (PWM) estimators and convert them into Generalized Extreme Value (GEV) distribution parameters (gamma, mu, sigma). It also allows users to compute confidence intervals for the estimated parameters and assess the statistical properties (mean, variance, bias, mean squared error) of the estimators compared to a true gamma value. Visualization options are provided to plot the estimators and confidence intervals.

Parameters

param TimeSeries:: TimeSeries object TimeSeries object containing block maxima or high order statistics.

Attributes

attribute blockmaxima:: numpy.ndarray Array of block maxima extracted from the TimeSeries object.
attribute values:: numpy.ndarray Array containing the PWM estimators (\(\gamma, \mu, \sigma\)) for each block maxima series.
attribute statistics:: dict Dictionary containing statistics (mean, variance, bias, mse) and confidence intervals of the PWM estimators.

Methods

method get_PWM_estimation():: Compute PWM estimators for each block maxima series and convert them into GEV parameters.
method get_statistics(gamma_true):: Compute statistics of the PWM estimators using a true gamma value.
method get_CIs(alpha=0.05, method=’symmetric’):: Compute confidence intervals (CIs) for the GEV parameters using either symmetric quantiles or minimal width intervals.
method plot(param=’gamma’, show_CI=True, show_true=True, filename=None):: Plot the PWM estimators for the GEV parameters (gamma, mu, sigma) with options to display confidence intervals and save the plot as an image.

Raises

raises ValueError:: If block maxima or high order statistics are not available in the TimeSeries object.

Examples

>>> from TimeSeries import TimeSeries
>>> from PWM_estimators import PWM_estimators
>>> ts = TimeSeries(data)  # Initialize TimeSeries object with data
>>> ts.get_blockmaxima(block_size=10, stride='SBM')  # Extract block maxima
>>> pwm = PWM_estimators(ts)  # Initialize PWM_estimators object
>>> pwm.get_PWM_estimation()  # Compute PWM estimators
>>> pwm.get_statistics(0.1)  # Compute statistics with true gamma value 0.1
>>> pwm.get_CIs(alpha=0.05, method='minimal_width')  # Compute confidence intervals
>>> pwm.plot(param='gamma', show_CI=True, show_true=True, filename='PWM_plot.png')  # Plot the results and save as image
>>> print(pwm.statistics)  # Print computed statistics
{'gamma_mean': 0.25, 'gamma_variance': 0.005, 'gamma_bias': 0.02, 'gamma_mse': 0.01,
 'mu_mean': 1.15, 'mu_variance': 0.04, 'sigma_mean': 0.9, 'sigma_variance': 0.02,
 'gamma_CI': (0.2, 0.5), 'mu_CI': (1.1, 1.5), 'sigma_CI': (0.8, 1.0)}

get_CIs(alpha=0.05, method='symmetric')

Compute confidence intervals (CIs) for the GEV parameters using different methods.

Notes

This function calculates confidence intervals (CIs) for the Generalized Extreme Value (GEV) parameters (gamma, mu, and sigma) estimated from Probability-Weighted Moments (PWM). The user can choose between two methods for computing the confidence intervals:

‘symmetric’: This method uses the quantiles of the distribution of parameter estimates to compute symmetric confidence intervals.
‘minimal_width’: This method finds the interval with minimal width that contains the desired proportion (1 - alpha) of the sorted parameter estimates.

For each block maxima series, confidence intervals for the GEV shape (gamma), location (mu), and scale (sigma) parameters are calculated. The results are stored in the self.statistics dictionary, with keys ‘gamma_CI’, ‘mu_CI’, and ‘sigma_CI’ corresponding to the computed confidence intervals.

Parameters

param alpha:: float, optional Significance level for the confidence intervals (default is 0.05, for a 95% CI).
param method:: str, optional Method for computing the confidence intervals. Options are: - ‘symmetric’: Uses quantiles to compute symmetric CIs (default). - ‘minimal_width’: Computes the minimal width interval containing (1 - alpha) of the estimates.

Example

>>> estimator = PWM_Estimators(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> estimator.get_CIs(alpha=0.05, method='minimal_width')
>>> print(estimator.statistics)
{'gamma_CI': (0.2, 0.5), 'mu_CI': (1.1, 1.5), 'sigma_CI': (0.8, 1.0)}

Returns

None: The results are stored in self.statistics, which contains the confidence intervals for each GEV parameter.

get_PWM_estimation()

Compute Probability-Weighted Moment (PWM) estimators and convert them into GEV parameters.

Notes

This function iterates over each block maxima series, computes the Probability-Weighted Moments (PWMs), and converts the PWMs into the Generalized Extreme Value (GEV) distribution parameters: shape (gamma), location (mu), and scale (sigma). The function utilizes the misc.PWM_estimation function to calculate the PWMs and then applies misc.PWM2GEV to convert these moments into GEV parameters.

The results for each block maxima series are stored in the self.values attribute, which is a NumPy array where each row corresponds to the GEV parameters [gamma, mu, sigma] for a specific block maxima series.

This function clears any previously stored values in self.values before appending new estimates.

Parameters

None

Example

>>> estimator = PWM_estimators(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> print(estimator.values)
array([[0.2, 1.1, 0.8],
    [0.3, 1.2, 0.9]])

Returns

None: The results are stored in self.values, a NumPy array containing the estimated GEV parameters for each block maxima series.

get_statistics(gamma_true)

Compute statistics of the PWM estimators using the true gamma value.

Notes

This function calculates various statistical measures (mean, variance, bias, and mean squared error) for the estimated Generalized Extreme Value (GEV) shape parameter (gamma) compared to a provided true value (gamma_true). It also computes the mean and variance for the location (mu) and scale (sigma) parameters across all block maxima series.

Mean Squared Error (MSE): Measures the average of the squares of the differences between the estimated and true gamma values.
Bias: Represents the systematic deviation of the estimated gamma values from the true gamma value.
Variance: Describes the spread of the estimated gamma values around their mean.

If only one block maxima series is available, a warning is raised since variance cannot be computed.

The computed statistics are stored in the self.statistics dictionary with the following keys: - ‘gamma_mean’, ‘gamma_variance’, ‘gamma_bias’, ‘gamma_mse’ - ‘mu_mean’, ‘mu_variance’ - ‘sigma_mean’, ‘sigma_variance’

Parameters

param gamma_true:: float The true value of the GEV shape parameter (gamma) used to compute bias and MSE.

Example

>>> estimator = PWM_estimators(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> estimator.get_statistics(gamma_true=0.2)
>>> print(estimator.statistics)
{'gamma_mean': 0.25, 'gamma_variance': 0.005, 'gamma_bias': 0.02, 'gamma_mse': 0.01,
'mu_mean': 1.15, 'mu_variance': 0.04, 'sigma_mean': 0.9, 'sigma_variance': 0.02}

Returns

None: The results are stored in self.statistics, containing the calculated statistics for gamma, mu, and sigma.

plot(param='gamma', show_CI=True, show_true=True, filename=None)

Plot the PWM estimators and confidence intervals for the GEV parameters.

Notes

This function generates a plot showing the Probability-Weighted Moment (PWM) estimators for the Generalized Extreme Value (GEV) parameters (gamma, mu, sigma) computed from block maxima. The user can choose to display confidence intervals (CIs) for each parameter.

The plot is saved as a PNG image if the save parameter is set to True.

Parameters

param param:: str, optional GEV parameter to plot (default is ‘gamma’).
param show_CI:: bool, optional Flag indicating whether to display confidence intervals (default is True).
param show:: bool, optional Flag indicating whether to display the plot.
param save:: bool, optional Flag indicating whether to save the plot as a PNG image (default is False).
param filename:: str, optional Name of the PNG file to save the plot (default is None).

Example

>>> estimator = PWM_estimators(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> estimator.get_CIs(alpha=0.05, method='minimal_width')
>>> estimator.plot(show_CI=True, show_true=True, save=True, filename='PWM_estimation.png')

Returns

None: The plot is displayed in the console and saved as a PNG image if the save parameter is set to True.

xtremes.topt.automatic_parameter_initialization(PWM_estimators, corr, ts=0.5)

Automatic parameter initialization for ML estimation.

Notes

This function is designed for initializing parameters for maximum likelihood estimation (ML) in statistical models. It automatically computes the probability weighted moment (PWM) estimators and adjusts them based on the specified correlation type (‘ARMAX’, ‘IID’, etc.). The ‘ts’ parameter is used to control the strength of temporal dependence in the model.

Parameters

param PWM_estimators:: list or numpy.array Probability weighted moment estimators.
param corr:: str Correlation type for the model. Supported values are ‘ARMAX’, ‘IID’, etc.
param ts:: float, optional Time series parameter controlling the strength of temporal dependence (default is 0.5).
return:: numpy.ndarray Initial parameters for ML estimation.

Examples

>>> from test_xtremes import misc
>>> PWM_est = misc.PWM_estimators(data)
>>> init_params = automatic_parameter_initialization(PWM_est, 'ARMAX', ts=0.8)

The `ML_estimators`, `Frechet_ML_estimators` and `ML_estimators_data` Classes

The ML_estimators, Frechet_ML_estimators, and ML_estimators_data classes are used for performing Maximum Likelihood Estimation (MLE) and handling the results.

class xtremes.topt.ML_estimators(TimeSeries)

Bases: object

Maximum Likelihood Estimators (MLE) for GEV parameters.

This class calculates Maximum Likelihood Estimators (MLE) for the parameters of the Generalized Extreme Value (GEV) distribution using the method of maximum likelihood estimation.

Parameters

TimeSeriesTimeSeries: The TimeSeries object containing the data for which MLE estimators will be calculated.

Attributes

valuesnumpy.ndarray: An array containing the MLE estimators for each set of high order statistics.
statisticsdict: A dictionary containing statistics computed from the MLE estimators.

Methods

__len__(): Returns the number of MLE estimators calculated.
get_ML_estimation(PWM_estimators=None, initParams=’auto’, option=1, estimate_pi=False): Computes the MLE estimators for the GEV parameters.
get_statistics(gamma_true): Computes statistics from the MLE estimators.

Examples

>>> import numpy as np
>>> from hos import TimeSeries, ML_estimators, PWM_estimators
>>> blockmaxima_data = np.random.normal(loc=10, scale=2, size=100)
>>> high_order_stats_data = np.random.normal(loc=5, scale=1, size=(100, 3))
>>> ts = TimeSeries(blockmaxima=blockmaxima_data, high_order_stats=high_order_stats_data, corr='IID', ts=0.5)
>>> pwm = PWM_estimators(ts)
>>> pwm.get_PWM_estimation()
>>> ml = ML_estimators(ts)
>>> ml.get_ML_estimation(PWM_estimators=pwm)
>>> ml.get_statistics(gamma_true=0.1)
>>> print("ML Estimators:")
>>> print(ml.values)
>>> print("\nStatistics:")
>>> print(ml.statistics)

get_CIs(alpha=0.05, method='symmetric')

Compute confidence intervals (CIs) for the GEV parameters using different methods.

Notes

This function calculates confidence intervals (CIs) for the Generalized Extreme Value (GEV) parameters (gamma, mu, and sigma) estimated from Maximum Likelihood Estimators (MLE). The user can choose between two methods for computing the confidence intervals:

‘symmetric’: This method uses the quantiles of the distribution of parameter estimates to compute symmetric confidence intervals.
‘minimal_width’: This method finds the interval with minimal width that contains the desired proportion (1 - alpha) of the sorted parameter estimates.

For each block maxima series, confidence intervals for the GEV shape (gamma), location (mu), and scale (sigma) parameters are calculated. The results are stored in the self.statistics dictionary, with keys ‘gamma_CI’, ‘mu_CI’, and ‘sigma_CI’ corresponding to the computed confidence intervals.

Parameters

param alpha:: float, optional Significance level for the confidence intervals (default is 0.05, for a 95% CI).
param method:: str, optional Method for computing the confidence intervals. Options are: - ‘symmetric’: Uses quantiles to compute symmetric CIs (default). - ‘minimal_width’: Computes the minimal width interval containing (1 - alpha) of the estimates.

Example

>>> estimator = ML_stimator(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> estimator.get_CIs(alpha=0.05, method='minimal_width')
>>> print(estimator.statistics)
{'gamma_CI': (0.2, 0.5), 'mu_CI': (1.1, 1.5), 'sigma_CI': (0.8, 1.0)}

Returns

None: The results are stored in self.statistics, which contains the confidence intervals for each GEV parameter.

get_ML_estimation(PWM_estimators=None, initParams='auto', r=None)

Compute Maximum Likelihood (ML) estimators for each series of high order statistics within the ML_estimators class.

This method fits the Generalized Extreme Value (GEV) distribution to each series of high order statistics using Maximum Likelihood Estimation (MLE) by optimizing the log-likelihood function.

Parameters

PWM_estimatorsPWM_estimators object, optional: An object containing PWM (Probability Weighted Moments) estimators. This is used for initializing the parameters in the ML estimation if initParams is set to ‘auto’. Required if initParams is ‘auto’.
initParamsstr or numpy.ndarray, optional: Initial parameters for the ML estimation. If ‘auto’, the initial parameters will be computed automatically using the PWM_estimators object. If a NumPy array is provided, these will be used as initial parameter values. Default is ‘auto’.
rint, optional: The number of order statistics to calculate the log-likelihood on. If not specified, all provided order statistics will be used.

Returns

None: This method updates the self.values attribute of the ML_estimators object with the estimated parameters (gamma, mu, sigma) for each series of high order statistics.

Raises

ValueError: If initParams is set to ‘auto’ and no PWM_estimators are provided, a ValueError is raised.

Notes

This method performs Maximum Likelihood Estimation (MLE) to fit the Generalized Extreme Value (GEV) distribution to the high order statistics within the ML_estimators class.
The method uses optimization techniques such as Nelder-Mead (and optionally COBYLA) to minimize the negative log-likelihood.
If initParams is set to ‘auto’, the initial parameters for the optimization are derived using the PWM_estimators object.
The optimization results (gamma, mu, sigma) are stored in the self.values list for each series of high order statistics.

get_statistics(gamma_true)

Compute statistics of the ML estimators using a true \(\gamma\) value.

Parameters

param gamma_true:: float True \(\gamma\) value for calculating statistics.

plot(param='gamma', show_CI=True, show_true=True, filename=None)

Plot the ML estimators and confidence intervals for the GEV parameters.

Notes

This function generates a plot showing the Maximum Likelihood estimators for the Generalized Extreme Value (GEV) parameters (gamma, mu, sigma) computed from block maxima. The user can choose to display confidence intervals (CIs) for each parameter

The plot is saved as a PNG image if the save parameter is set to True.

Parameters

param param:: str, optional GEV parameter to plot (default is ‘gamma’).
param show_CI:: bool, optional Flag indicating whether to display confidence intervals (default is True).
param show:: bool, optional Flag indicating whether to display the plot.
param save:: bool, optional Flag indicating whether to save the plot as a PNG image (default is False).
param filename:: str, optional Name of the PNG file to save the plot (default is None).

Example

>>> estimator = ML_estimators(timeseries_data)
>>> estimator.get_ML_estimation()
>>> estimator.get_CIs(alpha=0.05, method='minimal_width')
>>> estimator.plot(show_CI=True, show_true=True, save=True, filename='PWM_estimation.png')

Returns

None: The plot is displayed in the console and saved as a PNG image if the save parameter is set to True.

class xtremes.topt.Frechet_ML_estimators(TimeSeries)

Bases: object

Maximum Likelihood Estimators (MLE) for Frechet parameters.

This class calculates Maximum Likelihood Estimators (MLE) for the parameters of the 2-parameter Frechet distribution using the method of maximum likelihood estimation on a series of high order statistics.

Parameters

TimeSeriesTimeSeries: The TimeSeries object containing the data (high order statistics) for which MLE estimators will be calculated.

Attributes

valuesnumpy.ndarray: An array containing the MLE estimators (alpha, sigma) for each set of high order statistics.
statisticsdict: A dictionary containing computed statistics from the MLE estimators such as mean, variance, bias, and MSE.

Methods

__len__(): Returns the number of MLE estimators calculated.
get_ML_estimation(PWM_estimators=None, initParams=’auto’, r=None): Computes the MLE estimators for the Frechet parameters (alpha, sigma).
get_statistics(alpha_true): Computes statistics (mean, variance, bias, and MSE) of the MLE estimators using a true alpha value.

Examples

>>> import numpy as np
>>> from hos import TimeSeries, Frechet_ML_estimators, PWM_estimators
>>> blockmaxima_data = np.random.normal(loc=10, scale=2, size=100)
>>> high_order_stats_data = np.random.normal(loc=5, scale=1, size=(100, 3))
>>> ts = TimeSeries(blockmaxima=blockmaxima_data, high_order_stats=high_order_stats_data, corr='IID', ts=0.5)
>>> pwm = PWM_estimators(ts)
>>> pwm.get_PWM_estimation()
>>> ml = Frechet_ML_estimators(ts)
>>> ml.get_ML_estimation(PWM_estimators=pwm)
>>> ml.get_statistics(alpha_true=0.1)
>>> print("ML Estimators:")
>>> print(ml.values)
>>> print("\nStatistics:")
>>> print(ml.statistics)

get_CIs(alpha=0.05, method='symmetric')

Compute confidence intervals (CIs) for the Frechet parameters using different methods.

Notes

This function calculates confidence intervals (CIs) for the Frechet parameters (alpha and sigma) estimated from Maximum Likelihood Estimators (MLE). The user can choose between two methods for computing the confidence intervals:

‘symmetric’: This method uses the quantiles of the distribution of parameter estimates to compute symmetric confidence intervals.
‘minimal_width’: This method finds the interval with minimal width that contains the desired proportion (1 - alpha) of the sorted parameter estimates.

For each block maxima series, confidence intervals for the shape (alpha) and scale (sigma) parameters are calculated. The results are stored in the self.statistics dictionary, with keys ‘gamma_CI’, ‘mu_CI’, and ‘sigma_CI’ corresponding to the computed confidence intervals.

Do not get confused with alpha being the significance level as well as the shape parameter of the Frechet distribution.

Parameters

param alpha:: float, optional Significance level for the confidence intervals (default is 0.05, for a 95% CI).
param method:: str, optional Method for computing the confidence intervals. Options are: - ‘symmetric’: Uses quantiles to compute symmetric CIs (default). - ‘minimal_width’: Computes the minimal width interval containing (1 - alpha) of the estimates.

Example

>>> estimator = ML_stimator(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> estimator.get_CIs(alpha=0.05, method='minimal_width')
>>> print(estimator.statistics)

Returns

None: The results are stored in self.statistics, which contains the confidence intervals for each GEV parameter.

get_ML_estimation(PWM_estimators=None, initParams='auto', r=None, weights=None)

Compute ML estimators (alpha, sigma) for each high order statistics series using Frechet distribution.

Parameters

PWM_estimatorsPWM_estimators object, optional: PWM_estimators object containing PWM estimators for initializing parameters. Required if initParams is set to ‘auto’.
initParamsstr or numpy.ndarray, optional: Initial parameters for ML estimation. ‘auto’ to calculate automatically using PWM estimators. If a numpy array is provided, these will be used as initial parameter values. Default is ‘auto’.
rint, optional: Number of order statistics to calculate the log-likelihood on. If not specified, all provided order statistics will be used.

Returns

None: Updates the self.values attribute with the estimated parameters (alpha, sigma) for each series of high order statistics.

Raises

ValueError: If initParams is set to ‘auto’ and no PWM_estimators are provided.

get_statistics(alpha_true)

Compute statistics (mean, variance, bias, and MSE) of the ML estimators using the true \(\alpha\) value.

Parameters

alpha_truefloat: The true value of \(\alpha\) to calculate bias, MSE, and other statistics.

Returns

None: Updates the self.statistics dictionary with calculated statistics such as mean, variance, bias, and MSE for both \(\alpha\) and \(\sigma\).

Notes

The statistics include: - Mean and variance for the estimated alpha and sigma values. - Bias and mean squared error (MSE) for the alpha estimates.

plot(param='alpha', show_CI=True, show_true=True, filename=None)

Plot the ML estimators and confidence intervals for the GEV parameters.

Notes

This function generates a plot showing the Maximum Likelihood estimators for the Generalized Extreme Value (GEV) parameters (gamma, mu, sigma) computed from block maxima. The user can choose to display confidence intervals (CIs) for each parameter

The plot is saved as a PNG image if the save parameter is set to True.

Parameters

param param:: str, optional GEV parameter to plot (default is ‘gamma’).
param show_CI:: bool, optional Flag indicating whether to display confidence intervals (default is True).
param show:: bool, optional Flag indicating whether to display the plot.
param save:: bool, optional Flag indicating whether to save the plot as a PNG image (default is False).
param filename:: str, optional Name of the PNG file to save the plot (default is None).

Example

>>> estimator = PWM_estimators(timeseries_data)
>>> estimator.get_PWM_estimation()
>>> estimator.get_CIs(alpha=0.05, method='minimal_width')
>>> estimator.plot(show_CI=True, show_true=True, save=True, filename='PWM_estimation.png')

Returns

None: The plot is displayed in the console and saved as a PNG image if the save parameter is set to True.

xtremes.topt.log_likelihood(high_order_statistics, gamma=0, mu=0, sigma=1, r=None)

Calculate the GEV log likelihood based on the two highest order statistics.

Parameters

high_order_statisticsnumpy.ndarray: A 2D array where each row contains the two highest order statistics for each observation.
gammafloat, optional: The shape parameter (γ) for the Generalized Extreme Value (GEV) distribution. Default is 0.
mufloat, optional: The location parameter (μ) for the GEV distribution. Default is 0.
sigmafloat, optional: The scale parameter (σ) for the GEV distribution. Must be positive. Default is 1.
rint, optional: The number of order statistics to calculate the log-likelihood on. If not specified, it uses all provided statistics.

Returns

float: The calculated log likelihood.

Notes

This function computes the log likelihood using the two highest order statistics and supports both the classical Gumbel case (γ = 0) and the generalized case (γ ≠ 0).
The high_order_statistics array should be structured with the two highest order statistics per observation as rows.
The shape parameter γ controls the tail behavior of the distribution. When γ = 0, the distribution becomes the Gumbel type.
The r parameter controls how many order statistics are used for the likelihood calculation, typically r=2 for two order statistics.

Example

>>> hos = np.array([[0.1, 0.2], [0.3, 0.4], [0.2, 0.5], [0.4, 0.6]])
>>> log_likelihood(hos, gamma=0.5, mu=0, sigma=2, r=2)
-7.494890426732856

xtremes.topt.Frechet_log_likelihood(high_order_statistics, alpha=1, sigma=1, r=None, weights=None)

Calculate the 2-parameter Frechet log likelihood based on the highest order statistics. The calculation can be done using either the joint likelihood of the top two order statistics or the product of their marginals.

Parameters

high_order_statisticsnumpy.ndarray: A 2D array where each row contains the two highest order statistics for each observation.
alphafloat, optional: The shape parameter (α) for the Frechet distribution. Default is 1. Controls the tail behavior of the distribution.
sigmafloat, optional: The scale parameter (σ) for the Frechet distribution. Must be positive. Default is 1.
rint, optional: The number of order statistics to calculate the log-likelihood on. If not specified, it uses all provided statistics.
weightsnumpy.ndarray, optional: ABM as stride provides weights. If they are provided, use them instead of counts

Returns

float: The calculated log likelihood for the given data under the Frechet distribution.

Notes

This function computes the log likelihood using the two highest order statistics from each observation, with a focus on the Frechet distribution.
The shape parameter alpha determines the heaviness of the tail in the distribution, and the scale parameter sigma must be strictly positive.
The high_order_statistics array should be structured such that each row represents the two highest order statistics for an observation.
The function can either calculate the joint likelihood of the top two order statistics or consider the product of their marginals, depending on the values used.

Example

>>> hos = np.array([[0.5, 1.0], [1.5, 2.0], [1.2, 2.2], [2.0, 3.0]])
>>> Frechet_log_likelihood(hos, alpha=2, sigma=1.5, r=2)
-15.78467219003245

Running Extensive Simulations

The xtremes.topt module also provides functions for running extensive simulations and performing multiple MLEs.

xtremes.topt.run_ML_estimation(file, corr='IID', gamma_true=0, block_sizes=[5, 10, 15, 20, 25, 30, 35, 40, 45, 50], stride='SBM', option=1, estimate_pi=False)

Run maximum likelihood estimation (ML) for a given time series file.

Notes

This function reads a time series from a file and performs ML estimation for the specified correlation type, GEV shape parameter, block sizes, and stride type. It iterates over each block size, extracts block maxima, computes high order statistics, and performs ML estimation. The results are stored in a dictionary with the GEV shape parameter as the key and ML estimation results for each block size as the value.

Parameters

param file:: str Path to the file containing the time series data.
param corr:: str, optional Correlation type for the model (default is ‘IID’).
param gamma_true:: float, optional True value of the GEV shape parameter (default is 0).
param block_sizes:: list, optional List of block sizes for extracting block maxima (default is [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]).
param stride:: str, optional Stride type for block maxima extraction. Options are ‘SBM’ (sliding block maxima) or ‘DBM’ (default block maxima) (default is ‘SBM’).
param r:: int, optional Number of orderstatistics to calculate the log-likelihood on. If not specified, use all provided
param estimate_pi:: bool, optional Flag indicating whether to estimate the pi parameter (default is False).

Example

>>> result = run_ML_estimation("timeseries_data.pkl", corr='ARMAX', gamma_true=0.5, block_sizes=[10, 20, 30], stride='DBM', option=2, estimate_pi=True)
>>> print(result)
{0.5: {10: HighOrderStats_object, 20: HighOrderStats_object, 30: HighOrderStats_object}}

Returns

return:: dict Dictionary containing ML estimation results for each block size.

xtremes.topt.run_multiple_ML_estimations(file, corr='IID', gamma_trues=array([-0.4, -0.3, -0.2, -0.1, 0., 0.1, 0.2, 0.3, 0.4]), block_sizes=[5, 10, 15, 20, 25, 30, 35, 40, 45, 50], stride='SBM', option=1, estimate_pi=False, parallelize=False)

Run multiple maximum likelihood estimations (ML) for a range of GEV shape parameter values.

Notes

This function performs ML estimation for a range of GEV shape parameter values specified in ‘gamma_trues’ and aggregates the results into a single dictionary. It iterates over each gamma value, calls the ‘run_ML_estimation’ function, and collects the results. If ‘parallelize’ is set to True, it runs the estimations concurrently using asyncio.

Parameters

param file:: str Path to the file containing the time series data.
param corr:: str, optional Correlation type for the model (default is ‘IID’).
param gamma_trues:: numpy.ndarray, optional Array of GEV shape parameter values to perform ML estimation for (default is np.arange(-4, 5, 1)/10).
param block_sizes:: list, optional List of block sizes for extracting block maxima (default is [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]).
param stride:: str, optional Stride type for block maxima extraction. Options are ‘SBM’ (sliding block maxima) or ‘DBM’ (default block maxima) (default is ‘SBM’).
param r:: int, optional Number of orderstatistics to calculate the log-likelihood on. If not specified, use all provided
param estimate_pi:: bool, optional Flag indicating whether to estimate the pi parameter (default is False).
param parallelize:: bool, optional Flag indicating whether to parallelize the ML estimations using asyncio (default is False).

Returns

return:: dict Dictionary containing ML estimation results for each gamma value and block size.

Examples

Run ML Estimation:

from xtremes.topt import run_ML_estimation

result = run_ML_estimation("timeseries_data.pkl", corr='ARMAX', gamma_true=0.5, block_sizes=[10, 20, 30], stride='DBM', option=2, estimate_pi=True)
print(result)

Run Multiple ML Estimations:

from xtremes.topt import run_multiple_ML_estimations
import numpy as np

result = run_multiple_ML_estimations("timeseries_data.pkl", corr='IID', gamma_trues=np.arange(-4, 5, 1)/10, block_sizes=[10, 20, 30], stride='SBM', option=1, estimate_pi=False, parallelize=True)
print(result)

Reference: topt

Overview

Classes

The TimeSeries Class and its Functionalities

Parameters

Attributes

Methods

Examples

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Raises

Examples

The HighOrderStats Class and its Functionalities

Notes

Methods

Attributes

Example

Parameters

Notes

Examples

The Data Class and its Functionalities

Parameters

Attributes

Methods

Example

Parameters

Returns

Parameters

Returns

Parameters

Returns

The PWM_estimators Class

Notes

Parameters

Attributes

Methods

Raises

Examples

Notes

Parameters

Example

Returns

Notes

Parameters

Example

Returns

Notes

Parameters

Example

Returns

Notes

Parameters

Example

Returns

Notes

Parameters

See also

Examples

The ML_estimators, Frechet_ML_estimators and ML_estimators_data Classes

Parameters

Attributes

Methods

Examples

Notes

Parameters

Example

Returns

Parameters

Returns

Raises

Notes

Parameters

Notes

Parameters

Example

Returns

The `TimeSeries` Class and its Functionalities

The `HighOrderStats` Class and its Functionalities

The `Data` Class and its Functionalities

The `PWM_estimators` Class

The `ML_estimators`, `Frechet_ML_estimators` and `ML_estimators_data` Classes