pyuncertainnumber.calibration.tmcmc
===================================

.. py:module:: pyuncertainnumber.calibration.tmcmc


Classes
-------

.. autoapisummary::

   pyuncertainnumber.calibration.tmcmc.TMCMC
   pyuncertainnumber.calibration.tmcmc.Stage


Functions
---------

.. autoapisummary::

   pyuncertainnumber.calibration.tmcmc.recover_trace_samples
   pyuncertainnumber.calibration.tmcmc.plot_updated_distribution
   pyuncertainnumber.calibration.tmcmc.initial_population
   pyuncertainnumber.calibration.tmcmc.log_prior
   pyuncertainnumber.calibration.tmcmc.compute_beta_update_evidence
   pyuncertainnumber.calibration.tmcmc.propose
   pyuncertainnumber.calibration.tmcmc.MCMC_MH
   pyuncertainnumber.calibration.tmcmc.run_tmcmc_vec
   pyuncertainnumber.calibration.tmcmc.run_tmcmc_updated
   pyuncertainnumber.calibration.tmcmc.run_tmcmc
   pyuncertainnumber.calibration.tmcmc.log_likelihood
   pyuncertainnumber.calibration.tmcmc.gaussian_likelihood_fun
   pyuncertainnumber.calibration.tmcmc.get_top_samples
   pyuncertainnumber.calibration.tmcmc.get_posterior_samples
   pyuncertainnumber.calibration.tmcmc.hdi_1d
   pyuncertainnumber.calibration.tmcmc.get_hdi_bounds
   pyuncertainnumber.calibration.tmcmc.filter_hdi_bounds
   pyuncertainnumber.calibration.tmcmc.transform_old_trace_to_new


Module Contents
---------------

.. py:class:: TMCMC(N: int = None, parameters: list = None, names: list[str] = None, log_likelihood: callable = None, mutation_steps: int = 5, status_file_name: str = None, trace=None)

   Class for TMCMC implementation

   :param N: number of particles to be sampled from posterior
   :type N: int
   :param parameters: list of (size Nop) prior distributions instances
   :type parameters: list
   :param log_likelihood: log likelihood function on :math:`\theta` to be defined which is problem specific
   :type log_likelihood: callable
   :param status_file_name: name of the status file to store status of the tmcmc sampling
   :type status_file_name: str

   :returns:

             returns trace file of all samples of all tmcmc stages.
                 At stage m: it contains [Sm, Lm, Wm_n, ESS, beta, Smcap]
   :rtype: mytrace

   .. rubric:: Example

   >>> from pyuncertainnumber.calibration.tmcmc import TMCMC
   >>> from pyuncertainnumber import pba
   >>> # create an instance of TMCMC class and run tmcmc
   >>> parameters = [pba.D("uniform", (0.8, 2.2)), pba.D("uniform", (0.4, 1.2))] # dummy prior distributions

   >>> t = TMCMC(
   >>>     N=1000,                                             # number of particles
   >>>     parameters=parameters,                              # prior distributions
   >>>     names =['k1', 'k2'],                                # parameter names
   >>>     log_likelihood=log_likelihood_function,             # user defined log likelihood function
   >>>     mutation_steps=1,                                   # number of MCMC steps for perturbation
   >>>     status_file_name='tmcmc_progressing_status.txt',    # status log file
   >>> )
   >>> mytrace = t.run()

   >>> # for loading trace file
   >>> import pickle
   >>> with open('tmcmc_trace.pkl', 'rb') as f:
   >>>     mytrace = pickle.load(f)
   >>> re = TMCMC(trace=mytrace, names=names)                  # create TMCMC instance with loaded trace and names

   >>> re.plot_updated_distribution(save=False)                # plot prior and posterior distribution
   >>> prior_samples = re.load_prior_samples()                 # load prior samples as DataFrame
   >>> posterior_samples = re.load_posterior_samples()         # load posterior samples as DataFrame

   .. figure:: /_static/tmcmc_2dof.png
       :alt: 2DOF TMCMC example
       :align: center
       :width: 70%

       Example of TMCMC calibration of a 2-DOF system


   .. py:attribute:: N
      :value: None


   .. py:attribute:: parameters
      :value: None


   .. py:attribute:: names
      :value: None


   .. py:attribute:: log_likelihood
      :value: None


   .. py:attribute:: mutation_steps
      :value: 5


   .. py:attribute:: status_file_name
      :value: None


   .. py:attribute:: trace
      :value: None


   .. py:method:: check_names()


   .. py:method:: run() -> list[Stage]

      Run the TMCMC algorithm

      :returns: returns trace file of all samples of all tmcmc stages.
      :rtype: mytrace


   .. py:method:: check_trace()


   .. py:method:: save_trace(mytrace, file_name: str, save_dir=None)

      Save the trace to a file

      :param mytrace: trace file of all samples of all tmcmc stages.
      :param file_name: name of the file to save the trace
      :type file_name: str


   .. py:method:: load_prior_samples(trace=None) -> pandas.DataFrame


   .. py:method:: load_posterior_samples(trace=None) -> pandas.DataFrame


   .. py:method:: plot_updated_distribution(trace=None, save=False)

      Plot the prior and posterior distribution of the parameters

      :param trace: trace file of all samples of all tmcmc stages.
      :param save: if True, save the figure
      :type save: bool


.. py:class:: Stage

   Container for one TMCMC stage.

   :param Sm: samples before resampling (N x Np)
   :type Sm: NDArray
   :param Lm: log-likelihoods at Sm (N,)
   :type Lm: NDArray
   :param Wm_n: normalized incremental weights (N,)
   :type Wm_n: NDArray
   :param ESS: effective sample size the next stage (can be None in terminal stage)
   :type ESS: Optional[float]
   :param beta: tempering parameter next stage beta
   :type beta: float
   :param Smcap: resampled samples (N x Np), None in terminal stage
   :type Smcap: Optional[NDArray]


   .. py:attribute:: Sm
      :type:  numpy.typing.NDArray


   .. py:attribute:: Lm
      :type:  numpy.typing.NDArray


   .. py:attribute:: Wm_n
      :type:  numpy.typing.NDArray


   .. py:attribute:: ESS
      :type:  Optional[float]


   .. py:attribute:: beta
      :type:  float


   .. py:attribute:: Smcap
      :type:  Optional[numpy.typing.NDArray]


   .. py:method:: __repr__() -> str


   .. py:property:: samples
      :type: numpy.typing.NDArray


      Samples before resampling (from previous stage).


   .. py:property:: log_likelihoods
      :type: numpy.typing.NDArray


      Log-likelihood evaluated at `samples`.


   .. py:property:: weights
      :type: numpy.typing.NDArray


      Normalized incremental importance weights.


   .. py:property:: effective_sample_size
      :type: Optional[float]


      Effective sample size after reweighting the next stage.


   .. py:property:: tempering_parameter
      :type: float


      Tempering parameter beta next stage beta.


   .. py:property:: resampled_samples
      :type: Optional[numpy.typing.NDArray]


      Samples after importance resampling (before MH mutation).


.. py:function:: recover_trace_samples(mytrace: list[Stage], names: list[str]) -> pandas.DataFrame

   Load back the Prior and Posterior samples in the form of a DataFrame

   :param mytrace: trace file of all samples of all tmcmc stages.
   :type mytrace: list[Stage]
   :param names: list of parameter names
   :type names: list[str]

   :returns: Prior and Posterior in the trace as a DataFrame
   :rtype: (pd.DataFrame)


.. py:function:: plot_updated_distribution(mytrace, names: list[str], save=False)

   Plot the prior and posterior distribution of the parameters

   :param mytrace: trace file of all samples of all tmcmc stages.
   :param names: list of parameter names
   :type names: list[str]
   :param save: if True, save the figure
   :type save: bool


.. py:function:: initial_population(N: float, all_pars: list) -> numpy.typing.NDArray

   Generates initial population from prior distribution

   :param N: number of particles.
   :type N: float
   :param all_pars: All the parameters, of size Np, to be updated. `all_pars[i]` is object of type `pdfs`.
   :type all_pars: list

   Returns
       ini_pop (NDArray): the initial population which is a numpy array of size `N x Np`.

   .. note:: `Np` is number of parameters


.. py:function:: log_prior(s: numpy.typing.NDArray, all_pars: list[pyuncertainnumber.calibration.pdfs.ProbabilityDensityFun | pyuncertainnumber.Distribution]) -> float

   Computes log_prior value at all particles

   :param s: numpy array of size (N x Np) of all particles.
   :type s: NDArray
   :param all_pars: All the parameters, of size Np, to be updated. `all_pars[i]` is object of type `pdfs`.
   :type all_pars: list[ProbabilityDensityFun | Distribution]

   Returns
       log_p (NDArray): log prior at all N particles which is a numpy array of size N

   .. note:: `Np` denotes number of parameters


.. py:function:: compute_beta_update_evidence(beta: float, log_likelihoods: numpy.typing.NDArray, log_evidence: float, prev_ESS: int)

   Computes beta for the next stage and updated model evidence

   :param beta: stage parameter.
   :type beta: float
   :param log_likelihoods: log likelihood values at all particles which is numpy array of size N
   :type log_likelihoods: NDArray
   :param log_evidence: log of evidence.
   :type log_evidence: float
   :param prev_ESS: effective sample size of previous stage
   :type prev_ESS: int

   :returns: A 4-tuple ``(new_beta, log_evidence, Wm_n, ESS)`` where:
       - **new_beta** (*float*): stage parameter for next stage.
       - **log_evidence** (*float*): updated log evidence.
       - **Wm_n** (*NDArray*): weights of particles for the next stage.
       - **ESS** (*float*): effective sample size of new stage.
   :rtype: tuple[float, float, NDArray, float]


.. py:function:: propose(current: numpy.typing.NDArray, covariance: numpy.typing.NDArray, n: int) -> numpy.typing.NDArray

   Proposal distribution for MCMC in pertubation stage

   :param current: numpy array of size Np
                   current particle location
   :type current: NDArray
   :param covariance: numpy array of size Np x Np
                      proposal covariance matrix
   :type covariance: NDArray
   :param n: number of proposals.
   :type n: int

   :returns: n proposals
   :rtype: NDArray


.. py:function:: MCMC_MH(particle_num: int, Em: numpy.typing.NDArray, Nm_steps: int, current: numpy.typing.NDArray, likelihood_current: float, posterior_current: float, beta: float, numAccepts: int, all_pars: list, log_likelihood: callable) -> tuple[numpy.typing.NDArray, float, float, int]

   Markov chain Monte Carlo using Metropolis-Hastings which "perturbs" each particle using MCMC-MH

   Conduct `Nm_steps` steps of MH for each particle.

   :param particle_num: The index of the current particle/parameter vector being evaluated.
   :type particle_num: int
   :param Em: proposal covarince matrix which is numpy array of size Nop x Nop
   :type Em: NDArray
   :param Nm_steps: number of perturbation steps.
   :type Nm_steps: int
   :param current: numpy array of size Nop. current particle location
   :type current: NDArray
   :param likelihood_current: log likelihood value at current particle
   :type likelihood_current: float
   :param posterior_current: log posterior value at current particle
   :type posterior_current: float
   :param beta: stage parameter.
   :type beta: float
   :param numAccepts: total number of accepts
   :type numAccepts: int
   :param all_pars: list of size Nop
                    Nop is number of parameters
                    all_pars[i] is object of type pdfs
                    all parameters to be inferred.
   :param log_likelihood: log likelihood function to be defined in main.py.
   :type log_likelihood: callable

   :returns: A 4-tuple ``(current, likelihood_current, posterior_current, numAccepts)`` where:

       - **current** (*NDArray*): Perturbed particle location of shape (Nop,).
       - **likelihood_current** (*float*): Log-likelihood at the perturbed particle.
       - **posterior_current** (*float*): Log-posterior at the perturbed particle.
       - **numAccepts** (*int*): Updated total number of accepts.
   :rtype: tuple[NDArray, float, float, int]


.. py:function:: run_tmcmc_vec()

   vectorised version of run_tmcmc to be implemented later


.. py:function:: run_tmcmc_updated(N: int, all_pars: list, log_likelihood: callable, status_file_name: str, Nm_steps_max: int = 5, Nm_steps_maxmax: int = 5) -> list[Stage]

   Updated main workflow of running Transitional MCMC

   Solves the "At this point, beta has been updated to the new value, but Postm is still the log-target from the previous beta (except at the very first stage).
   So Postmcap is inconsistent with the beta you pass into MCMC_MH."

   :param N: int
             number of particles to be sampled from posterior
   :type N: int
   :param all_pars: All the parameters, of size Nop, to be updated. `all_pars[i]` is object of type `pdfs`.
   :type all_pars: list
   :param log_likelihood: log likelihood function to be defined in main.py as is problem specific
   :type log_likelihood: callable
   :param status_file_name: name of the status file to store status of the tmcmc sampling
   :type status_file_name: str
   :param Nm_steps_max: Numbers of MCMC steps for perturbation. The default is 5.
   :type Nm_steps_max: int, optional
   :param Nm_steps_maxmax: Numbers of MCMC steps for perturbation. The default is 5.
   :type Nm_steps_maxmax: int, optional

   :returns: the trace that contains all iterations of the Stage instances.
   :rtype: list[Stage]


.. py:function:: run_tmcmc(N: int, all_pars: list, log_likelihood: callable, status_file_name: str, Nm_steps_max: int = 5, Nm_steps_maxmax: int = 5)

   Main workflow of running Transitional MCMC

   :param N: int
             number of particles to be sampled from posterior
   :type N: int
   :param all_pars: All the parameters, of size Nop, to be updated. `all_pars[i]` is object of type `pdfs`.
   :type all_pars: list
   :param log_likelihood: log likelihood function to be defined in main.py as is problem specific
   :type log_likelihood: callable
   :param status_file_name: name of the status file to store status of the tmcmc sampling
   :type status_file_name: str
   :param Nm_steps_max: Numbers of MCMC steps for perturbation. The default is 5.
   :type Nm_steps_max: int, optional
   :param Nm_steps_maxmax: Numbers of MCMC steps for perturbation. The default is 5.
   :type Nm_steps_maxmax: int, optional
   :param parallel_processing: should be either 'multiprocessing' or 'mpi'
   :type parallel_processing: str

   :returns:

                 mytrace: returns trace file of all samples of all tmcmc stages.
                     at stage m: it contains [Sm, Lm, Wm_n, ESS, beta, Smcap]
   :rtype: a tuple containing


.. py:function:: log_likelihood(sam_id: int, param_vec: numpy.typing.NDArray, data_empirical, n_xa_samples, simulator: callable, which_likelihood_calculator='gaussian') -> numpy.ndarray

   Picklable function at module level.

   :param sam_id: (int) a dummy sample index  which is required in the main `run_tmcmc` workflow
   :param param_vec: indeed a 1d epistemic_param_vec, the parameter vector, i.e. epistemic_domain
   :type param_vec: array-like

   .. note::

      Gaussian approximate likelihood function is implemented here. Other likelihood functions such as
      "pseudo" and "vae" are underway. `sam_id` is not used here, but it is required in the main `run_tmcmc` workflow


.. py:function:: gaussian_likelihood_fun(params, data, simulator, n_xa_samples, dissimilarity_metric='wasserstein')

   Computes the Gaussian log-likelihood of the data given the model and parameters.

   :param params: Parameters for the black-box model.
   :param data: Empirical data (numpy array of shape (time_steps, features, samples)).
   :param simulator: A function that generates model response samples given parameters.
                     Here response may be time series data of shape (time_steps, features, n_xa_samples), or
                     maybe some performance metrics extracted from the time series data.
   :param n_xa_samples: Number of samples to generate from the simulator.

   :returns: The Gaussian log-likelihood of the data given the model.
   :rtype: log_likelihood


.. py:function:: get_top_samples(trace, top_num=20)

   Get the top posterior samples based on likelihood values.


.. py:function:: get_posterior_samples(trace: list[Stage]) -> numpy.typing.NDArray

   Get the posterior particles of the last stage from the trace.

   :param trace: trace object.
   :type trace: list[Stage]


.. py:function:: hdi_1d(x: numpy.typing.NDArray, cred_mass=0.95) -> tuple[float, float]

   Obtain the highest density interval (HDI) from particles

   :param x: 1D array of samples.
   :type x: NDArray
   :param cred_mass: Credible mass for the HDI.
   :type cred_mass: float

   :returns: Lower and upper bounds of the HDI.
   :rtype: tuple


.. py:function:: get_hdi_bounds(data: pandas.DataFrame | numpy.typing.NDArray, levels=(0.99, 0.95, 0.2, 0.1, 0.05)) -> pandas.DataFrame

   Compute HDI bounds for each column in `data`

   :param data: posterior samples (shape: n_samples x n_vars).
   :type data: pd.DataFrame | NDArray
   :param levels: A tuple of high density intervalc levels.
   :type levels: tuple

   :returns: DataFrame with HDI bounds for each column and level.
   :rtype: pd.DataFrame


.. py:function:: filter_hdi_bounds(df, level: int = 95) -> pandas.DataFrame

   Extract HDI bounds for a given credibility level.

   :param df: HDI dataframe containing columns like `hdi_95_low`, `hdi_95_high`
   :type df: pd.DataFrame
   :param level: significance level (e.g. 95, 90, 20)
   :type level: int | str

   :returns: DataFrame with columns [hdi_<level>_low, hdi_<level>_high]
   :rtype: (pd.DataFrame)


.. py:function:: transform_old_trace_to_new(trace: list[list]) -> list[Stage]

   Transform old trace format to new Stage class format.

   :param trace: Old trace instance which is list of lists, where each inner list contains:
                 [Sm, Lm, Wm_n, ESS, beta, Smcap]
   :type trace: list[list]

   :returns: New trace instance which is list of Stage class instances.
   :rtype: list[Stage]