pyuncertainnumber.calibration.tmcmc =================================== .. py:module:: pyuncertainnumber.calibration.tmcmc Classes ------- .. autoapisummary:: pyuncertainnumber.calibration.tmcmc.TMCMC pyuncertainnumber.calibration.tmcmc.Stage Functions --------- .. autoapisummary:: pyuncertainnumber.calibration.tmcmc.recover_trace_samples pyuncertainnumber.calibration.tmcmc.plot_updated_distribution pyuncertainnumber.calibration.tmcmc.initial_population pyuncertainnumber.calibration.tmcmc.log_prior pyuncertainnumber.calibration.tmcmc.compute_beta_update_evidence pyuncertainnumber.calibration.tmcmc.propose pyuncertainnumber.calibration.tmcmc.MCMC_MH pyuncertainnumber.calibration.tmcmc.run_tmcmc_vec pyuncertainnumber.calibration.tmcmc.run_tmcmc_updated pyuncertainnumber.calibration.tmcmc.run_tmcmc pyuncertainnumber.calibration.tmcmc.log_likelihood pyuncertainnumber.calibration.tmcmc.gaussian_likelihood_fun pyuncertainnumber.calibration.tmcmc.get_top_samples pyuncertainnumber.calibration.tmcmc.get_posterior_samples pyuncertainnumber.calibration.tmcmc.hdi_1d pyuncertainnumber.calibration.tmcmc.get_hdi_bounds pyuncertainnumber.calibration.tmcmc.filter_hdi_bounds pyuncertainnumber.calibration.tmcmc.transform_old_trace_to_new Module Contents --------------- .. py:class:: TMCMC(N: int = None, parameters: list = None, names: list[str] = None, log_likelihood: callable = None, mutation_steps: int = 5, status_file_name: str = None, trace=None) Class for TMCMC implementation :param N: number of particles to be sampled from posterior :type N: int :param parameters: list of (size Nop) prior distributions instances :type parameters: list :param log_likelihood: log likelihood function on :math:`\theta` to be defined which is problem specific :type log_likelihood: callable :param status_file_name: name of the status file to store status of the tmcmc sampling :type status_file_name: str :returns: returns trace file of all samples of all tmcmc stages. At stage m: it contains [Sm, Lm, Wm_n, ESS, beta, Smcap] :rtype: mytrace .. rubric:: Example >>> from pyuncertainnumber.calibration.tmcmc import TMCMC >>> from pyuncertainnumber import pba >>> # create an instance of TMCMC class and run tmcmc >>> parameters = [pba.D("uniform", (0.8, 2.2)), pba.D("uniform", (0.4, 1.2))] # dummy prior distributions >>> t = TMCMC( >>> N=1000, # number of particles >>> parameters=parameters, # prior distributions >>> names =['k1', 'k2'], # parameter names >>> log_likelihood=log_likelihood_function, # user defined log likelihood function >>> mutation_steps=1, # number of MCMC steps for perturbation >>> status_file_name='tmcmc_progressing_status.txt', # status log file >>> ) >>> mytrace = t.run() >>> # for loading trace file >>> import pickle >>> with open('tmcmc_trace.pkl', 'rb') as f: >>> mytrace = pickle.load(f) >>> re = TMCMC(trace=mytrace, names=names) # create TMCMC instance with loaded trace and names >>> re.plot_updated_distribution(save=False) # plot prior and posterior distribution >>> prior_samples = re.load_prior_samples() # load prior samples as DataFrame >>> posterior_samples = re.load_posterior_samples() # load posterior samples as DataFrame .. figure:: /_static/tmcmc_2dof.png :alt: 2DOF TMCMC example :align: center :width: 70% Example of TMCMC calibration of a 2-DOF system .. py:attribute:: N :value: None .. py:attribute:: parameters :value: None .. py:attribute:: names :value: None .. py:attribute:: log_likelihood :value: None .. py:attribute:: mutation_steps :value: 5 .. py:attribute:: status_file_name :value: None .. py:attribute:: trace :value: None .. py:method:: check_names() .. py:method:: run() -> list[Stage] Run the TMCMC algorithm :returns: returns trace file of all samples of all tmcmc stages. :rtype: mytrace .. py:method:: check_trace() .. py:method:: save_trace(mytrace, file_name: str, save_dir=None) Save the trace to a file :param mytrace: trace file of all samples of all tmcmc stages. :param file_name: name of the file to save the trace :type file_name: str .. py:method:: load_prior_samples(trace=None) -> pandas.DataFrame .. py:method:: load_posterior_samples(trace=None) -> pandas.DataFrame .. py:method:: plot_updated_distribution(trace=None, save=False) Plot the prior and posterior distribution of the parameters :param trace: trace file of all samples of all tmcmc stages. :param save: if True, save the figure :type save: bool .. py:class:: Stage Container for one TMCMC stage. :param Sm: samples before resampling (N x Np) :type Sm: NDArray :param Lm: log-likelihoods at Sm (N,) :type Lm: NDArray :param Wm_n: normalized incremental weights (N,) :type Wm_n: NDArray :param ESS: effective sample size the next stage (can be None in terminal stage) :type ESS: Optional[float] :param beta: tempering parameter next stage beta :type beta: float :param Smcap: resampled samples (N x Np), None in terminal stage :type Smcap: Optional[NDArray] .. py:attribute:: Sm :type: numpy.typing.NDArray .. py:attribute:: Lm :type: numpy.typing.NDArray .. py:attribute:: Wm_n :type: numpy.typing.NDArray .. py:attribute:: ESS :type: Optional[float] .. py:attribute:: beta :type: float .. py:attribute:: Smcap :type: Optional[numpy.typing.NDArray] .. py:method:: __repr__() -> str .. py:property:: samples :type: numpy.typing.NDArray Samples before resampling (from previous stage). .. py:property:: log_likelihoods :type: numpy.typing.NDArray Log-likelihood evaluated at `samples`. .. py:property:: weights :type: numpy.typing.NDArray Normalized incremental importance weights. .. py:property:: effective_sample_size :type: Optional[float] Effective sample size after reweighting the next stage. .. py:property:: tempering_parameter :type: float Tempering parameter beta next stage beta. .. py:property:: resampled_samples :type: Optional[numpy.typing.NDArray] Samples after importance resampling (before MH mutation). .. py:function:: recover_trace_samples(mytrace: list[Stage], names: list[str]) -> pandas.DataFrame Load back the Prior and Posterior samples in the form of a DataFrame :param mytrace: trace file of all samples of all tmcmc stages. :type mytrace: list[Stage] :param names: list of parameter names :type names: list[str] :returns: Prior and Posterior in the trace as a DataFrame :rtype: (pd.DataFrame) .. py:function:: plot_updated_distribution(mytrace, names: list[str], save=False) Plot the prior and posterior distribution of the parameters :param mytrace: trace file of all samples of all tmcmc stages. :param names: list of parameter names :type names: list[str] :param save: if True, save the figure :type save: bool .. py:function:: initial_population(N: float, all_pars: list) -> numpy.typing.NDArray Generates initial population from prior distribution :param N: number of particles. :type N: float :param all_pars: All the parameters, of size Np, to be updated. `all_pars[i]` is object of type `pdfs`. :type all_pars: list Returns ini_pop (NDArray): the initial population which is a numpy array of size `N x Np`. .. note:: `Np` is number of parameters .. py:function:: log_prior(s: numpy.typing.NDArray, all_pars: list[pyuncertainnumber.calibration.pdfs.ProbabilityDensityFun | pyuncertainnumber.Distribution]) -> float Computes log_prior value at all particles :param s: numpy array of size (N x Np) of all particles. :type s: NDArray :param all_pars: All the parameters, of size Np, to be updated. `all_pars[i]` is object of type `pdfs`. :type all_pars: list[ProbabilityDensityFun | Distribution] Returns log_p (NDArray): log prior at all N particles which is a numpy array of size N .. note:: `Np` denotes number of parameters .. py:function:: compute_beta_update_evidence(beta: float, log_likelihoods: numpy.typing.NDArray, log_evidence: float, prev_ESS: int) Computes beta for the next stage and updated model evidence :param beta: stage parameter. :type beta: float :param log_likelihoods: log likelihood values at all particles which is numpy array of size N :type log_likelihoods: NDArray :param log_evidence: log of evidence. :type log_evidence: float :param prev_ESS: effective sample size of previous stage :type prev_ESS: int :returns: A 4-tuple ``(new_beta, log_evidence, Wm_n, ESS)`` where: - **new_beta** (*float*): stage parameter for next stage. - **log_evidence** (*float*): updated log evidence. - **Wm_n** (*NDArray*): weights of particles for the next stage. - **ESS** (*float*): effective sample size of new stage. :rtype: tuple[float, float, NDArray, float] .. py:function:: propose(current: numpy.typing.NDArray, covariance: numpy.typing.NDArray, n: int) -> numpy.typing.NDArray Proposal distribution for MCMC in pertubation stage :param current: numpy array of size Np current particle location :type current: NDArray :param covariance: numpy array of size Np x Np proposal covariance matrix :type covariance: NDArray :param n: number of proposals. :type n: int :returns: n proposals :rtype: NDArray .. py:function:: MCMC_MH(particle_num: int, Em: numpy.typing.NDArray, Nm_steps: int, current: numpy.typing.NDArray, likelihood_current: float, posterior_current: float, beta: float, numAccepts: int, all_pars: list, log_likelihood: callable) -> tuple[numpy.typing.NDArray, float, float, int] Markov chain Monte Carlo using Metropolis-Hastings which "perturbs" each particle using MCMC-MH Conduct `Nm_steps` steps of MH for each particle. :param particle_num: The index of the current particle/parameter vector being evaluated. :type particle_num: int :param Em: proposal covarince matrix which is numpy array of size Nop x Nop :type Em: NDArray :param Nm_steps: number of perturbation steps. :type Nm_steps: int :param current: numpy array of size Nop. current particle location :type current: NDArray :param likelihood_current: log likelihood value at current particle :type likelihood_current: float :param posterior_current: log posterior value at current particle :type posterior_current: float :param beta: stage parameter. :type beta: float :param numAccepts: total number of accepts :type numAccepts: int :param all_pars: list of size Nop Nop is number of parameters all_pars[i] is object of type pdfs all parameters to be inferred. :param log_likelihood: log likelihood function to be defined in main.py. :type log_likelihood: callable :returns: A 4-tuple ``(current, likelihood_current, posterior_current, numAccepts)`` where: - **current** (*NDArray*): Perturbed particle location of shape (Nop,). - **likelihood_current** (*float*): Log-likelihood at the perturbed particle. - **posterior_current** (*float*): Log-posterior at the perturbed particle. - **numAccepts** (*int*): Updated total number of accepts. :rtype: tuple[NDArray, float, float, int] .. py:function:: run_tmcmc_vec() vectorised version of run_tmcmc to be implemented later .. py:function:: run_tmcmc_updated(N: int, all_pars: list, log_likelihood: callable, status_file_name: str, Nm_steps_max: int = 5, Nm_steps_maxmax: int = 5) -> list[Stage] Updated main workflow of running Transitional MCMC Solves the "At this point, beta has been updated to the new value, but Postm is still the log-target from the previous beta (except at the very first stage). So Postmcap is inconsistent with the beta you pass into MCMC_MH." :param N: int number of particles to be sampled from posterior :type N: int :param all_pars: All the parameters, of size Nop, to be updated. `all_pars[i]` is object of type `pdfs`. :type all_pars: list :param log_likelihood: log likelihood function to be defined in main.py as is problem specific :type log_likelihood: callable :param status_file_name: name of the status file to store status of the tmcmc sampling :type status_file_name: str :param Nm_steps_max: Numbers of MCMC steps for perturbation. The default is 5. :type Nm_steps_max: int, optional :param Nm_steps_maxmax: Numbers of MCMC steps for perturbation. The default is 5. :type Nm_steps_maxmax: int, optional :returns: the trace that contains all iterations of the Stage instances. :rtype: list[Stage] .. py:function:: run_tmcmc(N: int, all_pars: list, log_likelihood: callable, status_file_name: str, Nm_steps_max: int = 5, Nm_steps_maxmax: int = 5) Main workflow of running Transitional MCMC :param N: int number of particles to be sampled from posterior :type N: int :param all_pars: All the parameters, of size Nop, to be updated. `all_pars[i]` is object of type `pdfs`. :type all_pars: list :param log_likelihood: log likelihood function to be defined in main.py as is problem specific :type log_likelihood: callable :param status_file_name: name of the status file to store status of the tmcmc sampling :type status_file_name: str :param Nm_steps_max: Numbers of MCMC steps for perturbation. The default is 5. :type Nm_steps_max: int, optional :param Nm_steps_maxmax: Numbers of MCMC steps for perturbation. The default is 5. :type Nm_steps_maxmax: int, optional :param parallel_processing: should be either 'multiprocessing' or 'mpi' :type parallel_processing: str :returns: mytrace: returns trace file of all samples of all tmcmc stages. at stage m: it contains [Sm, Lm, Wm_n, ESS, beta, Smcap] :rtype: a tuple containing .. py:function:: log_likelihood(sam_id: int, param_vec: numpy.typing.NDArray, data_empirical, n_xa_samples, simulator: callable, which_likelihood_calculator='gaussian') -> numpy.ndarray Picklable function at module level. :param sam_id: (int) a dummy sample index which is required in the main `run_tmcmc` workflow :param param_vec: indeed a 1d epistemic_param_vec, the parameter vector, i.e. epistemic_domain :type param_vec: array-like .. note:: Gaussian approximate likelihood function is implemented here. Other likelihood functions such as "pseudo" and "vae" are underway. `sam_id` is not used here, but it is required in the main `run_tmcmc` workflow .. py:function:: gaussian_likelihood_fun(params, data, simulator, n_xa_samples, dissimilarity_metric='wasserstein') Computes the Gaussian log-likelihood of the data given the model and parameters. :param params: Parameters for the black-box model. :param data: Empirical data (numpy array of shape (time_steps, features, samples)). :param simulator: A function that generates model response samples given parameters. Here response may be time series data of shape (time_steps, features, n_xa_samples), or maybe some performance metrics extracted from the time series data. :param n_xa_samples: Number of samples to generate from the simulator. :returns: The Gaussian log-likelihood of the data given the model. :rtype: log_likelihood .. py:function:: get_top_samples(trace, top_num=20) Get the top posterior samples based on likelihood values. .. py:function:: get_posterior_samples(trace: list[Stage]) -> numpy.typing.NDArray Get the posterior particles of the last stage from the trace. :param trace: trace object. :type trace: list[Stage] .. py:function:: hdi_1d(x: numpy.typing.NDArray, cred_mass=0.95) -> tuple[float, float] Obtain the highest density interval (HDI) from particles :param x: 1D array of samples. :type x: NDArray :param cred_mass: Credible mass for the HDI. :type cred_mass: float :returns: Lower and upper bounds of the HDI. :rtype: tuple .. py:function:: get_hdi_bounds(data: pandas.DataFrame | numpy.typing.NDArray, levels=(0.99, 0.95, 0.2, 0.1, 0.05)) -> pandas.DataFrame Compute HDI bounds for each column in `data` :param data: posterior samples (shape: n_samples x n_vars). :type data: pd.DataFrame | NDArray :param levels: A tuple of high density intervalc levels. :type levels: tuple :returns: DataFrame with HDI bounds for each column and level. :rtype: pd.DataFrame .. py:function:: filter_hdi_bounds(df, level: int = 95) -> pandas.DataFrame Extract HDI bounds for a given credibility level. :param df: HDI dataframe containing columns like `hdi_95_low`, `hdi_95_high` :type df: pd.DataFrame :param level: significance level (e.g. 95, 90, 20) :type level: int | str :returns: DataFrame with columns [hdi__low, hdi__high] :rtype: (pd.DataFrame) .. py:function:: transform_old_trace_to_new(trace: list[list]) -> list[Stage] Transform old trace format to new Stage class format. :param trace: Old trace instance which is list of lists, where each inner list contains: [Sm, Lm, Wm_n, ESS, beta, Smcap] :type trace: list[list] :returns: New trace instance which is list of Stage class instances. :rtype: list[Stage]