gcfit.analysis.NestedRun#

class gcfit.analysis.NestedRun(filename, observations=None, group='nested', name=None, *args, **kwargs)#

Analysis and visualization of an nested sampling cluster fitting run.

Provides a number of flexible plotting, output and summary methods useful for the analysis of both the procedure and results of a nested sampling fitting run, based on the output file generated by the fitting.

Parameters:
filenamepathlib.Path or str

Path to the run output HDF5 file.

observationsgcfit.Observations or None, optional

The Observations instance corresponding to this cluster. If None (default), an educated guess will be made on the source location (i.e. the restrict_to argument) and the observations will be created based on the cluster name stored in the output.

groupstr, optional

Name of the root group in the HDF5 file. Defaults to “nested”, the most likely name used by gcfit.nested_fit.

namestr, optional

Custom name for this run.

*args, **kwargsdict

All other arguments are passed to _SingleRunAnalysis.

Methods

__init__(filename[, observations, group, name])

add_residuals(ax, y1, y2, e1, e2[, clrs, ...])

Append an extra axis to ax for plotting residuals.

get_CImodel([N, Nprocesses, add_errors, ...])

Return a CIModelVisualizer instance corresponding to this run.

get_model([method, add_errors])

Return a single ModelVisualizer instance corresponding to this run.

parameter_means([Nruns, sim_runs, ...])

Compute the mean of each parameter with corresponding errors.

parameter_summary(*[, N_simruns, label_fixed])

Compute the mean and std.dev.

parameter_vars([Nruns, sim_runs, return_samples])

Compute the variance of each parameter with corresponding errors.

plot_H([fig, ax])

Plot the information integral H.

plot_HN([fig, ax])

Plot the information H by the number of live points N.

plot_IMF([fig, ax, show_canonical, ci])

Plot the IMF, based on the alpha exponents.

plot_KL_divergence([fig, ax, Nruns, kl_kwargs])

plot_bounds(iteration[, fig, show_live])

Plot a "corner plot" approximating the bounds at some iteration.

plot_evidence([fig, ax, error])

Plot the estimated evidence.

plot_marginals([fig, full_volume])

Plot a "corner plot" showcasing the relationships between parameters.

plot_ncall([fig, ax])

Plot the number of likelihood calls.

plot_nlive([fig, ax])

Plot the number of live points.

plot_params([fig, params, posterior_color, ...])

Plot a diagnostic figure of the distributions of parameter samples.

plot_posterior(param[, fig, ax, chain, ...])

Plot a smoothed posterior distribution of a single parameter.

plot_probability([fig, ax])

Plot the posterior probability.

plot_weights([fig, ax, show_bounds, ...])

Plot the sample weights as a function of the prior volume.

print_summary([out, content, N_simruns])

Write a short summary of the run results and metadata.

reset_mask()

Reset the mask.

slice_on_param(param, lower_lim, upper_lim)

Set mask based on the value of a certain parameter.

Attributes

AIC

Akaike information criterion.

BIC

Bayesian information criterion.

ESS

The effective sample size.

chains

cmap

mask

Mask out certain samples, removing them from all analysis.

results

state

Return a flag representing where in the fitting this run reached.

weights

Array of weights, based on the default dynesty.weight_function.

property AIC#

Akaike information criterion.

property BIC#

Bayesian information criterion.

property ESS#

The effective sample size.

add_residuals(ax, y1, y2, e1, e2, clrs=None, res_ax=None, loc='bottom', size='15%', pad=0.1)#

Append an extra axis to ax for plotting residuals.

Automatically appends a new axis to the the bottom of the given ax, and plots the residuals between the two given quantities (and their errors) on it, as a percentage.

Parameters:
axmatplotlib.axes.Axes

An axes instance on which to plot this observational data.

y1, y2np.ndarray

Arrays of data values to plot the residual between. Residuals are of (y2 - y1) / y1.

e1, e2np.ndarray

Arrays of errors on each datapoint.

clrscolor, optional

Colour used for all datapoints, passed to errorbar and scatter.

res_axmatplotlib.axes.Axes, optional

Optionally provide an already created axis to plot residuals on. This is useful for overplotting multiple residuals (i.e. for multiple datasets).

loc{“left”, “right”, “bottom”, “top”}, optional

Where the new axes is positioned relative to the main axes.

sizestr or float, optional

The size of the appended residuals axes, with respect to the primary axes. See mpl_toolkits.axes_grid1.axes_divider.AxesDivider.append_axes for more information. Defaults to “15%”.

padfloat, optional

Padding between the axes. Defaults to 0.1.

Returns:
matplotlib.axes.Axes

The created axes instance containing the residuals plot.

get_CImodel(N=100, Nprocesses=1, add_errors=False, shuffle=True, load=False)#

Return a CIModelVisualizer instance corresponding to this run.

The visualizer is initialized through the CIModelVisualizer.from_chain classmethod, with the chain from this run and using N samples, if load is False, otherwise will attempt to use the CIModelVisualizer.load classmethod, assuming a CI model has already been created and saved to this same file, under the model group.

Parameters:
Nint, optional

The number of samples to use in computing the confidence intervals.

Nprocessesint, optional

The number of processes to use in a multiprocessing.Pool passed to the CI model initializer. Defaults to only 1 cpu.

add_errorsbool, optional

Optionally add the statistical and sampling errors, not normally accounted for, to the chain of samples used (using self._sim_errors(1)).

shufflebool, optional

Optionally shuffle the chains. This may be useful if N is too small to be representative of the full (reweighted) posteriors, and the final samples in the chain are nearly equal (due to their high weights).

loadbool, optional

If True, will attempt to load a CI model, rather than creating a new one.

Returns:
CIModelVisualizer

The created model visualization (with confidence intervals) object.

get_model(method='mean', add_errors=False)#

Return a single ModelVisualizer instance corresponding to this run.

The visualizer is initialized through the ModelVisualizer.from_chain classmethod, with the chain from this run and the method given here.

Parameters:
method{‘median’, ‘mean’, ‘final’}, optional

The method used to compute a single theta set from the chain. Defaults to ‘median’.

add_errorsbool, optional

Optionally add the statistical and sampling errors, not normally accounted for, to the chain of samples used (using self._sim_errors(1)).

Returns:
ModelVisualizer

The created model visualization object.

property mask#

Mask out certain samples, removing them from all analysis.

parameter_means(Nruns=250, sim_runs=None, return_samples=True)#

Compute the mean of each parameter with corresponding errors.

Returns the means of each parameter posterior estimation, with the corresponding error on this statistic. The uncertainties come from the two main sources of errors in nested sampling; statistical errors associated with the uncertainties surrounding the prior volume and sampling errors associated with the integral over the parameters of interest.

These errors can be computed using the standard deviation of the mean from a number of “simulated” (resampled and jittered) runs based on this run. See https://dynesty.readthedocs.io/en/latest/errors.html for a more thorough description.

Parameters:
Nrunsint, optional

The number of simulated runs to use to estimate the uncertainties.

sim_runsNone or list of dynesty.Results

A list of simulated runs to use. A precomputed list of runs may be provided, otherwise they will be computed using _sim_errors.

return_samplesbool, optional

Optionally also return the full array of parameter means from each simulated run.

Returns:
meannp.ndarray[Nparams]

Mean values of each parameter.

errnp.ndarray[Nparams]

Errors on the mean of each parameter.

means_arrnp.ndarray[Nruns, Nparams]

The mean values of each parameter for each simulated run.

parameter_summary(*, N_simruns=100, label_fixed=False)#

Compute the mean and std.dev. on each parameter.

Computes and returns a dictionary with the mean and standard deviation of each parameter, as given by parameter_means and np.sqrt(np.diag(parameter_vars)).

Parameters:
N_simrunsint, optional

The number of simulated runs used to compute the means and errors on each parameter, through parameter_{means,vars}.

label_fixedbool, optional

If True, adds “ (fixed)” to the end of any parameters which were fixed during fitting.

Returns:
dict

Dictionary of parameter labels and 2-tuples of mean and standard deviations.

parameter_vars(Nruns=250, sim_runs=None, return_samples=True)#

Compute the variance of each parameter with corresponding errors.

Returns the covariance array for each parameter posterior estimation, with the corresponding error on this statistic. The uncertainties come from the two main sources of errors in nested sampling; statistical errors associated with the uncertainties surrounding the prior volume and sampling errors associated with the integral over the parameters of interest.

These errors can be computed using the standard deviation of the variance from a number of “simulated” (resampled and jittered) runs based on this run. See https://dynesty.readthedocs.io/en/latest/errors.html for a more thorough description.

Parameters:
Nrunsint, optional

The number of simulated runs to use to estimate the uncertainties.

sim_runsNone or list of dynesty.Results

A list of simulated runs to use. A precomputed list of runs may be provided, otherwise they will be computed using _sim_errors.

return_samplesbool, optional

Optionally also return the full array of parameter variances from each simulated run.

Returns:
varsnp.ndarray[Nparams, Nparams]

Covariance matric for all parameters.

errnp.ndarray[Nparams, Nparams]

Errors on the covariance matric for all parameters.

vars_arrnp.ndarray[Nruns, Nparams, Nparams]

The covariance matrix for each simulated run.

plot_H(fig=None, ax=None, **kw)#

Plot the information integral H.

Plots the “information” gain (H) provided by the updating of a given prior, as characterized by the Kullback-Leibler divergence, as a function of the (log) prior volume.

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this information. Should be a part of the given fig.

**kwdict

All other arguments are passed to ax.plot.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

Notes

\[H \equiv \int_{\Omega_{\boldsymbol{\Theta}}} P(\boldsymbol{\Theta}) \ln\frac{P(\boldsymbol{\Theta})}{\pi(\boldsymbol{\Theta})}\, d\boldsymbol{\Theta}\]
plot_HN(fig=None, ax=None, **kw)#

Plot the information H by the number of live points N.

Plots the “information” gain (H) multiplied by the current number of live points, as a function of run iteration. Intended to compare against one of the termination conditions described by (Skilling, 2006)

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this information. Should be a part of the given fig.

**kwdict

All other arguments are passed to ax.plot.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_IMF(fig=None, ax=None, show_canonical='all', ci=True)#

Plot the IMF, based on the alpha exponents.

plot_bounds(iteration, fig=None, show_live=False, **kw)#

Plot a “corner plot” approximating the bounds at some iteration.

Plots a Nparam-Nparam lower-triangular “corner” plot showing the approximate extent of the bounding distributions of each parameter at a given iteration. Uses the plotting.cornerbound function built into dynesty.

Parameters:
iterationint or list of int

The iterations of the nested sampling run to show the bounding distributions for. If multiple iterations are given, they will be overplotted, in order, in different colours.

tribution at the specified iteration of the nested sampling run.

figNone or matplotlib.figure.Figure, optional

Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.

show_livebool, optional

Also show the live points at this given iteration. Doesn’t seem correct currently.

**kwdict

All other arguments are passed to dynesty.plotting.cornerbound.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

See also

dynesty.plotting.cornerbound

Base dynesty function used for plotting of each iteration.

plot_evidence(fig=None, ax=None, error=False, **kw)#

Plot the estimated evidence.

Plots the estimated (log) bayesian evidence as a function of the (log) prior volume.

Nested sampling provides a continuous estimate of the bayesian evidence based on the integral over the prior volume contained within a given iso-likelihood contour.

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this evidence. Should be a part of the given fig.

errorbool, optional

Optionally also show the error on the evidence estimation as contours on the plot.

**kwdict

All other arguments are passed to ax.plot.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

Notes

In nested sampling the evidence integral can be numerically approximated using a given set of dead points as:

\[\mathcal{Z} = \int_{0}^{1} \mathcal{L}(X)\,dX \approx \sum_{i=1}^{N}\,f(\mathcal{L}_i)\,f(\Delta X_i) \equiv \sum_{i=1}^{N}\,\hat{w}_i\]
plot_marginals(fig=None, full_volume=False, **corner_kw)#

Plot a “corner plot” showcasing the relationships between parameters.

Plots a Nparam-Nparam lower-triangular “corner” marginal plot showing the projections of all sampled parameter values, using the corner.py package.

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.

full_volumebool, optional

Use the entire raw chains, not resampled based on the weights. This will not show correct posteriors.

**corner_kwdict

All other arguments are passed to corner.corner.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_ncall(fig=None, ax=None, **kw)#

Plot the number of likelihood calls.

Plots the total number of likelihood function calls made at each step as a function of the (log) prior volume.

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this number. Should be a part of the given fig.

**kwdict

All other arguments are passed to ax.step.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_nlive(fig=None, ax=None, **kw)#

Plot the number of live points.

Plots the current number of live points, as a function of the (log) prior volume. This should remain constant until dynamic sampling begins, increase incrementally, and then decrease smoothly until all live points are removed.

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this number. Should be a part of the given fig.

**kwdict

All other arguments are passed to ax.plot.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_params(fig=None, params=None, *, posterior_color='tab:blue', posterior_border=True, show_weight=True, fill_type='weights', ylims=None, truths=None, **kw)#

Plot a diagnostic figure of the distributions of parameter samples.

Plots an Nparam-panel figure showcasing the parameter values of all samples, over the iteration domain, as well as a KDE-based smoothed posterior distribution for each parameter.

Provides a diagnostic figure for examining the parameter estimation. This is a modified version of the diagnostic plot first introduced in Higson et al. (2018).

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.

paramsNone or list of str, optional

The parameters to show on this figure. If None (default) all parameters (including fixed params) will be shown.

posterior_colorcolor, optional

The colour of the smoothed posterior distributions.

posterior_borderbool, optional

If False, will remove the axis frame around the smooth posterior distribution to the right of each panel.

show_weightbool, optional

Plot the (resampled) weights above the parameter samples columns, using the plot_weights method.

fill_type{weights, iters, id, batch, bound}, optional

The mapping used to colour all points within the samples axes. Defaults to ‘weights’.

ylimslist[Nparam] of 2-tuples, optional

Used to set the upper and lower y axis-limits on each parameter.

truthsnp.ndarray[Nparam] or np.ndarray[Nparam, 3], optional

Optionally indicate the “true” values as horizontal lines on the posterior frames. If [Nparam, 3], the values in each row will be taken as the median, lower limit and upper limit.

**kwdict

All other arguments are passed to ax.scatter.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_posterior(param, fig=None, ax=None, chain=None, flipped=True, truth=None, truth_ci=None, *args, **kwargs)#

Plot a smoothed posterior distribution of a single parameter.

Plots a gaussian-KDE smoothed posterior probability distribution of a given parameter. Designed mainly to be used within the plot_params method, but can be used on its own.

Parameters:
paramstr

Name of the parameter to plot.

figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this posterior. Should be a part of the given fig.

chainnp.ndarray, optional

Optionally supply a (flattened) array of samples to create the posterior from. By default, will load the full chain using self._get_chains.

flippedbool, optional

If True (default) the posterior will be flipped on it’s side, attached to the left-axis.

truthfloat, optional

Optionally indicate the “true” value as horizontal lines on the posterior.

truth_ci2-tuple of float, optional

Optionally shade between the lower and upper limits of the “truth” values, using plt.axhspan.

**kwargsdict

All other arguments are passed to the ax.fill_between function.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_probability(fig=None, ax=None, **kw)#

Plot the posterior probability.

Plots the total (sum of all components) logged posterior probability of the nested sampler as a function of (log) prior volume.

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot this probability. Should be a part of the given fig.

**kwdict

All other arguments are passed to ax.plot.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

plot_weights(fig=None, ax=None, show_bounds=False, resampled=False, filled=False, **kw)#

Plot the sample weights as a function of the prior volume.

Plots the importance weights \(\hat{w}_i\) of all samples as a function of the (log) prior volume \(\ln(X)\).

Parameters:
figNone or matplotlib.figure.Figure, optional

Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.

axNone or matplotlib.axes.Axes, optional

An axes instance on which to plot the weights. Should be a part of the given fig.

show_boundsbool, optional

Display the location of the weight-bounds used in the computation of the (first) dynamical batch likelihood boundaries, as a horizontal line across max(weights) * maxfrac.

resampledbool, optional

Plot the weights after (equally-weighted) resamping, effectively smoothing the weights.

filledbool, optional

Fill between the plotted weights and the x-axis.

**kwdict

All other arguments are passed to ax.plot.

Returns:
matplotlib.figure.Figure

The corresponding figure, containing all axes and plot artists.

print_summary(out=None, content='all', *, N_simruns=100)#

Write a short summary of the run results and metadata.

Write out (to a file or stdout) a short summary of the final median and 1σ parameter values, as well as some metadata surrounding the fitting run setup, such as fixed parameters, and statistics on the run progression, like the effective sample size and efficiency.

Parameters:
outNone or file-like object, str, or pathlib.Path, optional

The file to write out the summary to. If None (default) will be printed to stdout.

content{‘all’, ‘results’, ‘metadata’}

Which parts of the summary to write. If “results”, will print only the parameter values. If “metadata”, will print only the run metadata. If “all” (default), prints both.

N_simrunsint, optional

The number of simulated runs used to compute the means and errors on each parameter, through parameter_{means,vars}.

reset_mask()#

Reset the mask.

slice_on_param(param, lower_lim, upper_lim)#

Set mask based on the value of a certain parameter. Already present masks will be combined. If that’s not desired, masks should be reset (self.reset_mask()) first.

property state#

Return a flag representing where in the fitting this run reached.

property weights#

Array of weights, based on the default dynesty.weight_function.