gcfit.analysis.NestedRun#

class gcfit.analysis.NestedRun(filename, observations=None, group='nested', name=None, *args, **kwargs)#

Analysis and visualization of an nested sampling cluster fitting run.

Provides a number of flexible plotting, output and summary methods useful for the analysis of both the procedure and results of a nested sampling fitting run, based on the output file generated by the fitting.

Parameters:

filenamepathlib.Path or str: Path to the run output HDF5 file.
observationsgcfit.Observations or None, optional: The Observations instance corresponding to this cluster. If None (default), an educated guess will be made on the source location (i.e. the restrict_to argument) and the observations will be created based on the cluster name stored in the output.
groupstr, optional: Name of the root group in the HDF5 file. Defaults to “nested”, the most likely name used by gcfit.nested_fit.
namestr, optional: Custom name for this run.
*args, **kwargsdict: All other arguments are passed to _SingleRunAnalysis.

Methods

`__init__`(filename[, observations, group, name])
`add_residuals`(ax, y1, y2, e1, e2[, clrs, ...])	Append an extra axis to ax for plotting residuals.
`get_CImodel`([N, Nprocesses, add_errors, ...])	Return a CIModelVisualizer instance corresponding to this run.
`get_model`([method, add_errors])	Return a single ModelVisualizer instance corresponding to this run.
`parameter_means`([Nruns, sim_runs, ...])	Compute the mean of each parameter with corresponding errors.
`parameter_summary`(*[, N_simruns])	Compute the mean and std.dev.
`parameter_vars`([Nruns, sim_runs, return_samples])	Compute the variance of each parameter with corresponding errors.
`plot_H`([fig, ax])	Plot the information integral H.
`plot_HN`([fig, ax])	Plot the information H by the number of live points N.
`plot_IMF`([fig, ax, show_canonical, ci])	Plot the IMF, based on the alpha exponents.
`plot_KL_divergence`([fig, ax, Nruns, kl_kwargs])
`plot_bounds`(iteration[, fig, show_live])	Plot a "corner plot" approximating the bounds at some iteration.
`plot_evidence`([fig, ax, error])	Plot the estimated evidence.
`plot_marginals`([fig, full_volume, params, ...])	Plot a "corner plot" showcasing the relationships between parameters.
`plot_ncall`([fig, ax])	Plot the number of likelihood calls.
`plot_nlive`([fig, ax])	Plot the number of live points.
`plot_params`([fig, params, posterior_color, ...])	Plot a diagnostic figure of the distributions of parameter samples.
`plot_posterior`(param[, fig, ax, chain, ...])	Plot a smoothed posterior distribution of a single parameter.
`plot_probability`([fig, ax])	Plot the posterior probability.
`plot_weights`([fig, ax, show_bounds, ...])	Plot the sample weights as a function of the prior volume.
`print_summary`([out, content, N_simruns])	Write a short summary of the run results and metadata.
`reset_mask`()	Reset the mask.
`slice_on_param`(param, lower_lim, upper_lim)	Set mask based on the value of a certain parameter.

Attributes

`AIC`	Akaike information criterion.
`BIC`	Bayesian information criterion.
`ESS`	The effective sample size.
`chains`
`cmap`
`mask`	Mask out certain samples, removing them from all analysis.
`results`
`state`	Return a flag representing where in the fitting this run reached.
`weights`	Array of weights, based on the default dynesty.weight_function.

property AIC#: Akaike information criterion.

property BIC#: Bayesian information criterion.

property ESS#: The effective sample size.

add_residuals(ax, y1, y2, e1, e2, clrs=None, res_ax=None, loc='bottom', size='15%', pad=0.1)#

Append an extra axis to ax for plotting residuals.

Automatically appends a new axis to the the bottom of the given ax, and plots the residuals between the two given quantities (and their errors) on it, as a percentage.

Parameters:

axmatplotlib.axes.Axes: An axes instance on which to plot this observational data.
y1, y2np.ndarray: Arrays of data values to plot the residual between. Residuals are of (y2 - y1) / y1.
e1, e2np.ndarray: Arrays of errors on each datapoint.
clrscolor, optional: Colour used for all datapoints, passed to errorbar and scatter.
res_axmatplotlib.axes.Axes, optional: Optionally provide an already created axis to plot residuals on. This is useful for overplotting multiple residuals (i.e. for multiple datasets).
loc{“left”, “right”, “bottom”, “top”}, optional: Where the new axes is positioned relative to the main axes.
sizestr or float, optional: The size of the appended residuals axes, with respect to the primary axes. See mpl_toolkits.axes_grid1.axes_divider.AxesDivider.append_axes for more information. Defaults to “15%”.
padfloat, optional: Padding between the axes. Defaults to 0.1.

Returns:

matplotlib.axes.Axes: The created axes instance containing the residuals plot.

get_CImodel(N=100, Nprocesses=1, add_errors=False, shuffle=True, load=False, progress=False)#

Return a CIModelVisualizer instance corresponding to this run.

The visualizer is initialized through the CIModelVisualizer.from_chain classmethod, with the chain from this run and using N samples, if load is False, otherwise will attempt to use the CIModelVisualizer.load classmethod, assuming a CI model has already been created and saved to this same file, under the model group.

Parameters:

Nint, optional: The number of samples to use in computing the confidence intervals.
Nprocessesint, optional: The number of processes to use in a multiprocessing.Pool passed to the CI model initializer. Defaults to only 1 cpu.
add_errorsbool, optional: Optionally add the statistical and sampling errors, not normally accounted for, to the chain of samples used (using self._sim_errors(1)).
shufflebool, optional: Optionally shuffle the chains. This may be useful if N is too small to be representative of the full (reweighted) posteriors, and the final samples in the chain are nearly equal (due to their high weights).
loadbool, optional: If True, will attempt to load a CI model, rather than creating a new one.
progressbool, optional: Optionally display a tqdm loading bar when creating CIs, if load is False. Passed to model from_chain method as ‘verbose’ argument.

Returns:

CIModelVisualizer: The created model visualization (with confidence intervals) object.

get_model(method='mean', add_errors=False)#

Return a single ModelVisualizer instance corresponding to this run.

The visualizer is initialized through the ModelVisualizer.from_chain classmethod, with the chain from this run and the method given here.

Parameters:

method{‘median’, ‘mean’, ‘final’}, optional: The method used to compute a single theta set from the chain. Defaults to ‘median’.
add_errorsbool, optional: Optionally add the statistical and sampling errors, not normally accounted for, to the chain of samples used (using self._sim_errors(1)).

Returns:

ModelVisualizer: The created model visualization object.

property mask#: Mask out certain samples, removing them from all analysis.

parameter_means(Nruns=250, sim_runs=None, return_samples=True)#

Compute the mean of each parameter with corresponding errors.

Returns the means of each parameter posterior estimation, with the corresponding error on this statistic. The uncertainties come from the two main sources of errors in nested sampling; statistical errors associated with the uncertainties surrounding the prior volume and sampling errors associated with the integral over the parameters of interest.

These errors can be computed using the standard deviation of the mean from a number of “simulated” (resampled and jittered) runs based on this run. See https://dynesty.readthedocs.io/en/latest/errors.html for a more thorough description.

Parameters:

Nrunsint, optional: The number of simulated runs to use to estimate the uncertainties.
sim_runsNone or list of dynesty.Results: A list of simulated runs to use. A precomputed list of runs may be provided, otherwise they will be computed using _sim_errors.
return_samplesbool, optional: Optionally also return the full array of parameter means from each simulated run.

Returns:

meannp.ndarray[Nparams]: Mean values of each parameter.
errnp.ndarray[Nparams]: Errors on the mean of each parameter.
means_arrnp.ndarray[Nruns, Nparams]: The mean values of each parameter for each simulated run.

parameter_summary(*, N_simruns=100)#

Compute the mean and std.dev. on each parameter.

Computes and returns a dictionary with the mean and standard deviation of each parameter, as given by parameter_means and np.sqrt(np.diag(parameter_vars)).

Parameters:

N_simrunsint, optional: The number of simulated runs used to compute the means and errors on each parameter, through parameter_{means,vars}.
label_fixedbool, optional: If True, adds “ (fixed)” to the end of any parameters which were fixed during fitting.

Returns:

dict: Dictionary of parameter labels and 2-tuples of mean and standard deviations.

parameter_vars(Nruns=250, sim_runs=None, return_samples=True)#

Compute the variance of each parameter with corresponding errors.

Returns the covariance array for each parameter posterior estimation, with the corresponding error on this statistic. The uncertainties come from the two main sources of errors in nested sampling; statistical errors associated with the uncertainties surrounding the prior volume and sampling errors associated with the integral over the parameters of interest.

These errors can be computed using the standard deviation of the variance from a number of “simulated” (resampled and jittered) runs based on this run. See https://dynesty.readthedocs.io/en/latest/errors.html for a more thorough description.

Parameters:

Nrunsint, optional: The number of simulated runs to use to estimate the uncertainties.
sim_runsNone or list of dynesty.Results: A list of simulated runs to use. A precomputed list of runs may be provided, otherwise they will be computed using _sim_errors.
return_samplesbool, optional: Optionally also return the full array of parameter variances from each simulated run.

Returns:

varsnp.ndarray[Nparams, Nparams]: Covariance matric for all parameters.
errnp.ndarray[Nparams, Nparams]: Errors on the covariance matric for all parameters.
vars_arrnp.ndarray[Nruns, Nparams, Nparams]: The covariance matrix for each simulated run.

plot_H(fig=None, ax=None, **kw)#

Plot the information integral H.

Plots the “information” gain (H) provided by the updating of a given prior, as characterized by the Kullback-Leibler divergence, as a function of the (log) prior volume.

Parameters:

figNone or matplotlib.figure.Figure, optional: Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
axNone or matplotlib.axes.Axes, optional: An axes instance on which to plot this information. Should be a part of the given fig.
**kwdict: All other arguments are passed to ax.plot.

Returns:

matplotlib.figure.Figure: The corresponding figure, containing all axes and plot artists.

Notes

\[H \equiv \int_{\Omega_{\boldsymbol{\Theta}}} P(\boldsymbol{\Theta}) \ln\frac{P(\boldsymbol{\Theta})}{\pi(\boldsymbol{\Theta})}\, d\boldsymbol{\Theta}\]

plot_HN(fig=None, ax=None, **kw)#

Plot the information H by the number of live points N.

Plots the “information” gain (H) multiplied by the current number of live points, as a function of run iteration. Intended to compare against one of the termination conditions described by (Skilling, 2006)

Parameters:

figNone or matplotlib.figure.Figure, optional: Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
axNone or matplotlib.axes.Axes, optional: An axes instance on which to plot this information. Should be a part of the given fig.
**kwdict: All other arguments are passed to ax.plot.

Returns:

matplotlib.figure.Figure: The corresponding figure, containing all axes and plot artists.

plot_IMF(fig=None, ax=None, show_canonical='all', ci=True)#: Plot the IMF, based on the alpha exponents.

plot_bounds(iteration, fig=None, show_live=False, **kw)#

Plot a “corner plot” approximating the bounds at some iteration.

Plots a Nparam-Nparam lower-triangular “corner” plot showing the approximate extent of the bounding distributions of each parameter at a given iteration. Uses the plotting.cornerbound function built into dynesty.

Parameters:

iterationint or list of int

The iterations of the nested sampling run to show the bounding distributions for. If multiple iterations are given, they will be overplotted, in order, in different colours.

tribution at the specified iteration of the nested sampling run.

figNone or matplotlib.figure.Figure, optional

Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.

show_livebool, optional

Also show the live points at this given iteration. Doesn’t seem correct currently.

**kwdict

All other arguments are passed to dynesty.plotting.cornerbound.

Returns:

matplotlib.figure.Figure: The corresponding figure, containing all axes and plot artists.