gcfit.analysis.NestedRun#
- class gcfit.analysis.NestedRun(filename, observations=None, group='nested', name=None, *args, **kwargs)#
Analysis and visualization of an nested sampling cluster fitting run.
Provides a number of flexible plotting, output and summary methods useful for the analysis of both the procedure and results of a nested sampling fitting run, based on the output file generated by the fitting.
- Parameters:
- filenamepathlib.Path or str
Path to the run output HDF5 file.
- observationsgcfit.Observations or None, optional
The Observations instance corresponding to this cluster. If None (default), an educated guess will be made on the source location (i.e. the restrict_to argument) and the observations will be created based on the cluster name stored in the output.
- groupstr, optional
Name of the root group in the HDF5 file. Defaults to “nested”, the most likely name used by gcfit.nested_fit.
- namestr, optional
Custom name for this run.
- *args, **kwargsdict
All other arguments are passed to _SingleRunAnalysis.
Methods
__init__(filename[, observations, group, name])add_residuals(ax, y1, y2, e1, e2[, clrs, ...])Append an extra axis to ax for plotting residuals.
get_CImodel([N, Nprocesses, add_errors, ...])Return a CIModelVisualizer instance corresponding to this run.
get_model([method, add_errors])Return a single ModelVisualizer instance corresponding to this run.
parameter_means([Nruns, sim_runs, ...])Compute the mean of each parameter with corresponding errors.
parameter_summary(*[, N_simruns, label_fixed])Compute the mean and std.dev.
parameter_vars([Nruns, sim_runs, return_samples])Compute the variance of each parameter with corresponding errors.
plot_H([fig, ax])Plot the information integral H.
plot_HN([fig, ax])Plot the information H by the number of live points N.
plot_IMF([fig, ax, show_canonical, ci])Plot the IMF, based on the alpha exponents.
plot_KL_divergence([fig, ax, Nruns, kl_kwargs])plot_bounds(iteration[, fig, show_live])Plot a "corner plot" approximating the bounds at some iteration.
plot_evidence([fig, ax, error])Plot the estimated evidence.
plot_marginals([fig, full_volume])Plot a "corner plot" showcasing the relationships between parameters.
plot_ncall([fig, ax])Plot the number of likelihood calls.
plot_nlive([fig, ax])Plot the number of live points.
plot_params([fig, params, posterior_color, ...])Plot a diagnostic figure of the distributions of parameter samples.
plot_posterior(param[, fig, ax, chain, ...])Plot a smoothed posterior distribution of a single parameter.
plot_probability([fig, ax])Plot the posterior probability.
plot_weights([fig, ax, show_bounds, ...])Plot the sample weights as a function of the prior volume.
print_summary([out, content, N_simruns])Write a short summary of the run results and metadata.
Reset the mask.
slice_on_param(param, lower_lim, upper_lim)Set mask based on the value of a certain parameter.
Attributes
Akaike information criterion.
Bayesian information criterion.
The effective sample size.
chainscmapMask out certain samples, removing them from all analysis.
resultsReturn a flag representing where in the fitting this run reached.
Array of weights, based on the default dynesty.weight_function.
- property AIC#
Akaike information criterion.
- property BIC#
Bayesian information criterion.
- property ESS#
The effective sample size.
- add_residuals(ax, y1, y2, e1, e2, clrs=None, res_ax=None, loc='bottom', size='15%', pad=0.1)#
Append an extra axis to ax for plotting residuals.
Automatically appends a new axis to the the bottom of the given ax, and plots the residuals between the two given quantities (and their errors) on it, as a percentage.
- Parameters:
- axmatplotlib.axes.Axes
An axes instance on which to plot this observational data.
- y1, y2np.ndarray
Arrays of data values to plot the residual between. Residuals are of (y2 - y1) / y1.
- e1, e2np.ndarray
Arrays of errors on each datapoint.
- clrscolor, optional
Colour used for all datapoints, passed to errorbar and scatter.
- res_axmatplotlib.axes.Axes, optional
Optionally provide an already created axis to plot residuals on. This is useful for overplotting multiple residuals (i.e. for multiple datasets).
- loc{“left”, “right”, “bottom”, “top”}, optional
Where the new axes is positioned relative to the main axes.
- sizestr or float, optional
The size of the appended residuals axes, with respect to the primary axes. See mpl_toolkits.axes_grid1.axes_divider.AxesDivider.append_axes for more information. Defaults to “15%”.
- padfloat, optional
Padding between the axes. Defaults to 0.1.
- Returns:
- matplotlib.axes.Axes
The created axes instance containing the residuals plot.
- get_CImodel(N=100, Nprocesses=1, add_errors=False, shuffle=True, load=False)#
Return a CIModelVisualizer instance corresponding to this run.
The visualizer is initialized through the CIModelVisualizer.from_chain classmethod, with the chain from this run and using N samples, if load is False, otherwise will attempt to use the CIModelVisualizer.load classmethod, assuming a CI model has already been created and saved to this same file, under the model group.
- Parameters:
- Nint, optional
The number of samples to use in computing the confidence intervals.
- Nprocessesint, optional
The number of processes to use in a multiprocessing.Pool passed to the CI model initializer. Defaults to only 1 cpu.
- add_errorsbool, optional
Optionally add the statistical and sampling errors, not normally accounted for, to the chain of samples used (using self._sim_errors(1)).
- shufflebool, optional
Optionally shuffle the chains. This may be useful if N is too small to be representative of the full (reweighted) posteriors, and the final samples in the chain are nearly equal (due to their high weights).
- loadbool, optional
If True, will attempt to load a CI model, rather than creating a new one.
- Returns:
- CIModelVisualizer
The created model visualization (with confidence intervals) object.
- get_model(method='mean', add_errors=False)#
Return a single ModelVisualizer instance corresponding to this run.
The visualizer is initialized through the ModelVisualizer.from_chain classmethod, with the chain from this run and the method given here.
- Parameters:
- method{‘median’, ‘mean’, ‘final’}, optional
The method used to compute a single theta set from the chain. Defaults to ‘median’.
- add_errorsbool, optional
Optionally add the statistical and sampling errors, not normally accounted for, to the chain of samples used (using self._sim_errors(1)).
- Returns:
- ModelVisualizer
The created model visualization object.
- property mask#
Mask out certain samples, removing them from all analysis.
- parameter_means(Nruns=250, sim_runs=None, return_samples=True)#
Compute the mean of each parameter with corresponding errors.
Returns the means of each parameter posterior estimation, with the corresponding error on this statistic. The uncertainties come from the two main sources of errors in nested sampling; statistical errors associated with the uncertainties surrounding the prior volume and sampling errors associated with the integral over the parameters of interest.
These errors can be computed using the standard deviation of the mean from a number of “simulated” (resampled and jittered) runs based on this run. See https://dynesty.readthedocs.io/en/latest/errors.html for a more thorough description.
- Parameters:
- Nrunsint, optional
The number of simulated runs to use to estimate the uncertainties.
- sim_runsNone or list of dynesty.Results
A list of simulated runs to use. A precomputed list of runs may be provided, otherwise they will be computed using _sim_errors.
- return_samplesbool, optional
Optionally also return the full array of parameter means from each simulated run.
- Returns:
- meannp.ndarray[Nparams]
Mean values of each parameter.
- errnp.ndarray[Nparams]
Errors on the mean of each parameter.
- means_arrnp.ndarray[Nruns, Nparams]
The mean values of each parameter for each simulated run.
- parameter_summary(*, N_simruns=100, label_fixed=False)#
Compute the mean and std.dev. on each parameter.
Computes and returns a dictionary with the mean and standard deviation of each parameter, as given by parameter_means and np.sqrt(np.diag(parameter_vars)).
- Parameters:
- N_simrunsint, optional
The number of simulated runs used to compute the means and errors on each parameter, through parameter_{means,vars}.
- label_fixedbool, optional
If True, adds “ (fixed)” to the end of any parameters which were fixed during fitting.
- Returns:
- dict
Dictionary of parameter labels and 2-tuples of mean and standard deviations.
- parameter_vars(Nruns=250, sim_runs=None, return_samples=True)#
Compute the variance of each parameter with corresponding errors.
Returns the covariance array for each parameter posterior estimation, with the corresponding error on this statistic. The uncertainties come from the two main sources of errors in nested sampling; statistical errors associated with the uncertainties surrounding the prior volume and sampling errors associated with the integral over the parameters of interest.
These errors can be computed using the standard deviation of the variance from a number of “simulated” (resampled and jittered) runs based on this run. See https://dynesty.readthedocs.io/en/latest/errors.html for a more thorough description.
- Parameters:
- Nrunsint, optional
The number of simulated runs to use to estimate the uncertainties.
- sim_runsNone or list of dynesty.Results
A list of simulated runs to use. A precomputed list of runs may be provided, otherwise they will be computed using _sim_errors.
- return_samplesbool, optional
Optionally also return the full array of parameter variances from each simulated run.
- Returns:
- varsnp.ndarray[Nparams, Nparams]
Covariance matric for all parameters.
- errnp.ndarray[Nparams, Nparams]
Errors on the covariance matric for all parameters.
- vars_arrnp.ndarray[Nruns, Nparams, Nparams]
The covariance matrix for each simulated run.
- plot_H(fig=None, ax=None, **kw)#
Plot the information integral H.
Plots the “information” gain (H) provided by the updating of a given prior, as characterized by the Kullback-Leibler divergence, as a function of the (log) prior volume.
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this information. Should be a part of the given fig.
- **kwdict
All other arguments are passed to ax.plot.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
Notes
\[H \equiv \int_{\Omega_{\boldsymbol{\Theta}}} P(\boldsymbol{\Theta}) \ln\frac{P(\boldsymbol{\Theta})}{\pi(\boldsymbol{\Theta})}\, d\boldsymbol{\Theta}\]
- plot_HN(fig=None, ax=None, **kw)#
Plot the information H by the number of live points N.
Plots the “information” gain (H) multiplied by the current number of live points, as a function of run iteration. Intended to compare against one of the termination conditions described by (Skilling, 2006)
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this information. Should be a part of the given fig.
- **kwdict
All other arguments are passed to ax.plot.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_IMF(fig=None, ax=None, show_canonical='all', ci=True)#
Plot the IMF, based on the alpha exponents.
- plot_bounds(iteration, fig=None, show_live=False, **kw)#
Plot a “corner plot” approximating the bounds at some iteration.
Plots a Nparam-Nparam lower-triangular “corner” plot showing the approximate extent of the bounding distributions of each parameter at a given iteration. Uses the plotting.cornerbound function built into dynesty.
- Parameters:
- iterationint or list of int
The iterations of the nested sampling run to show the bounding distributions for. If multiple iterations are given, they will be overplotted, in order, in different colours.
tribution at the specified iteration of the nested sampling run.
- figNone or matplotlib.figure.Figure, optional
Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.
- show_livebool, optional
Also show the live points at this given iteration. Doesn’t seem correct currently.
- **kwdict
All other arguments are passed to dynesty.plotting.cornerbound.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
See also
dynesty.plotting.cornerboundBase dynesty function used for plotting of each iteration.
- plot_evidence(fig=None, ax=None, error=False, **kw)#
Plot the estimated evidence.
Plots the estimated (log) bayesian evidence as a function of the (log) prior volume.
Nested sampling provides a continuous estimate of the bayesian evidence based on the integral over the prior volume contained within a given iso-likelihood contour.
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this evidence. Should be a part of the given fig.
- errorbool, optional
Optionally also show the error on the evidence estimation as contours on the plot.
- **kwdict
All other arguments are passed to ax.plot.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
Notes
In nested sampling the evidence integral can be numerically approximated using a given set of dead points as:
\[\mathcal{Z} = \int_{0}^{1} \mathcal{L}(X)\,dX \approx \sum_{i=1}^{N}\,f(\mathcal{L}_i)\,f(\Delta X_i) \equiv \sum_{i=1}^{N}\,\hat{w}_i\]
- plot_marginals(fig=None, full_volume=False, **corner_kw)#
Plot a “corner plot” showcasing the relationships between parameters.
Plots a Nparam-Nparam lower-triangular “corner” marginal plot showing the projections of all sampled parameter values, using the corner.py package.
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.
- full_volumebool, optional
Use the entire raw chains, not resampled based on the weights. This will not show correct posteriors.
- **corner_kwdict
All other arguments are passed to corner.corner.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_ncall(fig=None, ax=None, **kw)#
Plot the number of likelihood calls.
Plots the total number of likelihood function calls made at each step as a function of the (log) prior volume.
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this number. Should be a part of the given fig.
- **kwdict
All other arguments are passed to ax.step.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_nlive(fig=None, ax=None, **kw)#
Plot the number of live points.
Plots the current number of live points, as a function of the (log) prior volume. This should remain constant until dynamic sampling begins, increase incrementally, and then decrease smoothly until all live points are removed.
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this number. Should be a part of the given fig.
- **kwdict
All other arguments are passed to ax.plot.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_params(fig=None, params=None, *, posterior_color='tab:blue', posterior_border=True, show_weight=True, fill_type='weights', ylims=None, truths=None, **kw)#
Plot a diagnostic figure of the distributions of parameter samples.
Plots an Nparam-panel figure showcasing the parameter values of all samples, over the iteration domain, as well as a KDE-based smoothed posterior distribution for each parameter.
Provides a diagnostic figure for examining the parameter estimation. This is a modified version of the diagnostic plot first introduced in Higson et al. (2018).
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.
- paramsNone or list of str, optional
The parameters to show on this figure. If None (default) all parameters (including fixed params) will be shown.
- posterior_colorcolor, optional
The colour of the smoothed posterior distributions.
- posterior_borderbool, optional
If False, will remove the axis frame around the smooth posterior distribution to the right of each panel.
- show_weightbool, optional
Plot the (resampled) weights above the parameter samples columns, using the plot_weights method.
- fill_type{weights, iters, id, batch, bound}, optional
The mapping used to colour all points within the samples axes. Defaults to ‘weights’.
- ylimslist[Nparam] of 2-tuples, optional
Used to set the upper and lower y axis-limits on each parameter.
- truthsnp.ndarray[Nparam] or np.ndarray[Nparam, 3], optional
Optionally indicate the “true” values as horizontal lines on the posterior frames. If [Nparam, 3], the values in each row will be taken as the median, lower limit and upper limit.
- **kwdict
All other arguments are passed to ax.scatter.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_posterior(param, fig=None, ax=None, chain=None, flipped=True, truth=None, truth_ci=None, *args, **kwargs)#
Plot a smoothed posterior distribution of a single parameter.
Plots a gaussian-KDE smoothed posterior probability distribution of a given parameter. Designed mainly to be used within the plot_params method, but can be used on its own.
- Parameters:
- paramstr
Name of the parameter to plot.
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this posterior. Should be a part of the given fig.
- chainnp.ndarray, optional
Optionally supply a (flattened) array of samples to create the posterior from. By default, will load the full chain using self._get_chains.
- flippedbool, optional
If True (default) the posterior will be flipped on it’s side, attached to the left-axis.
- truthfloat, optional
Optionally indicate the “true” value as horizontal lines on the posterior.
- truth_ci2-tuple of float, optional
Optionally shade between the lower and upper limits of the “truth” values, using plt.axhspan.
- **kwargsdict
All other arguments are passed to the ax.fill_between function.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_probability(fig=None, ax=None, **kw)#
Plot the posterior probability.
Plots the total (sum of all components) logged posterior probability of the nested sampler as a function of (log) prior volume.
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place the ax on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot this probability. Should be a part of the given fig.
- **kwdict
All other arguments are passed to ax.plot.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- plot_weights(fig=None, ax=None, show_bounds=False, resampled=False, filled=False, **kw)#
Plot the sample weights as a function of the prior volume.
Plots the importance weights \(\hat{w}_i\) of all samples as a function of the (log) prior volume \(\ln(X)\).
- Parameters:
- figNone or matplotlib.figure.Figure, optional
Figure to place all axes on. If None (default), a new figure will be created, otherwise the given figure should be empty, or already have the correct number of axes. See _RunAnalysis._setup_multi_artist for more details.
- axNone or matplotlib.axes.Axes, optional
An axes instance on which to plot the weights. Should be a part of the given fig.
- show_boundsbool, optional
Display the location of the weight-bounds used in the computation of the (first) dynamical batch likelihood boundaries, as a horizontal line across max(weights) * maxfrac.
- resampledbool, optional
Plot the weights after (equally-weighted) resamping, effectively smoothing the weights.
- filledbool, optional
Fill between the plotted weights and the x-axis.
- **kwdict
All other arguments are passed to ax.plot.
- Returns:
- matplotlib.figure.Figure
The corresponding figure, containing all axes and plot artists.
- print_summary(out=None, content='all', *, N_simruns=100)#
Write a short summary of the run results and metadata.
Write out (to a file or stdout) a short summary of the final median and 1σ parameter values, as well as some metadata surrounding the fitting run setup, such as fixed parameters, and statistics on the run progression, like the effective sample size and efficiency.
- Parameters:
- outNone or file-like object, str, or pathlib.Path, optional
The file to write out the summary to. If None (default) will be printed to stdout.
- content{‘all’, ‘results’, ‘metadata’}
Which parts of the summary to write. If “results”, will print only the parameter values. If “metadata”, will print only the run metadata. If “all” (default), prints both.
- N_simrunsint, optional
The number of simulated runs used to compute the means and errors on each parameter, through parameter_{means,vars}.
- reset_mask()#
Reset the mask.
- slice_on_param(param, lower_lim, upper_lim)#
Set mask based on the value of a certain parameter. Already present masks will be combined. If that’s not desired, masks should be reset (self.reset_mask()) first.
- property state#
Return a flag representing where in the fitting this run reached.
- property weights#
Array of weights, based on the default dynesty.weight_function.