gcfit.core.nested_fit#
- gcfit.core.nested_fit(cluster, *, bound_type='multi', sample_type='auto', initial_kwargs=None, batch_kwargs=None, pfrac=1.0, maxfrac=0.8, eff_samples=5000, plat_wt_func=False, Ncpu=2, mpi=False, initials=None, param_priors=None, fixed_params=None, excluded_likelihoods=None, hyperparams=False, savedir=PosixPath('.'), restrict_to=None, compress=False, verbose=False)#
Main nested sampling fitting pipeline.
Execute the full nested sampling cluster fitting algorithm.
Based on the given clusters Observations, determines the relevant likelihoods used to construct an dynamic nested sampler (dynesty).
All sampler results are stored using an HDF file backend, within the savedir directory under the filename “{cluster}_sampler.hdf”. Also stored within this file is various statistics and metadata surrounding the fitter run.
The nested sampler begins by sampling an “initial_batch” over the entire prior volume, up to some stopping condition defined in initial_kwargs, before transitioning to sampling in batches, with each batch adding Nlive_per_batch live points, until reaching a “(Kish) effective sample size” of eff_samples. Each batch samples only between log-likelihood bounds determined by the range covered by maxfrac percent of the importance weight peak.
The sampling is parallelized over Ncpu or using mpi, with calls to gcfit.posterior defined based on a uniform sampling of the PriorTransforms.
- Parameters:
- clusterstr
Cluster common name, as used to load gcfit.Observations.
- bound_type{‘none’, ‘single’, ‘multi’, ‘balls’, ‘cubes’}, optional
Method used to approximately bound the prior using the current set of live points. Conditions the sampling methods used to propose new live points.
- sample_type{‘unif’, ‘rwalk’, ‘rstagger’,
‘slice’, ‘rslice’, ‘hslice’}, optional
Method used to sample uniformly within the likelihood constraint.
- initial_kwargsdict, optional
Kwargs to be passed to the dynesty.DynamicNestedSampler.sample_initial initial baseline sampling function. Defaults include dlogz of 0.25 and nlive of 100. See dynesty for more info and all other defaults.
- batch_kwargsdict, optional
Kwargs to be passed to the dynesty.DynamicNestedSampler.sample_batch batch sampling function. Defaults include nlive_new of 100. See dynesty for more info and all other defaults.
- pfracfloat, optional
Fractional weight of the posterior (versus evidence) for stop function. Between 0.0 and 1.0, defaults to 1.0 (i.e. 100% posterior).
- maxfracfloat, optional
Fractional percentage threshold of importance weights peak to use for determining likelihood bounds for dynamic sampling batches. Between 0.0 and 1.0, defaults to 0.8 (i.e. 80% of maximum weight).
- eff_samplesint, optional
The desired number of “effective posterior samples” to determine the stopping condition of dynamic nested sampling. Uses the Kish ESS algorithm, see dynesty.dynamicsampler.stopping_function. Defaults to 5000.
- Ncpuint, optional
Number of CPU’s to parallelize the sampling computation over. Is ignored if mpi is True.
- mpibool, optional
Parallelize sampling computation using mpi rather than multiprocessing. Parallelization is handled by schwimmbad.
- initialsdict, optional
Dictionary of initial parameter values. There is no concept of “initial positions” in nested sampling, and this argument is only used in the case of fixed parameters.
- param_priorsdict, optional
Dictionary of prior bounds/args for each parameter. See probabilities.priors for formatting of args and defaults.
- fixed_paramslist of str, optional
List of parameters to fix to the initial value, and not allow to be varied through the sampler.
- excluded_likelihoodslist of str, optional
List of component likelihoods to exclude from the posterior probability function. Each likelihood can be specified using either the name of the function (as given by __name__) or the name of the relevant dataset.
- hyperparamsbool, optional
Whether to include bayesian hyperparameters (see Hobson et al., 2002) in all likelihood functions.
- savedirpath-like, optional
The directory within which the HDF output file is stored, defaults to the current working directory.
- restrict_to{None, ‘local’, ‘core’}
Where to search for the cluster data file, see gcfit.util.get_cluster_path for more information.
- compressbool, optional
If True, applies “gzip” compression to all datasets stored in the output file. Defaults to no compression.
- verbosebool, optional
Increase verbosity (currently only affects output of run final summary).
See also
dynestyNested sampler implementation.
schwimmbadInterface to parallel processing pools.