Calling CosmoSIS from Python

You can run CosmoSIS on the command line, but you can also call it from Python code. This is useful if you want to run multiple CosmoSIS instances in a loop, or if you want to integrate CosmoSIS into a larger Python application.

This page lists the main functions and classes you can use from scripts.

Top-Level Usage with run_cosmosis

The simplest way to run CosmoSIS from Python is to use the run_cosmosis function.

from cosmosis import run_cosmosis
# This is equivalent to running `cosmosis params.ini` on the command line:
run_cosmosis("params.ini")

The most useful additional arguments to run_cosmosis are override, which lets you replace parameters in the ini file, variables, which lets you replace contents of the values file.

You can also specify output="astropy" to get the results back as an Astropy Table.

cosmosis.run_cosmosis(ini, pool=None, pipeline=None, values=None, priors=None, override=None, profile_mem=0, profile_cpu='', variables=None, only=None, output=None, train_cosmopower=False, overwrite_cosmopower=False)

Execute cosmosis.

Parameters

ini: str, cosmosis.Inifile, or None

The parameter file from which to build the cosmosis run. If set to a string the file is read from disc. If set to None, the other parameters must contain all the required CosmoSIS parameters.

pool: None, cosmosis.MPIPool, or cosmosis.process_pool.Pool

A pool object to enable multi-process parallel execution. If left as the default None then the code is run with a single process (though modules may still run using OpenMP parallelism).

pipeline: None or cosmosis.LikelihoodPipeline

If set, ignore the pipeline definition in the ini file and use this pipeline instead.

values: None or dict[str, str]->str

If set, ignore the numerical parameter values in the ini file and use these instead.

priors: None or dict[str, str]->str

If set, ignore the prior values in the ini file and use these instead.

override: None or dict[str, str]->str

If set, override parameter values in the ini file from the dictionary.

profile_mem: int

If changed from the default zero value, print a memory profile every profile_mem seconds.

profile_cpu: str

If changed from the default empty string, print CPU profile information and also save to the named file. If running in parallel, save to {profile_cpu}.{rank}.

variables: None or dict[str, str]->str

If set, override variable values in the ini file from the dictionary.

only: None or str

If set, fix all the variable values except the one supplied.

output: None or cosmosis.Output

If set, use this output object to save the results. If not set, create an output object from the ini file.

train_cosmopower: bool

If set, it will run a specific set of samplers and modules up to and including CAMB to train the CosmoPower emulator, which can then be used as a drop-in replacement to CAMB. Will refuse to re-train if emulator exists.

overwrite_cosmopower: bool

Force re-train even if emulator exists.

Classes for custom usage

Other classes that may be particularly useful - LikelihoodPipeline - for building and running custom pipelines - DataBlock - for storing and playing with theory predictions from pipelines

class cosmosis.LikelihoodPipeline(arg=None, id='', override=None, modules=None, load=True, values=None, priors=None, only=None, training=False)

Very specialized pipeline designed specifically for the prototypical case of Bayes-computed posterior distributions.

The point of a statistical updating pipeline is that the parameters in the datablocks passed down the pipe, as well as having currently estimated values, also have allowable ranges and possibly other constraints which the user may want to tinker before each run. Thus there is a specialized layout of initialization files in the file system, and there is a modified expectation on the modules to perform simulation, compute the Bayesian evidence, hence log-likelihood. The pipeline itself will aggregate the results and summarize the net effect of all the likelihood estimations, and thence compute the Bayesian posterior.

Because of the necessity of working with distributions of values for each parameter, rather than just a scalar, the extra information is stored in a shadow array—another dictionary with the same keys but a complementary set of values to the original ones—of parameter`s to the :class:`datablock which the base pipeline modifies (actually only a subset of them known as the varied_params: an array which references the interesting parameters in the full set). this shadow array (parameters) is often referred to simply as p, and the two arrays frequently need to be ‘zipped’ together and then ‘unzipped’ after computations have completed.

build_starting_block(p, check_ranges=False, all_params=False)

Assemble DataBlock data based on parameter values in p, and return it.

If check_ranges is indicated, the function will return None if any of our parameters are out of their indicated range.

If all_params is indicated, then the p run data will be assumed to match all the pipeline parameter, including fixed ones. Otherwise (the default) it should match the list ‘varied_params’, and all of our ‘fixed’ parameters are added to the run-set.

create_ini(p, filename)

Dump the parameters p as a new ini file at filename

denormalize_matrix(c, inverse=False)

Perform the inverse operation to the normalize_matrix function above.

Note that if inverse is True the action is exactly the same as the function above, i.e. it normalizes the matrix.

denormalize_vector(p, raise_exception=True)

Convert an array of normalized parameter values, one for each varied parameter, in the range [0.0,1.0] into their original values using only the lower and upper limits of the parameter.

Use denormalize_vector_from_prior to convert according to the prior instead.

denormalize_vector_from_prior(p)

Convert an array of normalized parameter values, one for each varied parameter, in the range [0.0,1.0] into their original values according to the prior for each parameter.

i.e. v -> x such that int_{-inf}^{x} p(x’) dx’ = v

classmethod from_likelihood_function(log_likelihood_function, param_ranges, priors=None, debug=False, derived=None)

Make a pipeline from a simple likelihood function.

Parameters

log_likelihood_functionfunction

A function that takes a list of parameters and returns either a single number (the log-likelihood) or a tuple of two things, the first being the log-likelihood and the second being a dictionary of extra derived parameters.

param_rangeslist of tuples

A list of tuples of the form (min, starting_point, max) for each parameter.

priorslist of tuples, optional

A dictionary if priors i the form name:prior (see documentation for prior format). If not specified then uniform priors are used.

debugbool, optional

If True then exceptions in the likelihood function will be raised. If False then they will be ignored and the likelihood will be set to -inf.

derivedlist of strings, optional

A list of names of derived parameters to save in the output.

Returns

pipelineLikelihoodPipeline

A pipeline object that can be run.

is_out_of_range(p)

Determine if any parameter is not in its allowed range.

likelihood(p, return_data=False, all_params=False)

Run the simulation pipeline, computing any log-likelihoods in the pipeline given the given input parameter values, and return the sum of these.

The parameter vector p must match the length of self.varied_params, unless all_params is specified as True in which case it must match `self.parameters’, i.e. must correspond to the complete parameter set.

If return_data are requested, then the updated data block will be returned as the third return item.

The return will consist of two or three items, depending on return_data:

  • A scalar holding the sum of all computed log-likelihoods of the updated parameter value vector;

  • a vector (NumPy array) of updated parameter values as specified in self.extra_saves;

  • if return_data was specified, the updated data block.

If anything goes wrong in any of the computation which does not result in a run-time error being raised (which would include the case of a parameter going outside of its stipulated limits), then the returned log-likelihood will be -np.inf.

max_vector(all_params=False)

Return a NumPy array of upper limits for the parameters in the pipeline.

If all_params is specified as True then the return will include all parameters, including fixed ones. Otherwise it will just be the varying parameters.

min_vector(all_params=False)

Return a NumPy array of lower limits for the parameters in the pipeline.

If all_params is specified as True then the return will include all parameters, including fixed ones. Otherwise it will just be the varying parameters.

normalize_matrix(c)

Roughly, return a correlation matrix corresponding to the covariance matrix c, of varied_params values.

Except that the elements of c are not probabilities but dimensional values, and the ‘normalization’ is relative to the range of values the ‘covariance’s can take given the lower and upper limits on the variates.

normalize_vector(p)

Convert an array of parameter values, one for each varied parameter, into a normalized form all in the range [0.0,1.0] using only the lower and upper limits for each parameter.

output_names()

Return a list of strings, each the name of a non-fixed parameter.

parameter_index(section, name)

Return the sequence number of the parameter name in section.

If the parameter is not found then ValueError will be raised.

posterior(p, return_data=False, all_params=False)

Use the above methods to obtain prior and updated log-likelihoods, sum together to get Bayesian posterior.

The argument p is a vector or list of the input parameters. if (as in the default) all_params is False, then it should be the same length and order as self.varied_params.

Otherwise, if all_params is False, then it should match the length and order of self.parameters. The method returns two or three values depending on return_data:

  • The posterior;

  • a vector (NumPy array) of updated parameter values as specified in self.extra_saves;

  • if return_data was specified, the updated data block.

If there is a problem anywhere in the computations which does not cause a run-time exception to be raised—including the case where a parameter goes outside of its alloted range—, then -numpy.inf will be returned as the final posterior (i.e., zero probability of this set of parameter values being correct).

print_priors()

Pretty-print a table of priors for human inspection.

prior(p, all_params=False, total_only=True)

Compute the probability of all values in p based on their prior distributions.

The array p should match the length of of all of our parameters if all_params is True, and our varied_params otherwise.

If total_only is True (the default), then the scalar sum of all the prior probabilities is returned. Otherwise a list of pairs is returned, with each element a stringified version of the parameter name, and the prior probability: [(name1, prior1), (name2,prior2), …]

randomized_start()

Give each varied parameter an independent random value distributed according to the parameter prior.

The return is a NumPy array of the random values.

reset_fixed_varied_parameters()

Identify the sub-set of parameters which are fixed, and those which are to be varied.

run_parameters(p, check_ranges=False, all_params=False)

Assemble DataBlock data based on parameter values in p, and run the pipeline on those data.

If check_ranges is indicated, the function will return None if any of our parameters are out of their indicated range.

If all_params is indicated, then the p run data will be assumed to match all the pipeline parameter, including fixed ones. Otherwise (the default) it should match the list ‘varied_params’, and all of our ‘fixed’ parameters are added to the run-set.

run_results(p, all_params=False)

Run the pipeline on the given parameters and get a results object.

The argument p is a vector or list of the input parameters. if (as in the default) all_params is False, then it should be the same length and order as self.varied_params.

Otherwise, if all_params is False, then it should match the length and order of self.parameters.

The method returns a PipelineResults object with the following attributes:

  • results.post, the posterior (float);

  • results.extra, the reqiured additional output parameters (numpy array)

  • results.prior, the total prior (float)

  • results.block, the updated data block

If there is a problem anywhere in the computations which does not cause a run-time exception to be raised—including the case where a parameter goes outside of its alloted range—, then -numpy.inf will be returned as the final posterior (i.e., zero probability of this set of parameter values being correct).

set_fixed(section, name, value)

Indicate that the parameter (section, name) must be held fixed at value.

set_varied(section, name, lower, upper)

Indicate that the parameter (section, name) is to be varied between the lower and upper bounds.

start_vector(all_params=False, as_array=True)

Return a vector of starting values for parameters.

If all_params is specified as True then the return will include all our parameters, otherwise only the varying ones are included.

If as_array is specified as False then a Python list is returned, otherwise, the default, a NumPy array is returned.

class cosmosis.datablock.DataBlock(ptr=None, own=None)

A map of (section,name)->value of parameters.

At the heart of Cosmosis is a data-containing object which is passed down a pipeline of processing stages, which shape and massage those data as they go through. The DataBlock class is the realization of this object as seen by Python modules.

The main methods a Cosmosis module programmer is interested in given one of these objects are the implicitly-called __getitem__ and __setitem__: these retrieve parameter values from the map, and put new ones in or replace existing ones, respectively.

Most of the implementation detail of this class is a complete orthogonal set of methods which get, put and replace parameters with integer, boolean, string, floating-point, complex values, either as scalars or 1-, 2-dimensional arrays or ‘grids’, then refinement of these into generic get(), set() and replace() methods, and finally the ultimate refinement to the __getitem__() and __setitem__() methods themselves.

The grid concept is where a two-dimensional array is flanked by two one-dimensional ones giving labels to the ‘rows’ and ‘columns’; these labels are used to address the data directly.

clone()

Make a brand-new, completely independent object, a deep copy of the existing one.

A new object will be returned from this method which has its own underlying implementation, a deep copy of the parameter map we are holding. This WILL entail the attempted requisition of enough new memory to hold the complete parameter structure.

get(section, name)

Get the value of parameter with name in section.

The type value returned from this method will reflect the type of value held in the underlying map implementation. In circumstances where this either cannot be ascertained or cannot be converted simply to a native Python type, then either a BlockError or ValueError will be raised.

get_all_parameter_use(params_of_interest)

Analyze the log and figure out which parameter is in use in specific module

get_bool(section, name, default=None)

Retrieve a boolean value from the parameter set.

The name parameter in the given section will be interpreted as a boolean and returned to the caller. If such parameter is not found in the map, then the default will be returned if it was given, or else a specialized BlockError (see errors.py) will be thrown. The BlockError may also be thrown if a variable is found, but is not of boolean type.

get_complex(section, name, default=None)

Retrieve a complex value from the parameter set.

The name parameter in the given section will be interpreted as a complex value and returned to the caller. If such parameter is not found in the map, then the default will be returned if it was given, or else a specialized BlockError (see errors.py) will be thrown. The BlockError may also be thrown if a variable is found, but is not of complex type.

get_double(section, name, default=None)

Retrieve a floating-point value from the parameter set.

The name parameter in the given section will be interpreted as a floating-point value and returned to the caller. If such parameter is not found in the map, then the default will be returned if it was given, or else a specialized BlockError (see errors.py) will be thrown. The BlockError may also be thrown if a variable is found, but is not of floating-point type.

get_double_array_1d(section, name)

Retrieve a floating-point array from the parameter set.

The name parameter in the given section will be understood as being of floating-point array type and returned to the caller as a NumPy array. If such a parameter is not found in the map, then a specialized BlockError (see errors.py) will be thrown.

get_double_array_nd(section, name)

Get a floating-point array of a priori unspecified shape.

Expect BlockError or ValueError to be raised if there are extenuating circumstances.

get_first_parameter_use(params_of_interest)

Analyze the log and figure out when each parameter is first used

get_grid(section, name_x, name_y, name_z)

Return a triple of arrays, representing a grid of data.

The strings name_x, name_y and name_z must be keys under section which index data making up a grid; they must be the same set used in a call to replace_grid() or put_grid() used to establish the grid in the first place (except that the x- and y-axes are allowed to be transposed).

The return is a triple of arrays: the first two elements hold the labels along the axes and the third element is a two-dimensional array holding the data deemed to be inside the grid itself.

If the name_*`ʼs do not correspond correctly with those of an established grid then a :class:`BlockError will be raised.

get_int(section, name, default=None)

Retrieve an integer value from the parameter set.

The name ʼd parameter in the given section will be interpreted as an integer and returned to the caller. If such parameter is not found in the map, then the default will be returned if it was given, or else a specialized BlockError (see errors.py) will be thrown. The BlockError may also be thrown if a variable is found, but is not of integer type.

get_int_array_1d(section, name)

Retrieve an integer array from the parameter set.

The name parameter in the given section will be understood as being of integer array type and returned to the caller as a NumPy array. If such a parameter is not found in the map, then a specialized BlockError (see errors.py) will be thrown.

get_int_array_nd(section, name)

Get an integer-valued array of a priori unspecified shape.

Expect BlockError or ValueError to be raised if there are extenuating circumstances.

get_log_count()

Return the number of entries in the log.

get_log_entry(i)

Get the iʼth log entry.

The return is a tuple of four strings indicating the verb (i.e., logged action), section and name of the parameter, and the data type held by the parameter.

get_metadata(section, name, key)

Get the metadata called key attached to parameter name under section.

If the data do not exist at the requested address, then a BlockError will be raised.

get_string(section, name, default=None)

Retrieve a string value from the parameter set.

The name parameter in the given section will be interpreted as a string value and returned to the caller. If such parameter is not found in the map, then the default will be returned if it was given, or else a specialized BlockError (see errors.py) will be thrown. The BlockError may also be thrown if a variable is found, but is not of string type.

get_string_array_1d(section, name)

Retrieve an array of strings from the datablock.

The name parameter in the given section will be understood as being of 1D string array type and returned to the caller as a numpy array. If such a parameter is not found in the map, then a specialized BlockError (see errors.py) will be thrown.

has_section(section)

Indicate whether or not there is a given section in the data set.

The section should be a string holding the name of the section.

has_value(section, name)

Indicate whether or not a parameter is in the map.

Both section and name should be strings.

keys(section=None)

Return all keys in the collection, or, if section is specified, all keys under that section.

If section is specified, it must be a string naming a section for whose keys are requested.

In all cases a list of pairs of strings will be returned, the elements of each being the section and name of each parameter.

log_access(log_type, section, name)

Add an entry to the end of this DataBlock access log.

The log_type describes the action performed on the parameter at (section, name). It should be one of the strings displayed in datablock_logging.cc, viz: “READ-OK”, “WRITE-OK”, “READ-FAIL”, “WRITE-FAIL”, “READ-DEFAULT”, “REPLACE-OK”, “REPLACE-FAIL”, “CLEAR”, “DELETE”, or “MODULE-START”.

print_log()

Dump a human-readable list of log entries to standard output.

The entries appear one per line, with space-separated items corresponding to the verb, section and name, and data-type of the parameter.

put(section, name, value, **meta)

Add a parameter with value at (section, name) in the map.

The parameter stored in the map will have a type which reflects the type of value.

If provided, meta should be a map of key/value pairs, and these will be appended to the inserted parameter as meta-data, converted to string type.

It is an error to insert a parameter when there already is an entry at (section, name), in which case a BlockError specialization will be raised.

put_bool(section, name, value)

Add a boolean parameter to the map.

A new parameter will be added to the current map, at (section, name), and will have the value interpreted as a boolean type. It is an error to try to add a parameter which is already there, and in this case a BlockError will be raised.

put_complex(section, name, value)

Add a complex parameter to the map.

A new parameter will be added to the current map, at (section, name), and will have the value interpreted as a complex type. It is an error to try to add a parameter which is already there, and in this case a BlockError will be raised.

put_double(section, name, value)

Add a floating-point parameter to the map.

A new parameter will be added to the current map, at (section, name), and will have the value interpreted as a floating-point type. It is an error to try to add a parameter which is already there, and in this case a BlockError will be raised.

put_double_array_1d(section, name, value)

Add a one-dimensional floating-point array to the map.

A parameter called name is added to section, and holds value interpreted as a simple array of floating-point values. If this interpretation cannot be made then a BlockError will be raised.

put_double_array_nd(section, name, value)

Add a floating-point array parameter to the data set.

The value must be an array of values which can be interpreted as floating-point numbers, otherwise a ValueError will be raised. If the parameter does not exist in the data set, a BlockError will be raised. The array can be any shape.

put_grid(section, name_x, x, name_y, y, name_z, z)

Put a grid into the map.

The grid is put into section, using keys name_x, name_y and name_z to locate the data. The data comprise the array x holding a set of ‘labels’ for the x-axis, an array y holding labels for the y-axis, and then a two-dimensional array`z`, whose sizes must correspond with the x- and y-sizes, which holds the actual data inside the grid.

If there are any problems, most notably with the sizes of the arrays not being compatible, then a ValueError will be raised.

put_int(section, name, value)

Add an integer parameter to the map.

A new parameter will be added to the current map, at (section, name), and will have the value interpreted as an integer type. It is an error to try to add a parameter which is already there, and in this case a specialized BlockError will be raised.

put_int_array_1d(section, name, value)

Add a one-dimensional integer array to the map.

A parameter called name is added to section, and holds value interpreted as a simple array of integers. If this interpretation cannot be made then a BlockError will be raised.

put_int_array_nd(section, name, value)

Add an integer array parameter to the data set.

The value must be an array of values which can be interpreted as integer numbers, otherwise a ValueError will be raised. If the parameter does not exist in the data set, a BlockError will be raised. The array can be any shape.

put_metadata(section, name, key, value)

Associate value with the meta-key attached to parameter name under section.

If there is no parameter under (section, name) then a BlockError will be raised.

put_string(section, name, value)

Add a string parameter to the map.

A new parameter will be added to the current map, at (section, name), and will have the value interpreted as a string type. It is an error to try to add a parameter which is already there, and in this case a BlockError will be raised.

put_string_array_1d(section, name, value)

Add a one-dimensional floating-point array to the map.

A parameter called name is added to section, and holds value interpreted as a simple array of floating-point values. If this interpretation cannot be made then a BlockError will be raised.

static python_to_1d_c_array(value, numpy_type)

Create a C object equivalent to the value array, interpreted as numpy_type.

The object will be a contiguous list—this may entail that a value array with strides be copied to a compressed version—of C type most appropriate to the representation of the Python numpy_type.

static python_to_c_complex(value)

Interpret an arbitrary Python object as a lib.c_complex type.

This convenience function will take an actual lib.c_complex value (no-op), a Python complex value, the first two components of a Python tuple value, or a real scalar value and return the equivalent lib.c_complex (i.e. a type which can be passed to a C subroutine representing a complex number).

In the case of the scalar input, this is taken as the real part of the complex number and the imaginary part will be zero.

replace(section, name, value)

Replace the value of a parameter at (section, name) in the map with value.

The parameter newly stored in the map will have a type which reflects the type of value.

It is an error to attempt to replace a parameter not already present in the map, in which case a BlockError specialization will be raised.

replace_bool(section, name, value)

Change the value of a boolean parameter in the map.

The parameter at (section, name) will be given the new value. It is an error to attempt to replace a value which is not already in the map, and a BlockError will be raised in this case.

replace_complex(section, name, value)

Change the value of a complex parameter in the map.

The parameter at (section, name) will be given the new value. It is an error to attempt to replace a value which is not already in the map, and a BlockError will be raised in this case.

replace_double(section, name, value)

Change the value of a floating-point parameter in the map.

The parameter at (section, name) will be given the new value. It is an error to attempt to replace a value which is not already in the map, and a BlockError will be raised in this case.

replace_double_array_1d(section, name, value)

Replace the value of a parameter with a simple floating-point array.

The parameter at (section, name) is replaced with value, interpreted as a one-dimensional array.

If this cannot be done then a BlockError specialization will be raised.

replace_double_array_nd(section, name, value)

Replace a floating-point array parameter in the data set.

The value must be an array of values which can be interpreted as floating-point numbers, otherwise a ValueError will be raised. If the parameter already exists in the data set, a BlockError will be raised. The new array can be any shape, independent of the shape of the original value in this data set.

replace_grid(section, name_x, x, name_y, y, name_z, z)

Put a grid into the map.

The grid is put into section, using keys name_x, name_y and name_z to locate the data. The data comprise the array x holding a set of ‘labels’ for the x-axis, an array y holding labels for the y-axis, and then a two-dimensional array`z`, whose sizes must correspond with the x- and y-sizes, which holds the actual data inside the grid.

If there are any problems, most notably with the sizes of the arrays not being compatible, then a ValueError will be raised.

replace_int(section, name, value)

Change the value of an integer parameter in the map.

The parameter at (section, name) will be given the new value. It is an error to attempt to replace a value which is not already in the map, and a BlockError will be raised in this case.

replace_int_array_1d(section, name, value)

Replace the value of a parameter with a simple integer array.

The parameter at (section, name) is replaced with value, interpreted as a one-dimensional array.

If this cannot be done then a BlockError specialization will be raised.

replace_int_array_nd(section, name, value)

Replace an integer array parameter in the data set.

The value must be an array of values which can be interpreted as integer numbers, otherwise a ValueError will be raised. If the parameter already exists in the data set, a BlockError will be raised. The new array can be any shape, independent of the shape of the original value in this data set.

replace_metadata(section, name, key, value)

Associate value with the meta-key attached to parameter name under section.

If there is no parameter under (section, name) then a BlockError will be raised.

replace_string(section, name, value)

Change the value of a string parameter in the map.

The parameter at (section, name) will be given the new value. It is an error to attempt to replace a value which is not already in the map, and a BlockError will be raised in this case.

replace_string_array_1d(section, name, value)

Replacing string arrays is not yet implemented

report_failures()

Dump a human-readable list of failed-action log entries to the standard error channel.

The entries appear one per line, with space-separated items corresponding to the verb, section and name, and data-type of the parameter.

save_to_directory(dirname, clobber=False)

Save the entire contents of this parameter map in the filesystem under dirname.

The data are all written out long-hand in ASCII. Each unique section will go to its own sub-directory, in which all the scalar parameters in that section go into a single file (‘values.txt’), and of the ‘composite’ data each go into their own file, named after the parameter key.

The path, including dirname, will be created if necessary.

save_to_file(dirname, clobber=False)

Effectively save_to_directory() with the result tarʼd and compressed to a single file.

The dirname argument here is actually a file name without an extension; the path to the file will be created in the file system if necessary (ValueError will be raised if this cannot be accomplished), and “.tgz” will be appended to the file name.

sections()

Return a list of strings with the names of all sections in the data set.