swiftgalaxy.iterator module
Iterate over SWIFTGalaxy objects efficiently.
Provides the SWIFTGalaxies class that enables efficient
iteration over SWIFTGalaxy objects for multiple objects of
interest within a single simulation snapshot.
Parallelization is not yet implemented but is prioritized for future release.
- class swiftgalaxy.iterator.SWIFTGalaxies(snapshot_filename: str, halo_catalogue: _HaloCatalogue, auto_recentre: bool = True, preload: Set[str] = {}, transforms_like_coordinates: Set[str] = {}, transforms_like_velocities: Set[str] = {}, id_particle_dataset_name: str = 'particle_ids', coordinates_dataset_name: str = 'coordinates', velocities_dataset_name: str = 'velocities', coordinate_frame_from: SWIFTGalaxy | None = None, optimize_iteration: str = 'auto')[source]
Bases:
objectFacilitates efficiently iterating over many objects of interest from a simulation.
SWIFT simulation snapshots contain particles grouped by “top-level cells” that cover the simulation volume. The minimum number of particles that it makes sense to read is therefore those contained in one such top-level cell. If one wants to create many
SWIFTGalaxyobjects from one simulation snapshot, there is a risk that the same data are read many times, such as when multiple target objects lie within the same top-level cell. This class provides a convenient way to iterate over multiple target objects while minimizing the I/O overhead by managing the order of iteration to group together target objects that occupy common top-level cells and only reading the data once.An important consequence to be aware of is that the iteration order is not controlled by the user because it must be chosen to group objects in the same top-level cell(s) together. The iteration order is available as the
iteration_orderattribute of aSWIFTGalaxiesobject. Alternatively, output of a function applied to a list of target objects in the same order as the input list can be obtained using themap()method.There is an obvious opportunity to parallelize the iteration process by passing each region (potentially each containing multiple target objects) to worker processes as they become available, for example. This current initial version of the
SWIFTGalaxiesclass does not yet support parallel iteration, instead prioritizing the release of a working serial implementation. Support for parallelization will be added later as a high priority.- Parameters:
snapshot_filename (
str) – Name of file containing snapshot.halo_catalogue (
_HaloCatalogue) – A halo catalogue instance fromswiftgalaxy.halo_catalogues, e.g. aswiftgalaxy.halo_catalogues.SOAPinstance. It should specify more than one target object, e.g. by setting itssoap_index=[0, 123, 456, ...].auto_recentre (
bool(optional), default:True) – IfTrue, the coordinate system will be automatically recentred on the position and velocity centres defined by thehalo_catalogue.preload (set (optional), default:
set()) – Deprecated and ignored.transforms_like_coordinates (
set(optional), default:set()) – Names of fields that behave as velocities. It is assumed that these exist for all present particle types. When the coordinate system is rotated or boosted, the associated arrays will be transformed accordingly. Thevelocitiesdataset (or its alternative name given in thevelocities_dataset_nameparameter) is implicitly assumed to behave as velocities.transforms_like_velocities (
set(optional), default:set()) – Names of fields that behave as velocities. It is assumed that these exist for all present particle types. When the coordinate system is rotated or boosted, the associated arrays will be transformed accordingly. Thevelocitiesdataset (or its alternative name given in thevelocities_dataset_nameparameter) is implicitly assumed to behave as velocities.id_particle_dataset_name (
str(optional), default:"particle_ids") – Name of the dataset containing the particle IDs, assumed to be the same for all present particle types.coordinates_dataset_name (
str(optional), default:"coordinates") – Name of the dataset containing the particle spatial coordinates, assumed to be the same for all present particle types.velocities_dataset_name (
str(optional), default:"velocities") – Name of the dataset containing the particle velocities, assumed to be the same for all present particle types.coordinate_frame_from (
SWIFTGalaxy(optional), default:None) – AnotherSWIFTGalaxyto copy the coordinate frame (centre and rotation) and velocity coordinate frame (boost and rotation) from.optimize_iteration (
str(optional), default:"auto") – Can be"auto","dense"or"sparse". See docstrings of methods_eval_sparse_optimized_solution()and_eval_dense_optimized_solution()for explanations of optimization schemes. In most cases leave set to default"auto"to automatically determine optimal solution.
Examples
Using
SWIFTGalaxiesis almost the same as using the mainSWIFTGalaxyclass, except that (i) the halo catalogue is initialized with multiple target objects and (ii) theSWIFTGalaxiesclass provides an iteration method (__iter__), and determines its own iteration order. For example:from swiftgalaxy import SWIFTGalaxies, SOAP sgs = SWIFTGalaxies( "snapshot.hdf5", SOAP( "soap.hdf5", soap_index=[0, 123, 456], # multiple target indices ), ) iteration_order = sgs.iteration_order # be aware of the order of iteration for sg in sgs: # some analysis involving the pre-loaded data fields goes here: sg.element_abundances.carbon sg.dark_matter.coordinates sg.stars.velocities
Alternatively the
map()method can be used to apply a function to all of theSWIFTGalaxy’s created by this class. For example:from swiftgalaxy import SWIFTGalaxies, SOAP sgs = SWIFTGalaxies( "snapshot.hdf5", SOAP( "soap.hdf5", soap_index=[0, 123, 456], # multiple target indices ), ) def analysis(sg): # this function can also have additional args & kwargs, if needed # it should only access the pre-loaded data fields sg.element_abundances.carbon sg.dark_matter.coordinates sg.stars.velocities return sg.element_abundances.carbon.mean() # map accepts arguments `args` and `kwargs`, passed through to function, if needed result = sgs.map(analysis)
- property iteration_order: ndarray
Property holding the order that the target objects will be iterated in.
The iteration order is likely not the same as the order that the targets are provided in because this is probably not an optimal iteration order. This property attribute provides the optimized iteration order evaluated by
SWIFTGalaxies.- Returns:
Array of indices specifying the iteration order.
- Return type:
- map(func: Callable, args: List[Tuple] | None = None, kwargs: List[Dict] | None = None) List[Any][source]
Apply a function to each object of interest and return a list of results.
The iteration order of
SWIFTGalaxiesis not necessarily the order that the objects of interest are provided by the user because the class determined an efficient iteration order to minimize I/O operations. This method applies a provided function to each object of interest in an efficient order then returns the results in a list ordered in the same order that the objects of interest were input.The function to be evaluated should expect a
SWIFTGalaxy(from those to be iterated over) as its first argument. It may accept lists of additional arguments and/or keyword arguments (with each element corresponding to one entry in the list of target objects) that can be passed to map as atupleof arguments and adictof keyword arguments.Currently this function only executes serially but adding a parallel execution option, and further support for parallelization in analysis, is a high priority.
- Parameters:
func (callable) – The function to be evaluated.
args (
list(optional), default:None) – List of additional arguments to the function to be evaluated (the first argument is always the currentSWIFTGalaxyin the iteration). Each item in the list should be atupleof arguments, with onetuplefor each galaxy being iterated over. See examples section for further details.kwargs (
list(optional), default:None) – List of additional keyword arguments to pass to the function to be evaluated. Each item in the list should be adictof keyword arguments, with onedictfor each galaxy being iterated over. Dictionary keys are the names of the keyword arguments and the corresponding dictionary values are the values of the keyword arguments. See examples section for further details.
- Returns:
A list containing the return value(s) of the function applied to each object of interest, in the same order as the objects of interest were passed to the halo finder interface.
- Return type:
list
Examples
A simple example that applies a function
dm_median_positionto each galaxy in a list of targets[11, 22, 33]:from swiftgalaxy import SWIFTGalaxies, SOAP # define the function that we will apply to each SWIFTGalaxy object: def dm_median_position(sg): return np.median(sg.dark_matter.coordinates, axis=0) sgs = SWIFTGalaxies( "my_snapshot.hdf5", SOAP( "my_soap.hdf5", soap_index=[11, 22, 33], ), ) my_result = sgs.map(dm_median_position)
The result stored in
my_resultcontains the result of the function for the galaxies at index11,22and33, in the same order as they are given in thesoap_indexlist.This second example shows how to pass extra arguments and/or keyword arguments to the function given to
map:from swiftgalaxy import SWIFTGalaxies, SOAP # define the function that we will apply to each SWIFTGalaxy object: def dm_median_position( sg, # the first argument is always a SWIFTGalaxy from the iteration extra_argument_1, extra_argument_2, extra_kwarg_1=None, extra_kwarg_2=None, ): # presumably make use of the extra arguments and/or kwargs here... return np.median(sg.dark_matter.coordinates, axis=0) sgs = SWIFTGalaxies( "my_snapshot.hdf5", SOAP("my_soap.hdf5", soap_index=[11, 22, 33]), ) my_result = sg.map( dm_median_position, args=[ (my_extra_arg_1_for_galaxy_11, my_extra_arg_2_for_galaxy_11), (my_extra_arg_1_for_galaxy_22, my_extra_arg_2_for_galaxy_22), (my_extra_arg_1_for_galaxy_33, my_extra_arg_2_for_galaxy_33), ], kwargs=[ dict( extra_kwarg_1=my_extra_kwarg_1_for_galaxy_11, extra_kwarg_2=my_extra_kwarg_2_for_galaxy_11, ), dict( extra_kwarg_1=my_extra_kwarg_1_for_galaxy_22, extra_kwarg_2=my_extra_kwarg_2_for_galaxy_22, ), dict( extra_kwarg_1=my_extra_kwarg_1_for_galaxy_33, extra_kwarg_2=my_extra_kwarg_2_for_galaxy_33, ), ] )
Note that if you have only a single extra argument it must still be packaged as a tuple, for instance:
args=[ (my_extra_arg_for_galaxy_11, ), (my_extra_arg_for_galaxy_22, ), (my_extra_arg_for_galaxy_33, ), ]
The commas inside the parentheses are not optional!