Halo finders

swiftgalaxy uses a helper class to create a uniform interface to outputs from different halo finders. Provided a python library to read the halo finder outputs already exists, this helper class is usually lightweight and easy to create. Currently, the Velociraptor halo finder and Caesar catalogue format have built-in support, and support for SOAP is planned. Other halo finders may be supported on request – pull requests to the repository are also welcome.

The second argument to create a SWIFTGalaxy is an instance of a class derived from the base helper class _HaloFinder, such as Velociraptor. This object has multiple roles. It will be aware of:

  • the location of the halo finder output files;

  • how to extract the properties of a galaxy of interest from those files;

  • how to create a mask specifying the particles belonging to a galaxy of interest.

SOAP

Future support for SOAP is planned as soon as a stable python interface to SOAP catalogues becomes available. Such an interface is currently in development.

Caesar

The Caesar catalogue format is popular in the Simba simulation community and lives within the yt ecosystem. The Caesar helper class relies on the Group interface to the halo finder outputs. Setting up an instance of the helper class is straightforward. We’ll assume a Caesar output called caesar_catalogue.hdf5. There are two types of groups compatible with SWIFTGalaxy that are defined in these catalogues: halos and galaxies (refer to the Caesar documentation for details). The type of object is specified in a group_type argument (valid values are "halo: or "galaxy"). The position of the object of interest in the corresponding Caesar list is specified in the group_index argument. For example, to choose the 4th entry in the halo list (halo_index=3, since the list is indexed from 0):

cat = Caesar(
    caesar_file="caesar_catalogue.hdf5",
    group_type="halo",
    group_index=3,
)

The first argument could also include a path if needed, e.g. "/path/to/caesar_catalogue.hdf5".

The properties of the object of interest are made conveniently available with the __getattr__() (dot) syntax, which exposes the interface provided by the Group class. For example, the info() function familiar to Caesar users (e.g. using the caesar tools load("caesar_catalogue.hdf5").galaxies[3].info()) is available as:

cat.info()

This lists available integrated properties of the object of interest, for example the virial mass (if available) would be accessed as:

cat.virial_quantities["m200c"]

Caesar is compatible with yt and returns values with units specified with yt that unyt understands by default.

Warning

Caesar defines its own unit registry that specifies how some customised units convert to units provided by yt. For example, a Mpccm (co-moving Mpc) unit is defined. Because swiftsimio provides its own custom implementation of co-moving units that is not explicitly aware of the CAESAR implementation, but both are compatible with yt, some issues can arise. The cosmo_array provided by swiftsimio does produce a warning when potentially ambiguous calculations are attempted (e.g. where its doesn’t know that both argument are co-moving, or that both are physical), and this will trigger on calculations mixing incompatible CAESAR-style and cosmo_array units. However, occasionally swiftsimio uses bare unyt quantities or arrays, and if a CAESAR-style quantity collides with one of these in a calculation silent and incorrect conversion from comoving to physical units (or any other redshift-dependent units) can occur. It is therefore recommended that users convert CAESAR-style quantities to use cosmo_array before they are passed to swiftsimio or swiftgalaxy functions. For example:

import unyt as u
from swiftsimio.objects import cosmo_array, cosmo_factor
scale_factor = ...  # retrieve scale factor from snapshot or catalogue file
cosmo_array(
    cat.virial_quantities["r200c"].to(u.kpc),  # ensures physical units
    comoving=False,
    cosmo_factor=cosmo_factor(a**1, scale_factor=scale_factor)
).to_comoving()  # or leave in physical units if desired

Usually the Caesar object is used to create a SWIFTGalaxy object. In this case the interface is exposed through the halo_finder attribute, for example:

sg = SWIFTGalaxy(
    ...,
    Caesar(...),
)
sg.halo_finder.info()
sg.halo_finder.virial_quantities["m200c"]

By default, the Caesar class will identify the particles that the halo finder deems bound to the object as belonging to the galaxy. This is controlled by the argument:

Caesar(
    ...,
    extra_mask="bound_only"
)

This behaviour can be adjusted. If None is passed instead, then only the spatial masking provided by _get_spatial_mask() is used. This means that all particles in the set of (probably cubic) subvolumes of the simulation that overlap with the region of interest will be read in. Alternatively, a MaskCollection can be provided. This will be used to select particles from those already selected spatially.

Warning

Older CAESAR outputs (prior to updates to the package in October 2023) do not contain enough information to define a spatial sub-region to take advantage of swiftsimio’s spatial masking. swiftgalaxy is still compatible with these older output files but properties of all particles in the box will be read and then masked down to the object of interest, which is very inefficient. When swiftgalaxy doesn’t find the information needed for spatial masking in a CAESAR output file, it will produce a warning at runtime before proceeding (very inefficiently).

If a different subset of particles is desired, often the most practical option is to first set up the SWIFTGalaxy with either extra_mask="bound_only" or extra_mask=None and then use the loaded particles to compute a new mask that can then be applied, perhaps permanently. Since all particles within the spatially masked region will always be read in any case, this does not imply any loss of efficiency.

The Caesar catalogue lists two centres for halos and galaxies. By default, the location of the gravitational potential minimum is assumed as the centre of the objet (and will be used to set the coordinate system, unless the argument auto_recentre=False is passed to SWIFTGalaxy). Usually the available centring options are:

  • "minpot" – potential minimum

  • "" – centre of mass

These can be used as, for example:

Caesar(
    ...,
    centre_type="",  # centre of mass (no suffix in Caesar catalogue)
)

Velociraptor

Velociraptor is a widely-used halo finder. Some SWIFT-based simulations projects have used it, but are largely moving to a model where particles are assigned to halos with Velociraptor (or another finder) and a catalogue is produced with the SOAP tool. The Velociraptor catalogue format is therefore falling somewhat out of fashion in the SWIFT community. It is supported for use with SWIFTGalaxy, but is unlikely to be further developed or maintained. The Velociraptor helper class relies on the velociraptor interface to the halo finder outputs. Setting up an instance of the helper class is straightforward. If the halo finder outputs are named, for example, halos.properties, halos.catalog_groups, etc., and the galaxy of interest occupies the 4th row in the catalogue (halo_index=3, since rows are indexed from 0), then:

cat = Velociraptor(
    "halos",
    halo_index=3
)

The first argument could also include a path if needed, e.g. "/path/to/halos".

Warning

Currently the velociraptor module does not support selecting galaxies by a unique identifier, e.g. cat.ids.id. Users are advised to check this identifier for their selected galaxy to ensure that they obtain the object that they expected.

The properties of the galaxy of interest as calculated by Velociraptor are made conveniently available with the __getattr__() (dot) syntax, which exposes the interface provided by the velociraptor module. For example, the virial mass can be accessed as cat.masses.mvir. Lists of available properties can be printed interactively using print(cat) (or simply cat at the python prompt), or print(cat.masses), etc. When a Velociraptor instance is used to initialize a SWIFTGalaxy, it is made available through the halo_finder attribute. For example, to access the virial mass:

sg = SWIFTGalaxy(
    ...,
    Velociraptor(
        ...
    )
)
sg.halo_finder.masses.mvir

By default, the Velociraptor class will identify the particles that the halo finder deems bound to the object as belonging to the galaxy. This is controlled by the argument:

Velociraptor(
    ...,
    extra_mask="bound_only"
)

This behaviour can be adjusted. If None is passed instead, then only the spatial masking provided by velociraptor.swift.swift.generate_spatial_mask() is used. This means that all particles in the set of (probably cubic) subvolumes of the simulation that overlap with the region of interest will be read in. Alternatively, a MaskCollection can be provided. This will be used to select particles from those already selected using generate_spatial_mask().

If a different subset of particles is desired, often the most practical option is to first set up the SWIFTGalaxy with either extra_mask="bound_only" or extra_mask=None and then use the loaded particles to compute a new mask that can then be applied, perhaps permanently. Since all particles in the spatial region defined by generate_spatial_mask() will always be read in any case, this does not imply any loss of efficiency.

The Velociraptor halo finder computes several centres for halos. By default, the location of the gravitational potential minimum is assumed as the centre of the galaxy (and will be used to set the coordinate system, unless the argument auto_recentre=False is passed to SWIFTGalaxy). Usually the available centring options are:

  • "minpot" – potential minimum

  • "" – centre of mass (?)

  • "_gas" – gas centre of mass (?)

  • "_star" – stellar centre of mass (?)

  • "mbp" – most bound particle

These can be used as, for example:

Velociraptor(
    ...,
    centre_type="mbp"
)

Other halo finders

Support for other halo finders will be considered on request.

Entrepreneurial users may also create their own helper class inheriting from swiftgalaxy.halo_finders._HaloFinder. In this case, the following methods should be implemented:

  • _load(): called during __init__(), implement any initialisation tasks here.

  • _get_spatial_mask(): return a SWIFTMask defining the spatial region to be loaded for the galaxy of interest.

  • _get_extra_mask(): return a MaskCollection defining the subset of particles from the loaded spatial region that belong to the galaxy of interest.

  • centre(): return the coordinates (as a cosmo_array) to be used as the centre of the galaxy of interest (implemented with the @property decorator).

  • velocity_centre(): return the coordinates (as a cosmo_array) to be used as the bulk velocity of the galaxy of interest (implemented with the @property decorator).

In addition, it is recommended to expose the properties computed by the halo finder, masked to the values corresponding to the object of interest. To make this intuitive for users, the syntax to access attributes of the galaxy of interest should preferably match the syntax used for the library conventionally used to read outputs of that halo finder, if it exists. For instance, for Velociraptor this is implemented via __getattr__ (dot syntax), which simply exposes the usual interface (with a mask to pick out the galaxy of interest).

Using swiftgalaxy without a halo catalogue

A helper class called swiftgalaxy.halo_finders.Standalone is provided so that the features of swiftgalaxy that aren’t directly tied to a halo catalogue (e.g. spherical and cylindrical coordinates, consistent coordinate frame, etc.) can be used when no supported halo catalogue is available.

Often the most pragmatic way to create a selection of particles using Standalone is to first select a spatial region guaranteed to contain the particles of interest and then create the final mask programatically using SWIFTGalaxy’s masking features. For example, suppose that you know that there is a galaxy with its centre at (2, 2, 2) Mpc and that you eventually want to select all particles in a spherical aperture 1 Mpc in radius around this point. Start with a cubic spatial mask enclosing this region:

from swiftgalaxy import SWIFTGalaxy, Standalone, MaskCollection
from swiftsimio import cosmo_array
import unyt as u

sg = SWIFTGalaxy(
    "my_snapshot.hdf5",
    Standalone(
        centre=cosmo_array([2.0, 2.0, 2.0], u.Mpc),
        velocity_centre=cosmo_array([0.0, 0.0, 0.0], u.km / u.s),
        spatial_offsets=cosmo_array([[-1.0, 1.0], [-1.0, 1.0], [-1.0, 1.0]], u.Mpc),
        extra_mask=None,  # we'll define the exact set of particles later
    )
)

You can next define the masks selecting particles in your desired spherical aperture, using SWIFTGalaxy’s convenient spherical coordinates feature, and store them in a MaskCollection:

mask_collection = MaskCollection(
    gas=sg.gas.spherical_coordinates.r < 1 * u.Mpc,
    dark_matter=sg.dark_matter.spherical_coordinates.r < 1 * u.Mpc,
    stars=sg.stars.spherical_coordinates.r < 1 * u.Mpc,
    black_holes=sg.black_holes.spherical_coordinates.r < 1 * u.Mpc,
)

Finally, apply the mask to the sg object:

sg.mask_particles(mask_collection)

You’re now ready to proceed with analysis of the particles in the 1 Mpc spherical aperture using this sg object.

Note

mask_particles() applies the masks in-place. The mask could also be applied with the __getattr__() method (i.e. in square brackets), but this returns a copy of the SWIFTGalaxy object. If memory efficiency is a concern, prefer the mask_particles() approach.