Cookbook¶
Not yet fully vetted
These recipes are illustrative and have not all been verified end-to-end against live catalogs — some may need adjusting (column names, catalog names, exact verb arguments) for your data. Treat them as starting points rather than guaranteed-runnable scripts, and check the linked guides for the authoritative API.
Runnable end-to-end recipes for the science workflows ACID was built for. Each entry leads with the science question, then gives a single self-contained snippet you can drop into a script or notebook.
Every recipe assumes import acid and import astropy.units as u.
acid is module-level and singleton-by-default: the recipes call
acid.open(...) / acid.sql.query(...) directly on the process-wide default
connection — the first call lazy-initializes it, reading the catalog
source path from your configuration (the ACID_PATH environment variable
or ~/.config/acid). Catalog names used in the snippets follow
the real ones (gaia_dr3, twomass_psc, ztf, panstarrs1,
rubin_object) when realistic; the synthetic fixtures (a, b, c,
lc) appear when a snippet is meant to run against the test suite.
Substitute the catalogs you actually have registered.
For task-shaped explanations of the underlying verbs, see the user guide; the recipes below stitch those verbs together.
Gaia × Rubin counterparts in one science field¶
Find the Gaia DR3 source associated with every Rubin DP1 object in a science field, with the separation in arcseconds and the Gaia photometry attached. A textbook position-by-position counterpart match.
import acid
import astropy.units as u
gaia = acid.open("gaia_dr3")
rubin = acid.open("rubin_object")
# 1° science field. The cone scopes both catalogs uniformly;
# the radius-vs-margin rule is enforced at analyze time.
with acid.in_cone((150.5, 2.2), radius=0.5 * u.deg):
counterparts = (rubin
.crossmatch(gaia, radius=1 * u.arcsec, dist_col="d_arcsec")
.select("objectId, source_id, d_arcsec, "
"g_psfMag, phot_g_mean_mag, parallax")
.to_polars())
print(counterparts.shape)
print(counterparts["d_arcsec"].describe())
maxmatch=1 (the default) gives one Gaia source per Rubin object —
the closest one within 1″. For "every Gaia source within 1″", pass
maxmatch=-1.
If you're at full-sky scale, drop the with acid.in_cone(...): block;
the query is the same. Develop with the cone, run without it
(see Debug small, run big).
ZTF transient candidates → Pan-STARRS host photometry¶
For every ZTF alert candidate near a target field, find the Pan-STARRS host (the nearest PS1 source within 1.5″) and pull its multi-band photometry.
import acid
import astropy.units as u
ztf = acid.open("ztf_alerts")
ps1 = acid.open("panstarrs1")
candidates = (ztf
.where("classtar > 0.5 AND magpsf < 19.5")
.crossmatch(ps1, radius=1.5 * u.arcsec, dist_col="d_arcsec",
how="left")
.select("candid, ra, dec, magpsf, "
"objID, gMeanPSFMag, rMeanPSFMag, iMeanPSFMag, "
"d_arcsec")
.to_polars())
print(f"{len(candidates):,} candidates")
print(f" with PS1 host: {candidates['objID'].is_not_null().sum():,}")
print(f" hostless: {candidates['objID'].is_null().sum():,}")
how="left" keeps every candidate (matched or not). The columns
from PS1 are NULL for hostless candidates — useful for separating
extragalactic transients (host present) from genuinely
hostless ones.
If ZTF positions are at a different epoch than PS1, propagate before registering (see Crossmatch guide §1).
Sky-binned source counts (a HEALPix density map)¶
Produce a HEALPix histogram of Gaia source counts at order N, for making a sky-density figure.
import acid
from acid import agg
gaia = acid.open("gaia_dr3")
# Bin to Norder=8 (~13 arcmin pixels). _healpix_29 is the
# order-29 pixel ID present in every HATS catalog; right-
# shifting by 2 × (29 - 8) gives the order-8 pixel.
density = (gaia
.where("phot_g_mean_mag BETWEEN 14 AND 18")
.group_by("_healpix_29 / 4398046511104 AS hpix_8") # 4398046511104 = 4^(29 - 8)
.aggregate(n=agg.count())
.to_polars())
print(density.head())
The _healpix_29 / 4^(29-8) divisor is integer arithmetic — push the
division into the GROUP BY rather than computing a separate column,
and Polars folds it cleanly. Result is one row per non-empty pixel.
This pattern generalises to any sky-binned aggregate: replace
agg.count() with agg.mean("phot_g_mean_mag") for a mean-magnitude
map, or agg.std("parallax") for a parallax-scatter map.
Light curves for a list of targets¶
Given a CSV of target coordinates, fetch the Rubin DP1 forced-source light curve for each — one row per target, the detections as time-ordered lists.
The target list goes straight in as a virtual catalog (no offline HATS import). The shape is crossmatch by position to anchor each target to a Rubin object, then a nested equi-join on the object ID so each target's detections fold into per-object lists.
import acid
import astropy.units as u
# Your target list: a CSV with RA/DEC columns (degrees, ICRS).
targets = acid.open("targets.csv", ra="RA", dec="DEC")
rubin_obj = acid.open("rubin_object") # Rubin DP1 object table
rubin_fs = acid.open("rubin_forcedsource") # Rubin DP1 forced-source light curves
# Position match → Rubin object, then nested equi-join by objectId.
lc = (targets
.crossmatch(rubin_obj, radius=1 * u.arcsec, dist_col="d_arcsec")
.join(rubin_fs, on="objectId", how="left",
nested=True, order_by="midpointMjdTai"))
lc.save("/scratch/target_lightcurves", name="target_lc") # stays queryable
# one row per matched object; midpointMjdTai / band / psfFlux are lists.
# Or, for one flat file to hand to a notebook / collaborator:
lc.export("target_lightcurves.parquet") # leaves the system
save writes a queryable HATS catalog (streaming — any size); export
gathers the full result in memory and writes one flat file (right for a
modest target-list output, not a full-sky one). For an in-memory frame
(pandas / Astropy) instead of a file, pass it to
acid.open(frame, ra=..., dec=...) the same way. See
bring your own target list
and the light-curves task guide for the full
treatment (nested joins, column pruning, per-object UDF reductions).
Anti-join: Rubin objects without a Gaia counterpart¶
Find every Rubin object inside a field that has no Gaia source within 1″ — a candidate list for "Rubin-only" detections (potential variables not in Gaia, faint sources, transients, artifacts).
import acid
import astropy.units as u
rubin = acid.open("rubin_object")
gaia = acid.open("gaia_dr3")
with acid.in_cone((150.5, 2.2), radius=0.5 * u.deg):
anti = (rubin
.crossmatch(gaia, radius=1 * u.arcsec, how="left", dist_col="d")
.where("d IS NULL") # the anti-join key
.select("objectId, ra, dec, g_psfMag, r_psfMag")
.to_polars())
print(f"{len(anti):,} Rubin objects without a Gaia counterpart within 1\"")
how="left" + WHERE d IS NULL is the canonical anti-join pattern in
ACID. The pattern works the same in the SQL surface:
LEFT JOIN gaia ON XMATCH(radius_arcsec => 1.0, dist_col => 'd')
WHERE d IS NULL.
Restrict to a survey footprint¶
Count Gaia sources inside DELVE's published footprint. Footprints
ship as point_map.fits in most HATS catalogs; ACID picks them up
automatically when you name a catalog in IN_MOC(...) /
Catalog.in_region(<catalog_name>).
The footprint restriction prunes whole partitions before any worker reads data; this is one of the fastest queries you can run over a billion-row catalog. See Sky regions & footprints.
Materialize an intermediate, then iterate¶
A heavy crossmatch you want to run once, then explore many times.
Catalog.save(path, name=...) writes a HATS tree and registers it on
the connection. Subsequent queries against the saved name skip the
matcher entirely.
import acid
import astropy.units as u
# Heavy: full-sky Gaia × 2MASS, one-time cost.
bright = (acid.open("gaia_dr3")
.where("phot_g_mean_mag < 16")
.crossmatch(acid.open("twomass_psc"),
radius=1 * u.arcsec, dist_col="d_arcsec")
.save("/scratch/bright_gaia_2mass",
name="bright_x_2mass"))
# Light: iterate against the cached catalog.
bright.where("d_arcsec < 0.5").head(20).show()
bright.where("j_m < 14").head(20).show()
acid.sql.query("""
SELECT COUNT(*) AS n
FROM bright_x_2mass
WHERE d_arcsec < 0.3
""").show()
save returns a fresh Catalog pointing at the saved tree; the same
name is also reachable through acid.open("bright_x_2mass") and from
SQL. The output is a valid HATS tree that LSDB and hats.read_hats(...)
read without modification.
Top-K closest matches¶
The 100 closest Gaia/2MASS pairs in a field. .sort(...) + .limit(K)
pushes the LIMIT to each partition, so the work scales with K, not
with the total match count.
import acid
import astropy.units as u
gaia = acid.open("gaia_dr3")
twomass = acid.open("twomass_psc")
closest = (gaia.crossmatch(twomass, radius=5 * u.arcsec, dist_col="d")
.sort("d")
.limit(100)
.to_polars())
print(closest)
Self-crossmatch (pairs within one catalog)¶
Find close stellar pairs in Gaia — sources within 5″ of each other.
Open the same catalog twice with different aliases; the
source_id < source_id_b clause keeps each pair exactly once.
import acid
r = acid.sql.query("""
SELECT a.source_id AS id1,
b.source_id AS id2,
d_arcsec
FROM gaia_dr3 AS a
JOIN gaia_dr3 AS b
ON XMATCH(radius_arcsec => 5.0, mode => 'all', dist_col => 'd_arcsec')
WHERE a.source_id < b.source_id
ORDER BY d_arcsec
LIMIT 100
""")
r.show()
Run a recipe from the CLI¶
Each Python recipe maps to a one-liner acid query:
# Pretty-print to stdout.
acid query "SELECT COUNT(*) FROM gaia_dr3 WHERE IN_MOC(gaia_dr3, 'delve_dr3')" \
--db /data/hats
# Run a longer query from a file, write a HATS tree.
acid query -f /scripts/gaia_x_rubin.sql --db /data/hats \
--output /scratch/gaia_x_rubin/ --workers 32
# Restrict to a debug cone right from the shell.
acid query -f /scripts/gaia_x_rubin.sql --db /data/hats \
--cone 150.5,2.2,0.5
See the CLI reference for the full flag list.
See also¶
- User guide — Crossmatching catalogs — the full reference for the verb and the four astronomy-correctness checks.
- User guide — Light curves for a list of targets — the longer task-shaped version of the target-list recipe above.
- User guide — Debug small, run big — the cone-then-full-sky pattern that underlies every recipe here.
- Reference — SQL features — the canonical list of supported and rejected query shapes.