Skip to content

Sky regions & footprints

For row-by-row cuts (magnitude, color, flag — where("g_mag < 18")), see filtering rows. This page covers the spatial restriction surface: scoping a query to a sky region (a cone, a MOC, a survey footprint, another catalog's footprint). Two routes, same underlying machinery:

  • Catalog.in_region(...) — the fluent verb. Accepts a registered MOC name, a peer Catalog's footprint, a FITS path, a mocpy.MOC object, or a HATS directory with a point_map.fits.
  • IN_MOC(<alias>, '<name>') — the SQL predicate. Per-row boolean, restricted to a top-level conjunctive WHERE position.
  • acid.in_cone(center, radius=...) — a context manager that scopes a circular cone to every query executed inside a with block; both surfaces respect it.

ACID uses the HEALPix _healpix_29 column to push the region down to parquet row-group statistics (the same fast path as _healpix_29 row-group pushdown), so a region restriction prunes whole row groups before any rows are decoded.

What a MOC is

A MOC (Multi-Order Coverage map) is a region of sky encoded as a set of HEALPix pixels at multiple Norder resolutions — coarse where the region is solid, fine along boundaries. The IVOA specifies the FITS encoding; the major surveys publish their footprints as MOC FITS files (DES DR2, 2MASS, DELVE, …). See the glossary for the formal definition.

For ACID's purposes, you do not need to know the encoding — just that a MOC is a sky region you point acid at and acid restricts the query to rows inside it.

Registering a MOC

Three ways to register one:

register_moc.py
import acid

acid.init("catalogs.yaml")

# 1. From a FITS file on disk.
acid.register_moc("des_dr2", "/data/mocs/des_dr2.fits")

# 2. From a mocpy.MOC object you already have in memory.
from mocpy import MOC
m = MOC.from_string("4/0-100", "ascii")
acid.register_moc("custom", m)

# 3. From a YAML config (set up at init time, not at runtime).

The YAML form lives in your registry file:

catalogs.yaml
catalogs:
  gaia_dr3: { path: /data/gaia_dr3 }
  twomass:  { path: /data/twomass_psc }

mocs:
  des_dr2:          /data/mocs/des_dr2.fits
  known_artifacts:  /data/mocs/artifacts.fits

Catalogs that ship with a point_map.fits (the HATS standard footprint file) also expose that footprint by their catalog name. If you don't register "twomass" as a MOC but a registered catalog named twomass has a point_map.fits, IN_MOC(a, 'twomass') and a.in_region("twomass") both lazy-load it — useful for cross-survey footprint intersection without a separate MOC file.

When both an explicit MOC and an auto-loaded catalog footprint share a name, the explicit registration wins.

Filtering by a registered MOC

Fluent — in_region(...)

Catalog.in_region(region) is the fluent verb. The region argument accepts five things:

gaia.in_region("des_dr2")                          # registered MOC name
gaia.in_region(twomass)                             # peer Catalog -> its footprint
gaia.in_region("/data/mocs/des_dr2.fits")          # FITS path
gaia.in_region(some_mocpy_moc)                     # mocpy.MOC object
gaia.in_region("/data/twomass_hats")                # HATS dir -> point_map.fits

A registered catalog name (the first form) takes precedence over a path on disk — so gaia.in_region("twomass") resolves through the registry the same way the SQL form does, and the compiled SQL uses the same MOC name.

For the catalog-handle form (in_region(twomass)), the catalog must come from the same Connection; cross-connection handles are rejected with ValueError. The catalog must have a point_map.fits at its root, otherwise it is a RegistryError (with a hint to pass a FITS path or a mocpy.MOC instead).

SQL — IN_MOC(<alias>, '<name>')

The SQL predicate is per-row boolean. The first argument is the table alias whose (ra, dec) is being tested; the second is the registered MOC name (or auto-resolvable catalog name):

SELECT a.source_id, a.ra, a.dec
FROM   gaia_dr3 AS a
WHERE  IN_MOC(a, 'des_dr2')
  AND  NOT IN_MOC(a, 'known_artifacts')

Multiple IN_MOC predicates AND-combined in a WHERE clause are folded into one effective MOC by the analyzer — the predicate fires once per row, with the intersection and exclusions baked in. That is why the AND-chain above is cheap even with two MOCs.

IN_MOC is also accepted in SELECT as a per-row boolean (and in CASE and ORDER BY as long as the same MOC is also AND-ed into the top-level WHERE — the analyzer rejects naked positions outside the WHERE, see Where IN_MOC is rejected below):

SELECT source_id,
       IN_MOC(a, 'des_dr2') AS in_des
FROM   gaia_dr3 AS a
WHERE  IN_MOC(a, 'des_dr2')         -- required for the SELECT use to fold

Cone restriction — in_cone(...)

A cone (one center + a radius) is the most common region in interactive analysis: "let me iterate on a 1° patch around (180°, 0°) before I run the whole sky". ACID exposes it as a context manager that applies to every query inside the with block — fluent or SQL — and lifts when the block ends:

in_cone.py
import acid
import astropy.units as u

acid.init("catalogs.yaml", workers=8)

gaia = acid.open("gaia_dr3")
bright = gaia.where("phot_g_mean_mag < 16")

with acid.in_cone((180.0, 0.0), radius=1 * u.deg):
    small = bright.to_polars()
    r     = acid.sql.query("SELECT COUNT(*) FROM gaia_dr3")
    # both are restricted to the cone

# back outside the block — full sky, same query object:
big = bright.to_polars()

The cone is execution-time: it applies to whichever queries run inside the with block, not to the Catalog you built. The bright handle above is reused both scoped and full-sky.

center is (ra_deg, dec_deg) (ICRS) or an astropy.SkyCoord; radius is either a plain number (degrees) or an astropy.units.Quantity.

The cone is exact — distances are computed with a haversine expression — and is pushed down to parquet row groups via the cone's MOC representation (the same path as IN_MOC).

Cones do not nest

Only one cone is active at a time. Entering a second in_cone block while one is on the stack is rejected with ValidationError. The "true intersection of two non-concentric cones" is a lens, not a cone, and ACID refuses to silently approximate. Compose the two regions into a single cone before entering, or use in_region(...) with a MOC that represents the intersection.

Where IN_MOC is rejected

IN_MOC is a footprint restriction. Inside an AND-ed top-level WHERE (optionally NOT-negated), the analyzer can fold it into the effective MOC, push it down to row-group statistics, and prune work tuples at enumeration time. Outside that position, none of those fast paths apply — and a silent perf cliff is worse than a hard rejection. The analyzer rejects IN_MOC in the following positions with ValidationError:

  • Inside a disjunction (IN_MOC(...) OR ...).
  • In a JOIN ON clause.
  • In a SELECT projection without a matching WHERE term.
  • In ORDER BY, HAVING, or a CASE branch on its own.

The fix in every case is the same: move the region restriction to a top-level conjunctive WHERE. If your region truly is an OR-of-MOCs, union them in mocpy first and register the result as one MOC.

Composing region filters

You can combine in_region / IN_MOC with row filters and with each other. The pieces stack — ACID combines them into one effective region before pushing down:

bright_in_des = (acid.open("gaia_dr3")
                     .where("phot_g_mean_mag < 16")
                     .in_region("des_dr2")
                     .in_region("known_artifacts"))   # intersection

Subtracting a region — there is no direct fluent verb for NOT IN_MOC; the supported workaround is to drop into SQL for the negated leg:

in_des_not_artifacts = acid.sql.query("""
    SELECT * FROM gaia_dr3 AS a
    WHERE  IN_MOC(a, 'des_dr2')
      AND  NOT IN_MOC(a, 'known_artifacts')
""")
SELECT *
FROM   gaia_dr3 AS a
WHERE  IN_MOC(a, 'des_dr2')
  AND  NOT IN_MOC(a, 'known_artifacts')
  AND  phot_g_mean_mag < 16

When to use which surface

  • A cone for interactive iterationacid.in_cone(...). It applies to every query executed inside the block, including SQL.
  • A fixed survey footprint → register the MOC once via register_moc(...) (or the YAML mocs: block) and use in_region("name") or IN_MOC(a, 'name').
  • A catalog's published footprint → either register it explicitly or rely on the auto-load: in_region("twomass") resolves to twomass's point_map.fits if no explicit twomass MOC is registered.

See also

  • Filtering rows — non-spatial cuts (magnitude, color, flags). Region filters compose with row filters.
  • Crossmatching catalogs — restricting one or both sides of a crossmatch to a region.
  • SQL escape hatch — the inline-subquery / CTE pre-filter form, useful for narrowing a side that already has a region restriction.
  • Concepts: MOC footprints — what a MOC is and why it's the right encoding for sky regions.