Algorithms: Target Selection

Detailed descriptions of the selection algorithms for the different categories of SDSS targets are provided in the series of papers noted below under Target Selection References. Here we provide short summaries of the various target selection algorithms. Also compare the target selection quality page.

In the SDSS imaging data output tsObj files, the result of target selection for each object is recorded in the 32-bit primTarget flag, as defined in Table 27 of Stoughton et al. (2002). For details, see the Target Selection References

Note the following subtleties:

An object can be targeted simultaneously by more than one algorithm.
The photometric catalogs contain a target selection flag for every single object,
but not all objects which are flagged as a spectroscopic target will actually be observed with the spectrograph. The assignment of spectrograph fibers to targets from the photometry catalogs is called tiling.
Perhaps most importantly, the target selection flags used in order to create the spectroscopic plates were (necessarily) based on an earlier processing of the data. Thus, objects that were targets in the original rerun may not be targets now, and vice versa. For the Main Galaxy Sample, this amounts to changes in the r band flux limit; for Quasars it means wholesale changes in the algorithms; for Luminous Red Galaxies, it means that the effective color selection differs from place to place on the sky.

The following samples are targetted:

Main Galaxy Sample
Luminous Red Galaxies (LRG)
Quasars
Stars
ROSAT All-Sky Survey sources
Serendipity targets
SEGUE targets

Main Galaxy Sample

The main galaxy sample target selection algorithm is detailed in Strauss et al. (2002) and is summarized in this schematic flowchart.

Galaxy targets are selected starting from objects which are detected in the r band (i.e. those objects which are more than 5σ above sky after smoothing with a PSF filter). The photometry is corrected for Galactic extinction using the reddening maps of Schlegel, Finkbeiner, and Davis (1998). Galaxies targeted from DR2 and later data are separated from stars using the following cut on the difference between the r-band PSF and cmodel magnitudes:

r_PSF - r_cmodel >= 0.24

Note that this cut is more conservative for galaxies than the star-galaxy separation cut used by Photo (galaxies targeted off DR1 data used a slightly different cut, r_PSF - r_model >= 0.3). Potential targets are then rejected if they have been flagged by Photo as SATURATED, BRIGHT, or (BLENDED and not NODEBLEND) - see the descriptions of flags. The Petrosian magnitude limit r_P = 17.77 is then applied, which results in a main galaxy sample surface density of about 90 per deg².

A number of surface brightness cuts are then applied, based on mu₅₀, the mean surface brightness within the Petrosian half-light radius petroR50. The most significant cut is mu₅₀ <= 23.0 mag arcsec^-2 in r, which already includes 99% of the galaxies brighter than the Petrosian magnitude limit. At surface brightnesses in the range 23.0 <= mu₅₀ <= 24.5 mag arcsec^-2, several other criteria are applied in order to reject most spurious targets, as shown in the flowchart. Please see the detailed discussion of these surface brightness cuts, including consideration of selection effects, in Section 4.4 of Strauss et al. (2002) [link to AJ site, subscription required]. Finally, in order to reject very bright objects which will cause contamination of the spectra of adjacent fibers and/or saturation of the spectroscopic CCDs, objects are rejected if they have (1) fiber magnitudes brighter than 15.0 in g or r, or 14.5 in i; or (2) Petrosian magnitude r_P < 15.0 and Petrosian half-light radius petroR50 < 2 arcsec.

Main galaxy targets satisfying all of the above criteria have the GALAXY bit set in their primTarget flag. Among those, the ones with mu₅₀ >= 23.0 mag arcsec^-2 have the GALAXY_BIG bit set. Galaxy targets who fail all the surface brightness selection limits but have r band fiber magnitudes brighter than 19 are accepted anyway (since they are likely to yield a good spectrum) and have the GALAXY_BRIGHT_CORE bit set.

Luminous Red Galaxies (LRG)

SDSS luminous red galaxies (LRGs) are selected on the basis of color and magnitude to yield a sample of luminous intrinsically red galaxies that extends fainter and farther than the SDSS main galaxy sample. Please see Eisenstein et al. (2001) for detailed discussions of sample selection, efficiency, use, and caveats.

LRGs are selected using a variant of the photometric redshift technique and are meant to comprise a uniform, approximately volume-limited sample of objects with the reddest colors in the rest frame. The sample is selected via cuts in the (g-r, r-i, r) color-color-magnitude cube. Note that all colors are measured using model magnitudes, and all quantities are corrected for Galactic extinction following Schlegel, Finkbeiner, and Davis (1998).

Objects must be detected by Photo as BINNED1 OR BINNED2 OR BINNED4 (see flag descriptions) in both r and i, but not necessarily in g, and objects flagged by Photo as BRIGHT or SATURATED in g, r, or i are excluded.

The galaxy model colors are rotated first to a basis that is aligned with the galaxy locus in the (g-r, r-i) plane according to:

c_⊥ = (r-i) - (g-r)/4 - 0.18
c_|| = 0.7(g-r) + 1.2[(r-i) - 0.18]

Please note that some earlier versions of SDSS documentation (notably among them this page and the print version of the EDR paper) have incorrect signs in the definition of cperp; the above with just minus signs are now correct, as are those in the LRG target selection paper (referenced below).

Because the 4000 Angstrom break moves from the g band to the r band at a redshift z ~ 0.4, two separate sets of selection criteria are needed to target LRGs below and above that redshift:

Cut I for z <~ 0.4

r_Petro < 13.1 + c_|| / 0.3
r_Petro < 19.2
|c_⊥| < 0.2
mu₅₀ < 24.2 mag arcsec^-2
r_PSF - r_model > 0.3

Cut II for z >~ 0.4

r_Petro < 19.5
c_⊥ > 0.45 - (g-r)/6
g-r > 1.30 + 0.25(r-i)
mu₅₀ < 24.2 mag arcsec^-2
r_PSF - r_model > 0.5

Cut I selection results in an approximately volume-limited LRG sample to z=0.38, with additional galaxies to z ~ 0.45. Cut II selection adds yet more luminous red galaxies to z ~ 0.55. The two cuts together result in about 12 LRG targets per deg² that are not already in the main galaxy sample (about 10 in Cut I, 2 in Cut II).

In primTarget, GALAXY_RED is set if the LRG passes either Cut I or Cut II. GALAXY_RED_II is set if the object passes Cut II but not Cut I. However, neither of these flags is set if the LRG is brighter than the main galaxy sample flux limit but failed to enter the main sample (e.g., because of the main sample surface brightness cuts). Thus LRG target selection never overrules main sample target selection on bright objects.

Changes in LRG target selection for DR2 and later data

With the change in the model magnitude code between DR1 and DR2/DR3 data (see improvements to image processing in DR2), the mean g-r and r-i model colors of galaxies have shifted by about 0.005 magnitudes. Because the LRG is very sensitive to color, this would have increased the number density of targets by about 10%. Instead, we shifted the LRG color cuts to compensate; in addition, improved star-galaxy separation allows tighter cuts on the model-PSF quantity by which stars are rejected. Here we give the updated equations for Cut I applied to data reduced with the DR2/DR3 version of the photometric pipeline (5_4; criteria not listed here are taken over unchanged from above):

The definition of the new color basis changes as follows:

c_⊥ = (r-i) - (g-r)/4 - 0.177
c_|| = 0.7(g-r) + 1.2[(r-i) - 0.177]

Cut I for z <~ 0.4 becomes

r_Petro < 13.116 + c_|| / 0.3
r_PSF - r_model >= 0.24

Cut II for z >~ 0.4 becomes

c_⊥ > 0.449 - (g-r)/6
g-r > 1.296 + 0.25 (r-i)
r_PSF - r_model >= 0.4

This new version of LRG target selection is applied to the best region of sky reduced with the latest version of the imaging pipeline. It is of course not applied retroactively to the target version of the sky, which used older versions of the pipeline.

Due to other subtle differences in the photometric pipeline and the calibration, these changes will not exactly reproduce the selection criteria actually used when spectroscopy was carried out. Indeed, defining an LRG sample based on the best reductions will result in large spectroscopic incompleteness because so many objects are close to the boundaries. Instead, one should use the target photometry and adjust the calibrations of that relative to the best calibration. Of course, if one is interested in photometric properties of single objects, then we recommend the best photometry.

Quasars

The final adopted SDSS quasar target selection algorithm is described in Richards et al. (2002) (link to AJ site, subscription required; preprint-format paper available here). However, it should be noted that the implementation of this algorithm came after the last date of DR1 spectroscopy. Thus this paper does not technically describe the DR1 quasar sample and the DR1 quasar sample is not intended to be used for statistical purposes (but see below). Interested parties are instead encouraged to use the catalog of DR1 quasars that is being prepared by Schneider et al (2003, in prep.), which will include an indication of which quasars were also selected by the Richards et al. (2002) algorithm. At some later time, we will also perform an analysis of those objects selected by the new algorithm but for which we do not currently have spectroscopy and will produce a new sample that is suitable for statistical analysis.

Though the DR1 quasars were not technically selected with the Richards et al. (2002) algorithm, the algorithms used since the EDR are quite similar to this algorithm and this paper suffices to describe the general considerations that were made in selecting quasars. Thus it is worth describing the algorithm in more detail.

The quasar target selection algorithms are summarized in this schematic flowchart. Because the quasar selection cuts are fairly numerous and detailed, the reader is strongly recommended to refer to Richards et al. (2002) (link to AJ paper; subscription required) for the full discussion of the sample selection criteria, completeness, target efficiency, and caveats.

The quasar target selection algorithm primarily identifies quasars as outliers from the stellar locus, modelled following Newberg & Yanny (1997) [subscription required] as elongated tubes in the (u-g, g-r, r-i) (denoted ugri) and (g-r, r-i, i-z) (denoted griz) color cubes. In addition, targets are also selected by matches to the FIRST catalog of radio sources (Becker, White, & Helfand 1995). All magnitudes and colors are measured using PSF magnitudes, and all quantities are corrected for Galactic extinction following Schlegel, Finkbeiner, and Davis (1998).

Objects flagged by Photo as having either "fatal" errors (primarily those flagged BRIGHT, SATURATED, EDGE, or BLENDED; see flag descriptions) or "nonfatal" errors (primarily related to deblending or interpolation problems) are rejected from the color selection, but only objects with fatal errors are rejected from the FIRST radio selection. See Section 3.2 of Richards et al. (2002) for the full details [link to AJ site, subscription required]. Objects are also rejected (from the color selection, but not the radio selection) if they lie in any of 3 color-defined exclusion regions which are dominated by white dwarfs, A stars, and M star+white dwarf pairs; see Section 3.5.1 of Richards et al. (2002) for the specific exclusion region color boundaries. Such objects are flagged as QSO_REJECT. Quasar targets are further restricted to objects with i_PSF > 15.0 in order to exclude bright objects which will cause contamination of the spectra from adjacent fibers.

Objects which pass the above tests are then selected to be quasar targets if they lie more than 4σ from either the ugri or griz stellar locus. The detailed specification of the stellar loci and of the outlier rejection algorithm are provided in Appendices A and B of Richards et al. (2002). These color-selected quasar targets are divided into main (or low-redshift) and high-redshift samples, as follows:

Main Quasar Sample (QSO_CAP, QSO_SKIRT)

These are outliers from the ugri stellar locus and are selected in the magnitude range 15.0 < i_PSF < 19.1. Both point sources and extended objects are included, except that extended objects must have colors that are far from the colors of the main galaxy distribution and that are consistent with the colors of AGNs; these additional color cuts for extended objects are specified in Section 3.4.4 of Richards et al. (2002).

Even if an object is not a ugri stellar locus outlier, it may be selected as a main quasar sample target if it lies in either of these 2 "inclusion" regions: (1) "mid-z", used to select 2.5 < z < 3 quasars whose colors cross the stellar locus in SDSS color space; and (2) "UVX", used to duplicate selection of z <= 2.2 UV-excess quasars in previous surveys. These inclusion boxes are specified in Section 3.5.2 of Richards et al. (2002).

Note that the QSO_CAP and QSO_SKIRT distinction is kept for historical reasons (as some data that are already public use this notation) and results from an original intent to use separate selection criteria in regions of low ("cap") and high ("skirt") stellar density. It turns out that the selection efficiency is indistinguishable in the cap and skirt regions, so that the target selection used is in fact identical in the 2 regions (similarly for QSO_FIRST_CAP and QSO_FIRST_SKIRT, below).

High-Redshift Quasar Sample (QSO_HIZ)

These are outliers from the griz stellar locus and are selected in the magnitude range 15.0 < i_PSF < 20.2. Only point sources are selected, as these quasars will lie at redshifts above z~3.5 and are expected to be classified as stellar at SDSS resolution. Also, to avoid contamination from faint low-redshift quasars which are also griz stellar locus outliers, blue objects are rejected according to eq. (1) in Section 3.4.5 of Richards et al. (2002).

Moreover, several additional color cuts are used in order to recover more high-redshift quasars than would be possible using only griz stellar locus outliers. So an object will be selected as a high-redshift quasar target if it lies in any of these 3 "inclusion" regions: (1) "gri high-z", for z >= 3.6 quasars; (2) "riz high-z", for z >= 4.5 quasars; and (3) "ugr red outlier", for z >= 3.0 quasars. The specifics are given in eqs. (6-8) in Section 3.5.2 of Richards et al. (2002).

FIRST Sources (QSO_FIRST_CAP, QSO_FIRST_SKIRT)

Irrespective of the various color selection criteria above, SDSS stellar objects are selected as quasar targets if they have 15.0 < i_PSF < 19.1 and are matched to within 2 arcsec of a counterpart in the FIRST radio catalog.

Finally, those targets which otherwise meet the color selection or radio selection criteria described above, but fail the cuts on i_PSF, will be flagged as QSO_MAG_OUTLIER (also called QSO_FAINT). Such objects may be of interest for follow-up studies, but are not otherwise targeted for spectroscopy under routine operations (unless another "good" quasar target flag is set).

Other Science Targets

A variety of other science targets are also selected; see also Section 4.8.4 of Stoughton et al. (2002). With the exception of brown dwarfs, these samples are not complete, but are assigned to excess fibers left over after the main samples of galaxies, LRGs, and quasars have been tiled.

Stars

A variety of stars are also targeted using color selection criteria, as follows:

blue horizontal-branch stars (STAR_BHB)
both dwarf and giant carbon stars (STAR_CARBON)
brown dwarfs (STAR_BROWN_DWARF) - this is the only tiled sample of stars
low-luminosity subdwarfs (STAR_SUB_DWARF)
cataclysmic variables (STAR_CATY_VAR)
red dwarfs (STAR_RED_DWARF)
hot white dwarfs (STAR_WHITE_DWARF)
central stars of planetary nebulae (STAR_PN)

ROSAT Sources

ROSAT targeting is described in Anderson, S., et al. 2003, AJ, 126, 2209 SDSS objects are positionally matched against X-ray sources from the ROSAT All-Sky Survey (RASS; Voges et al. 1999), and SDSS objects within the RASS error circles (commonly 10-20 arcsec) are targeted using algorithms tuned to select likely optical counterparts to the X-ray sources. Objects are targeted which:

are also radio sources (ROSAT_A)
have SDSS colors of AGNs or quasars (ROSAT_B)
fall in a broad intermediate category that includes stars that are bright, moderately blue, or both (ROSAT_C)
are otherwise bright enough for SDSS spectroscopy (ROSAT_D)

Objects are flagged ROSAT_E if they fall within the RASS error circle but are either too faint or too bright for SDSS spectroscopy.

Serendipity

This is an open category of targets whose selection criteria may change as different regions of parameter space are explored. These consist of:

objects lying outside the stellar locus in color space (SERENDIP_RED, SERENDIP_BLUE, SERENDIP_DISTANT)
objects coincident with FIRST sources but fainter than the equivalent in quasar target selection; also not restricted to point sources (SERENDIP_FIRST)
hand-selected targets (SERENDIP_MANUAL)

SEGUE

There is a terse but complete description of SEGUE target selection on the SEGUE web page. A more verbose description will be linked from here when it is available.

Target Selection References

Sloan Digital Sky Survey: Early Data Release, Stoughton, C., et al. 2002, AJ, 123, 485
Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Main Galaxy Sample, Strauss, M., et al. 2002, AJ, 124, 1810
Spectroscopic Target Selection for the Sloan Digital Sky Survey: The Luminous Red Galaxy Sample, Eisenstein, D., et al. 2001, AJ, 122, 2267
Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Quasar Sample, Richards, G., et al. 2002, 123, 2945
A Large, Uniform Sample of X-Ray-Emitting AGNs: Selection Approach and an Initial Catalog from the ROSAT All-Sky and Sloan Digital Sky Surveys, Anderson, S., et al. 2003, AJ, 126, 2209

Last modified: Mon Jun 25 22:12:36 CEST 2007

Submit your questions to the SDSS helpdesk