REQUIREMENTS FOR SDSS DATA MANAGEMENT SYSTEMS
June 5, 1995
[All requirements are subject to budgetary,
manpower, and technical limitations]
Revision 1 - February 26, 2004
The data management systems shall maintain all processed data from the SDSS
and provide access by SDSS scientists and operators in order to maximize the
ease of the following:
- Operate the SDSS survey so as to maximize the efficiency of operations.
- Perform Quality Analysis operations on the data so as to ensure its
integrity. The operations will verify calibrations, target selection criteria
and classifications, and completeness & accuracy.
- Provide SDSS scientists with access to the data and tools to permit
selection of spectroscopic targets for certain categories (serendipity;
- Provide SDSS scientists with access to the data so as to enable scientific
Requirements for the Science ArchiveThe science archive shall consist
A public version of the science database shall be a snapshot of the Science
Archive. The public version will include two versions of the sky: Best and
Target. It will not include all runs obtained in the course of survey operations.
- A science database that shall:
- a. Retain resolved calibrated object catalogs (photometric
CCD output) for two sky versions of the data: Best and Target.
- b. Retain parameters from spectroscopic pipeline
- c. Retain references to atlas images and extracted spectra
- d. Provide ability to carry out manual target selection for
certain target categories
- e. Provide ability for SDSS scientists to extract subsets of
- f. Provide smooth transition to public distribution system.
- A set of files tracked by the science database.
- A set of files not tracked by the science database.
An enhanced goal is to create a "Runs" database in addition to Target and Best
versions. A Runs database would contain every imaging scan obtained over the
course of operations.
I. Input to Science Archive
- Survey Definition
- a. A description of the North Imaging survey area
- b. Survey progress: A description of sky inserted into database
- Final Astrometric Calibration
- a. List of calibration coefficients on a frame-by-frame
- b. Position errors stored on an object-by-object basis.
- Final Photometric Calibraton
- a. List of photometric calibration coefficients on a
- Merged Object Lists
- a. A list of calibrated objects and parameters from the Frames
pipeline of photo
- b. A list of masks derived from object masks from the
Frames pipeline of photo.
- c. Run, Rerun and Field information.
- d. Star/Galaxy classifications
- e. Target selection flags
- f. Status flags
- g. Cross-identifications to other
- Target Selection
- a. A list of all targetable objects with target selection
- b. A list of all objects from (5a) selected as targets with
- c. Tiling flags for all objects in b.
- Spectroscopic Pipeline
- a. Redshifts and parameters of all targeted objects
- b. Tile and plate information.
- c. Primary target designation, to identify primary
measurements of targets for which multiple spectra have been
- Enhanced goal: Scientist derived catalogs
- Enhanced goal: Other input catalogs
- Separate files tracked from Science Database
- a. Atlas Images
- b. 1-D spectra
- c. Corrected frames
- d. Masks
- e. Binned frames
- f. fpFieldStat
- g. psField
- TBD: Southern Survey
II. Functional Goals
- User will be able to carry out efficient queries to locate objects over
one or more ranges of following attributes:
- a. Longitude or latitude in several spherical coordinates
- i) J2000 Ra and Dec
- ii)Galactic coordinates
- iii) Survey Coordinates
- iv) Any linear combination of the two coordinates
- b. Radius within a give point of the sky
- c. u' g' r' i' z' (One set of magnitudes per object)
- d. Any linear combination of c.
- e. Object radius
- f. Surface brightess formed by c and d.
- g. Star/Galaxy classification flag
- h. Target Selection Category
- i. Spectrum available flag
- j. Status and photo flags
- User will be able to carry out queries on any retained object parameter.
- Enhanced Goal: All calibrated quantities can be recomputed using
improved astrometric and photometric calibrations. Queries can be performed on
the recalibrated quantities.
- For all efficient queries, return an esimated number of objects to be
- For all located objects, users shall be able to specify an arbitrary
subset of stored parameters to be returned plus the number of located
- Users shall be able to perform the following functions:
- a. Proxy queries [e.g, get all objects within each
of 10,000 QSOs in my favorite catalog).]
- b. Formulate new queries based on results of previous queries.
- Users shall be able to query for database metadata:
- a. List of tables
- b. List of attributes
- c. List of enumerated constants with text descriptors.
III. Technical Goals
- User interface
- a. User interface shall be http-based.
- b. User interface shall communicate with a query support layer
via ASCII interface protocol.
- c. Data shall be returnable on the sockets in ASCII, HTML,
or XML format.
- d. User interface shall be documented.
- Data shall be stored in a system providing an industry-standard OSQL-like
interface to enable use of commercial products to provide alternative view
of the database.
- a. A master copy of all data shall be maintained (the Master
- b. Capability shall be present to replicate all or part of the
Master Science Archive as local databases at SDSS institutions. Replication
may consist of:
- i) Science Database in its entirety
- ii) All or part of separate files tracked by Science Database
- iii) No capability shall be present to replicate an
arbitrarily selected subset of the science database beyond that described
by section 1.c of USER INTERFACE.
- iv) The institution requesting replication shall be responsible
for providing the hardware that the database and/or files will be copied
- d. No capability is required to be present to replicate all or
part of separate files not tracked by Science Database
- a. Master Science Archive shall be protected against corruption
by SDSS participant users
- b. Master Science Archive shall be protected against
unauthorized access by non-SDSS participants.
- c. Computer security policies and procedures of the institution
hosting the Master Science Archive shall be followed.
- Version Retention (NEW)
- a. Two prior data release versions of the Science Database,
in addition to the current release, should be maintained on-line.
- System Availability (NEW)
- a. 99% system availability to the end user of the public
version of the current data release.
- b. 95% system availability to the end user of the current
collaboration version of the Science Database.
- c. 95% system availability to the end user on prioir
release versions of the Science Database.
- d. For 95% uptime systems:
- i) Fault response time should be within 16 hours.
- ii) Fault recovery time should be within 48 hours.