Paper
24 December 2002 Data Processing Factory for the Sloan Digital Sky Survey
Christopher Stoughton, Jennifer Adelman, James T. Annis, John Hendry, John Inkmann, Sebastian Jester, Steven M. Kent, Nickolai Kuropatkin, Brian Lee, Huan Lin, John Peoples Jr., Robert Sparks, Douglas Tucker, Dan Vanden Berk, Brian Yanny, Dan Yocum
Author Affiliations +
Abstract
The Sloan Digital Sky Survey (SDSS) data handling presents two challenges: large data volume and timely production of spectroscopic plates from imaging data. A data processing factory, using technologies both old and new, handles this flow. Distribution to end users is via disk farms, to serve corrected images and calibrated spectra, and a database, to efficiently process catalog queries. For distribution of modest amounts of data from Apache Point Observatory to Fermilab, scripts use rsync to update files, while larger data transfers are accomplished by shipping magnetic tapes commercially. All data processing pipelines are wrapped in scripts to address consecutive phases: preparation, submission, checking, and quality control. We constructed the factory by chaining these pipelines together while using an operational database to hold processed imaging catalogs. The science database catalogs all imaging and spectroscopic object, with pointers to the various external files associated with them. Diverse computing systems address particular processing phases. UNIX computers handle tape reading and writing, as well as calibration steps that require access to a large amount of data with relatively modest computational demands. Commodity CPUs process steps that require access to a limited amount of data with more demanding computations requirements. Disk servers optimized for cost per Gbyte serve terabytes of processed data, while servers optimized for disk read speed run SQLServer software to process queries on the catalogs. This factory produced data for the SDSS Early Data Release in June 2001, and it is currently producing Data Release One, scheduled for January 2003.
© (2002) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Christopher Stoughton, Jennifer Adelman, James T. Annis, John Hendry, John Inkmann, Sebastian Jester, Steven M. Kent, Nickolai Kuropatkin, Brian Lee, Huan Lin, John Peoples Jr., Robert Sparks, Douglas Tucker, Dan Vanden Berk, Brian Yanny, and Dan Yocum "Data Processing Factory for the Sloan Digital Sky Survey", Proc. SPIE 4836, Survey and Other Telescope Technologies and Discoveries, (24 December 2002); https://doi.org/10.1117/12.457014
Lens.org Logo
CITATIONS
Cited by 14 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Spectroscopy

Data processing

Calibration

Imaging spectroscopy

Databases

Data archive systems

Telescopes

RELATED CONTENT

The Palomar transient factory
Proceedings of SPIE (February 08 2015)
Data quality analysis at the Spitzer Science Center
Proceedings of SPIE (June 30 2006)
VISTA data flow system: status
Proceedings of SPIE (June 29 2006)
J-PAS data management pipeline and archiving
Proceedings of SPIE (September 24 2012)
Elixir: how to handle 2 trillion pixels
Proceedings of SPIE (January 02 2002)

Back to Top