Event



Astro Seminar: "SciServer - A Collaborative Research Environment for Large-scale Data-driven Science"

Gerard Lemson (Johns Hopkins)
| David Rittenhouse Laboratory, A4

SciServer is a Big Data infrastructure project developed at Johns Hopkins University that provides a common environment for sharable, computationally-intensive research.

SciServer extends the capabilities of the successful SkyServer and CASJobs services originally developed for providing access to the results of the Sloan Digital Sky Survey. An important extension is SciServer Compute, which implements Jupyter notebooks in Docker containers to bring advanced analysis capabilities close to Terabyte-scale relational databases and Petabyte-scale file storage systems. In addition to real-time analysis in Jupyter Notebooks with Python, R, and Matlab, SciServer Compute delivers an API for asynchronous tasks in persistent Docker containers. Compute adds new libraries for CASJobs, an asynchronous free-form database querying tool, as well as libraries to access data on hosted and local file storage systems. SciServer's MyScratch provides Terabytes of temporary storage space, while SciDrive offers a Dropbox-like interface for long-term storage of scientific results. These components are accessible through the single sign-on Login Portal. SciServer supports many scientific disciplines, incorporating large databases and file collections from Astronomy, Cosmology, Turbulence, Genomics, Oceanography, and Materials Science. I will describe SciServer in some detail and will demonstrate some of its capabilities on use cases coming from the various science domains.