Data Exploration at the Exascale
Recent workshop reports have found that exascale computing would revolutionize science in many important areas (high energy physics, climate, nuclear physics, fusion, nuclear energy, basic energy sciences, biology, and security). The most daunting challenge to achieving exascale computing is power: three orders of magnitude more floating point operations must come from only one order of magnitude more power. This power constraint will lead to profound changes in how simulations are performed. From the perspective of visualization and analysis, the most significant change will be that data must be processed in situ --- it will no longer be possible to employ the traditional method of writing full resolution data to disk and postprocessing it with dedicated analysis applications.
Exploration is arguably the most important use case for visualization and analysis. It is an iterative process: an analyst forms a hypothesis, poses a question to analysis software, interprets the result, and then forms new hypotheses and/or additional questions. It is important because this is the time when new insights are made, when "new science" is discovered. But exploration is a slow process; analysts form theories on time scales ranging from seconds to days.
Exploratory analysis and in situ processing seem to be fundamentally incongruent. Traditionally, in situ processing is used when the techniques to employ are known a priori, which is not the case for exploration. And exploratory analysis occurs on time scales that are too long for in situ processing, that are too long to hold the exascale machine "hostage." Yet exploration is important, since it enables insights, so the activity cannot be dropped as computing moves towards exascale.
With this project, we are studying techniques that will enable exploratory analysis on exascale simulations. Our strategy resembles the traditional post-processing model. The simulation will write data to disk and stand-alone programs will visualize and analyze this data by reading it from disk. However, we introduce a key new step to this model: we will use in situ processing to substantially reduce the data before writing it to disk.
This approach requires research in three distinct areas:
This project is attempting to answer these questions. We aim to both innovate new solutions and also to evaluate existing solutions and to catalogue the results for exascale simulation scientists. Most importantly, the project aims to enable exascale simulation scientists to do exploratory analysis. This is key because exploration is where new science is discovered; this project can help exascale computing realize its value.This project is funded by a Department of Energy Career award for Hank Childs, which runs through 2017.
This project covers multiple distinct areas. Some of the areas are listed as their own research pages:
Shaomeng (Samuel) Li
Ph.D. Student (alum)