Parallel Connected Components

Problem Overview

Parallel scientific simulations running on today's state of the art petascale computing platforms generate massive quantities of high resolution mesh-based data. Scientists often analyze this data by eliminating portions and visualizing what remains, through operations such as isosurfacing, selecting certain materials and discarding the others, isolating hot spots, etc. These approaches can generate complex derived geometry with intricate structures that require further techniques for effective analysis, especially in the context of massive data.

In these instances, representations of the topological structure of a mesh is often helpful. Specifically, a labeling of the connected components in a mesh provides a simple and intuitive topological characterization of which parts of the mesh are connected to each other. These unique sub-meshes contain a subset of cells that are directly or indirectly connected via series of cell abutments.

The global nature of connectivity poses a challenge in distributed-memory parallel environments, which are the most common setting for analyzing massive data. This is because massive data sets are typically too large to fit into the memory of a single processor, so pieces of the mesh are distributed across processors. Cells comprising connected sub-meshes may span any of the processors, but the relationships of how cells abut across processors frequently has to be derived. To deal with this problem, sophisticated techniques to resolve connectivity are necessary.

Results

We have published several papers on this topic:

An efficient algorithm for distributed-memory parallelism provided ghost data is present (Harrison/EGPGV11).
An extended algorithm that works on all data, whether or not ghost data is present (Harrison/TSMCD14).
An applications paper studying turbulent flow that depended on connected component calculation (Gaither/CGA12).

CDUX People

Ryan Bleile
Ph.D. Student (alum)

Hank Childs
CDUX Director

External Collaborators

Cyrus Harrison, LLNL
Jordan Weiler, CDUX Alum
Kelly Gaither, TACC

Publications

A Distributed-Memory Algorithm for Connected Components Labeling of Simulation Data
Cyrus Harrison, Jordan Weiler, Ryan Bleile, Kelly Gaither, and Hank Childs
Topological and Statistical Methods for Complex Data, December 2014

[LINK] [BIB]

Using Visualization and Data Analysis to Understand Critical Structures in Massive Time Varying Turbulent Flow Simulations
Kelly Gaither, Hank Childs, Karl Schulz, Cyrus Harrison, Bill Barth, Diego Donzis, and P.K. Yeung
IEEE Computer Graphics and Applications (CG&A), July 2012

[PDF] [BIB]

Data-Parallel Mesh Connected Components Labeling and Analysis
Cyrus Harrison, Hank Childs, and Kelly Gaither
EGPGV, Llandudno, Wales, April 2011

[PDF] [BIB]