
Research Synopsis: Dr. Arifuzzaman develops performance‑portable algorithms and software for large‑scale data analytics on heterogeneous architectures (including multicore/many-core CPUs and modern GPUs). Core methods include communication‑ and memory‑efficient algorithm designs, cache‑/NUMA‑aware layouts, machine learning and autotuning, graph partitioning and load balancing, data reduction and approximation methods, and dynamic/streaming graph methods.
Current thrusts: (i) Dynamic Graphs at scale—parallel community detection, temporal motif counting (e.g., triangles/squares), and streaming updates with accuracy–throughput trade‑offs; (ii) Performance portability & autotuning—learned cost models and ML/GNN‑based predictors for kernel/block sizes, execution policies, and data layouts across CPU/GPU backends; and (iii) Trustworthy & efficient AI—topology‑aware GNNs for intrusion detection (security) and scientific data analysis (e.g., neuroscience), interpretable models, and low‑latency inference on edge/IoT and HPC systems; plus NLP‑driven social network mining (social media); and (iv) AI-powered tools for STEM education.
Shubhashish Kar
PhD Student
Graphs, HPC, AI
Ming Chen
PhD Student
AI, Time Series, Graphs
Chan Lee
MS Student
Knowledge Graphs for Health Data
Swarna Latha Boya
MS Student
Dynamic Graphs
Hasan Arikan
BS Student
AI Performance Modeling
Aleisha Ortiz
BS Student
Algorithmic Complexity
Nitika Pathania
MS/CS Sp25
Graph Education Tools
Rishabh Kimothi
MS/CS Sm25
Graph Convolutional Networks
Dr. M. Abdul Motaleb Faysal
Postdoc: Texas State University
Currently, Faculty at Idaho State University
Dr. Naw Safrin Sattar
Postdoc: Oak Ridge National Lab
Currently, Faculty at Meharry Medical College, TN.
Prashant Rathore, MS/CS Sp25
Likhitha Pathpi, MS/CS Sp25
Abhishek A. Hossur, MS/CS Sm25
Kiran Yadav, MS/CS Sp25
Yashwanth Reddy, MS/CS Sp25
Soham Dhore, Intl. Intern/UG
Rajnandini Jadhav, Intl. Intern/UG
Unmesh Chakravarty, HS Ind. Study (Currently at Carnegie Mellon)
The AI-Orchestrated Performance Engineering (AI4Perf) project deals with the blend of artificial intelligence and high performance computing. It also covers the problems of performance portability in different architectures. To tune the configuration parameters both for algorithms and hardware architectures, the project aims to develop data-intensive machine learning solution based on the data generated by different graph algorithms.
Real complex systems are inherently time-varying and can be modeled as temporal graphs (networks). Examples include social, transportation, and many forms of biological networks. Standard graph metrics introduced so far in complex network theory are mainly suited for static graphs, i.e., graphs in which the links do not change over time. In this work, we aim at designing scalable parallel algorithms for mining large time-varying networks. Thanks to our collaborator from Performance and Algorithms Group at Berkeley Lab.
The Parallel and Resilient Subgraph Matching (PRISM) project tackles one of the fundamental challenges in graph analysis: efficiently finding and counting subgraphs in massive networks. This problem has a broad range of applications, from detecting social network communities to analyzing biological pathways. Our approach focuses on developing innovative, high-performance algorithms that leverage parallel and distributed computing to make subgraph matching and enumeration feasible at large scales. By exploiting data-intensive computing techniques, PRISM aims to push the boundaries of complex network analysis.
The Parallel and Scalable Path-Related Techniques (PaSPrT) project addresses key challenges in graph optimization, such as finding the shortest paths, computing minimum spanning trees, and solving other path-related problems in large-scale networks. These problems have significant applications in areas like transportation networks, communication systems, and infrastructure planning. Our work focuses on developing cutting-edge algorithms that utilize parallel and distributed computing to efficiently solve these path-related problems, making them tractable even in massive graphs. With PaSPrT, we aim to advance the state-of-the-art in scalable path-related techniques for graph algorithm optimization.
Complex systems are organized in clusters or communities, each having distinct role or function. In the corresponding network representation, each functional unit (community) appears as a tightly-knit set of nodes having a higher connection inside the set than outside. Finding communities may reveal the organization of complex systems and their function. We are currently working on designing parallel scalable algorithms for detecting communities in large-scale networks.
Some of the works include: Twitter sentiment analysis and covid vaccination rate prediction; Semi-supervised community detection using graph convolutional network; Data parallel large deep neural netowrk on GPUs; Predictive model for web spam detection.
We are investigating hardware design implications for several scientific data analytics kernels. Thanks to our collaborators from the Computer Architecture Group of LBNL.
This project aims to unravel the complex dynamics of epidemics in urban settings by leveraging data-driven social network analysis. By identifying key hubs within urban social networks, Epi-HUB seeks to pinpoint critical nodes that contribute to the spread of infectious diseases. This approach helps guide effective mitigation efforts, providing valuable insights into how targeted interventions can curb epidemic outbreaks and protect urban populations.
In this project, we identify several popular network visualization tools and provide a comparative analysis based on the features and operations these tools support. We demonstrate empirically how those tools scale to large networks. We also provide several case studies of visual analytics on large network data and assess performances of the tools.