
Home
We are delighted to announce the 2nd Undergraduate and High School Symposium, to be held as part of the IEEE ICDM 2025 Conference. This symposium aims to provide a platform for young researchers to showcase their innovative work in the field of data mining and related disciplines.
Why Participate in this Symposium?
- Present your research to an international audience at one of the premier data mining conferences.
- Compete for Outstanding Paper Awards, Runner-Up Awards, and Rising Star Awards honoring exceptional student contributions.
- Interact directly with top AI researchers, industry innovators, and NSF Program Directors.
Program Agenda
Nov 14: conference banquet (evening). Nov 15: full-day UGHS program.
November 14, 2025 — Conference Banquet | ||
---|---|---|
Time | Event | Location |
18:00–21:00 | Banquet | — |
November 15, 2025 (Location: 1 main room + 5 additional parallel session rooms) | ||
---|---|---|
Time | Event | Speaker |
09:00–10:00 |
Keynote (50 mins) Location: Senate Room |
NSF Program Director (TBD) |
10:00–10:30 |
Coffee Break Location: Capital Terrace, Congressional |
— |
10:30–11:25 |
Fireside Chat (50 mins) with Dr. Jure Leskovec (Stanford) Location: Senate Room |
Moderator: TBD |
11:30–12:25 |
Panel with Parents (50 mins): Next Generation AI Education Location: Senate Room |
Panelists: See panelists → |
12:25–12:30 | Poster Preparation | — |
12:30–14:00 |
Poster Session (Lunch) Location: Capital Terrace, Congressional |
All Authors |
14:00–15:30 |
Oral Presentations (12 mins × 7 papers × 6 rooms) UH1 - 6 · Locations see details |
Oral Sessions: See sessions → |
15:30–16:00 |
Coffee Break Location: Capital Terrace, Congressional |
— |
16:00–17:30 |
Oral Presentations (12 mins × 7 papers × 6 rooms) UH7 - 12 · Locations see details |
Oral Sessions: See sessions → |
17:30 |
Closing (Award Announcements) Location: Presidential Room |
— |
Oral Presentations · 14:00–15:30
14:00–15:30 · Oral Presentations(12 mins × 7 papers × 6 rooms)
Session Chairs — UH1–UH6
UH1 (Room: Federal A, Session Chair: Sanjay Madria): S01287, S01286, S01285, S01242, S01241, S01282, S01281
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 14:00–14:12 | S01287 | Can Small Quantized VLMs Drive? An Experimental Evaluation of Small Quantized VLMs for Autonomous Driving | Samson Mathew |
2 | 14:12–14:24 | S01286 | Data Insights into Teen Consumer Trends: From Kaggle to Knoxville | Chuanren Liu |
3 | 14:24–14:36 | S01285 | An Empirical Approach Toward Understanding the Impact of Essential Oils on Alzheimer's Disease Progression | Khalil Al-Hussaeni |
4 | 14:36–14:48 | S01242 | Assessing Bias Within Diabetes Risk Prediction in Machine Learning Techniques | Ayesha Faruki |
5 | 14:48–15:00 | S01241 | Heatwaves and Health Risks in New York City | Jie Yang |
6 | 15:00–15:12 | S01282 | Interpretable Feature Mining for AI Product Design | Vivian Foutz |
7 | 15:12–15:24 | S01281 | QuizWhiz: An End-to-End AI-Powered Educational Platform for K-12 Intelligent Tutoring and Teaching Analytics | Ming Zhang |
UH2 (Room: Federal B, Session Chair: Chen Chen): S01280, S01279, S01278, S01277, S01276, S01275, S01274
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 14:00–14:12 | S01280 | Early Wildfire Detection with UAVs using a Frame Difference Method | Brian Hong |
2 | 14:12–14:24 | S01279 | FS-PREM: A Physics-Aware Framework for Predicting Port Disruption | Shriraghav Ashok |
3 | 14:24–14:36 | S01278 | Privacy-First Triage Classification with Open-Weight LLMs: A Chain-of-Thought Distillation Approach | Zeyuan Zhao |
4 | 14:36–14:48 | S01277 | DETECT: Data-Driven Evaluation of Treatments Enabled by Classification Transformers | Yuanheng Mao |
5 | 14:48–15:00 | S01276 | SmartPharynx: A Camera-Based Smartphone System for Screening of Bacterial Pharyngitis with a Low-Shot CycleGAN and Custom CNN | Srikar Kovvali |
6 | 15:00–15:12 | S01275 | An Analysis of Gender-Based Differential Item Functioning in the PISA 2018 and 2022 Cycles | Isabel Xiong |
7 | 15:12–15:24 | S01274 | A Critical Analysis of a Multi-Input CNN Architecture for Quantum-Enhanced Forecasting | Ateef Mahmud |
UH3 (Room: South American B, Session Chair: Fang Jin): S01273, S01271, S01269, S01267, S01264, S01263
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 14:00–14:12 | S01273 | Towards Robust Anomaly Detection in Fish Behavior: Hybrid LLM–ML Ensembles and Federated Learning | Qiong Cheng |
2 | 14:12–14:24 | S01271 | Transforming Color Correction for Colorblindness with Hydrodynamic Modeling and Deep Learning-Based Validation | Lucas Yang |
3 | 14:24–14:36 | S01269 | Comparative Analysis of GraphCast and the Global Forecast System Using Real-Time Mesoscale Analysis | Jonathan Yu |
4 | 14:36–14:48 | S01267 | Learning for Inflation Forecasting with Dynamic Feature Spaces | Zakariyya Scavotto |
5 | 14:48–15:00 | S01264 | Predicting Residual Cognitive Deficit Post-Ischemic Stroke: An Imbalance-Aware Machine Learning Pipeline on EHR Data | Sirichandana Yakkala |
6 | 15:00–15:12 | S01263 | A ResNet and ViT-U-Net Hybrid Model for Accurate FVM Flooding Simulations | Isabella Cho |
UH4 (Room: California, Session Chair: Haibing Lu): S01262, S01261, S01260, S01258, S01257, S01256
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 14:00–14:12 | S01262 | AI-Assisted Safe Drop Zone Identification for Human-Guided Drone Deliveries | Atharva Kakatkar |
2 | 14:12–14:24 | S01261 | Adaptive Execution Scheduler for DataDios SmartDiff | Aryan Poduri |
3 | 14:24–14:36 | S01260 | Contrastive Retrieval Augmented In-Context Learning for Medical Classification Tasks with Imbalanced Data | Swarnika Joshi |
4 | 14:36–14:48 | S01258 | Graph Perspective on Multi-modal Mouse Neural Data and Behavior Analysis | Wenhao Hu |
5 | 14:48–15:00 | S01257 | Efficient Semantic-based Video Segment Querying | ziqi Zhou |
6 | 15:00–15:12 | S01256 | Enhancing Radiographic Disease Detection with MetaCheX, a Context-Aware Multimodal Model | Nathan He |
UH5 (Room: New York, Session Chair: Mohammad Ali Javidian): S01255, S01254, S01253, S01252, S01251, S01250
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 14:00–14:12 | S01255 | Statistical Mining of Patient Reviews for Geographic Insights on Quality of Urological Care: A Pilot Study | Max Yu |
2 | 14:12–14:24 | S01254 | Enhancing On-Chip Learning for RRAM Devices Through Evolutionary Theories | Xinghui Zhao |
3 | 14:24–14:36 | S01253 | Application of Object Segmentation Model in Contact Angle Measurement for Hydrophobicity Determination | Joann Xie |
4 | 14:36–14:48 | S01252 | Hybrid BiLSTM-RF Framework for Lithium-ion Battery State of Health and RUL Prediction | Irene Lu |
5 | 14:48–15:00 | S01251 | Uncertainty Quantification in Deep Learning based Breast Cancer Diagnosis using DCE-MRI and TRAMs | Jerry Wang |
6 | 15:00–15:12 | S01250 | MetaRef: A Generalizable Physics-Aware Refinement Framework for Metamaterial Design | Alexander Lu |
UH6 (Room: Massachusetts, Session Chair: Long Nguyen): S01249, S01248, S01246, S01245, S01244, S01243, DM556
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 14:00–14:12 | S01249 | Few-Shot Learning Meets Large Language Models: Mining Medicine Interventions From Reddit | Fang Jin |
2 | 14:12–14:24 | S01248 | Assessing Cognitive Biases in LLMs for Judicial Decision Support: Virtuous Victim and Halo Effects | Sierra Liu |
3 | 14:24–14:36 | S01246 | FinFraud-LLM: Exploring Large Language Models for Financial Fraud Detection | Johnson Chen |
4 | 14:36–14:48 | S01245 | Can Reasoning LLMs Eliminate Conformity in Multi-Agent Systems? | Alina Liu |
5 | 14:48–15:00 | S01244 | Echo State Networks in Reservoir Computing: Foundations, Benchmarks, and Applications to Next-G Wireless Communication | Andrew Liu |
6 | 15:00–15:12 | S01243 | Interpretable Deep Learning Framework for the Diagnosis of Age-Related Macular Degeneration | Anvitaa Rudharraju |
7 | 15:12–15:24 | DM556 | Explainable Skill Acquisition over Time via GraphRAG-Augmented Mastery Features, Fuzzy Clustering, and Hybrid Deep Models | Qiong Cheng |
Oral Presentations · 16:00–17:30
16:00–17:30 · Oral Presentations(12 mins × 7 papers × 6 rooms)
Session Chairs — UH7–UH12
UH7 (Room: Federal A, Session Chair: Yi He): S01284, S01283, S01240, S01239, S01238, S01237, S01236
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 16:00–16:12 | S01284 | Multi-optimizer Deep&Cross at Industrial Scale | Mark znidar |
2 | 16:12–16:24 | S01283 | Multi-Modal Embedding Fusion for Scalable Context-First CTR | Mark znidar |
3 | 16:24–16:36 | S01240 | A Deep Learning Approach for Reaction-Diffusion-Advection Modeling of Vegetation-Desertification Patterns | Aritro Chatterjee |
4 | 16:36–16:48 | S01239 | Dynamics of Fencer Rating Progression | Ethan Xu |
5 | 16:48–17:00 | S01238 | ESN-DAGMM: A Lightweight Framework for Unsupervised Time-Series Data Monitoring in 5G O-RAN Networks | Raymond Zhao |
6 | 17:00–17:12 | S01237 | Deep Gaussian Fusion Network for Traffic Prediction | Zhiqian Chen |
7 | 17:12–17:24 | S01236 | Quantifying Biopharma Alliance Fragility Using a Strategic Shock Risk Index (SSRI) | Rhea Zhou |
UH8 (Room: Federal B, Session Chair: Kunpeng Liu): S01235, S01234, S01233, S01232, S01231, S01230, S01229
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 16:00–16:12 | S01235 | Do You Know What I Mean? Testing the Prompt Robustness of an LLM-Powered IoT System | Xingguo Ding |
2 | 16:12–16:24 | S01234 | Interactive 3D Spine Modeling for Enhanced Doctor-Patient Communication and Health Literacy | Christian Jin |
3 | 16:24–16:36 | S01233 | AI or humans: Who understands online emotions Better? | Victor Tang |
4 | 16:36–16:48 | S01232 | Mining Mobile Point-of-Interest Visit Data for Socioeconomic Insights | Amy Ma |
5 | 16:48–17:00 | S01231 | An Agentic Framework for Social Event Forecasting: Approaches using Causality Contextualized Chain of Thought | Avani Thakur |
6 | 17:00–17:12 | S01230 | Data-Driven Weakly-Supervised Methods Successfully Denoise Diverse Biomedical Imaging Modalities | Reeti Rout |
7 | 17:12–17:24 | S01229 | Lost in Transcription: Influence of Dialect on Whisper’s Performance | Helen Qin |
UH9 (Room: South American B, Session Chair: Kanthi K Sarpatwar): DM314, S01228, S01227, S01226, S01224, S01223, S01222
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 16:00–16:12 | DM314 | Benchmarking LLMs and Distributed Approaches for Anomaly Detection | Qiong Cheng |
2 | 16:12–16:24 | S01228 | Grounded Chest X-Ray Reasoning: Leveraging Visual Tools to Improve Medical Multimodal LLMs | Huaxiu Yao |
3 | 16:24–16:36 | S01227 | Bringing Optimization to Everyone: Exploring LLMs as a Tool for Non-Experts and Students | Winston Zhang |
4 | 16:36–16:48 | S01226 | Deep Learning to Denoise and Segment Air Pollutant Plumes | Daniel Li |
5 | 16:48–17:00 | S01224 | EEG EyeNet: Strong, Insightful Baselines for Eye-Movement Prediction from EEG | Christian Jin |
6 | 17:00–17:12 | S01223 | Automated Analysis of Astrocyte Cell Connectivity after Laser Ablation Using Machine Learning and Path-Finding Algorithms | Connor Lee |
7 | 17:12–17:24 | S01222 | Predicting Vertical Cloud Type Structure with GOES-ABI Multi-Channel Data (Deep Learning & Foundation Model) | Sidh Jaddu |
UH10 (Room: California, Session Chair: Zhou Yang): S01221, S01220, S01219, S01218, S01217, S01216
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 16:00–16:12 | S01221 | Multimodal Foundation Models as Router Models for High-Resolution Aerial Image Segmentation | Cooper Li |
2 | 16:12–16:24 | S01220 | Forecasting U.S. Recessions with Machine Learning: Evidence from Ten Economic Indicators, 1978–2025 | Neel Dhuruva |
3 | 16:24–16:36 | S01219 | C-Reactive Protein Induces Endothelial Cell Dysfunction and Replication Stress | Jay Peng |
4 | 16:36–16:48 | S01218 | Systematic Comparison of Artifact Removal Techniques for Reliable Feature Extraction from scTS-Contaminated EMG | Vivian Li |
5 | 16:48–17:00 | S01217 | Graph-LLM for EHRs: Combining Temporal Graph Representations and LLM-Based Note Imputation for Clinical Predictions | Michael Liu |
6 | 17:00–17:12 | S01216 | LLMSeqRec: LLM Enhanced Contextual Sequential Recommender | Connor Lee |
UH11 (Room: New York, Session Chair: Senjuti Basu Roy): S01215, S01214, S01213, S01212, S01211, S01210
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 16:00–16:12 | S01215 | Benchmarking the Code Generation Capabilities of Popular Large Language Models for Front-End Web Development | Dron Datta |
2 | 16:12–16:24 | S01214 | Persona-Driven LLM Interaction in Stock Market Simulations | Medhashree Parhy |
3 | 16:24–16:36 | S01213 | Evaluating the Effectiveness of Persona Simulation in Opinion Prediction with GPT-4.1 | Sarah Li |
4 | 16:36–16:48 | S01212 | AI-Powered Trait Analysis for Poisonous Mushroom Classification | Srikar Akundi |
5 | 16:48–17:00 | S01211 | Machine Learning-Based Classification of Transcriptional Gene Groups for Cancer Prognosis | Annie Wu |
6 | 17:00–17:12 | S01210 | Performance Evaluation of Convolutional Neural Networks in Image-Based Malware Classification | Raymond Jiang |
UH12 (Room: Massachusetts, Session Chair: Meikang Qiu): S01208, S01207, S01206, S01204, S01203, S01201
# | Time | Paper ID | Title | Contact |
---|---|---|---|---|
1 | 16:00–16:12 | S01208 | Signature vs. Substance: Evaluating the Balance of Adversarial Resistance and Linguistic Quality in Watermarking Large Language Models | William Guo |
2 | 16:12–16:24 | S01207 | Hyperspectral Band Selection with Learnable Weights for Efficient Glioblastoma Detection | Albert Li |
3 | 16:24–16:36 | S01206 | Dimension Reduction Enhanced Boosting for Imbalanced Data Classification | Eric Wang |
4 | 16:36–16:48 | S01204 | Probabilistic Prompts for Zero-shot and Few-shot Large Language Models: An Empirical Study of Patient-reported Outcomes | Zhong Chen |
5 | 16:48–17:00 | S01203 | Quantitative Assessment on the Impact of Music on Athletic Performance | Angela Du |
6 | 17:00–17:12 | S01201 | Predicting Relationship Stability Using Communication Patterns | Deepak Gahalot |
Parent Panel — Panelists
This panel appears on Nov 15, 11:30–12:25 (see agenda above).
Name | Affiliation | Title / Role | |
---|---|---|---|
Min Wu | minwu@umd.edu | University of Maryland – College Park | Distinguished University Professor; Associate Dean for Graduate Programs; Christine Yurie Kim Eminent Professor in Information Technology, Electrical and Computer Engineering |
Ambreen Hasan | ambihasan@gmail.com | Youngstown State University | Director of Institutional Research and Analytics |
Jianjun Xie | jianjunxie@gmail.com | FICO | Principal Scientist |
Dhuruva Badri | drubadri@gmail.com | Florida Department of Transportation | Assistant District Construction Engineer |
Tarun Thakur | tarun.thakur@alumni.duke.edu | Veza | Co-Founder and CEO |
Poster Presentation
-
The recommended poster size is A0 (33.1″ × 46.8″ or 841 mm × 1189 mm). Posters may be prepared in either portrait or landscape orientation. Authors are free to design their posters as they see fit, as long as the poster fits within the allotted display area.
Each poster board measures 8 feet wide by 4 feet high and is double-sided. Two posters will be displayed on each side, so each presenter has a maximum space of 4 feet × 4 feet (1.22 m × 1.22 m), including margins. For visual reference, please see the attached diagram illustrating the poster board layout.
8’ board widthIllustration: one 8’ × 4’ board (one side) with two 4’ × 4’ poster areas. -
Presentation materials may be attached using pushpins or tape, which will be provided on site.
Scope of Topics
We invite submissions of original research papers from undergraduate and high school students on topics related to data mining, including but not limited to:
- Foundations, algorithms, models, and theory of data mining, including big data mining
- Machine learning, deep learning, and statistical methods for big data
- Mining heterogeneous data sources, including text, semi-structured, spatio-temporal, streaming, graph, web, and multimedia data
- Data mining systems and platforms for analyzing big data, including methods for parallel and distributed data mining, federated learning, and their efficiency, scalability, security, and privacy
- Data mining for modeling, visualization, personalization, and recommendation
- Data mining for cyber-physical systems and complex, time-evolving networks
- Data mining with large language models
- Novel applications of data mining in data science, including big data analysis in social sciences, physical sciences, engineering, life sciences, climate science, web, marketing, finance, precision medicine, health informatics, and other domains
Submitted papers should present novel ideas, methodologies, algorithms, or applications in the realm of data mining. Papers will be evaluated based on their technical quality, novelty, relevance, and clarity of presentation.
Eligibility
- Undergraduate students and high school students pursuing an academic degree at the time of submission are eligible to submit papers as first authors.
- Each submission must have at least one student author, who should be the presenter if the paper is accepted.
- Co-authorship with faculty members or researchers is allowed, but the student must be the primary contributor to the work.
In-Person Policy (UGHS)
The UGHS Symposium will be run as an in-person session. At least one student author must attend and present the paper in person; otherwise, the paper will not be included in the IEEE proceedings.
Additional requirements for high school student authors:
- At least one parent accompanies the student to the conference.
- A signed release form (download here).
-
An information document signed by a parent/guardian, along with an optional signature from the high school principal.
The document should include the paper ID and title; contact information for the parent (required) and the principal (optional);
a statement of permission from the high school (optional); and confirmation that a parent will accompany the student.
Template: Information Document (DOCX)
Note: School approvals are optional. However, some schools require their permission.
The conference policy requires that each student participant be accompanied by their own parent or legal guardian. One adult chaperone cannot supervise multiple students.
Where to send: Please email the signed release form and the optional information document to ieee-icdm-2025-undergraduate-and-high-school-symposium-g@vt.edu before November 7, 2025.
Submission Format Requirements
- Submissions must adhere to the IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to “formatting instructions” below: https://www.ieee.org/conferences/publishing/templates.html).
- Undergraduate student research papers should not exceed 6 pages, including all figures, tables, and references.
- High school student research papers should not exceed 5 pages, including all figures, tables, and references.
- Please highlight whether the first author is a high school or an undergraduate student in the author affiliation of your submitted paper.
Publication Ethics and Dual Submission Policy
We uphold the highest standards of academic integrity and ethical research conduct. All authors must ensure that their submissions fully comply with the following guidelines.
Originality and Dual Submission: Submitted manuscripts must represent original work that has not been submitted or published elsewhere. Dual submission—submitting the same or substantially similar content to multiple venues concurrently— is strictly prohibited and constitutes a violation of publication ethics and integrity.
Subsequent Journal Submission: If authors wish to submit an extended version of their published work to a journal after conference publication (e.g., IEEE ICDM proceedings), they must clearly disclose that a preliminary version has been published. Transparency with journal editors and reviewers is required. Any subsequent submission to another venue must include significant new contributions, such as expanded experiments, novel theoretical insights, or additional methodological developments. Minor revisions or incremental changes are not sufficient to qualify as a new work.
Ethical Responsibility: Authors are responsible for maintaining ethical standards in all aspects of the publication process, including authorship, data integrity, and proper citation of prior work. Violations may lead to retraction, notification of the authors’ institutions, and disqualification from future submissions.
Important Dates
- Paper Submission Deadline:
Sep 1, 2025 11:59 PM AoE - Paper Submission Deadline (Deadline Extended):
Sep 10, 2025 11:59 PM AoE - Notification of Acceptance:
Oct 1, 2025 - Camera-Ready Paper Submission:
Oct 15, 2025 - Symposium Date: Nov 15, 2025
Venue & Hotels
IEEE ICDM 2025 will be held in-person in Washington, D.C. Discounted hotel rooms are available. Please check: https://www3.cs.stonybrook.edu/~icdm2025/venue.html .
How to Submit
- The review process is single-blind (reviewers anonymous; authors visible to reviewers).
- Accepted papers will appear in the ICDM Workshop Proceedings (IEEE Computer Society Press) and be available at the conference.
- Each accepted paper must have at least one author registered and present in person during the symposium day.
Camera-Ready Submission
Final deadline: October 15, 2025.
- The camera-ready paper must follow all formatting requirements (not a response to reviewer comments). Ensure IEEE formatting/page limits per the ICDM 2025 website, use high-clarity figures, and keep the layout clean and consistent.
- Submission site: IEEE CPS Login
- PDF eXpress validation: Conference Record
69685
. -
Paper ID on the registration form:
use your conference submission ID (e.g.,
S01217
) from the submission emails — not the IEEE PDF eXpress ID (e.g.,2025357692
). - Issues with copyright or submission? Contact Martha Nunez.
Registration
We have already provided the discount registration code in the camera-ready email to all authors. That discounted registration fee is the one-day symposium fee, rather than the full conference fee. It covers one student and one parent companion, and also includes the Friday banquet.
Program Co-Chairs
Join us in shaping the future of data mining research by sharing your insights and discoveries at the Undergraduate and High School Symposium. For questions, please contact the symposium co-chairs.


For general inquiries: ieee-icdm-2025-undergraduate-and-high-school-symposium-g@vt.edu
Session Co-Chairs











Web Chairs
Student Volunteers
Visa Information Contact

