Topic

2025

第十二届

中国可视化与可视分析大会

The 12th China Visualization

and Visual Analytics Conference

中国·杭州

China·Hangzhou

2025.07.19-22

Topic 11: Frontier Papers Session 1

Information

Time: July 21, 2025 - Afternoon, 13:30 - 15:00

Location: Wenlan Hall A, 2F

Chair

Zeyu Li

Communication University of China

Talks

Beyond Numbers: Creating Analogies to Enhance Data Comprehension and Communication with Generative AI

Xuechen Li

Tongji University

Abstract: Unfamiliar units of measurement often hinder readers' understanding of the scale, comprehension of content, and integration into context of numerical data. To improve data understanding and communication effectiveness, we use analogies to bridge the gap between abstract data and familiar measurements. In this work, we first conducted semi-structured interviews with design experts to identify design issues and summarize design considerations. Then, we collected an analogy dataset containing 138 cases from various online sources. Based on the collected dataset, we depicted a design space for creating data analogies. Next, we built a prototype system AnalogyMate that automatically recommends data analogies, corresponding design solutions, and visualization representations generated by generative AI. Research results show that AnalogyMate is useful in assisting the data analogy creation process, and data analogies are effective in improving data understanding and communication.

Speaker Bio: Graduated from Tongji University with a bachelor's degree in Media and Communication Design, currently a graduate student at the intelligent big data visualization (idvx) laboratory of the College of Design and Innovation, Tongji University, supervised by Associate Professor Chen Qing. Research interests include artificial intelligence and data design, creativity assistance tools, etc.

DKMap: Interactive Exploration of Vision-Language Alignment in Multimodal Embeddings via Dynamic Kernel Enhanced Projection

Yilin Ye

Hong Kong University of Science and Technology

Abstract: Studying vision-language alignment in multimodal embeddings is crucial for various tasks, such as evaluating generative models and filtering multimodal pre-training data. Due to the complexity of high-dimensional embedding features, dimensionality reduction (DR) methods must be employed to explore alignment relationships in multimodal embeddings. However, existing DR visual analysis methods fail to consider cross-modal alignment metrics, leading to problems: occlusion by noise points in dense regions, inaccurate metric heatmaps, and insufficient support for multi-scale interactive exploration. To address these issues, this paper proposes a novel dimensionality reduction visualization system - DKMap, which achieves interactive exploration of multimodal embeddings through Dynamic Kernel Enhanced Projection. First, we propose a parameterized supervised t-SNE that integrates post-projection metric heatmap estimation into the projection learning process, thereby improving the accuracy of multimodal mapping. Second, to support multi-scale exploration with dynamic scaling and progressive enhancement of local details, we combine generalized t-kernel α parameter optimization based on validation constraints with quadtree-based multi-resolution techniques to achieve reliable kernel parameter adjustment while ensuring low overfitting. DKMap is a cross-platform visualization tool that includes a web system for interactive exploration and a Python library suitable for computational notebook analysis. We validated DKMap's versatility and scalability through three usage scenarios, including: visualizing the million-scale text-to-image corpus DiffusionDB, comparative evaluation of generative models with different architectures (Unet-Diffusion and Diffusion Transformer), and exploring the billion-scale pre-training dataset LAION-400M.

Speaker Bio: PhD in Computational Media and Arts, Division of Interdisciplinary Studies, Hong Kong University of Science and Technology, supervised by Professor Wei Zeng and Professor Kang Zhang. Research focuses on the intersection of data visualization, human-computer interaction, and artificial intelligence, with particular emphasis on Human-AI Interaction issues based on high-dimensional embeddings. Develops high-dimensional embedding dimensionality reduction and retrieval methods and systems to support user interaction with multimodal data and generative models. During doctoral studies, has published 10 CCF-A papers, including 5 first-author papers published in VIS, ICML, CSCW, TVCG and other top CCF-A conferences and journals in visualization, human-computer interaction, and artificial intelligence fields, and received CHI Best Paper Nomination as corresponding author. Serves as reviewer for VIS, CHI, TVCG, PacificVIS, etc.

Reviving Static Charts into Live Charts

Lu Ying

Zhejiang University

Abstract: Data charts are prevalent across various fields due to their efficacy in conveying complex data relationships. However, static charts may sometimes struggle to engage readers and efficiently present intricate information, potentially resulting in limited understanding. We introduce "Live Charts," a new format of presentation that decomposes complex information within a chart and explains the information pieces sequentially through rich animations and accompanying audio narration. We propose an automated approach to revive static charts into Live Charts. Our method integrates GNN-based techniques to analyze the chart components and extract data from charts. Then we adopt large natural language models to generate appropriate animated visuals along with a voice-over to produce Live Charts from static ones. We conducted a thorough evaluation of our approach, which involved the model performance, use cases, a crowd-sourced user study, and expert interviews. The results demonstrate Live Charts offer a multi-sensory experience where readers can follow the information and understand the data insights better. We analyze the benefits and drawbacks of Live Charts over static charts as a new information consumption experience.

Speaker Bio: PhD candidate at the School of Computer Science and Technology, Zhejiang University, supervised by Professor Wu Yingcai. Research focuses on data narrative and pictographic visualization, dedicated to integrating artificial intelligence technology into visualization design to improve the intelligence and efficiency of visualization creation. Her current research covers intelligent visualization generation, information visualization narrative, and basic theories and key technologies in the field of visual analytics. Related achievements have been published in conferences and journals such as IEEE VIS, IEEE TVCG, ACM CHI, IEEE TPAMI, with over ten papers on visualization, visual analytics, and artificial intelligence, including 8 top CCF-A papers in the computer science field, including 4 first-author CCF-A papers.

ProactiveVA: Proactive Visual Analytics with LLM-Based UI Agent

Yuheng Zhao

Fudan University

Abstract: Traditional visual analytics often requires significant human effort and lacks proactive assistance during the exploration process. Even the latest reactive LLM-assisted systems only provide help when explicitly requested by the user, making them insufficiently intelligent to offer suggestions when analysts need them the most. To support deeper and broader insight exploration, we propose a novel proactive visual analytics framework, where an LLM-based UI agent autonomously anticipates users' difficulties and offers timely assistance. To determine when users need proactive help, what assistance they require, and how the agent should intervene, we first conducted a formative study to analyze help-seeking behaviors from interaction logs and distilled key design requirements for proactive agents in visual analytics systems. We develop a three-stage UI agent pipeline including perception, reasoning, and acting. The agent autonomously perceives users' needs from interaction logs, providing tailored suggestions and intuitive guidance through interactive exploration of the system. We implemented the framework in two representative types of visual analytics systems, demonstrating its generalizability, and evaluated the effectiveness through an algorithm evaluation, case and expert study and a user study. We also discuss current design trade-offs of proactive visual analytics and areas for further exploration.

Speaker Bio: PhD student at Fudan University, supervised by Researcher Chen Siming. Research results have been published in top journals and conferences such as IEEE TVCG, IEEE VIS, ACM CSCW. Research direction is large model-driven intelligent visual analytics. Specifically includes: large model-enhanced full-process visual analytics framework, agent-driven visual analytics for automated task decomposition, social media and text visual analytics methods, etc. Won IEEE PacificVis Best Paper Nomination Award, IEEE VAST Challenge First Prize, etc.

Deep Learning Model Visualization Based on Generated Data

Yang Zhang

Tianjin University

Abstract: With the continuous development of generative models, generated samples have become an important supplement to original samples and are widely used in downstream tasks such as product design, model training, and optimization. Generative models learn high-dimensional latent spaces to enable sampling of infinite numbers of samples from them. However, how to effectively guide users to find ideal generated samples in the latent space and apply them to downstream tasks remains a challenge. To solve this problem, we propose Latent Space Map, which maps the latent space to a two-dimensional plane and preserves its key characteristics, helping users locate samples with desired properties in the latent space through visual guidance. We applied this map in model diagnosis and product design scenarios and verified its effectiveness. In addition, based on this idea, we further propose RobustMap, a visual exploration method for adversarial robustness of deep neural networks based on generative latent space. This method significantly improves the application value of generated samples in deep learning model analysis by using generated samples to visualize the robustness of deep neural networks.

Speaker Bio: Master's student at the School of Intelligence and Computing, Tianjin University, supervised by Professor Li Jie. Research direction is generative model latent space visualization.

Leveraging Foundation Models for Crafting Narrative Visualization: A Survey

Yi He

Tongji University

Abstract: Narrative visualization transforms data into engaging stories, making complex information accessible to a broad audience. Foundation models, with their advanced capabilities such as natural language processing, content generation, and multimodal integration, hold substantial potential for enriching narrative visualization. Recently, a collection of techniques have been introduced for crafting narrative visualizations based on foundation models from different aspects. We build our survey upon 66 papers to study how foundation models can progressively engage in this process and then propose a reference model categorizing the reviewed literature into four essential phases: Analysis, Narration, Visualization, and Interaction. Furthermore, we identify eight specific tasks (e.g. Insight Extraction and Authoring) where foundation models are applied across these stages to facilitate the creation of visual narratives. Detailed descriptions, related literature, and reflections are presented for each task. To make it a more impactful and informative experience for diverse readers, we discuss key research problems and provide the strengths and weaknesses in each task to guide people in identifying and seizing opportunities while navigating challenges in this field.

Speaker Bio: Currently pursuing a PhD degree at the intelligent big data visualization laboratory at Tongji University, supervised by Professor Cao Nan. Main research directions include agent design, human-computer interaction, and narrative visualization. Has published three academic papers, including one SCI paper as first author. Graduated from Beijing University of Posts and Telecommunications, Department of Digital Media.

Versatile Ordering Network: An Attention-based Neural Network for Ordering Across Scales and Quality Metrics

Zehua Yu

Sun Yat-Sen University

Abstract: Ordering has been extensively studied in many visualization applications, such as axis and matrix reordering, for the simple reason that the order will greatly impact the perceived pattern of data. Many quality metrics concerning data pattern, perception, and aesthetics are proposed, and respective optimization algorithms are developed. However, the optimization problems related to ordering are often difficult to solve (e.g., TSP is NP-complete), and developing specialized optimization algorithms is costly. In this paper, we propose Versatile Ordering Network (VON), which automatically learns the strategy to order given a quality metric. VON uses the quality metric to evaluate its solutions, and leverages reinforcement learning with a greedy rollout baseline to improve itself. This keeps the metric transparent and allows VON to optimize over different metrics. Additionally, VON uses the attention mechanism to collect information across scales and reposition the data points with respect to the current context. This allows VONs to deal with data points following different distributions. We examine the effectiveness of VON under different usage scenarios and metrics. The results demonstrate that VON can produce comparable results to specialized solvers.

Speaker Bio: PhD student at the School of Computer Science, Sun Yat-Sen University, supervised by Associate Professor Tao Jun. Research directions include neural network-oriented visual analytics methods, machine learning, and event sequence analysis, with a focus on general tools and algorithms for VIS4AI.

VisGuard: Securing Visualization Dissemination through Tampering-Resistant Data Retrieval

Huayuan Ye

East China Normal University

Abstract: The dissemination of visualizations is primarily in the form of raster images, which often results in the loss of critical information such as source code, interactive features, and metadata. While previous methods have proposed embedding metadata into images to facilitate Visualization Image Data Retrieval (VIDR), most existing methods lack practicability since they are fragile to common image tampering during online distribution such as cropping and editing. To address this issue, we propose VisGuard, a tampering-resistant VIDR framework that reliably embeds metadata link into visualization images. The embedded data link remains recoverable even after substantial tampering upon images. VisGuard enables various applications, including interactive chart reconstruction, tampering detection, and copyright protection. We conduct comprehensive experiments on VisGuard's superior performance in data retrieval accuracy, embedding capacity, and security against tampering and steganalysis, demonstrating VisGuard's competence in facilitating and safeguarding visualization dissemination and information conveyance.

Speaker Bio: Huayuan Ye, Master's student at East China Normal University, supervised by Associate Professor Li Chenhui. His research areas include AI4VIS, human-computer interaction, computer vision, etc. He has published several papers in journals/conferences such as VIS and CHI.

NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification

Wenshuo Zhang

Hong Kong University of Science and Technology

Abstract: The use of conversational large language models for programming tasks by non-experts is on the rise. However, a significant hurdle remains in effectively communicating complex user intentions to these models, often resulting in generated code that fails to meet user needs. This research investigates this communication breakdown and proposes a new direction for human-LLM interaction. Our approach aims to create a more transparent and collaborative process where the user's intent can be more clearly understood and verified by the system. We have developed a prototype to explore this new paradigm. Results from our evaluations are promising and suggest that our methods can enhance the synergy between users and AI, leading to more effective and efficient outcomes in programming-related tasks.

Speaker Bio: Wenshuo Zhang, PhD student at Hong Kong University of Science and Technology, supervised by Professor Qu Huamin. His research direction is human-AI alignment in conversational and agent systems, with particular focus on improving the performance of large language models in programming and other tasks.

第十二届中国可视化与可视分析大会

The 12th China Visualization and Visual Analytics Conference