Improving Interaction in Visual Analytics using Machine Learning
Abstract
Interaction is one of the most fundamental components in visual analytical systems, which transforms people from mere viewers to active participants in the process of analyzing and understanding data. Therefore, fast and accurate interaction techniques are key to establishing a successful human-computer dialogue, enabling a smooth visual data exploration. Machine learning is a branch of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It has been utilized in a wide variety of fields, where it is not straightforward to develop a conventional algorithm for effectively performing a task. Inspired by this, we see the opportunity to improve the current interactions in visual analytics by using machine learning methods.
In this thesis, we address the need for interaction techniques that are both fast, enabling a fluid interaction in visual data exploration and analysis, and also accurate, i.e., enabling the user to effectively select specific data subsets. First, we present a new, fast and accurate brushing technique for scatterplots, based on the Mahalanobis brush, which we have optimized using data from a user study. Further, we present a new solution for a near-perfect sketch-based brushing technique, where we exploit a convolutional neural network (CNN) for estimating the intended data selection from a fast and simple click-and-drag interaction and from the data distribution in the visualization. Next, we propose an innovative framework which offers the user opportunities to improve the brushing technique while using it. We tested this framework with CNN-based brushing and the result shows that the underlying model can be refined (better performance in terms of accuracy) and personalized by very little time of retraining. Besides, in order to investigate to which degree the human should be involved into the model design and how good the empirical model can be with a more careful design, we extended our Mahalanobis brush (the best current empirical model in terms of accuracy for brushing points in a scatterplot) by further incorporating the data distribution information, captured by kernel density estimation (KDE). Based on this work, we then provide a detailed comparison between empirical modeling and implicit modeling by machine learning (deep learning). Lastly, we introduce a new, machine learning based approach that enables the fast and accurate querying of time series data based on a swift sketching interaction. To achieve this, we build upon existing LSTM technology (long short-term memory) to encode both the sketch and the time series data in two networks with shared parameters.
All the proposed interaction techniques in this thesis were demonstrated by application examples and evaluated via user studies. The integration of machine learning knowledge into visualization opens further possible research directions.
Has parts
Paper A: Chaoran Fan and Helwig Hauser. User-study based optimization of fast and accurate Mahalanobis brushing in scatterplots. In Proc. Vision, Modeling, and Visualization (VMV 2017), pages 77–84, 2017. The article is available in the thesis file. The article is also available at: https://doi.org/10.2312/vmv.20171262Paper B: Chaoran Fan and Helwig Hauser. Fast and Accurate CNN-based Brushing in Scatterplots. Computer Graphics Forum (Eurovis 2018), 37 (3): 111–120, 2018. The article is available in the thesis file. The article is also available at: https://doi.org/10.1111/cgf.13405
Paper C: Chaoran Fan and Helwig Hauser. Personalized Sketch-Based Brushing in Scatterplots. IEEE Computer Graphics and Applications, 39 (4): 28–39, 2019. The article is available at: https://hdl.handle.net/11250/2722776
Paper D: Chaoran Fan and Helwig Hauser. On sketch-based selections from scatterplots using KDE, compared to Mahalanobis and CNN brushing. IEEE Computer Graphics and Applications, 41 (5): 67–78, 2021. The article is available in the thesis file. The article is also available at: https://doi.org/10.1109/MCG.2021.3097889
Paper E: Chaoran Fan, Krešimir Matkovi´c and Helwig Hauser. Sketch-based fast and accurate querying of time series using parameter-sharing LSTM networks. IEEE Transaction on Visualization and Computer Graphics, Early Access, 2020. The article is available in the thesis file. The article is also available at: https://doi.org/10.1109/TVCG.2020.3002950