Visual Analytics

In Visual Analytics we research eﬀective systems for exploration and retrieval in large and complex data sets. Our aim is to combine scalable visual data representations with appropriate automatic data analysis methods. A tight integration of Visualization and Data Analysis in interactive systems can help to ﬁnd patterns and details of interest in large data. Our work falls into the following areas:

Surveys and Foundations of Visual Analytics

Visual Analytics builds on foundations in visualization, interaction, and data analysis, among others. The amount of components to use and possible system designs is large, hence it is instructive to survey exiting systems and deﬁne research perspectives. In [vLKS+11], we have surveyed a large number of graph visualization approaches and deﬁned promising research directions for visual analytics of graph data. An application that has received a lot of research interest is visual analysis of social media data. In [SK13], we survey visual analytics systems in this area. We have recently provided a survey of text highlighting techniques [SOK+15]. In this survey, based on crowd-sourced experiments we also compared the relative performance of diﬀerent methods regarding tasks in term identiﬁcation. Furthermore, there is an increasing interest to leverage visual analysis systems in corporate settings. In [ZSB+12], we have compared a number of existing data management and analysis systems with respect to their scope and comparative advantages. As well, in [vLSFK12] we explored the speciﬁc challenges in searching and analyzing in data comprised by multiple compound data types. References

Visual Analysis of High-Dimensional and Relational Data

Figure 1: Exploration of complementary and redundant subspaces [TMF+12] (left), and 1DMDS techniques [JFSK15] (right). Figure 2: Visual analysis of large relational data by semantic zoom of matrix view [BDF+14a] (left) and comparison of multiple hierarchies for analysis of Phylogenetic trees [BvLH+11] (right). High-dimensional visual data analysis is challenging, as typically there are not enough visual variables available to map many diﬀerent data dimensions. Data patterns of interest may be hidden in subspaces, which implies a need for data reduction and novel analysis paradigms in general. We have in several projects explored the application of data mining methods from subspace analysis. In [TMF+12], we proposed a system for interactive exploration of data subspaces, based on grouping of subspaces for similarity and their MDS layouts. Our ClustNails system [TZB+12] is an approach for visual comparison of clusters in subspaces. Recently, we have introduced 1D MDS plots [JFSK15] for the visual exploration of time-dependent multivariate data sets, allowing to visually detect patterns across time and dimensions. In the exploration of high-dimensional data, there exist exponentially many possible dataviews (subspaces), and it is often not possible to specify a-priori which subspaces or views are interesting to a user or within a given task. In [BLBS11], we proposed a scalable approach for visual comparison of alternative descriptor spaces to identify useful spaces for analysis. The approach is based on data projection and appropriate visual-interactive comparison facilities. Furthermore, it may be possible to introduce a relevance-feedback stage into high-dimensional data exploration [BKSS14]. By means of a classiﬁer trained on user feedback, the system can learn to discriminate between relevant and irrelevant views, and adapt to the task at hand. High-dimensional data analysis often relies on data reduction for visualization. In [PZS+15], we proposed measures to judge the quality of projections, which can be used in turn to search for dimension weights to improve the projection relevance. Still, projections often introduce distortions and misleading views. To this end, we proposed several visual mappings to include projection quality measures into the projection [SvLB10], making the analyst aware of certainand uncertain data areas. We also investigate the visual analysis of relational data. Matrix visualization can help toshow large network data, however the displays typically depend on ﬁnding an appropriate sorting to visually detect patterns. In [BDF+14a] we proposed a semantic zoom approach for exploration and comparison of sets of matrices, where the display can scale from adjacency matrix to node-link view. Furthermore, we considered visual comparison of sets of hierarchies in a small-multiple approach, where a custom similarity function allows to identify similar of diﬀerent subtrees [BvLH+11]. Besides trees, we also considered visual comparison of sets of graphs using a Self-Organizing map display to show clusters of node-link diagrams [vLGS09]. References

To top

Visual Analysis of Spatial and Temporal Data

Figure 1: Visual analysis of spatio-temporal data: Interactive Self-Organizing trajectory map [SBTK09] (left) and visual analysis of movement data during a soccer match [JSS+14] (right). Figure 2: Two approaches for scalable time series visualization: Importance-driven layouts [DHKS05] (left) and Growth matrix display [KNS+06] (right). Spatial and temporal data are very important basic data forms and of paramount importance in data analysis. A number of our works concern trajectory data. In [SBTK09], we have provided a visual analytics approach for interactive clustering of trajectory data. A set of controls allows the analyst to interactively specify a number of example trajectories, which are used to initialize the training of a Self-organizing cluster map. Our approach allows the user to visually monitor the training process, and if needed, steer the process by visual adaption of parameters and cluster prototypes. In many cases, trajectories of interest are very longand need to be segmented into smaller chunks. In [vLBSF14], we applied an interest point detector to temporal features of trajectories, allowing analysts to identify and compare segments (time intervals) of interest, based on features of single or multiple trajectories. Motion also occurs naturally in a number of applications, which we support by custom systems. In[JSS+14], we proposed feature-based visual analysis of Soccer match data, based on payer and ball trajectories. The system included a classiﬁcation engine, which the user can train to adaptively ﬁnd segments of interest. In [BWK+13], we developed the Motion Explorer system, which allows search and comparison of movement patterns in motion capture data, based on transition and cluster diagrams together with a custom movement glyph. Spatial analysis also plays an important role in Social Media analysis applications. For example, in [SBS13a] we assessed the credibility of voluntarily contributed, spatially localized image data for estimation of location and content correctness. The latter is important if one wants to rely on high quality of spatial information, e.g., in crisis management scenarios. It is also interesting to search for spatially trending patterns in microblog services. In [SBSL14], we analyzed microblog data for spatial transition patterns, e.g., linear or circular trends. We showed how this can be used to track the sentiment of an audience during a band tour across the country. Visual analysis of temporal data is often confronted with long and many time series. Existing visualizations often have a scalability problem and to this end, we worked on a number of techniques for eﬀective compression and abstraction of large time series data. In our Space-in-Time maps [AAB+10], we provided explorative overviews of long time series with geo-references, based on visual cluster analysis. Several scalable layouts were proposed for longtime series. In [DHKS05], we proposed a TreeMap-type layout for sets of time series, scaling and placing them according to a given importance measure. In [HKDS07], we considered a time series display which adapts and scales according also to a given interest measure, allowing focus-and-context analysis in long time series. As another example, it is also possible to represent time series in a matrix-oriented display. In our Growth Matrix representation, the spectrum of all possible change ratios in a given time series is shown by a matrix display [KNS+06], an approach useful e.g., in ﬁnancial data analysis. References

To top

Visual Search and Digital Libraries

Digital Libraries aim to provide user access to archived contents. Increasingly, besides textual documents, also non-textual documents are relevant. We focus particularly on user search and access to research data sets, where the goal is to search for data patterns of interest in alarge data repository. In the VisInfo project, we have proposed and evaluated a methodology for visual search in time series data sets [BDF+14b]. Using a baseline similarity function, users can search for patterns of interest, with cluster-based overviews allowing navigation. In addition to time series, we have also worked on similarity models and visual representations for bivariate data sets. We showed that so-called regressional features can form an eﬀective similarity model to search in bivariate data [SBS11]. In [SBS+14], we have proposed anapproach to assist the query speciﬁcation for scatter plot patterns by search previews based on shadow drawings. An extension using a bag-of-words model [SvLS13] supports retrieval also in multivariate data. To evaluate and compare the performance of alternative similarity models for scatter plot retrieval, in [SvLS12] we deﬁned a benchmark data set. Comparing a number of similarity models, we found that features based on density and edge orientation perform well on average, and that the regression feature model has advantages in terms of user interpretability. While these aforementioned models consider global data properties, we recently also explored local approaches for the similarity computation. In [SSB+15], based on appropriate segmentation, a weighted distance function can compare scatter plots for the similarity of local patterns. References

To top

Immersive Analytics

Fig. 1: Movement visualisation and analysis of movement (left) for example for assembly simulation (middle, right) [KSSWF+20][KSSWS+20].

Fig. 2: Multidimensional time series visualisation via spatially stacked line charts for anomaly detection [KSSSW+20].

Fig. 3: Touch display: Collaborative analysis in data modelling by a team of analysts [CSL+17] (left). Eyetracking: User eye tracking for reordering of a Parallel Coordinate Plot [CASS19] (right).

In Immersive Analytics (IA), we explore applications of visual analytics using immersive displays and user input facilities, extending beyond more conventional display and interaction systems like desktops. Novel environments in general can range from touch screens and situated-analytics in the real world, to use of eye trackers capturing visual attention, to mixed and virtual reality approaches. They offer novel sets of possibilities, giving different perspectives on data to explore, and new ways for analysts to interact with the visualization environment. Research in IA evaluates new possibilities and potential advantages for immersive technologies in the visual analysis process.

In our work, we focus on exploring the ways in which we can exploit the improved immersion and spatial understanding of Virtual Reality (VR) for data visualisation and analysis. For example, VR allows us to provide spatial data analysis of user VR behaviour and interaction data, in the same environment the data was captured in. In addition, it can provide novel interaction methods compared to traditional desktop applications.

Virtual environments are becoming more and more ubiquitous and allow us to measure every facet of user interaction. Within the FFG VR4CPPS project, we have created an immersive analysis system that allows users to view and analyse user motion in virtual reality [KSSWF+20]. Movement is visualised as trajectories that show head and hand movement of users, while providing several tools for gaining an overview or for looking at the movement in more detail. Fig. 1 (left) shows our storyboard visualisation that gives an overview by finding important key-frames in the data and visualising the user's stance at the corresponding timestamps. The middle of Fig. 1 shows a collaborative assembly simulation [KSSWS+20], where users can analyse their movement with the implemented system. To the right of Fig. 1 shows such a situation, with a visualisation of the head movement of a user performing an assembly.

Virtual Reality also allows us to use spatial cues for data that is not inherently spatial. An example are multidimensional time series data from measurements over long periods of time, where analysts want to find anomalies in the measured parts. We have created an immersive analytics system that allows users to view the time series in a stacked, spatial arrangement, which provides a fast way of visually detecting anomalies [KSSSW+20]. The system is designed to offer natural interaction metaphors and allows users to stay still while interacting with the data. Analysis and navigation of time series data in VR is often a challenge, due to user navigation, orientation and occlusion issues. Through a novel proxy interaction metaphor, the displayed data can be filtered and transformed in the virtual space, supporting user navigation (Fig. 2 left). Furthermore, users can employ the proxy widget to either sort the data according to an anomaly-score and view highly anomalous regions in the data (Fig. 2 middle), or they can search for repeated behaviour in the data (Fig. 2 right).

We also research the use of eye tracking devices in the Visual Analytics process. Eye trackers allow capturing user gaze behaviour on a visual display. This data can potentially be valuable for supporting the user exploration process, e.g., by recognizing viewed visual patterns and recommending unseen visual patterns [SBJ+19]. For example, in [SSES17] we proposed capturing user gazes in exploration of scatter plot matrices, for guiding the exploration of large data. Furthermore, we explore collaborative Visual Analytics on large-scale touch displays, for joint data modelling, e.g., regression analysis and pattern search [CSL+17] (see Fig. 3 left). This data can potentially be valuable for supporting the user exploration process, e.g., by recognizing viewed visual patterns and recommending unseen visual patterns [SBJ+19] (see Fig. 3 right).

[KSSWF+20] S. Kloiber, V. Settgast, C. Schinko, M. Weinzerl, J. Fritz, T. Schreck, and R. Preiner. Immersive Analysis of User Motion in VR Applications. The Visual Computer 36(10–12): 1937–49, 2020. https://doi.org/10.1007/s00371-020-01942-1.

[KSSWS+20] S. Kloiber, C. Schinko, V. Settgast, M. Weinzerl, T. Schreck, and R. Preiner. Integrating Assembly Process Design and VR-based Evaluation using the Unreal Engine. In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (Vol. 1, pp. 271–278), 2020. SCITEPRESS - Science and Technology Publications. https://doi.org/10.5220/0008965002710278.

[KSSSW+20] S. Kloiber, J. Suschnigg, V. Settgast, C. Schinko, M. Weinzerl, T. Schreck, and R. Preiner. Immersive Analytics of Anomalies in Multivariate Time Series Data with Proxy Interaction. In Proc. 2020 International Conference on Cyberworlds (CW), 94–101, 2020. https://doi.org/10.1109/CW49994.2020.00021.

[SBJ+19] N. Silva, T. Blaschek, R. Jianu, N. Rodrigues, D. Weiskopf, M. Raubal, and T. Schreck. Eye tracking support for visual analytics systems: Foundations, current applications, and research challenges. In Proc. ACM Symposium on Eye Tracking Research & Applications, 2019. https://doi.org/10.1145/3314111.3319919

[CASS19] M. Chegini, K. Andrews, T. Schreck, and A. Sourin. Eyetracking based adaptive parallel coordinates. In SIGGRAPH Asia 2019 Posters, SA 2019, Brisbane, QLD, Australia, November 17-20, 2019, pages 44:1–44:2. ACM, 2019. https://doi.org/10.1145/3355056.3364563.

[SSES17] L. Shao, N. Silva, E. Eggeling, and T. Schreck. Visual exploration of large scatter plot matrices by pattern recommendation based on eye tracking. In Proc. ACM IUI Workshop on Exploratory Search and Interactive Data Analytics, 2017. https://doi.org/10.1145/3038462.3038463

[CSL+17] M. Chegini, L. Shao, D. Lehmann, K. Andrews, and T. Schreck. Interaction concepts for collaborative visual analysis of scatterplots on large vertically-mounted high-resolution multi-touch displays. In Proc. 10th Forum Media Technology FH St. Poelten, 2017.

VAST Challenge Participation

Figure 1: The VisInfo Digital Library system for time series retrieval [BDF+14b] (left), and visual search for Scatter Plot data using guided query sketching [SBS+14] (right). Figure 2: Results from successful entries to the VAST challenge on Epidemic Spread analysis [BBF+11] (left) and visual-interactive prediction [AJS+14] (right). Evaluation of visual analysis systems is not easy, as these typically comprise a combination of visualization, interaction, and data analysis algorithms. Furthermore, traditional evaluation metrics like time and error are not directly applicable. This is because eventually, visual analytics solutions aim to provide exploration and insights, which is harder to measure than more precisely predeﬁned, operational tasks. Contest-based evaluation is a viable approach to assess and compare visual analysis systems. The Vast Challenge is an international, yearly contest in which the community is asked to solve challenging data analysis tasks on specifically prepared, representative data sets. The entries are peer-reviewed by researchers and professional analysts for eﬀectiveness and novelty. We participated successfully in the VASTchallenge in previous years. In 2011, we achieved the Grand Challenge award with an approach for integrated analysis of spatial-temporal microblog, news, and network-oriented data [BBF+11]. In 2013, we won two awards for visual-interactive prediction systems, based on a tree-oriented visual gathering of training data, and visual combination of SVM prediction models. The results are generalized in [AJS+14]. References

To top