Exploring Multi-dimensional Data via Subset Embedding

dc.contributor.authorXie, Pengen_US
dc.contributor.authorTao, Wenyuanen_US
dc.contributor.authorLi, Jieen_US
dc.contributor.authorHuang, Wentaoen_US
dc.contributor.authorChen, Simingen_US
dc.contributor.editorBorgo, Rita and Marai, G. Elisabeta and Landesberger, Tatiana vonen_US
dc.date.accessioned2021-06-12T11:01:22Z
dc.date.available2021-06-12T11:01:22Z
dc.date.issued2021
dc.description.abstractMulti-dimensional data exploration is a classic research topic in visualization. Most existing approaches are designed for identifying record patterns in dimensional space or subspace. In this paper, we propose a visual analytics approach to exploring subset patterns. The core of the approach is a subset embedding network (SEN) that represents a group of subsets as uniformlyformatted embeddings. We implement the SEN as multiple subnets with separate loss functions. The design enables to handle arbitrary subsets and capture the similarity of subsets on single features, thus achieving accurate pattern exploration, which in most cases is searching for subsets having similar values on few features. Moreover, each subnet is a fully-connected neural network with one hidden layer. The simple structure brings high training efficiency. We integrate the SEN into a visualization system that achieves a 3-step workflow. Specifically, analysts (1) partition the given dataset into subsets, (2) select portions in a projected latent space created using the SEN, and (3) determine the existence of patterns within selected subsets. Generally, the system combines visualizations, interactions, automatic methods, and quantitative measures to balance the exploration flexibility and operation efficiency, and improve the interpretability and faithfulness of the identified patterns. Case studies and quantitative experiments on multiple open datasets demonstrate the general applicability and effectiveness of our approach.en_US
dc.description.number3
dc.description.sectionheadersMultivariate Data and Dimension Reduction
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume40
dc.identifier.doi10.1111/cgf.14290
dc.identifier.issn1467-8659
dc.identifier.pages75-86
dc.identifier.urihttps://doi.org/10.1111/cgf.14290
dc.identifier.urihttps://diglib.eg.org:443/handle/10.1111/cgf14290
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.en_US
dc.subjectHuman centered computing
dc.subjectVisual analytics
dc.subjectVisualization systems and tools
dc.titleExploring Multi-dimensional Data via Subset Embeddingen_US
Files
Original bundle
Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
v40i3pp075-086.pdf
Size:
5.44 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
1199-file1.zip
Size:
39.14 MB
Format:
Zip file
No Thumbnail Available
Name:
demonstrationvideo.mp4
Size:
60.6 MB
Format:
Unknown data format
Collections