DimVis: Interpreting Visual Clusters in Dimensionality Reduction With Explainable Boosting Machine

SALMANIAN, PARISA; Chatzimparmpas, Angelos; Karaca, Ali Can; Martins, Rafael M.

DimVis: Interpreting Visual Clusters in Dimensionality Reduction With Explainable Boosting Machine

dc.contributor.author	SALMANIAN, PARISA	en_US
dc.contributor.author	Chatzimparmpas, Angelos	en_US
dc.contributor.author	Karaca, Ali Can	en_US
dc.contributor.author	Martins, Rafael M.	en_US
dc.contributor.editor	Archambault, Daniel	en_US
dc.contributor.editor	Nabney, Ian	en_US
dc.contributor.editor	Peltonen, Jaakko	en_US
dc.date.accessioned	2024-05-21T08:51:16Z
dc.date.available	2024-05-21T08:51:16Z
dc.date.issued	2024
dc.description.abstract	Dimensionality Reduction (DR) techniques such as t-SNE and UMAP are popular for transforming complex datasets into simpler visual representations. However, while effective in uncovering general dataset patterns, these methods may introduce artifacts and suffer from interpretability issues. This paper presents DimVis, a visualization tool that employs supervised Explainable Boosting Machine (EBM) models (trained on user-selected data of interest) as an interpretation assistant for DR projections. Our tool facilitates high-dimensional data analysis by providing an interpretation of feature relevance in visual clusters through interactive exploration of UMAP projections. Specifically, DimVis uses a contrastive EBM model that is trained in real time to differentiate between the data inside and outside a cluster of interest. Taking advantage of the inherent explainable nature of the EBM, we then use this model to interpret the cluster itself via single and pairwise feature comparisons in a ranking based on the EBM model's feature importance. The applicability and effectiveness of DimVis are demonstrated via a use case and a usage scenario with real-world data. We also discuss the limitations and potential directions for future research.	en_US
dc.description.sectionheaders	Papers
dc.description.seriesinformation	Machine Learning Methods in Visualisation for Big Data
dc.identifier.doi	10.2312/mlvis.20241125
dc.identifier.isbn	978-3-03868-256-1
dc.identifier.pages	5 pages
dc.identifier.uri	https://doi.org/10.2312/mlvis.20241125
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/mlvis20241125
dc.publisher	The Eurographics Association	en_US
dc.rights	Attribution 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	CCS Concepts: Human-centered computing→Visualization; Visual analytics; Machine learning→Unsupervised learning
dc.subject	Human centered computing→Visualization
dc.subject	Visual analytics
dc.subject	Machine learning→Unsupervised learning
dc.title	DimVis: Interpreting Visual Clusters in Dimensionality Reduction With Explainable Boosting Machine	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 03_mlvis20241125.pdf
Size:: 1.3 MB
Format:: Adobe Portable Document Format

Download

Collections

Machine Learning Methods in Visualisation for Big Data 2024