Comparative Exploration of Document Collections: a Visual Analytics Approach

dc.contributor.authorOelke, Danielaen_US
dc.contributor.authorStrobelt, Hendriken_US
dc.contributor.authorRohrdantz, Christianen_US
dc.contributor.authorGurevych, Irynaen_US
dc.contributor.authorDeussen, Oliveren_US
dc.contributor.editorH. Carr, P. Rheingans, and H. Schumannen_US
dc.date.accessioned2015-03-03T12:34:46Z
dc.date.available2015-03-03T12:34:46Z
dc.date.issued2014en_US
dc.description.abstractWe present an analysis and visualization method for computing what distinguishes a given document collection from others. We determine topics that discriminate a subset of collections from the remaining ones by applying probabilistic topic modeling and subsequently approximating the two relevant criteria distinctiveness and characteristicness algorithmically through a set of heuristics. Furthermore, we suggest a novel visualization method called DiTop-View, in which topics are represented by glyphs (topic coins) that are arranged on a 2D plane. Topic coins are designed to encode all information necessary for performing comparative analyses such as the class membership of a topic, its most probable terms and the discriminative relations. We evaluate our topic analysis using statistical measures and a small user experiment and present an expert case study with researchers from political sciences analyzing two real-world datasets.en_US
dc.description.seriesinformationComputer Graphics Forumen_US
dc.identifier.issn1467-8659en_US
dc.identifier.urihttps://doi.org/10.1111/cgf.12376en_US
dc.publisherThe Eurographics Association and John Wiley and Sons Ltd.en_US
dc.titleComparative Exploration of Document Collections: a Visual Analytics Approachen_US
Files