Mi DSpace
Please use this identifier to cite or link to this item: http://hdl.handle.net/UCSP/15843
Title: Paradigmatic Clustering for NLP
Authors: Santisteban Pablo, Julio Omar
Tejada Cárcamo, Javier
Keywords: Cluster analysis;Data mining;Graph theory;Natural language processing systems;asymmetric similarity;clustering;Clustering techniques;paradigmatic;Similarity measure;Synthetic and real data;Traditional approaches;Word Sense Disambiguation;Clustering algorithms
Issue Date: 2016
Publisher: Institute of Electrical and Electronics Engineers Inc.
metadata.dc.relation.uri: https://www.scopus.com/inward/record.uri?eid=2-s2.0-84964770393&doi=10.1109%2fICDMW.2015.233&partnerID=40&md5=26ff37a5a3402b53a73baf00f81bd862
Abstract: How can we retrieve meaningful information from a large and sparse graph?. Traditional approaches focus on generic clustering techniques and discovering dense cumulus in a network graph, however, they tend to omit interesting patterns such as the paradigmatic relations. In this paper, we propose a novel graph clustering technique modelling the relations of a node using the paradigmatic analysis. We exploit node's relations to extract its existing sets of signifiers. The newly found clusters represent a different view of a graph, which provides interesting insights into the structure of a sparse network graph. Our proposed algorithm PaC (Paradigmatic Clustering) for clustering graphs uses paradigmatic analysis supported by a asymmetric similarity, in contrast to traditional graph clustering methods, our algorithm yields worthy results in tasks of word-sense disambiguation. In addition we propose a novel paradigmatic similarity measure. Extensive experiments and empirical analysis are used to evaluate our algorithm on synthetic and real data. © 2015 IEEE.
URI: http://repositorio.ucsp.edu.pe/handle/UCSP/15843
ISBN: 9781467384926
Appears in Collections:Artículos de investigación

Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.