Conference proceeding
Mining Hidden Knowledge from the Counterterrorism Dataset Using Graph-Based Approach
Natural Language Processing and Information Systems, NLDB 2016, Vol.9612, pp.310-317
Lecture Notes in Computer Science
01/01/2016
DOI: 10.1007/978-3-319-41754-7_29
Abstract
Information overloaded is now a matter of fact. These enormous stack of information poses huge potential to discover previously uncharted knowledge. In this paper, we propose a graph based approach integrated with statistical correlation measure to discover latent but valuable information buried under huge corpora. For given two concepts, C-i and C-j (e.g. bush and bin ladin), we find the best set of intermediate concepts interlinking them by gleaning across multiple documents. We perform query enrichment on input concepts using Longest Common Substring (LCSubstr) algorithm to enhance the level of granularity. Moreover, we use Kulczynski correlation measure to determine the strength of interdependence between concepts and demote associations with relatively meager statistical significance. Finally, we present our users with ranked paths, along with sentence level evidence to facilitate better interpretation of underlying context. Counterterrorism dataset is used to demonstrate the effectiveness and applicability of our technique.
Details
- Title: Subtitle
- Mining Hidden Knowledge from the Counterterrorism Dataset Using Graph-Based Approach
- Creators
- Kishlay Jha - North Dakota State UniversityWei Jin - North Dakota State University
- Contributors
- E Metais (Editor)F Meziane (Editor)M Saraee (Editor)Vijayan Sugumaran (Editor)S Vadera (Editor)
- Resource Type
- Conference proceeding
- Publication Details
- Natural Language Processing and Information Systems, NLDB 2016, Vol.9612, pp.310-317
- Publisher
- Springer Nature
- Series
- Lecture Notes in Computer Science
- DOI
- 10.1007/978-3-319-41754-7_29
- ISSN
- 0302-9743
- eISSN
- 1611-3349
- Number of pages
- 8
- Language
- English
- Date published
- 01/01/2016
- Academic Unit
- Electrical and Computer Engineering
- Record Identifier
- 9984294925402771
Metrics
4 Record Views