Logo image
TaxoDiff: Improving Taxonomy Completion with Diffusion Guided Dynamic Negative Sampling
Conference proceeding   Open access

TaxoDiff: Improving Taxonomy Completion with Diffusion Guided Dynamic Negative Sampling

Shailesh Dahal, Ratri Mukherjee, Nicholas Mathews and Kishlay Jha
Proceedings of the Nineteenth ACM International Conference on Web Search and Data Mining, pp.122-131
ACM Conferences
WSDM '26:The Nineteenth ACM International Conference on Web Search and Data Mining
02/22/2026
DOI: 10.1145/3773966.3778002
url
https://doi.org/10.1145/3773966.3778002View
Published (Version of record) Open Access

Abstract

Taxonomies offer a powerful data structure for organizing information hierarchically and support practical web applications. However, the rapid emergence of new entities renders the manual taxonomy curation both time-consuming and costly. Research efforts have been focused on automatically integrating new entities to an appropriate hypernym-hyponym pair in the existing taxonomy. Most recent approaches formulate taxonomy completion as a plausibility-scoring task over candidate query–position pairs, optimized via contrastive learning (CL). However, these methods typically rely on static or randomly sampled negatives that are semantically trivial. To address this, we propose TaxoDiff, a diffusion-guided dynamic negative sampling approach that reformulates negative sampling as a dynamic generative process in the latent space and enables flexible control over the hardness of negative examples during training. Specifically, TaxoDiff leverages a conditional denoising diffusion model to synthesize negative examples with adjustable semantic hardness, generating both easy and hard negatives at different timesteps. Altogether, a dynamic mixture of easy and hard negatives enables the model to produce robust feature representations needed for accurate taxonomy completion. Experimental results on three benchmark datasets show that TaxoDiff achieves state-of-the-art taxonomy completion performance.
Information systems -- Information extraction

Details

Metrics

1 Record Views
Logo image