As engineering librarians, we recognize that understanding our faculty research needs is an ongoing endeavor. It is a continuing learning process throughout our time serving engineering faculty with diverse research interests. However, the time-intensive learning process may not efficiently help engineering librarians quickly develop an overall view of the changing and evolving departments. It’s also challenging for early-career librarians who are new to engineering librarianship or do not have relevant subject background.
In order to tackle the problem, the authors explored research topics of our faculty’s work using a topic modeling technique called Latent Dirichlet Allocation (LDA) which is a type of statistical topic model and a machine learning algorithm for discovering the research topics from text data. We retrieved thousands of bibliographic records of faculty publications as the text data, particularly for the title, abstract and keywords, from Web of Science, removed duplicates and cleaned up the data. Next, we ran the data through the LDA model. The model sorted the data into several groups of related words forming our research topics. As a result, we determined the optimal research topic number of 25 and interpreted the research topics based on the visualization of the LDA results.
In conclusion, our experiment with the LDA approach helped us quickly develop an understanding of faculty research interests, would provide good evidence from which to make decisions on collection management, reference and library instruction, and show the possibility of academic libraries to make use of data and data science techniques in the era of big data.