Logo image
Deep learning applications for the design of novel molecules : an active learning deep learning platform for de novo generation of UV absorbing molecules
Dissertation

Deep learning applications for the design of novel molecules : an active learning deep learning platform for de novo generation of UV absorbing molecules

Umesh Arampath
University of Iowa
Doctor of Philosophy (PhD), University of Iowa
Autumn 2025
pdf
Deep_Learning_Applications_for_the_design_of_novel_molecules___Umesh_Arampath1.54 MB
Embargoed Access, Embargo ends: 01/23/2027

Abstract

In this Thesis, the primary focus is on the application of deep learning methods to predict properties and generate novel bio-based UV-absorbing molecules with optimized properties. Generalized property prediction models trained on molecular string representations and Generative models based on Conditional Variational autoencoders will be the two main building blocks of the active learning deep learning platform for de novo generation of UV-absorbing molecules. The most challenging task in the machine learning solution process is to gather and prepare relevant datasets. In this regard, several domain specific properties were explored, and in light of the absence of any proprietary dataset, publicly available datasets were explored for the relevant properties. A dataset consisting of UV absorption data of several thousand small molecules was identified as a useful dataset to build a machine learning model to predict UV absorption properties for a given molecule. Several Deep learning-based ML algorithms were examined, and testing yielded promising prediction results, which could be further optimized and transferred into building generative models to generate novel molecules. In silico and lab experiments were conducted to verify the feasibility and practical applicability of the proposed methods. And further established a framework for using functional property prediction and generative processes in the domain-specific area. We have proposed an RNN network for predicting UV absorption max with high accuracy and have verified its performance through wet-lab and in silico tests. The process by which we developed the optimized RNN-based property prediction architecture enables us to generalize the model to other property datasets. We further study conditional Variational AutoEncoders as effective generative models for generating optimized molecules from a given molecule, and their effectiveness is confirmed by screening the resulting molecules against the starting molecule.
Machine Learning Bioinformatics AI Deep Learning

Details

Metrics

1 Record Views
Logo image