Chemical syntheses – the routes connecting buyable starting materials to some molecule of interest – are generally designed by expert chemists with years of advanced training and experience. They are usually carried out experimentally in a trial-and-error, labor-intensive fashion and frequently require multiple re-optimizations and/or redevelopments along the way. The overarching goal of this project is to create a fully automated system that will rapidly and efficiently produce any specified organic target molecule. A key component is to identify promising synthetic routes to target molecules with minimal risk of failure so that an automated reactor system can execute that synthesis.
Building upon recent advances in machine learning, cheminformatics, and computational chemistry, our group is developing a knowledge-based, computational synthesis route design platform for reaction pathway identification, scoring, and selection. Taking advantage of the large corpus of known reactions in literature, we have developed a model to predict the outcomes of organic reactions using machine learning and a recommendation algorithm for reaction conditions using a nearest neighbor approach. In addition to knowledge extracted from literature, we have applied computational chemistry approaches to provide predictions of thermophysical properties (e.g., molecular solubility) that are important to synthesis planning.
As we work towards a fully automated organic synthesis system, we aspire to refine the planning process with the feedback from experimental evaluation and optimization conducted by our collaborators, and further develop computational tools to guide scale-up and process optimization.