Los Alamos National Laboratory, Los Alamos, NM
Background: Mechanically robust, single-use plastics create several challenges for natural degradation in soil and marine environments, including plastic buildup and long-term harmful effects to the surrounding ecosystem. To improve the environmental efficacy of degradable plastics, poly(hydroxyalkanoates) (PHAs) have been proposed and extensively studied as a sustainable, environmentally friendly, and non-toxic alternative to conventional plastic materials. The degradation rate of PHAs in natural environments is currently not fully understood or extensively modeled. The main objective of this study is therefore to utilize machine learning (ML) to develop a platform to assist with the prediction of degradation for a diverse suite of PHAs. Methods: A comprehensive literature review of various PHA compositions in unique degradation environments was conducted to develop a curated database. The individual data entries were collected from peer-reviewed journal articles, published book chapters, and other open-source formats. For the cases where the data was only available in a plot format, we relied on PlotDigitizer software to convert the graphical results into a numerical format. The dataset was then augmented with a carefully selected set of cheminformatics-based features and employed to train and validate a random forest (RF) regression model. Results: Over 1,100 unique data points from more than 50 different literature sources were compiled in the final dataset. This dataset included 23 varieties of monomers and polymers and a 3D visualization of the structures using SMARTS representations. Key variables of the database included both quantitative and qualitative descriptors such as polymer composition, molecular weight, pH, temperature, and percent weight loss of the polymer over time. While a careful analysis of the curated database provided useful qualitative insights and chemical trends for the degradation behavior in PHAs, we resorted to the ML-based RF regression model for the semi-quantitative predictions. In particular, the analysis focused on: (1) degradation performance learning and (2) identification and quantification of relative importance of various chemical and environmental factors dictating the degradation behavior. Conclusions: Built on a first-ever comprehensive and curated database of degradation performance in a wide range of PHA chemistries, our preliminary ML analysis establishes the promise of data-enabled techniques as an alternative route towards the design and development of sustainable, functional biopolymers.