Skip to main content
Log in

Machine learning-based discovery of molecules, crystals, and composites: A perspective review

  • Invited Review Paper
  • Published:
Korean Journal of Chemical Engineering Aims and scope Submit manuscript

Abstract

Machine learning based approaches to material discovery are reviewed with the aim of providing a perspective on the current state of the art and its potential. Various models used to represent molecules and crystals are introduced and such representations can be used within the neural networks to generate materials that satisfy specified physical features and properties. For problems where large database for structure-property map cannot be created, the active learning approaches based on Bayesian optimization to maximize the efficiency of a search are reviewed. Successful applications of these machine learning based material discovery approaches are beginning to appear and some of the notable ones are reviewed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Y. LeCun, Y. Bengio and G. Hinton, Nature, 521, 436 (2015).

    Article  CAS  PubMed  Google Scholar 

  2. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg and D. Hassabis, Nature, 518, 529 (2015).

    Article  CAS  PubMed  Google Scholar 

  3. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel and D. Hassabis, Nature, 529, 484 (2016).

    Article  CAS  PubMed  Google Scholar 

  4. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Nature, 559, 547 (2018).

    Article  CAS  PubMed  Google Scholar 

  5. A. Agrawal and A. Choudhary, APL Materials, 4, 053208 (2016).

    Article  CAS  Google Scholar 

  6. M. Rupp, A. Tkatchenko, K.-R. Müller and O. A. von Lilienfeld, Phys. Rev. Lett., 108, 058301 (2012).

    Article  PubMed  CAS  Google Scholar 

  7. T. Hastie, R. Tibshirani and J. Friedman, The elements of statistical learning, Springer, New York (2009).

    Book  Google Scholar 

  8. K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O. A. von Lilienfeld, K.-R. Müller and A. Tkatchenko, J. Phys. Chem. Lett., 6, 2326 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. D. Weininger, J. Chem. Information Modeling, 28, 31 (1988).

    Article  CAS  Google Scholar 

  10. D. Weininger, A. Weininger and J. L. Weininger, J. Chem. Information Modeling, 29, 97 (1989).

    CAS  Google Scholar 

  11. S. Kearnes, K. McCloskey, M. Berndl, V. Pande and P. Riley, J. Comput.-Aided Mol. Des., 30, 595 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. D. Duvenaud, D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik and R. P. Adams, arXiv preprint arXiv:1509.09292 (2015).

  13. A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez and J. Garcia-Rodriguez, arXiv preprint arXiv:1704.06857 (2017).

  14. J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li and M. Sun, AI Open, 1, 57 (2020)

    Article  Google Scholar 

  15. A. P. Bartók, R. Kondor and G. Csányi, Phys. Rev. B, 87, 184115 (2013).

    Article  CAS  Google Scholar 

  16. O. A. von Lilienfeld, R. Ramakrishnan, M. Rupp and A. Knoll, Int. J. Quantum Chem., 115, 1084 (2015).

    Article  CAS  Google Scholar 

  17. M. Valle and A. R. Oganov, Acta Crystallogr., Sect. A: Found. Crystallog., 66, 507 (2010).

    Article  CAS  Google Scholar 

  18. K. T. Schütt, H. Glawe, F. Brockherde, A. Sanna, K. R. Müller and E. K. U. Gross, Phys. Rev. B, 89, 205118 (2014).

    Article  CAS  Google Scholar 

  19. F. Faber, A. Lindmaa, O. A. von Lilienfeld and R. Armiento, Int. J. Quantum Chem., 115, 1094 (2015).

    Article  CAS  Google Scholar 

  20. T. Xie and J. C. Grossman, Phys. Rev. Lett., 120, 145301 (2018).

    Article  CAS  PubMed  Google Scholar 

  21. J. Behler and M. Parrinello, Phys. Rev. Lett., 98, 146401 (2007).

    Article  PubMed  CAS  Google Scholar 

  22. J. Behler, J. Chem. Phys., 134, 074106 (2011).

    Article  PubMed  CAS  Google Scholar 

  23. J. Behler, Int. J. Quantum Chem., 115, 1032 (2015).

    Article  CAS  Google Scholar 

  24. J. S. Smith, O. Isayev and A. E. Roitberg, Chem. Sci., 8, 3192 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. M. Gastegger, L. Schwiedrzik, M. Bittermann, F. Berzsenyi and P. Marquetand, J. Chem. Phys., 148, 241709 (2018).

    Article  CAS  PubMed  Google Scholar 

  26. K. T. Schütt, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko and K.-R. Müller, J. Chem. Phys., 148, 241722 (2018).

    Article  PubMed  CAS  Google Scholar 

  27. K. T. Schütt, P. Kessel, M. Gastegger, K. A. Nicoli, A. Tkatchenko and K.-R. Müller, J. Chem. Theory Comput., 15(1), 448 (2018).

    Article  PubMed  CAS  Google Scholar 

  28. L.-C. Lin, A. H. Berger, R. L. Martin, J. Kim, J. A. Swisher, K. Jariwala, C. H. Rycroft, A. S. Bhown, M. W. Deem, M. Haranczyk and B. Smit, Nat. Mater., 11, 633 (2012).

    Article  CAS  PubMed  Google Scholar 

  29. C. E. Wilmer, M. Leaf, C. Y. Lee, O. K. Farha, B. G. Hauser, J. T. Hupp and R. Q. Snurr, Nat. Chem., 4, 83 (2012).

    Article  CAS  Google Scholar 

  30. D. A. Gómez-Gualdrón, C. E. Wilmer, O. K. Farha, J. T. Hupp and R. Q. Snurr, J. Phys. Chem. C, 118, 6941 (2014).

    Article  CAS  Google Scholar 

  31. C. M. Simon, J. Kim, D. A. Gomez-Gualdron, J. S. Camp, Y. G. Chung, R. L. Martin, R. Mercado, M. W. Deem, D. Gunter, M. Haranczyk, D. S. Sholl, R. Q. Snurr and B. Smit, Energy Environ. Sci., 8, 1190 (2015a).

    Article  CAS  Google Scholar 

  32. A. Mullard, Nature, 549, 445 (2017).

    Article  PubMed  Google Scholar 

  33. B. Sanchez-Lengeling and A. Aspuru-Guzik, Science, 361, 360 (2018).

    Article  CAS  PubMed  Google Scholar 

  34. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, Commun. ACM, 63(11), 139 (2020).

    Article  Google Scholar 

  35. D. P. Kingma and M. Welling, arXiv preprint arXiv:1312.6114 (2013).

  36. R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams and A. Aspuru-Guzik, ACS Cent. Sci., 4, 268 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. M. J. Kusner, B. Paige and J. M. Hernández-Lobato, ICML, PMLR (2017).

  38. E. Putin, A. Asadulaev, Y. Ivanenkov, V. Aladinskiy, B. Sanchez-Lengeling A. Aspuru-Gzik, and A. Zhavoronkov, J. Chem. Information Modeling, 58, 1194 (2018).

    Article  CAS  Google Scholar 

  39. M. H. S. Segler, T. Kogej, C. Tyrchan and M. P. Waller, ACS Cent. Sci., 4, 120 (2018).

    Article  CAS  PubMed  Google Scholar 

  40. G. L. Guimaraes, B. Sanchez-Lengeling, C. Outeiral, P. L. C. Farias and A. Aspuru-Guzik, arXiv preprint arXiv:1705.10843 (2017).

  41. A. Kadurin, S. Nikolenko, K. Khrabrov, A. Aliper and A. Zhavoronkov, Mol. Pharm., 14, 3098 (2017).

    Article  CAS  PubMed  Google Scholar 

  42. M. Olivecrona, T. Blaschke, O. Engkvist and H. Chen, J. Cheminformatics, 9, 48 (2017).

    Article  Google Scholar 

  43. N. De Cao and T. Kipf, arXiv preprint arXiv:1805.11973 (2018).

  44. N. W. A. Gebauer, M. Gastegger and K. T. Schütt, arXiv preprint arXiv:1810.11347 (2018).

  45. D. Xue, Y. Gong, Z. Yang, G. Chuai, S. Qu, A. Shen, J. Yu and Q. Liu, Wiley Interdiscip. Rev. Comput. Mol. Sci., 9, e1395 (2018).

    Google Scholar 

  46. Y. Li, L. Zhang and Z. Liu, J. Cheminformatics, 10(1), 1 (2018).

    Article  CAS  Google Scholar 

  47. M. Simonovsky and N. Komodakis, ICANN, Springer, Cham (2018).

  48. Q. Zhou, P. Tang, S. Liu, J. Pan, Q. Yan and S.-C. Zhang, Proc. Natl. Acad. Sci., 115(28), E6411 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. A. Ziletti, D. Kumar, M. Scheffler and L. M. Ghiringhelli, Nat. Commun., 9, 2775 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. J. Noh, J. Kim, H. S. Stein, B. Sanchez-Lengeling, J. M. Gregoire, A. Aspuru-Guzik and Y. Jung, Matter, 1(5), 1370 (2019).

    Article  Google Scholar 

  51. S. Kim, J. Noh, G. H. Gu, A. Aspuru-Guzik and Y. Jung, ACS Cent. Sci., 6, 1412 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. J. Jang, G. H. Gu, J. Noh, J. Kim and Y. Jung, J. Am. Chem. Soc., 142, 18836 (2020).

    Article  CAS  PubMed  Google Scholar 

  53. N. S. Bobbitt and R. Q. Snurr, Mol. Simul., 45(14–15), 1069 (2019).

    Article  CAS  Google Scholar 

  54. M. Fernandez, P. G. Boyd, T. D. Daff, M. Z. Aghaji and T. K. Woo, J. Phys. Chem. Lett., 5, 3056 (2014).

    Article  CAS  PubMed  Google Scholar 

  55. C. M. Simon, R. Mercado, S. K. Schnell, B. Smit and M. Haranczyk, Chem. Mater., 27, 4459 (2015).

    Article  CAS  Google Scholar 

  56. Y. G. Chung, D. A. Gómez-Gualdrón, P. Li, K. T. Leperi, P. Deria, H. Zhang, N. A. Vermeulen, J. F. Stoddart, F. You, J. T. Hupp, O. K. Farha and R. Q. Snurr, Sci. Adv., 2(10), e1600909 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. A. Raza, A. Sturluson, C. M. Simon and X. Fern, J. Phys. Chem. C, 124, 19070 (2020).

    Article  CAS  Google Scholar 

  58. Z. Yao, B. Sánchez-Lengeling, N. S. Bobbitt, B. J. Bucior, S. G. H. Kumar, S. P. Collins, T. Burns, T. K. Woo, O. K. Farha, R. Q. Snurr and A. Aspuru-Guzik, Nat. Mach. Intell., 3, 76 (2021).

    Article  Google Scholar 

  59. S. Lee, B. Kim and J. Kim, J. Mater. Chem. A, 7, 2709 (2019).

    Article  CAS  Google Scholar 

  60. B. Kim, S. Lee and J. Kim, Sci. Adv., 6, eaax9324 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. D. Xue, P. V. Balachandran, J. Hogden, J. Theiler, D. Xue and T. Lookman, Nat. Commun., 7(1), 1 (2016).

    Google Scholar 

  62. A. I. J. Forrester and A. J. Keane, Prog. Aerosp. Sci., 45, 50 (2009).

    Article  Google Scholar 

  63. P. Raccuglia, K. C. Elbert, P. D. F. Adler, C. Falk, M. B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. Schrier and A. J. Norquist, Nature, 533, 73 (2016).

    Article  CAS  PubMed  Google Scholar 

  64. S. Pruksawan, G. Lambard, S. Samitsu, K. Sodeyama and M. Naito, Sci. Technol. Adv. Mater., 20, 1010 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. J. Mockus, J. Glob. Optim., 4, 347 (1994).

    Article  Google Scholar 

  66. D. R. Jones, M. Schonlau and W. J. Welch, J. Glob. Optim., 13, 455 (1998).

    Article  Google Scholar 

  67. S. Streltsov and P. Vakili, J. Glob. Optim., 14, 283 (1999).

    Article  Google Scholar 

  68. C. E. Rasmussen and C. Williams, Gaussian processes for machine learning, MIT Press, Cambridge (2006).

    Google Scholar 

  69. D. R. Jones, M. Schonlau and W. J. Welch, J. Glob. Optim., 13, 455 (1998).

    Article  Google Scholar 

  70. P. I. Frazier, W. B. Powell and S. Dayanik, SICON, 47, 2410 (2008).

    Article  Google Scholar 

  71. J. Knowles, IEEE Trans. Evol. Comput., 10, 50 (2006).

    Article  Google Scholar 

  72. I. Das, Nonlinear multicriteria optimization and robust optimality, Rice University (1997).

  73. W. Ponweiser, T. Wagner, D. Biermann and M. Vincze, Multiobjective optimization on a limited budget of evaluations using modelassisted S-metric selection, Springer, Berlin (2008).

    Google Scholar 

  74. M. Zuluaga, G. Sergent, A. Krause and M. Püschel, ICML, PMLR (2013).

  75. M. Emmerich and J.-w. Klinkenberg, Rapport technique, Leiden University, 34, 7 (2008).

    Google Scholar 

  76. V. Picheny, Stat. Comput., 25, 1265 (2015).

    Article  Google Scholar 

  77. D. Hernández-Lobato, J. Hernandez-Lobato, A. Shah and R. Adams, ICML, PMLR (2016).

  78. M. Schonlau, Computer experiments and global optimization, University of Waterloo (1997).

  79. M. J. Sasena, Flexibility and efficiency enhancements for constrained global design optimization with kriging approximations, University of Michigan (2002).

  80. M. Sasena, P. Papalambros and P. Goovaerts, 8th Multidiscip. Anal. Optim. Conf., 4921 (2000).

  81. C. Audet, J. Denni, D. Moore, A. Booker and P. Frank, 8th Multidiscip. Anal. Optim. Conf., 4891 (2000).

  82. B. Bichon, S. Mahadevan and M. Eldred, 50th AIAA/ASCE/AHS/ASC Struct. Struct. Dyn. Mater. Conf. (2009).

  83. V. Picheny, R. B. Gramacy, S. Wild and S. L. Digabel, ICONIP, 1443 (2016).

  84. H. Lee, R. Gramacy, C. Linkletter and G. Gray, Pac. J. Optim., 7, 467 (2011).

    Google Scholar 

  85. A. Basudhar, C. Dribusch, S. Lacaze and S. Missoum, Struct. Multidiscip. Optim., 46, 201 (2012).

    Article  Google Scholar 

  86. J. Azimi, A. Fern and X. Z. Fern, NeurIPS (2010).

  87. J. Bergstra, R. Bardenet, Y. Bengio and B. Kégl, NeurIPS, 24 (2011).

  88. J. Azimi, A. Jalali and X. Fern, arXiv preprint arXiv:1202.5597 (2012).

  89. M. Schonlau, W. J. Welch and D. R. Jones, Lecture Notes-Monograph Series, 34, 11 (1998).

    Article  Google Scholar 

  90. E. Contal, D. Buffoni, A. Robicquet and N. Vayatis, ECML PKDD, 225 (2013).

  91. T. Desautels, A. Krause and J. W. Burdick, J. Mach. Learn. Res., 15, 3873 (2014).

    Google Scholar 

  92. J. Očenášek and J. Schwarz, The state of the art in computational intelligence, 61, Physica, Heidelberg (2000).

    Book  Google Scholar 

  93. M. A. Taddy, H. K. H. Lee, G. A. Gray and J. D. Griffin, Technometrics, 51, 389 (2009).

    Article  Google Scholar 

  94. J. Schmidt, M. R. G. Marques, S. Botti and M. A. L. Marques, Npj Comput. Mater., 5, 1 (2019).

    Article  Google Scholar 

  95. T. Lookman, P. V. Balachandran, D. Xue, J. Hogden and J. Theiler, Curr. Opin. Solid State Mater. Sci., 21, 121 (2017).

    Article  CAS  Google Scholar 

  96. P. V. Balachandran, D. Xue, J. Theiler, J. Hogden and T. Lookman, Sci. Rep., 6, 1 (2016).

    Article  CAS  Google Scholar 

  97. A. Talapatra, S. Boluki, T. Duong, X. Qian, E. Dougherty and R. Arróyave, Phys. Rev. Mater., 2, 113803 (2018).

    Article  CAS  Google Scholar 

  98. R.-R. Griffiths and J. M. Hernández-Lobato, arXiv preprint arXiv: 1709.05501 (2017).

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea funded by the Ministry of Science, ICT, & Future Planning under grant no. 2021R1A2C2003583 and 2021R1A2C2006083.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jihan Kim or Jay Hyung Lee.

Additional information

Jihan Kim obtained his B.S. degree in Electrical Engineering and Computer Sciences from UC Berkeley in 2001. He received his M.S. and Ph.D. degree in Electrical and Computer Engineering from University of Illinois at Urbana-Champaign in 2004 and 2009 respectively. From 2009 to 2013, he was a postdoctoral researcher at Lawrence Berkeley National Laboratory. He joined KAIST in 2013 and is currently an associate professor in the Department of Chemical and Biomolecular Engineering. He has published more than 90 papers.

Jay Hyung Lee is currently a KEPCO Chair Professor at Korea Advanced Institute of Science and Technology (KAIST). He is also the director of Saudi Aramco-KAIST CO2 Management Center. He received the AIChE CAST Computing in Chemical Engineering Award and was elected as an IEEE Fellow, an IFAC Fellow, and an AIChE Fellow. He was the 29th Roger Sargent Lecturer in 2016. He published over 200 manuscripts in SCI journals with more than 17000 Google Scholars citations. His research interests are in the areas of state estimation, model predictive control, planning/scheduling, and reinforcement learning with applications to energy systems and carbon management systems.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, S., Byun, H., Cheon, M. et al. Machine learning-based discovery of molecules, crystals, and composites: A perspective review. Korean J. Chem. Eng. 38, 1971–1982 (2021). https://doi.org/10.1007/s11814-021-0869-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11814-021-0869-2

Keywords

Navigation