Abstract
In the field of computer-aided drug design (CADD), there has been dramatic progress in the development of big data and AI-driven methodologies. The expensive and time-consuming process of drug design is related to biomedical complexity. CADD can be used to apply effective and efficient strategies to overcome obstacles in the field of drug design in order to properly design and develop a new medicine. To prepare the raw data for consistent and repeatable applications of big data and AI methodologies, data pre-processing methods are introduced. Big data and AI technologies can be used to develop drugs in areas including predicting absorption, distribution, metabolism, excretion, and toxicity properties as well as finding binding sites in target proteins and conducting structure-based virtual screenings. The accurate and thorough analysis of large amounts of biomedical data as well as the design of prediction models in the area of drug design is made possible by data pre-processing and applications of big data and AI skills. In the biomedical big data era, knowledge on the biological, chemical, or pharmacological structures of biomedical entities relevant to drug design should be analyzed with significant big data and AI approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Grechishnikova D (2021) Transformer neural network for protein-specific de novo drug generation as a machine translation problem. Sci Rep UK 11(1):1–3
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 25(3):1315–1360
Lee JW, Maria-Solano MA, Vu TNL, Yoon S, Choi S (2022) Big data and artificial intelligence (AI) methodologies for computer-aided drug design (CADD). Biochem Soc Trans 50(1):241–252. https://doi.org/10.1042/BST20211240
Tripathi MK, Nath A, Singh TP et al (2021) Evolving scenario of big data and Artificial Intelligence (AI) in drug discovery. Mol Divers 25:1439–1460. https://doi.org/10.1007/s11030-021-10256-w
Jiménez-Luna J, Grisoni F, Schneider G (2020) Drug discovery with explainable artificial intelligence. Nat Mach Intell 2:573–584. https://doi.org/10.1038/s42256-020-00236-4
Buza K, Peška L, Koller J (2020) Modified linear regression predicts drug-target interactions accurately. PLoS One 15(4):e0230726. https://doi.org/10.1371/journal.pone.0230726
Najafi-Ghobadi S, Najafi-Ghobadi K, Tapak L et al (2019) Application of data mining techniques and logistic regression to model drug use transition to injection: a case study in drug use treatment centers in Kermanshah Province. Iran Subst Abuse Treat Prev Policy 14:55. https://doi.org/10.1186/s13011-019-0242-1
Andrews CW, Bennett L, Yu LX (2000) Predicting human oral bioavailability of a compound: development of a novel quantitative structure-bioavailability relationship. Pharm Res 17(6):639–644. https://doi.org/10.1023/a:1007556711109
Shi H, Liu S, Chen J, Li X, Ma Q, Yu B (2019) Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. Genomics 111(6):1839–1852. https://doi.org/10.1016/j.ygeno.2018.12.007
Mehmood T, Iqbal M, Rafique B (2021) Using least angular regression to model the antibacterial potential of metronidazole complexes. Sci Rep 11:19295. https://doi.org/10.1038/s41598-021-97897-x
Macalino SJY, Gosu V, Hong SH, Choi S (2015) Role of computer-aided drug design in modern drug discovery. Arch Pharm Res 38(9):1686–1701
Schneider P, Tanrikulu Y, Schneider G (2009) Self-organizing maps in drug discovery: compound library design, scaffold-hopping, repurposing. Curr Med Chem 16(3):258–266. https://doi.org/10.2174/092986709787002655
Hu YH, Lin WC, Tsai CF, Ke SW, Chen CW (2015) An efficient data preprocessing approach for large scale medical data mining. Technol Health Care 23(2):153–160
Car J, Sheikh A, Wicks P et al (2019) Beyond the hype of big data and artificial intelligence: building foundations for knowledge and wisdom. BMC Med 17:143
Saez C, Garcia-Gomez JM (2018) Kinematics of big biomedical data to characterize temporal variability and seasonality of data repositories: functional data analysis of data temporal evolution over non-parametric statistical manifolds. Int J Med Inform 119:109–124
He T, Heidemeyer M, Ban F et al (2017) SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines. J Cheminform 9:24. https://doi.org/10.1186/s13321-017-0209-z
Miller JB (2019) Big data and biomedical informatics: preparing for the modernization of clinical neuropsychology. Clin Neuropsychol 33(2):287–304
Suh D, Lee JW, Choi S, Lee Y (2021) Recent applications of deep learning methods on evolution-and contact-based protein structure prediction. Int J Mol Sci 22(11):6032
Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4(1):120–131. https://doi.org/10.1021/acscentsci.7b00512
Yasonik J (2020) Multiobjective de novo drug design with recurrent neural networks and nondominated sorting. J Cheminform 12:14. https://doi.org/10.1186/s13321-020-00419-6
Mirza B, Wang W, Wang J, Choi H, Chung NC, Ping PP (2019) Machine learning and integrative analysis of biomedical big data. Genes Basel 10(2):87
Irwin B, Whitehead TM, Rowland S, Mahmoud SY, Conduit GJ, Segall MD (2021) Deep imputation on large-scale drug discovery data. Appl AI Lett 2(3):e31
Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488
Rumondor AC, Taylor LS (2010) Application of partial least-squares (PLS) modeling in quantifying drug crystallinity in amorphous solid dispersions. Int J Pharm 398(1–2):155–160. https://doi.org/10.1016/j.ijpharm.2010.07.049
Perez-Villanueva J, Santos R, Hernandez-Campos A, Giulianotti MA, Castillo R, Medina-Franco JL (2010) Towards a systematic characterization of the antiprotozoal activity landscape of benzimidazole derivatives. Bioorgan Med Chem 18(21):7380–7391
Heikamp K, Bajorath J (2014) Support vector machines for drug discovery. Expert Opin Drug Discov 9(1):93–104. https://doi.org/10.1517/17460441.2014.866943
Lee JW, Moen EL, Punshon T, Hoen AG, Stewart D, Li H, Karagas MR, Gui J (2019) An Integrated Gaussian Graphical Model to evaluate the impact of exposures on metabolic networks. Comput Biol Med 114:103417. https://doi.org/10.1016/j.compbiomed.2019.103417
Shutta KH, De Vito R, Scholtens DM, Balasubramanian R (2022) Gaussian graphical models with applications to omics analyses. Stat Med 41(25):5150–5187. https://doi.org/10.1002/sim.9546
Diaz-Uriarte R, Gómez de Lope E, Giugno R, Fröhlich H, Nazarov PV et al (2022) Ten quick tips for biomarker discovery and validation analyses using machine learning. PLoS Comput Biol 18(8):e1010357. https://doi.org/10.1371/journal.pcbi.1010357
Liu B, Sträuber H, Saraiva J et al (2022) Machine learning-assisted identification of bioindicators predicts medium-chain carboxylate production performance of an anaerobic mixed culture. Microbiome 10:48. https://doi.org/10.1186/s40168-021-01219-2
Acknowledgments
This article is financially supported by the 2023 College of Public Policy at Korea University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Seo, S., Lee, J.W. (2024). Applications of Big Data and AI-Driven Technologies in CADD (Computer-Aided Drug Design). In: Gore, M., Jagtap, U.B. (eds) Computational Drug Discovery and Design. Methods in Molecular Biology, vol 2714. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3441-7_16
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3441-7_16
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3440-0
Online ISBN: 978-1-0716-3441-7
eBook Packages: Springer Protocols