Cheminformatics is the development of computational tools and methods to study complex chemical systems. Understanding and predicting behavior in chemical systems is critical for areas such as drug discovery and computational toxicology. Drug discovery is the process by which new candidate medications are discovered, while computational toxicology focuses on adverse health effects caused by chemicals such as environmental pollutants and pharmaceuticals.
Prediction models are at the heart of cheminformatics, and they typically address at least one of the following challenges: large predictor space, noisy data, rare events, high degree of interactions between predictors and balancing the need for interpretability with prediction ability. Data sets may fall anywhere in the spectrum from big data to very small data, where each extreme has its own demands.