Skip to main content

Assessment of computational approaches in the prediction of spectrogram and chromatogram behaviours of analytes in pharmaceutical analysis: assessment review

Abstract

Background

Today, artificial intelligence-based computational approach is facilitating multitasking and interdisciplinary analytical research. For example, the data gathered during an analytical research project such as spectral and chromatographic data can be used in predictive experimental research. The spectral and chromatographic information plays crucial role in pharmaceutical research, especially use of instrumental analytical approaches and it consume time, man power, and money. Hence, predictive analysis would be beneficial especially in resource-limited settings.

Main body

Computational approaches verify data at an early phase of study in research process. Several in silico techniques for predicting analyte’s spectral and chromatographic characteristics have recently been developed. Understanding of these tools may help researchers to accelerate their research with boosted confidence and prevent researchers from being misled by incorrect analytical data. In this communication, the properties of chemical compounds and its relation to chromatographic retention will be discussed, as well as the prediction technique for UV/IR/Raman/NMR spectrograms. This review looked at the reference data of chemical compounds to compare the predictive ability in silico tools along with the percentage error, limitations, and advantages.

Conclusion

The computational prediction of analytical characteristics offers a wide range of applications in academic research, bioanalytical method development, computational chemistry, analytical method development, data analysis approaches, material characterization, and validation process.

Background

The use of computational chemistry in research has been well-acknowledged in recent years and afforded significant research outcomes [1, 2]. There are literature reports on computer code for analysing models, replicating processes, predicting models, and interpreting chemical compounds [3]. Unlike the drug discovery area, the validity of computational techniques in analytical chemistry yet to be explored as a comprehensive tool [4,5,6]. The computational approach in analytical research is important because simulations of chemical behaviour of an analyte are needed for modelling of analyte response relationship in instrumental methods. Of course, it can be viewed as a visual representation of the connection between the analytical experiment and theoretical prediction [4, 7].

In this era, new chemical entity research is needed in new drug discovery process for treatment, diagnostic, and biomarker research. At this juncture, spectroscopy and chromatography techniques are playing a vital role in the purification, identification, and characterization of the targeted chemical compound [8, 9]. In general, understanding and interpreting the spectrograms and chromatographic retention times of the new compounds is quite difficult for beginners if the researcher is a non-chemist [10]. But, knowledge of spectrogram and chromatography is very essential for researchers and plays a crucial role in the process of developing new drugs. Indeed, the level of expertise and awareness on the accuracy of computation tools could assist the researchers in speeding up the experiments with partial validity of the analytical data [11]. In the current scenario, still there are predatory journals publish data sets that are not reliable if they are not verified [12]. Here, researchers may utilize computational tools to verify the data before citing in their research [4, 7]. The prediction tools of various spectrograms like UV–visible, infrared (IR), Raman, nuclear magnetic resonance (NMR), and mass spectra are now widely accessible to researchers. Similarly, in silico approaches to predict the chromatographic behaviour of an analyte in various chromatographic techniques like HPLC and GC are in existence [13, 14]. The prediction of retention time (tR) in chromatography is gaining much importance in analytical method development research. Several computational prediction approaches have been reported. Some of them are artificial neural networks (ANNs), response surface methodology (RSM), analytical quality by design (AQbD) [15], design of experiments (DoE), chemometrics, and quantitative structure retention relationship (QSRR) methods [16]. Although the knowledge about artificial intelligence software is limited, several artificial neural network-based programmes are widely available these days. Many researchers spend a significant amount of time on their experimental work, even though they are shortcomings in computational chemistry. The AQbD and QSRR approaches explore the scientific understanding of critical method variables and method response in chromatography [17, 18]. These methods are still recommended in pharmaceutical method development because it allows regulatory flexibility [19]. In the AQbD approach [20], the tool used in the model development is DoE. In chromatographic research, the quantitative structure retention relationship (QSRR) is a reliable in silico method for predicting molecular systems [21, 22], and it can be used to evaluate complex physicochemical features of analytes in chromatographic analyses and for predicting chromatographic retention parameters [23, 24].

Considering the above discussion, the present assessment review focused on various prediction tools available, and accessible to resource-limited research setups. We have also explored the predictive ability of the different in silico tools with examples pertaining to the reference spectral library. Thus, this review can assist researchers in assessing the tool’s reliability from case to case.

Main text

Problems involving the analytical methods

Today, the difficulties in analytical laboratories are the same as they had experienced in the past, although there has been advancement in analytical technology. Analytical laboratories experience difficulties related to the growth and preservation of expertise, maintaining the equipment sensitivities, and introduction of novel methodologies [25]. There are many reports on previous analytical issues with analytes, including method performance [26], a lack of regulatory flexibility [27], complex chemical processes [28], OOT-out of trend [29], and OOS-out of specification [30, 31]. This problem could be mainly raised by three stages such as pre-analytical, post-analytical, and development phase. These can be overcome by utilizing the most modern and advanced computational methods.

Pre-analytical phase

One of the crucial stages in the analysis of the sample is the pre-analytical phase it includes, gathering of literature, sampling, preparation of the sample, transport, and storage. This entire process is the most time-consuming and might occasionally lead to errors [32]. It is widely acknowledged that a degraded sample cannot produce good results. Always, it is important to conduct a literature review before beginning any research on an analyte. There are many databases, books, journals, and websites, but in some instances, information on new analyte may not be available due to a lack of studies on the analytes, or newly synthesized materials, or a lack of source availability [33]. Next, for new analytical method development, the preparation of a sample is a critical step. A sample processing method is unique for each type of sample, including biological matrix, food products, active compounds, excipients, and pesticides. A given procedure cannot be applied to a different type of analyte without a complete revalidation of the method [34]. Unfortunately, this rule is regularly ignored. Finally, there are several issues with analyte that affect storage and transportation; they are temperature, humidity control, data storage maintenance, and a lack of advancement [35].

Development phase

The selection of the method, procedure, principle, technology, and appropriate recommendations are the main problems that arise throughout the development phase. Unfortunately, it must be acknowledged that no method has yet been developed that satisfies all of these criteria and appropriate for all classes of analytes. This always place restrictions on analytical chemists. It is also crucial to understand whether the analysis’s objective is merely screening or accurate quantification. In developing chromatography methods, optimization includes temperature, flow rate, the choice of mobile and stationary phases, separation efficiency, internal standard selection, and validation. Thus, re-optimization are difficult task, if the method fails during method transfer [36]. In the last decade, new chromatographic techniques for the detection of bio-analytes have emerged. One of these techniques is tandem mass spectrometry (LC–MS/MS), which has advantages such as high selectivity and sensitivity but possess disadvantages such as expensive equipment, experienced operators, and more challenging method development [37, 38]. In the development of electrochemistry supported instrument, the general settings for resolution, path of the composite electrochemical response examination, and optimal path of analysis of the multidimensional data are complicated [39].

Post-analytical phase

In this phase, the key challenge is the collection and interpretation of data with analytical techniques, particularly when it comes to clinical research, proteomics, and metabolomics. Additionally, certain sophisticated computations raised problems from the data analysis as well. In general, manual calculations can produce inaccurate findings. From pre- and post-data analyses in chromatography methods, the common troubles are unwanted background signals, baseline drift, unresolved peaks shifting and retention durations, data comparison errors, and improper retention time alignments which are to be addressed [40]. In spectroscopic analyses, specific mathematical transformations that are frequently created for a certain experimental approach are typically used to rectify systematic undesirable signal changes. Baseline shifts (offsets), horizontal shifts, drifts (slope changes), and global intensity effects are some of the systematic signal fluctuations. The significant alteration of signal profiles produced by the derivation transform can mislead the interpretation of final results [41]. Overall, the scheme of application of computational method is shown in Fig. 1.

Fig. 1
figure 1

Scheme of application of computational methods in analytical process

Prediction of spectrograms

Prediction of C13-NMR and H1-NMR

NMR is a significant tool for detecting carbon and hydrogen atoms in organic compounds. In the pharmaceutical industry, C13-NMR and H1-NMR are used to assess drug purity, composition, and chemical shifts of diverse organic molecules. NMR parameters are now calculated by utilizing computational methods in association with chemical structures. AI has created several software tools (e.g. ChemDraw, Chemaxon, etc.) that are now used to predict chemical shifts in H1-NMR and C13-NMR and offer net intensity, quality, and spectrograms.

Machine learning approach in NMR prediction

Machine learning (ML) approaches are more beneficial and, in most cases, faster than prediction-based databases like HOSE codes. The database works by finding structural similarities and averaging the experimental data for chemical structures. The similarity between the new and known HOSE codes has little bearing on the accuracy of the prediction. The well-established structure determination approach formerly relied on quantum chemical calculation-based methods such as topical-based DFT calculation. This method is accurate for H1 and C13 chemical shift predictions, but considerably more time-consuming and expensive. Today, software tools have been designed to speed up the procedure. The NMR signal characteristics can be visualized more accurately using a machine learning method called “Automatic structure verification (ASV)” based on variables such as temperature, solvent, pH, salt content, concentration, and so on which will affect chemical shifts in laboratory studies. All of these parameters have considered, such a way that NMR can predict the chemical shift for an unknown structure. But, certain other prediction algorithms take some of them into account, still the prediction systems produce variable values. But, the ASV system is capable of properly dealing with overlapping peaks. This is especially important when sections of the compound’s relevant peaks, such as significant solvent peaks, are quite close to other signals [42,43,44,45]. Few researchers have used this approach, including Jia et al. [46], who have developed a method for extracting data from previously examined 13C and 1H NMR spectra in order to recognize the NMR spectrum. Min Lin and colleagues predicted the chemical shifts based on cutting-edge machine learning [47].

Software handling for NMR Signal prediction

The user can either use a software application to draw the chemical structure of the test molecule or download and paste it into the software. The user will be able to locate the predicted C13-NMR and H1-NMR spectra in 1–5 min after clicking the calculation button. The user can optionally alter the frequency range from 60.0 to 1000 Hz after the prediction. Finally, a pdf document will be generated including the substance’s chemical shift, peak intensity, peak quality, molecular location, and coupling constant values [48]. A typical H1-NMR signal for Zidovudine is shown in Fig. 2.

Fig. 2
figure 2

Typical predicted H1-NMR signals for Zidovudine

Prediction of UV–Visible Spectra

The UV–Vis absorption spectrum of an organic substance is a key component of its physical makeup. Using predictions of UV–Vis spectra from molecular structural formulas, it is generally quite interesting to design new materials, find potential phototoxic chemicals, and estimate missing spectroscopic data for known molecules [49]. In a recent study, Chan et al. [50] utilized TD-DFT computation approach for rapid ultraviolet–visible spectrum prediction. The method was developed by Urbina et al. [51] using neural network-based computation to predict UV–visible spectrograms.

Time-dependent density functional theory (TD-DFT)

For TD-DFT calculation, the software should be able to analyse the energy of the chemical structure in the excited states, and the probability of transition between energy levels for the chemical molecule. For example, the ORCA programme contains several methods for accurately determining excited state properties. The TD-DFT technique is the most effective of all the approaches. For precise results in this method, an optimized geometry file of the chemical structure is required. To optimize the structure, the user might utilize the “IQmol” software package or another. After that, the user can use Notepad +  + to create the input file, with the function code “! B3LYP def2-TZVP”, “RIJCOSX” code to speed up the process, “% TDDFT” code to automatically generate the excited state calculation, “NROOTS” flag to determine how many excited states to be added, and “MEXDIM” to determine the maximum dimension of the expansion space. To simulate the analyte employed in the experiment, CPCM may be a solvation model for both the ground and excited states. The number “0” denotes charge, whereas the second number denotes multiplicity. Finally, from the same folder, save this file in “inp” format (tddft.inp). The user may then go to the folder and input a comment “orca tddft.inp > tddft.out” followed by “Enter” to execute the computation on the CMD line (comment prompt). Depending on the molecules involved, it might take some time (10 min–2 h). After the computation is completed, the programme creates an output file in the same folder that contains all of the data [52, 53].

Visualization of UV–visible spectra

The UV–Visible spectrum can be obtained for an unknown analyte instantly using a graphical interface. It does indeed show thin line spectral waves, but some line broadening is required to make the predicted spectra match the experimental one. This is easily accomplished by selecting “Advanced >  > ” and then, on the “Infrared Spectra Settings” tab, adjusting the “Peak Width” to 10-30 cm−1 [54,55,56,57,58]. Figure 3 shows the generated spectrogram of Zidovudine compound.

Fig. 3
figure 3

Predicted CD and UV spectrum of zidovudine generated by Avogadro

IR/Raman predictions

For chemical characterization and identification, both infrared (IR) and Raman spectroscopy continued to be essential tools. Recently, McGill et al. [59] developed the IR spectrum prediction procedure using a neural network-based approach. IR and Raman spectra may also be predicted using the ORCA software. It uses “Avogadro” or “IQmol” to compute the frequencies of the molecules. The 3D structure of the analyte is to be analysed and optimized. The ORCA programme can create output on its own. The user must create a new folder and set the optimized geometry structure and input file, similar to the UV–visible computations. The “! B3LYP DEF2-SVP” is the function code, while “OPT FREQ” specifies multiplicity. Finally, save the file in “inp” format in the same location so that the user may navigate the folder and execute “orca foscarnet.inp > foscarnet.out” followed by “Enter” to perform the computation. The output file can be created in the same folder when the operation is finished [7, 54,55,56,57,58]. Figure 4 shows the predicted IR spectrum of foscarnet generated by Avogadro.

Fig. 4
figure 4

Predicted IR spectrum of foscarnet generated by Avogadro

Plotting a spectrum

Using Avogadro as a graphical user interface, the IR spectrum may be generated rapidly. To view the visual spectra in a new window, the user can open the saved output file and click “Show Spectra”. Although it displays narrow spectral lines, some line widening is necessary to bring the predicted spectra as close to the observed one as possible. This can be readily performed by selecting “Advanced >  > ” and then changing the “Peak Width” to 30–130 cm−1 on the “Infrared Spectra Settings” tab [54,55,56].

Mass spectroscopy predictions

The molecular weight of an analytes in pharmaceutical studies is determined by mass spectrometry (MS). In an electron ionization mass spectrometry (EI-MS), an electron beam positively ionizes and fragments the molecules [60]. According to the mass-to-charge (m/z) ratio, the mass spectrum is a distribution of the frequency or intensity of each type of ion [61]. The prediction models calculate the chance of each bond breaking under ionization and the frequency of each ion fragment by using quantum mechanics calculations [62] or machine learning [63]. For large molecules, model’s prediction can consume few minutes, depending on the molecule’s size. This due to the fact that these techniques must either utilize sophisticated computations to determine molecular orbital energies with high accuracy or stochastically mimic the fragmentation of the molecule. A neural network termed neural electron ionization mass spectrometry (NEIMS) predicts the electron ionization mass spectrum for a particular small molecule and is studied by Jennifer N. Wei and colleagues. Additionally, they found that the forward-only model fails to adequately capture the fragmentation events, but the bidirectional prediction mode does [64] because it directly predicts spectra rather than bond breaking probabilities. As a result, this model is significantly faster than previously reported methods.

Wang et al. utilized the recently developed quantum chemical programme QCEIMS (Quantum Chemical Electron Ionization Mass Spectrometry). QCEIMS can theoretically calculate the spectra for any given chemical structure. However, in order to make quick predictions, approximations and parameter estimations are required, which are important for the precision of QCEIMS predictions. For the MD trajectories, fragment ions are calculated by QCEIMS using Born–Oppenheimer molecular dynamics (MD) within picosecond reaction durations with femtosecond intervals. With this approach, they discovered that tweaking QCEIMS’s parameters were not a practical way to enhance simulation outcomes [65, 66]. One of the best tools for in silico mass-spectrum-to-compound identification is CFM-ID, which Wang et al. used to predict more accurate ESI–MS/MS spectra. They added a new method for modelling ring cleavage that models the process as a series of straightforward chemical bond dissociations, and they expanded their handwritten rule-based predictor to cover more chemical classes of analytes [67]. They also listed parameters from molecular topological parameters.

Fluorescence spectroscopy predictions

Fluorescence spectroscopy measures a target analyte fluorescence upon being excited by a laser beam (often UV absorption) [68]. The prediction of analyte’s fluorescence features, including the type of fluorescence, emission, and excitation wavelengths [69], can be employed to examine included solvent effects. It has been used to predict the spectra for a variety of fluorescent compounds [70]. The majority of the predicted spectra have molecular masses of 228 or below. In such case, DFT technique can be used for larger molecular weight and chemical emission spectra calculation with solvent effects.

The characterization of electronic excited states depends on the accuracy of simulation spectrum of molecular absorption /or emission and precise techniques like the equation of motion coupled cluster singles and doubles (EOM-CCSD) [71, 72]. In order to increase the emission spectrum qualities, Caricato et al. [73] combined the EOM-CCSD and polarizable continuum (PCM) models and reported that the predicted values of vertical emission energies are in good accord with the available experimental data. Later, DFT was used by Powell et al. [74] to demonstrate the capability of predicted spectra in generating libraries of fluorescence spectra in a digital format. Ye et al. concluded that the statistical requirements for the numerically predicted wavelength were satisfied by the Lasso-RF (Random Forest descriptor) model. Four conjugated bonding-related characteristics were found by the model to contribute primarily to the predicted emission wavelength [75]. Furthermore, Shams-Nateri et al. [76] investigated the link between absorption and emission spectra using the PCA chemometric approach, and they found that the accuracy of emission spectra prediction was improved with the addition of more principal components.

Electrochemistry predictions

Because of the growing interest in electrochemistry as a potential drug core structure and for the development of organic photovoltaic materials, it has recently experienced a huge comeback and provided valuable prediction, filtering, and active learning. This includes a promising optimization of the electrochemical properties of the analytes, investigation of intrinsic electron deficiency, and rendering of the connection between electronic characteristics and substituent effects [77]. Using electrochemistry predictions of compounds using quantum mechanical calculations provides a quick and accurate method for the research. For instance, DFT is regarded as the “workhorse” of recent theoretical investigations in electrochemistry and physics [78].

Electrochemical systems are studied using the popular electrochemical impedance spectroscopy (EIS) characterization approach. The significance of this method is still constrained by several issues. EIS is also extensively employed in the development of sensors [79, 80], in health care [81], drug release [82], testing, and biology [83] because EIS makes it possible to characterize such systems and helps in identifying crucial variables like conductivities [84], resistances [85], and capacitances [86]. The computational Gaussian processes (GPs) used in this method faced significant challenges including noise, impeded spectrum regression, polarization resistance, and probed frequencies that were not always ideal. An infinite or finite collection of random variables is referred to as a GP, if the joint distribution of any finite subset displays multivariate Gaussian behaviour. Then, GPs may regress and predict it using a prior distribution and a set of assumptions on the characteristics of the observed unknown function [87]. Regression and prediction uncertainty can be measured using GPs and also have so far been used to filter data, predict parameters in diverse situations [88], and enhance experiments in the active learning domain. Liu and Ciucci et al. [89, 90] used GPs to de-convolve the distribution of relaxation duration, a novel approach for EIS analysis. Then, using a finite GP approximation, Maradesa et al. extended this framework to constrain the DFT to be non-negative. Additionally, Py et al. [91, 92] created and validated the method that Ciucci used to assess the quality of EIS spectra using GPs that complied with the Hilbert transform.

Kiss et al. [93] predicted the substituent effects in electrochemical properties of the analyte and comprehended the influence of substituents on the character of the electronic transition and transition density matrices (TDMs). This procedure makes it possible to access the distribution of electrons and holes in the excited state and determine their delocalization. This makes it possible to reveal electronic excitation processes like charge transfer [94]. The imbalance in the TDMs is caused by the presence of electron-donating and electron-withdrawing groups interacting with the hole. The location of the hole is altered when an electron-donating moiety uses mesomeric effects to donate electron density to the hole. Instead of being just inductively impacted, at this instance, the TDM can be described as mesomerically effected. On the other hand, the inductively dominated TDM lacks any localization due to the absence of any major TDM elements on the analyte. The polarity difference has a significant impact on the mesmeric contribution to the TDM. This made it easier to spot the impacts of charge transfer and substitution.

The next field of research addressed the exciton binding energies, which show the Coulomb attraction between the exciton quasiparticles (electron and hole). It is a measurement of the exciton’s separability in free charges, and it has a direct impact on how an effective current is produced in optoelectronics [95]. More details on the impacts on the characteristics of the electronic structure are revealed by analysing the HOMO and LUMO energies (EHOMO and ELUMO) [96]. In order to optimize the electrochemical characteristics of an analyte, Min et al. [97] developed and verified a machine learning (ML) approach for electrochemistry. Both output (such as initial capacity and cycle life) and few input (synthesis parameters, ICP-MS data, and X-ray diffraction (XRD) results) variables were used to build several experimental datasets for analyte [98]. When distributing these variables across the entire dataset while building the ML model, a number of primary variables were chosen to serve as suggestions for the optimal experimental parameters.

Prediction of chromatographic retention behaviour

Quantitative structure retention relationship (QSRR)

QSRR is a computational approach for linking chemical structural variables to chromatographic column retention behaviour. Here, Y-variables are frequently employed as dependent variables for predictive or explanatory purposes, whereas X-variables are utilized as independent variables. As a result, Y-variables in QSRR have connected to solute chromatographic retention, whereas X-variables encode solute molecular structure. QSRR was first used to characterize columns by quantitatively comparing their separation qualities or to supply knowledge for predicting retention mechanisms in various chromatographic settings [22]. A typical QSRR study includes building a retention database of compounds with known chemical structures, computing molecular descriptors for each structure, choosing descriptors, creating a QSRR model, and validation. Figure 5 illustrates a QSRR methodology and work flow.

Fig. 5
figure 5

Scheme of QSRR methodology in chromatography

The most popular methods for expressing chemical structures are molecule 1D descriptors, 2D descriptors, and 3D descriptors. While representing a connection table or a molecular graph, the chemical structure of the solutes of interest is used to compute 2D descriptors, whereas 1D descriptors provide simple chemical information about a solute, such as molecular weight or the number of oxygen atoms in the structure. A molecular descriptor that describes both the general surfaces/or volumes of molecules and 3D arrangement of structural attributes is known as a 3D molecular descriptor [23].

Depicting the molecular structure of QSRR is one of the key concerns in QSRR modelling. Molecular descriptors that describe chemical structures are typically categorized as physicochemical descriptors and descriptors of the quantum chemical, topological, etc. [99]. The fact is that physicochemical descriptors have a positive correlation with solute retention on chromatographic columns. On a molecular level, quantum chemical descriptors shed light on the process of chromatographic retention, although the link to solute retention is frequently poor, and the calculation is laborious. With today’s computational technologies, topological descriptors are easily constructed, but they are unrelated to retention phenomena [24]. There are two methods of the QSRR approach, viz., the direct mapping method and the direct comparison method.

Prediction of retention time by the direct mapping method

It is a simple method for predicting compound retention time on a chromatographic column. It is a web-based solution that allows users to predict retention by submitting their data and receiving expected retention values. Predict is a database available, and this experiment has four steps as follows.

The user can create a CVS file that includes the compound name, real retention time from the PubChem CID or InChIs databases, and stereo-chemical parameters. The user must be able to upload retention data and get new retention predictions easily using a web interface. On the website, the user is initially asked to create a new chromatographic system. Each system will contain two types of columns: (1) a name and (2) a column type (for example: RP, HILIC). (3) column description (for example, Waters and Symmetry C18 columns), (4) eluent system (for instance, 95:5 methanol/water), (5) The eluent’s pH (for example, acidic or alkaline), and (6) Eluent additives (for example, 0.1 per cent trifluoroacetic acid). The user will next submit a CSV file containing retention times for chemicals derived from their studies or google scholar in the following phase. Finally, the user may obtain the estimated retention time by clicking “get a prediction” [58].

Prediction of retention time by direct comparison method

QSRR Automator, a python-based software, can be used to predict retention using the direct comparison method. Mordred, a software package that uses the rdkit package, can be used to determine molecular descriptors. Machine learning operations may be performed with the sci-kit learn package. The following is a description of the QSRR Automator Workflow. The training data, which contains the name of each chemical, the structure in the form of a simplified molecular input line entry system (SMILES) text string, and the retention duration, may be created by the user. The programme creates a template and simplifies the input file on its own. After that, the user may submit their training data (chemical descriptions, SMILES, compound name, and actual retention time). The structural and electrical descriptions to be utilized should be used. Functional groups, hybridizations, the number of carbon atoms, and the ring system are all structural properties. Aromaticity and numerous electronegativity calculations are two electrical properties. All of these calculations are simple; unlike more complex fingerprint feature combinations, they can all be done using the Mordred software package, which calculates over 1500 features [100]. The recent data on the QSRR based method were listed in the Table 1.

Table 1 Data on the analytical methods reported based on QSRR

Chemometrics in chromatography

The chemometric approach is widely used in separation science to predict the analysed peak asymmetry, peak overlapping, and peak optimizations. Co-elution of multiple analytes in chromatography significantly complicates quantification of the target analyte due to interference caused by incorrect method optimization. At this juncture, chemometric methods such as principal component analysis (PCA) are widely used in separation science and have now been extended to LC-HRMS analysis for proteomics and metabolomics. In addition, artificial neural networks (ANN), factorial design (FD), partial least squares (PLS), and cluster analysis (CA) are also in place [113, 114]

Chemometrics in one- and two-dimensional chromatography

In the development of two-dimensional (2D) chromatography, the entire first-dimension (1D) effluent is divided into many fractions, each of which is subjected to 2D separation. Two-dimensional chromatography is created by combining the results of 1D liquid chromatography separations (LC × LC). The placements of the spots provide qualitative information, while the intensities of the spots provide quantitative information. However, extracting information from extremely complex molecules like protein digests, metabolic extracts, and oil mixes can be problematic. Even with modern high-resolution chromatography, extracting the entire information of a complex matrix remains a challenging task. Many researchers are constantly working to improve the efficiency of chemometric data processing strategies.

In chromatography, chemometric is an appreciable tool for pre- and post-data analysis to resolve undesired background signals, baseline drift, unresolved peaks, and shifting retention times. Chemometric-based data interpretation, information extraction, and pre-data processing can significantly increase the analytical performance of an existing technique. The various chemometric approaches used in chromatography are penalized partial least squares (PPLS) approaches, multivariate curve resolution and orthogonal subspace projection for background correction, local minimum value approach, baseline estimation, and denoising using scarcity, retention-time-alignment strategies, peak clustering, and principal component analysis (PCA). These methods highlighted the chemometric techniques as the most progressing in silico approach in 1D and 2D chromatography and spectroscopy [115].

Chemometrics in unsupervised and supervised techniques

For understanding the dissimilarity or variance in the data matrix, PCA, independent compound analysis (ICA), and cluster analysis (CA) are used. As a result, the “calibration sets” may be defined as loading vectors and utilized to project unknown data. If data does not cluster against any objective criterion, then supervised procedures such as multivariate calibration methods are applied. Although a regression model may be built utilizing a large number of PCA variables, this approach is referred to as principal component regression (PCR). The data matrix’s PCR analysis is mainly based on variance. The partial least squares (PLS) method, also known as a projection to latent structures, is commonly used in the linear supervised method. It finds the route through the data matrix that maximizes the covariance between the matrix and the predicted variable and then creates a regression model [116].

Software tools in chemometrics and their workflow

Chemometric software (for example, BWIQ) is available for on- and off-line quantitative and qualitative spectral measurements to identify principal components. The software classifies the sample as corresponding to the group with the shortest calculated “Mahalanobis distance (a measure of the distance between point-P and distribution D)”. The workflow is described in following section.

The complete spectrum will be presented on the screen once you start the software, click “file”, open the data, and import it into the software. We may designate spectral files in BWIQ in a variety of ways, including calibration, validation, and ignored files. The “usage” column’s drop-down button was used to manually designate the spectrum. The algorithm parameters have been chosen and are accessible in the algorithm properties tab. We may use the sampling method and adjust the calibration file to the o validation file ratio in the property panel, for example, 60:40 (calibration: validation). After that, eliminate any change in the unrelated to chemical variations data sets but rather to scattering, instrumental fluctuations, spectral noise, or background differences in the pre-processing processes. Because the model can analyse the full spectrum, it will be more sensitive to contaminants or changes in the samples that add signals in other spectral areas. However, excluding non-informative or noisy data areas from analysis is an advantage. Then, we have the option of using a chemometric method such as PCA-Mahalanobis distance (MD). In principle component space, the scores plot illustrates the sample clusters. The result shows clusters matching the different classes of principal components. Additional graphs, such as loading and variance, are also available [117].

Different types of chemometrics approaches

Penalized partial least squares approach (PPLS)

This method was initially developed by Whittaker in 1922 to address signal smoothing issues [118]. The goal of PLS is to approximate observed data by resolving conflicts between original data fidelity and the imprecision of fitting data more easily by resolving the model’s fit to the data [119]. Assume that Eq. (1) is used to calculate the fidelity and roughness combined in a balanced way:

$$Q = F + \lambda R = \mathop \sum \limits_{i = 1}^{n} \left( {v_{i} - z_{i} } \right)^{2} + \mathop \sum \limits_{i = 2}^{n} \left( {z_{i} - z_{i - 1} } \right)^{2} = \left\| {v - z^{2}} \right\| + \lambda \left\| {Dz^{2}} \right\|$$
(1)

where z is the fitting vector and v is a vector representing the analyte spectrum, both of which have a length of “n” elements. Fitted z should maintain both the roughness of the fitted vector and fidelity to v. The sum of squares of differences between the vector and element of z and its neighbours can be used to describe F, which stands for fidelity to the analyte spectrum “v”, and R, which stands for the roughness of the fitting vector z. A user-adjustable parameter called “λ” finds a balance between fidelity and roughness. Greater λ favours a fitted vector that is smoother.

A weight vector w was added for fidelity in order to use the PLS to estimate the background. Its element wi may be thought of as a weight that represents the dependability of point I as a component of background. The partial derivatives of Q are equalled to zero \(\left( {{{\partial Q} \mathord{\left/ {\vphantom {{\partial Q} {\partial z}}} \right. \kern-0pt} {\partial z}} = 0} \right)\), in order to solve the minimization issue of Eq. (1). The matrix form of the resulting linear system is then used to determine the fit (Eq. 2).

$$\left( {w + \lambda D^{\prime}D} \right)z = v$$
(2)

To use this PLS approach for baseline correction, which is used by Zhang et al. and Cobas, one must first identify the locations of the chromatogram’s peaks. In order to determine whether a data point in the chromatogram relates to background or a peak, respectively, a binary mask or weighted matrix can be generated once these peak points are known [120, 121].

$$\left( {w + \lambda D^{\prime}D} \right)z = wv$$
(3)

Additionally, Eilers et al. [122] created the asymmetrical least squares (asLS), which introduces an asymmetry parameter in an effort to address this problem. The weights assigned to positive and negative deviations from the baseline can now be less and bigger, respectively. However, this also takes into account of issues with the baseline that were raised for the introduction of adaptive iteratively reweighted penalized least squares (airPLS) [123], which enables some baseline regions to be fined more than others. By iteratively resolving a weighted penalized least squares problem, airPLS develops a weight vector.

Once the difference between the signal and the fitted vector \(\left| {{\text{d}}^{t} } \right|\) is less than one thousand of the original signal, it is assumed that an accurate weight vector has been established. The PLS approach satisfies the following termination criteria.

$$\left| {{\text{d}}^{t} } \right| < 0.001 \times \left| v \right|$$
(4)

In some situations, both approaches overestimate the baseline when a matrix is present. Baek et al. created the asymmetrically reweighted penalizes least squares (arPLS) method as a solution [124]. MairPLS is another technique built on the similar concepts. While comparing to the prior technique, Long Chen et al. [125] collaborative PLS for Raman spectra background correction result was better.

Multivariate curve resolution-alternating least squares (MCR-ALS)

From MCR-ALS, estimations of the chemically significant profiles of the relevant chemical species may be created from mixed experimental data using a bilinear decomposition [126]. Building many MCR-ALS models while investigating suitable quality-of-fit and interpretability of resolved chemical information is commonly required by strategies to determine the optimal number of components in the MCR-ALS model [127]. The data set include complex, heterogeneous samples of unknown composition, spatially resolved chemical images and associated resolved analyte spectra of the individual, pure chemical components. MCR-ALS specifically breaks down an experimental data matrix (DM) [128]

$${\text{DM}} = CS^{T} + E$$
(5)

where in Eq. (5), the resolved spectrum matrix is ST, the residual error matrix is E, and the concentration profile matrix is C. Three-dimensional experimental data produced by spectroscopic techniques contain spectral (λ or v) and spatial (x and y) information. The 2D experimental data matrix, DM, which contains integrated spatial (both x and y together) and spectral (λ or v) information, is generated from the three-dimensional experimental data before MCR-ALS. This approach applied for the baseline correction and quantitative purpose also for correction of local minimum of the least square errors obtained by various other methods such as singular value decomposition (SVD) or PCA [129].

Principal component analysis (PCA)

The principal component analysis is a popular unsupervised learning technique for reducing the dimensionality of data. The PCA was invented in 1901 by Pearson [130]. In chromatography, PCA is frequently used to examine the outcomes of complicated samples [131] where uncorrelated variables are linearly fit across the data set. The major variation of data is represented by the first component, which also describes the second-most frequent variance in the data, and so on. This chemometrics tool can be particularly helpful when it comes to interpreting highly dimensional data.

The PCA method may be used for interference factor removal, interference factor extraction, and data compression. The following equation illustrates the outcomes of using the singular value decomposition (SVD) method to carry out PCA analysis and get orthogonal principal components (PCs) [132].

$$D = U\Sigma V^{T}$$
(6)

where in Eq. (6), the three matrices \(U\), \(\Sigma\), and \(V^{T}\) denote scores, singular values, and loadings with sizes of \(m \times m\), \(m \times n\), and \(n \times n\), respectively. D stands for the raw data with a size of \(m \times n\) for decomposition [133].

In chromatography, Soares et al. [134] applied the PCA in combination with COW; its interesting use is to compare columns. Prior performing PCA, the chromatograms are first aligned with a COW technique to increase the probability (p-) values. It is possible to determine if there are significant differences between chromatograms by computing the Mahalanobis distances and converting them to p-values. Although this method decreases noise and raises the signal-to-noise ratio (S/N), there is a possibility that numerous components may become convoluted and that chemical information is lost. According to a report, the ideal bin size depends on the sample [135]. This method may be used to classify samples in complicated or multidimensional data set.

Parallel factor analysis (PARAFAC)

PARAFAC reduces the dimensionality of the data collection, but factor analysis is as similar to PCA. Factor analysis present the data as trilinear and contains three modes, namely spectra, chromatograms, and concentrations [136], whereas PCA is essentially a dimension reduction approach. As a result, it discovers not only a subspace, but also the vector orientations [137]. PARAFAC2, which was developed by Khakimov et al. [138], can similarly handle slight changes in retention time. The three-way array X of dimensions I, J, and K can be described by the PARAFAC decomposition.

$$X_{ijk} = \mathop \sum \limits_{f = 1}^{F} a_{if} b_{jf} c_{kf} + e_{ijk}$$
(7)

In Eq. (7), F stands for the number of factors, while \(X_{ijk}\), \(a_{if}\), \(b_{jf}\), \(c_{kf}\), and \(e_{ijk}\) are, respectively, elements of X, A, B, C, and E. The loading matrices A, B, and C have dimensions of \(I \times F\),\({ }J \times F\), and \(K \times F\), respectively. The three-way array of dimensions \(I \times J \times K\) is denoted E [139].

The uniqueness of PARAFAC model is that it establishes not only the subspace, but also the location of the axes defining it. Additionally, the PARAFAC model offers a second-order benefit of allowing for the analysis of chemical components even in the presence of unidentified interferences [140]. Tatjana et al. and Na Peng et al. [141] both applied the PARAFAC to the fluorescence analysis, and they discovered that the model of fluorescence had the capacity to quantify and analyse fluorophores quality in analytes and classify the various types of fluorophores. Another study recommended the combination of PARAFAC with fluorescence regional integration for better characterizing analyte and understanding their functionality [142].

Partial least squares (PLS)-based methods

PLS-DA, also known as discriminant partial least squares (D-PLS), is a method for analysing partial least squares. The technique was first developed by Barker and Rayens [143]. Dimension reduction and the construction of a predictive model are the two major components of PLS-DA modelling. It gives a linear delimiter using partial least squares (PLS) regression with the response variables being binary class membership indices (e.g. 0 and 1) for each class. The PLS-2 algorithm, which enables the prediction of a matrix of response variables in multiple components, is used when there are more than two classes involved.

PLS-DA—The components must be orthogonal to one another in the ordinary variant. The non-singular eigenvectors of the covariance matrix C can be used to formulate it [144].

$$C = \frac{1}{{\left( {n - 1} \right)^{2} }}X^{T} C_{n} yy^{T} C_{n} X$$
(8)

where in Eq. (8), \(y\) is the class label vector, \(C_{n}\) is the \(n \times n\) centring matrix, and \(X\) is the loading matrix. The loading vectors a1,… ad, which denote the relevance of each feature in that component, are computed iteratively. Its objective for iteration h is as follows:

$$\max {\text{cov}} \left( {X_{h} a_{h} ,y_{h} b_{h} } \right)$$
(9)

where X1 = X, \(y_{h}\) and \(X_{h}\) are the residual (error) matrices following transformation with the prior h-1 components, and \(b_{h}\) is the loading for the label vector \({ }y_{h}\).

PLS-DA has been used mostly in biomarker and drug discovery research using LC–MS/MS and NMR study of advanced-stage melanoma in blood [145]. Using LC–MS data, Lambrecht et al. [146] employed PLS-DA to classify black rice according to its place of origin. PLS was used by Eleni et al. [147] to predict the diffusion of substances in artificial membranes.

Additionally, orthogonal partial least squares discriminant analysis (OPLS-DA) is designed to distinguish between the discriminating and non-discriminatory dimensions [148]. Using a set of metabolites identified by LC–MS/MS, Zhang et al. [149] applied OPLS-DA to confirm the legitimacy of fruit juices. Shurui et al. [150] used a similar strategy when they used OPLS-DA to HRMS study for non-target metabolomics.

Support vector machines (SVM)

A set of pattern-recognition techniques called support vector machines (SVM) was developed to effectively handle nonlinear data distributions. It is one of the chemometrics’ machine learning methods. The fundamental component of SVM is the projection of data points into a space with added dimensions, which serves as a means of identifying linear functions capable of modelling the data [151]. Such modelling functions can be projected back into the space of the original predictors, and producing functions are higher in complexity but lower in dimension (often nonlinear). The use of SVM in discriminant classification is conventional. Nevertheless, several authors offered class-modelling-relevant adjustments. It is important to note the support vector domain description (SVDD) method by Songfeng Zheng [152] used hyperspheres to describe the class spaces, as one of the most popular strategies. Numerous researchers have used this strategy in a variety of analytical studies, including laser-induced breakdown spectroscopy [153], ATR-FT-IR spectroscopy [154], tandem mass spectrometry (MS/MS) [155], and HPLC [156].

Artificial neural networks (ANNs)

ANNs are multilayer networks of linked mathematical operators (neurons). The feed-forward neural network is the most common ANN. Here, each neuron performs as a weighted sum of the input data or outputs of the preceding layer as modified by an activation function (typically linear or logistic function). The proposed algorithms learns from a dataset for predicting event outcomes [157].

In the last decade, artificial neural networks (ANNs) have been developed to determine retention index or time for 1D-GC, 1D-LC, 2D-LC and 2D-GC separations [158, 159]. ANNs are computer programs that “learn” to carry out tasks by taking into account multiple cases. As long as enough input is given, an ANN can detect traits and patterns in data. Then, predictions are made in novel conditions using these traits and patterns. ANNs have been employed in variety of analytical research studies like LC–MS/MS determination [160], GC–MS [161], and HPLC [162]. Moreover, the list of chemometric methods used in analytical techniques were listed in Table 2.

Table 2 List of chemometric methods used in analytical techniques

Analytical quality by design (AQbD)

Analytical quality by design (AQbD) is an approach for developing robust analytics that is appropriate for regulatory flexibility in pharmaceutical submissions to the FDA. AQbD is widely used in the development of various analytical methods such as UV–visible, FT-IR, Raman, NIR, fluorimetric, HPLC, UHPLC, LC–MS, GC–MS, HPTLC, and SFC. In the pharmaceutical industry, the AQbD tool is integrated with PAT as a real-time process analyser to monitor any given process or material, which generates massive and complex data sets. There is a growing interest in the implementation of AQbD in new analytical method development procedures for wider applications including assays, stability studies, and bioanalytical studies, in analytical method development. While comparing to one-factor-at-a-time (OFAT) approach, AQbD-based analytical methods have demonstrated a high degree of robustness and method performance. Notably, using these techniques reduces the likelihood of human error, and the AQbD approach will not predict any chromatogram but instead explore scientific understanding in method implementation sequences, beginning with the quality of predictions that relate to risk assessment in method choice, then between method parameter and expected method results, and finally a region for a highly robust and cost-effective approach [186]. The design of experiment (DoE) is a part of AQbD methodology and represents the interaction among the input factors that ultimately affect the technique response and outcomes. Therefore, a typical AQbD methodology starts with an analytical target profile (ATP) and risk and critical evaluation, then uses DoE to optimize the method variables, creates a method operable design region (MODR), and implements a control plan [187,188,189]. There are works available and comprised in Table 3 and the scheme of methodology illustrated in Fig. 6.

Table 3 Data on the analytical methods reported based on AQbD
Fig. 6
figure 6

A typical AQbD approach in analytical method development

Assessments of prediction ability of prediction software

Assessment of the predictive ability of NMR prediction by Chemaxon

An attempt was made to verify the expected chemical shift values for the chosen test compounds shown in Fig. 7. The original experimental chemical shift values were compared to the predicted chemical shift values of ten chemically divergent structural compounds in this experiment. A per cent error (%) for each chemical shift value was obtained, as well as regression analysis. The per cent error ranged from − 26.52 to 35.98%. The correlation’s graphs in Figs. 8 and 9 show R2 value of 0.959 (H1-NMR) and 0.974 (C13 NMR). This indicates the accuracy of NMR signal prediction. According to prediction results, in H1-NMR, aliphatic proton error ranged from − 26.52 to 35.98%, whereas aromatic proton error ranged from − 25.47 to 9.21%. The aliphatic carbon error ranged from − 14.41 to 27.54% in the C13 NMR, whereas the aromatic carbon error ranged from − 14.95 to 6.49 per cent. Finally, we conclude the aliphatic error was greater when compared with the aromatic error all those data was presented in the Table 4.

Fig. 7
figure 7

Chemical structures used for NMR signal prediction assessment

Fig. 8
figure 8

Regression for H1-NMR predicted versus experimental signals

Fig. 9
figure 9

Regression for C13-NMR predicted versus experimental signals

Table 4 Comparative data for H1-NMR and C13 NMR signals of predicted versus experimental signals

Assessment of the predictive ability of ORCA

For UV–Visible prediction

Here, originally obtained wavelength maximum (λmax) values were compared to the predicted wavelength values of fifteen structurally divergent structural compounds. A per cent error (%) for each wavelength value was obtained, as well as regression analysis. The error rate was found to be between − 2.27 and 18.69%. The correlation’s graph in Fig. 10shows R2 value of 0.926. This demonstrates the accuracy of UV–visible prediction. The results demonstrate that when methanol is used as a prediction solvent, the error ranges from 0.0 to 18.69 per cent, whereas water has a range of − 2.27 to 11.73%. Finally, we conclude that more error is observed when using methanol as a solvent for prediction compared with water. The resulting data were presented in the Table 5.

Fig. 10
figure 10

Regression plot for UV–visible of predicted versus experimental frequency

Table 5 Comparative data of UV–visible spectral data: predicted versus experimental

For Raman and infrared

The predicted Raman shift and infrared absorption frequency for the selected test substances was verified with experimental values. Here, the predicted frequency values of ten chemically divergent structural compounds are verified with original experimental frequency values. The % error for each frequency value and regression analysis was calculated. The % error was observed between − 30.04 and 29.26%. The R2 value (Figs. 11 and 12) for the correlation was 0.946 for both the Raman shift and the infrared absorption frequency. This indicates the reliability of Raman shift and infrared absorption frequency prediction. The results reveals that aliphatic single bond compound error ranged from − 20.02 to 29.26%, double and triple bonded compound error ranged from − 4.20 to 13.14%, and hydroxyl function group compound error ranged from − 21.01 to 3.55%, and aromatic ring compound error ranged from − 11.43 to 18.07%. As a result, we find that the aliphatic single bond compound error was greater than other errors, moreover the comparative data was presented in the Table 6 clearly.

Fig. 11
figure 11

Regression plot for Raman shift of predicted versus experimental frequency

Fig. 12
figure 12

Regression plot for infrared absorption of predicted versus experimental frequency

Table 6 Comparative data on Raman shift and infrared absorption frequency: predicted versus experimental values

Furthermore, an attempt has been made to verify the predicted infrared absorption frequency for lamivudine and zidovudine with all functional frequencies. In this study, the predicted frequency values of lamivudine and zidovudine structural functional group frequencies were compared to the original experimental frequency values. The % error for each frequency value and regression analysis was calculated. The % error was observed between − 24.26 and 18.89%. The R2 value for the correlation graph in Fig. 13 was shown to be 0.970 for both lamivudine and zidovudine absorption frequencies. This also demonstrates the reliability of frequency prediction using a single compound prediction with all functional groups was presented in the Table 7.

Fig. 13
figure 13

Regression plot for infrared absorption of predicted versus experimental frequency of lamivudine and zidovudine

Table 7 Comparative infrared data of single compound with functional group frequency: predicted vs experimental values

Assessment of predictive ability QSRR Automator with reference data

The QSSR retention predictions for antiviral drugs were conducted. For that reference information was gathered from various research publications, and different antiviral drugs with C18 column elution were selected. The predicted retention time for the test set of drugs was compared to the published retention time data. The % error for each retention time and regression coefficient were calculated. The % error was observed in the range of − 20 to 20%. This can be observed clearly in the histogram plot in Fig. 14 and 15, and the R2 value for the correlation was 0.947. This indicates the reliability of retention time prediction. Topological and 2D descriptors like MW, AATS, MATS, GATS, Axp, n6aHRing, NsNH2, and SLogP are the most often used contributing descriptors, and they depend on the analyte chemical structure. Table 8 was presented the results of QSRR predicted and experimental retention time of Antiviral class drugs.

Fig. 14
figure 14

Regression plot for a retention time of predicted versus published

Fig. 15
figure 15

Illustrate Histogram plot for all the % errors in prediction study A UV–visible—symmetric error, B Raman—right skewed error, C IR—symmetric error, D H1-NMR—right skewed error, E C13-NMR—left skewed error, F QSRR—left skewed error

Table 8 Comparative data of predicted and experimental retention in HPLC using the QSRR approach [232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279]

Short-time Fourier transform (STFT) method for assessment

The short-time Fourier transform (STFT) study revealed sufficiency to recognize that all the assessment results of C13, H1-NMR, UV–visible, IR, Raman are accurate and have also been used for further evaluation purpose. Based on the density power of frequency spectrogram, it is most likely that the yellow or red colour denoted high power, and the blue colour is low power. If the spectrograms had the same frequency power or should have produced results close to the acceptable prediction, we would have concentrated on the greater frequency power of both the predicted and experimental data sets. While both will have different frequency powers, this indicates that the prediction was inaccurate.

The H1-NMR power frequency of the predicted and experimental results are nearly identical and the highest power index should be in the range of 18.25–17.39, respectively (Fig. 16). This indicates that both outcomes are accurate. With regard to the C13-NMR, the power frequency of both the predicted and the experimental data are depicted in Fig. 17, both of which exhibit the identical power index value of 44.17 and 44.01, respectively. This demonstrated the validity of the data, and the spectrogram revealed very slight frequency differences in the lower power range, which are visible in the blue colour peaks. The highest frequency indexes in the UV–visible Fig. 18 are 44.89 and 44.82, for both predicted and experimental results, respectively, in the same frequency index. Finally, the Raman and infrared power frequency index can be observed in Figs. 19 and 20 that both the predicted and experimental data are shown using the same frequency index and Fig. 21 shows the STFT spectrogram 3D plot for IR prediction results all these results providing us to confirm that the prediction was accurate.

Fig. 16
figure 16

STFT spectrogram plot of H1-NMR for predicted and published results

Fig. 17
figure 17

STFT spectrogram plot of C13-NMR for predicted and published results

Fig. 18
figure 18

STFT spectrogram plot of UV–visible for predicted and published results

Fig. 19
figure 19

STFT spectrogram plot of Raman for predicted and published results

Fig. 20
figure 20

STFT spectrogram plot of IR for predicted and published results

Fig. 21
figure 21

STFT spectrogram 3D plot for IR prediction results

Discussion

Our assessment afforded the acceptable results, however few software-related constraints, particularly time consumption of 5–20 h for TD-DFT calculations. The prediction error will produce more erroneous findings when the data set is small and a prediction tool required to be unique. Therefore, a large data set is necessary for successful finding; but, in some cases, a large data set can also result in inaccurate prediction, e.g. a complicated structure with multiple classes of variables takes longer time to process, and the impact on the prediction process ultimately leads to wrong results, which is disappointing for a research study. Therefore, careful planning in the dataset and systematic prediction are required to produce reliable research findings. Then, while collecting the reference data set, we stumbled into issues with some data not being present in the reference library. In that situation, leaving the compound and switching to another approach might be an option. For example, if two distinct spectrum results for the same chemicals are found in certain reference data, such case data optimization need to be performed. There are several online reference data sources available for mass spectroscopy; however, there are fewer for infrared Raman spectra. In Table 9, all these problems and challenges related to spectrogram prediction are listed. 

Table 9 Spectrogram behaviour predictions’ limitations and benefits

In QSRR approach, more than 50 compounds are needed for prediction of retention time. During operation, we noticed that whenever a data set was given, it was based on predicting values nearby. This may be a problem with the QSRR Automator programme, but more sophisticated software for retention time prediction is already available, so we can utilize it for alternative purposes. The chemometric theory is entirely mathematically based, understanding AQbD and chemometrics is more critical in nature. If a chemist is not familiar with mathematical, it will be harder to develop a prediction process. Each approach in chemometrics has a unique methodology, thus experts are required for both planning and result evaluation. Additionally, we noted in the literature study that there is less research on the electrochemistry spectroscopic prediction with chemometrics. Generally, the electrochemistry prediction will be employed in the technological field, but only when few drugs are discovered and developed. Since there are so many variables that might influence the results, such as instrument setting, calibration, process, and model selections, some AQbD method failure will certainly occur in the case of method replacements. However, this strategy is most effective at minimizing method transfer, OOS, and OOT failure rates. These are presented in Table 10.

Table 10 Chromatography behaviour predictions’ limitations and benefits

The differences between the physical and chemical data predictions are also illustrated in Table 11. By comparison, the physical data prediction is simpler than the chemical data prediction because the latter requires a larger number of supporting techniques and programmes. In addition, it requires larger number of descriptors, and is more challenging for beginners and students.

Table 11 Differences in physical and chemical data predictions

Conclusions

Finally, with acceptable accuracy and the least feasible variation, the present review of computational approaches in spectrum prediction was concluded. Overall, students and researchers are considerably utilizing the in silico tools in computational chemistry and indicate the reliability of such tools in research. The development and application of computational approaches in analytical research and development are our key objectives. As we observed, computational analytical behaviour prediction offers a wide range of applications in academic research, bioanalytical method development, computational chemistry, analytical method development, data analysis approaches, material characterization, and validation. Still, the prediction error of these tools need to be minimized for better accuracy, thus it will be explored much more in exploratory research in future.

Availability of data and materials

Not applicable. The manuscript does not contain any data.

Abbreviations

UV:

Ultraviolet

IR:

Infrared

NMR:

Nuclear magnetic resonance

GC:

Gas chromatography

ANNs:

Artificial neural networks

RSM:

Response surface methodology

AQbD:

Analytical quality by design

DoE:

Design of experiments

QSRR:

Quantitative structure retention relationship

OOT:

Out of trend

OOS:

Out of specification

AI:

Artificial intelligence

ASV:

Automatic structure verification

TD-DFT:

Time-dependent density function theory

EI-MS:

Electron ionization mass spectrometry

NEIMS:

Neural electron ionization mass spectrometry

QCEIMS:

Quantum chemical electron ionization mass spectrometry

MD:

Molecular dynamics

ESI–MS/MS:

Electrospray ionization mass spectroscopy

EOM-CCSD:

Equation of motion coupled cluster singles and doubles

PCM:

Polarizable continuum model

QSAR:

Qualitative structural activity relationship

Lasso-RF:

Random forest descriptor

EIS:

Electrochemical impedance spectroscopy

GPs:

Computational Gaussian processes

TDMs:

Transition density matrices

ICP-MS:

Inductively coupled plasma mass spectroscopy

XRD:

X-ray diffraction

CSV:

Comma-separated values

SMILES:

Simplified molecular input line entry system

RMSE:

Root-mean-squared error

FD:

Factorial design

PLS:

Partial least squares

CA:

Cluster analysis

PCR:

Principal component regression

TLRC:

Trilinear regression calibration

MLRC:

Multi-linear regression calibration

PLSR:

Partial least square regression

ILS:

Inverse least square

PCA:

Principal component analysis

OPA:

Orthogonal projection analysis

EFA:

Evolving factor analysis

MCR-ALS:

Multivariate curve resolution by alternating the least squares approach

SFA:

Sub-window factor analysis

PARAFAC:

Parallel factor analysis

GRAM:

Generalized rank annihilation

PLS-DA:

Partial least square discriminate analysis

HCA:

Hierarchical cluster analysis

SIMCA:

Soft independent modelling of class analogy

RAFA:

Rank annihilation factor analysis

LC–NMR:

Liquid chromatography nuclear magnetic resonance

LC–MS:

Liquid chromatography–mass spectrometry

GC–MS:

Gas chromatography–mass spectrometry

FT-IR:

Fourier transform infrared

HPLC:

High-performance liquid chromatography

UPLC:

Ultra-performance liquid chromatography

HPTLC:

High-performance thin-layer chromatography

SFC:

Supercritical fluid chromatography

PAT:

Process analytical technique

OFAT:

One-factor-at-a-time

FR:

Flow rate

CT:

Column temperature

AQ:

Aqueous

ACN:

Acetonitrile

RSD:

Relative standard deviation

TBAH:

Tetrabutylammonium hydroxide

PDOP:

Potassium dihydrogen orthophosphate

TEA:

Triethylamine

STFT:

Short-time Fourier transform

QSPR:

Qualitative structural property relationship

References

  1. Nova A, Maseras F (2013) Enantioselective synthesis. In: Comprehensive inorganic chemistry ii (second edition): from elements to applications, pp. 807–831

  2. Genheden S, Reymer A, Saenz-Méndez P, Eriksson LA (2017) Computational chemistry and molecular modelling basics

  3. Polanski J, Gasteiger J (2016) Computer representation of chemical compounds. J Puzyn T Eds. https://doi.org/10.1007/978-94-007-6169-8_50-1

    Article  Google Scholar 

  4. Gerlich M, Neumann S (2013) Metfusion: integration of compound identification strategies. J Mass Spectrom 48(3):291–298

    CAS  PubMed  Google Scholar 

  5. Wolf S, Schmidt S, Müller-Hannemann M, Neumann S (2010) In silico fragmentation for computer assisted identification of metabolite mass spectra. BMC Bioinf 11(1):1–12

    Google Scholar 

  6. Peironcely JE, Rojas-Chertó M, Tas A, Vreeken R, Reijmers T, Coulier L, Hankemeier T (2013) Automated pipeline for de novo metabolite identification using mass-spectrometry-based metabolomics. Anal Chem 85(7):3576–3583

    CAS  PubMed  Google Scholar 

  7. Snyder HD, Kucukkal TG (2021) Computational chemistry activities with avogadro and orca. J Chem Educ 98(4):1335–1341. https://doi.org/10.1021/acs.jchemed.0c00959

    Article  CAS  Google Scholar 

  8. Kotha RR, Natarajan S, Wang D, Luthria DL (2019) Compositional analysis of non-polar and polar metabolites in 14 soybeans using spectroscopy and chromatography tools. Foods 8(11):557

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Kaleta M, Oklestkova J, Novák O, Strnad M (2021) Analytical methods for the determination of neuroactive steroids. Biomolecules 11(4):553

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Vandierendonck A (2017) A comparison of methods to combine speed and accuracy measures of performance: a rejoinder on the binning procedure. Behav Res Methods 49(2):653–673

    PubMed  Google Scholar 

  11. Paul D, Sanap G, Shenoy S, Kalyane D, Kalia K, Tekade RK (2021) Artificial intelligence in drug discovery and development. Drug Discov Today 26(1):80

    CAS  PubMed  Google Scholar 

  12. Udayakumar V, Periandy S, Ramalingam S (2011) Experimental (ft-ir and ft-raman) and theoretical (hf and dft) investigation, ir intensity, raman activity and frequency estimation analyses on 1-bromo-4-chlorobenzene. Spectrochim Acta A Mol Biomol Spectrosc 79(5):920–927. https://doi.org/10.1016/j.saa.2011.03.049

    Article  CAS  PubMed  Google Scholar 

  13. Guideline, I. H. T. (2017). Technical and regulatory considerations for pharmaceutical product lifecycle management q12. Paper presented at the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use.

  14. FDA. (2011). Food drug administration. Pharmaceutical quality system (ich 10) conference. Accessed Jul 2021 from https://www.fda.gov/regulatory-information/search-fda-guidance-documents/q10-pharmaceutical-quality-system

  15. Patel KY, Dedania ZR, Dedania RR, Patel U (2021) Qbd approach to hplc method development and validation of ceftriaxone sodium. Future J Pharm Sci 7(1):141. https://doi.org/10.1186/s43094-021-00286-4

    Article  Google Scholar 

  16. Peraman R, Bhadraya K, Padmanabha Reddy Y (2015) Analytical quality by design: a tool for regulatory flexibility and robust analytics. Int J Anal Chem. https://doi.org/10.1155/2015/868727

    Article  PubMed  PubMed Central  Google Scholar 

  17. Agatonovic-Kustrin S, Zecevic M, Zivanovic L, Tucker I (1998) Application of artificial neural networks in HPLC method development. J Pharm Biomed Anal 17(1):69–76

    CAS  PubMed  Google Scholar 

  18. Webb R, Doble P, Dawson M (2009) Optimisation of hplc gradient separations using artificial neural networks (anns): application to benzodiazepines in post-mortem samples. J Chromatogr B 877(7):615–620

    CAS  Google Scholar 

  19. Chatterjee S (2013) QBD considerations for analytical methods—FDA perspective. Paper presented at the US IFPAC annual meeting

  20. Burnett K, Harrington B, Graul T, Fanalis S, Haddad P, Poole C (2013) Qbd in liquid chromatographic applications. Elsevier

    Google Scholar 

  21. Kaliszan R (2000) Chapter 11 recent advances in quantitative structure-retention relationships (QSRR). In: Valkó K (ed) Handbook of analytical separations. Elsevier Science, pp 503–534

    Google Scholar 

  22. Héberger K (2007) Quantitative structure–(chromatographic) retention relationships. J Chromatogr A 1158(1):273–305. https://doi.org/10.1016/j.chroma.2007.03.108

    Article  CAS  PubMed  Google Scholar 

  23. Amos RIJ, Haddad PR, Szucs R, Dolan JW, Pohl CA (2018) Molecular modeling and prediction accuracy in quantitative structure-retention relationship calculations for chromatography. TrAC, Trends Anal Chem 105:352–359. https://doi.org/10.1016/j.trac.2018.05.019

    Article  CAS  Google Scholar 

  24. De Matteis CI, Simpson DA, Doughty SW, Euerby MR, Shaw PN, Barrett DA (2010) Chromatographic retention behaviour of n-alkylbenzenes and pentylbenzene structural isomers on porous graphitic carbon and octadecyl-bonded silica studied using molecular modelling and QSRR. J Chromatogr A 1217(44):6987–6993

    PubMed  Google Scholar 

  25. MacNeil JD (2012) Analytical difficulties facing today’s regulatory laboratories: issues in method validation. Drug Test Anal 4(Suppl 1):17–24. https://doi.org/10.1002/dta.1358

    Article  CAS  PubMed  Google Scholar 

  26. Volta ESL, Gonçalves R, Menezes JC, Ramos A (2021) Analytical method lifecycle management in pharmaceutical industry: a review. AAPS PharmSciTech 22(3):128. https://doi.org/10.1208/s12249-021-01960-9

    Article  Google Scholar 

  27. Yang W, Qian W, Yuan Z, Chen B (2022) Perspectives on the flexibility analysis for continuous pharmaceutical manufacturing processes. Chin J Chem Eng 41:29–41. https://doi.org/10.1016/j.cjche.2021.12.005

    Article  CAS  PubMed  Google Scholar 

  28. Akash MSH, Rehman K (2020) Introduction to pharmaceutical analysis. In: Akash MSH, Rehman K (eds) Essentials of pharmaceutical analysis. Springer Nature Singapore, Singapore, pp 1–18

    Google Scholar 

  29. Cadinoska M, Popstefanova N, Ilievska M, Karadzinska E, Jovanoska M, Glavas Dodov M (2019) Trending and out-of-trend results in pharmaceutical industry. Maced Pharm Bull 65:39–60. https://doi.org/10.33320/maced.pharm.bull.2019.65.01.005

    Article  Google Scholar 

  30. Appleton T, Bryan P, Contos D et al (2012) Nonclinical dose formulation: out of specification investigations. Aaps J 14(3):523–529. https://doi.org/10.1208/s12248-012-9347-4

    Article  PubMed  PubMed Central  Google Scholar 

  31. Martinez Calatayud J (2005) Spectrophotometry | pharmaceutical applications. In: Worsfold P, Townshend A, Poole C (eds) Encyclopedia of analytical science, 2nd edn. Elsevier, Oxford, pp 373–383

    Google Scholar 

  32. Simundic AM, Lippi G (2012) Preanalytical phase–a continuous challenge for laboratory professionals. Biochem Med 22(2):145–149. https://doi.org/10.11613/bm.2012.017

    Article  Google Scholar 

  33. Paré GKS (2017) Handbook of ehealth evaluation: an evidence-based approach. University of Victoria, Victoria

    Google Scholar 

  34. Redrup MJ, Igarashi H, Schaefgen J et al (2016) Sample management: recommendation for best practices and harmonization from the global bioanalysis consortium harmonization team. Aaps J 18(2):290–293. https://doi.org/10.1208/s12248-016-9869-2

    Article  PubMed  PubMed Central  Google Scholar 

  35. Piskunov DP, Danilova LA, Pushkin AS, Rukavishnikova SA (2020) Influence of exogenous and endogenous factors on the quality of the preanalytical stage of laboratory tests (review of literature). Klin Lab Diagn 65(12):778–784. https://doi.org/10.18821/0869-2084-2020-65-12-778-784

    Article  CAS  PubMed  Google Scholar 

  36. Krčmová LK, Melichar B, Švec F (2020) Chromatographic methods development for clinical practice: requirements and limitations. Clin Chem Lab Med 58(11):1785–1793. https://doi.org/10.1515/cclm-2020-0517

    Article  CAS  PubMed  Google Scholar 

  37. Patil R, Bhaskar R, Ola M, Pingale D, Chalikwar SS (2019) Bioanalytical method development and method validation in human plasma by using LC MS/MS

  38. Khamis MM, Adamko DJ, El-Aneed A (2021) Strategies and challenges in method development and validation for the absolute quantification of endogenous biomarker metabolites using liquid chromatography-tandem mass spectrometry. Mass Spectrom Rev 40(1):31–52. https://doi.org/10.1002/mas.21607

    Article  CAS  PubMed  Google Scholar 

  39. Ragoisha G (2020) Challenge for electrochemical impedance spectroscopy in the dynamic world. J Solid State Electrochem 24:2171–2172

    CAS  Google Scholar 

  40. Pierce KM, Trinklein TJ, Nadeau JS, Synovec RE (2021) Chapter 20 - data analysis methods for gas chromatography. In: Poole CF (ed) Gas chromatography, 2nd edn. Elsevier, Amsterdam, pp 525–546

    Google Scholar 

  41. Oliveri P, Malegori C, Simonetti R, Casale M (2019) The impact of signal pre-processing on the final interpretation of analytical outcomes - a tutorial. Anal Chim Acta 1058:9–17. https://doi.org/10.1016/j.aca.2018.10.055

    Article  CAS  PubMed  Google Scholar 

  42. Cobas C (2020) NMR signal processing, prediction, and structure verification with machine learning techniques. Magn Reson Chem 58(6):512–519

    CAS  PubMed  Google Scholar 

  43. Ito K, Xu X, Kikuchi J (2021) Improved prediction of carbonless NMR spectra by the machine learning of theoretical and fragment descriptors for environmental mixture analysis. Anal Chem 93(18):6901–6906

    CAS  PubMed  Google Scholar 

  44. Kern S, Liehr S, Wander L, Bornemann-Pfeiffer M, Müller S, Maiwald M, Kowarik S (2020) Artificial neural networks for quantitative online NMR spectroscopy. Anal Bioanal Chem 412(18):4447–4459

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Jonas E, Kuhn S (2019) Rapid prediction of NMR spectral properties with quantified uncertainty. J Cheminf 11(1):1–7

    CAS  Google Scholar 

  46. Jia W, Yang Z, Yang M, Cheng L, Lei Z, Wang X (2021) Machine learning enhanced spectrum recognition based on computer vision (SRCV) for intelligent NMR data extraction. J Chem Inf Model 61(1):21–25. https://doi.org/10.1021/acs.jcim.0c01046

    Article  CAS  PubMed  Google Scholar 

  47. Lin M, Xiong J, Su M et al (2022) A machine learning protocol for revealing ion transport mechanisms from dynamic NMR shifts in paramagnetic battery materials. Chem Sci 13(26):7863–7872. https://doi.org/10.1039/D2SC01306A

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Chemaxon (2021) Chemaxon software solution service for chemistry and biology. Accessed May 2021 from https://chemaxon.com/products/calculators-and-predictors

  49. Mamede R, Pereira F, Aires-de-Sousa J (2021) Machine learning prediction of UV–vis spectra features of organic compounds related to photoreactive potential. Sci Rep 11(1):23720. https://doi.org/10.1038/s41598-021-03070-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Chan B, Hirao K (2020) Rapid prediction of ultraviolet-visible spectra from conventional (non-time-dependent) density functional theory calculations. J Phys Chem Lett 11(18):7882–7885. https://doi.org/10.1021/acs.jpclett.0c02146

    Article  CAS  PubMed  Google Scholar 

  51. Urbina F, Batra K, Luebke KJ et al (2021) UV-advisor: attention-based recurrent neural networks to predict UV–vis spectra. Anal Chem 93(48):16076–16085. https://doi.org/10.1021/acs.analchem.1c03741

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. ORCA (2021) Orca. Basis sets. Orca input library. Accessed May 2021 from https://sites.google.com/site/orcainputlibrary/basis-sets

  53. (2021) UV-vis spectroscopy orca tutorials 5.0 documentation. Accessed May 2021 from https://www.orcasoftware.de/tutorials_orca/spec/uvvis.html

  54. Orca A (2021) Avogadro orca: an open-source molecular builder and visualization tool, version 4.2. Accessed May 2021 from https://avogadro.cc/

  55. Neese F (2012) The orca program system. Wiley Interdiscip Rev Comput Mol Sci 2(1):73–78

    CAS  Google Scholar 

  56. Avogadro (2021) Avogadro program manual. Accessed Jun 2021 from https://avogadro.cc/docs

  57. Neese F, Wennmohs F, Becker U, Riplinger C (2020) The orca quantum chemistry program package. J Chem Phys 152(22):224108. https://doi.org/10.1063/5.0004608

    Article  CAS  PubMed  Google Scholar 

  58. Stanstrup J, Neumann S, Vrhovšek U (2015) Predret: prediction of retention time by direct mapping between multiple chromatographic systems. Anal Chem 87(18):9421–9428. https://doi.org/10.1021/acs.analchem.5b02287

    Article  CAS  PubMed  Google Scholar 

  59. McGill C, Forsuelo M, Guan Y, Green WH (2021) Predicting infrared spectra with message passing neural networks. J Chem Inf Model 61(6):2594–2609. https://doi.org/10.1021/acs.jcim.1c00055

    Article  CAS  PubMed  Google Scholar 

  60. Maciel EVS, Pereira Dos Santos NG, Vargas Medina DA, Lanças FM (2022) Electron ionization mass spectrometry: Quo vadis? Electrophoresis 43(15):1587–1600. https://doi.org/10.1002/elps.202100392

    Article  CAS  PubMed  Google Scholar 

  61. Smith RW (2013) Mass spectrometry. In: Siegel JA, Saukko PJ, Houck MM (eds) Encyclopedia of forensic sciences, 2nd edn. Academic Press, Waltham, pp 603–608

    Google Scholar 

  62. Bauer CA, Grimme S (2016) How to compute electron ionization mass spectra from first principles. J Phys Chem A 120(21):3755–3766. https://doi.org/10.1021/acs.jpca.6b02907

    Article  CAS  PubMed  Google Scholar 

  63. Zhou Z, Zare RN (2017) Personal information from latent fingerprints using desorption electrospray ionization mass spectrometry and machine learning. Anal Chem 89(2):1369–1372. https://doi.org/10.1021/acs.analchem.6b04498

    Article  CAS  PubMed  Google Scholar 

  64. Wei JN, Belanger D, Adams RP, Sculley D (2019) Rapid prediction of electron-ionization mass spectrometry using neural networks. ACS Cent Sci 5(4):700–708. https://doi.org/10.1021/acscentsci.9b00085

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Wang S, Kind T, Tantillo DJ, Fiehn O (2020) Predicting in silico electron ionization mass spectra using quantum chemistry. J Cheminform 12(1):63. https://doi.org/10.1186/s13321-020-00470-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Allen F, Pon A, Greiner R, Wishart D (2016) Computational prediction of electron ionization mass spectra to assist in GC/MS compound identification. Anal Chem 88(15):7689–7697. https://doi.org/10.1021/acs.analchem.6b01622

    Article  CAS  PubMed  Google Scholar 

  67. Wang F, Liigand J, Tian S, Arndt D, Greiner R, Wishart DS (2021) Cfm-id 4.0: more accurate ESI-MS/MS spectral prediction and compound identification. Anal Chem 93(34):11692–11700. https://doi.org/10.1021/acs.analchem.1c01465

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Kannan R, Solaimalai A, Jayakumar M, Surendran U (2022) Chapter 26 - advance molecular tools to detect plant pathogens. In: Rakshit A, Meena VS, Abhilash PC, Sarma BK, Singh HB, Fraceto L, Parihar M, Singh AK (eds) Biopesticides. Woodhead Publishing, pp 401–416

    Google Scholar 

  69. Zhu G, Bian Y, Hursthouse AS et al (2017) Application of 3-d fluorescence: characterization of natural organic matter in natural water and water purification systems. J Fluoresc 27(6):2069–2094. https://doi.org/10.1007/s10895-017-2146-7

    Article  CAS  PubMed  Google Scholar 

  70. Tomasi J, Mennucci B, Cammi R (2005) Quantum mechanical continuum solvation models. Chem Rev 105(8):2999–3093. https://doi.org/10.1021/cr9904009

    Article  CAS  PubMed  Google Scholar 

  71. Shavitt I, Bartlett RJ (2009) Many-body methods in chemistry and physics: Mbpt and coupled-cluster theory

  72. Pavošević F, Hammes-Schiffer S (2019) Multicomponent equation-of-motion coupled cluster singles and doubles: theory and calculation of excitation energies for positronium hydride. J Chem Phys 150(16):161102. https://doi.org/10.1063/1.5094035

    Article  CAS  PubMed  Google Scholar 

  73. Caricato M (2012) Absorption and emission spectra of solvated molecules with the eom–ccsd–pcm method. J Chem Theory Comput 8(11):4494–4502. https://doi.org/10.1021/ct3006997

    Article  CAS  PubMed  Google Scholar 

  74. Powell J, Heider EC, Campiglia A, Harper JK (2016) Predicting accurate fluorescent spectra for high molecular weight polycyclic aromatic hydrocarbons using density functional theory. J Mol Spectrosc 328:37–45. https://doi.org/10.1016/j.jms.2016.06.015

    Article  CAS  Google Scholar 

  75. Ye Z-R, Huang I-S, Chan Y-T et al (2020) Predicting the emission wavelength of organic molecules using a combinatorial qsar and machine learning approach. RSC Adv 10:23834–23841

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Shams-Nateri A, Piri N (2016) Prediction of emission spectra of fluorescence materials using principal component analysis. Color Res Appl 41(1):16–21. https://doi.org/10.1002/col.21959

    Article  Google Scholar 

  77. Mai S, Atkins AJ, Plasser F, González L (2019) The influence of the electronic structure method on intersystem crossing dynamics. The case of thioformaldehyde. J Chem Theory Comput 15(6):3470–3480

    CAS  PubMed  Google Scholar 

  78. Ohto T, Dodia M, Xu J et al (2019) Accessing the accuracy of density functional theory through structure and dynamics of the water-air interface. J Phys Chem Lett 10(17):4914–4919. https://doi.org/10.1021/acs.jpclett.9b01983

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Wang S, Zhang J, Gharbi O, Vivier V, Gao M, Orazem ME (2021) Electrochemical impedance spectroscopy. Nat Rev Methods Prim 1:41

    CAS  Google Scholar 

  80. Magar HS, Hassan RYA, Mulchandani A (2021) Electrochemical impedance spectroscopy (EIS): Principles, construction, and biosensing applications. Sensors. https://doi.org/10.3390/s21196578

    Article  PubMed  PubMed Central  Google Scholar 

  81. Krukiewicz K (2020) Electrochemical impedance spectroscopy as a versatile tool for the characterization of neural tissue: a mini review. Electrochem Commun 116:106742. https://doi.org/10.1016/j.elecom.2020.106742

    Article  CAS  Google Scholar 

  82. Pasqual JAR, Freisleben LC, Colpo JC, Egea JRJ, dos Santos LAL, de Sousa VC (2021) In situ drug release measuring in α-tcp cement by electrochemical impedance spectroscopy. J Mater Sci Mater Med 32(4):38. https://doi.org/10.1007/s10856-021-06507-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Heijne A et al (2018) Quantification of bio-anode capacitance in bioelectrochemical systems using electrochemical impedance spectroscopy. J Power Sour 400:533

    CAS  Google Scholar 

  84. Vadhva P, Hu JX, Johnson MJ, Stocker R, Braglia M, Brett DJL, Rettie AJE (2021) Electrochemical impedance spectroscopy for all-solid-state batteries: theory, methods and future outlook. ChemElectroChem 8(11):1930–1947

    CAS  Google Scholar 

  85. Meyer Q, Zeng Y, Zhao C (2019) Electrochemical impedance spectroscopy of catalyst and carbon degradations in proton exchange membrane fuel cells. J Power Sourc 437:226922

    CAS  Google Scholar 

  86. Pajkossy T, Jurczakowski R (2017) Electrochemical impedance spectroscopy in interfacial studies. Curr Opin Electrochem 1(1):53–58. https://doi.org/10.1016/j.coelec.2017.01.006

    Article  CAS  Google Scholar 

  87. Maradesa A, Py B, Quattrocchi E, Ciucci F (2022) The probabilistic deconvolution of the distribution of relaxation times with finite gaussian processes. Electrochim Acta 413:140119. https://doi.org/10.1016/j.electacta.2022.140119

    Article  CAS  Google Scholar 

  88. Schulz E, Speekenbrink M, Krause A (2018) A tutorial on gaussian process regression: modelling, exploring, and exploiting functions. J Math Psychol 85:1–16. https://doi.org/10.1016/j.jmp.2018.03.001

    Article  Google Scholar 

  89. Liu J, Ciucci F (2020) The gaussian process distribution of relaxation times: a machine learning tool for the analysis and prediction of electrochemical impedance spectroscopy data. Electrochim Acta 331:135316. https://doi.org/10.1016/j.electacta.2019.135316

    Article  CAS  Google Scholar 

  90. Lu Y, Zhao CZ, Huang J, Zhang QK (2022) The timescale identification decoupling complicated kinetic processes in lithium batteries. Joule. 6:1172

    CAS  Google Scholar 

  91. Liu J, Wan TH, Ciucci F (2020) A bayesian view on the hilbert transform and the kramers-kronig transform of electrochemical impedance data: probabilistic estimates and quality scores. Electrochim Acta 357:136864

    CAS  Google Scholar 

  92. Ciucci F (2020) The gaussian process hilbert transform (gp-ht): TESTING the consistency of electrochemical impedance spectroscopy data. J Electrochem Soc 167:126503

    CAS  Google Scholar 

  93. Kiss FL, Corbet BP, Simeth NA, Feringa BL, Crespi S (2021) Predicting the substituent effects in the optical and electrochemical properties of n, n′-substituted isoindigos. Photochem Photobiol Sci 20(7):927–938. https://doi.org/10.1007/s43630-021-00071-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Li Y, Ullrich CA (2011) Time-dependent transition density matrix. Chem Phys 391:157–163

    CAS  Google Scholar 

  95. Li H-W, Guan Z, Cheng Y, Lui T, Yang Q, Lee C-S, Tsang S-W (2016) On the study of exciton binding energy with direct charge generation in photovoltaic polymers. Adv Electr Mater. https://doi.org/10.1002/aelm.201600200

    Article  Google Scholar 

  96. Albuquerque LS, Arias JJR, Santos BPS, Marques M et al (2020) Synthesis and characterization of novel conjugated copolymers for application in third generation photovoltaic solar cells. J Market Res 9:7975–7988

    CAS  Google Scholar 

  97. Min K, Choi B, Park K, Cho E (2018) Machine learning assisted optimization of electrochemical properties for ni-rich cathode materials. Sci Rep 8(1):15778. https://doi.org/10.1038/s41598-018-34201-4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Min K, Park K, Park SY, Seo S-W, Choi B, Cho E (2017) Improved electrochemical properties of lini0.91co0.06mn0.03o2 cathode material via li-reactive coating with metal phosphates. Sci Rep 7(1):7151. https://doi.org/10.1038/s41598-017-07375-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Jalili-Jahani N, Zeraatkar E (2021) Fuzzy wavelet network based on extended kalman filter training algorithm combined with least square weight estimation: efficient and improved chromatographic QSRR/QSPR models. Chemom Intell Lab Syst 208:104191. https://doi.org/10.1016/j.chemolab.2020.104191

    Article  CAS  Google Scholar 

  100. Naylor BC, Catrow JL, Maschek JA, Cox JE (2020) Qsrr automator: a tool for automating retention time prediction in lipidomics and metabolomics. Metabolites 10(6):237

    CAS  PubMed  PubMed Central  Google Scholar 

  101. Wen Y, Amos RIJ, Talebi M, Szucs R, Dolan JW, Pohl CA, Haddad PR (2018) Retention index prediction using quantitative structure-retention relationships for improving structure identification in nontargeted metabolomics. Anal Chem 90(15):9434–9440. https://doi.org/10.1021/acs.analchem.8b02084

    Article  CAS  PubMed  Google Scholar 

  102. Daghir-Wojtkowiak E, Studzińska S, Buszewski B, Kaliszan R, Markuszewski MJ (2014) Quantitative structure–retention relationships of ionic liquid cations in characterization of stationary phases for hplc. Anal Methods 6(4):1189–1196. https://doi.org/10.1039/C3AY41805G

    Article  CAS  Google Scholar 

  103. Goryński K, Bojko B, Nowaczyk A, Buciński A, Pawliszyn J, Kaliszan R (2013) Quantitative structure-retention relationships models for prediction of high performance liquid chromatography retention time of small molecules: endogenous metabolites and banned compounds. Anal Chim Acta 797:13–19. https://doi.org/10.1016/j.aca.2013.08.025

    Article  CAS  PubMed  Google Scholar 

  104. Bodzioch K, Durand A, Kaliszan R, Baczek T, Vander Heyden Y (2010) Advanced QSRR modeling of peptides behavior in RPLC. Talanta 81(4–5):1711–1718. https://doi.org/10.1016/j.talanta.2010.03.028

    Article  CAS  PubMed  Google Scholar 

  105. Filipic S, Nikolic K, Krizman M, Danica A (2008) The quantitative structure–retention relationship (QSRR) analysis of some centrally acting antihypertensives and diuretics. QSAR Comb Sci 27:1036–1044. https://doi.org/10.1002/qsar.200710161

    Article  CAS  Google Scholar 

  106. Bahmani A, Saaidpour S, Rostami A (2017) Quantitative structure–retention relationship modeling of morphine and its derivatives on ov-1 column in gas–liquid chromatography using genetic algorithm. Chromatographia 80(4):629–636. https://doi.org/10.1007/s10337-017-3273-7

    Article  CAS  Google Scholar 

  107. Zhang DX, Si HZ, Liu X (2014) Quantitative structure-retention time relationship for retention time of coffee flavor compounds. Adv Mater Res 926–930:1010–1013. https://doi.org/10.4028/www.scientific.net/AMR.926-930.1010

    Article  Google Scholar 

  108. Paritala J, Peraman R, Kondreddy VK, Subrahmanyam CVS, Ravichandiran V (2021) Quantitative structure retention relationship (QSRR) approach for assessment of chromatographic behavior of antiviral drugs in the development of liquid chromatographic method. J Liq Chromatogr Relat Technol 44(13–14):637–648. https://doi.org/10.1080/10826076.2022.2025827

    Article  CAS  Google Scholar 

  109. Akbar J, Iqbal S, Batool F, Karim A, Chan KW (2012) Predicting retention times of naturally occurring phenolic compounds in reversed-phase liquid chromatography: a quantitative structure-retention relationship (qsrr) approach. Int J Mol Sci 13(11):15387–15400. https://doi.org/10.3390/ijms131115387

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Parinet J (2021) Prediction of pesticide retention time in reversed-phase liquid chromatography using quantitative-structure retention relationship models: a comparative study of seven molecular descriptors datasets. Chemosphere 275:130036. https://doi.org/10.1016/j.chemosphere.2021.130036

    Article  CAS  PubMed  Google Scholar 

  111. Maljurić N, Golubović J, Otašević B, Zečević M, Protić A (2018) Quantitative structure –retention relationship modeling of selected antipsychotics and their impurities in green liquid chromatography using cyclodextrin mobile phases. Anal Bioanal Chem 410(10):2533–2550. https://doi.org/10.1007/s00216-018-0911-3

    Article  CAS  PubMed  Google Scholar 

  112. Ji C, Li Y et al (2009) Quantitative structure-retention relationships for mycotoxins and fungal metabolites in LC-MS/MS. J Sep Sci 32:3967–3979. https://doi.org/10.1002/jssc.200900441

    Article  CAS  PubMed  Google Scholar 

  113. Szucs R, Brown R, Brunelli C, Heaton JC, Hradski J (2021) Structure driven prediction of chromatographic retention times: applications to pharmaceutical analysis. Int J Mol Sci 22(8):3848

    CAS  PubMed  PubMed Central  Google Scholar 

  114. Duarte A, Capelo S (2006) Application of chemometrics in separation science. J Liq Chromatogr Relat Technol 29(7–8):1143–1176

    CAS  Google Scholar 

  115. Bos TS, Knol WC, Molenaar SR, Niezen LE, Schoenmakers PJ, Somsen GW, Pirok BW (2020) Recent applications of chemometrics in one-and two-dimensional chromatography. J Sep Sci 43(9–10):1678–1727

    CAS  PubMed  PubMed Central  Google Scholar 

  116. Komsta Ł (2012) Chemometrics in fingerprinting by means of thin layer chromatography. Chromatogr Res Int 2012:893246. https://doi.org/10.1155/2012/893246

    Article  CAS  Google Scholar 

  117. BWIQ (2021). Chemometric tool BWIQ-software package. Accessed Jun 2021 from https://bwtek.Com/support and https://bwtek.Com/videos-applications

  118. Whittaker ET (1922) On a new method of graduation. Proc Edinb Math Soc 41:63–75

    Google Scholar 

  119. Suzuki T, Yoshida N (2020) Penalized least squares approximation methods and their applications to stochastic processes. Jpn J Stat Data Sci 3(2):513–541. https://doi.org/10.1007/s42081-019-00064-w

    Article  Google Scholar 

  120. Carlos Cobas J, Bernstein MA, Martín-Pastor M, Tahoces PG (2006) A new general-purpose fully automatic baseline-correction procedure for 1d and 2d NMR data. J Magn Reson 183(1):145–151. https://doi.org/10.1016/j.jmr.2006.07.013

    Article  CAS  PubMed  Google Scholar 

  121. Zhang Z, Chen S, Liang Y et al (2010) An intelligent background-correction algorithm for highly fluorescent samples in Raman spectroscopy. J Raman Spectrosc 41:659–669

    CAS  Google Scholar 

  122. Eilers PH (2003) A perfect smoother. Anal Chem 75(14):3631–3636. https://doi.org/10.1021/ac034173t

    Article  CAS  PubMed  Google Scholar 

  123. Zhang ZM, Chen S, Liang YZ (2010) Baseline correction using adaptive iteratively reweighted penalized least squares. Analyst 135(5):1138–1146. https://doi.org/10.1039/b922045c

    Article  CAS  PubMed  Google Scholar 

  124. Baek SJ, Park A, Ahn YJ, Choo J (2015) Baseline correction using asymmetrically reweighted penalized least squares smoothing. Analyst 140(1):250–257. https://doi.org/10.1039/c4an01061b

    Article  CAS  PubMed  Google Scholar 

  125. Chen L, Wu Y, Li T, Chen Z (2018) Collaborative penalized least squares for background correction of multiple Raman spectra. J Anal Methods Chem 2018:9031356. https://doi.org/10.1155/2018/9031356

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Pérez Y, Casado M, Raldúa D et al (2020) Mcr-als analysis of (1)h NMR spectra by segments to study the zebrafish exposure to acrylamide. Anal Bioanal Chem 412(23):5695–5706. https://doi.org/10.1007/s00216-020-02789-0

    Article  CAS  PubMed  Google Scholar 

  127. Felten J, Hall H, Jaumot J, Tauler R, de Juan A, Gorzsás A (2015) Vibrational spectroscopic image analysis of biological material using multivariate curve resolution-alternating least squares (MCR-ALS). Nat Protoc 10(2):217–240. https://doi.org/10.1038/nprot.2015.008

    Article  CAS  PubMed  Google Scholar 

  128. Smith JP, Holahan EC, Smith FC, Marrero V, Booksh KS (2019) A novel multivariate curve resolution-alternating least squares (MCR-ALS) methodology for application in hyperspectral raman imaging analysis. Analyst 144(18):5425–5438. https://doi.org/10.1039/C9AN00787C

    Article  CAS  PubMed  Google Scholar 

  129. Nagai Y, Sohn WY, Katayama K (2019) An initial estimation method using cosine similarity for multivariate curve resolution: application to NMR spectra of chemical mixtures. Analyst 144:5986

    CAS  PubMed  Google Scholar 

  130. Pearson K (1901) LIII. On lines and planes of closest fit to systems of points in space. Philos Magaz J Sci 2(11):559–572

    Google Scholar 

  131. Cserháti T (2010) Data evaluation in chromatography by principal component analysis. Biomed Chromatogr 24(1):20–28. https://doi.org/10.1002/bmc.1294

    Article  CAS  PubMed  Google Scholar 

  132. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1–3):37–52

    CAS  Google Scholar 

  133. Pang T, Zhang H, Wen L et al (2021) Quantitative analysis of a weak correlation between complicated data on the basis of principal component analysis. J Anal Methods Chem 2021:8874827. https://doi.org/10.1155/2021/8874827

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Soares EJ, Clifford AJ, Brown CD, Dean RR, Hupp AM (2019) Balancing resolution with analysis time for biodiesel–diesel fuel separations using GC, PCA, and the mahalanobis distance. Separations 6(2):28

    CAS  Google Scholar 

  135. Sudol PE, Gough DV, Prebihalo SE, Synovec RE (2020) Impact of data bin size on the classification of diesel fuels using comprehensive two-dimensional gas chromatography with principal component analysis. Talanta 206:120239

    CAS  PubMed  Google Scholar 

  136. Cook DW, Rutan SC (2014) Chemometrics for the analysis of chromatographic data in metabolomics investigations. J Chemom 28:681–987

    CAS  Google Scholar 

  137. Smilde AK, Bro R, Geladi P (2004) Multi-way analysis with applications in the chemical sciences. John Wiley & Sons

    Google Scholar 

  138. Khakimov B, Amigo JM, Bak S, Engelsen SB (2012) Plant metabolomics: resolution and quantification of elusive peaks in liquid chromatography–mass spectrometry profiles of complex plant extracts using multi-way decomposition methods. J Chromatogr A 1266:84–94. https://doi.org/10.1016/j.chroma.2012.10.023

    Article  CAS  PubMed  Google Scholar 

  139. Kumar K (2019) Optimizing parallel factor (PARAFAC) assisted excitation-emission matrix fluorescence (EEMF) spectroscopic analysis of multifluorophoric mixtures. J Fluoresc 29(3):683–691. https://doi.org/10.1007/s10895-019-02379-z

    Article  PubMed  Google Scholar 

  140. Kumar K, Kumar Mishra A (2015) Parallel factor (PARAFAC) analysis on total synchronous fluorescence spectroscopy (TSFS) data sets in excitation–emission matrix fluorescence (EEMF) layout: certain practical aspects. Chemom Intell Lab Syst 147:121–130. https://doi.org/10.1016/j.chemolab.2015.08.008

    Article  CAS  Google Scholar 

  141. Dramićanin T, Zeković I, Periša J, Dramićanin MD (2019) The parallel factor analysis of beer fluorescence. J Fluoresc 29(5):1103–1111. https://doi.org/10.1007/s10895-019-02421-0

    Article  CAS  PubMed  Google Scholar 

  142. Peng N, Wang K, Tu N, Liu Y, Li Z (2020) Fluorescence regional integration combined with parallel factor analysis to quantify fluorescencent spectra for dissolved organic matter released from manure biochars. RSC Adv 10(52):31502–31510. https://doi.org/10.1039/D0RA02706E

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  143. Barker M, Rayens W (2003) Partial least squares for discrimination. J Chemom 17(3):166–173. https://doi.org/10.1002/cem.785

    Article  CAS  Google Scholar 

  144. Ruiz-Perez D, Guan H, Madhivanan P, Mathee K, Narasimhan G (2020) So you think you can pls-da? BMC Bioinf 21(1):2. https://doi.org/10.1186/s12859-019-3310-7

    Article  Google Scholar 

  145. Bayci AWL, Baker DA, Somerset AE et al (2018) Metabolomic identification of diagnostic serum-based biomarkers for advanced stage melanoma. Metabolomics 14(8):105. https://doi.org/10.1007/s11306-018-1398-9

    Article  CAS  PubMed  Google Scholar 

  146. Dittgen CL, Hoffmann JF, Chaves FC, Rombaldi CV, Filho JMC, Vanier NL (2019) Discrimination of genotype and geographical origin of black rice grown in brazil by LC-MS analysis of phenolics. Food Chem 288:297–305. https://doi.org/10.1016/j.foodchem.2019.03.006

    Article  CAS  PubMed  Google Scholar 

  147. Tsanaktsidou E, Karavasili C, Zacharis CK, Fatouros DG, Markopoulou CK (2020) Partial least square model (PLS) as a tool to predict the diffusion of steroids across artificial membranes. Molecules. https://doi.org/10.3390/molecules25061387

    Article  PubMed  PubMed Central  Google Scholar 

  148. Wang X, Wang P, Zhang A, Sun H (2015) Chapter 11 - metabolic profiling and potential biomarkers of Shenyinxu syndrome and the therapeutic effect of liuweidihuang wan. In: Wang X, Zhang A, Sun H (eds) Chinmedomics. Academic Press, Boston, pp 175–194

    Google Scholar 

  149. Zhang J, Yu Q, Cheng H, Ge Y-Q, Liu H, Ye X, Chen Y (2018) Metabolomic approach for the authentication of berry fruit juice by liquid chromatography quadrupole time-of-flight mass spectrometry coupled to chemometrics. J Agric Food Chem 66(30):8199–8208

    CAS  PubMed  Google Scholar 

  150. Cao S, Du H, Tang B, Xi C, Chen Z (2021) Non-target metabolomics based on high-resolution mass spectrometry combined with chemometric analysis for discriminating geographical origins of Rhizoma Coptidis. Microchem J 160:105685. https://doi.org/10.1016/j.microc.2020.105685

    Article  CAS  Google Scholar 

  151. Pisner DA, Schnyer DM (2020) Chapter 6 - support vector machine. In: Mechelli A, Vieira S (eds) Machine learning. Academic Press, pp 101–121

    Google Scholar 

  152. Zheng S (2016) Smoothly approximated support vector domain description. Pattern Recogn 49:55–64. https://doi.org/10.1016/j.patcog.2015.07.003

    Article  Google Scholar 

  153. Képeš E, Vrábel J, Adamovsky O, Střítežská S, Modlitbová P, Pořízka P, Kaiser J (2022) Interpreting support vector machines applied in laser-induced breakdown spectroscopy. Anal Chim Acta 1192:339352. https://doi.org/10.1016/j.aca.2021.339352

    Article  CAS  PubMed  Google Scholar 

  154. Khanmohammadi Khorrami M et al (2021) Genetic algorithm based support vector machine regression for prediction of sara analysis in crude oil samples using ATR-FTIR spectroscopy. Spectrochim Acta A Mol Biomol Spectrosc 245:118945. https://doi.org/10.1016/j.saa.2020.118945

    Article  CAS  PubMed  Google Scholar 

  155. Hwang H, Jeong HK, Lee HK et al (2020) Machine learning classifies core and outer fucosylation of n-glycoproteins using mass spectrometry. Sci Rep 10(1):318. https://doi.org/10.1038/s41598-019-57274-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  156. Usman AG et al (2020) Artificial intelligence-based models for the qualitative and quantitative prediction of a phytochemical compound using hplc method. Turk J Chem 44(5):1339–1351. https://doi.org/10.3906/kim-2003-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  157. Mendez KM, Broadhurst DI, Reinke SN (2020) Migrating from partial least squares discriminant analysis to artificial neural networks: a comparison of functionally equivalent visualisation and feature contribution tools using jupyter notebooks. Metabolomics 16(2):17. https://doi.org/10.1007/s11306-020-1640-0

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  158. D’Archivio AA, Incani A, Ruggieri F (2011) Retention modelling of polychlorinated biphenyls in comprehensive two-dimensional gas chromatography. Anal Bioanal Chem 399:903–913

    PubMed  Google Scholar 

  159. Malenović A, Jančić-Stojanović BS, Kostić N, Ivanović D, Medenica M (2011) Optimization of artificial neural networks for modeling of atorvastatin and its impurities retention in micellar liquid chromatography. Chromatographia 73:993–998

    Google Scholar 

  160. Xu Y, Chen J, Yang D, Hu Y et al (2021) Development of lc-ms/ms determination method and backpropagation artificial neural networks pharmacokinetic model of febuxostat in healthy subjects. J Clin Pharm Ther 46(2):333–342. https://doi.org/10.1111/jcpt.13285

    Article  CAS  PubMed  Google Scholar 

  161. Huang S, Liu Y, Sun X, Li J (2021) Application of artificial neural network based on traditional detection and GC-MS in prediction of free radicals in thermal oxidation of vegetable oil. Molecules. https://doi.org/10.3390/molecules26216717

    Article  PubMed  PubMed Central  Google Scholar 

  162. Mert Ozupek N, Cavas L (2021) Modelling of multilinear gradient retention time of bio-sweetener rebaudioside a in HPLC analysis. Anal Biochem 627:114248. https://doi.org/10.1016/j.ab.2021.114248

    Article  CAS  PubMed  Google Scholar 

  163. Dinç E, Yücesoy C, Onur F (2002) Simultaneous spectrophotometric determination of mefenamic acid and paracetamol in a pharmaceutical preparation using ratio spectra derivative spectrophotometry and chemometric methods. J Pharm Biomed Anal 28(6):1091–1100. https://doi.org/10.1016/S0731-7085(02)00031-6

    Article  PubMed  Google Scholar 

  164. Dinç E (2003) Linear regression analysis and its application to the multivariate spectral calibrations for the multiresolution of a ternary mixture of caffeine, paracetamol and metamizol in tablets. J Pharm Biomed Anal 33(4):605–615. https://doi.org/10.1016/s0731-7085(03)00260-7

    Article  PubMed  Google Scholar 

  165. Niazi A, Goodarzi M (2008) Orthogonal signal correction-partial least squares method for simultaneous spectrophotometric determination of cypermethrin and tetramethrin. Spectrochim Acta A Mol Biomol Spectrosc 69(4):1165–1169. https://doi.org/10.1016/j.saa.2007.06.017

    Article  CAS  PubMed  Google Scholar 

  166. Dinç E, Baydan E, Kanbur M, Onur F (2002) Spectrophotometric multicomponent determination of sunset yellow, tartrazine and allura red in soft drink powder by double divisor-ratio spectra derivative, inverse least-squares and principal component regression methods. Talanta 58(3):579–594. https://doi.org/10.1016/S0039-9140(02)00320-X

    Article  PubMed  Google Scholar 

  167. Chen K, Park J, Li F, Patil SM, Keire DA (2018) Chemometric methods to quantify 1d and 2d NMR spectral differences among similar protein therapeutics. AAPS PharmSciTech 19(3):1011–1019. https://doi.org/10.1208/s12249-017-0911-1

    Article  CAS  PubMed  Google Scholar 

  168. Wasim M, Brereton RG (2005) Application of multivariate curve resolution methods to on-flow LC-NMR. J Chromatogr A 1096(1–2):2–15. https://doi.org/10.1016/j.chroma.2005.05.101

    Article  CAS  PubMed  Google Scholar 

  169. Gargallo R, Tauler R, Cuesta-Sánchez F, Massart DL (1996) Validation of alternating least-squares multivariate curve resolution for chromatographic resolution and quantitation. TrAC Trends Anal Chem 15(7):279–286. https://doi.org/10.1016/0165-9936(96)00048-9

    Article  CAS  Google Scholar 

  170. Peré-Trepat E, Lacorte S, Tauler R (2005) Solving liquid chromatography mass spectrometry coelution problems in the analysis of environmental samples by multivariate curve resolution. J Chromatogr A 1096(1):111–122. https://doi.org/10.1016/j.chroma.2005.04.089

    Article  CAS  PubMed  Google Scholar 

  171. Bylund D, Danielsson R, Malmquist G, Markides KE (2002) Chromatographic alignment by warping and dynamic programming as a pre-processing tool for parafac modelling of liquid chromatography–mass spectrometry data. J Chromatogr A 961(2):237–244. https://doi.org/10.1016/S0021-9673(02)00588-5

    Article  CAS  PubMed  Google Scholar 

  172. Deng X, Liao Q, Xu X et al (2014) Analysis of essential oils from cassia bark and cassia twig samples by GC-MS combined with multivariate data analysis. Food Anal Methods 7(9):1840–1847. https://doi.org/10.1007/s12161-014-9821-y

    Article  Google Scholar 

  173. Fraga CG (2003) Chemometric approach for the resolution and quantification of unresolved peaks in gas chromatography–selected-ion mass spectrometry data. J Chromatogr A 1019(1–2):31–42. https://doi.org/10.1016/s0021-9673(03)01329-3

    Article  CAS  PubMed  Google Scholar 

  174. Lukitaningsih E et al (2012) Quantitative analysis of lard in cosmetic lotion formulation using FTIR spectroscopy and partial least square calibration. J Am Oil Chem Soc 89(8):1537–1543. https://doi.org/10.1007/s11746-012-2052-8

    Article  CAS  Google Scholar 

  175. Rohman A, Che Man YB (2011) Application of fourier transform infrared (FT-IR) spectroscopy combined with chemometrics for authentication of cod-liver oil. Vib Spectrosc 55(2):141–145. https://doi.org/10.1016/j.vibspec.2010.10.001

    Article  CAS  Google Scholar 

  176. Yan F, Liang Z, Jianna C, Zhengtao W, Losahan X, Zhengxing Z (2001) Analysis of Cnidium Monnieri fruits in different regions of china. Talanta 53(6):1155–1162. https://doi.org/10.1016/S0039-9140(00)00594-4

    Article  CAS  PubMed  Google Scholar 

  177. Comas E, Gimeno RA, Ferré J, Marcé RM, Borrull F, Rius FX (2004) Quantification from highly drifted and overlapped chromatographic peaks using second-order calibration methods. J Chromatogr A 1035(2):195–202. https://doi.org/10.1016/j.chroma.2004.02.069

    Article  CAS  PubMed  Google Scholar 

  178. Detroyer A, Schoonjans V, Questier F, Vander Heyden Y, Borosy AP, Guo Q, Massart DL (2000) Exploratory chemometric analysis of the classification of pharmaceutical substances based on chromatographic data. J Chromatogr A 897(1):23–36. https://doi.org/10.1016/S0021-9673(00)00803-7

    Article  CAS  PubMed  Google Scholar 

  179. Lu X-F, Bi K-S, Zhao X, Chen X-H (2012) Authentication and distinction of shenmai injection with hplc fingerprint analysis assisted by pattern recognition techniques. J Pharm Anal 2(5):327–333. https://doi.org/10.1016/j.jpha.2012.07.009

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  180. Abdollahi H, Nazari F (2003) Rank annihilation factor analysis for spectrophotometric study of complex formation equilibria. Analytica Chimica Acta ANAL CHIM ACTA 486:109–123. https://doi.org/10.1016/S0003-2670(03)00471-9

    Article  CAS  Google Scholar 

  181. Kong W-J, Zhao Y-L, Xiao X-H, Jin C, Li Z-L (2009) Quantitative and chemical fingerprint analysis for quality control of Rhizoma Coptidischinensis based on UPLC-pad combined with Chemometrics methods. Phytomedicine 16(10):950–959. https://doi.org/10.1016/j.phymed.2009.03.016

    Article  CAS  PubMed  Google Scholar 

  182. O’Connell M-L, Ryder AG, Leger MN, Howley T (2010) Qualitative analysis using raman spectroscopy and chemometrics: A comprehensive model system for narcotics analysis. Appl Spectrosc 64(10):1109–1121

    CAS  PubMed  Google Scholar 

  183. Porfire A, Tomuta I, Tefas L, Leucuta SE, Achim M (2012) Simultaneous quantification of l-α-phosphatidylcholine and cholesterol in liposomes using near infrared spectrometry and chemometry. J Pharm Biomed Anal 63:87–94. https://doi.org/10.1016/j.jpba.2012.01.017

    Article  CAS  PubMed  Google Scholar 

  184. Candolfi A, De Maesschalck R, Massart DL, Hailey PA, Harrington AC (1999) Identification of pharmaceutical excipients using NIR spectroscopy and simca. J Pharm Biomed Anal 19(6):923–935. https://doi.org/10.1016/s0731-7085(98)00234-9

    Article  CAS  PubMed  Google Scholar 

  185. Rodríguez Cáceres MI, Durán Merás I, Ornelas Soto NE, López de Alba PL, López Martinez L (2008) Determination of anticarcinogenic and rescue therapy drugs in urine by photoinduced spectrofluorimetry using multivariate calibration: comparison of several second-order methods. Anal Bioanal Chem 391(4):1119–1127. https://doi.org/10.1007/s00216-008-2069-x

    Article  CAS  PubMed  Google Scholar 

  186. Ter Horst JP, Turimella SL, Metsers F, Zwiers A (2021) Implementation of quality by design (QBD) principles in regulatory dossiers of medicinal products in the European Union (EU) between 2014 and 2019. Ther Innov Regul Sci 55(3):583–590. https://doi.org/10.1007/s43441-020-00254-9

    Article  PubMed  PubMed Central  Google Scholar 

  187. Gurrala S, Raj S, Cvs S, Anumolu PD (2022) Quality-by-design approach for chromatographic analysis of metformin, Empagliflozin and Linagliptin. J Chromatogr Sci 60(1):68–80

    CAS  PubMed  Google Scholar 

  188. Zacharis CK, Vastardi E (2018) Application of analytical quality by design principles for the determination of alkyl p-toluenesulfonates impurities in aprepitant by HPLC. Validation using total-error concept. J Pharm Biomed Anal 150:152–161

    CAS  PubMed  Google Scholar 

  189. Almeida J, Bezerra M, Markl D, Berghaus A, Borman P, Schlindwein W (2020) Development and validation of an in-line API quantification method using AQBD principles based on UV-vis spectroscopy to monitor and optimise continuous hot melt extrusion process. Pharmaceutics 12(2):150

    PubMed  PubMed Central  Google Scholar 

  190. Žigart N, Časar Z (2020) Development of a stability-indicating analytical method for determination of venetoclax using AQBD principles. ACS Omega 5(28):17726–17742

    PubMed  PubMed Central  Google Scholar 

  191. Bandopadhyay S, Beg S, Katare O, Sharma T, Singh B (2020) Integrated analytical quality by design (AQBD) approach for the development and validation of bioanalytical liquid chromatography method for estimation of valsartan. J Chromatogr Sci 58(7):606–621

    CAS  PubMed  Google Scholar 

  192. Bossunia MTI, Urmi KF, Chironjit Kumar S (2017) Quality-by-design approach to stability indicating RP-HPLC analytical method development for estimation of canagliflozin API and its validation. Pharm Methods 8:2

    Google Scholar 

  193. Peraman R, Bhadraya K, Reddy YP, Reddy CS, Lokesh T (2015) Analytical quality by design approach in RP-HPLC method development for the assay of etofenamate in dosage forms. Indian J Pharm Sci 77(6):751–757. https://doi.org/10.4103/0250-474x.174971

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  194. Tumpa A, Stajić A, Jančić-Stojanović B, Medenica M (2017) Quality by design in the development of hydrophilic interaction liquid chromatography method with gradient elution for the analysis of olanzapine. J Pharm Biomed Anal 134:18–26

    CAS  PubMed  Google Scholar 

  195. Peraman R, Kalva B, Shanka S, Reddy YP (2014) Analytical quality by design (AQBD) approach to liquid chromatographic method for quantification of acyclovir and hydrocortisone in dosage forms. Anal Chem Lett 4(5–6):329–342

    CAS  Google Scholar 

  196. Peraman R, Bandi J, Kondreddy VK et al (2021) Analytical quality by design approach versus conventional approach: development of HPLC-dad method for simultaneous determination of etizolam and propranolol hydrochloride. J Liquid Chromatogr Relat Technol 44(3–4):197–209

    CAS  Google Scholar 

  197. Palakurthi AK, Dongala T, Katakam LNR (2020) QBD based development of HPLC method for simultaneous quantification of telmisartan and hydrochlorothiazide impurities in tablets dosage form. Pract Lab Med 21:e00169

    PubMed  PubMed Central  Google Scholar 

  198. Krishna MV, Dash RN, Reddy BJ, Venugopal P, Sandeep P, Madhavi G (2016) Quality by design (QBD) approach to develop HPLC method for eberconazole nitrate: application oxidative and photolytic degradation kinetics. J Saudi Chem Soc 20:S313–S322

    CAS  Google Scholar 

  199. Thakur D, Jain A, Ghoshal G, Shivhare U, Katare O (2017) RP-HPLC method development using analytical QBD approach for estimation of cyanidin-3-o-glucoside in natural biopolymer based microcapsules and tablet dosage form. J Pharm Investig 47(5):413–427

    CAS  Google Scholar 

  200. Sandhu PS, Beg S, Kumar R, Katare O, Singh B (2017) Analytical QBD-based systematic bioanalytical HPLC method development for estimation of quercetin dihydrate. J Liq Chromatogr Relat Technol 40(10):506–516

    CAS  Google Scholar 

  201. Alruwaili NK (2021) Analytical quality by design approach of reverse-phase high-performance liquid chromatography of atorvastatin: method development, optimization, validation, and the stability-indicated method. Int J Anal Chem. https://doi.org/10.1155/2021/8833900

    Article  PubMed  PubMed Central  Google Scholar 

  202. Fink DW (1988) Ivermectin analytical profiles of drug substances. Elsevier, pp 155–184

    Google Scholar 

  203. Abdel-Moety EM, Al-Khamees HA (1990) Analytical profile of azintamide. In: Analytical profiles of drug substances, vol 18, pp 1–32. Elsevier

  204. Zubair MU, Hassan MM, Al-Meshal IA (1986) Caffeine. In: Analytical profiles of drug substances, vol 15, pp 71–150. Elsevier

  205. Florey K (1979) Aspirin. In: Analytical profiles of drug substances, vol 8, pp 1–46. Elsevier

  206. Townley ER (1979) Griseofulvin. In: Analytical profiles of drug substances, vol 8, pp 219–249. Elsevier

  207. Pitré D (1986) Iodamide. In: Analytical profiles of drug substances, vol 15, pp 337–365. Elsevier.

  208. Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B et al (2009) Hmdb: a knowledgebase for the human metabolome. Nucl Acids Res 37(suppl_1):D603–D610

    CAS  PubMed  Google Scholar 

  209. SDBSWeb (2014) Sdbsweb: National institute of advanced industrial science and technology. Accessed May 2021 from http://sdbs.Riodb.Aist.Go.Jp

  210. Hassan MM, Elazzouny AA (1982) Clofibrate. In: Analytical profiles of drug substances, vol 11, pp 197–224. Elsevier

  211. Piskorik HG (1985) Tripelennamine hydrochloride. In: Analytical profiles of drug substances. vol 14, pp 107–133. Elsevier

  212. Pitrè D (1985) Iopoanoic acid. In Analytical profiles of drug substances, vol. 14, pp 181–206. Elsevier

  213. Brittain HG (2001) Malic acid. Profiles Drug Subst Excip Relat Methodol 28:153–195

    CAS  Google Scholar 

  214. Al-Badr AA, El-Obeid HA (1988) Analytical profile of primidone. In: Analytical profiles of drug substances, vol 17, pp 749–795. Elsevier

  215. Fairbrother JE (1974) Acetaminophen. In: Analytical profiles of drug substances, vol 3, pp 1–109. Elsevier

  216. Ali SL (1983) Benzocaine. In: Analytical profiles of drug substances, vol 12, pp 73–104. Elsevier

  217. Eisner U, Gore PH (1958) The light absorption of pyrroles. Part i. Ultraviolet spectra. J Chem Soc 186:922–927. https://doi.org/10.1039/JR9580000922

    Article  Google Scholar 

  218. Filpponen I, Sadeghifar H, Argyropoulos DS (2011) Photoresponsive cellulose nanocrystals. Nanomater Nanotechnol 1:7. https://doi.org/10.5772/50949

    Article  Google Scholar 

  219. Arayne MS, Sultana N, Zuberi MH, Siddiqui FA (2009) Spectrophotometric quantitation of metformin in bulk drug and pharmaceutical formulations using multivariate technique. Indian J Pharm Sci 71(3):331–335. https://doi.org/10.4103/0250-474x.56022

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  220. Dunn DL, Jones WJ, Dorsey ED (1983) Analysis of chlorobutanol in ophthalmie ointments and aqueous solutions by reverse-phase high-performance liquid chromatography. J Pharm Sci 72(3):277–280. https://doi.org/10.1002/jps.2600720317

    Article  CAS  PubMed  Google Scholar 

  221. Pubchem (2022) Pubchem database. Accessed May 2021 from https://pubchem.Ncbi.Nlm.Nih.Gov.

  222. Fairbrother JE (1973) Chloral hydrate. In: Florey K (ed) Analytical profiles of drug substances. Academic Press, pp 85–143

    Google Scholar 

  223. Daley RD (1972) Halothane. In: Florey K (ed) Analytical profiles of drug substances. Academic Press, pp 119–147

    Google Scholar 

  224. Orzech CE, Nash NG, Daley RD (1979) Halcinonide. In: Florey K (ed) Analytical profiles of drug substances. Academic Press, pp 283–314

    Google Scholar 

  225. Stober HC (1986) Lithium carbonate. In: Florey K (ed) Analytical profiles of drug substances. Academic Press, pp 367–391

    Google Scholar 

  226. Chao MKC, Albert KS, Fusari SA (1978) Phenobarbital. In: Florey K (ed) Analytical profiles of drug substances. Academic Press, pp 359–399

    Google Scholar 

  227. Newman AW, Vitez IM, Mueller RL, Kiesnowski CC et al (1999) Sorbitol. In: Brittain HG (ed) Analytical profiles of drug substances and excipients. Academic Press, pp 459–502

    Google Scholar 

  228. Manius GJ (1978) Trimethoprim. In: Florey K (ed) Analytical profiles of drug substances. Academic Press, pp 445–475

    Google Scholar 

  229. Coates J (2000) Interpretation of infrared spectra: a practical approach. John Wiley & Sons Ltd, Chichester

    Google Scholar 

  230. Vilas S, Thilagar S (2021) Formulation and optimisation of lamivudine-loaded eudragit(®) s 100 polymer-coated pectin microspheres for colon-specific delivery. IET Nanobiotechnol 15(1):90–99. https://doi.org/10.1049/nbt2.12010

    Article  PubMed  PubMed Central  Google Scholar 

  231. Bansal R, Guleria A, Acharya PC (2013) FT-IR method development and validation for quantitative estimation of zidovudine in bulk and tablet dosage form. Drug Res 63(4):165–170. https://doi.org/10.1055/s-0032-1333297

    Article  CAS  Google Scholar 

  232. Anbazhagan S, Indumathy N, Shanmugapandiyan P, Sridhar SK (2005) Simultaneous quantification of stavudine, lamivudine and nevirapine by UV spectroscopy, reverse phase HPLC and HPTLC in tablets. J Pharm Biomed Anal 39(3–4):801–804

    CAS  PubMed  Google Scholar 

  233. Chandramowli B, Kumar BS, Bhikshapathi D, Rajkamal BB (2018) A new quantitative analytical method development and validation for the analysis of boceprevir in bulk and marketed formulation. Int J Pharm Sci Drug Res 10(3):201–205

    CAS  Google Scholar 

  234. Snehal DJ, Prafulla SC, Vishal SM (2018) HPLC method development of cidofovir as bulk drug and its formulation. Int J Eng Devel Res 6(2):870–874

    Google Scholar 

  235. Abdul K, Milon M, Mohammad A, Parvin M, Rafiquzzaman M, Kundu S (2017) Development and validation of a new RP-HPLC method for the determination of daclatasvir dihydrochloride in bulk and pharmaceutical dosage forms. Int J Pharm 7:7–13

    Google Scholar 

  236. Satyanarayana L, Naidu S, Rao MN, Ayyanna C, Kumar A (2011) The estimation of RALTIGRAVIR in tablet dosage form by RP-HPLC. Asian J Pharm Anal 1(3):56–58

    Google Scholar 

  237. Tiruveedhi VBG, Battula VR, Bonige KB (2021) Rp-hplc (stability-indicating) based assay method for the simultaneous estimation of doravirine, tenofovir disoproxil fumarate and lamivudine. Int J Appl Pharm 13(1):153–159

    Google Scholar 

  238. Tejaswi KD et al (2019) Reverse-phase high-performance liquid chromatography method development and validation for simultaneous estimation and forced degradation studies of Emtricitabine, Rilpivirine, and Tenofovir Alafenamide in solid dosage form. Asian J Pharm Clin Res 12:112. https://doi.org/10.22159/ajpcr.2018.v12i1.28765

    Article  CAS  Google Scholar 

  239. Mastanamma S, Chandini S, Reehana S, Saidulu P (2018) Development and validation of stability indicating RP-HPLC method for the simultaneous estimation of Sofosbuvir and Ledipasvir in bulk and their combined dosage form. Future J Pharm Sci 4(2):116–123

    Google Scholar 

  240. Dey S, Patro SS, Babu NS, Murthy P, Panda S (2017) Development and validation of a stability-indicating RP–HPLC method for estimation of atazanavir sulfate in bulk. J Pharm Anal 7(2):134–140

    CAS  PubMed  Google Scholar 

  241. Rathnasamy R, Karuvalam R, Pakkath R, Kamalakannan P, Sivasubramanian A (2018) RP-HPLC method development and method validation of lopinavir and ritonavir in pharmaceutical dosage form. Am J Clin Microbiol Antimicrob 1(1):1002

    Google Scholar 

  242. Ezzeldin E, Abo-Talib NF, Tammam MH, Asiri YA, Amr AE-GE, Almehizia AA (2020) Validated reversed-phase liquid chromatographic method with gradient elution for simultaneous determination of the antiviral agents: sofosbuvir, ledipasvir, daclatasvir, and simeprevir in their dosage forms. Molecules 25(20):4611

    CAS  PubMed  PubMed Central  Google Scholar 

  243. Ahmad SAR, Patil L, Usman MRM, Imran M, Akhtar R (2018) Analytical method development and validation for the simultaneous estimation of abacavir and lamivudine by reversed-phase high-performance liquid chromatography in bulk and tablet dosage forms. Pharm Res 10(1):92. https://doi.org/10.4103/pr.pr_96_17

    Article  CAS  Google Scholar 

  244. Haneef M, Rajkamal B, Goud VM (2013) Development and validation by RP-HPLC method for estimation of zidovudine in bulk and its pharmaceutical dosage form. Asian J Res Chem 6(4):341–344

    Google Scholar 

  245. Godela R, Gummadi S (2021) A simple stability indicating RP-HPLC-dad method for concurrent analysis of tenofovir disoproxil fumarate, doravirine and lamivudine in pure blend and their combined film coated tablets. Paper presented at the Annales Pharmaceutiques Françaises.

  246. Harikrishnan N, Prasad VV et al (2019) Stability indicating RP-HPLC method development and validation for the simultaneous estimation of pibrentasvir and glecaprevir in bulk and pharmaceutical dosage form. Int J Res Pharm Sci 10(3):1841–1846

    CAS  Google Scholar 

  247. Ramya V et al (2019) Simultaneous estimation of amantadine hydrochloride and oseltamivir phosphate using precolumn derivatization technique. Int J Pharm Sci Res 10(12):5443–5449. https://doi.org/10.13040/IJPSR.0975-8232.10(12).5443-49

    Article  CAS  Google Scholar 

  248. García J, Márquez A, Ruiz R, López LF, Claro C, Lucero MJ (2006) Determination of foscarnet (trisodium phosphonoformate) in pharmaceutical preparations by high-performance liquid chromatography with ultraviolet detection. Biomed Chromatogr 20(10):1024–1027

    PubMed  Google Scholar 

  249. Ramesh P, Basavaiah K, Vinay K, Xavier CM (2012) Development and validation of Rp-HPLC method for the determination of ganciclovir in bulk drug and in formulations. Int Schol Res Not. https://doi.org/10.5402/2012/894965

    Article  Google Scholar 

  250. Nekkala K, Kumar JS, Ramachandran D (2019) Stability indicating RP-HPLC method for quantification of impurities in fosamprenavir calcium drug substance. J Pharm Sci Res 11(3):712–717

    CAS  Google Scholar 

  251. Bulduk İ (2021) HPLC-UV method for quantification of favipiravir in pharmaceutical formulations. Acta Chromatogr 33(3):209–215

    CAS  Google Scholar 

  252. Asha E, Surendra Babu K (2020) A new selective separation method development and validation of trifluridine and tipiracil and its degradents were characterized by LC-MS/MS/QTOF. J Pharm Sci Res 12(1):199–205

    Google Scholar 

  253. Reddy Y, Harika K, Sowjanya K, Swathi E, Soujanya B, Reddy S (2011) Estimation of zanamivir drug present in tablets using RP-HPLC method. Int J PharmTech Res 3:180–186

    CAS  Google Scholar 

  254. Dalmora SL, Sangoi M et al (2010) Validation of a stability-indicating RP-HPLC method for the determination of entecavir in tablet dosage form. J AOAC Int 93(2):523–530

    CAS  PubMed  Google Scholar 

  255. Prakash KV, Rao JV, Raju NA (2007) RP-HPLC method for the estimation of nelfinavir mesylate in tablet dosage form. E J Chem 4(3):302–306

    CAS  Google Scholar 

  256. Jing Q, Shen Y, Tang Y, Ren F, Yu X, Hou Z (2006) Determination of nelfinavir mesylate as bulk drug and in pharmaceutical dosage form by stability indicating HPLC. J Pharm Biomed Anal 41(3):1065–1069

    CAS  PubMed  Google Scholar 

  257. Attaluri T, Seru G, Varanasi SNM (2021) Development and validation of a stability-indicating RP-HPLC method for the simultaneous estimation of bictegravir, emtricitabine, and tenofovir alafenamide fumarate. Turk J Pharm Sci 18(4):410–419. https://doi.org/10.4274/tjps.galenos.2020.70962

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  258. Jitta SR, Salwa K et al (2021) Development and validation of high-performance liquid chromatography method for the quantification of remdesivir in intravenous dosage form. Assay Drug Dev Technol 19(8):475–483. https://doi.org/10.1089/adt.2021.074

    Article  CAS  PubMed  Google Scholar 

  259. Shuangjin C, Fang F, Han L, Ming M (2007) New method for high-performance liquid chromatographic determination of amantadine and its analogues in rat plasma. J Pharm Biomed Anal 44(5):1100–1105

    PubMed  Google Scholar 

  260. Kaliszan R (2007) Qsrr: Quantitative structure-(chromatographic) retention relationships. Chem Rev 107(7):3212–3246

    CAS  PubMed  Google Scholar 

  261. Venkat Kumar C, Anantahakumar D, Rao J (2010) A new validated rp- hplc method for determination of penciclovir in human plasma. Int J Chem Sci 2:95–102

    Google Scholar 

  262. Raees Ahmad SA, Patil L, Mohammed Usman MR, Imran M, Akhtar R (2018) Analytical method development and validation for the simultaneous estimation of abacavir and lamivudine by reversed-phase high-performance liquid chromatography in bulk and tablet dosage forms. Pharmacognosy Res 10(1):92–97. https://doi.org/10.4103/pr.pr_96_17

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  263. Hussain Shah SS, Nasiri MI, Sarwar H, Ali A et al (2021) RP-HPLC method development and validation for quantification of daclatasvir dihydrochloride and its application to pharmaceutical dosage form. Pak J Pharm Sci 34(3):951–956

    PubMed  Google Scholar 

  264. Hiremath SN, Bhirud CH (2015) Development and validation of a stability indicating HPLC method for the simultaneous analysis of lopinavir and ritonavir in fixed-dose combination tablets. J Taib Univ Med Sci 10(3):271–277. https://doi.org/10.1016/j.jtumed.2014.11.006

    Article  Google Scholar 

  265. dos Santos JV, de Carvalho LA, Pina ME (2011) Development and validation of a RP-HPLC method for the determination of zidovudine and its related substances in sustained-release tablets. Anal Sci 27(3):283–289. https://doi.org/10.2116/analsci.27.283

    Article  PubMed  Google Scholar 

  266. Higashi Y, Uemori I, Fujii Y (2005) Simultaneous determination of amantadine and rimantadine by hplc in rat plasma with pre-column derivatization and fluorescence detection for pharmacokinetic studies. Biomed Chromatogr 19(9):655–662. https://doi.org/10.1002/bmc.492

    Article  CAS  PubMed  Google Scholar 

  267. Mamatha J, Devanna N (2017) Development and validation of a rp-hplc method for analysis of cidofovir in medicinal form. Indian J Sci Technol 10:1–5. https://doi.org/10.17485/ijst/2017/v10i34/117108

    Article  CAS  Google Scholar 

  268. Patel J, Bedi H, Middha A, Prajapati L, Parmar V (2012) Rp-hplc method for estimation of nitazoxanide in oral suspension formulation. Der Pharm Chem 4:1140–1144

    CAS  Google Scholar 

  269. Kanagala V, Anusha M, Reddy S (2015) Rapid rp-hplc method development and validation for analysis of raltegravir in bulk and pharmaceutical dosage form. Asian J Pharm Anal. https://doi.org/10.5958/2231-5675.2015.00003.4

    Article  Google Scholar 

  270. Babu G, Atmakuri LR, Rao J (2015) A rapid RP-HPLC method development and validation for the quantitative estimation ribavirin in tablets. Int J Pharm Pharm Sci 7:60–63

    CAS  Google Scholar 

  271. Chandrasekaran B, Abed SN, Al-Attraqchi O, Kuche K, Tekade RK (2018) Chapter 21 - computer-aided prediction of pharmacokinetic (admet) properties. In: Tekade RK (ed) Dosage form design parameters. Academic Press, pp 731–755

    Google Scholar 

  272. Wu Y-J, Zhan T, Hou Z, Fang L, Xu Y (2020) Physical and chemical descriptors for predicting interfacial thermal resistance. Sci Data 7:36. https://doi.org/10.1038/s41597-020-0373-2

    Article  PubMed  PubMed Central  Google Scholar 

  273. Zhang C, Idelbayev Y, Roberts N, Tao Y et al (2017) Small molecule accurate recognition technology (smart) to enhance natural products research. Sci Rep 7(1):14243–14243. https://doi.org/10.1038/s41598-017-13923-x

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  274. Stagner WC, Haware RV (2019) Qbd innovation through advances in pat, data analysis methodologies, and material characterization. AAPS PharmSciTech 20(7):295. https://doi.org/10.1208/s12249-019-1506-9

    Article  PubMed  Google Scholar 

  275. Skoog DA et al (2014) Fundamentals of analytical chemistry, 9th edn. Brooks/Cole, Belmont CA

    Google Scholar 

  276. Booksh KS, Kowalski BR (1994) Theory of analytical chemistry. Anal Chem 66(15):782A-791A. https://doi.org/10.1021/ac00087a718

    Article  CAS  Google Scholar 

  277. Wise BM, Gallagher NB (1996) The process chemometrics approach to process monitoring and fault detection. J Process Control 6(6):329–348

    CAS  Google Scholar 

  278. Fu H, Yin Q, Xu L, Wang W, Chen F, Yang T (2017) A comprehensive quality evaluation method by ft-nir spectroscopy and chemometric: fine classification and untargeted authentication against multiple frauds for chinese ganoderma lucidum. Spectrochimica acta. Part A Mol Biomol Spectrosc 182:17–25. https://doi.org/10.1016/j.saa.2017.03.074

    Article  CAS  Google Scholar 

  279. Rady AM, Guyer DE (2015) Evaluation of sugar content in potatoes using nir reflectance and wavelength selection techniques. Postharvest Biol Technol 103:17–26. https://doi.org/10.1016/j.postharvbio.2015.02.012

    Article  CAS  Google Scholar 

  280. Hong X, Wang J, Qi G (2015) E-nose combined with chemometrics to trace tomato-juice quality. J Food Eng 149:38–43. https://doi.org/10.1016/j.jfoodeng.2014.10.003

    Article  CAS  Google Scholar 

  281. Canizo BV, Escudero LB, Pérez MB, Pellerano RG, Wuilloud RG (2017) Intra-regional classification of grape seeds produced in mendoza province (argentina) by multi-elemental analysis and chemometrics tools. Food Chem 242:272–278

    PubMed  Google Scholar 

  282. Liu W, Liu C, Hu X, Yang J, Zheng L (2016) Application of terahertz spectroscopy imaging for discrimination of transgenic rice seeds with chemometrics. Food Chem 210:415–421. https://doi.org/10.1016/j.foodchem.2016.04.117

    Article  CAS  PubMed  Google Scholar 

  283. Barbosa RM, de Paula ES, Paulelli AC, Moore AF, Souza JMO et al (2016) Recognition of organic rice samples based on trace elements and support vector machines. J Food Compos Anal 45:95–100. https://doi.org/10.1016/j.jfca.2015.09.010

    Article  CAS  Google Scholar 

  284. Zheng W, Fu X, Ying Y (2014) Spectroscopy-based food classification with extreme learning machine. Chemom Intell Lab Syst 139:42–47. https://doi.org/10.1016/j.chemolab.2014.09.015

    Article  CAS  Google Scholar 

  285. PK H (1985) Receptor modling in environmental chemistry. John Wiley Sons, New York

    Google Scholar 

  286. Golubović J, Protić A, Otašević B, Zečević M (2016) Quantitative structure-retention relationships applied to development of liquid chromatography gradient-elution method for the separation of sartans. Talanta 150:190–197. https://doi.org/10.1016/j.talanta.2015.12.035

    Article  CAS  PubMed  Google Scholar 

  287. Khosrokhavar R, Ghasemi JB, Shiri F (2010) 2d quantitative structure-property relationship study of mycotoxins by multiple linear regression and support vector machine. Int J Mol Sci 11(9):3052–3068. https://doi.org/10.3390/ijms11093052

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  288. Fu T, Sieng IDY (2021) A comparative study between pcr, plsr, and lw-pls on the predictive performance at different data splitting ratios. Chem Eng Commun. https://doi.org/10.1080/00986445.2021.1957853

    Article  Google Scholar 

  289. El-Gindy A, Emara S, Mostafa A (2005) Hplc and chemometric-assisted spectrophotometric methods for simultaneous determination of atenolol, amiloride hydrochloride and chlorthalidone. Il Farmaco 60(3):269–278. https://doi.org/10.1016/j.farmac.2004.11.013

    Article  CAS  PubMed  Google Scholar 

  290. Luis ML, Fraga JMG, Jiménez F, Jiménez AI, Arias JJ (2001) Simultaneous spectrophotometric determination of diuretics by using multivariate calibration methods. Talanta 53(4):761–770. https://doi.org/10.1016/S0039-9140(00)00538-5

    Article  CAS  PubMed  Google Scholar 

  291. Cramer RD (1993) Partial least squares (pls): its strengths and limitations. Perspect Drug Discov Des 1(2):269–278. https://doi.org/10.1007/BF02174528

    Article  CAS  Google Scholar 

  292. El-Gindy A, Emara S, Mostafa A (2006) Application and validation of chemometrics-assisted spectrophotometry and liquid chromatography for the simultaneous determination of six-component pharmaceuticals. J Pharm Biomed Anal 41(2):421–430. https://doi.org/10.1016/j.jpba.2005.12.005

    Article  CAS  PubMed  Google Scholar 

  293. Koleini F, Balsini P, Parastar H (2022) Evaluation of partial least-squares regression with multivariate analytical figures of merit for determination of 10 pesticides in milk. Int J Environ Anal Chem 102(8):1900–1910. https://doi.org/10.1080/03067319.2020.1745198

    Article  CAS  Google Scholar 

  294. Pate ME, Turner MK, Thornhill NF, Titchener-Hooker NJ (1999) The use of principal component analysis for the modelling of high performance liquid chromatography. Bioprocess Eng 21(3):261–272. https://doi.org/10.1007/s004490050674

    Article  CAS  Google Scholar 

  295. Singh VD, Daharwal SJ (2017) Development and validation of multivariate calibration methods for simultaneous estimation of paracetamol, enalapril maleate and hydrochlorothiazide in pharmaceutical dosage form. Spectrochim Acta A Mol Biomol Spectrosc 171:369–375. https://doi.org/10.1016/j.saa.2016.08.028

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We would like to express our gratitude to our college’s organization, co-workers, and collaborations for moulding review collection concerns; without their enthusiasm and hard work, we would not have been able to succeed, as well as the software developers of QSRR Automator, ORCA, and Avogadro.

Study involving plants/licence for the study

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to M. Malarvannan or P. Ramalingam.

Ethics declarations

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors whose names are listed in this manuscript that they have NO affiliations with or involvement in any organization or entity with any financial or nonfinancial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malarvannan, M., Kumar, K.V., Reddy, Y.P. et al. Assessment of computational approaches in the prediction of spectrogram and chromatogram behaviours of analytes in pharmaceutical analysis: assessment review. Futur J Pharm Sci 9, 86 (2023). https://doi.org/10.1186/s43094-023-00537-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s43094-023-00537-6

Keywords