Inter-HVI: Bridging Interpretability and Accuracy in Hypervalent Iodine Reactivity Prediction


Uğurlu S. Y.

JOURNAL OF SOLUTION CHEMISTRY, cilt.54, sa.11, ss.1403-1453, 2025 (SCI-Expanded)

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 54 Sayı: 11
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1007/s10953-025-01509-5
  • Dergi Adı: JOURNAL OF SOLUTION CHEMISTRY
  • Derginin Tarandığı İndeksler: Scopus, Science Citation Index Expanded (SCI-EXPANDED), Academic Search Premier, Aerospace Database, Chemical Abstracts Core, Communication Abstracts, Metadex, Civil Engineering Abstracts
  • Sayfa Sayıları: ss.1403-1453
  • Akdeniz Üniversitesi Adresli: Evet

Özet

Hypervalent iodine (HVI) reagents are widely used in organic synthesis due to their oxidative versatility, tunable reactivity, and environmentally friendly profile. However, accurately predicting their reactivity, typically quantified by bond dissociation energy (BDE), remains computationally intensive and experimentally demanding. In this work, we propose Inter-HVI, a transparent and high-performing machine learning framework for BDE prediction. Inter-HVI, a well-designed framework, combines molecular 2,809 descriptors from RDKit, Mordred, PyBioMed, CDK, and Avalon/Morgan/MACCS fingerprints. Descriptors with more than 5% missing values were removed to reduce computational cost and prevent potential redundancy that could arise from imputation using mean or median values. After a generous feature selection process, the Inter-HVI model was trained using RuleFit, which remains robust even with a large number of descriptors due to its tree-derived rule structure. Such a structure of Inter-HVI enables the model to focus on the most informative feature interactions while naturally filtering out irrelevant or redundant variables, thus maintaining both accuracy and interoperability. As a result, Inter-HVI achieved top-tier performance in predicting bond dissociation energy, matching the test R2 of the benchmark ANN model at 0.960, while improving cross-validation R2 (0.931 vs. 0.887), reducing RMSE (3.033 vs. 3.030 kcal·mol−1 (1 kcal = 4.184 kJ) on test, 3.836 vs. 4.690 kcal·mol−1 on cross-validation), and lowering MAE (2.230 vs. 2.276 kcal·mol−1 on test). This demonstrates that Inter-HVI maintains comparable predictive accuracy to advanced deep learning models while offering enhanced interpretability. To enhance interpretability, besides high prediction performance, seven complementary model explanation techniques were employed to uncover the relationships between molecular features and HVI reactivity. In particular, the interpretable rules extracted by RuleFit offer human-readable insights and can guide rational optimization of HVI compounds by modifying key descriptors to achieve desired bond dissociation properties.