LLM Reasoning 相关度: 6/10

An Interpretable Machine Learning Framework for Non-Small Cell Lung Cancer Drug Response Analysis

Ann Rachel, Pranav M Pawar, Mithun Mukharjee, Raja M, Tojo Mathew
arXiv: 2603.16330v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

该论文利用XGBoost和SHAP值,结合DeepSeek解释,构建了个性化肺癌药物反应预测模型。

主要贡献

  • 构建基于基因信息的药物反应预测模型
  • 利用SHAP值进行模型的可解释性分析
  • 使用DeepSeek验证生物学合理性并提供上下文解释

方法论

使用XGBoost回归器预测药物敏感性LN-IC50,通过交叉验证和随机搜索优化超参数,并用SHAP和DeepSeek进行解释。

原文摘要

Lung cancer is a condition where there is abnormal growth of malignant cells that spread in an uncontrollable fashion in the lungs. Some common treatment strategies are surgery, chemotherapy, and radiation which aren't the best options due to the heterogeneous nature of cancer. In personalized medicine, treatments are tailored according to the individual's genetic information along with lifestyle aspects. In addition, AI-based deep learning methods can analyze large sets of data to find early signs of cancer, types of tumor, and prospects of treatment. The paper focuses on the development of personalized treatment plans using specific patient data focusing primarily on the genetic profile. Multi-Omics data from Genomics of Drug Sensitivity in Cancer have been used to build a predictive model along with machine learning techniques. The value of the target variable, LN-IC50, determines how sensitive or resistive a drug is. An XGBoost regressor is utilized to predict the drug response focusing on molecular and cellular features extracted from cancer datasets. Cross-validation and Randomized Search are performed for hyperparameter tuning to further optimize the model's predictive performance. For explanation purposes, SHAP (SHapley Additive exPlanations) was used. SHAP values measure each feature's impact on an individual prediction. Furthermore, interpreting feature relationships was performed using DeepSeek, a large language model trained to verify the biological validity of the features. Contextual explanations regarding the most important genes or pathways were provided by DeepSeek alongside the top SHAP value constituents, supporting the predictability of the model.

标签

肺癌 药物反应预测 XGBoost SHAP DeepSeek 可解释性

arXiv 分类

cs.CV cs.AI cs.LG