Predicting Sentinel Lymph Node Involvement in Breast Cancer Patients Using TabNet: A Comprehensive Approach

Abstract

In this study, we delve into the realm of machine learning to develop a model capable of predicting sentinel lymph node (SLN) involvement in breast cancer patients, utilizing the TabNet algorithm. Our model harnesses preoperative features to make accurate predictions, surpassing the performance of traditional statistical methods. The findings underscore the potential of TabNet in guiding treatment decisions and improving patient outcomes.

Introduction

Sentinel lymph node involvement holds immense significance in breast cancer diagnosis and treatment. Its presence indicates the spread of cancer cells beyond the primary tumor, necessitating prompt intervention. Traditional methods for predicting SLN involvement often rely on clinical and histopathological evaluations, which can be limited in accuracy and generalizability.

Machine learning offers a compelling alternative, enabling us to uncover intricate relationships between features and outcomes from vast datasets. This study embarks on a journey to harness the power of TabNet, a deep learning algorithm tailored for tabular data, to predict SLN involvement with remarkable accuracy.

Methods

Our study meticulously analyzed data from 1832 breast cancer patients who underwent SLN biopsy. We gathered preoperative information, encompassing patient demographics, tumor characteristics, and core needle biopsy (CNB) results.

The TabNet model, renowned for its prowess in handling tabular data, was meticulously trained, validated, and tested using a 70-30-10 split of the dataset. To establish a benchmark, we juxtaposed its performance with that of the logistic regression model, a widely used statistical method for binary classification tasks.

Results

The TabNet model emerged triumphant, achieving an accuracy of 75%, sensitivity of 78%, specificity of 70%, F1-Score of 79%, and AUC of 0.74 on the test set. In contrast, the logistic regression model yielded an accuracy of 70%, sensitivity of 75%, specificity of 66%, F1-Score of 72%, and AUC of 0.71.

These results unequivocally demonstrate the superiority of TabNet in predicting SLN involvement, outperforming the logistic model across all metrics.

Discussion

Our study stands as a pioneering effort in harnessing preoperative tabulated data to predict SLN involvement in breast cancer patients with such high accuracy, specificity, and sensitivity. The TabNet model’s remarkable performance underscores its potential as an invaluable tool for clinicians, aiding in making informed treatment decisions and ultimately improving patient outcomes.

This study lays the groundwork for future advancements in this field, inspiring the development of even more accurate and generalizable models for SLN prediction.

Conclusion

The findings of this study unequivocally establish TabNet as a game-changer in predicting SLN involvement in breast cancer patients. Its ability to accurately predict SLN involvement empowers clinicians to make informed treatment decisions, paving the way for improved patient outcomes.

As we venture into the future, we eagerly anticipate further research endeavors that explore avenues to enhance the model’s performance even further, such as customizing loss functions and learning rate annealing. Additionally, integrating imaging data with tabular data holds immense promise, potentially unlocking even greater accuracy in SLN prediction.

Ultimately, our collective efforts are unwavering in the pursuit of advancing breast cancer care and alleviating the burden of this disease on countless lives.