Purpose Primary central nervous system lymphoma (PCNSL) is a rare, aggressive form of extranodal non-Hodgkin lym- phoma. To predict the overall survival (OS) in advance is of utmost importance as it has the potential to aid clinical decision-making. Though radiomics-based machine learning (ML) has demonstrated the promising performance in PCNSL, it demands large amounts of manual feature extraction efforts from magnetic resonance images beforehand. deep learning (DL) overcomes this limitation.Methods In this paper, we tailored the 3D ResNet to predict the OS of patients with PCNSL. To overcome the limitation of data sparsity, we introduced data augmentation and transfer learning, and we evaluated the results using r stratified k-fold cross-validation. To explain the results of our model, gradient-weighted class activation mapping was applied.Results We obtained the best performance (the standard error) on post-contrast T1-weighted (T1Gd)—area under curve = 0.81(0.03), accuracy = 0.87(0.07), precision = 0.88(0.07), recall = 0.88(0.07) and F1-score = 0.87(0.07), while compared with ML-based models on clinical data and radiomics data, respectively, further confirming the stability of our model. Also, we observed that PCNSL is a whole-brain disease and in the cases where the OS is less than 1 year, it is more difficult to distinguish the tumor boundary from the normal part of the brain, which is consistent with the clinical outcome. Conclusions All these findings indicate that T1Gd can improve prognosis predictions of patients with PCNSL. To the best of our knowledge, this is the first time to use DL to explain model patterns in OS classification of patients with PCNSL. Future work would involve collecting more data of patients with PCNSL, or additional retrospective studies on different patient populations with rare diseases, to further promote the clinical role of our model.