loading . . . Combining Machine Learning Models and Screening to Enhance Suicide Risk Identification for American Indian Patients: Retrospective Cohort Study Background: American Indian and Alaska Native communities experience disproportionately high suicide rates. While machine learning (ML) models leveraging electronic health records have emerged as promising tools for suicide risk identification, the optimal integration of these models with existing screening practices remains unclear. Objective: The objective of this study was to compare parallel and serial testing strategies that combine an ML suicide risk model and the Ask Suicide-Screening Questions (ASQ) against using the ASQ alone. To achieve this, we conducted a retrospective secondary analysis of electronic health record data. The cohort consisted of adult emergency department visits at an Indian Health Service facility between October 1, 2019, and October 2, 2021. Methods: Sensitivity, specificity, predictive values, and 95% CIs were averaged across 10 cross-validated patient-level folds. The final sample included 7897 American Indian patients with 26,896 visits, 824 (3.1%) of which had a positive ASQ result and 102 (0.4%) of which had the outcome of suicide attempt or death within 90 days of the visit. The logistic regression ML model previously developed using Indian Health Service–specific data was operationalized at the 95th and 75th percentiles to evaluate high-risk and medium-risk thresholds, respectively. A sensitivity analysis was performed to evaluate identification approaches across all emergency department visits during this period. Results: The ML medium-risk threshold alone identified the most true positives (sensitivity: 0.782, 95% CI 0.648-0.915; specificity: 0.751, 95% CI 0.725-0.777; positive predictive value [PPV]: 0.012, 95% CI 0.009-0.014; negative predictive value [NPV]: 0.999, 95% CI 0.998-0.999) in comparison to the ML high-risk threshold alone (sensitivity: 0.429, 95% CI 0.287-0.572; specificity: 0.955, 95% CI 0.948-0.961; PPV: 0.035, 95% CI 0.022-0.048; NPV: 0.998, 95% CI 0.997-0.999) or the ASQ alone (sensitivity: 0.178, 95% CI 0.073-0.282; specificity: 0.970, 95% CI 0.968-0.971; PPV: 0.022, 95% CI 0.010-0.034; NPV: 0.997, 95% CI 0.996-0.998). Combining the ML high-risk threshold with the ASQ in series yielded the greatest positive predictive ability (PPV: 0.050, 95% CI 0.014-0.086) at the cost of reduced sensitivity (0.129, 95% CI 0.036-0.221). Finally, the parallel testing approach using the ML medium-risk threshold yielded the greatest sensitivity (0.795, 95% CI 0.671-0.920; specificity: 0.742, 95% CI 0.716-0.767; PPV: 0.012, 95% CI 0.009-0.014; NPV: 0.999, 95% CI 0.998-0.999) without missing any cases identified through screening. Conclusions: Unlike existing studies that evaluate ML and screening tools in isolation, this study innovates by assessing combined parallel and serial testing strategies in a real-world setting. We demonstrated that, while serial testing maximizes predictive accuracy, it is often infeasible. Instead, parallel testing brings value as a clinical “safety net” to catch at-risk patients missed by standard practices. Ultimately, integrating ML in suicide prevention requires balancing statistical accuracy with setting-specific, real-world workflows. http://dlvr.it/TSTwnF