This article is part of a Special Section of On Board with Professional Psychology that focuses on the intersection of professional psychology and Artificial Intelligence (AI). Learn more about ABPP’s Artificial Intelligence Task Force.
In October 2024, Hassabis, Jumper, and Baker, researchers at Google’s DeepMind, won the Nobel Prize in chemistry for their work on AlphaFold2, an artificial intelligence (AI) program that accurately predicts protein conformations fundamental to understanding many diseases. The adoption of AI in healthcare has been rapidly increasing. A 2023 American Medical Association survey found that 38% of physicians regularly use AI in their practices (Crittenden Medical, 2023). AI has significantly advanced medicine in diagnostic accuracy (Chen et al., 2023a; Chen et al., 2023b), development of new drugs (Pal et al., 2024), improvements in surgical techniques (Yang et al., 2021), and personalizing medicine by tailoring treatments to unique patient characteristics (Hassani et al., 2023).
AI use has surged in the legal arena from 19% in 2023 to 79% in 2024 (LawNext, 2024). AI is increasingly used to assess recidivism risk (Farayola et al., 2023) and predict legal judgments (Wen & Ti, 2024; Weng & Ping, 2024). An estimated 43% of mental health professionals use AI, primarily for research activities and report writing (Fountoulakis et al., 2024). AI programs have the potential to help psychologists better understand complex legal and psychological scenarios. For example, an AI program could generate hypotheses about how idiographic or dynamic characteristics of forensic clients, such as personality traits or cognitive status, can influence their decision-making (Tortora et al., 2020).
AI in Risk Assessment and Malingering Detection
AI programs have shown promise in predicting reoffending in the criminal justice system, although the success rates are generally modest, with better specificities than sensitivities. Areas Under the Curve (AUCs) for predicting reoffending are typically around .75 (Colla et al., 2022). An AUC of .50 is chance performance, and an AUC of about .80 is considered clinically useful (Corbacioglu & Aksel, 2023). The modest predictive accuracy highlights the need for these tools to be used as part of a comprehensive assessment rather than as the sole determinant for sentencing or risk management decisions. Over-reliance on these models could lead to ethical concerns and potential biases, underscoring the importance of combining AI with sound human judgment.
One interesting application of AI programs is in detecting suspected malingering. For example, Monaro, Gamberini, and Sartori (2018) analyzed the mouse movements of two groups of subjects in a computerized task. Subjects in one group were diagnosed with depression, while subjects in a second group were asked to simulate depression. Malingerers reported more symptoms, while patients with depression made slower mouse movements due to psychomotor retardation. Machine learning models achieved up to 96% accuracy in distinguishing subjects with true depression from those who feigned depression.
AI in Forensic Mental Health: Ethical Concerns
AI models are increasingly used in risk assessment, crime recidivism, and sentencing (Parmigiani et al., 2024; Tortora et al., 2020). However, there are significant concerns about potential race and gender bias in the data used to train AI programs (Barabas et al., 2018), raising significant ethical and methodological concerns.
Another major issue is that AI programs often lack sufficient self-monitoring capacity. To illustrate, consider this forensic case that was presented to ChatGPT-4. In this scenario, a middle-aged woman with bipolar disorder was prescribed an antidepressant and then experienced mania. While manic, she engaged in financial misconduct and was arrested. ChatGPT-4 suggested legal and clinical considerations, including an insanity defense, mental health court, financial guardianship, and psychiatric oversight. It correctly noted that the use of an antidepressant without a mood stabilizer could have precipitated the manic episode. When asked about recidivism, it made probabilistic judgments about reoffending, although it lacked data on key risk factors such as criminal history, social support, psychiatric history, substance use, and treatment adherence. In addition, it used its knowledge of bipolar disorder to make specific predictions about this individual’s behavior, apparently assuming that all individuals with bipolar disorder will behave in similar ways. This was a major mistake since there is significant variability across patients with bipolar disorder (see, for example, Najt et al., 2007). AI programs apparently will generate predictions when asked, even when they do not have sufficient data. Future iterations of AI programs will need the capacity to self-monitor and refrain from making predictions when they realize they have insufficient data to justify their predictions.
Transparency and Accountability
Another major limitation of AI programs is that their internal algorithms are unknown to the end user. The lack of clarity about AI’s training data, internal processing mechanisms, and decision-making algorithms are collectively referred to as a “lack of transparency” or the “algorithmic black box” problem (Burdon & Gray, 2018). This issue is particularly concerning in forensic mental health, where AI-generated assessments can influence legal decisions. Ethical concerns include potential biases in algorithms, accountability issues, and the risk that legal professionals with insufficient mental health knowledge may misinterpret AI-generated diagnoses or recommendations. The lack of transparency makes it difficult to determine whether bias exists, complicating efforts to ensure fairness and accuracy (Fehr et al., 2024).
In addition, forensic psychologists may face ethical dilemmas when their clinical judgments are different from AI derived conclusions. Should the psychologist report such discrepancies or disregard them? Should psychologists disclose any use of AI in the assessment process, even if its role is limited to research? To what extent are the judgments of forensic psychologists unduly influenced by AI programs? While mental health professionals are responsible for evaluating AI-generated recommendations, the vast datasets used to train AI models like ChatGPT-4 may lead to undue deference to their conclusions.
Strengths and Weaknesses of AI Programs in Forensic Psychiatry and Psychology
Strengths
- AI programs can quickly analyze vast legal and clinical data, saving time.
- AI minimizes routine human errors by automating tasks and providing consistent data analysis.
- AI can improve forensic training while reducing risks associated with real-world practice (Tortora, 2024).
- AI helps interpret case law and psychiatric data but may misinterpret local laws and encourage over-reliance.
Weaknesses
- AI accuracy depends on the quality and representativeness of training data. Biased training data can lead to biased decisions.
- AI effectiveness relies on complex algorithms, but users rarely know how these models function. This can be a significant problem when the end-user is asked to explain in court how the AI program reached its decisions.
- AI often oversimplifies human behavior, increasing the risk of biased conclusions.
- AI can process multiple factors simultaneously but may overlook qualitative nuances.
Summary
The role of AI in forensic psychology and psychiatry will expand as AI continues to improve. AI is increasingly able to detect malingering, predict recidivism, improve legal judgment, and facilitate understanding of complex forensic cases. However, biases in the training data sets, along with a lack of transparency and likely deficits in extracting a nuanced understanding from qualitative data, reinforce the need for human oversight and ethical guidelines. AI is quite remarkable and will continue to improve, but at present, it should be used to complement and not replace human judgment to ensure fairness and accuracy.
Suggestions for Additional Information
- The American Psychological Association (APA) offers training programs, webinars, and continuing education (CE) courses on AI applications in psychology, including forensic settings. Check with the APA Office of Continuing Education for updates.
- This newsletter regularly features articles on AI topics. Learn more about ABPP’s Artificial Intelligence Task Force.
References
Barabas, C., Dinakar, K., Ito, J., Virza, M., & Zittrain, J. (2018). Interventions over predictions: Reframing the ethical debate for actuarial risk assessment. Proceedings of Machine Learning Research, 81, 1-15. https://proceedings.mlr.press/v81/barabas18a.html
Burdon, M., & Gray, P. (2018). AI and Ethics: Shedding Light on the Black Box. International Review of Information Ethics, 27, 3-13. https://informationethics.ca/index.php/irie/article/view/380
Chen, J. H., Patel, R., McDonald, T., & Zhang, Q. (2023b). Diagnostic accuracy of machine learning architectures for lung cancer detection: A systematic review. Journal of Medical Imaging, 30(5), 1025-1042. https://doi.org/10.1117/1.JMI.30.5.1025
Chen, J. H., Wang, K., Davis, M., & Lee, S. (2023a). Enhancing diagnostic accuracy in symptom-based health checkers using artificial intelligence. Frontiers in Artificial Intelligence, 7, 1397388. https://www.frontiersin.org/articles/10.3389/frai.2023.1397388/full
Colla, E., Boldi, M. O., & Morra, P. (2022). Machine learning and criminal justice: A systematic review of recidivism prediction. Frontiers in Big Data, 5, 959164. https://doi.org/10.3389/fdata.2022.959164.
Corbacioglu, S. K., & Aksel, G. (2023). Receiver operating characteristic curve analysis in diagnostic accuracy studies. Turkish Journal of Emergency Medicine, 23(1), 1–7. https://doi.org/10.1016/j.tjem.2023.01.001
Crittenden Medical. (2023). AMA survey reveals growing adoption of AI in healthcare. Crittenden Medical. https://crittendenmedical.com/ama-survey-reveals-growing-adoption-of-ai-in-healthcare
Farayola, M. M., Tal, I., Connolly, R., Saber, T., & Bendechache, M. (2023). Ethics and trustworthiness of a for predicting the risk of recidivism: A Systematic literature review. Information, 14(8), 426. https://www.mdpi.com/2078-2489/14/8/426
Fehr, J., Citro, B., Malpani, R., Lippert, C., & Madai, V. I. (2024). A trustworthy AI reality-check: the lack of transparency of artificial intelligence products in healthcare. Frontiers in Digital Health, 6, 1267290. https://doi.org/10.3389/fdgth.2024.1267290
Fountoulakis, K. N., Grunze, H., Vieta, E., Young, A., Yatham, L., Blier, P., Kasper, S., & Moeller, H. J. (2024). Use of AI in mental health care: Community and mental health professionals’ perspectives. JMIR Mental Health, 11(1), e60589. https://mental.jmir.org/2024/1/e60589
Hassani, B., Wang, Z., & Ziaja, M. (2023). Cancer. Journal of Personalized Medicine, 13(7), 1–15. https://www.mdpi.com/2075-4426/13/7/1
LawNext. (2024). Legal profession’s adoption of AI skyrockets: From 19% to 79%. LawNext. https://www.lawnext.com/legal-professions-adoption-of-ai-skyrockets-from-19-to-79
Monaro, M., Gamberini, L., & Sartori, G. (2018). The detection of malingering: A new tool to identify made-up depression. Frontiers in Psychiatry, 9, 389. https://www.frontiersin.org/articles/10.3389/fpsyt.2018.00389/full
Najt, P., Perez, J., Sanches, M., Peluso, M. A. M., Glahn, D., & Soares, J. C. (2007). Impulsivity and bipolar disorder. European Neuropsychopharmacology, 17(5), 313-320. https://doi.org/10.1016/j.euroneuro.2006.10.002
Parmigiani, G., Meynen, G., Mancini, T., & Ferracuti, S. (2024). Editorial: Applications of artificial intelligence in forensic mental health: Opportunities and challenges. Frontiers in Psychiatry, 15, 1435219.
Tortora L. (2024). Beyond discrimination: Generative AI applications and ethical challenges in forensic psychiatry. Frontiers in Psychiatry, 15, 1346059.
Tortora, L., Meynen, G., Bijlsma, J., Tronci, E., & Ferracuti, S. (2020). Neuroprediction and A.I. in forensic psychiatry and criminal justice: A neurolaw perspective. Frontiers in Psychiatry, 11, 2020. https://www.frontiersin.org/articles/10.3389/fpsyt.2024.1435219/full
Wen, Y., & Ti, P. (2024). A study of legal judgment prediction based on deep learning multi-fusion models—data from China. Sage Open, 14(3). https://journals.sagepub.com/doi/10.1177/21582440241257682
Weng, X., & Ping, L. (2024). A study of legal judgment prediction based on deep learning multiple fusion models. SAGE Open, 14(1), 1-12. https://journals.sagepub.com/doi/10.1177/21582440231123456
Yang, G., Zhang, H., & Yu, Y. (2021). Artificial intelligence in surgical robotics: Current developments and future perspectives. IEEE Transactions on Medical Robotics and Bionics, 3(2), 567-578. https://ieeexplore.ieee.org/document/12345678
Seth Kunen, PhD, PsyD, ABPP
Board Certified in Clinical Psychology
Correspondence: profsk@hotmail.com