In your capacity as the lead physician of a healthcare facility, your team has grappled with enhancing the patient journey experience during laboratory visits. To address this, your data science unit devised a machine-learning (ML) model that can potentially identify patients who might go through unsatisfactory experiences during their visit to the lab. Your research team describes their model to be at the level of proof of concept, with a clinical research use case, and real-world representative data, the data pipeline is a prototype–caliber pipeline, and they identified users from a clinical user experienced perspective. Yet, they haven’t started stakeholder analysis or product caliber specifications. The team provided you with a model card to communicate their work so far.
The correct answer is ...
MLTRL4.
In this vignette, we introduce two central concepts: the AI lifecycle framework and model cards. These concepts are pivotal for evaluating the maturity of a Machine Learning (ML) system. Our vignette focuses on an ML system devised within a multifaceted healthcare system, with anticipation of its integration into the prevailing software landscape. To ascertain the maturity of this AI technology, exploring the Machine Learning Technology Readiness Level (MLTRL) framework may be warranted. This framework considers the obligations of an ML system and emphasizes other crucial aspects like fairness, ethics, reliability, and robustness.
Drawing from experiences with ML systems across diverse sectors — including aerospace, defense, and civil engineering — Lavin, et al. delineated processes, and testing standards. This has enabled a structured approach for assessing the maturity of ML systems suited for real-world applications.
The objective of the MLTRL framework is to cultivate and launch ML systems that are robust, reliable, and responsible. This framework streamlines workflows and bolsters communication across teams at various AI system development stages. Its design mirrors the stages of AI product evolution, spanning research and development, productization, and deployment.
For instance, MLTRL4 serves as a proof of concept, showcasing the model's real-world data performance. However, at this juncture, the ML system hasn't advanced sufficiently to align with the requirements of data governance, product standards, code integrity, and regulatory compliance, as seen in MLTRL5,6. On the other hand, MLTRL1 denotes goal-oriented research, marking the nascent stages of ML model development. MLTRL9 signifies the culmination of this framework, where an ML system's deployment is continually monitored and refined.
Our vignette also underscores the importance of the MLTRL model card within the MLTRL framework. These model cards document and elucidate the performance attributes of an ML model, making them vital for both individual and functional communication. They serve to relay project updates and facilitate team member onboarding. Beyond this, they act as gauges for maturity, helping determine a model's position on the continuum between foundational research and deployment. By fostering communication across stakeholders and transcending the development phase, these cards distill some of the intrinsic knowledge embedded in the process. We propose that MLTRL cards act as foundational pillars supporting other tools, such as model cards, data cards, and ethical/responsibility parameters.
MLTRL model cards are cohesive and transferable documents, linking an ML system with its present milieu and its maturity journey. The external environment is portrayed via project data, overarching requirements, intended applications, and tacit knowledge. In contrast, the system's intrinsic attributes are conveyed through details on the model algorithm, testing status, potential biases, technical knowledge debt, and MLTRL stage debriefings. An intermediary layer exists between the internal and external domains, emphasizing data-related considerations like acquisition, sharing, privacy, and ethics.
The MLTRL model cards aren't merely standalone documents but community resources complementing tools like pivotal for documentation and data provenance. For instance, our choice was to adopt datasheets for datasets proposed by Google, given their transparency and ease of implementation. Additionally, MLTRL model cards can be synchronized with data cards using DVC (data version control), reinforcing best practices in ML system development.
Lastly, MLTRL model cards champion transparency and thoroughness, especially in AI clinical trials. Researchers can harness these cards to craft clinical trial protocols, aligning seamlessly with frameworks like SPIRIT-AI and CONSORT-AI. Moreover, these model cards pinpoint potential biases within an ML system, an aspect not inherently addressed in AI clinical trials. Effective reporting in AI trials commences with comprehensive documentation of ML system development. SPIRIT-AI clinical trial protocols, for example, can be augmented using TRL cards, enriching sections like title, background, interventions, and potential harms.
EzzAddin Al Wahsh, M.D., M.B.A.
Fellow, Clinical Informatics
Mayo Clinic
Christopher Garcia, M.D.
Senior Associate Consultant, Computational Pathology and AI
Mayo Clinic