Abstract
This narrative mini-review defines the concept of clinical phenotypes, molecular endotypes, and therapeutic subtypes (theratypes) in osteoarthritis (OA) and explores the conceptual transition from a disease that was incorrectly considered homogeneous in the recent past, to a more heterogeneous disorder with broadly defined clinical phenotypes, which are observable traits like malalignment, joint instability, or metabolic dysfunction, to molecular endotypes that represent the specific biological pathways driving disease pathogenesis. Studies that incorporate neo-epitope biomarkers of disease activity, multiplex measurements of pro-inflammatory cytokines, and ‘omics’ technologies into large datasets are not a currently applied gold standard, but, if widely implemented, they will allow increased use of machine learning to cluster large OA populations into subgroups with common endotype drivers. This approach is essential for exploring the underlying molecular pathways and developing future digital biomarkers and algorithms that can be used to identify distinct therapeutic subtypes characterised by specific, predictable treatment responses that may even be modelled in silico. Identifying these molecular drivers within phenotypic clusters is important for advancing OA clinical trials, because it allows for phenotypic enrichment in clinical studies of drugs and biological agents with well-known mechanisms of action, increasing the likelihood of demonstrating efficacy for future disease modifying interventions.
Key Points
1. This review discusses osteoarthritis as a highly heterogeneous joint disorder characterised by distinct clinical phenotypes and molecular endotypes.
2. The authors propose that identifying endotypes and biological drivers using multi-modal datasets is a high priority and can be achieved by incorporating neo-epitope biomarkers, cytokine profiling, and ‘omics’ technologies in clinical trials.
3. Implementing advanced clustering approaches and machine learning algorithms on these multidimensional datasets opens opportunities for clinical trial patient stratification and phenotypic enrichment to optimise the evaluation and efficacy assessment of targeted disease-modifying interventions.
INTRODUCTION
Osteoarthritis (OA) is the most common form of arthritis and a leading cause of pain, disability, and healthcare costs.1,2 OA currently affects almost 600 million people globally, representing over 7% of the world’s population.3 It accounts for more than 10% of the global burden of disease and is a major contributor to chronic pain and disability.4 The prevalence of OA is projected to reach nearly one billion people by 2050, driven largely by shifting demographics such as ageing populations5 and the link to obesity, and cardiovascular and metabolic diseases.2,5,6
OA is a chronic, progressive musculoskeletal disorder characterised by insidious molecular alterations that precede the degradation of articular cartilage, synovial inflammation, and remodelling of subchondral bone.7 Historically, OA was considered a simple age-related ‘wear-and-tear’ phenomenon.8 However, it has been redefined as a highly heterogenous, whole-joint disease, involving complex interactions between metabolic, mechanical, and inflammatory pathways driven by systemic and synovial factors.9-12 Despite its high and rising prevalence as a serious disease and its economic impact on healthcare systems, a significant gap exists in our current understanding of its molecular pathogenesis, which partly explains the limited options available for its clinical management.13 Current therapeutic approaches, which largely include non-steroidal anti-inflammatory drugs and corticosteroids, often provide only temporary symptomatic relief without addressing the underlying causes of joint degeneration.14 This leaves many patients and healthcare systems with an unmet medical need. This gap is further compounded by the phenotypic heterogeneity of OA, characterised by various clinical phenotypes and molecular endotypes that require more interdisciplinary, integrated, and personalised approaches for the development of disease-modifying interventions.15,16
This narrative mini-review article explores recent advances in understanding the endophenotypes of OA. The shift from viewing OA as a wear-and-tear disease to a complex, heterogeneous disorder driven by intrinsic and extrinsic factors has necessitated a conceptual transition from developing futile ‘one-size-fits-all’ approaches to more personalised treatments, facilitated by identification, validation, and characterisation of clinical phenotypic clusters and molecular (mechanistic) clusters, also known as endophenotypes.
DEFINITIONS AND DETERMINATION OF DISEASE DRIVERS
The heterogeneity of OA is one of the main reasons why many clinical trials of novel treatments have failed. The clinical diversity of the patient population has not been a key consideration in recent clinical trials. The aim of current research in this area is to identify OA subsets to improve trial design and clinical outcomes. The aim is not translation of biomarkers into clinical decision-making, which many clinicians assume (and wish for). Instead, the effort is focused on developing tools and novel computational approaches that can improve OA clinical trial outcomes. However, an important pre-requisite is defining phenotypes and differentiating between the purely observable characteristics and/or ‘endotypes’ that represent the molecular drivers and specific biological pathways. Therefore, getting the terminology right is crucially important for advancing this field.17
Clinical phenotypes of OA can be characterised by purely clinically observable patient traits from population studies and OA initiatives (i.e., Osteoarthritis Initiative [OAI], Amsterdam Osteoarthritis (AMS-OA), and other cohorts), such as joint instability, systemic metabolic dysfunction (i.e., obesity, diabesity), or psychological factors that drive chronic pain sensitisation.18-22 Chronification and amplification of pain are the symptomatic hallmarks of OA and understanding the complexity of pain in OA and predicting those who are at risk of not responding to standard OA pain therapy are essential for improving the management of pain in OA.23 In this context, pain phenotyping is almost completely a parallel but necessary exercise. A detailed discussion of pain phenotyping is well beyond the scope of this article and the readers are referred to a recent article where we have proposed a spectrum of pain complexity in patients with OA, with high complexity subtypes representing the most difficult to treat patients.24
The following text includes the most relevant definitions of clinical phenotypes, molecular endotypes, and therapeutic subtypes (theratypes) of OA (see Figure 1 for further details).25,26

Figure 1: Interactions that determine phenotypes, endotypes, and theratypes and the omics technologies for their interrogation and validation.25,36
Interactions that determine phenotypes, endotypes, and theratypes and the omics technologies for their interrogation and validation. Phenotypes (the clinical presentation of a disease) are influenced by life events, ageing, and interactions between genes and the exposome. Co-morbidities also influence clinical phenotypes and can drive some pathogenic phenotypes. The observable characteristics that define clinical phenotypes result from a combination of hereditary and environmental influences but do not include molecular and genetic aspects. Endotypes are defined by specific molecular or pathophysiological mechanisms, differentiating them from other variants of the same clinical condition. The application of all the relevant omics is shown on the right-hand side.
Adapted from Mobasheri et al.25 and Welsing et al.36
Clinical Phenotype
From a purely clinical perspective, a phenotype is defined as any clinically observable characteristic or trait of a disease. In a primary healthcare setting, a clinical phenotype is defined by externally observable morphological and behavioural characteristics. Biochemical, physiological, and molecular properties are not included in this purely clinical phenotypic context. This implies that mechanistic aspects are distinct and are not part of a clinical phenotype. A clinical phenotype is simply the presentation of a disease in a given individual from a purely phenotypic perspective. The observable characteristics that define clinical phenotypes result from a combination of hereditary and environmental influences, but these do not include the molecular and genetic aspects.27
Molecular Endotype
Molecular endotypes are distinct mechanistic pathways that explain variable clinical phenotypes. They are measures and features derived from molecular, cellular, immunological, genetic, and genomic analyses that provide insight into the underlying mechanisms that promote disease pathogenesis and drive pathological progression.28
Theratype
Theratypes are distinct therapeutic subtypes. A theratype essentially represents a distinct molecular endotype that responds to a small number of targeted therapeutics. Many new molecular endotypes and theratypes of disease are being revealed by omic approaches and the emerging theratypes are being defined by the specificity of treatment responses in subtypes of patients. The term ‘theratype’ was not coined by any researcher in the field of OA. It was originally described as a means to group variants of cystic fibrosis transmembrane conductance regulator (CFTR) protein in the lung according to their interaction with drugs. However, it is becoming clear that this term can be extended across other areas of medicine to define therapeutic subtypes.29
Phenotyping and clustering patients from a purely clinical practice perspective are important, but they are not the most important priorities. Instead, they are important starting points for next generation clinical trials.30 However, these surface-level clinical clusters often mask a diverse range of underlying molecular endotypes, where distinct biological pathways drive the structural damage that is visible on imaging (i.e., X-ray radiography and MRI). There has been a recent call for screening MRI as a tool for enhancing OA clinical trials.31 However, identifying the specific molecular drivers within these phenotypic clusters is essential for developing effective pharmacological and biological therapeutics that can help the field move beyond the ‘one-size-fits-all’ treatment model that has hampered progress and impeded the development of effective treatments. By identifying molecular biomarkers and mapping underlying biological mechanisms to distinct clinical presentations, we can develop more targeted therapies that address the specific pathophysiology of each patient subgroup, or cluster, or phenotype, ultimately increasing the success rate of clinical trials.
This same endophenotyping approach can also incorporate the idea of sex differences and hormonal influences in OA as emerging phenotypes and leverage the potential for developing future therapeutic approaches that employ sex hormones and their derivatives for specific subsets of the population.32-34
Pathophysiology and Endophenotypes of OA
OA was previously conceptualised as an inevitable ‘wear and tear’ disease of old age.9 This view persisted for decades and effectively inhibited therapeutic development and significant progress in the field. It also prevented multidisciplinary collaboration, especially between the disciplines of orthopaedics, rheumatology, immunology, and metabolic research. More recently, based on molecular, cellular, preclinical, and clinical data, OA has been redefined as a highly heterogeneous and complex disorder of the entire synovial joint.35-37 It is increasingly defined as a low-grade, chronic mechano-inflammatory disease characterised by a diverse range of clinical phenotypes and molecular endotypes.25,26,38,39 There is also emerging evidence for a link between the gut microbiome, intestinal dysbiosis, and OA pain, connecting the synovial joint with the central and peripheral nervous systems.40-43 The enormous diversity and clinical heterogeneity mean that there is no single natural history for OA; instead, the disease manifests through multiple aetiologies, including ageing, obesity, and joint injury, which drive distinct pathological pathways.1,42
Pathophysiologically, disease progression is linked to the disruption of cartilage homeostasis where chondrocytes fail to maintain equilibrium between extracellular matrix synthesis and catabolism.44 Age-related changes induce mitochondrial dysfunction and cellular senescence, prompting chondrocytes to adopt a senescence-associated secretory phenotype.45-47 This shift accelerates the production of reactive oxygen species (ROS)48 and triggers low-grade, chronic inflammation within the synovium.12 Key inflammatory mediators, notably IL-1β, TNF-α, and IL-6, activate downstream signal transduction pathways, such as Nuclear Factor kappa-light-chain-enhancer of activated B cells (NFκB) pathway, innate-immune driven synovitis, and crystal-activated nucleotide-binding oligomerisation domain-like receptor family pyrin domain containing 3 (NLRP3) inflammasome.49-52 The activation of these molecular cascades upregulates the expression of destructive metalloproteinases and aggrecanases cartilage catabolism by matrix metalloproteinase-13 (MMP-13) and a disintegrin and metalloproteinase with thrombospondin motifs 5 (ADAMTS-5), while actively suppressing the growth factors and anabolic mediators (i.e., TGF-β) that regulate the synthesis of extracellular matrix macromolecules, ultimately driving the progressive structural breakdown of cartilage, subchondral bone remodelling, and osteophyte formation.53-56
Biomarkers for Endophenotypic Identification
The exploration of OA endophenotypes and clusters relies heavily on the identification and validation of urinary and serum biochemical markers that reflect specific pathological processes within the joint.38,57,58 While traditional radiographic measures like the Kellgren-Lawrence grade provide a snapshot of structural joint damage, they fail to capture current ‘disease activity’.59 The use of high-throughput ‘omics’ technologies to achieve ‘deep phenotyping’, is an emerging area of investigation, allowing for the identification of molecular signatures that precede structural changes.36 For instance, markers of Type II collagen degradation (e.g., C-terminal telopeptide of Type II collagen [CTX-II]) and synovial inflammation (e.g., C-reactive protein [CRP]) are being used to cluster patients into ‘high-progressor’ or ‘inflammatory’ subgroups or clusters.38 Pro-inflammatory cytokines are important for driving disease pathology in OA, but measuring them in clinical studies using currently available multiplex approaches may not necessarily assist with endophenotyping or provide a stable reflection of dynamic changes in endophenotype profiles over time.60 Consequently, it is important to incorporate cytokine measurements with neo-epitope biochemical marker-driven approaches for clinical trial stratification and for defining the molecular taxonomy of OA. Ultimately, digital biomarkers and digital algorithms derived from big data studies are needed to move beyond subjective clinical observations to stratified patient selection for clinical trials and for the development of multimodal and multidisciplinary clinical management strategies.61,62
Clinical Implications and Future Directions
The transition from a ‘one-size-fits-all’ approach to personalised medicine in OA is contingent upon matching specific therapies to identified endophenotypes and establishing clearer theratype to endotype connections. In the context of clinical trials, the failure of many disease-modifying OA drugs can be attributed to the inclusion of heterogeneous patient populations where the mechanism of action of the drug being investigated did not align with the primary disease driver in the recruited population. By using phenotypes as inclusion or exclusion criteria, a process known as phenotypic enrichment, we can increase the likelihood of demonstrating drug efficacy. For example, a senotherapeutic agent that targets cellular senescence may fail in a post-traumatic cluster driven primarily by mechanical instability. Or similarly, an anti-inflammatory intervention (i.e., one that targets IL-1β signalling) may fail in a cluster that is not driven by inflammation at all. Besides improving the likelihood of detecting efficacy, such endotype-matched enrichment may also reduce outcome variability, as aligning drug mode of action with included or excluded endotypes leads to lower standard deviation and, in the end, reduced sample sizes required to demonstrate statistically significant treatment effects.63
Given the challenges of using standard multiplex approaches for inflammatory cytokines for measuring dynamic ‘disease activity’ related changes and endophenotypic changes, more work is needed to address how this gap can be bridged with surrogate markers or alternative sampling methods (i.e., synovial micro-biopsies instead of synovial fluid) to make the inflammatory endotype a viable target for drug development.
Digital Health, Machine Learning, and AI
Emerging digital health tools and AI offer a powerful means to integrate multi-modal data for more accurate patient stratification. However, translation of these tools into routine patient stratifications in a clinical setting has not yet occurred, although proof of concept efforts have been undertaken. Machine learning algorithms can process vast datasets, including clinical symptoms, imaging, and biochemical markers, to reveal latent ‘clusters’ that are not visible through standard clinical assessment. For instance, within the Innovative Medicines Initiative-Applied Public-Private Research enabling OsteoArthritis Clinical Headway (IMI-APPROACH) knee OA cohort, machine learning has been used to rank patients according to their likelihood of progression in terms of structure and/or pain, as a strategy to select a trial population enriched for progressors.64,65 In the same setting, unsupervised clustering of biochemical markers has aided identification of OA endotype clusters.38 However, the diversity of trial datasets and the use of different outcome measures makes it very difficult to integrate all of these studies together to identify clusters in a much larger population. These AI-driven approaches are increasingly used to define theratypes characterised by their specific response to a treatment. This is part of Patient Relevant Osteoarthritis endpoints using Big data Evaluation (PROBE), a recently launched public-private partnership funded by the European Commission’s Innovative Health Initiative (IHI). Integrating these technologies into clinical practice is a long way off, but for now, it provides researchers with projects to pursue. Providing a complete and logically sound future direction is challenging at this point in time. However, methodological improvements and consortium-led solutions (e.g., development of federated learning in IHI-PROBE and establishment of standardised core outcome sets) will be required to harmonise these diverse multimodal datasets (some with missing data and need for imputation) for future computational, modelling, and AI applications.
CHALLENGES AND LIMITATIONS
A clinically measurable phenotype in a primary healthcare setting is normally restricted to externally observable morphological and behavioural characteristics and these explicitly exclude imaging and biochemical marker properties. The minimal joint disease category described by Dell’Isola and Steultjens18 (Table 1) relies on radiographic assessment (and Kellgren-Lawrence grading). This is rarely done, especially in early OA, but, since radiographic structural data are internal pathophysiological properties and not externally observable traits, this older phenotype description does not match the conceptual framework proposed by Mobasheri and Loeser.25 This strict definition of a clinical phenotype that needs inclusion of imaging-dependent classifications in their taxonomy, will require further debate in terms of the clinical utility of such phenotypes in low-resource settings and some of the ongoing clinical trials that do not include any imaging modalities (i.e., trials of combinations of glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) drugs for treating obesity-related OA). It should also be acknowledged that no single biomarker or panel to date has been qualified for routine clinical decision making in OA and that the translation of identified endotype clusters into an actionable tool for successfully guiding treatment is yet to be proven.

Table 1: Integrated classification of OA clinical phenotypes and molecular endotypes.12,18,25
This table categorises proposed OA subsets based on the conceptual framework established by Mobasheri et al.,25 distinguishing between externally observable phenotypic traits and underlying biological mechanisms. Clinical phenotypes represent patient clusters that have been identified through collection of patient information such as family history, demographics, and physical examination without the primary requirement for any form of advanced imaging. Molecular endotypes represent molecular entities and mechanistic pathways that drive disease pathogenesis. Note that while clinical phenotypes like minimal joint disease are defined by symptomatic stability, they require baseline radiography (e.g., Kellgren-Lawrence Grade 1) to confirm the lack of structural burden, as originally proposed by Dell’Isola and Steultjens.18
ACL: anterior cruciate ligament; ADAMTS: a disintegrin and metalloproteinase with thrombospondin motifs; MMP: matrix metalloproteinase; OA: osteoarthritis; SASP: senescence-associated secretory phenotype.
CONCLUSION
This article is not intended to incorporate molecular clustering into clinical decision-making. There are significant limitations in the tools available for endophenotyping. There are currently no approaches available to reconcile biomarker findings into molecular clusters that seamlessly translate into novel clinical decision-making tools. The shift toward precision medicine in OA research and clinical development requires sharper tools and these are likely to be digital biomarkers that combine imaging with biochemical markers of ‘disease activity’. This approach necessitates a departure from viewing OA as an inevitable consequence of ageing. Future research in this area will need to focus on machine learning, such as those used in the IMI APPROACH study, and big data analytics and bridge the gap between ‘omics’-based basic discovery and clinical trial stratification, ensuring that the next generation of OA therapeutics will be as diverse and targeted as the disease itself, with its many diverse clinical and molecular features.64-68




