Abstract
Knee osteoarthritis (OA) diagnosis is based on symptoms, assessed through questionnaires such as the WOMAC. However, the inconsistency of pain recording and the discrepancy between joint phenotype and symptoms highlight the need for objective biomarkers in knee OA diagnosis. To this end, we study relationships among clinical and molecular data in a cohort of women (n = 51) with Kellgren–Lawrence grade 2–3 knee OA through a Support Vector Machine (SVM) and a regulation network model. Clinical descriptors (i.e., pain catastrophism, depression, functionality, joint pain, rigidity, sensitization and synovitis) are used to classify patients. A Youden’s test is performed for each classifier to determine optimal binarization thresholds for the descriptors. Thresholds are tested against patient stratification according to baseline WOMAC data from the Osteoarthritis Initiative, and the mean accuracy is 0.97. For our cohort, the data used as SVM inputs are knee OA descriptors, synovial fluid proteomic measurements (n = 25), and transcription factor activation obtained from regulatory network model stimulated with the synovial fluid measurements. The relative weights after classification reflect input importance. The performance of each classifier is evaluated through ROC-AUC analysis. The best classifier with clinical data is pain catastrophism (AUC = 0.9), highly influenced by funcionality and pain sensetization, suggesting that kinesophobia is involved in pain perception. With synovial fluid proteins used as input, leptin strongly influences every classifier, suggesting the importance of low-grade inflammation. When transcription factors are used, the mean AUC is limited to 0.608, which can be related to the pleomorphic behaviour of osteoarthritic chondrocytes. Nevertheless, funcionality has an AUC of 0.7 with a decisive importance of FOXO downregulation. Though larger and longitudinal cohorts are needed, this unique combination of SVM and regulatory network model shall help to stratify knee OA patients more objectively.
Similar content being viewed by others
![](https://cdn.statically.io/img/media.springernature.com/w215h120/springer-static/image/art%3A10.1038%2Fs41413-020-00109-x/MediaObjects/41413_2020_109_Fig1_HTML.png)
Introduction
Synovial joints (e.g., knee, hip and hand) allow smooth movements between adjacent bones and are surrounded by an articular capsule that defines a cavity filled with synovial fluid. Bone extremities are also covered by a layer of hyaline articular cartilage that prevents bone-to-bone contact, cushions possible noxious impacts, and provides a low-friction surface for joint articulation1. Epidemiological data from the Institute for Health Metrics and Evaluation shows that knee osteoarthritis (OA) affects 22% of men and 31% of women over the age of 55 in 2019. Such prevalence is expected to rise due to the increase in life expectancy and the overall body mass index. Currently, treatments are conservative (i.e., weight loss, low-impact exercises and analgesics) and knee OA-modifying drugs are not available because knee OA pathophysiology is not fully understood. Eventually, total knee replacement may be necessary. Predictions suggest that the need for joint replacement will continue to rise because of both changes in patients’ demography, and expanding indications for surgery2. Consequently, the medical costs associated with knee OA are increasing, making this disease one of the world’s leading health problems3.
knee OA diagnosis occurs mainly during its moderate to severe/late stage when the articular cartilage has become irreversibly damaged. At this point, the decision to propose the patient a total knee replacement or conservative treatment considers the patient’s age and the extension of joint structural alterations, but the eventual decision is largely conditioned by the symptoms, pain being the critical one. Every person has a unique perception, as it is affected by biological, psychological and social factors4. Pain is typically assessed through questionnaires such as Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC). However, the inconsistency and poor reporting of questionnaires, along with a poor consistency between physical joint damage, as assessed through radiological biomarkers, and pain symptoms, motivate the search for more objective measures that could be used as new biomarkers in knee OA diagnosis5,6.
To find such biomarkers, gait data have been used to develop prediction models for the progression of knee OA7. Also, serum and urine biochemical markers have been explored to establish knee OA phenotype classification systems: urinary C-terminal crosslinked telopeptide of type II collagen and serum degrading enzymes levels, cartilage oligomeric matrix protein and hyaluronan have some predictive value for knee OA classification5. But, as pain, at least the nociceptive dimension might be correlated with the levels of synovial inflammation8, there is a growing interest in exploring synovial fluid molecules as possible biomarkers9,10. For example, Haraden et al.11 predicted the development of inflammatory knee OA endotype based on synovial fluid molecules11. Synovial fluid data have also been combined with gait data to enable the construction of support vector machine classifiers to predict cartilage damage by Donnenfield et al.10.
But, patient stratification systems that helped to differentiate between sensory and affective pain dimensions from the nociceptive extent have not been fully explored5. Therefore, objectively and biologically characterizing current knee OA clinical descriptors (i.e., WOMAC domains, catastrophism, or sensitization) may help to gain valuable insight into patient stratification which eventually might help in clinical decision-making. Machine learning models can link seemingly unrelated features by finding patterns in training data. Specifically, this cross-sectional study aims to describe a cohort of women (n = 51) by mining the relationships among typical knee OA descriptors and molecular data through Support Vector Machine (SVM) classifiers to provide objectivity in current knee OA diagnostic systems. We further intend to add interpretability to the results by enriching proteomic real-world data. Specifically, by using as initial conditions the proteomic data we personalized simulations of a previously developed regulation network model12. The transcription factor results out of the regulatory model were used as the third set of input features for the SVM, allowing us to characterize the cohort with intracellular molecular information.
Results
We explore a total of 686 classification tasks to classify a cohort of 51 women regarding individual levels of functionality (FU), rigidity (RI), joint pain (JP), pain catastrophism (CA), pain sensitization (SE), depression (DE) and synovitis (SY). Three different types of datasets are used as input features: (i) sets of knee OA-clinical features; (ii) synovial proteomics data; (iii) in-silico data regarding the activation of transcription factors from a regulatory network-based model.
Figure 1A shows the relative importance of each clinical variable when target knee OA descriptors (those that can be binarized with an \(accuracy > 0.75\) on the train set) are classified: the closer the weight is to 1 (or yellow according to the colour bar employed), the more relevant such feature is for the classification. WOMAC domains classification (i.e., JP, RI and FU) are more influenced by CA, FU and SE, respectively (\(AUC > 0.8\)). DE is mostly influenced by FU, but in this case, the AUC score is lower (\(AUC = 0.79\)). CA classifier has the highest AUC score (\(AUC = 0.9\)) and is highly influenced by FU. The inflammatory classifier (SY) is mostly influenced by RI.
(A) Normalized importance when classifying each knee OA-descriptor with specific sets of knee OA-descriptors as input features (the omitted features for each classifier are depicted in white). (B) ROC-AUC analysis for the classifiers explored in A. functionality (FU), rigidity (RI), joint pain (JP), pain catastrophism (CA), pain sensetization (SE), depression (DE) and synovitis (SY).
Figure 2A shows the relative importance of SF molecules to classify knee OA clinical features. The classification of WOMAC domains (i.e., JP, RI and FU) are highly influenced (\(AUC > 0.8\)) by MCP1, IL-1RA and IL-6, respectively. With a lower score (\(AUC = 0.71\)), DE categorization is equally influenced by LEPTIN and VEGF-A. When classifying patients against CA and SE, IL-RA and LEPTIN are the parameters that have a higher influence, but the AUC score decreases below 0.6, reaching 0.41 for the SE classifier. SY is relatively well classified (\(AUC = 0.67\)), but in this case, the most influencing parameter is MCP1.
(A) Normalized importance when classifying knee OA-descriptors with synovial proteomics data as input features. (B) ROC-AUC analysis for the classifiers explored in A. functionality (FU), rigidity (RI), joint pain (JP), pain catastrophism (CA), pain sensetization (SE), depression (DE) and synovitis (SY).
Figure 3 shows the classification of knee OA clinical descriptors for n = 25 when the simulated activities of the TF of the regulatory network-based model of a chondrocyte12 are used as input features. Generally FOXO and Sox9 are the most influential TF (see Fig. 3A). Figure 3B reveals the robust effect of FOXO in JP and FU classification as it discriminates either the false positives (\(AUC = 0.3\)) or the true positives (\(AUC = 0.7\)) of the classifiers. The other classifications lead to AUC scores lower than 0.65: RI (\(AUC = 0.62\)) is greatly influenced by Sox9. The SE classifier (\(AUC = 0.64\)) is the only one mostly controlled by CREB. The SY classifier (\(AUC = 0.5\)) is mostly influenced by a set of TF that includes Sox9, CREB and FOXO among the most relevant, and contributions of Hif2a, NF-\(\kappa\)B and Runx2 result largely irrelevant.
(A) Normalized importance when classifying knee OA-descriptors with transcription factor activities simulated with. a regulatory network-based model, as input features. (B) ROC-AUC analysis. functionality (FU), rigidity (RI), joint pain (JP), pain catastrophism (CA), pain sensetization (SE), depression (DE) and synovitis (SY).
Table 1 summarizes the results of the classification of the WOMAC domains data recruited in the OAI initiative after binarizing them with the thresholds found with our data. The average accuracy of the six explored classifiers is 0.9794.
Discussion
Knee OA diagnosis is influenced by subjective parameters (i.e., WOMAC domains)13, which impacts treatment decisions. To reduce the subjectivity in knee OA diagnosis, the present study aimed to characterize seven clinical knee OA descriptors commonly used by examining the contributions of synovial and chondrocyte-derived information. Specifically, a cohort of 51 women with Kellgren–Lawrence grades 2–3 was classified with SVM. However, logistic regression would be another plausible choice, and yet similar results to the ones provided by the SVM would be obtained, as the SVM and logistic regression loss functions are similar. However, SVM is more robust to outliers and overfitting, thanks to the tunning of parameter C. Accordingly, SVM was used to evaluate the value of biological data to inform objectively about the clinical stratification of patients39.
First, we assessed how the descriptors that play a role in the knee OA clinical manifestations can be classified based on a combination of these very same descriptors. Fo Remarkably, pain catastrophism (CA) was the leading input for the classification of the joint pain dimension (JP) of the WOMAC questionnaire (Fig. 1). Pain catastrophizing is characterized by a patient’s tendency to magnify pain stimuli and feel helpless in the context of pain5. It has been argued that the assessment of CA in response to a specific stimulus might account for most of the variance in pain reports14,15,16. Accordingly, the influence of CA in our classification of knee OA descriptors suggests that JP is not assessing accurately the pain due to noxious stimulus, highlighting potential bias in current diagnosis systems17,18. At the same time, DE was the second leading feature for JP classification. A previous study demonstrated a high correlation between high WOMAC pain scores and depression19. Furthermore, knee OA patients were found to have increased negative beliefs and mental health issues20. These results might emphasise a bidirectional relation regarding the feeling of pain and mental health: patients who develop depression might be more prone to develop low tolerances to pain (a phenomenon called central sensitization, SE) and reduce their physical activity (i.e., sedentarism or static postures). In other words, our classifications support the idea that pain perception contributes to depression and emotional distress in knee OA patients, which in turn, feedback loops with pain reporting. Accordingly, SE was the leading feature for DE classification, which is consistent with previous findings: in individuals with knee OA, the presence of enlarged pain areas was associated with more persistent and severe pain, as well as higher anxiety levels. These findings were interpreted as a sign of altered central pain processing21 Beyond its influence on DE, SE played an important role in classifying the WOMAC domains within our cohort, in general, since it resulted to be one of the leading influential features (Fig. 1A). Accordingly, SE might increase WOMAC scores indicating secondary influences in the outcome of these questionnaires.
FU had also a clear influence on the classification of RI, which matches previous findings by Wolfe et al.19 In turn, FU classification was greatly affected by SE and CA. These results suggest that the fear of movement (kinesophobia) might impose circular interactions among cognitive factors and pain perception, as Wong et al.22 reported previously for the case of CA23. There is a growing consensus that disability symptoms in knee OA patients rest upon various factors, including central pain processing mechanisms20. As a result, a person may distance herself from activities and social situations to avoid the appearance of pain, increasing the risk of developing an unhealthy lifestyle and depression24. In fact, a previous study predicted that static postures might decrease chondrocyte anabolic activity which decreases the deposition of extracellular matrix12.
Figure 1 further suggests that subjective clinical descriptors influence the classification of inflammation (i.e., SY). Although knee pain (JP) has been linked to joint inflammation25, our findings, as illustrated in Fig. 1A, do not reflect this behaviour: RI led the way in classifying SY with AUC = 0.86. The lack of mechanistic significance of these inputs highlights the need for new objective biological markers in knee OA clinical-decision making. In this sense, serum and urine biochemical markers (in combination with magnetic resonance imaging) have been used to establish knee OA phenotype classification systems9. As the pain might be correlated to the intensity of articular cartilage degradation and a pro-inflammatory environment26, there is a growing interest in exploring the synovial fluid inflammatory molecules10,11,27. However, to the best of our knowledge, no study has specifically explored the capacity of synovial fluid markers to increase our understanding of existing knee OA diagnosis descriptors. Hence, we explored the potential of measurable molecules in synovial liquid to provide insights into KAO clinical features. Figure 2A reveals a high potential of leptin to explain the classification of knee OA patients according to clinical descriptors. Specifically, the relatively leading role of leptin in the good discrimination of patients regarding the SY levels (\(AUC > 0.67\)) suggests that the endocrine action of leptin might have an important role in joint inflammation stratification, especially in such joints not affected by over-compression. Leptin is a hormone produced by white adipocytes, and it is overexpressed in individuals with metabolic syndrome. This syndrome is characterized by the presence of systemic low-grade inflammation, that has the potential to induce pathological processes in several tissues. In the specific case of chondrocytes, leptin has been linked with increased production of iNOS (the main producer of NO)28. Moreover, leptin shows a synergic effect with IL-1\(\beta\) on the production of NO in human articular cartilage chondrocytes29. Besides, leptin has been associated with increased production of matrix-degrading enzymes (i.e., MMP1 and MMP3)30. Other studies demonstrated that leptin alone can induce the synthesis of PGE2 (a molecule associated to modulate inflammatory pain), IL-6, IL-8, MMP1, MMP3 and MMP13 in cartilage explants31,32.
But, the main influencing parameter for SY and JP classification was MCP1 (Fig. 2A). This suggests that MCP1 might be used as a new biomarker to evaluate inflammation as it efficiently classifies SY (\(AUC = 0.67\), Fig. 2B). Curiously, a study relates structural progression and pain with the abundance of activated macrophages33. For SY classification, VEGF-A emerges as the second most influential parameter, which is a reasonable inference: once macrophages are recruited, they stimulate VEGF production, which increases angiogenesis and synovitis. This creates a positive feedback loop where macrophage recruitment contributes to the perpetuation of joint inflammation, thereby leading to more severe forms of knee OA34. Thus, our results suggest that patients with higher levels of MCP-1 might be more prone to have painful forms of knee OA. MCP-1 might be used as an objective factor to differentiate pain induced by sensitization from pain induced by nociception and adjust the clinical decision accordingly.
We used SF information as initial conditions to a previous RNM for articular cartilage chondrocytes to obtain patient-specific information about the TF proteomic profile. Figure 3A illustrates that TF that usually are activated in healthy chondrocytes (i.e., FOXO, Sox9, and Cited2) contribute more in the classification tasks, while the hypertrophic (i.e., Runx2 and Hif2) and acute inflammatory (i.e., NF-\(\kappa\)B) TFs do not greatly affect the stratification of patients. But, AUC scores in Fig. 3B (with an overall mean of 0.56) suggest that there is not a clear pattern within knee OA patients. This fact might reflect the pleomorphic behaviour of osteoarthritic chondrocytes associated with unstructured activation of intracellular signalling pathways, which leads to a microheterogeneity of cellular reaction patterns within subjects35. However, the AUC analysis supports the use of TF for the classification of JP because it discriminates false positives (\(AUC = 0.3\)). Previous studies with animal models demonstrate that FOXO has protective functions in the response of cartilage to joint trauma and mechanical overload, and is downregulated by pro-inflammatory mediators (i.e., TNF-\(\alpha\) and IL-1\(\beta\))36,37. This and the poor influence of NF-\(\kappa\)B might help understand why clinical manifestations are insufficiently explained by acute nociceptive pain mechanisms (i.e., tissue damage and acute inflammatory responses). Our results highlight an important role of systemic low-grade inflammation in knee OA (overall leading role of leptin in Fig. 2). However, the relatively low AUC scores point out the need for further corroboration of this hypothesis in larger (longitudinal) cohorts.
Before performing the classification tasks, the clinical descriptors underwent a binarization process with a mean accuracy of 0.83 (see Supplementary Material 1). To validate thresholds, we investigated their application in the baseline OAI dataset. Our findings revealed that the accuracy was above 0.95 for six of the explored binarized outputs. We also examined whether the sensitization threshold matched the threshold recommended by Pujol et al.38) and both thresholds were the same mean (\(accuracy = 0.67\)) on the tests sets.
As with any study, our investigation has limitations: the most notable is the need for confirmation of our presented results in a larger cohort. Besides, according to the size of our data set, we implemented a nested leave-one-out validation. When applying this method, the train set remained highly consistent across each fold. Arguably this particular form of cross-validation may introduce higher variance in the error estimate, yet represents one of the optimal strategies to take most advantage of small data set39. Another weak point is the elevated number of features used when SF and TF are explored as inputs, which limits the algorithm to work optimally. Besides, TF data come from a model that was not validated using in-vitro/in-vivo data. Also, we should keep in mind that we used a linear classifier, which cannot model nonlinear relationships in the data. Furthermore, the current study has a cross-sectional design, which difficult to extract conclusions regarding the causality between biomarkers and current knee OA diagnostic systems. However, our findings can guide the design of future studies based on the most influential associations identified.
In conclusion, to the best of our knowledge, this study is the first to explore potential biomarkers that could be related to common diagnostic criteria for knee OA, such as WOMAC scores, pain catastrophism, sensitization, and depression, to potentially increase objectivity in knee OA patient’s stratification. The unique application of Support Vector machine-based classifiers in combination with a regulatory network-based model allows for a more objective understanding of knee OA based on synovial fluid biomarkers. However, it is important to acknowledge that limitations exist, and larger cohorts and longitudinal studies are needed to fully map out objective descriptors for knee OA diagnosis.
Methods
We explored how to classify subjects in a cohort of women (n = 51) with Kellgren–Lawrence grade 2–3 knee OA, described extensively in terms of inclusion and exclusion criteria by Tassani et al.40 From those patients that presented effusion (n = 25) synovial liquid (SL) extraction was performed and measured through Luminex\(\circledR\) to obtain information related to inflammatory soluble factors. Such inflammatory mediators were further used as input data for a qualitative dynamical system of a regulatory-network based model (RNM) that summarizes chondrocyte mechanobiological and intracellular activity12. From the RNM we obtained personalized virtual information regarding the activation level of seven transcription factors (TF). An overview of the methodoly followed is depicted in Fig. 4.
But, the primary contribution of this manuscript is the characterization of patients regarding the level of seven clinical descriptors (catastrophism (CA), depression (DE), functionality (FU), joint pain (JP), rigidity (RI), sensitization (SE), and synovitis (SY)), which were binarized for the classification purposes with a Youden test. Three sets of data were used as input for the SVM-based classifiers: (i) the most appropriate set of knee OA descriptors; (ii) SL inflammatory data (n = 25), and (iii) in-sillico TF activation levels. All in all, lead to the exploration of 21 classification tasks. The binary classification tasks were validated using the receiver operating characteristic curve (ROC-AUC) analysis. The binarization thresholds were further validated in the baseline of the osteoarthritis initiative (OAI) data set.
Cohort description
At the Rheumatology/Orthopaedic Surgery Department of the Hospital del Mar (Barcelona, Spain), a cohort was built revising the clinical history of patients diagnosed with knee OA. The study followed the Good Clinical Practice guidelines and the Declaration of Helsinki, and the Clinical Research Ethical Committee approved the protocol (2016/6747/I). All participants signed an approved informed consent and agreed to do a wash-up treatment for 3 months of intra-articular hyaluronic acid infiltrations, 1 month for oral or intra-articular corticoids, and 1 week of nonsteroidal anti-inflammatory or opioid drugs. A total of 51 women were selected who presented radiographic signs of moderate knee OA (Kellgren–Lawrence grade of 2/3) and symptomatology (pain, dysfunction and/or effusion) in the last 3 months. Patients who presented any sign of secondary OA were excluded (meniscectomy, inflammatory or connective tissue diseases, overuse of the joint). Women with effusion (n = 25) were eligible for synovial liquid extraction40.
Description of the output labels
The targeted OA descriptors used for the stratification of patients were:
-
Joint Pain (JP): it is assessed using the WOMAC pain scale, where patients rate the level pain experienced during five quotidian (i.e., walking, sitting or standing) activities. For each of these activities, the level of pain felt is reported on a 5-point Likert scale ranging from “none” to “extreme”41.
-
The functionality of the joint (FU): it contains the results of the WOMAC functional impairment questionnaire that summarizes 17 items with 1—5 Linkert scale responses.
-
The rigidity of the joint (RI): it is the WOMAC stiffness subtest contains two categories and is used to rank the rigidity of the joint.
-
Central pain sensitization or pain hypersensitivity (SE): Defined by the International Association for the Study of Pain as “increased responsiveness of nociceptive neurons in the central nervous system to either normal or subthreshold afferent input”42. SE was evaluated following the protocol described in Pujol et al.38
-
Effusion (EF): it is a radiographic factor (echography), post-analysed and graded by an expert38. It is a measure of the liquid accumulated inside the synovial joint.
-
Depression and anxiety (DE): It is obtained from the Hospital Anxiety and Depression Scale (HAD), which measures core symptoms of anxiety and depression without including physical symptoms43.
-
Catastrophism (CA): based on the Pain Catastrophism Scale, it assesses catastrophic thinking related to pain with or without chronic pain. CA refers to the tendency to focus on, and magnify, ache sensations, and to feel helpless in front of pain16.
Originally, these descriptors were collected in a discrete non-binary form, but when used as targets to predict, we converted them into binary categorical variables (i.e. targeted labels). This was done by selecting a threshold that best divided the cohort into two groups. The optimal threshold for each variable was determined through a grid search (see Supplementary Material 1) to maximise the Youden’s index:
TP refers to true positives, FN refers to false negative, TN refers to true negative, and FP refers to false positive. Maximizing Youden’s index was equivalent to achieving the highest value of the sum of sensitivity (first term on the right side of the equation) and specificity (second term). However, if the accuracy across the entire fold of the leave-one-out validation was less than 0.75 in the train set, we omitted such descriptor as an output label because we assumed that it could not be binarized properly.
Input features selection
Three types of input features were used for the classification of 7 clinical OA descriptors. A total of 21 classification tasks were explored. The first set of input features was the continuous form of the same clinical OA characteristics. For this case, feature selection was done ahead of binarization, based on the numerical values of the input and output variables. Hence, both Pearson and Spearman are good options. Yet, the Pearson correlation test was preferred because it measures linear relationships among variables, according to the linearity of the kernel of the SVM. We aimed to reduce any possible redundancy of descriptors by avoiding high linear dependency relationships among the input and output sets. Specifically, we excluded those cases with Pearson correlation coefficient \((r) > 0.7\) (see Fig. A1). This ensured that the identified features were completely independent and had no strong linear relationship with the targeted output. To this end, a correlation matrix using Python 7.29.0 (corr. method from pandas library) was generated, which helped identify the highly linearly correlated features regarding the output label.
Synovial fluid (SF) samples from 25 patients were obtained through aspiration and analysed with Luminex\(\circledR\) technology to measure the amounts of:
-
pro-inflammatory cytokines: Interleukin 6 (IL-6), Interleukin 8 (IL-8), Interleukin 4 (IL-4), Tumor necrosis factor alpha (TNF-\(\alpha\)), Interleukin 18 (IL-18), Interferon-gamma (INF-\(\gamma\)), Interleukin 17 (IL-17). These molecules contribute to the degeneration of the articular cartilage and inflammation44.
-
Interleukin-1 receptor antagonist protein (IL-1RA) inhibits the activity of the Interleukin-1\(\beta\) (IL-1\(\beta\)) receptor.
-
Monocyte chemoattractant protein 1 (MCP1) recruits monocytes to inflammation sites produced by tissue injury, propagating inflammation and tissue damage45.
-
Vascular endothelial growth factor A (VEGF-A) increases angiogenesis46.
-
Leptin modulates the inflammatory processes and articular cartilage remodelling47.
AC is nourished through the SF. Hence, diffusion of the aforementioned molecules to articular cartilage influences the chondrocyte’s metabolism. Thus, the transcription factor proteomic profile might provide valuable data to characterize patients biologically. Accordingly, and because articular cartilage samples cannot be collected in the patients without causing any serious damage, information from a chondrocyte RNM was used12. The RNM models the concentration of molecules (i.e., proteins) by time-dependent variables. The synthesis of each network node (i) is regulated by rate equations in the form \(\frac{\textrm{d}x_i}{\textrm{d}t}=f_i(x_n)\), which depends on regulation nodes (n). The RNM can capture the intracellular channelling of external signals into identifiable cellular behaviours48. In this work, we have used the patient SF as an external signal of a chondrocyte RNM previously developedSegarra23. From the simulations, we generated personalized synthetic information about 8 TF: Activator Protein 1 (AP1); cAMP response element-binding protein (CREB); forkhead box 1 (FOXO); Nuclear factor kappa-light-chain-enhancer (NF-\(\kappa\)B); transcription factor Sox9 (Sox9); Cbp/p300-interacting transactivator 2 (CITED2); Runt-related transcription factor 2 (Runx2); Hypoxia-inducible factor 2-alpha (HIF2a). Detailed integration can be found in Additional Information and summarized in Fig. 5.
Overview of the methodology followed for obtaining patient-specific virtual information about the activation of transcription factors. (A) Synovial liquid from patients who presented effusion was extracted. (B) Luminex technology was used to measure soluble inflammatory mediators \(y_i^n\), \(i =\) IL-8, IL-6, IL-4, TNF-\(\alpha\), IL-18, IFN-\(\gamma\), IL-17, VEGF. (C) Such information is re-scaled between 0 and 1 to be used as an input (\(x_i^n(t_0)\)) in a regulatory network-based model of a chondrocyte12. (D) The model converts a static graph (a simplified version of the original published at12) into a system of ordinary differential equations following Mendoza et al. methodology48. 25 simulations were run with Segarra-Queralt et al.12 to obtain patient-specific information about the activation of transcription factors (\(x_o^n(t_{SS})\), \(o =\) CITED2, CREB, Runx2, FOXO, SOX9, HIF2a, NF-\(\kappa\)B, AP1).
Mathematical formulation
SVM is a supervised approach for classifying49. It divides the samples into two groups by identifying the separating hyper-plane with the largest distance (margin) to the nearest training data points, called the support vectors. In this work, a binary linear SVM was used to predict a class \(y_n\in [0,1]\), based on the input training set \(x_n\in \mathbb {R}^M,n=1,\ldots ,N,\) where M was the number of features and N the number of samples. A linear hyperplane (a line if m = 2 or a plane if m = 3) can be written as the set of points x satisfying
where \(b\in \mathbb {R}\) was the ordinate at the origin and \(\omega \in \mathbb {R}^M\) was the normal vector to the hyperplane. The algorithm found a prediction function that categorized most samples correctly by minimizing:
subject to
\(y_n\) was the known label corresponding to the sample \(x_n\), \(\zeta _n\) is the extent to which the margin constraint on \(x_n\) can be violated, and C was the penalty term that controlled the number and severity of the violations to the margin. Given a sample x, the algorithm classified it by determining on which side of the hyperplane was \(y_n\) (i.e., whether \(\omega ^T(x_n)+b\) was positive or negative.
To ensure the maximum performance of the classifier, hyperparameter tuning is done before the training step based on k-fold cross-validation (\(k\ =\ 5\)). Once the best hyperparameters were found, the SVM was built in Python (Version 3.9.7)50. Furthermore, according to the size of the data set, leave-one-out validation was applied. Thus, we have trained \(n=50\) models when clinical data is used as input feature, and 24 when synovial fluid and transcription data are used as inputs. With the linear SVM, the size of each weight \(\omega _m\in \mathbb {R}\) relative to the other ones indicated how important the m-th feature was for the separation. With the leave-one-out validation, the \(\bar{\omega }_m=\frac{\sum ^{n-1}\omega _m}{n-1}\) was obtained. The \(\bar{\omega }\) value was normalized between 0 and 1 for each M of the \(x_n\) for visualization purposes. Finally, the performance of each classifier was evaluated using a nested receiver operating characteristic curve (ROC-AUC) analysis, and mean AUC scores were reported50.
Thresholds validation
We validated the binarization of features in the Osteoarthritis Initiative (OAI) baseline dataset using the thresholds found for the WOMAC domains (see Table A1). The OAI dataset recruited 4796 individuals in the United States. To minimize the noise in the data, we excluded individuals who had either knee history of knee surgery (i.e., arthroscopy, ligament repair, meniscectomy) and participants with less than 80% of variables completed, resulting in 3706 subjects. Pearson correlation was used for dimensionality reduction, to remove highly correlated variables (\(r>0.7\)), leaving 785 features for analysis. Finally, information gain was used to identify the first 50 most important features (see Figs. A2, A3, A4, A5, A6 and A7). Information gain calculation was based on the reduction in entropy of the dataset to evaluate each feature’s importance51. Hyperparameter tuning was then performed, followed by SVM classification for six targeted outputs: WOMAC pain, functionality and rigidity of the right and left knee. A detailed description of this process can be found in Additional Information.
Topical subheadings are allowed. Authors must ensure that their Methods section includes adequate experimental and characterization data necessary for others in the field to reproduce their work.
Data availability
The data that support the findings of this study are available from the Hospital del Mar but restrictions apply to the availability of these data, which were used under the terms of the informed consent signed by the patients for the current project, and so are not publicly available. Data are however available from the corresponding author, Jérôme Noailly, upon reasonable request and with permission of the Ethical Committee of the Hospital del Mar. Data for the thresholds validation is available at https://nda.nih.gov/data_structure.html?short_name=clinoq01OAI web-page.
References
Tamer, T. M. Hyaluronan and synovial joint: Function, distribution and healing. Interdiscip. Toxicol. 6, 111. https://doi.org/10.2478/INTOX-2013-0019 (2013).
Jeffery, A. E., Wylde, V., Blom, A. W. & Horwood, J. P. “it’s there and i’m stuck with it’’: Patients’ experiences of chronic pain following total knee replacement surgery. Arth. Care Res. 63, 286–292. https://doi.org/10.1002/ACR.20360 (2011).
Mobasheri, A. & Batt, M. An update on the pathophysiology of osteoarthritis. Ann. Phys. Rehabil. Med. 59, 333–339. https://doi.org/10.1016/J.REHAB.2016.07.004 (2016).
Trouvin, A. P. & Perrot, S. Pain in osteoarthritis: Implications for optimal management. Joint Bone Spine 85, 429–434. https://doi.org/10.1016/J.JBSPIN.2017.08.002 (2018).
Lotz, M., Martel-Pelletier, J. & Christiansen, C. Value of biomarkers in osteoarthritis: Current status and perspectives. Ann. Rheum. Dis. 72, 1756–1763. https://doi.org/10.1136/annrheumdis-2013-203726 (2013).
Copsey, B. et al. Problems persist in reporting of methods and results for the womac measure in hip and knee osteoarthritis trials. Qual. Life Res. 28, 335–343. https://doi.org/10.1007/S11136-018-1978-1/TABLES/6 (2019).
Long, M. J., Papi, E., Duffell, L. D. & McGregor, A. H. Predicting knee osteoarthritis risk in injured populations. Clin. Biomech. 47, 87–95. https://doi.org/10.1016/J.CLINBIOMECH.2017.06.001 (2017).
Schaible, H. G. Mechanisms of chronic pain in osteoarthritis. Curr. Rheumatol. Rep. 14, 549–556. https://doi.org/10.1007/S11926-012-0279-X (2012).
Nelson, A. E. et al. A machine learning approach to knee osteoarthritis phenotyping: Data from the fnih biomarkers consortium. Osteoarth. Cartil. 27, 994–1001. https://doi.org/10.1016/J.JOCA.2018.12.027 (2019).
Donnenfield, J. I. et al. Predicting severity of cartilage damage in a post-traumatic porcine model: Synovial fluid and gait in a support vector machine. PLoS ONE 17, e0268198. https://doi.org/10.1371/JOURNAL.PONE.0268198 (2022).
Haraden, C. A., Huebner, J. L., Hsueh, M. F., Li, Y. J. & Kraus, V. B. Synovial fluid biomarkers associated with osteoarthritis severity reflect macrophage and neutrophil related inflammation. Arth. Res. Ther. 21, 1–9. https://doi.org/10.1186/S13075-019-1923-X/FIGURES/2 (2019).
Segarra-Queralt, M., Piella, G. & Noailly, J. Network-based modelling of mechano-inflammatory chondrocyte regulation in early osteoarthritis. Front. Bioeng. Biotechnol. 11, 87. https://doi.org/10.3389/FBIOE.2023.1006066 (2023).
Zuiderbaan, H. A. et al. Predictors of subjective outcome after medial unicompartmental knee arthroplasty. J. Arthroplast. 31, 1453–1458. https://doi.org/10.1016/J.ARTH.2015.12.038 (2016).
Sullivan, M. J., Stanish, W., Waite, H., Sullivan, M. & Tripp, D. A. Catastrophizing, pain, and disability in patients with soft-tissue injuries. Pain 77, 253–260. https://doi.org/10.1016/S0304-3959(98)00097-9 (1998).
Keefe, F. J. et al. The relationship of gender to pain, pain behavior, and disability in osteoarthritis patients: The role of catastrophizing. Pain 87, 325–334. https://doi.org/10.1016/S0304-3959(00)00296-7 (2000).
Quartana, P. J., Campbell, C. M. & Edwards, R. R. Pain catastrophizing: A critical review. Expert Rev. Neurother. 9, 745. https://doi.org/10.1586/ERN.09.34 (2009).
Goodin, B. R. et al. The association of greater dispositional optimism with less endogenous pain facilitation is indirectly transmitted through lower levels of pain catastrophizing. J. Pain 14, 126–135. https://doi.org/10.1016/j.jpain.2012.10.007 (2013).
Stratford, P., Kennedy, D. & Clarke, H. Confounding pain and function: The womac’s failure to accurately predict lower extremity function. Arthroplast. Today 4, 488. https://doi.org/10.1016/J.ARTD.2018.09.003 (2018).
Wolfe, F. Determinants of womac function, pain and stiffness scores: Evidence for the role of low back pain, symptom counts, fatigue and depression in osteoarthritis, rheumatoid arthritis and fibromyalgia. Rheumatology 38, 355–361. https://doi.org/10.1093/RHEUMATOLOGY/38.4.355 (1999).
López-Ruiz, M. et al. Central sensitization in knee osteoarthritis and fibromyalgia: Beyond depression and anxiety. PloS Onehttps://doi.org/10.1371/JOURNAL.PONE.0225836 (2019).
Wood, L. R., Peat, G., Thomas, E. & Duncan, R. Knee osteoarthritis in community-dwelling older adults: Are there characteristic patterns of pain location?. Osteoarth. Cartil. 15, 615–623. https://doi.org/10.1016/J.JOCA.2006.12.001 (2007).
Wong, W. S. et al. The fear-avoidance model of chronic pain: Assessing the role of neuroticism and negative affect in pain catastrophizing using structural equation modeling. Int. J. Behav. Med. 22, 118–131. https://doi.org/10.1007/S12529-014-9413-7/TABLES/5 (2015).
Helminen, E. E., Sinikallio, S. H., Valjakka, A. L., Väisänen-Rouvali, R. H. & Arokoski, J. P. Determinants of pain and functioning in knee osteoarthritis: A one-year prospective study. Clin. Rehabil. 30, 890–900. https://doi.org/10.1177/0269215515619660/FORMAT/EPUB (2016).
de Oliveira Vargas e Silva, NC., da Silva-Gusmão-Cardoso, T., de Andrade, E. A., Battistella, L. R. & Alfieri, F. M. (2020) Pain disability and catastrophizing in individuals with knee osteoarthritis. BrJP3, 322–327, https://doi.org/10.5935/2595-0118.20200193.
Syx, D., Tran, P. B., Miller, R. E. & Malfait, A. M. Peripheral mechanisms contributing to osteoarthritis pain. Curr. Rheumatol. Rep. 20, 1–11. https://doi.org/10.1007/S11926-018-0716-6/METRICS (2018).
Neogi, T. The epidemiology and impact of pain in osteoarthritis. Osteoarth. Cartil. 21, 1145–1153. https://doi.org/10.1016/J.JOCA.2013.03.018 (2013).
Neuman, P., Dahlberg, L. E., Englund, M. & Struglics, A. Concentrations of synovial fluid biomarkers and the prediction of knee osteoarthritis 16 years after anterior cruciate ligament injury. Osteoarth. Cartil. 25, 492–498. https://doi.org/10.1016/J.JOCA.2016.09.008 (2017).
Otero, M., Lago, R., Lago, F., Reino, J. J. G. & Gualillo, O. Signalling pathway involved in nitric oxide synthase type ii activation in chondrocytes: Synergistic effect of leptin with interleukin-1. Arth. Res. Therapyhttps://doi.org/10.1186/AR1708 (2005).
Otero, M., Reino, J. J. G. & Gualillo, O. Synergistic induction of nitric oxide synthase type ii: In vitro effect of leptin and interferon-gamma in human chondrocytes and atdc5 chondrogenic cells. Arth. Rheum. 48, 404–409. https://doi.org/10.1002/ART.10811 (2003).
Hui, W. et al. Leptin produced by joint white adipose tissue induces cartilage degradation via upregulation and activation of matrix metalloproteinases. Ann. Rheum. Dis. 71, 455–462. https://doi.org/10.1136/ANNRHEUMDIS-2011-200372 (2012).
Moilanen, E. et al. Leptin enhances synthesis of proinflammatory mediators in human osteoarthritic cartilage-mediator role of no in leptin-induced pge 2, il-6, and il-8 production. Med. Inflamm.https://doi.org/10.1155/2009/345838 (2009).
Koskinen, A., Vuolteenaho, K., Nieminen, R., Moilanen, T. & Moilanen, E. Leptin enhances mmp-1, mmp-3 and mmp-13 production in human osteoarthritic cartilage and correlates with mmp-1 and mmp-3 in synovial fluid from oa patients. Clin. Exp. Rheumatol. 29, 57–64 (2011).
Daghestani, H. N., Pieper, C. F. & Kraus, V. B. Soluble macrophage biomarkers indicate inflammatory phenotypes in patients with knee osteoarthritis. Arth. Rheumatol. 67, 956–965. https://doi.org/10.1002/ART.39006 (2015).
Scanzello, C. R. & Goldring, S. R. The role of synovitis in osteoarthritis pathogenesis. Bone 51, 249–257. https://doi.org/10.1016/j.bone.2012.02.012 (2012).
Aigner, T., Söder, S., Gebhard, P. M., McAlinden, A. & Haag, J. Mechanisms of disease: Role of chondrocytes in the pathogenesis of osteoarthritis-structure, chaos and senescence. Nat. Clin. Pract. Rheumatol. 3, 391–399. https://doi.org/10.1038/ncprheum0534 (2007).
Ludikhuize, J. et al. Inhibition of forkhead box class o family member transcription factors in rheumatoid synovial tissue. Arth. Rheum. 56, 2180–2191. https://doi.org/10.1002/ART.22653 (2007).
Matsuzaki, T. et al. Foxo transcription factors influence cartilage maturation, homeostasis and osteoarthritis pathogenesis by modulating autophagy and proteoglycan 4 hhs public access. Sci. Transl. Med.https://doi.org/10.1126/scitranslmed.aan0746 (2018).
Pujol, J. et al. Brain imaging of pain sensitization in patients with knee osteoarthritis. Pain 158, 1831–1838. https://doi.org/10.1097/J.PAIN.0000000000000985 (2017).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning (Springer, 2001).
Tassani, S. et al. Relationship between the choice of clinical treatment, gait functionality and kinetics in patients with comparable knee osteoarthritis. Front. Bioeng. Biotechnol. 10, 202. https://doi.org/10.3389/FBIOE.2022.820186/BIBTEX (2022).
Riddle, D. L. & Perera, R. A. The womac pain scale and crosstalk from co-occurring pain sites in people with knee pain: A causal modeling study. Phys. Ther. 100, 1872. https://doi.org/10.1093/PTJ/PZAA098 (2020).
Adams, G. R. et al. Do “central sensitization’’ questionnaires reflect measures of nociceptive sensitization or psychological constructs? protocol for a systematic review. Pain Reports 6, e962. https://doi.org/10.1097/PR9.0000000000000962 (2021).
Stern, A. F. The hospital anxiety and depression scale. Occup. Med. 64, 393–394. https://doi.org/10.1093/OCCMED/KQU024 (2014).
Kapoor, M., Martel-Pelletier, J., Lajeunesse, D., Pelletier, J. P. & Fahmi, H. Role of proinflammatory cytokines in the pathophysiology of osteoarthritis. Nat. Rev. Rheumatol. 7, 33–42. https://doi.org/10.1038/nrrheum.2010.196 (2011).
Raghu, H. et al. Ccl2/ccr2, but not ccl5/ccr5, mediates monocyte recruitment, inflammation and cartilage destruction in osteoarthritis. Ann. Rheum. Dis. 76, 914–922. https://doi.org/10.1136/ANNRHEUMDIS-2016-210426 (2017).
Hamilton, J. L. et al. Targeting vegf and its receptors for the treatment of osteoarthritis and associated pain. J. Bone Miner. Res. 31, 911–924. https://doi.org/10.1002/JBMR.2828 (2016).
Eldjoudi, D. A. et al. Leptin in osteoarthritis and rheumatoid arthritis: Player or bystander?. J. Mol. Sci 2022, 23. https://doi.org/10.3390/ijms23052859 (2022).
Mendoza, L. & Xenarios, I. A method for the generation of standardized qualitative dynamical systems of regulatory networks. Theor. Biol. Med. Model. 3, 13. https://doi.org/10.1186/1742-4682-3-13 (2006).
Cortes, C., Vapnik, V. & Saitta, L. Support-vector networks. Mach. Learn. 20, 273–297. https://doi.org/10.1007/BF00994018 (1995).
Pedregosa, F. et al. Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Azhagusundari, B. & Thanamani, A. S. Feature selection based on information gain. Int. J. Innov. Technol. Explor. Eng. https://doi.org/10.1016/j.asoc.2008.05.006 (2013).
Acknowledgements
Catalan and Spanish Governments (2020FI b00680; STRATO PID2021126469ob-C21-2), European Commission (MSCA-TN-ETN-2020-Disc4All-955735, ERC-2021-CoG-O-Health-101044828). G. Piella is supported by the ICREA Academia programme.
Author information
Authors and Affiliations
Contributions
The conception and design of the study were made by J.N. L.T. and J.M. contributed to the acquisition of data. M.S.Q. wrote the manuscript and along with G.P., M.G. and J.N. analysed and interpreted the data. All authors contributed to the final version to be submitted and revised for important intellectual content.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Segarra-Queralt, M., Galofré, M., Tio, L. et al. Characterization of clinical data for patient stratification in moderate osteoarthritis with support vector machines, regulatory network models, and verification against osteoarthritis Initiative data. Sci Rep 14, 11797 (2024). https://doi.org/10.1038/s41598-024-62212-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-62212-x
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.