Introduction: It is important to recognise inflammatory back pain (IBP) for early diagnosis of ankylosing spondylitis (AS). The aims of this study were to develop a valid, reliable Bengali IBP tool and to assess the performance of different IBP criteria sets, including Calin, Berlin set 8a and 7b, and new Assessment of SpondyloArthritis International Society (ASAS) expert criteria, in radiographic axial spondyloarthritis (axSpA) and nonradiographic axSpA.
Method: This case-control study was performed in three phases. The first phase involved development of an IBP tool by adding the fifth parameter of ASAS expert criteria to the National Health and Nutrition Examination Survey (NHANES) 2009–2010 arthritis questionnaires; the second phase assessed reliability by test-retest statistics among 87 participants at a 5-day interval. Finally, according to the imaging arm of ASAS axSpA classification criteria, 50 patients with axSpA were included as cases while 50 patients with chronic mechanical back pain (MBP) were included as a control.
Results: The presence of IBP with SpA versus patients with MBP, detected by Calin criteria, were 76.0% versus 10.0%, by Berlin 8a were 72.0% versus 6.0%, by Berlin 7b were 58.0% versus 12.0%, and by ASAS were 64.0% versus 18.0%, respectively. Results suggested the Calin criteria set has the highest sensitivity (76.0%) and Berlin set 8a has the highest specificity (78.9%) in the differentiation of IBP from MBP.
Conclusion: The performance of the new ASAS criteria was analogous to the other existing criteria sets. The highest positive likelihood ratio and odds ratio were found for Berlin set 8a criteria. The Berlin set 8a criteria can still be used in primary care practice at the first screening because of high sensitivity.
Back pain is a very common problem worldwide. It is the most frequent reason for visits to the physician.1-5 Approximately 80% of the world’s population develops low back pain at some point in their adult life. Back pain is considered chronic when it persists for 3 months or more. This chronic condition may reflect inflammatory back pain (IBP) or mechanical back pain (MBP). Approximately 38.7% of patients with chronic back pain have IBP.6 This IBP is the earliest symptom of axial and other forms of spondyloarthritis (SpA).7-11 The sacroiliac joint is the primary site of inflammation.12 The presence of sacroiliitis in the pelvic X-ray, according to modified New York criteria, defines radiographic SpA; the presence of bone marrow oedema, synovitis and capsulitis, enthesitis, subchondral sclerosis, erosions (marginal foci or articular bone loss), periarticular fat deposition, and ankylosis in the MRI short TI inversion recovery image defines nonradiographic axial spondyloarthritis (axSpA). Axial spondyloarthropathy includes classical ankylosing spondylitis (AS) as well as axSpA. Inflammatory changes in the entire axial skeleton are characteristic of axSpA and can be visualised by MRI; structural alterations, such as new bone formation with syndesmophytes and ankylosis, develop later in the course of the disease. AS is defined by the presence of sacroiliitis on X-ray and other structural changes on spine X-rays, which may eventually progress to bony fusion of the spine.4 Males tend to be more commonly affected than females.12 AS primarily affects young adults, with a higher incidence in patients <45 years old.
Clinical features of axial SpA or AS include IBP, alternating buttock pain, enthesitis, arthritis, dactylitis, acute anterior uveitis, a positive family history, and a good response to nonsteroidal anti-inflammatory drugs. Among these features, IBP is often present at disease onset.13 Over recent decades, it has become increasingly evident that in many patients with AS or SpA, it takes many years to develop radiographic sacroiliitis from the onset of IBP.14 The higher prevalence rate of SpA in this subcontinent has become a prime concern.15 As IBP is the key clinical symptom, it is very important to recognise IBP for early diagnosis of axSpA or AS.16 To detect IBP, powerful tools or tests are needed, not only for the diagnosis of patients with AS,12,17 but also for the diagnostic evaluation of patients with chronic back pain.18,19
Up to now, several criteria sets have been developed that measure IBP. In chronological order, these criteria sets include Calin,16 modified New York criteria for ankylosing spondylitis,20 Amor,21 European Spondyloarthropathy Study Group (ESSG),22 Berlin,23 and Assessment of SpondyloArthritis International Society (ASAS) criteria.23,24 Although these criteria sets share many common clinical features, they diverge on some parameters such as age limit, mode of onset of pain, duration of pain, presence of morning stiffness or night pain, and improvement of pain with rest or exercise, which may be responsible for the difference between their reported sensitivity and specificity. The Berlin criteria have two subsets, Berlin set 8a and 7b, which differ in the number and variation of their parameters. However, there are no published data in Bangladesh, as well as in this subcontinent, regarding the performance of these IBP criteria sets.
To develop a valid, reliable Bengali IBP tool and to assess the performance of Calin, Berlin, and the new ASAS expert criteria in patients with axSpA and nonradiographic axSpA by using a control group of patients with chronic MBP for ≥3 months. This study also aims to help determine which criteria sets are better to recognise the presence IBP in the Bengali population.
MATERIALS AND METHODS
Following the minimum prevalence rate of IBP in the previous studies, the authors recruited participants >20 years of age from the outpatient department of the Medicine department of Chattogram Medical College. A convenience method of sampling was followed. Medical data were collected from patients who were either consulted spontaneously or referred for further evaluation by Medicine Indoor or Physical Medicine Indoor of Chattogram Medical College Hospital, from April 2019 to September 2019.
The study was performed in three phases. In the first phase, translation of the English National Health and Nutrition Examination Survey (NHANES) 2009–2010 Arthritis Questionnaire (ARQ) into Bengali was completed, according to Beaton et al.25 translation procedure (ARQ010, ARQ020, ARQ024, ARQ025, ARQ022, ARQ040, ARQ060, ARQ073, ARQ077, ARQ080, and ARQ100 were translated). The intraclass coefficient was 0.8, with a 95% confidence interval (CI) having a width of 0.1, so a minimum of 37 subjects were required to assess reliability statistics of any instrument. In this study, for test-retest reliability, the translated version of the Bengali IBP tool was administered among 50 participants; out of 50 participants, only 37 subjects participated in a retest by the same assessor at a 5-day interval. In the third phase, the performance of different IBP criteria sets was assessed by the Bengali IBP tool, where the sample size was 100 participants who attended the outpatient and inpatient departments of the Medicine and Physical Medicine department with chronic back pain for ≥3 months. Fifty patients with axSpA, diagnosed according to the imaging arm of ASAS axSpA classification, who had chronic back pain for ≥3 months with radiographic sacroiliitis by modified NY criteria or sacroiliitis on MRI short TI inversion recovery image, were included as study cases. The control group of 50 patients were those with a diagnosis of chronic (≥3 months) MBP, with a normal pelvic radiograph as well as normal MRI of sacroiliac joints. Because ankylosing spondylitis is not the only cause of IBP, exclusion of other diseases was confirmed by MRI of the whole spine when necessary.
Sensitivity and specificity were measured by 2×2 contingency table. According to the empirical nonparametric method, receiver operating characteristic analyses were performed to evaluate the performances of the Bengali version of Calin, Berlin set 8a, Berlin set 7b, and ASAS IBP criteria, and the area under curve (AUC) were computed for each criterion. Receiver operating characteristic curves provided a graphical representation of the overall accuracy of a test by plotting sensitivity against specificity for all thresholds, while the AUC quantified the accuracy of the test. This study also calculated positive and negative likelihood ratio (+LR, -LR), positive predictive value (PPV), and negative predictive value (NPV) to evaluate the external validity of each tool. The ability of the tools to detect IBP was also evaluated in patients with SpA. Statistical analysis used SPSS® (Version 23.0; IBM, Endicott, New York, USA).
Firstly, the different IBP criteria sets are defined (Table 1),16,23,24 with results explained successively.
A total of 100 respondents were enrolled in this study. The mean age of the SpA group was 39.30 (±13.31) years, and 35.58 (±14.56) years in the MBP group. In both groups, 54.0% of participants were male, and 46.0% were female. Most of the patients in the SpA group were aged 40–49 years (38.0%), and 19–29 years (39.6%) in the MBP group. Most of the patients belonged to urban areas: approximately 27 in the SpA (61.4%) and 34 in the MBP (75.6%) groups. Among patients with SpA, 34.7% (n=17) had completed their primary level education, whereas 31.3% (n=15) of patients had completed the graduation level of their education. In both groups, employment role of homemaker was predominant: approximately 19 (43.2%) of the SpA and 11 (25.6%) of the MBP group. Among the clinical characteristics of both groups, the duration of disease in the SpA group was 115 (±79) months and 62 (±7) months in the MBP group. Biochemically, the level of haemoglobin was near to equal in both groups. The levels of C-reactive protein (CRP) were significantly higher in the SpA group (25.95±30.24) than patients with MBP (2.41±1.09) because it is a clinical feature of SpA. Serum glutamic pyruvic transaminase levels were relatively higher in patients with SpA (55.83±76.38) compared with the MBP group (0.81±0.12). Among the features of SpA, in the case group elevated CRP levels were predominant in 39 (79.6%) patients. Other features were good response to nonsteroidal anti-inflammatory drugs in 36 (73.5%), arthritis in 25 (51.0%), and enthesitis in 18 (36.0%) patients in this group. A history of anterior uveitis was present in 4 (8.3%) cases; a positive family history of SpA was found in only 4 (8.3%) patients and psoriasis in 3 (6.0%) patients in the case group. The SpA features were absent in the MBP group as exclusion criteria. In the imaging, most patients presented with bilateral sacroiliitis (76.0%; n=38), and unilateral sacroiliitis was found in 24.0% (n=12) of cases. The calculated Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) for 50 patients with axSpA was 2.780 (±1.232).
Among the available criteria sets for the definition of IBP, the Calin criteria had the highest sensitivity (76%), while the Berlin set 8a criteria had the highest specificity (94%). The Berlin set 8a also had a sensitivity (72.0%) near to Calin. The recently described ASAS IBP criteria showed the most balanced performance, with no clear superiority over the other two criteria sets (sensitivity: 64%; specificity: 82%). The highest +LR was 12 (95% CI: 3.952–36.436) for Berlin set 8a criteria. A comparison of different IBP criteria sets is shown in Table 2.16,23,24
The individual performance of IBP items revealed some significant findings. IBP item ‘pain improves with activity, not with rest’ showed the highest sensitivity (97.0%); the best specificity was found for ‘morning stiffness >30min’ (88.0%). The highest +LR of 9.50 (95% CI: 9.49–9.50) was observed for the item ‘pain awakens second half of night’. ‘Pain response to exercise’ showed a significant odds ratio (OR) of 8.367 (95% CI: 3.610–19.395). The performance of individual items of the criteria sets for the detection of IBP is shown in Table 3.
With a few exceptions in demographic features, the clinical features and results of previous studies were similar to that of the present study. 50% of study subjects in both the SpA and MBP groups had an education qualification above secondary school. The study included 43% homemakers and 34% service holders with SpA; on the other hand, the MBP group comprised 30% students and 26% homemakers.
Among the clinical variables, all patients with SpA had higher CRP values (25.95±30.24) compared with controls (2.41±1.09). The presence of IBP among the patients with SpA and MBP, detected by Calin, Berlin set 8a, Berlin set 7b, and ASAS criteria, were 76.0% and 10.0%, 72.0% and 6.0%, 58.0% and 12.0%, and 64.0% and 18.0%, respectively. The estimated BASDAI for patients with axSpA was 2.780 (±1.232). Assessment of individual performance of IBP items revealed some significant findings. The item ‘age at onset’ showed good sensitivity (78.0%) and low specificity (16.0%) for SpA, which was consistent with other studies.26 The item ‘insidious onset’ was not clarified by previous studies or by the original developers of various criteria sets. As per the structure of the NHANES questionnaire, there were various options for the item ‘insidious onset’. It was measured in terms of two options in this study: ‘over 3 weeks’ and ‘month up to a year’.
The sensitivity and specificity were 98.0% and 14.0%, respectively, for the option ‘over 3 weeks’, which is a very poor trade-off with specificity in the case of SpA and dissimilar to other studies. However, the sensitivity and specificity became 65.8% and 57.6% for the option ‘month up to a year’, and the OR also became 1.080. The present study was structured with the NHANES questionnaire, which had only one option: ‘morning stiffness >30 minute’. The study showed 70.0% sensitivity and 58.0% specificity, with a significant OR of 4.50 (95% CI: 2.036–9.945).
‘No improvement with rest’ achieved 90.0% sensitivity and 15.0% specificity. The item ‘improves with exercise but not with rest’ instead of item ‘no improvement with rest’ had higher specificity (90.0%), along with a significant OR of 1.250 (95% CI: 1.088–1.436). Regarding ‘awakening during the second half of the night’, the scoring reflected the consolidated positive response for one of two options: ‘wake up after have been sleeping for 4 or more hours’ and ‘kept from sleeping for more than 4 hours at a time’. Sensitivity (59.0%) and specificity (70.0%) of the item that indicated ‘nocturnal pain’ was also consistent.
The last IBP item, ‘alternating buttock pain’, showed a significant difference between this study (84.0% sensitivity and 70.0% specificity) and past studies. Besides this, when components of IBP criteria sets were analysed individually, the highest OR were observed for ‘pain improves with exercise but not with rest’, ‘pain improves with exercise or activity’, and for ‘morning stiffness’. The highest +LR of 9.5 (95% CI: 9.49–9.50) and OR of 8.367 (95% CI: 3.610–19.395) were observed for ‘pain awakens second half of night’ and ‘pain improves with exercise or activity’. Therefore, considering the duration of morning stiffness >30min, Calin’s sensitivity (88.4%) and specificity (78.9%) were consistent with the sensitivity and specificity of the other previous study.27
Regarding AUC assessment, Calin cover 0.830 (95% CI: 0.749–0.911) area, which also indicates the validity of this study. +LR of 7.6 (95% CI: 3.261–17.71) and disease prevalence of 0.50 (95% CI: 0.398–0.602) were found for Calin in this study.
In this study, the sensitivity and specificity of the Berlin set 8a criteria were 72.0% and 94.0%, respectively. The specificity (82.0%) of this study was consistent with the ASAS validation study (91.4%).27 Amongst individual items of IBP, the highest sensitivity (84.0%) for SpA was that of ‘alternating buttock pain’. Berlin set 7b and 8a criteria have similar item combinations, except that ‘alternate buttock pain’ is not an item of Berlin set 7b. With this reduced item set, the sensitivity of Berlin set 7b came to be lower than set 8a, but was consistent (58.0%) with the previous studies;26,27 the sensitivity of ‘alternating buttock pain’ might be responsible for this difference.
Regarding AUC analysis, it was found that Berlin set 8a covered >0.830 (95% CI: 0.745–0.915) area, had +LR of 12 (95% CI: 3.952–36.436), and had a prevalence of 0.50 (95% CI: 0.3983–0.6017). AUC curve analysis showed that ASAS criteria covered 0.730 (95% CI: 0.629–0.831) area; a +LR of 3.556 (95% CI: 1.899–6.656) and prevalence of 0.50 (95% CI: 0.3983–0.6017) were found.
In conclusion, this study was to develop a valid, reliable Bengali IBP tool to assess the prevalence of IBP among the 260 million Bengali population living around the world. These tools also help the physician to assess IBP among the Bengali people. Moreover, performances of all IBP criteria sets are not the same around the world. These results suggest that among the available criteria sets for the definition of IBP, the Berlin set 8a criteria had a sensitivity of 72% and the highest specificity (94%). Berlin set 8a also showed the specificity nearest to Calin. The recently described ASAS IBP criteria showed a balanced performance, with no clear superiority over the other two criteria sets.
The highest +LR was found for Berlin set 8a criteria. The Berlin 8a criteria set can be advocated for use in primary care practice because sensitivity is important at the first screening, while specificity becomes more important at higher levels of care.
- Due to financial constraints and time limitations, this study was conducted in a small population. With future financial support, this study can be conducted in a large population.
- This is a screening test. This study included cases and controls according to the imaging arm of ASAS axSpA classification criteria, which is already established.
- There may be a chance of some degree of recall bias.