One can easily see why the CDC 2T is flawed by looking at the raw data in the latest WB study in China. Even though the study is for B. afzelii in China, there is no fundamental reason why Borrelia Burgdorferi in the US would vary that much since the immune response is similar and surface proteins are similar. One would not expect a sufficient difference to support a CDC 5/10 IgG criteria plus the choice of the the actual 10 antibodies is dubious at best.
One wildcard is if this study or the US studies got the antigens used in the WB right for the genotype surface antigen variants. Nobody seems to account for the variation due to even strain given they swear only Bb infects people in the US. Gary Wormser mentions this problem but always just uses good old B31 for the antigens. The Chinese used one sample they cultured for their antigens. Maybe everyone gets it wrong for this reason. Every study uses people infected with a variety of strains. Every study uses one or maybe 2 strains/antigens for their WB. If the precise variation and whether a few or many of the actual infection strains have surface antigens with sufficiently different epitopes, they may not show up on the WB having not been transferred.
A moment on sensitivity
True positive = correctly identified
False positive = incorrectly identified
True negative = correctly rejected
False negative = incorrectly rejected
sensitivity = number of true positives/ ( number of true positives + number of false negatives)
So its critical one must know with near certainty, the number of true positives ( this is where many studies fail )
Its also critical knowing the False negatives = incorrectly rejected
Its also extremely critical the study population contain a reasonable mix of infection strains/species that reasonably matches that which will be encountered in the clinical setting. A study with only patients form the Northeast does not correctly represent the US diversity. For example, failing to use strains found in CA or the South will cause inaccurate results since the CDC 2T is sensitive to strain and species diversity - Gary Wormser.
Knowing the number of true positives requires a test such as a culture or PCR that is independent of the test under study i.e. CDC 2T. That test must be very good. The new ALS culture may be good enough especially when combined with amplified high volume blood PCR. These two get an accurate mix of stains and number of true positives.
Now one must know the False negatives = incorrectly rejected. If one knows the REAL number of true positives and the mix of strains matches the clinical setting, then negatives on the CDC 2T that are known to be true positives, provides all the data needed.
Leaving off high cross reacting antibodies gets the best specificity while using the most commonly ocurring antibodies helps yield the best sensitivity/specificity - tradeoff
Back to the Chinese study example:
So what does the raw data show? It shows that the US IgG 5/10 requirement for positive is far too biased toward specificity versus sensitivity. No wonder it fails so often. The US researchers routinely claim that virtually 100% of late NB is CDC positive. That is also absurd based on this data.
The ALS culture is undergoing validation at 2 independent labs. If it performs as advertised, then we can finally know what species and strains are in these study participants. Then it will be possible to properly "tune" the antigens used in the WB to be sure strain diversity ( and sometimes species) isn't screwing the CDC 2T test up. I suspect strain diversity, a test good enough for knowing true positives and study selection bias are the culprits.
They show in table 1 the percentage of each category with a positive antibody for EM NB Late plus controls and for IgM
Lets focus on the IgG
Table 1 shortened:
kDa EM NB late
P83/100 6 11 18
P75 3 2 7
P66 3 2 6
P58 5 5 11
P43 2 2 5
P41 15 20 25
P39 5 13 10
OspA 35 2 5 9
OspA 32 12 12 17
P30 4 3 4
P28 2 2 7
OspC 22 2 5 12
P17 3 7 7
P14 8 3 18
It doesn't take a ROC analysis to see that P83/100 and P14 which are the most common are only 18% of the late category population each. There is a 4/5 probability one will not have each of these most common antibodies. The odds only get worse with the remaining antibodies. For simplicities sake, lets assume the 10 antibodies chosen in the ROC analysis all had an occurrence percentage of 20%. That's better than the 2 best P83/100 and P14 and far better than P43 and P30 at 5% and 4% respectively. But lets take this optimistic and easy to see case of them all being 20%. The odds of having any one is 1/5. If there are 10 chosen than 1/5 x 10 = 10/5 = 2. So in this very optimistic case where they all appear at 20% ( but they don't), then the average test would have 2/10.
All the antibodies together only add up to 157%. That means the average study participant only had 1.57 antibodies. In other words, the odds of having one antibody is excellent while the odds of 2 is poor. So they then proceed to do the ROC analysis to see what x is where 1/x is the criteria as a specificity tradeoff plus the best set. One can see by inspection that the the ratio for IgG must have a 1 in its numerator or the sensitivity will plummet. That is the CDC 2T fatal flaw.
The actual 10 antibodies chosen are 83/100, 58, 39, OspB, OspA, 30, 28, OspC, 17 and 14. There total occurrence rate is 113%. That means the odds of having one antibody in the 10 is 113% or good. So 1/10 makes sense. The choice of actual antibodies comes from the ROC analysis which is used to get the best sensitivity versus specificity tradeoff. 41 is eliminated because even though it helps sensitivity, its disastrous to specificity due to the high cross reactivity. Its removed form the IgG criteria due to its lowering of specificity more than its sensitivity gain. The US boys blew this one.
The ROC analysis is used to choose the best sensitivity versus specificity tradeoff by examining sets of antibodies using the most common ones first since that simplifies the analysis and is obvious. They came up with 1/10 for IgG and their 10 antibodies are not the same as in the US. I would expect the B Burgdorferi data in the US to differ possibly in preferred set but not significantly in frequency of occurrence if the selection bias used by most US researchers to be corrected. Looks like the Chinese are smarter than US researchers. That's scary since they are slowly cleaning our clock elsewhere.
They use the ROC analysis to chose which antibodies and the value for the denominator. Recall the US criteria is 5 = numerator and 10 is the denominator. Both the choice of antibodies and the 5/10 is absurd if the US data for LB is anything like this data.
Yes this is not B Burgdorferi
but the immune response and surface proteins are sufficiently similar that one would expect roughly similar results. In order for the results to support 5/10 for IgG, the sum of the late NB percentages used for the denominator set would need to add up to over 500%. That would mean on average, 5 antibodies are seen. This data clearly shows only 1 as a valid numerator.
The total EM column doesn't even add up to 100%. That suggests even their criteria of 1/10 from P83/100, P58, P39, OspB, OspA, P30, P28, OspC, P17, and P14 will have poor sensitivity.
Results Criteria for a positive diagnosis of Lyme disease were established as at least one band of P83/100, P58, P39, OspB, OspA, P30, P28, OspC, P17, and P14 in the IgG test
and at least one band of P83/100, P58, P39, OspA, P30, P28, OspC, P17, and P41 in the IgM test. For IgG criteria, the sensitivity, specificity and Youden index were 69.8%, 98.3%, and 0.681, respectively; for IgM criteria, the sensitivity, specificity and Youden index were 47%, 94.2%, and 0.412, respectively.
The greater the ignorance, the greater the dogmatism.
Attributed to William Osler, 1902