Estimating the 1st report’s missing raw data

Once we start working with estimates and deductive limits it is important to remember that the number of those “swapped” is a different issue from the fact that a swap took place. In the preceding section I demonstrated with mathematical rigor how the swap takes place and that this or that estimate is not the point. The swap is inherent in the methodology.

However, using examples can be a useful aid. So in the following pages I will present quantities. There are many ways to tease the numbers out of the authors’ material.

But again, my purpose here is to demonstrate a math-assisted swap and not a science-based estimate of FH prevalence. There is always a risk that the central argument will be lost in peripheral distractions: if we can see this or that basis for an estimate, then we have an easier-to-find conflict, and conflicts are the main literary vehicle for entertainment.  We can then be distracted from the conflict-free deduction which underlies this analysis and which affords no room for debate.

Good math precedes solid scientific conclusion, and bad math precludes it. So the following pages will not be an attempt at scientific discussion but at illustrating the bad math.  It is as if a telephone pole were down and after showing that, I then proceeded to illustrate that same fact, but this time by teasing apart and testing combinations of wires in the telephone box.  If a dispute emerges surrounding “blue” and “red wires,” “here” and “not there,” then it is hoped that the central issue of the downed telephone pole be remembered as the real target of the discussion.

A Guideline: We will estimate the mutation hits originally above the clinical detection point.

Heretofore we have given the 1st report the best possible footing: working with a deductive ceiling … the maximum amount of Top4 mutation carriers that could exist above the clinical cutoff used for the 1st report: 25. However, that number would require that there be zero Top4 hits for the next 40,000 added to the 2nd report. We would find roughly 1 in every 2,400 in the 1st report’s 60,000, and then after scanning the next 40,000 we would find zero mutation carriers that scored at the DLCN definite and probable. This is of course unrealistic.

So then, how can we keep estimates in a responsible range? We try to remember not only the estimate for the 1st report but also the consequences such a number would have for the 2nd report.  We can’t allow too much for the one without leaving the other ridiculous.

The following chart demonstrates the interdependence that estimates have within the 1st and 2nd reports.[1] The more hits for the Top4 originally above the clinical detection level that one estimates for the 1st report, the fewer one has for the 2nd report. We also track the resulting false positive percentage, using a practical setting. (IE, we don’t assume that molecular testing will take place for an entire population, but only on the pool isolated by a passing clinical score.)

Note that the 60-40 proportion — the 1st report’s population compared to what was added to the 2nd report — results in a real-world, false positive rate of 80.5%. This estimates only 15 Top4 hits orginally above clinical detection.

60/40 proportion matches the 1st report to the 2nd report addition.

Estimates of the 1st report’s molecular hits which originally scored above the clinical cutoff

Estimated number of four most frequent FH and FDB patients who scored Probable or higher

At an estimate of 15 Top4 hits, originally above the clinical detection point, a real world false positive rate works out to 80.5%.

Estimated number of Top 4 with passing scores is 15.
Range of TOP4 in first Danish Report

Let’s plug in our estimates and view the breakdown through the 1st report’s method. For our estimate of the Top4 mutation carriers which scored above the clinical detection point, we are going to give the 1st report the best footing possible. Instead of taking the average of 15, we’re going to remember that we’ve adjusted the 1st report’s clinical scores in order to work in an equal population scale to that of the molecular results. (Click here for details.) We want to leave as little room for doubt as possible.  So although this only affects the 14.5 estimate above, we will nonetheless add the 13% back in, always conscious of tilting grey areas in the favor of the authors. The higher this number the lower the false positives in the result.  We will use 17.

With this, let’s estimate the full mutation spectrum of the 1st report according to the ratio used in the 2nd report (.387).[3]

Let’s estimate the full mutation spectrum of the 1st report according to the ratio used in the 2nd report

[1] Click here for False Positive percentages under different scenarios.

[2] Click here and here.

[3] For treatment of the distribution of the Top4 and Ex-Top4 within the clinical categories, click here: “Weakest link in my analysis is nonetheless very strong.”