Posted in spin in science writing

Spin in writing-6: A critic of a case-control study

This post is a follow-up to my previous post on the same topic, which was inspired by a journal club podcast that dealt with a research paper on Cannabis use and psychosis.

The podcast was presented by Matt. Chris, and Don at the Population Health Exchange of the Boston University School Health Public Health.

Delving into the subject, in this post, I am deconstructing the paper with an emphasis on “spin in science writing”.

About the study

This was a case-control study conducted in several countries. It appeared in The Lancet Psychiatry in 2019. Its title is “the contribution of cannabis use to variation in the incidence of psychotic disorder across Europe (EU-GEI): “A multicenter case-control study”: You can access the full paper by clicking this link.

About the study design

In the abstract’s methods section, the authors write: “We included patients aged 18–64 years who presented to psychiatric services in 11 sites across Europe and Brazil with first-episode psychosis and recruited controls representative of the local populations.”

My comment: The cases were the first-episode psychosis. They are the ideal choice to explore incident (new) cases. However, another research reveals that the median time between the first appearing of symptoms and the diagnosis of first-episode psychosis is 2-2.5 years and the median age at diagnosis is 30 years in the UK. Further research reveals that as many as 64 percent who first experienced the first-episode psychosis have used cannabis and 30 percent of them had a cannabis use disorder. These facts resonate with one of the inherent problems in the retrospective case-control design: The problem of recall bias. It is troubling because the relationship between cannabis use and first-episode psychosis could be bi-directional.

Moreover, as the journal club podcast presenters point out cannabis may be just unmasking psychosis among those who are genetically predisposed.

Because of these reasons, the temporality – whether cannabis use precedes the first-episode psychosis – is very difficult to adduce.

Causality assumption:

In the abstract’s methods section, the authors write: “Assuming causality, we calculated the population attributable fractions (PAFs) for the patterns of cannabis use associated with the highest odds of psychosis and the correlation between such patterns and the incidence rates for psychotic disorder across the study sites.”

Assuming causality: The authors “assume causality” in their study based on findings from previous studies. As podcast presenters highlight, this assumption is a huge part because case-control study design, by definition, does not allow a causality assumption. After assuming causality, they have gone further and calculated population attribution fraction (PAF).

Causality language cannot be used in case-control study designs.

Causality language

Case-control study designs are observational by nature. They allow us to conclude associations, certainly not causations. In other words, we cannot use words or phrases that allude to causality; which means no declarative verbs. Instead, we should ideally use “descriptive” verbs to describe associations.

You can find more details about what declarative and descriptive verbs mean by reading this post.

In this post, I search and highlight the sections, sentences, and phrases that I consider containing declarative and descriptive words and verbs.

However, I request readers to contribute to this post and am willing to correct myself if my facts and arguments are incorrect as per your opinion.

In the study’s title:

The authors write: “The contribution of cannabis use to variation in the incidence of psychotic disorder across Europe (EU-GEI): a multicenter case-control study”.

My comment: Case-control study designs may prove association but not causation. The phrase – “contribution” of cannabis use to the “incidence” of psychotic disorder – alludes to a causative relationship. However, the word, “contribution” may mean an association also.

In the abstract:

Findings section: The authors write: “Daily cannabis use was associated with increased odds of the psychotic disorder compared with never users (adjusted odds ratio [OR] 3·2, 95% CI 2·2–4·1)”.

My comment: This is a correct characterization of the study findings because the case-control study designs warrant using the phrase, “associated with”.

Prevention?:

Findings section: The authors write: “The PAFs calculated indicated that if high-potency cannabis were no longer available, 12·2% (95% CI 3·0–16·1) of cases of first-episode psychosis could be prevented across the 11 sites, rising to 30·3% (15·2–40·0) in London and 50·3% (27·4–66·0) in Amsterdam.”

My comment: They claim that by removing high-potency cannabis in these 11 sites 12.2 percent of cases of first-episode psychosis could be prevented. And, in some sites – London and Amsterdam – it could be as much as 30.3 percent and 50.3 percent. This claim goes way beyond the study design warrants because it is based on causality.

Interpretation section: The authors write: “Differences in frequency of daily cannabis use and in use of high-potency cannabis contributed to the striking variation in the incidence of psychotic disorder across the 11 studied sites. Given the increasing availability of high-potency cannabis, this has important implications for public health.”

My comment: Is it possible to use the phrase, “contributed to the incidence of psychotic disorder” because it alludes to, again the causality?

In the main text:

Results section: The authors write: “The use of high-potency cannabis (THC ≥10%) modestly increased the odds of a psychotic disorder compared with never use.”

My comment: In this sentence, the use of the word, “increased” suggests causality.

The authors write: “Adjusted logistic regression indicated that daily use of high-potency cannabis carried more than a four-times increase in the risk of a psychotic disorder (OR 4·8, 95% CI 2·5–6·3) compared with never having used cannabis.”

My comment: The phrase: ” four-times increase in the risk of psychotic disorder” suggests causality.

Discussion section: The authors write: “The strongest independent predictors of whether any given individual would have a psychotic disorder or not were daily use of cannabis and use of high-potency cannabis.”

My comment: The word, “predictor” suggests causality.

The authors write: “The odds of the psychotic disorder among daily cannabis users were 3·2 times higher than for never users.”

My comment: I believe that this is a correct characterization of the findings.

The main focus of this post is to highlight reporting practices of a case-control study with special reference to the causality language. However, there are several other areas that we can discuss here; one such issue is the recruitment of controls and their comparability.

The problem of recruiting controls:

Ideally, the recruited controls should be “potential cases”. In other words, should they experience first-episode psychosis, they are expected to be included in the study as cases. Therefore, both groups’ basic characteristics should be more or less similar. However, according to Table 1, we find statistically significant differences between cases and controls in age, ethnicity, education status; the cases were younger, lower educational status, and non-whites. That means the control group did not represent the local population from where the potential cases of first-episode psychosis should originate.

I found another excellent account based on this paper written by Suzanne H. Gage to The Lancet. It also enriches this discussion. And, I found another excellent write-up by Kristen Monaco written to MedPage TODAY.

Posted in spin in science writing

Spin in science writing – 5: Psychiatry and psychology

Earlier we saw how distorted abstract reporting – a form of spin – occurs in health research. This post dives into a specific subject area: Psychiatry and Psychology.

In 2019, Samuel Jellison and his team published an excellent paper on this exact topic in the British Medical Journal. They looked at the frequency of distorted reporting in the abstracts of randomized controlled trials (RCTs) with non-significant primary endpoints, irrespective of the funding source, published in the Psychiatry and Psychology journals between January 2012 – December 2017.

They identified 116 papers and determined 65 papers (56 percent) with distorted reporting in the abstracts. And, they further found that it has no statistically significant association with the funding source whether it is industry-funded or otherwise; that is also very interesting.

How spin occurs

Samuel and his team described how spin occurs in the abstracts.

Spin in the results section of the abstracts
  • Focusing on statistically significant secondary endpoints while omitting statistically non-significant one or more primary endpoints
  • Focusing only on statistically significant primary endpoints while omitting statistically non-significant other primary endpoints
  • Claiming equivalence to statistically non-significant primary endpoints
  • Using phrases like “trending towards significance”
  • Focusing on statistically significant sub-group analyses of the primary endpoint

Spin in the abstract conclusions

  • Claiming benefit based on statistically significant secondary endpoints
  • Claiming equivalence versus comparator for a statistically non-significant endpoint
  • Claiming benefit using statistically significant sub-group analysis

In 2017, a Japanese group of researchers published a paper in the PLOS One exactly about that. They compared the conclusions written in the abstracts with the results of the expected primary outcomes of 60 papers. These papers reported effective interventions in the mental health and psychiatry field.

They determined that twenty out of sixty papers included “overstatements”. And, nine papers reported in the abstracts statistically significant results of secondary outcomes or subgroup analyses when none of its primary outcomes showed positive results.

Let us see few details as it appeared in the paper.

Not reporting non-significant results of the primary outcomes, instead reporting significant results of secondary outcomes

Example 1:

This study compared the efficacy of (web-based – this was not mentioned in the abstract) counselor-assisted problem-solving intervention method (n=65) with access to internet resources (n=67). It was a randomized clinical trial involving adolescents between 12 – 17 years admitted to a hospital with traumatic brain injuries. The interviewers were blinded to the intervention method. The primary outcome was measured using the child behavior checklist (CBCL) as reported by their parents before and after the intervention.

The result for the primary outcome – CBCL for the adolescent 12 – 17 years – was not statistically significant. They have not reported it in the abstract; instead, the abstract includes significant results for its sub-group analyses – late adolescents versus early adolescents.

Example 2:

This was a randomized controlled trial aimed at evaluating the effectiveness of depression intervention for women screen-positive for major depression, dysthymia, or both. The primary outcome was to change in depression symptoms and functional status 12 months after the intervention. The secondary outcomes were at least 50% reduction and remission in depressive symptoms, global improvement, treatment satisfaction, and quality of care. They have compared this intervention with the usual care.

According to the results reported in the main text, of both expected primary outcomes, symptom reduction had been statistically significant at 12 months but not their functional status at the end of 12 months; however, the secondary outcomes had achieved statistically significant results.

In the abstract, the authors have mentioned only the positive results.

Their advice:

Scrutinize the full-text results; Do not rely only on the abstracts

Shinohara, K., Suganuma, A. M., Imai, H., Takeshima, N., Hayasaka, Y., & Furukawa, T. A. (2017). Overstatements in abstract conclusions claiming the effectiveness of interventions in psychiatry: A meta-epidemiological investigation. PLoS ONE12(9). https://doi.org/10.1371/journal.pone.0184786

Relationship with journal impact factor and sample size

The authors of this study reported a very interesting relation of abstract”overstatements” with the published journal’s impact factor and the study’s sample size; they found that the journal’s impact factor fewer than 10 and the sample sizes fewer than 300 are more associated with abstract “overstatements”.

Not reporting abstracts in the structured format

As early as 2013, the CONSORT recommended that all randomized controlled trials’ abstracts need to be reported in a structured format; however the study authors had noted that a number of studies had not followed the recommendation.

Posted in spin in science writing

Spin in science writing -4: non-reporting of negative outcomes

This post brings another two spin methods; reporting statistically significant secondary endpoints (outcomes) in the abstract in the absence of non-significant primary endpoints (outcome) in Randomized controlled trials (RCTs) and not reporting of adverse effects of interventions.

The RCTs carry the highest level of evidence strength in research. In this study design, we compare a new treatment or an intervention against an existing one. Not all times, we find statistically significant results in the experimental arm. In those situations, evidence exists that researchers tend to give undue prominence to secondary outcomes pushing behind the primary outcome when the primary outcome is not statistically significant.

Let us find out how it occurs in context.

A group of Canadian researchers reviewed 164 papers on randomized clinical trials carried out to determine the efficacy of new treatment methods for breast cancer. They have extracted the papers from PubMed published 1995 and 2011. The breakdown of the type of trials was as follows: 148 on new drugs, 11 on new radiation methods, and 5 on new surgical methods. The main outcome (primary endpoint) was the survival – either overall, disease-free, or progression-free – in terms of the number of years. This paper appeared in the Annals of Oncology Journal in 2013.

These researchers found that of all the trials, 72 trials (43.9 percent) were reported with statistically significant primary outcomes (endpoints); the rest – 92 trials (56.1 percent) – did not have statistically significant primary outcomes (endpoints).

What did they find with regard to spin?

Reporting the trials as positive based on statistically significant secondary outcomes

They found that 54 out of 92 trials (59 percent) which showed statistically non-significant primary outcomes (endpoints) were reported as positive in the abstracts based on the statistically significant secondary outcomes (endpoints). However, the researchers do not elaborate, in the paper, what these secondary outcomes were.

Interestingly, this practice did not show any significant association with the journal’s impact factor.

Inaccurate reporting of adverse effects (toxicity) of the interventions

This warrants a little bit of explanation. They have used a hierarchical scale here; from 1 (excellent) to 7 (very poor). If the severe and life-threatening toxicities were not mentioned in the abstract, they have classified that paper as poor: somewhere between 5 – 7 in their scale.

According to the above scale, they had to classify 110 papers (67 percent out of all the papers) “poor”; obviously, it includes not only the trials that found to have non-significant primary outcomes but even trials with significant primary outcomes also.

Another interesting finding was that although most 103 (almost two-third) of the trials were funded by the industry, they could not find a significant association between the funding source and reporting/non-reporting of toxicities.

Posted in spin in science writing

Spin in science writing – 3: Misuse of adjectives

This is my third post about spin in science writing. While the first one discussed inappropriate usage of causal language in reporting observational studies, the second focused on making inappropriate recommendations based on observational study designs.

This post deals with inappropriate usage of adjectives and adjectival phrases in reporting studies in both observational and randomized controlled trials (RCTs). The contents of this post are based on the findings of a study on this topic with regard to randomized trials.

In 2015, a research team published a paper about their findings on the usage of adjectives in 16,789 randomized controlled trial abstracts published in PubMed.

Among many, the most common adjectives and adjectival phrases found in RCTs either in the title or abstracts are as follows:

  • Well tolerated
  • Meaningful
  • Feasible
  • successful
  • usual
  • good

Is it wrong to use adjectives in the abstract?

The use of adjectives in science writing is a double-edged sword. It allows writers to generalize study findings. We should not use that based on the findings of one study.

In 2012, the Medical Publishing Insights and Practices Initiative (MPIP) recommended avoiding the use of adjectival phrases such as, “well-tolerated” and “generally safe”.

I am quoting below some adjective phrases in the context that they have highlighted for us to understand better usage.

  • “The data suggest that the drug ——- is well tolerated by the high-risk patients”
  • “clinically meaningful”
  • feasible
  • demonstrate good clinical efficacy and safety
  • successful treatment

Posted in spin in science writing

Spin in science writing – 2: making clinical recommendations

In my previous post, I wrote about one spin method in science writing: use of causal language in reporting findings from observational study designs.

This post adds another spinning method used frequently in reporting results of observational studies: making clinical recommendations based on results from observational study designs.

The observational studies are very useful in science; however, these designs cannot be used to make clinical recommendations without first suggesting randomized trials. This is because these study designs allow us only to determine prevalence, incidence or demonstrate either associations or correlations. And, certainly, it does not allow us to infer causation.

However, this happens;

In 2013, a research team reviewed 298 observational studies published in 2010 in four high impact research journals.

They found more than half (56 percent exactly) these studies have made clinical recommendations without first calling for randomized controlled trials; only 14 percent suggested that step.

To put this into context, I delved into two studies that they have mentioned in their paper.

Case study 1

The title of this study is “fructose-rich beverages and the risk of gout in women. It was a prospective cohort study spanning 22 years. The researchers have documented 778 new cases of gout during the study period. And, they have found a statistically significant association between consumption of fructose-rich beverages and the occurrence of gout.

They have written their conclusion in their abstract as follows: “the data suggest that fructose-rich beverages increase the risk of gout in women”.

Can they make this conclusion? If you remember my previous post, these researchers have done the same mistake here: using causal language in observational study design.

Further, in the main text, they stepped into the other type of spin; under the discussion section, they promote the reduction of fructose intake – a recommendation the study design does not allow. They have not suggested a randomized study either.

Case study 2

In this study, the researchers have compared 366 children with ADHD for genetic variants with 1047 controls. Based on their positive results, they recommend referring such children with ADHD for routine screening for such variants.

Can they make such a recommendation based on this study?

The simple answer is no.

However, the presence of findings from randomized controlled trials is not the only criterion to make recommendations. It is a far more complex topic. Because of that, experts in the field have developed the GRADE approach as a guideline in developing recommendations for practice.

What is the GRADE approach?

The GRADE is an acronym for the Grading of Recommendations Assessment, Development, and Evaluation. Its working group presented its first report in 2004.

It categorizes the quality (strength) of evidence produced into four levels: high, moderate, low, and very low. While the evidence derived from randomized studies is considered high-quality, the evidence from observational studies is of low-quality with regard to making any recommendations.

However, with regard to testing the accuracy of diagnostic tests, there are instances that observational studies provide high-quality evidence.

Whatever it is before making any recommendation, one has to consult the GRADE approach since, in addition to the study design, several other factors should be considered.

Posted in spin in science writing

Spin in science writing – 2: making clinical recommendations

In my previous post, I wrote about one spin method in science writing: use of causal language in reporting findings from observational study designs.

This post adds another spinning method used frequently in reporting results of observational studies: making clinical recommendations based on results from observational study designs.

The observational studies are very useful in science; however, these designs cannot be used to make clinical recommendations without first suggesting randomized trials. This is because these study designs allow us only to determine prevalence, incidence or demonstrate either associations or correlations. And, certainly, it does not allow us to infer causation.

However, this happens;

In 2013, a research team reviewed 298 observational studies published in 2010 in four high impact research journals.

They found more than half (56 percent exactly) these studies have made clinical recommendations without first calling for randomized controlled trials; only 14 percent suggested that step.

To put this into context, I delved into two studies that they have mentioned in their paper.

Case study 1

The title of this study is “fructose-rich beverages and the risk of gout in women. It was a prospective cohort study spanning 22 years. The researchers have documented 778 new cases of gout during the study period. And, they have found a statistically significant association between consumption of fructose-rich beverages and the occurrence of gout.

They have written their conclusion in their abstract as follows: “the data suggest that fructose-rich beverages increase the risk of gout in women”.

Can they make this conclusion? If you remember my previous post, these researchers have done the same mistake here: using causal language in observational study design.

Further, in the main text, they stepped into the other type of spin; under the discussion section, they promote the reduction of fructose intake – a recommendation the study design does not allow. They have not suggested a randomized study either.

Case study 2

In this study, the researchers have compared 366 children with ADHD for genetic variants with 1047 controls. Based on their positive results, they recommend referring such children with ADHD for routine screening for such variants.

Can they make such a recommendation based on this study?

The simple answer is no.

However, the presence of findings from randomized controlled trials is not the only criterion to make recommendations. It is a far more complex topic. Because of that, experts in the field have developed the GRADE approach as a guideline in developing recommendations for practice.

What is the GRADE approach?

The GRADE is an acronym for the Grading of Recommendations Assessment, Development, and Evaluation. Its working group presented its first report in 2004.

It categorizes the quality (strength) of evidence produced into four levels: high, moderate, low, and very low. While the evidence derived from randomized studies is considered high-quality, the evidence from observational studies is of low-quality with regard to making any recommendations.

However, with regard to testing the accuracy of diagnostic tests, there are instances that observational studies provide high-quality evidence.

Whatever it is before making any recommendation, one has to consult the GRADE approach since, in addition to the study design, several other factors should be considered.

Spinning in writing
Posted in spin in science writing

Spin in science writing -1: Observational study designs

Image by StartupStockPhotos from Pixabay

Spin in writing occurs all the time. And, it occurs when writing research papers too.

What is “spin in writing”?

It is a sort of word game that distorts the evidence shown by data. Sometimes, it may or may not be deliberate.

Contrary to the popular belief, research has shown that spin occurs irrespective of its funding source – industry-funded or public-funded.

This post discusses one aspect of spin in science writing. I use here a much-debated research topic and then explore expert suggestions to avoid occurring it again.

Hormone Replacement Therapy for coronary heart disease for those with post-menopausal symptoms

Background

Since the early 1980s, physicians have been prescribing hormones – estrogen and progesterone – for women to alleviate post-menopausal symptoms. During this time, some believed these hormones might also prevent coronary heart disease associated with the post-menopausal period. A group of interested researchers followed from 1976 to 1980 a group of 121,964 female nurses who took hormonal tablets. They collected their responses to a mailed questionnaire periodically during the study period. Their aim was to compare non-fatal and fatal heart attack rates between nurses who took and did not take the medications. The findings of their study appeared in the New England Journal of Medicine in 1985. You can read its abstract here.

Look closely at the last sentence of the abstract. The study authors claim that their “data support the hypothesis that the postmenopausal use of estrogen reduces the risk of severe coronary heart disease”.

The keyword here is “reduces”.

Can they make that claim?

The experts say “NO”

They should not have used the verb, “reduces”.

Why?

Because they have adopted an observational study design. The observational study designs do not warrant using “declarative” verbs. We can use only “descriptive verbs”.

The observational study designs provide us information about either associations or correlation between the study variables, certainly not its causative relationships. If we want to find out strong evidence for causative mechanisms we should adopt randomized controlled study designs. In other words, we can use active verbs such as “reduce” only with randomized study designs.

Declarative verbs and descriptive verbs

It is worth digging a little further about declarative and descriptive verbs. I found a very useful paper with regard to this subject. In 2017, two Austrian surgeons published a short paper describing the differences between declarative and descriptive verbs citing examples. I am using the same example of verbs and verbal phrases here because it is pretty comprehensive and very useful particularly for those whose English is not their primary language.

What are declarative verbs and verb phrases?

Some examples of declarative verbs and verbal phrases are shown below. The former ones declare some definitive action. The latter describes.

Declarative words and phrasesDescriptive words and phrases
Showare in favor
Demonstrateare in association with
Provebelieve
Establish bring forward
Causecarry, provide
Determine find, report, suggest
Result in get
Reduce have
increase/decreasehighlight
hold, underscore
identify, think, underlie
look, observe
maintain, place

About the verb “show”

In that paper, the authors caution us when using the verb, ” show” – it can either be used in a descriptive or declarative sense – and they suggest instead using the following verbs when the verb, “show” is required to use in its descriptive sense.

  • depict
  • exhibit
  • display
  • expose
  • reveal

Hedging verbs

There is another set of verbs called “hedging verbs or verbal phrases”. The following are examples of those.

  • This may account for
  • seems
  • appears
  • suggest
  • could support
  • indicate

As suggested in the paper, we should not use those to imply or define causation.

How should we use the adjective, “significant”?

I had this puzzle at the beginning of my career. In research papers, the word “significant” is used with its literal meaning as well as when discussing statistical findings; we write our findings either “statistically significant” or “statistically non-significant” based on the probability value (p-value). The paper suggests reserving the adjective the word with “statistically” to avoid confusion. Instead, the authors suggest the use of words such as “substantial” and “substantive” instead of the adjective “significant”.

So, now how should we write a statistically significant or non-significant finding from an observational study design?

The authors of the above paper suggest that we should write sentences such as follows:

  • A statistically significant was found …..
  • A is associated with a statistically significant increase (or a decrease) ….

“trend”

We can find this word in research papers and even in the abstracts commonly when the p-value is more than 0.05. The experts advocate not to engage in this practice particularly not to “imply or claim statistical significance”.

Correlate and agreement

And, the above verbs too should be reserved to discuss statistics because the “correlate” is used with Pearson r and spearman rank correlation statistical tests. Similarly, “X was in agreement with Y” too should be avoided because of Cohen’s “agreement” Kappa statistic.

Verb tense

The verb tense also matters; The above two editor surgeons claim that the past tense always is descriptive and the present tense is declarative. For example, X (intervention) decreased mortality” is descriptive and “X decreases mortality” declarative and implies causality.

However, the year 2002 reversed it all. The JAMA published a paper based on the findings from a randomized controlled design. According to the paper, the authors stopped the study prematurely due to exceeding the expected breast cancer risk. And, they found higher coronary heart disease and stroke risk too. This debate is still going on. I do not intend to delve into this conflict. My aim is to emphasize on choosing the most appropriate verbs that reflect the strength of evidence, based on the study design.

Those who are interested can read the following blog post that explains more about this subject written by Hilda Bastian:https://absolutelymaybe.plos.org/2016/03/17/how-to-spot-research-spin-the-case-of-the-not-so-simple-abstract/#:~:text=Research%20spin%20is%20when%20findings,message%20you%20want%20to%20send.