Skip to main content

Recent findings by a research team at Rutgers University have unveiled significant shortcomings in the methodology applied by algorithms intended for the detection of “fake news.” The algorithms, it appears, place disproportionate emphasis on the credibility scores of the article’s source, rather than conducting an assessment of the credibility of each individual article. The researchers stress that this approach is flawed and not entirely reliable.

The algorithms typically evaluate an article’s credibility based on the reputation of the source, rather than the individual article. The researchers suggest that source-level labeling is unreliable, aligning with article-level labels only 51% of the time. They argue for a more nuanced approach, focusing on article-level labeling to improve the detection of misinformation in online news.

Source Credibility vs. Article Credibility

According to Vivek K. Singh, an associate professor at the Rutgers School of Communication and Information and a co-author of the study, the credibility of a source does not automatically imply the accuracy of all articles published by that source. Likewise, articles from sources often labeled as ‘non-credible’ are not necessarily ‘fake news.’ Singh and Lauren Feldman, another associate professor and co-author of the paper, argue that assigning credibility labels based on the source is as arbitrary as randomly assigning true/false labels to news stories.

Implications of Source-Level Labeling

The study reveals that using source-level labels for credibility aligns with article-level labels only 51% of the time. This insight carries significant ramifications for the development of robust fake news detectors and for maintaining fairness across the political landscape. The researchers underscore that their findings highlight the necessity for more reliable and nuanced methods for online news misinformation detection.

Article-Level Labeling and Misinformation Detection

In response to the identified issue, the research team developed a new dataset of individually labeled articles of journalistic quality and outlined an approach for misinformation detection and fairness audits. The team assessed the credibility and political leaning of 1,000 news articles, using these article-level labels to construct misinformation detection algorithms. The aim was to understand the impact of article-level labeling on the process and establish whether the bias that exists at the source level persists when applying the same machine-learning approach to individual articles.

Presented at the 15th Association for Computing Machinery Web Science Conference 2023, the study represents a collaboration between professionals in journalism, information science, and computer science. The authors stress that validating online news and combating the spread of misinformation is vital to maintain trustworthy online environments and safeguard democracy.