Jump to content

Automated indexing

From UBC Wiki
MTIA vs MTIX indexing F-scores in MEDLINE. NLM Tech Bull. 2024

Compiled by

Updated

See also

Introduction

Automated indexing has been defined as “indexing the subject content of papers by means of a computer, either with some human intervention and oversight, or none at all”. (Giustini et al, 2025). In 2025, automated indexing can be performed through the use of various computer methods, algorithms (hence, algorithmic indexing), natural language processing and even artificial intelligence (AI). Automated indexing can refer to “semi- and/or partly automated” processes depending on the levels of curation involved. According to Ruiz and Aronson (2009), automatic indexing is a form of text categorization, where documents are assigned terms from a controlled vocabulary by machines in order to summarize their contents.

Automated (or, semi-automated) compared to human indexing?

A commonly-stated goal of state-of-the-art automated indexing is to mimic human indexing; however, its main challenge is to extract an exhaustive and precise set of terms just as a human indexer would to represent the subject content of every document in a database.

By 2022, NLM had implemented fully automated indexing in MEDLINE using the Medical Text Indexer (MTI-Auto) with human review for certain subjects, and other records reviewed at random. Since 2002, several large-scale MeSH indexing approaches have been proposed to improve automated indexing such as the MeSHLabeler, DeepMeSH and MeSHProbeNet. However, performance of these models is hampered by use of titles and abstracts where better results can be achieved via a paper's full-text. NLM continues to evaluate innovative technologies to improve indexing performance but new problems seem to arise as new medical concepts are introduced in biomedical papers.

What is automated indexing in MEDLINE?

Automated indexing in MEDLINE is sometimes referred to as algorithmic indexing (see Amar-Zifkin et al (2025)). In the MTI, algorithms are key to the indexing workflow at NLM. In 2022, first-line indexing for all MEDLINE records was performed by the MTIA, with humans limiting their curation to sets involving genes and proteins. In 2025, the NLM uses the MTIX which is based on neural networks technology.

According to the Encyclopedia of Knowledge Organization, “[algorithmic] indexing has referred historically to search-engines where automation plays an important role because of the scale of information". Similarly, with semantic indexing, terms used in related documents tend to have similar subject content and meaning. Based on these assumptions, associations between terms that occur in similar documents are calculated, and then concepts for those documents extracted from a corpus. Indexing using semantic AI technologies has scalability and is a major driver of its application in large web search engines. Further, algorithms have been shown to prove indexing consistency, although inconsistencies are not solved by applying automatic indexing methods alone. In other words, automated indexing is not an “objective” process as it reflects the worldview of the texts it indexes, and may perpetuate its own specific perspectives and biases. A reliance on using a large corpus of raw text to return outputs means that these algorithms suffer from their own indexing imprecision and unreliability.

Medical text indexer (MTI) and MEDLINE

The Medical Text Indexer (MTI) is the automated indexing tool developed by the National Library of Medicine (NLM) for MEDLINE. The MTI is one of the more impressive achievements for any national library, accomplished over decades of research, implementation and testing.

In 2024, the MTIX (Medical Text Indexer-NeXt Generation) replaced the MTI-Auto and used machine learning and neural networks to assign MeSH terms to articles. Its main benefits were improved indexing speed and scalability. Trained on millions of MEDLINE citations from 2007–2022, the MTIX analyzes titles, abstracts, and journal metadata to recommend MeSH terms with high recall (e.g., >94% for disease detection) and precision (e.g., 87% for disease categories). MTIX supports semi-automated and fully automated indexing, reducing the workload for human indexers while maintaining standards - still, it has an error rate of 10% based on an F-score of .90. See Amar-Zifkin et al, 2025; Askin et al, 2025.

Neural networks used in the MTIX enable rapid, precise indexing, critical for managing the growing volume of biomedical literature. In 2024, almost 1.4 million papers were added to MEDLINE. While human curation remains in place for quality control, NLM's use of AI supports applications such as the publicly-available MeSH on Demand. For medical texts, NLM says that "... automated indexing is currently based on the title and abstract of articles; future work will investigate automated indexing based on processing of the article’s full text (where NLM has access to that text for computational purposes)" thereby improving term coverage over title-and-abstract-based methods. Filtering techniques, like ranking scores and excluding lengthy documents, further boost accuracy.

Transformers have significantly improved Medical Subject Headings (MeSH) indexing in MEDLINE, the core task of Medical Text Indexing (MTI). Pretraining on in-domain biomedical text is critical: it gives the models the vocabulary, subword tokenization, and background knowledge needed to understand complex medical terminology, which general-purpose models (like base BERT or even SciBERT) lack and underperform on MeSH prediction, Since 2020, the National Library of Medicine (NLM) has incorporated BERT-based models (BioBERT, PubMedBERT) and later large language models into the MTI pipeline. These transformer models power the “First-Lines” and “Full-Text” predictors, dramatically boosting recall for rare MeSH terms and reducing the indexing workload for human revisers. State-of-the-art MTI performance now combines traditional approaches with transformer-based ranking (e.g., BERT rerankers and cross-encoders), achieving F-scores above 0.70.

Despite these advancements to the MTIX, human indexers are still needed to correct and curate Medline records.

Automated indexing from MTI (2002) to MTI-Auto (2022)

Rules-based systems such as the Medical Text Indexer – Automated (MTIA) use human-written instructions (ie., "based on NLM policies, use most specific MeSH term in tree") and ask the underlying algorithm to follow them. In rule based systems, the rules are built automatically from the list for match and synonym rules, that is, "See XYZ, Use XYZ." For example, if a newly-publshed paper contained the phrase “heart attack” in the title, the MTI's algorithm would assign the MeSH heading Myocardial Infarction. While precise, rules-based approaches are rigid and newer terms, synonyms, or complex phrasing could cause the system to miss relevant MeSH. By 2024, machine learning systems, using neural networks, emerged; the NLM implemented the MTIX built from the data in millions of previously indexed records from 2007 to 2022. Instead of relying on fixed rules, the MTIX looked at linguistic patterns and adapted to new terminology, maintaining indexing precision while improving recall.

The MTIX of 2024 replaced the MTIA (Auto) (2019, 2022) because it was a legacy “rules-based” system. Rules-based methods refer to previous algorithms (MTI, MTI-FL, MTIA) and their reliance on hand-crafted rules and heuristics, rather than learning from data found in MEDLINE citations. The rules-based MTI performed a range of pre-determined assignments according to MEDLINE indexing policies, and directions embedded in see references and scope notes in the MeSH vocabulary. For example, the MTI matched exact keywords found in titles/abstracts of articles to possible MeSH terms; identified patterns (e.g., if “fracture of the hip” appeared in title or abstract, MTI would assign “Hip Fractures” as a MeSH term); the MTI used rules to decide relevance such as word frequency thresholds, among other semantic techniques.

Rules-based systems (2002-2022) worked well for two decades but became prone to errors requiring constant updating and human intervention; the MTIA missed synonyms, misunderstood new terminologies, and made more work for indexers. As the biomedical literature grew, and became more complex, the MTIX’s machine learning approach was evaluated as being better at handling a myriad of linguistic, semantic and other issues. AI-based indexing for MEDLINE scales up to the 1.5 million papers published annually. Still, human indexers amend records that have been assigned MeSH terms incorrectly. With an F-score of 90, this means that 10% of all records will need human curation at some point.

Key errors found in automated indexing records

The following list (incomplete) was created in an early analysis for a study entitled Automated indexing of the biomedical literature in MEDLINE: a scoping review, and based in part on comments from NLM's PubMed Office Hours in 2022 - 2024. In general, algorithmic indexing can perpetuate a range of biases along various dimensions such as gender, sexual orientation and race (however, more research is needed).

  1. Missing MeSH (False Negatives) terms and tags — automated indexing may not "see" concepts (in the full-text, for example) and therefore assign relevant MeSH terms and check tags that would be obvious to a human indexer.
    • Amar-Zifkin et al assessed a sample of MEDLINE records (using MTIA) from February–March 2023; 47 % of records had inadequacies in indexing, such as missing significant concepts, use of overly general headings, or misassignments—confirming substantial false negatives and reduced recall. “Musings on MeSH” reported Amar-Zifkin et al concluded that 47 % of records had minor or major MeSH issues, which would indeed affect retrieval. Still relying solely on human indexing is no longer practical, and the continuous refinement of the algorithm underscores NLM's commitment to accuracy.
  2. Extra (Spurious) MeSH Terms (False Positives)
    • Askin et al., in their JMLA article “Filtering failure: the impact of automated indexing in Medline on retrieval of human studies for knowledge synthesis,” said that indexing often includes irrelevant terms or omits obviously relevant ones re: human studies. Concerns about check tag errors — such as gender biases favoring “Male” over “Female” — underscore the problem of false positives in machine learning models. Chen et al. (2023) noted a frequent misuse or omission of check tags (e.g., gender or age).
  3. Overly General or Inaccurate Publication Types — errors in publication types were often hierarchically related — e.g., tagging as Historical Article instead of more accurate Biography, or Clinical Trial when it’s a Clinical Study. PT errors affect search precision and filtering in search filters and knowledge synthesis.
    • NLM's MeSH 2025 Update showed that NLM makes adjustments to Publication Types—such as introducing “Network Meta-Analysis” or “Scoping Review.” Automated indexing systems (like MTIX) may lag or misclassify when publication types change or are too general.
    • Menke et al, 2025 report that "full-text features, enhanced document representations, and fine-tuning optimizations improve publication type and study design indexing."
  4. Limited Context: Missing Populations or Methods Details — MTIX relies on titles and abstracts (plus metadata in journal and pubyear) rather than full text. It can miss details such as populations or methodology—commonly found in full text.
    • Amar-Zifkin et al. note that MTIX (as of early 2025) was trained on citations up to 2022 and primarily uses titles, abstracts, and metadata, not full-text content. “Musings on MeSH” blog states that automated indexing is based only on title and abstract, meaning details found deeper in full text—such as population descriptors or methodology—can be missed. Chen et al. said the MTI tended to rank “Male” check tag more highly than “Female,” and frequently omitted “Aged” check tag—reflecting how terms can be missed.
  5. New or Drifted MeSH Terms — MTIX’s training data covers citations up to 2022, so new MeSH terms or those evolved in meaning (“drifted”) may not be recognized or applied. NLM addresses this by adding examples of new or drifted terms for MTIX retraining, but gaps still exist.
    • NLM reported that MTIX “needs new training data” in order to recognize new MeSH terms or drifted terminology—indicating gaps if new concepts emerge post-training. NLM's MeSH 2025 Update shows revisions (e.g., additions of AI-related headings, publication type changes) are being made to the vocabulary. MTIX’s older training means it may miss or misapply these new terms.

For more detail, see Medical Library Association (MLA) 2025 presentation, Automated indexing in MEDLINE. and National Library of Medicine. NLM Medical Text Indexer. NLM Technical Bulletin. March-April 2024.

Questions re: impact on comprehensive searching

Health sciences librarians (HSLs) may want to consider the impact of automated indexing on their search practices and MEDLINE instruction. Understanding MTIX's AI features may mean that HSLs should test more search filtering and seach strategies combining MeSH and freetext terms to ensure comprehensive retrieval, particularly for recent or non-indexed literature. HSLs may also want to communicate the basics of automated indexing to users, share best practices at meetings, and explain the implications of same on search precision and recall in MEDLINE.

  • How will automated indexing change our search practices in support of knowledge synthesis (KS) and our users — if at all?
  • How will health sciences librarians (HSLs) be influenced in their searching based on understanding of the MTIX — and, its AI features?
  • What sort of pivots are HSLs making in their MEDLINE teaching? in creating search strategies in support of expert searching?
  • How are librarians responding to questions about MeSH assignments in MEDLINE i.e., "How are MeSH terms assigned in MEDLINE"?

Feel free to share your comments and concerns: Dean Giustini, UBC Biomed librarian, dean.giustini@ubc.ca

References

Disclaimer

  • Note: Please use your critical reading skills while reading entries. No warranties, implied or actual, are granted for any health or medical search or AI information obtained while using these pages. Check with your librarian for more contextual, accurate information.