Jump to content

Automated indexing

From UBC Wiki
MTIA vs MTIX indexing F-scores in MEDLINE. NLM Tech Bull. 2024

Compiled by

Updated

See also

Introduction

Automated indexing has been defined as “indexing the subject content of papers by means of a computer, either with some human intervention and oversight, or none at all”. (Giustini et al, 2025). In 2025, automated indexing can be performed through the use of various computer methods, algorithms (hence, algorithmic indexing), natural language processing and artificial intelligence (AI). Automated indexing refers to both “semi- and/or partly automated” processes depending on the levels of human curation involved. According to Ruiz and Aronson (2009), automatic indexing is a form of text categorization, where documents are assigned terms from a controlled vocabulary by machines in order to summarize their contents.

Automated (or, semi-automated) compared to human indexing?

A commonly-stated goal of state-of-the-art automated indexing is to mimic human indexing. The principal challenge, however, lies in extracting an exhaustive and precise set of controlled terms that accurately represent the subject content of each document in a database, as a human indexer would.

Since 2002, several large-scale MeSH indexing approaches such as MeSHLabeler, DeepMeSH and MeSHProbeNet have been proposed to enhance automated indexing. However, the performance of these models is limited by their reliance on article titles and abstracts; improved results could be achieved by leveraging full-text content. While the National Library of Medicine (NLM) continues to evaluate innovative technologies to improve indexing performance, new challenges persist as novel medical concepts are introduced in the biomedical literature.

In fiscal year 2021, the average time required to index articles fully reviewed by human indexers was 145 days, excluding the time needed for bibliographic data review. By 2022, the NLM had implemented a fully automated indexing program for MEDLINE using its Medical Text Indexer (MTI-Auto). Under this system, human review is retained for selected subject areas, while other records are reviewed on a random basis. Indexing time has improved considerably through automation reducing the time to one business day.

What is automated indexing in MEDLINE?

Automated indexing in MEDLINE is sometimes referred to as algorithmic indexing (see Amar-Zifkin et al (2025)). In the MTI, algorithms are key to the indexing workflow at NLM. In 2022, first-line indexing for all MEDLINE records was performed by the MTIA, with humans limiting their curation to sets involving genes and proteins. In 2025, the NLM uses the MTIX which is based on neural networks technology.

According to the Encyclopedia of Knowledge Organization, “[algorithmic] indexing has referred historically to search-engines where automation plays an important role because of the scale of information". Similarly, with semantic indexing, terms used in related documents tend to have similar subject content and meaning. Based on these assumptions, associations between terms that occur in similar documents are calculated, and then concepts for those documents extracted from a corpus. Indexing using semantic AI technologies has scalability and is a major driver of its application in large web search engines. Further, algorithms have been shown to prove indexing consistency, although inconsistencies are not solved by applying automatic indexing methods alone. In other words, automated indexing is not an “objective” process as it reflects the worldview of the texts it indexes, and may perpetuate its own specific perspectives and biases. A reliance on using a large corpus of raw text to return outputs means that these algorithms suffer from their own indexing imprecision and unreliability.

Medical text indexer (MTI) and MEDLINE

The Medical Text Indexer (MTI) is the automated indexing tool developed by the National Library of Medicine (NLM) for MEDLINE and represents one of the most significant achievements in large-scale automated indexing by a national library. Its development reflects decades of sustained research, implementation, and evaluation.

In 2024, MTIX (Medical Text Indexer–NeXt Generation) replaced MTI-Auto, incorporating machine learning and neural network–based methods to assign MeSH terms to biomedical articles. The primary advantages of MTIX include substantially improved indexing speed and scalability. Trained on millions of MEDLINE citations published between 2007 and 2022, MTIX analyzes article titles, abstracts, and journal metadata to recommend MeSH terms with high recall (e.g., greater than 94% for disease detection) and strong precision (e.g., approximately 87% for disease categories). MTIX supports both semi-automated and fully automated indexing workflows, significantly reducing the workload for human indexers while maintaining indexing standards. Nevertheless, despite an overall F-score of 0.74, the system exhibits an estimated error rate as much as one-third to one half. (Amar-Zifkin et al., 2025; Askin et al., 2025).See Amar-Zifkin et al, 2025; Askin et al, 2025.

Neural networks used in the MTIX enable rapid, precise indexing, critical for managing the growing volume of biomedical literature. In 2024, almost 1.4 million papers were added to MEDLINE. While human curation remains in place for quality control, NLM's use of AI supports applications such as the publicly-available MeSH on Demand. For medical texts, NLM says that "... automated indexing is currently based on the title and abstract of articles; future work will investigate automated indexing based on processing of the article’s full text (where NLM has access to that text for computational purposes)" thereby improving term coverage over title-and-abstract-based methods. Filtering techniques, like ranking scores and excluding lengthy documents, further boost accuracy.

Since 2020, the National Library of Medicine (NLM) has incorporated Bidirectional Encoder Representations from Transformers (BERT)-based models—such as BioBERT and PubMedBERT. These transformer models underpin the “First-Lines” and “Full-Text” predictors, substantially improving recall for rare MeSH terms and reducing the workload for human indexers. Pretraining on biomedical corpora is critical for these models, as it provides the specialized vocabulary, subword tokenization, and contextual knowledge required to interpret complex medical terminology—capabilities that general-purpose models (e.g., BERT or SciBERT) lack and therefore underperform in MeSH prediction tasks.

State-of-the-art MTI performance now combines traditional indexing approaches with transformer-based ranking methods, such as BERT rerankers and cross-encoders, achieving F-scores above 0.70. Despite these and other advances incorporated into MTIX, human indexers remain essential for correcting errors and curating MEDLINE records to ensure indexing quality and consistency.

Automated indexing from MTI (2002) to MTI-Auto (2022)

Rules-based systems such as the Medical Text Indexer – Automated (MTIA) use human-written instructions (ie., "based on NLM policies, use most specific MeSH term in tree") and ask the underlying algorithm to follow them. In rule based systems, the rules are built automatically from the list for match and synonym rules, that is, "See XYZ, Use XYZ." For example, if a newly-publshed paper contained the phrase “heart attack” in the title, the MTI's algorithm would assign the MeSH heading Myocardial Infarction. While precise, rules-based approaches are rigid and newer terms, synonyms, or complex phrasing could cause the system to miss relevant MeSH. By 2024, machine learning systems, using neural networks, emerged; the NLM implemented the MTIX built from the data in millions of previously indexed records from 2007 to 2022. Instead of relying on fixed rules, the MTIX looked at linguistic patterns and adapted to new terminology, maintaining indexing precision while improving recall.

Rules-based systems (2002-2022) worked well for two decades but became prone to errors requiring constant updating and human intervention; the MTIA missed synonyms, misunderstood new terminologies, and made more work for indexers. As the biomedical literature grew, and became more complex, the machine learning approach was evaluated as being better at handling a myriad of linguistic, semantic and other issues. AI-based indexing for MEDLINE now scales up to the 1.5 million papers published annually. Still, human indexers amend records that have been assigned MeSH terms incorrectly.

MTIX of 2024

MTIX, introduced in 2024, replaced MTIA (Auto) (2019, 2022), which was a legacy rules-based system. Rules-based methods—including earlier versions such as MTI, MTI-FL, and MTIA—relied on hand-crafted rules and heuristics rather than learning directly from data in MEDLINE citations. These systems applied predetermined assignments based on MEDLINE indexing policies, as well as directives embedded in see references and scope notes within the MeSH vocabulary.

For example, MTI matched exact keywords in article titles and abstracts to candidate MeSH terms and applied pattern-based rules (e.g., assigning the MeSH term Hip Fractures when phrases such as “fracture of the hip” appeared). Additional rules were used to assess relevance, including word-frequency thresholds and other heuristic semantic techniques.

By contrast, MTIX employs data-driven, machine learning–based methods that have dramatically improved indexing efficiency. The MTIX also leverages neural network–based models to learn complex semantic relationships between biomedical text and Medical Subject Headings (MeSH), enabling more accurate and scalable indexing than earlier rule-based systems. By training on millions of MEDLINE citations, these neural architectures capture contextual meaning and synonymy that cannot be encoded through hand-crafted rules. As a result, MTIX achieves faster indexing turnaround while maintaining high recall and precision across diverse biomedical domains. As of 2026, article citations are typically indexed within one day of receipt in NLM’s indexing system. In practical terms, most articles from MEDLINE-indexed journals now appear in PubMed with assigned MeSH terms within one business day. https://www.nlm.nih.gov/bsd/indexfaq.html#descriptor

Key errors found in automated indexing records

The following list was created in an early analysis for Automated indexing of the biomedical literature in MEDLINE: a scoping review, and based in part on comments from NLM's PubMed Office Hours in 2022 - 2024. In general, algorithmic indexing can perpetuate a range of biases along various dimensions such as gender, sexual orientation and race (however, more research is needed).

  1. Missing MeSH (False Negatives) terms and tags — automated indexing may not "see" concepts (in the full-text, for example) and therefore may not assign relevant MeSH terms and check tags that would be obvious to a human indexer.
    • Amar-Zifkin et al assessed a sample of MEDLINE records (using MTIA) from February–March 2023; 47 % of records had inadequacies in indexing, such as missing significant concepts, use of overly general headings, or misassignments—confirming substantial false negatives and reduced recall. “Musings on MeSH” reported Amar-Zifkin et al concluded that 47 % of records had minor or major MeSH issues, which would indeed affect retrieval. Still relying solely on human indexing is no longer practical, and the continuous refinement of the algorithm underscores NLM's commitment to accuracy.
  2. Extra (Spurious) MeSH Terms (False Positives)
    • Askin et al., in their JMLA article “Filtering failure: the impact of automated indexing in Medline on retrieval of human studies for knowledge synthesis,” said that indexing often includes irrelevant terms or omits obviously relevant ones re: human studies. Concerns about check tag errors — such as gender biases favoring “Male” over “Female” — underscore the problem of false positives in machine learning models. Chen et al. (2023) noted a frequent misuse or omission of check tags (e.g., gender or age).
  3. Overly General or Inaccurate Publication Types — errors in publication types were often hierarchically related — e.g., tagging as Historical Article instead of more accurate Biography, or Clinical Trial when it’s a Clinical Study. PT errors affect search precision and filtering in search filters and knowledge synthesis.
    • NLM's MeSH 2025 Update showed that NLM makes adjustments to Publication Types—such as introducing “Network Meta-Analysis” or “Scoping Review.” Automated indexing systems (like MTIX) may lag or misclassify when publication types change or are too general.
    • Menke et al, 2025 report that "full-text features, enhanced document representations, and fine-tuning optimizations improve publication type and study design indexing."
  4. Limited Context: Missing Populations or Methods Details — MTIX relies on titles and abstracts (plus metadata in journal and pubyear) rather than full text. It can miss details such as populations or methodology—commonly found in full text.
    • Amar-Zifkin et al. note that MTIX (as of early 2025) was trained on citations up to 2022 and primarily uses titles, abstracts, and metadata, not full-text content. “Musings on MeSH” blog states that automated indexing is based only on title and abstract, meaning details found deeper in full text—such as population descriptors or methodology—can be missed. Chen et al. said the MTI tended to rank “Male” check tag more highly than “Female,” and frequently omitted “Aged” check tag—reflecting how terms can be missed.
  5. New or Drifted MeSH Terms — MTIX’s training data covers citations up to 2022, so new MeSH terms or those evolved in meaning (“drifted”) may not be recognized or applied. NLM addresses this by adding examples of new or drifted terms for MTIX retraining, but gaps still exist.
    • NLM reported that MTIX “needs new training data” in order to recognize new MeSH terms or drifted terminology—indicating gaps if new concepts emerge post-training. NLM's MeSH 2025 Update shows revisions (e.g., additions of AI-related headings, publication type changes) are being made to the vocabulary. MTIX’s older training means it may miss or misapply these new terms.

For more detail, see Medical Library Association (MLA) 2025 presentation, Automated indexing in MEDLINE. and National Library of Medicine. NLM Medical Text Indexer. NLM Technical Bulletin. March-April 2024.

Questions re: impact on comprehensive searching

Health sciences librarians (HSLs) may wish to consider how automated indexing is reshaping search practices and MEDLINE instruction. Understanding MTIX and its AI-driven features suggests a growing need to test and refine search strategies that combine MeSH and free-text terms to ensure comprehensive retrieval—particularly for very recent, partially indexed, or non-indexed literature. HSLs may also play an important role in communicating the fundamentals of automated indexing to users, sharing emerging best practices with colleagues, and explaining the implications of these changes for search precision and recall in MEDLINE.

This raises several questions for practice and professional reflection:

  • How will automated indexing influence our search strategies in support of knowledge synthesis (KS) and our users—if at all?
  • In what ways might HSLs’ searching evolve as they develop a deeper understanding of MTIX and its AI features?
  • What pivots are HSLs making in MEDLINE instruction and in the design of expert search strategies?
  • How are librarians responding to user questions about MeSH assignment in MEDLINE, such as “How are MeSH terms assigned?”

Feel free to share your comments, experiences, and concerns. Dean Giustini UBC Biomedical Librarian dean.giustini@ubc.ca

References

Disclaimer

  • Note: Please use your critical reading skills while reading entries. No warranties, implied or actual, are granted for any health or medical search or AI information obtained while using these pages. Check with your librarian for more contextual, accurate information.