Using AI tools to perform data extraction in knowledge synthesis (KS)
Compiled by
Updated
See also
IntroductionUsing AI tools to perform data extraction in knowledge synthesis (KS) is the application of computational methods including machine learning, natural language processing and large language models (LLMs) to automating or semi-automating the identification, retrieval, and structuring of information from scientific literature in systematic reviews and meta-analyses. Systematic reviews are resource-intensive undertakings; a comprehensive systematic review can take thousands of person hours to complete. AI-assisted data extraction has emerged as a possible solution for reducing this researcher burden. Caveat: consult a biostatistician or methodologist before undertaking use of any AI tool or platform for data extraction processes. Background to the SR "data extraction" phaseSystematic reviews involve comprehensive retrieval of published and unpublished literature on a defined research question, followed by rigorous screening, quality appraisal, and synthesis of results. The data extraction phase—in which reviewers manually read each eligible study and record key variables such as population characteristics, interventions, outcome measures, and effect sizes—is among the most time-consuming steps in this process. Data extraction has typically required two independent reviewers to manually code each included study, with disagreements resolved by a third party. Given that a single systematic review may include dozens to hundreds of primary studies, this approach demands considerable human labour. The need for scalable, reproducible, and timely KS has driven substantial interest in automation and AI-powered systems. End-to-end tools and platformsSeveral tools have emerged and now support AI-assisted data extraction as well as other steps in KS. Platforms such as Rayyan, Covidence, EPPI-Reviewer, and Abstrackr incorporate machine learning and natural language processing but are designed to augment, rather than replace, human judgment.
AI-powered search toolsAI-powered search tools are reshaping how researchers discover and extract evidence by combining retrieval systems with LLMs. Unlike traditional databases, tools such as Elicit.com, Undermind.ai and Perplexity use retrieval augmented generation (RAG) to locate relevant documents and generate answers grounded in real sources, reducing hallucinations and improving transparency - but they are a work in progress. A key innovation is the integration of data extraction into the search process. For example, Elicit.com functions as an AI research assistant that retrieves papers and extracts structured data such as study design, sample size, and outcomes into comparison tables. This allows users to move quickly from discovery to synthesis, a task that traditionally required extensive manual effort. Similarly, Undermind.ai performs “deep search” by iteratively refining queries and identifying literature, producing comprehensive reports that approximate systematic searching workflows. Other tools emphasize synthesis over extraction. Consensus focuses on peer-reviewed literature aggregating findings into a high-level “consensus” answer, helping users understand the overall direction of evidence. Perplexity, by contrast, operates as a real-time web-based answer engine that retrieves and summarizes information with inline citations. AI-powered search tools signal a shift from searching as retrieval towards searching as analysis, where finding, extracting, and synthesizing evidence occur in a single integrated workflow. More testing is required in all of these systems. Note: before using any AI tools, consult a librarian who can explain the differences between traditional search methods and AI-powered methods. Accuracy and validationEvaluations of AI-assisted data extraction tools report accuracy in the range of 70–95% depending on data elements, domain, and model used.
For relevant study examples, see:
Ethical and methodological considerationsThe integration of AI into systematic reviews raises important questions about transparency, reproducibility, and accountability. Guidelines from bodies such as PRISMA and the Cochrane Collaboration are grappling with the reporting of AI use in systematic reviews, though formal standards remain under development. Key concerns include the risk of systematic bias introduced by training data, the opacity of model decision-making, and the challenge of auditing AI-generated extractions. Researchers have emphasised that AI should be positioned as a tool to augment rather than replace expert human judgement in evidence synthesis. Future directionsGiven the imperfections of fully automated systems, researchers advocate for hybrid workflows in which AI performs an initial extraction pass and human reviewers verify, correct, or augment outputs. This approach can reduce the time spent on data entry while retaining human expertise. Such workflows must be carefully designed to avoid automation bias, the tendency of human reviewers to over-trust machine outputs and fail to detect errors that would have been caught in a fully manual review. Ongoing research is exploring the use of multimodal AI systems capable of extracting data not only from text but also from tables, figures, and supplementary materials. Retrieval augmented generation (RAG) and knowledge graphs are used to improve contextual accuracy of extractions and enable more sophisticated cross-study comparisons. As LLMs continue to improve in capability and reliability, and as validation frameworks mature, AI-assisted data extraction is expected to become a standard component of the systematic review workflow, potentially enabling near-real-time evidence synthesis at a scale previously thought impossible. Caveat: consult a biostatistician or methodologist versed in SR workflows and processes before undertaking use of any AI tool or platform. References
Disclaimer
|
