Machine learning

Compiled by

Dean Giustini, UBC Biomed librarian, dean.giustini@ubc.ca

Updated

12 March 2026 | Part of Knowledge Synthesis (KS) & AI Search Wiki 2026 & A to Z Listing

Introduction

Machine learning (ML) refers to a process of teaching computers how to perform certain tasks without explicitly programming them to do so.

ML involves the use of algorithms with large amounts of data to gradually improve predictive performance of the system. ML makes it possible for systems to imitate human learning and improve their performance through experience. ML supports tasks such as image classification, data analysis, and outcome prediction—activities that traditionally require human intelligence. At its core, machine learning focuses on developing algorithms that enable systems to make data-driven decisions and predictions.

Machine learning encompasses several approaches, and includes supervised learning, unsupervised learning, and reinforcement learning. Language models such as GPT and Bidirectional Encoder Representations from Transformers (BERT)) are built using ML techniques, particularly deep learning. These models are neural networks—often based on transformer architectures—trained on vast amounts of text to understand and generate human-like language.

Retrieval augmented generation (RAG)

Many (if not all) of the AI-powered search tools such as Elicit.com use retrieval augmented generation (RAG) deep research techniques to deliver results. RAG refers to a technique combining the strengths of retrieval-based and generative AI models. In RAG, an AI system first retrieves information from a large dataset or knowledge base and then uses this retrieved data to generate a response or output. Essentially, the RAG model augments the generation process with additional context or information pulled from relevant sources.

Presentation

Note: This video is meant to be for informational purposes only. Any claims of the video should be tested for accuracy and verified.

Biased outputs in machine learning

There is substantial evidence that machine learning models and AI tools can exhibit bias.

Amazon developed an AI system to screen job applicants’ résumés, but the tool was found to be gender biased against women (Dastin, 2022). Because the model was trained on historical hiring data from a predominantly male industry, it learned to penalize résumés containing terms such as “women.” In another case, an Amazon facial recognition system incorrectly matched 28 members of the U.S. Congress with individuals who had been arrested for crimes. Similarly, Google’s natural language processing models have been shown to label sentences referring to religious and ethnic minorities as “negative,” reflecting biases embedded in sentiment analysis training data.
Comparable biases have been identified in neural networks designed to recognize skin lesions when training datasets included only 5–10 percent images of Black skin. Barros et al. (2023) reported that model accuracy dropped by nearly half when evaluated on images of Black skin, increasing the risk of misdiagnosis and adverse health outcomes for Black patients. Given that Black patients have an estimated five-year skin cancer survival rate of approximately 70 percent, compared with 94 percent for white patients, the consequences of deploying such biased algorithms in clinical settings can be substantial.

Librarian view

"...The idea that we should outsource academic authorship to LLMs rests on the assumption that writing is (only) a mechanical, predictable or reductive process which, with the right prompts, can be replicated with ease." — Masters, 2025.

Bottom line: For health sciences librarians, AI tools may offer useful support when working with health professionals; however, many of the underlying processes raise serious concerns for those committed to scientific accuracy, transparency, and methodological rigour in evidence reviews. Information about AI tools is evolving rapidly, so readers should verify details using current sources or consult a librarian.

Note: librarians are trained to distinguish between searching for sources and searching for answers. Many AI systems prioritize the latter while obscuring the former, and transparency remains a significant limitation. Transparency is a critical part of doing knowledge synthesis (KS).

References

Disclaimer

Note: Please use your critical reading skills while reading entries. No warranties, implied or actual, are granted for any health or medical search or AI information obtained while using these pages. Check with your librarian for more contextual, accurate information.