Library:Circle/Guidelines for Recommending UBC Library Repositories for Data: cIRcle and Borealis

From UBC Wiki

About this Guide

UBC Library offers a number of services for managing and preserving research data. This guide was created, and is maintained by, members of the cIRcle and Research Data Management teams. Its primary purpose is to provide guidance around management of data deposit requests between cIRcle[1], UBC’s open access digital repository, and UBC Dataverse Collection (@Borealis)[2], a cross-disciplinary data repository on Borealis supporting online management and access to data.

To learn more about the major data repositories used by Canadian researchers including Borealis, FDRD, and Dryad, see the Where Should I Deposit My Data? [3] decision-tree.

For more information about Research Data Management at UBC visit researchdata.library.ubc.ca. [4]

Thesis and Dissertation Data

If you are interested in depositing data that is not part of your formal thesis or dissertation submission but is relevant to your research, you may wish to explore the options discussed below. Questions about what to include your thesis or dissertation proper should be directed to your graduate office: graduate.thesis[at]ubc.ca for UBC-V and gradtheses.ok@ubc.ca for UBC-O.

Overview

UBC Library defines research data as “[…] the data created or generated as part of a research project and exists in many formats including numeric data, text, transcripts, images, video and audio recordings" (Source: Casrai RDM Glossary[5]).

In recognition of the diversity of data and the complexity of data stewardship, UBC Library maintains or provides support for four repositories where research data may be stored, accessed, and managed: UBC Dataverse Collection[6], FRDR[7], Dryad[8], and cIRcle (DSpace). Each repository presents different strengths and limitations in the management of research data. The advantages of each are dictated, in large part, by the data creator’s goals for access, preservation, and re-use. This document provides general guidance for librarians to decide primarily when to direct data deposits to cIRcle or the UBC Dataverse Collection. It also outlines how relationships between content living across both repositories may be identified and described.  

Guiding Principles

General

All four repositories can hold multiple files per record. Ideally, a record within a repository holds all appropriate files related to a study. Where possible, multiple files from a study should not be split across multiple records within a repository. However, depending on the type of information deposited, a study may have a record for a publication in cIRcle, for example, and corresponding records for data in Borealis, FRDR, or Dryad. This scenario is described in greater detail in the section: When there is both Research Data and Documents.

Where materials for a study may require multiple records within a single repository or across different repository platforms, a consultation with any repository staff member will ensure best practices are applied.

Final and Work-in-Progress Data

cIRcle is UBC’s digital repository for teaching and research materials. cIRcle’s mandate prioritizes open access and preservation for static or final versions of materials that will not require adjustments or changes such as articles, theses, and conference papers as well as recordings of events and lectures. Although cIRcle does accept datasets, its services offer limited support for version control. In most cases where a new version of a file becomes available, cIRcle generally recommends adding updated files to existing records rather than replacing (i.e. deleting) files. For this reason, cIRcle is an ideal repository for completed datasets, particularly when paired with an article, presentation or paper.

Borealis is an open source web application for sharing, preserving, citing, exploring, and analyzing research data. It facilitates making data available to others, and allows replication of others' work. It has a robust version control and allows granular access to data files. Borealis is an excellent place for files toward the end of the research data lifecycle or final version that might be amended by the users.

Access & Management

While anyone can access materials deposited to cIRcle, submissions or edits to existing records are managed by cIRcle staff upon request.  Although cIRcle does support embargoes, data should be added to cIRcle with the ultimate goal of open access and re-use. See the cIRcle FAQ[9] for more information on permissions and withdrawal and editing policies.

Borealis allows and encourages continuing use and further editing of research data, including version control and granular access to research data which requires a UBC CWL login[10].

Data Type

Tabular data files can be deposited to either cIRcle or Borealis depending on the data content and other considerations (final/in-progress, access, etc.).

SPSS (.por, .sav), Stata (.dta), R (.RData), LIDAR or GIS data would be more suited for deposit in Dataverse.

File Sizes

cIRcle recommends individual file sizes not exceed 2GB. For help with larger files, please contact cIRcle staff[11].

In Borealis there is a maximum file size of 3GB for each file that users can upload via a browser. Larger files could be deposited by Borealis staff via the API. For very large datasets (e.g. TBs), we recommend that data be deposited in FRDR. Contact the RDM team for help: research.data[at]ubc.ca.

File Formats

Ideally, open and non-proprietary file formats should be used to store and share data. Please refer to recommended file formats, and recommend conversion of file formats when possible.

cIRcle is file type agnostic, and can accommodate a variety of file formats though non-proprietary are preferred. Through Open Collections, cIRcle is able to support in-browser viewing of documents (.pdf), images (.jpg, .png, .gif), videos (.mp4), and audio (.mp3).

Borealis is file format agnostic, and can accommodate a variety of file formats. However, to make research data previewable and analyzable in a browser while using Borealis, we recommend to upload/convert the data to CSV (.csv), SPSS (.por, .sav), Stata (.dta) or R (.RData) file formats.

Excel (.xls / xlsx) may be referred to either repository.

When there is both Research Data and Documents

Many researchers may wish to deposit both the research data for their study, as well as the publication or supplemental material. There are advantages and disadvantages to having all items for a study (regardless of format) within one repository.

cIRcle is an ideal home for publications with corresponding research data that is static and relatively small (under 2GB).  It’s also the place most users expect to find publications from UBC creators. Full-text indexing is provided through the multi-repository search portal, Open Collections  for improved discoverability and access. Researchers may choose, however, to deposit their article in cIRcle and have links to their datasets in another data repository if file size, flexibility or interactivity requirements do not support having related materials in one place.

Borealis is the preferred repository for researchers who want to deposit, describe and manage their datasets, particularly where versioning is a priority. For extremely large datasets (e.g. spectrometer data), FDRD is recommended.

Linking Records Across Repositories

Depending on priorities and the strengths and limitations of each repository, researches may find that it makes sense to use multiple repositories for a given research project. For example, researchers might deposit their protocols in OSF, datasets in Borealis, and the final publication in cIRcle. Scholarly communication is enhanced when research inputs are not only made available, but also connected to their final scholarly output. Deposits to cIRcle, OSF, and UBC Dataverse Collection (Borealis) are all citable and provided with a persistent identifier (e.g. DOI, URI, etc.). This means that related records in different repositories can be connected by referring to each record's persistent identifier.

cIRcle

When depositing into cIRcle, link out to related records by adding a note in the Abstract/Description field that includes each related record's persistent identifier.

UBC Dataverse Collection

When depositing into Borealis, link out to related records by adding their persistent identifiers in the Related Materials field.

Open Science Framework

When depositing into OSF, there are a number of ways to make connections to cIRcle and Dataverse:[12]

Help

For questions about this document or research and data repository services, please contact us at  

     cIRcle at circle.repository[at]ubc.ca or Research Data Management at research.data[at]ubc.ca.

Version 2.0. Created September 2016. Last updated March 2023.

References