The RI DataHUB is Rhode Island's Statewide Longitudinal Data System (SLDS), housed at the University of Rhode Island. The DataHUB is a persistent, holistic, integrated data system designed to continually take in administrative data from partner state agencies, link data across disparate sources, and provide de-identified request-ready data to approved researchers. The DataHUB currently has over 20 collections of data from 18 sources including multiple state agencies, local governments, and independent non-profits.
The RI DataHUB is built and maintained by DataSpark, a program unit at the University of Rhode Island (URI). Day-to-day operation of the DataHUB, including data preparation, import, linkage, and quality reporting is managed by the DataSpark team.
However, data in the DataHUB belongs to the partner agencies, not to DataSpark or URI. Therefore, the oversight of the DataHUB and decisions around data use and access falls to a governance board made up of representatives of these partner agencies.
There are three main users of the RI DataHUB:
Under data sharing agreements that the University of Rhode Island and its program unit, DataSpark, have established with partner agencies and organizations, we integrate individual-level data on early childhood, K-12 and higher education, wages, unemployment, and health and human services into the system. These administrative records are collected on an ongoing basis by our partners and are updated by DataSpark at regular intervals.
Data are currently integrated from partnering agencies and organizations across Rhode Island including:
There are three ways that a researcher can work with data in the RI DataHUB.
Data use requires the permission of the individual state agencies who own the data. DataSpark can help you navigate the process of obtaining authorization to use DataHUB data.
The DataHUB’s funding comes from federal, state, local, and foundation grants and contracts. It is not funded by URI or by the state’s budget.
Current funding comes from participating state agencies, community partners and researchers. Recent funders of the DataHUB include: RI Department of Education, RI Office of the Postsecondary Commissioner, RI Department of Health, and RI Department of Labor and Training.
Yes. DataSpark, which manages the DataHUB, operates on a fee for service model. DataSpark is a Service Center per the University of Rhode Island Office of the Controller Policy 97-03.
Requests for aggregate or individual-level data are subject to DataSpark's standard hourly rate (cost-recovery fees) due to the time and effort required to obtain appropriate legal review, data use and release approvals; fulfill the request; and provide technical assistance to the requestor after the transfer of data.
Depending on the time of Data Use Agreement, agency consent, IRB approvals, etc. DataSpark may release data at the individual (de-identified) or aggregateve level.
Aggregate data consists of counts or percentages, and cannot generally be tracked back to a single individual: for example, “70% of students graduated on time.” Per its legal restrictions, DataSpark may release aggregate data to requestors without data use agreements; however, aggregate data must be used responsibly so that individuals could not be identified from counts of small groups or by combining different aggregations. DataHUB agency partners require suppression of cell counts smaller than 10 individuals.
Individual-level data is information about a single individual. Per its legal restrictions, DataSpark may not release individual-level data with personally identifiable information under any circumstances, and it may not release de-identified individual-level data without a signed data use agreement from the requestor. It’s important to note that statewide longitudinal data systems are not intended to track individuals, as in a case management system. Rather, they are intended to make individual-level data statistically available so that researchers can investigate the relationships between individual characteristics.
Under the right conditions, DataSpark can match researcher-collected data with DataHUB administrative data to measure program outcomes or examine other research questions. The data must be collected with an identifier that is common to our system, participants must have consented to this use of the data or the data must have been collected under a FERPA exception, and the data must be returned to the researcher completely de-identified.
Data preparation is usually a straightforward process, but obtaining approvals and importing new or updated datasets where required can be time-consuming and unpredictable. Once Data Use Agreements have been executed, contracts established and data updates/imports completed, DataSpark can begin preparing your data. Legal documents can take up to 6 months to execute, depending on the complexity of the request and the queue of agency/URI/ researcher legal counsels. DataSpark strives to fill requests in the order in which they were received and in time for requestors’ grant proposal or publication deadlines.
Most of our data sharing agreements with partner agencies require a preview period prior to publication. As a result of these legal agreements that are also passed along to requestors -- and in the spirit of courtesy to our agency partners who collect and provide regular data feeds to the DataHUB as a public service -- we will require that data users share, at minimum, tabular outputs or Executive Summaries of findings in advance. The time period and detail of findings to be shared is determined based on the policy of the agency whose data is requested (often 15 days prior to any release or publication). This period is to allow participating agencies to be prepared for findings and public comment and is not intended to prevent release of findings. Agency partners may also request a presentation of findings with the researcher’s interpretation of the policy or program implications.