Supported Employment Demonstration Public Use File
Overview of the Supported Employment Demonstration (SED)
Overview of the SED Study Design
The Supported Employment Demonstration Public Use File
The SED Public Use File: Analysis file structure and contents
PUF Documentation
SED PUF Summary Folder Contents
Suggested citation for the Supported Employment Demonstration Public Use
File:
Published summary statistics using the SED PUF file:
Overview of the SED PUF preparations
Overview of the Supported Employment Demonstration (SED)
In August 2016, SSA awarded a contract to Westat, Inc., a research firm located in Rockville, Maryland, to implement and evaluate whether offering evidence-based interventions of integrated vocational, medical, and behavioral health services to individuals with behavioral health challenges can significantly reduce the demand for disability benefits and help individuals remain in the labor force. For additional details about the study sample, its study implementation, activities, processes, and additional details about data sources and context, see the final evaluation reports (i.e., Final Process Analysis Report, and the Final Impact and Cost-Benefit Analysis Report) on Social Security Online - Supported Employment Demonstration (ssa.gov) where these and related reports are posted. Westat prepared the SED Public Use File for SSA, which was reviewed and approved by the SSA Project Data Disclosure Review Board for release in June 2023
Overview of the SED Study Design
A mixed-methods randomized control trial (RCT) experimental design was used for the SED. This allowed a comparison across two treatment alternatives, entitled “Full-Service” and “Basic-Service,” against a “Usual Services” control group as the counterfactual. Participants randomized to one of the two treatment arms received intervention services from one of thirty study sites serving as SED sites across 20 states. The SED recruited, enrolled and randomized 2,944 participants, aged 18 to 50 years, for 36 months of study participation.
The SED sites were community-based organizations that provided social services, including mental health, behavioral health, and employment services to residents in their geographic catchment (service) areas. The sites delivered services to primarily urban residents but also served residents in mixed urban and rural areas. The participating states and the number of sites per state are:
- CA 1 site
- CO 1 site
- FL 1 site
- IL 3 sites
- KS 2 sites
- KY 2 sites
- MA 1 site
- MD 2 sites
- MI 1 site
- MN 1 site
- NC 1 site
- NY 1 site
- OH 3 sites
- OK 1 site
- OR 1 site
- SC 2 sites
- TN 2 sites
- TX 1 site
- WA 1 site
- WI 2 sites
SSA recruited participants from December 2017 through March 2019. The final participants transitioned off the study in March 2022. Outcome analyses and the Final Process Report and Final Impact and Cost-Benefit Report (see link above) were completed in December 2022.
The treatment interventions integrated supported employment with behavioral health treatment following the evidence-based Individual Placement and Support (IPS) model of employment services. The interventions provided to both treatment groups also included care management services to address barriers to employment and modest financial support for individual work-related expenses and out-of-pocket expenses associated with behavioral health and other care management services not covered by health insurance. The Full-Service intervention also included Medication Management Support (MMS) delivered by a Nurse Care Coordinator (NCC). Usual Services participants received a comprehensive resource manual and sought out services independently as they normally would. A summary ‘Levels of Service’ table for the intervention appears below.
The recruitment period was 16 months (November 2019 through March 2022), allowing staggered enrollments for individuals who had received a denial for a disability claim for mental health conditions to SSA in the previous 30-60 days.
The Supported Employment Demonstration Public Use File
The SED Public Use File (PUF) is an analytical file consisting of data acquired from 2944 enrollees who provided survey responses at baseline and on a quarterly basis throughout their 3-year enrollment in the SED. In addition, SSA administrative data on claiming behaviors (e.g., appeals) and outcomes (e.g., allowances and denials, payments) for participants during the study period are also included in the analytic file. The files available in the PUF provide deidentified data for study participants and present final outcomes from the study.
All enrollees, regardless of study arm participated in quarterly follow-up interviews, in which they provided updates on their employment and use of services during the previous three months. The annual survey (e.g., quarters 4, 8 and 12) included additional items related to health status and functioning. The information from the survey included the following domains: Clinical recovery, employment and earnings; and quality of life. The following table presents the measures where the data source is the survey.
Response rates among eligible participants held above 70 percent for the first two years of study enrollment (quarters 1 through 8). The third year of the study saw a drop-off in completion rates; by Quarter 12, roughly two-thirds (65.3%) of eligible enrollees completed the survey.
The SED Public Use File: Analysis file structure and contents
The PUF data set contains one row per participant (n=2,944). Measures that are taken repeatedly use suffixes to indicate the timing of the measurement. All timing is relative to the participants’ enrollment date (enrollment dates range from December 2017 to March 2019). The suffixes use the following notation:
Quarters. Quarters after study enrollment are numbered sequentially: _q1, _q2, _q3, …_q12. Baseline measures (data collected at the time of enrollment) have the suffix _q0.
Annual measures. Data collected that covers a year of study enrollment has the suffix _y1, _y2, _y3 to indicate the year after study enrollment.
PUF Documentation
The data set is provided in different formats:
- SAS
- CSV
- STATA, and
- ASCII file.
In addition, the SED PUF user guide provides additional details about the file structure and instructions for using the weights included to conduct weighted analyses. A full list of variable names and labels is provided in SED Contents. The PUF Codebook is contained in Codebook and includes variable names, labels, frequencies and/or means. A catalog of formats used to add labels to the dataset is contained in the Formats file.
SED PUF Summary Folder Contents
- Public Use Data file (SAS format)
- Public Use Data file (CSV format)
- Public Use Data file (STATA format)
- Public Use Data file (ASCII format)
- Public use file format catalogue
- Codebook containing variable labels, frequency distributions / means and ranges
- Users guide to accompany PUF
Suggested citation for the Supported Employment Demonstration Public Use File:
U.S. Social Security Administration, Office of Retirement and Disability Policy, Office of Research, Demonstration, and Employment Support. (2023). Overview and Documentation of the Social Security Administration's Disability Analysis File Public Use File [Data file and code book]. Retrieved from SED PUF Files.
Published summary statistics using the SED PUF file:
Due to the recency of posting, no summary statistics using the SED PUF file data have been published.
Overview of the SED PUF preparations
The Program Data Disclosure Review Board (PDDRB) of SSA met with SSA and ORDES (Office of Research, Demonstration and Employment Support component) on a weekly (or more) basis from late February, 2023 through middle May, 2023. The PDDRB approved the release of the SED PUF 5/10/23.
To prepare the Supported Employment Demonstration (SED) Public Use File (PUF), SSA sought to balance two requirements: 1) reduce the risk of identification of SED participants by a public use file user to an acceptable level, and 2) maintain utility for outside researchers. Meeting these requirements required tradeoffs. Given the sensitive nature of the data, the team gave priority to requirement 1 (reducing risk of identification) in making decisions to modify the data for public use.
This page provides a high-level overview of the steps SSA took to prepare the PUF, notes the remaining risks that could not be completely eliminated, and the mitigating factors associated with each remaining risk. Below, we outline the SED study and data collection, the analysis of risk, changes to the PUF to reduce risk, and provide the list files contained in the SED PUF.
SSA believes the SED PUF reflects a proper balance of disclosure risk management and data utility. The rationale for this assertion is based on these summary issues:
- There is no PII and no person-level identifier. Open-ended fields that could contain PII have been removed.
- The PUF contains no geographic variables or site information. While ‘catchment areas’ (a type of SED specific service area) are discussed in the User Guide, catchment areas are not defined geographically and are unknown to a PUF user.
- Variables that contain geographically-linked information, open-ended fields, or that provide information on a small and unique population have been dropped.
- Variables have been coarsened (recategorized or top coded) as described in the ‘checklist;’
- While there are very high number of unique occurrence violations (i.e., 1 or 2 people with unique combinations of cross-tabulated variable values between 4 way and 8 way crosstabs):
- Coarsening greatly reduces chances of isolating a particular person.
- There are 2,944 enrollees in the PUF. However, there is a population of nearly 47,000 potential enrollees that were provided by SSA (denied disability applicants). Furthermore, because the catchment areas are not defined in the PUF documentation or in publicly available documents, the population of denied disability applicants that could have participated is much larger than the 44,000 potential participants provided for recruitment. Although we don’t have survey response data for approximately 44,000 potential enrollees, the numbers alone suggest that each person in the PUF may represent more than 3 persons in the potential enrollee population.
- A PUF user would not know when a participant enrolled in the study or the timing of the baseline survey or follow-up surveys. This reduces the risk of identification for variables measured at baseline that can change over time (e.g., education, marital status) because the reference period is not known to the PUF user.
- In terms of earnings variables, quarterly earnings can be different for each enrollee. The quarter (as defined in the data) starts from the enrollment date, which is not known to the PUF user.
- Physical and mental health status variables (SF-12, CSI, CIDI, WD-FAB) are scale scores constructed from multiple survey items. The individual survey items have been removed from the PUF. The respondent would not know their score on each scale and it would not be possible for a PUF user to know the responses to individual items on the scale, knowing only the constructed score. This reduces the risk of a “nosy neighbor” identifying a study participant using the constructed scale scores in the PUF because the respondent would not be able to accidentally disclose the constructed score.
The PPDRB agreed with SSA that additional coarsening, dropping, or recoding would substantially reduce the utility of the data for future research.
Outside researchers may utilize the public use data set to conduct follow-on analyses of program impact. Many of these follow-on analyses will include sub-group analyses using characteristics of the participants in the study. Characteristics such as employment, earnings, race and ethnicity, household size, health status, and quality of life would be key information for an outside researcher. Further coarsening of data would substantially limit the utility of these data for sub-group analysis. For example, race and ethnicity have already been recoded to three categories (White, Black, and Other) to reduce disclosure risks.
Many variables were dropped from the PUF, other variables were recoded or recategorized, direct identifiers were dropped, there is no geography, and SSA staff went through an analysis of hundreds of factual variables that were not recoded to make assessment that those variables are not particularly identifiable because they are not unique in the population (e.g., diabetes diagnosis) or can change over time (e.g., marital status). These were best assessment judgement calls made during discussions. The SED PUF represents a convenience sample of 2944 enrollees who provided survey response on a baseline and quarterly basis throughout enrollment in the SED. While the numbers of known potentially eligible population for SED (denied disability applicants) suggest that each enrollee could likely represent more than 3 persons in the potential population, we lack administrative data or survey response from the non-enrolled population to make that direct and more concrete claim. Hence, this is not a directly traceable sample. Also, given the relatively small sample size of the SED PUF, there are a very high number of unique individual enrollee cases based on 4 or more cross-tabulated categorical variables. Again, it is quite likely but not directly verifiable that each of those enrollees represents more than 3 individuals in the broader population.
Because there is no PII/person level direct identifiers, no geography, coarsening and blurring of variable values, values that can change over time, person specific floating quarters in the study (not tied to a standard calendar but instead tied to individual enrollment date), and time has passed since the SED data collection, the PDDRB asserted that the disclosure risks associated with the SED PUF have been reasonably mitigated and managed and affords an appropriate balance of disclosure protection and data utility.