The purpose of this page is to give you the tools you will need to analyze the PIAAC dataset. The Data Tools section provides information on how to analyze your dataset. The Data Files are organized by national, international, and trend categories. Before working with PIAAC data, make sure to read our "What you need to consider" guide.
We also highly recommend that you watch modules 2 and 4 of the Distance Learning Data Training (DLDT) before beginning your analysis. DLDT online training modules allow you to learn more about the statistical procedures and methods for proper analysis of PIAAC datasets. If you have further questions about using the PIAAC dataset, you may visit our Q&A page or contact us at email@example.com.
Note to users: The OECD released updated public-use data files for Round 1 countries and the data for Round 2 countries (released June 28, 2016). The data updates include the rescaling of the skills use indices and the recalculation of the derived earning variables. A detailed list of the updated variables can be found here.
SAS and STATA users need macros for analysis, which take into account the plausible values and sampling weights. You can download the macros and data analysis manual here. Please note for STATA the only macro available is for REPEST. For additional information, please read the "What you need to consider" guide.
SPSS users will need to use the IDB Analyzer for analysis, which takes into account the plausible values and sampling weights. The IDB Analyzer provides a user-friendly interface to easily merge the micro-data files of the participating countries and create SPSS syntax that would produce accurate statistical results. IDB Analyzer is a Windows program used in conjunction with SPSS for Windows version 15 or higher. The program is available from International Association for the Evaluation of Educational Achievement (IEA) website. You can download the IDB Analyzer here. Please visit module 5 of the Distance Learning Dataset Training (DLDT) to review an instructional video on how to use the IDB Analyzer.
International Data Explorer (IDE) is a user-friendly, online tool that can be used for basic analyses such as averages, percentage distributions, and percentiles as well a few more advanced functions such as gap analysis and regression analysis. It takes into account the plausible values and sampling weights for you (U.S. IDE: http://nces.ed.gov/surveys/international/ide/; OECD IDE: http://piaacdataexplorer.oecd.org/ide/idepiaac/)
Introduction to PIAAC Data Files
The PIAAC 2012/14 data are now available in a U.S. national database. The national database contains additional variables that are unique to the United States, such as respondents’ race and ethnicity. The 2012/14 national database contains the total sample from the first and second rounds of U.S. household data collection (including older adults, ages 66-74), while the international database contains only the internationally comparable sample (excluding older adults, ages 66-74). Both the international and national databases come with documentation and example programs for reading in data and conducting analyses. These resources are listed in the section below.
Note the U.S. PIAAC 2012 data are also found in both an international database and a U.S. national database. The international database contains all information collected through the international PIAAC instruments.
These data files containing individual unit record data in SAS and SPSS format are available for downloading for each of the countries participating in PIAAC other than Australia. These contain both responses to the background questionnaire and the cognitive assessment. Full documentation of the contents of the files is provided. Please note that these files are primarily designed for use by researchers and data analysts. The files are also large and may take some time to download.
U.S. National Data Files and Resources
Public Data Files: The U.S. national PIAAC 2012/14 database can be found under the "PIAAC Publications & Products" section of the NCES PIAAC website. The U.S. databases include all U.S. data, both international and U.S.-only variables. Resources available include:
- U.S. PIAAC database (in SPSS and SAS formats)
- U.S. codebook and data compendia
- U.S. national technical report
Restricted Data Files: For data that was suppressed in the public-use dataset due to confidentiality concerns for respondents, a U.S. restricted-use database is also available. Restricted-use databases contain more detailed information, such as continuous age and earnings variables. To access the restricted-use data, the restricted-use license has to be applied for and obtained from NCES. More information on the process is available at: http://nces.ed.gov/pubsearch/licenses.asp
Click here to view information on the PIAAC U.S. 2012/14 restricted data files page ›
2012 Public and Restricted Data Files: The 2012 datasets only contain data from the 2012 Main Study sample and do not include data from the additional sample collected in 2014 U.S. National Supplement. The 2012/14 dataset includes data from the expanded national sample, which supports more accurate and reliable national estimates, estimates for the subgroups oversampled in the National Supplement (young adults and unemployed adults) and, in the case of older adults, estimates for new groups not represented in the first round of PIAAC. These original 2012 data have been updated, reweighted, and revised with the release of the PIAAC 2012/2014 restricted-use dataset. The 2012/2014 dataset should be used for analyses instead of this original version unless you are seeking to reproduce historic analyses.
Click here to view information on the PIAAC U.S. 2012 restricted data files page ›
Synthetic Restricted Data Files: The U.S. PIAAC synthetic restricted use file (S-RUF) is available to researchers outside the U.S. to prepare computer code for the analysis of PIAAC data on the U.S. restricted use file (RUF). The U.S. RUF contains some variables that are not included in the US international PUF as well as data for some international variables at greater levels of disaggregation than available in the PUF. The structure of the S-RUF is similar to that of the actual restricted use file (RUF) to allow researchers to develop and test computer code for analytical routines. As the data on the S-RUF is synthetic, no conclusions can be drawn from the output generated.
Researchers wishing to run analysis on the actual U.S. RUF are able to submit their SAS, SPSS, or Stata code to firstname.lastname@example.org, where the requested analyses will be run using on the real U.S. RUF. The output will undergo a confidentiality review and will be returned to the researcher after approval.
Click here to access the PIAAC U.S. synthetic restricted data files ›
International Data Files and Resources
- PIAAC international database (in SPSS and SAS formats)
- International codebook and data compendia
- International technical report (Round 1 and Round 2)
Australia Public Data Files: To access the Australian Public Use File, please write to email@example.com
Trend Data Files and Resources
IALS (1994-1998), ALL(2003-2008)
Rescaled public-use micro data files from the International Adult Literacy Survey (IALS) and the Adult Literacy and Lifeskills Survey (ALL) are available from Statistics Canada. These data files have rescaled literacy scores (IALS and ALL) and rescaled numeracy scores (ALL) that can be used in trend analysis with PIAAC scores. The rescaled public-use files for IALS and ALL that include data from all of the countries that participated in these surveys can be requested from Statistics Canada by emailing firstname.lastname@example.org. The request should specify "rescaled public-use data files for IALS (1994-1998) and ALL (2003-2008)". The files will be available for download after one signs a license.
International Log Files
Log Files (PIAAC 2012 Round 1 countries)
The Log Files contain a record of the interactions between respondents and the PIAAC computer testing application during the course of the assessment. They are useful for better understanding test-taking behavior and the strategies and processes followed by respondents in responding to test items. Individual records can be matched with corresponding background and cognitive response data available in the PIAAC PUFs using the SEQID variable. Resources available include:
• Public use files containing log data from the PIAAC cognitive assessments
• Full documentation of the contents of the files
• Customized data analysis tool, the PIAAC LogDataAnalyzer
PIAAC log file data is available for 18 countries that participated in the first round of the PIAAC study (PIAAC 2012) including the United States.