Linking 1940 U.S. Census Data to Modern Aging Surveys
We (Minnesota Population Center, where I worked as a postdoctoral researcher during 2016-2019) link 1940 full-count U.S. census data to five modern aging surveys, including:
(1) Wisconsin Longitudinal Study (WLS), co-sponsored by the University of Wisconsin.
(2) Health and Retirement Study (HRS), co-sponsored by the University of Michigan-Institute for Social Research (ISR).
(3) Panel Study of Income Dynamics (PSID), co-sponsored by the University of Michigan-Institute for Social Research (ISR).
(4) National Social Life, Health, and Aging Project (NSHAP), co-sponsored by the University of Chicago-NORC.
(5) National Health and Aging Trends Study (NHATS), co-sponsored by the Johns Hopkins Bloomberg School of Public Health and Westat in Rockville, Maryland.
In addition, we are working with the Urban Transition Historical GIS Project by Brown University-S4 to provide geo-information for the census data part.
Please see IPUMS-USA for information on 1940 U.S. census data. Ancestry.com provides us raw and digitized census data.
In the linking projects, we implement a variety of automated record linkage algorithms, including stata's reclink module, Jaro-Winkler algorithm, and the probit-based machine learning algorithm.
The resulting linked census-survey samples can be used to study a variety of research questions, such as the effects of early-life characteristics (e.g., childhood lead exposure) on later-life outcomes, or the relationship between parental social mobility and children's later-life earnings.
Surname-Based Ethnicity Variable in Historical U.S. Census Data
In this project, I create a surname-based ethnicity variable for foreign-born males in 1920 and 1930 U.S. full-count census data. I use various machine learning algorithms (e.g., naïve Bayes, OLS, SVM, logistic) to classify ethnicity based on the linguistic origin of the surname. This ethnicity variable can be used to study ethnic residential segregation, heterogeneity in post-migration labor market outcomes, intermarriage, etc.
We (Minnesota Population Center, where I worked as a postdoctoral researcher during 2016-2019) link 1940 full-count U.S. census data to five modern aging surveys, including:
(1) Wisconsin Longitudinal Study (WLS), co-sponsored by the University of Wisconsin.
(2) Health and Retirement Study (HRS), co-sponsored by the University of Michigan-Institute for Social Research (ISR).
(3) Panel Study of Income Dynamics (PSID), co-sponsored by the University of Michigan-Institute for Social Research (ISR).
(4) National Social Life, Health, and Aging Project (NSHAP), co-sponsored by the University of Chicago-NORC.
(5) National Health and Aging Trends Study (NHATS), co-sponsored by the Johns Hopkins Bloomberg School of Public Health and Westat in Rockville, Maryland.
In addition, we are working with the Urban Transition Historical GIS Project by Brown University-S4 to provide geo-information for the census data part.
Please see IPUMS-USA for information on 1940 U.S. census data. Ancestry.com provides us raw and digitized census data.
In the linking projects, we implement a variety of automated record linkage algorithms, including stata's reclink module, Jaro-Winkler algorithm, and the probit-based machine learning algorithm.
The resulting linked census-survey samples can be used to study a variety of research questions, such as the effects of early-life characteristics (e.g., childhood lead exposure) on later-life outcomes, or the relationship between parental social mobility and children's later-life earnings.
Surname-Based Ethnicity Variable in Historical U.S. Census Data
In this project, I create a surname-based ethnicity variable for foreign-born males in 1920 and 1930 U.S. full-count census data. I use various machine learning algorithms (e.g., naïve Bayes, OLS, SVM, logistic) to classify ethnicity based on the linguistic origin of the surname. This ethnicity variable can be used to study ethnic residential segregation, heterogeneity in post-migration labor market outcomes, intermarriage, etc.