Skip to main content

United States of America

Progress report of the UNECE Task Force on subjective poverty measures, Thesia Garner (U.S. Bureau of Labor Statistics)

Objective poverty measures alone are not sufficient to understand the complexity of poverty and that subjective measures can complement them in important ways, especially with regard to reaching the poorest and making their voice heard. Given this fact, during the 2019 Conference of European Statisticians Bureau meeting, subjective poverty measurement was selected as a topic for in-depth review (/ECE/CES/2019/14/Add.13).

Languages and translations
English

1

UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE

Subjective Poverty

Report prepared by the UNECE Task Force on Subjective Poverty Measures

2

Acknowledgements

This Report has been prepared by the UNECE Task Force on Subjective Poverty Measures, which consisted of the following members representing national statistical offices, international organizations, and academia:

Thesia Garner, U.S. Bureau of Labor Statistics – Chair of the Task Force

Nikki Graf, U.S. Bureau of Labor Statistics Jake Schild, U.S. Bureau of Labor Statistics Andrew Heisz, Statistics Canada Kimberly Newman, Statistics Canada Christine Laporte, Statistics Canada Eric Olson, Statistics Canada Alex Miller, Statistics Canada Rania Abdulla, Statistics Canada Rana Maarouf, Statistics Canada Jarl Quitzau, Statistics Demark Daniel Gustafsson, Statistics Demark Yafit Alfandari, Israel Ellys Monahan, Office for National Statistics Ellys Croal, Office for National Statistics Tim Vizard, Office for National Statistics Andrew Zelinsky, Office for National Statistics Anna Szukiełoć-Bieńkuńska, Statistics Poland Maria Vyshnikova, Belarus João Hallak Neto, Brazilian Institute of Geography and Statistics (IBGE) Leonardo Santos de Oliveira, Brazilian Institute of Geography and Statistics (IBGE) Agata Kaczmarek-Firth, Eurostat Estefania Alaminos Aguilera, Eurostat Carlotta Balestra, OECD Elena Danilova-Cross, UNDP Regional Bureau for Europe and CIS Esther Dzifa Bansah UNDP Regional Bureau for Europe and CIS Alexander Kirianov, CIS-Stat Gerardo Leyva, INEGI Mexico Adriana Pérez, INEGI Mexico Gwyther Rees, UNICEF Siraj Mahmudlu, UNICEF Sabina Alkire, OPHI Fanni Kovesdi, OPHI Tomas Zelinsky Durham University (United Kingdom) Martina Mysikova Institute of Sociology of the Czech Academy of Sciences

3

Table of Contents

Chapter 1. INTRODUCTION ........................................................................................................................... 6

Chapter 2. FOCUS ON SUBJECTIVE POVERTY ................................................................................................ 8

I. INTRODUCTION ................................................................................................................................. 8

II. DEFINITION OF SUBJECTIVE POVERTY .............................................................................................. 9

A. Contrast to objective poverty ..................................................................................................... 10

B. Frameworks for subjective poverty ............................................................................................ 11

C. Collection and analysis of subjective poverty at National Statistical Offices ............................. 13

D. Collection and analysis of subjective poverty at International Agencies ................................... 13

III. WHY MEASURE SUBJECTIVE POVERTY AND A BRIEF REVIEW OF THE LITERATURE ................... 14

A. Why measure subjective poverty? ............................................................................................. 14

B. Evolution of subjective poverty measurement ........................................................................... 16

Chapter 3. APPROACHES FOR MEASUREMENT AND ANALYSIS .................................................................. 19

I. APPROACHES TO MEASUREMENT .................................................................................................. 19

A. Qualitative Questions not Focused on Specific Levels of Income (or Consumption) ................. 20

Identification ................................................................................................................................... 20

Evaluation ........................................................................................................................................ 21

Prediction ........................................................................................................................................ 23

B. Qualitative Categorical Questions Focused on Specific Income (or Consumption) ................... 24

Evaluation ........................................................................................................................................ 24

Prediction ........................................................................................................................................ 26

C. Money Metric Valuation Questions ............................................................................................ 26

II. ANALYSIS ......................................................................................................................................... 28

A. Relationships ............................................................................................................................... 28

B. Subjective Poverty Lines ............................................................................................................. 29

Leyden Poverty Line based on Money Metric Evaluation Question ............................................... 30

Intersection Method Based on the Minimum Income Question .................................................... 30

Quasi Leyden Poverty Line Based on the Deleeck Question ........................................................... 33

An Approach Based on Proportional Odds Logistic Regression ...................................................... 34

An Approach Based on Dichotomized Data .................................................................................... 35

C. Country/international organization examples ............................................................................ 37

Chapter 4. STATCAN contribution ............................................................................................................... 37

4

Methods of data collection and guidelines............................................................................................. 37

Survey Frame and sample considerations .......................................................................................... 38

Traditional surveys .............................................................................................................................. 39

Case Study 1: National Survey of Self-reported Well-being (ENBIARE) 2021 of Mexico................. 40

Omnibus Survey .................................................................................................................................. 43

Case Study 2: The Quality of Life framework for Canada................................................................ 44

Opinion Poll Survey ............................................................................................................................. 44

Rapid response.................................................................................................................................... 44

Case Study 3: The U.S. Census Bureau Household Pulse Survey Financial Well-being Question ... 45

Web-panel........................................................................................................................................... 46

Crowdsourced surveys ........................................................................................................................ 46

Case Study 4: Using crowdsourced data ......................................................................................... 46

Administrative and registry data ........................................................................................................ 47

Case Study 5: Use of administrative data for sampling and calibration of EU-SILC at Statistics

Denmark .......................................................................................................................................... 47

Sources of error: concerns with response and representativeness ................................................... 48

Validity and relationship to other measures of poverty and economic well-being ........................... 49

Quality reports and validating data................................................................................................. 49

Advantages of subjective poverty measures .................................................................................. 50

Disadvantages of subjective poverty measures .............................................................................. 50

Differences in personal opinion ...................................................................................................... 51

Timeframe for data collection and release ......................................................................................... 51

Cross-sectional versus longitudinal data collection ............................................................................ 52

OECD subjective well-being guidelines ............................................................................................... 52

Hypothetical assessments of subjective poverty .................................................................................... 53

What is the role of question wording? ............................................................................................... 54

Statistics Canada ............................................................................................................................. 54

Cognitive tests Bureau of Labor Statistics ....................................................................................... 55

Framing and mode effects .................................................................................................................. 56

Subjective poverty and the evolution of measures ............................................................................ 57

Case Study 5: Subjective assessments versus objective measures of poverty – discussion of the

definitions of selected poverty measures based on the Polish edition of the EU-SILC survey ....... 57

What is the role of defining minimums in assessing one’s subjective poverty position? .................. 61

What is the role of geographic differences in prices? ........................................................................ 62

5

What is the role of household composition and assumptions regarding sharing? ............................ 64

What is the role of Social Transfers in Kind (STIK)? ............................................................................ 65

What is the role of housing wealth and imputed rent? ..................................................................... 66

What is the role of differences in “culture” and religion? .................................................................. 67

Concluding remarks on hypothetical questions ................................................................................. 69

Lessons learned from COVID-19 ............................................................................................................. 69

Subjective Poverty in SEIA Questionnaires and Comparability Analysis ............................................ 70

Poverty defined in a fully subjective way (direct self-identification as poor, feeling of poverty) ... 72

Perceived financial difficulties ......................................................................................................... 72

Subjective poverty line approach – perceived poverty line ............................................................ 72

Subjective poverty lines assessed with the use of statistical methods (so-called objectivised,

quasi-subjective poverty lines) ....................................................................................................... 72

Perception of poverty as a social phenomenon .............................................................................. 72

Other Approaches ........................................................................................................................... 73

An overview of UNDP Socio-Economic Impact Assessments (SEIAs) for households in countries of

UNECE region ...................................................................................................................................... 73

Case study 6: Self-assessed Financial Well-being: comparing objective and subjective measures 75

Overlaps in Dimensions of Poverty ................................................................................................. 76

Implications regarding experience with COVID outbreak .................................................................. 77

Conclusion ............................................................................................................................................... 78

Chapter 5. RECOMMENDATIONS ................................................................................................................ 78

Appendix ..................................................................................................................................................... 82

6

Chapter 1. INTRODUCTION

Objective poverty measures alone are not sufficient to understand the complexity of poverty

and that subjective measures can complement them in important ways, especially with regard

to reaching the poorest and making their voice heard.

Given this fact, during the 2019 Conference of European Statisticians Bureau meeting, subjective

poverty measurement was selected as a topic for in-depth review (/ECE/CES/2019/14/Add.13).

This was followed up by an in-depth review of subjective poverty measures which was presented

before the Bureau of the Conference of European Statisticians (CES) in October 2021. This was

largely based on a paper prepared by Statistics Poland summarizing survey responses from

National Statistical Offices from 52 countries, with additional information regarding

international activities. Reference is also made to another study which was conducted by the

United Nations Development Programme of 15 countries/territory in Europe and Central Asia

region. This study was conducted during the COVID-19 outbreak in 2020.

A summary of the in-depth review follows (from document ECE/CES/BUR/2021/OCT/2):

1. Both the literature review and research practices indicate different ways of

understanding and defining the term subjective poverty. This indicates a need to clarify

terminology and develop a system of concepts related to the measurement of subjective

poverty.

2. At present, both at national and international level, objective indicators play a

dominant role in monitoring the phenomenon of poverty, and statistical offices give

priority to the production of these data. The measurement of subjective poverty is

generally very limited or not considered at all.

3. In the framework of “official statistics”, direct self-identification as poor is very rarely

used. In most countries, household surveys include questions on subjective

assessments of living standards, which can provide a basis for calculating indirect

measures of subjective poverty. However, in practice these data are not fully exploited

for the analysis of subjective poverty.

4. The omission of the subjective approach, as complementary to the objective

measurement, significantly weakens the diagnosis of poverty. In this context it seems

important to disseminate knowledge on the usefulness and interpretation of subjective

data on poverty.

5. Taking into consideration the conclusions of the review of methods used to measure subjective poverty and the opinion of National Statistical Offices on the usefulness of

work in this area at international level, it is proposed to develop a guide on methods for

measuring subjective poverty and to agree on a short list of harmonised subjective

poverty indicators for international comparisons. To ensure the implementation of these

tasks it is proposed to establish under the umbrella of the Conference of European

Statisticians a Task Force on Subjective Poverty Measurement.

7

The Bureau asked the UNECE Secretariat, together with the Steering Group on Measuring

Poverty and Inequality, to prepare a proposal for follow-up work addressing the priority areas

raised in the in-depth review, considering the discussions on subjective poverty at the meeting

of the Group of Experts on Measuring Poverty and Inequality in December 2021. During the

December meeting it was suggested that a task force be created to consider going beyond

quantitative approaches to measuring poverty to include qualitative measures as well.

The UNECE Secretariat together with the Steering Group on Measuring Poverty and

Inequality prepared terms of reference for the Task Force on Subjective Poverty Measures.

The objective of the Task Force was to develop a guide on measuring subjective poverty,

including a set of subjective poverty indicators that could be used for international

comparison. As noted from CES Bureau discussions in October 2021 and February 2022, the

proposed list of subjective poverty indicators to be developed should be coherent, holistic, and

short. The indicators should relate to existing international work, i.e., to the measuring of

subjective perception of living conditions defined in the EU Survey on Income and Living

Conditions (EU-SILC), and to the OECD guidelines on measuring subjective well-being. The

proposed guide on measuring subjective poverty should include a list of indicators, the related

conceptual considerations, and guidelines on how to develop the indicators. In follow-up,

electronic consultations with the CES member States on the in-depth review of subjective

poverty measures were conducted in April-May 2022 (for reference, see

ECE/CES/2022/9/Add.1, 31 May 2022). The following 13 countries replied to the electronic

consultation: Austria, Belarus, Canada, Costa Rica, Denmark, Finland, Hungary, Lithuania,

Mexico, Poland, Russian Federation, Turkey, and Ukraine.

A summary of comments from these consultations follows:

1. All responding countries welcomed the outcome of the in-depth review paper and

expressed support for further steps in the area.

2. The proposal to develop a guide on measuring subjective poverty containing description

of approaches and best practices, system of indicators and methodology behind their

measurement as well as further recommendations for statistical services concerning

international comparisons was highly valued.

3. Poverty in general as well as subjective poverty are complex phenomena. Clarified

terminology and unambiguous interpretation are preconditional for international

harmonisation. Different economic, social, political, and cultural conditions across

countries should be taken into consideration when measuring subjective poverty.

4. The use of the subjective approach as complementary to the objective measurement can

be a very useful and efficient diagnostic tool of poverty. It allows for a better

understanding of what poverty means to people and verifying whether objective

evaluations of poverty are consistent with social experience. At the same time, nationally

and at the policy level having more than one measure of poverty could be challenging

and likely to require a large dissemination effort to make use of additional measures of

poverty sufficiently widespread.

8

5. There was some agreement that the proposed list of subjective poverty indicators to be

developed should be coherent, holistic, and short.

According to Members of the Task Force and experts responding to the survey and electronic

consultation with National Statistical Offices representatives, subjective poverty measurement

is not an alternative to objective poverty measurement but should be considered as

complementary. The subjective approach shows the problem of poverty from a completely

different perspective than the objective one.

Applying a subjective approach allows for a better understanding of what poverty means to

people, as well as to verify whether objective evaluations of poverty are consistent with the

social perception of this phenomenon. Subjective measures also provide information on 'public

moods,' which can influence people's behaviour in both the economic, social and political

spheres. Statistical analyses related to the use of subjective and quasi-subjective measures may

also be used to verify and even construct measures of an objective nature (e.g., the consensus

method for constructing deprivation indices, verification of equivalence scales used).

The purpose of this guide is to enrich the subjective assessment of poverty by improving the

understanding of what people think it means to be poor and by going beyond a purely

economic approach to poverty measurement. This guide builds upon existing UNECE

networks of experts in measuring poverty and inequality and follows the methodological work

under the Conference that has led to the publication of the Guide on poverty measurement in

2017 and the Guide on disaggregated poverty measures in 2020.

Chapter 2. FOCUS ON SUBJECTIVE POVERTY

I. INTRODUCTION

Scholars across different disciplines of the social sciences agree that poverty is a

multidimensional phenomenon. It is well recognized that traditional resource-based indicators

(e.g., income compared to an official poverty line) alone cannot fully capture the complex

nature of well-being, and thus ignoring other than the traditional or objective

income/expenditure-based poverty measures can distort the overall picture. Like objective

measures, the focus of this report is poverty defined in terms of people not having economic

resources to realize a set of basic “functionings” or minimum level or standard of living (Sen

1985, 1993).1 But how to determine whether this minimum level has been achieved can be

measured using subjective measures, not just objective ones.2 Like for other measures of

1 An alternative conceptualization of poverty is based on the scarcity theory (Mullainathan and Shafir, 2013).

Following this theory, poverty can be defined as “the gap between one's needs and the resources available to fulfil

them” (Mani et al, 2013, 976). Identifying one’s need and this gap is based on subjective assessments and can be used

to define poverty. 2 There is much research on the dynamic relationship between the subjective and objective measures. For example,

many sociologists write about it regarding social boundaries and identity, for example Lamon and Mizrachi (2012),

Mizrachi and Zawdu (2012), and Harold et al. (2021). Blanchflower and Bryson (2023) explore the role COVID-19

and the Great Recession had on objective and subjective well-being.

9

poverty, this achievement can be influenced by many factors (see Figure 1). While poverty can

be approached from various perspectives, including domains such as human rights or

sustainable development, for example, the UNECE Task Force on Subjective Poverty

determined that its primary focus would be on economic poverty.

The challenge for National Statistical Offices is to develop measures that can tie various

aspects of poverty together, and that then could be used by governments to determine how

effective policies are in supporting people in meeting minimum needs. We propose that

subjective measures be included among the set of assessment tools used by countries. We are

not proposing that these replace objective measures or multidimensional measures; rather that

these be included in the arsenal used by countries to assess poverty. The Stiglitz et al. (2009)

report cites the need for wider perspective and recommends that objective and subjective

measures of well-being be included in a dashboard. The OECD references this report and its

recommendations as a motivation behind collecting subjective well-being data (OECD, 2013).

Additionally following the report, Eurostat developed the EU-SILC ad-hoc module on “wellbeing” in 2013. All of which has led to the creation of the OECD Better Life initiative

(2023) which includes objective and subjective measures but no measure of poverty

specifically. The primary purpose of this chapter is to provide an overview of the theoretical

and conceptual background of subjective poverty measurement.

II. DEFINITION OF SUBJECTIVE POVERTY

To understand the concept of subjective poverty, we start with a description of what is

subjective, emphasizing its relevance within the context of welfare. Something is subjective if

it reflects one’s personal views, experiences, preferences, attitudes, values, or background and

arises out of one’s own perceptions. In developing these perceptions, individuals compare

their perceived status against their own standards of desirability. These perceptions are

F ure 1. Co cept u e the e tio or ea ure e t o poverty

From arel an den Bosch, I Ashgate Publishing, ampshire, England, 2001, page 6.

Economic resources

Set of feasible functionings ( capabilities)

Realized functionings

Subjective welfare

(Dis)abilities and circumstances

Preferences

Personal standards and expectations

10

influenced by each respondent’s own income/expenditures/wealth, personality, family

influences (e.g., background such as religion, disability of family members), and subjective

well-being (e.g., happiness, life satisfaction in general) plus views regarding one’s community,

society at large, and the general economy. Along these lines, many people now are familiar

with the more broadly defined concept of “subjective well-being,” which focuses on life

satisfaction or happiness (Mahoney 2023). Indicators of subjective poverty can be seen as

complements to indicators of subjective well-being, with both drawing on how to measure

these.3 An early contribution to the quantification of happiness in surveys was Cantrilʼs (1965)

idea of the “ladder of life.” With reference to subjective well-being, for example see Diener

(1984), Kashdan (2004). Early applications of subjective welfare concepts in economics

included van Praag (1968), Kapteyn and van Praag (1976), and Easterlin (1974). Though the

origins of subjective welfare come from happiness or life satisfaction, we focus here on

subjective economic welfare and specifically subjective poverty.

The determination of whether an individual or household is poor is based on their situation

compared to a standard which could be objectively or subjectively determined and could be

assessed in terms of a money-metric response (e.g., with respect to levels of income,

expenditures, consumption, or wealth) or qualitative categorical response (e.g., one’s

perception of being poor or satisfaction with one’s income). For subjective poverty, measures

do not rely on any externally given absolute or relative resource-based threshold or measure.

Rather, they rely on individuals’ own assessments of their economic situation, or that of

others’ economic situations. For example, being in poverty based on a subjective measure

means could mean being below a subjectively defined national threshold, experiencing a state

of being that is less than that of others, or experiencing a state of being that is less than one’s

own standard such as reporting having great difficulty making ends meet. The majority of

subjective assessments, particularly those associated with poverty, reflect the respondent’s

own situation; however, other questions refer to hypothetical situations or families.

Assessments referring to another’s living conditions or expectations regarding minimum living

standards are often referred to as hypothetical or consensual. In this report we consider

hypothetical/consensual measures as a type of method for assessing subjective poverty. A

detailed discussion comparing the use of the respondent’s own situation or a hypothetical one

is provided in Chapter IV.

A. Co tra t to objective poverty

Subjective and objective assessments of poverty are related; however, they are distinct. When

considered together, they provide a more comprehensive view of poverty. Objective

approaches are typically based on household income, expenditures, consumption, wealth,

access to or possession of various goods or services or “attainment” of certain observable and

“objectively” measurable variables. On the other hand, subjective approaches rely on

respondents’ self-assessments of their own or another’s financial and/or material situations and

reflect all circumstances of their living conditions. With subjective measures there are

particular concerns about methodological issues such as comparability (across people and

time), validity, reproducibility, and generalizability cross-nationally. While objective

3 See Simona-Moussa (2020) for a recent study of subjective wellbeing and measures of vulnerability to poverty

considered together.

11

measures, such as a specific income level, can be influenced by these same circumstances, the

reporting of this income is not expected to be influenced by one’s self-assessment of one’s

financial situation. The objective approach is typically the preferred option by national and

international statistics offices as the data are often readily available from large-scale household

surveys and cross-country comparisons are more easily understood; however, (low) income

only represents one dimension of poverty.

To produce valid and practical poverty standards for a country, subjective assessments are also

needed. These assessments provide insight into how well people are faring personally and

adapting to policies to alleviate poverty. In addition, they can be used as indicators of

economic insecurity or vulnerability regarding needs that are unmet by current policies.4 For

example, a family may have income that is just above an objectively defined poverty

threshold, but still may have difficulty meeting its material needs due to circumstances not

accounted for in this objective measure. In this case, a subjective measure can provide

additional information for the development of policies to improve the economic well-being of

such families that income alone has not been able to address.

B. Fra ework or ubjective poverty

Recent UNECE studies have proposed alternative frameworks to group questions that can be

used for the measurement of subjective poverty. The UNECE Guide on Poverty Measurement

(2017) proposed grouping questions into three groups: (1) ability to meet various needs

focused on financial restrictions faced by the household; (2) considering oneself as poor via

individual self-assessment; and (3) income necessary to make ends meet and households’

minimum perceived needs. In a 2021 report published by the Conference of European

Statisticians, Statistics Poland presents a framework based on responses to a survey on current

country practices for measuring subjective poverty (2021). They classify questions as (1)

direct identification, (2) perceived financial difficulty, and (3) a subjective poverty line

approach. The subjective poverty line approach is divided into two subcategories: perceived

poverty line and statistical methods.

The purpose of subjective poverty questions is to provide a subjective measure of the welfare

space, where the “welfare space” is defined as economic poverty. To measure the welfare

space, we first need to operationalize it. Ravallion (2014) suggested there are two approaches

to measuring subjective poverty based on responses. The first approach asks for a money

metric of subjective welfare, and the second approach uses qualitative categories in the

welfare space. Adopting Ravallion’s suggestion, we propose a framework for thinking about

subjective poverty questions based on the same two approaches. Our framework aligns closely

with the work by Statistics Poland and the UNECE proposal, while also taking into

consideration the qualitative categorical classification proposed by the OECD in their 2023

report, Subjective Well-being Measurement: Current Practices and New Frontiers.5

4 For an example, see Duboux and Papuchon (2019a,b) and Bertolini et al. (2017). 5 Alternative frameworks are available when discussing subjective wellbeing more generally, rather than subjective poverty specifically. For example, Ryff (1989) discusses wellbeing questions using the framework of eudaimonic (psychological) and

12

Money metric questions ask respondents to report a specific monetary value. The subject of

these questions is typically income or expenditures with respect to some attribute, such as

ability to make ends meet, satisfaction, or adequacy of consumption, and were designed for

estimation of subjective poverty lines.6 Though attempts have been made to apply simpler

methods, such as averaging responses to subjective quantitative questions (such as respondents

reported minimum income to meet basic needs), or contrasting the responses directly to the

actual income (comparing respondents actual income to their reported minimum incomes),

these (naïve) methods lead to less reliable results. This is because individuals often

misperceive the true minimum income. Econometric methods have been developed that are

based on the intersection of actual and reported minimum incomes that produce reliable results

(Knight and Gunatilaka, 2012; Garner and Short, 2005). It is the multidimensionality of

factors considered by respondents and the heterogeneity in their answers that predetermines

the necessity to apply appropriate econometric techniques to analyze the subjective

quantitative questions.7

In contrast, qualitative questions rely on categorical responses, rather a specific monetary

value, and typically ask respondents about perceptions of their (or a hypothetical household’s)

material, financial, or economic situation. For instance, does the respondent consider his/her

family to be poor? Yes or No. The goal of such questions is for respondents to assess their

situations holistically as opposed to providing a particular income or expenditure. When

assessing their financial or economic situation, respondents are expected (and sometimes

asked specifically) to consider factors such as income sufficiency, the extent of their savings

and other financial assets, their ability to repay debt, and their capacity to cover unexpected

expenses. Within the concept of qualitative questions, we further operationalize the welfare

space by specifying three subcategories or groups based on what the question is asking of the

respondent: evaluation, identification, and prediction. More detailed descriptions of the money

metric and qualitative categorial questions, as well as examples, are provided in Chapter IV

Section A.

hedonic (life satisfaction, negative affect, and positive affect). In their 2013 report “Subjective Well-Being: Measuring appiness,

Suffering, and Other Dimensions of Experience,” the National Academies of Science (NAS) build of Ryff’s framework. They

classify subjective wellbeing questions as evaluative, experienced, and eudaimonic. The 2023 OECD report, Subjective Well-

being Measurement: Current Practices and New Frontiers, presents a similar framework, classifying questions as evaluative,

affective, and eudaimonic (page 6) as follows. (1) Life evaluation: Evaluative measures of subjective well-being refer to

the general assessments people make of their lives, or specific aspects of it, and is most commonly captured through

an indicator asking respondents to reflect on how satisfied they are with their lives (i.e. life satisfaction). Domain

satisfaction measures, relating to how satisfied one is with various aspects of one’s life, also fall under the evaluative

heading. (2) Affect: Affective measures capture people’s feelings, emotions or states, often measured with respect to

a defined time period (e.g., “over the course of yesterday”, etc.). )3) Eudaimonia: Eudaimonia can be thought of as

psychological flourishing, operationalised in the Guidelines as a measure of feeling one’s life has purpose or

meaning, though also containing aspects of autonomy, competence and self-actualisation. 6 While subjective monetary measures that ask about income or expenditures might be more useful in developed

countries, measures focusing on consumption could be more relevant for lesser developed ones. Consumption-based

measures typically focus on one’s assessment of the value of consumption needed for the respondent to feel well-off

and account for not just income but all resources available, for example, home production and uses of credit and access

to wealth. 7 See Chapter IV Section B for an overview of the most common estimation procedures.

13

C. Collectio a a aly o ubjective poverty at Natio al Stati tical Office

Measurement and analysis of subjective poverty tend to be neglected or omitted by most

National Statistical Offices. This was the conclusion of Statistics Poland based on an in-depth

review of current country practices for measuring subjective poverty that was tasked by the

Bureau of the Conference of European Statisticians, under the auspices of the United Nations

Economic Commission for Europe (UNECE 2021).8 Seven of the 52 countries surveyed did

not report collecting any information or conducting any work related to subjective poverty.9

Among the remaining 45 countries, all reported asking subjective poverty questions, but only

a small subset of these regularly produce, analyze, and publish data in this area. However, 37

of the respondents saw a need to prepare a guide providing an overview of the methods used to

measure subjective poverty, and 34 countries were in favor of working on a short list of

subjective poverty indicators for international comparison.

Another study with data collected from national statistical offices was conducted by the United

Nations Development Programme (UNDP). The focus of this study was Socio-Economic

Impact Assessments (SEIAs) of households and their response to COVID-19 (Danilova-Cross

2022). Information was collected from 15 countries with six of them reporting the collection

and use of subjective poverty measurement;10 five of these embarked on the collection of

primary data to support the measurement; and one, Serbia, reporting making use of subjective

poverty measures in its annual national surveys. In the surveys, households were asked

questions to assess their perceptions of the Covid-19 pandemic on changes in the household

levels of income, their ability to meet material and non-material needs or household expenses

as they fall due. This approach "gave a voice to respondents and sought to determine poverty

criteria on the basis of their opinions and experiences resulting from the pandemic. Employing

this method in socio-economic impact assessments is of particular importance as it helps

gauge where economic hardship is being experienced in the face of a global pandemic” (page

6).

It should be noted that the results of the study conducted by Statistics Poland and the one

conducted by the UNDP (Danilova-Cross 2022) are based on National Statistical Offices

regarding country specific measurement and analysis. Several statistical offices have

conducted analyses in an experimental capacity or commissioned research to be done by

individuals outside of their agency. Much of this work is cited and discussed in the brief

review of the literature provided in the next chapter.

D. Collectio a a aly o ubjective poverty at I ter atio al A e c e

In contrast to the lack of work in this area by National Statistical Offices, several international

8 A copy of the report can be found via the following link: https://unece.org/sites/default/files/2021-10/02_In-

depth_review_Subjective_poverty.pdf. 9 These seven countries are Azerbaijan, Czech Republic, Dominican Republic, Georgia, Japan, Mongolia, and United

States. Although the Czech Republic did not report collecting data related to subjective poverty, they participate in

the European Union Statistics on Income and Living Conditions Survey, which does collect data related to subjective

poverty. It should also be noted, after this survey was conducted the United States began collecting data related to

subjective poverty via the Household Pulse Survey. For more information about this question see Garner et al. (2020). 10 These six include: yrgyz Republic, Moldova, Serbia, Tajikistan, Ukraine, and Uzbekistan.

14

organizations have demonstrated positive practices in measuring some aspects of subjective

poverty. Two agencies in particular are Eurostat and the OECD.

At the European level, EU-SILC11 is the EU reference source for comparative statistics on

income, social inclusion and living conditions.12 The EU-SILC survey, which is managed by

Eurostat (European Commission), is a household and individual data collection which output

is harmonised as it is regulated by legislations.13 Among the different variables that EU-SILC

collects, some of them (e.g., in the field of subjective assessments of living standards,

questions about making ends meet) constitute a potential data source for measuring some

aspects of subjective poverty at the European level (e.g., estimating quasi-subjective poverty

lines or calculating indirect measures of subjective poverty).

On the basis of EU-SILC data, analytical work in the area of subjective poverty has been

carried out by various research centres (Zelinsky et al., 2022). In addition, Eurostat, on the

basis of a harmonised question included in EU-SILC, calculates and publishes on its website

the indicator “Inability to make ends meet” as a monetary measure of subjective poverty.14

This makes it possible to compare, at the European level15, measures of objective poverty

with people’s feelings of subjective economic poverty, identified as stress in the survey.

The OECD has been collecting evidence on subjective poverty through Compare your Income

(CYI), a web-based interactive tool that allows users to explore income statistics and compare

how well or badly off they are, and test whether their perceptions are in line with the actual

situation in their country.16 The web-tool was launched in 2015 and has so far, collected more

than 2 million entries. Over the course of years, the web-tool attracted a varied audience,

thanks to the fact that it covers all OECD countries (except Colombia, for which

internationally comparable income data are currently missing), is available in eight languages,

and has been widely promoted. The OECD uses the data from the CYI in two ways. First,

subjective poverty lines and equivalence scales are derived and compared with the equivalence

scale use by the OECD for official reporting. The results of this analysis are unpublished at the

time of writing this report. Second, although not focused on subjective poverty, data on

perceptions of income inequality across countries have been published in an OECD report,

Does Inequality Matter?: How People Perceive Economic Disparities and Social Mobility

(2021).

III. WHY MEASURE SUBJECTIVE POVERTY AND A BRIEF REVIEW OF THE LITERATURE

A. Why ea ure ubjective poverty?

The conventional and the most commonly adopted approach to measuring poverty is based on

an indirect or so-called “welfarist” approach. This approach relies on the assumption that

11 EU-SILC - the European Union Statistics on Income and Living Conditions Survey 12 See: https://ec.europa.eu/eurostat/web/income-and-living-conditions/overview 13 In addition, Eurostat issues yearly methodological guidelines which provide extended explanations and

recommendations on the implementation of the data collection. 14 See: https://ec.europa.eu/eurostat/databrowser/view/ILC_MDES09__custom_6666774/default/table?lang=en 15 EU-SILC provides cross-sectional and longitudinal data for the 27 European Members States, Iceland, Norway,

Switzerland, Albania, Kosovo, Montenegro, North Macedonia, Serbia and Türkiye. 16 See: https://www.oecd.org/wise/compare-your-income.htm

15

individuals are rational and can reasonably be considered the most capable assessors of the

kind of life and pursuits that optimize their personal satisfaction and happiness (Duclos and

Araar, 2006). Within this conceptual framework, assessments of poverty are typically based on

measures of income or resources. As these indicators are observed and generally considered

objectively measurable, we can also refer to it as to the objective approach. In this context,

along with an additional set of assumptions, income is seen as a measure of individual welfare

as all welfare-relevant goods and services can be purchased through market transactions.

When based on resources to include in-kind transfers and home production, the attainment of

an individual’s welfare is not limited to market transactions. Shortfalls in income or resources

can be interpreted as shortfalls in economic welfare or poverty. Nanda and Banerjee (2021) as

well as van Praag and Ferrer-i-Carbonell (2006) point out that objective measures of poverty

based on income may not be appropriate for developing nations because such societies are not

completely “monetarized” and there is a considerable amount of home production and in-kind

transfers. For such countries, the broader resource measure would be more appropriate. Or as

suggested by Ravallion (2016), consumption could be a better measure of welfare particularly

when considered in terms of individual’s subjective evaluation of the adequacy of their

consumption. Furthermore, there is no generally agreed objective standard for where to draw

the income threshold that defines poverty.

To build an argument for the addition of subjective measures to assess poverty, we again to

turn Sen (2007) who noted:

“ – p –

w w k w

x p w

p .”

This definition is founded within his argument that welfare should be thought of in terms of a

person’s capabilities or the functionings, not just income, that a person is able to achieve (Sen,

1985, 1993). Based on this approach, someone is poor when they have limited freedoms or

chances of realizing their own lifestyle. Also noted by Sen (1992, p. 107) is the following

warning, “We are not entirely free to characterize poverty in any way we like…There are some

clear associations that constrain the nature of the concept [i.e., poverty].” Given this guidance

and wisdom, one could attempt the Sisyphean task of trying to define “limited freedoms” in

order to establish a poverty threshold or the task could be given to the people via subjective

poverty questions, thereby establishing poverty criteria on the basis of public opinion.

Alternatively, one can ask about financial difficulty, minimum income, and other subjective

poverty questions directly as measures of a person’s ability or inability to lead a decent –

minimally acceptable – life.17 Regardless of the subjective measures selected, drawing upon

the UNECE (2020) recommendation for deprivation measures (28.1), a key criterion is that the

measures be “based on clear and explicit theory or normative definition of poverty to ensure

that the questions used are valid indicators of poverty as opposed to unrelated concepts of

17 See Van den Bosch (2001) for a discussion of this options in his treatise on Identifying the Poor Using subjective

and consensual measures.

16

general wellbeing or happiness.”

It should be noted that although we provide an argument for measuring subjective poverty, we

do not advocate for subjective poverty to replace objective measures. Rather, measures of

subjective poverty should be seen as complements to objective measures. This

recommendation aligns with the recommendations made by the Commission on the

Measurement of Economic Performance and Social Progress (Stiglitz et al., 2009). Their

report emphasizes the importance of developing robust measures of social connections,

political voice, and insecurity that can predict life satisfaction, using both objective and

subjective data. And, in addition, the report highlights the need for statistical offices to

incorporate objective and subjective indicators that capture people’s life evaluations, hedonic

experiences, and priorities in their surveys.

B. Evolutio o ubjective poverty ea ure e t

Early subjective well-being questions and measures were modified to a narrow definition of

economic welfare. For example, the Cantril ladder was designed to ask respondents to rank

themselves on a ladder with steps numbered from zero at the bottom to ten at the top,

supposing that the top of the ladder represents the best possible life, and the bottom of the

ladder represents the worst possible life. This scale has been used to assess subjective well-

being with results currently included in the OECD WISE dashboard for countries.18

An example of using such a ladder for subjective qualitative poverty measurement is a

rich/poor scale included in the Eurobarometer survey for the first time in 1976 (Riffault,

1991). The ladder included seven rungs with the bottom rung representing “poor”. An example

of directly labelling the rungs with respective to poverty explicitly (e.g., “poor”, “borderline”,

“non-poor”) was used by Mangahas (1995). In the economics literature, these types of

questions are also referred to as the Economic Welfare Question or Economic Ladder Question

(Ravallion and Lokshin, 2002, Ravallion, 2014). The current European survey EU–Statistics

on Income and Living conditions (EU-SILC) applies a 6-point scale question asking

households to self-evaluate their ability to make ends meet with respect to their income, which

is a monetary version of the question that can be used to assess subjective poverty.

The most common approach to identify poor populations using qualitative categorical

responses to an Economic Ladder Question is to set an arbitrary threshold based on one or

more bottom ladder rungs (Carletto and Zezza, 2006, Mysíková et al., 2019). Though a

threshold must be selected by the researcher (i.e., a category below which the household is

identified as poor), the advantage is that such an approach does not require specifying a

monetary value of the subjective poverty line (Duvoux and Papuchon, 2019). Attempts to

estimate the subjective poverty line based on the categorical welfare ladder questions are less

frequent (Piasecki and Bieńkuńska 2018, Pradhan and Ravallion, 2000, Želinský et al., 2020,

see section III.B). The references cited represent three different estimation methods; however,

only the method by Pradhan and Ravallion – explained below – has been used in several other

papers, but mostly by the same group of authors.

18 See https://www.oecd.org/wise/measuring-well-being-and-progress.htm

17

Pradhan and Ravallion (2000) proposed employing a different type of qualitative categorical

question, the Consumption Adequacy Question (CAQ). Using this question avoids asking

respondents (in Jamaica and Nepal; Lokshin et al., 2006, applied the CAQ in Madagascar)

about a precise amount of income needed to make ends meet. ouseholds, especially in rural

areas, may have different concepts of income, making the answers to Minimum Income

Questions (MIQ) less comparable. The raised issues concerned inclusion of cash income only

versus other components of total income such as imputed income from own housing and

production (e.g., a family farm) or production costs. Therefore, instead of asking about

minimum income, respondents were asked to evaluate if their consumption of various

commodities (food, housing, clothing) was adequate or not. Thus, this approach drops the

monetary component and applies categorical questions instead to facilitate respondents’

answers.

Another source of objections to subjective indicators arises from latent heterogeneity, a

phenomenon occurring when people with similar observable characteristics (for example, age,

income, education) but different latent personality traits provide different responses to

subjective welfare questions (Ravallion, 2014). In other words, people may employ different

criteria when assessing their well-being, as they may hold distinct perceptions of what

constitutes “wealth” or “poverty” and what signifies satisfaction or dissatisfaction in their lives

(Beegle et al., 2012). As further argued by Ravallion, even individuals with similar observable

and latent personality traits may use different criteria to assess their welfare, which can be

influenced by a “frame-of-reference” bias (Ravallion, 2008). To address this concern, Beegle

et al. (2012) used vignettes to test for bias due to latent heterogeneity in individual scales of

subjective welfare. Respondents were asked: “Imagine a 6-step ladder where on the bottom,

the first step, stand the poorest people, and the highest step, the sixth, stand the rich. On which

step are you today?” In a later section of the questionnaire, respondents were asked to place

four vignettes of hypothetical families on the six-step ladder and then to place themselves on

the same scale. Their findings demonstrate the presence of a frame-of-reference effect on

individuals' SWB, indicating that people from diverse socioeconomic backgrounds

consistently employ distinct scales when responding to inquiries about their welfare.

Nevertheless, their results indicate that this factor is not a significant source of bias in

producing subjective poverty lines.

In contrast, money metric questions, at times, ask individuals to state a concrete amount that

represents a certain living standard. While asking for very specific amounts of money,

objections to such questions also arise. The concept of a money metric approach to measure

subjective economic poverty was first introduced by van Praag (1968, 1971), with the Income

Evaluation Question (IEQ). The IEQ asked respondents to provide explicit income values that

they considered “very bad” to “very good”, with a number of options in between. The answers

to the IEQ from all respondents were fitted to a utility function with the formula of the log-

normal distribution function (van Praag and apteyn, 1994). The derived poverty line is

referred to as the Leyden Poverty Line (LPL). Such questions were primarily designed to be

used for econometric modelling of subjective poverty lines, which then would be compared to

respondents’ actual income.

The foundation of a model-based approach to produce subjective poverty lines is the MIQ, a

specific case of IEQ. MIQ, and again a monetary-based subjective question ( apteyn et al.,

18

1988, apteyn, 1994, Goedhart et al., 1977) asks what income is needed to make ends meet.

The Subjective Poverty Line (SPL) is econometrically estimated such that the expected

minimum income equals actual income across the population rather than at the individual

household level (see section III.B for estimation details). Objectively measured income

normalised by the SPL is used as the welfare indicator, i.e., actual income below SPL

identifies the subjectively poor population. Flik and van Praag (1991) compared the LPL and

SPL and concluded that LPL seems to be theoretically superior to the SPL given the fact that

IEQ is a multi-level question, while MIQ is a one-level question, which makes the latter more

likely to be subject to random response fluctuations.

Simplified methods based on averaging responses to questions of subjectively evaluated living

standards or comparing the responses directly to the actual income (referred to as the

individual method) are less common but have been applied, for instance, by rooman and

off (2004), Thijssen and Wildeboer Schut (2005), and Mysíková et al. (2019). These latter

approaches are presumed to produce less reliable subjective poverty measures than those

based on the model-based subjective poverty lines. The simplified or naïve methods have been

criticized for “heterogeneity, such that people at the same standard of living can give different

answers on subjective welfare” (Ravallion, 2014, pp. 146–147; Pittau and Zelli, 2023). One

way to control for this heterogeneity is to use monetary-based subjective questions to estimate

model-based subjective poverty lines (Goedhart et al., 1977, apteyn et al., 1988).

Rather than derive the subjective poverty lines based on income, Morissette and Poulin (1991)

for Canada and Garner and Short (2003, 2004) for the U.S., used a similar question, the

Minimum Spending Question (MSQ) to assess poverty based on subjective questions. For the

U.S., Garner and Short (2005) compared MSQ-based lines to household expenditure outlays.

They concluded that such a question resulted in poverty thresholds/ rates similar to those

based on NAS methods (NRC 1995).

A similar approach was introduced by the Centre for Social Policy (CSP). For this approach,

the subjective line is derived based on the MIQ question but is only applied to a subsample of

respondents (Deleeck, 1977, Deleeck et al., 1984). The subsample is selected based on a

monetary, categorical question that asks respondents to evaluate on a 6-point scale how they

can make ends meet with their actual household disposable income. This question is known as

a “Deleeck” question (also included in EU-SILC survey, see Chapter III, Box 7) and the

derived poverty line as a CSP poverty line. The method only selects respondents who

classified themselves as making ends meet “with some difficulty”, as these are assumed to be

on the margin of poverty and consequently to have the best knowledge of the situation. After

excluding outliers, the CSP poverty line is derived as an average value of the minimum

between the actual household income and the reported subjective minimum income (from

MIQ).

The selection of the subsample assumes that the poverty line must be determined by

respondents who are at the border of poverty as these have the best knowledge of the situation.

Some researchers considered this assumption to be too strong and disagreed with the strong

dependence of the poverty line on the choice of the subsample of respondents, especially

because the reference group could possibly include only a few people (Flik and van Praag,

1991). Alternative methods and modifications broadly based on LPL, SPL or CSP lines have

19

been further developed in the literature.

Subsequent literature raised concerns about how respondents interpret the MIQ (Garner and de

os, 1995) and that the concept of income may not be well-defined for respondents, especially

in developing countries (Pradhan and Ravallion, 2000).19 De os and Garner (1991) analyzed

the relationship between expenditures and responses to MIQ. Consequently, perceptions of

minimum expenditures started to supplement or supplant income.

Garner and Short (2003, 2004) discussed a notion that respondents consider a higher living

standard when answering the MIQ than the MSQ. The reasons might be that respondents could

include savings or loan payments in the minimum income, while they are asked to focus

specifically on spending and basic necessities such as food, shelter, clothing and other

essential items for daily living in the MSQ. The MIQ refers to a broader set of needs than

MSQ. Therefore, they suggested the higher MIQ-based line as representing a “social minimum

standard”, while the lower MSQ-based line could be considered a “subsistence minimum

standard”. The difference between MIQ-based and MSQ-based SPLs was shown on the U.S.

data.

Similar to the CSP method in that qualitative and quantitative questions were used together to

estimate the poverty line, Pradhan and Ravallion (2000) used responses to the Consumption

Adequacy Question (CAQ) in combination with actual reports of consumption. Specifically,

respondents were asked to evaluate if their consumption of various commodities (food,

housing, clothing) was adequate or not. Two methods were used to estimate the subjective

poverty line, both based on regressions. Method (1) anchors the subjective poverty line to the

perceived adequacy of food consumption alone; Method (2) also includes non-food

consumption, but the approach is the same in both cases. The difference is that in Method (2)

Pradhan and Ravallion also estimate a reduced-form Engel curve to make “an allowance for

the remaining components of spending which is an estimate of the expected value for someone

consuming the subjective poverty line level for core expenditure.”

Chapter 3. APPROACHES FOR MEASUREMENT AND ANALYSIS

I. APPROACHES TO MEASUREMENT

Following the framework developed in the Chapter 2, we provide a discussion of the various

approaches to measuring subjective poverty. We divide qualitative categorical response

questions into three groups: identification, evaluation, and prediction. The first two align

closely with what are considered standard notions of poverty, while prediction more closely

aligns poverty with economic insecurity or vulnerability. In contrast a money metric question

requires a specific money value response. A description of each type of question is provided in

this section. To help elucidate this framework, along with the descriptions we provide

examples of subjective poverty questions, we limit our presentation to the country responses to

19 This is the case especially for subsistence farmers, who are a significant group of poor, but may not impute

income/expenditure for the produce which they use for their own consumption.

20

the 2021 UNECE survey developed by Statistics Poland.

Figure 1 presents the number of countries asking subjective poverty questions by type as

collected in the 2021 UNECE survey. The pie chart in the center of the figure shows 29

countries report asking only monetary subjective poverty questions, 3 countries report asking

only non-monetary questions, and 13 countries report asking both types of questions. Among

country representatives who reported asking monetary questions, as well as countries who

reported asking non-monetary questions, “evaluation” was the most frequently reported

subcategory, with 40 countries reporting monetary evaluation questions and 14 countries

reporting non-monetary evaluation questions. A more detailed breakdown of the questions can

be found in Table A.1, which provides counts of the number of questions by type, by country.

Figure 1: Number of Countries Asking Subjective Poverty Questions by Type

Note: Data comes from the responses to the UNECE survey developed by Statistics Poland.

Several responses to the survey fell more into the area of measuring deprivation, social

exclusion, or well-being, rather than subjective poverty, which are outside the guidelines of

this Task Force. Therefore, we did not include them in our analysis; however, we do make a

record of these responses in Table A.1. They are classified as “other.” 22 countries reported

asking at least one question that fell outside the scope of subjective poverty.

A. Qual tative Que tio ot Focu e o Spec c Level o I co e (or Co u ptio )

Identification

“Identification” is the most direct way of collecting data on subjective poverty. This type of

question asks respondents to identify themselves as poor or experiencing poverty in a

qualitative sense based on a categorical response. Countries can then use the responses to this

Qualitative

Categorical

23

Money

Metric

1

Both

21

4

42

6

Identification Evaluation Prediction

22

Valuation

21

question to produce simple statistics to describe the subjective poverty status of their

population. Only four of the 52 countries (i.e., Columbia, Israel, yrgyz Republic, and iet

Nam) reported questions in which the respondent was asked to identify themselves or their

household as poor or feeling at risk of poverty. There was no standard question wording across

countries. See Box 1 for examples of questions from Columbia and yrgyz Republic.

Box 1. Examples of Qualitative Categorical Identification Questions

[Colombia] Do you consider yourself poor?

• Yes

• No

[Kyrgyz Republic] How do you assess the circumstances of your household?

• Rich

• Average

• Poor

• Very poor

Evaluation

Qualitative categorical evaluation questions ask respondents to assess their economic or

financial situation holistically with respect to some attribute such as satisfaction. 14 countries

report asking a categorical evaluation question, with the most frequently used question

wording asking respondents about their current financial situation. See Box 2 for examples.

Canada, ungary, Norway, and Switzerland reported asking questions using this phrasing.20

Five countries asked respondents to indicate their level of satisfaction with their financial

situation using a scale from 0 to 10; however, the scales were not uniformly defined. Canada

designates their scale as “very dissatisfied” (0) to “very satisfied” (10), whereas the other

countries have scales that range from “not at all satisfied” to “completely/very satisfied” (10).

In addition to the 0 to 10 scale, Canada also includes a satisfaction question where the

responses follow a 5-point Likert scale.21 Even though the questions are worded similarly

across countries, because the scales are defined differently, cross-country comparisons,

specifically with Canada, are not possible.

Box 2. Examples of Qualitative Categorical Evaluation Questions, Current Financial

Situation

[Canada] How do you feel about your finances?

0 – Very dissatisfied

10 – Very Satisfied

[Switzerland] In general, how satisfied are you with the current financial situation of your

household?

20 In the UNECE CIS report (2023), Kazakhstan was also identified as asking a categorial evaluation question with

wording focused on satisfaction with one’s financial situation. 21 The 5-point Likert scale used by Canada was (1) very satisfied, (2) satisfied, (3) neither satisfied nor dissatisfied,

(4) dissatisfied, and (5) very dissatisfied.

22

0 – Not satisfied at all

10 – Completely satisfied

The next most common qualitative categorical question is to ask respondents how they

perceive their current financial or economic situation compared to a reference point in the past.

See Box 3 for examples. Two countries, Colombia and Ukraine, ask respondents to consider

“12 months ago” and “the last 12 months,” respectively. In contrast, Belarus and Finland use

the “previous year” as the reference point, with Finland specifying the calendar year in the

question. The different wording can result in different reference periods. For example,

consider an individual being interviewed in December of 2020. A respondent asked to consider

the last calendar year (all of 2019) will likely answer differently than if their reference point

was the previous 12 months (December 2019 through December 2020) or even 12 months ago

(December 2019). All counties use a 5-point Likert scale for responses.

Box 3. Examples of Qualitative Categorical Evaluation Questions, Current Financial

Situation Compared to the Past

[Columbia] How do you consider the economic situation of your household compared to 12

months ago?

(1) Much better

(2) Better

(3) Same

(4) Worse

(5) Much worse

[Finland] Compared to the previous year, that is [20XX-1], has your financial situation:

(1) Changed significantly for the better

(2) Changed somewhat for the better

(3) Remained unchanged

(4) Changed somewhat for the worse

(5) Changed significantly for the worse

Another frequently reported qualitative categorical question asked respondents to select a

phrase from a set of options that best describes their current financial situation. See Box 4 for

examples. Denmark, Lithuania, and Netherlands all report asking this type of question. The

phrases respondents select from can provide a detailed picture of their financial situation. For

example, one of the options Lithuania offers is “we are having to draw on our savings.”

owever, similar to the problem previously encountered when asking respondents how they

feel about their financial situation, cross-country comparisons are only possible if the response

options are worded in a comparable manner.

Box 4. Examples of Qualitative Categorical Evaluation Questions, Describe Current

Financial Situation

[Denmark] How is the present financial situation of your household, or in other words:

23

• Do you spend more than you earn?

• Do you find it difficult to make ends meet?

• Are you able to put money aside?

[Lithuania] Which of these statements best described the current financial situation of your

household:

• We are saving a lot

• We are saving a little

• We are just managing to make ends meet on our income

• We are having to draw on our savings

• We are running into debt

Prediction

The final type of qualitative categorical question is “prediction,” which asks respondents to

consider how they think their current financial, material, or economic situation will change

over a specified period. See Box 6 for examples. Four countries report asking this type of

question: Belarus, Colombia, ungary, and Ukraine, and all four use the next twelve months

or next year as the prediction period. owever, a country could also ask about the next six

months, two years, or even longer, depending on whether they are interested in measuring

respondents’ short- or long-run perceptions.

Box 6. Examples of Qualitative Categorical Evaluation Questions, Prediction

[Columbia] W k ’ w k in 12 months

compared to now?

• Much better

• Better

• Same

• Worse

• Much worse

[Ukraine] How do you think the material status of your household could change for the next

12 months?

• It will get better

• It will remain without any changes

• It will get worse

• It is difficult to specify

[Belarus] How do you think the material situation of your household will change next year?

• It will get better

• It will remain without any changes

It will get worse

24

As with the previous questions, the question wording and response options were not

standardized across countries. Both Belarus and Ukraine report asking respondents to evaluate

potential change in their material situation over the next 12 months, but Belarus asks

respondents to consider how the material situation “will change” over the next year, whereas

Ukraine asks respondents to consider how things “could change.” Although the wording is

only slightly different, the choice of “will” or “could” may impact how a respondent evaluates

the future. Both Colombia and ungary also ask respondents to consider how their financial

situation will change over the next 12 months but provide different response options.

Colombia uses a 5-point Likert scale, whereas ungary only uses a 3-point Likert scale.22

Other types of qualitative categorical questions refer to money in particular. These are

presented in the next section

B. Qual tative Cate or cal Que tio Focu e o Spec c I co e (or Co u ptio )

Evaluation

Qualitative categorical evaluation questions ask respondents to evaluate their income with

respect to some attribute, such as ability to make ends meet, satisfaction, or adequacy of

consumption. Responses to these types of questions are categorial and can be used to create

simple statistics to describe the subjective poverty status of a country’s population. Responses

to these evaluation questions can also be combined with money metric valuation questions,

questions that require the respondent to report a specific dollar value such as the minimum

income question, to create a subjective poverty threshold.23 See Section II. B in this chapter

for more information regarding the estimation of such thresholds.

Forty countries report asking at least one qualitative categorical question that was focused on

income in particular. The overwhelming popularity of this type of question is, in part, a result

of it being included in the EU-SILC. The exact wording of the question reported by the EU-

SILC countries is slightly different but follows the same general pattern of asking respondents

to evaluate their ability to make ends meet with respect to their income. EU-SILC survey

offers response options following a 6-point Likert scale. See Box 7 for an example.24 This type

of question is also known within the literature as a Deleeck question.

Box 7. Examples of Qualitative Categorical Evaluation Questions Focused on Income,

EU-SILC Countries

[EU-SILC participating countries] A household may have different sources of income and

more than one household member may contribute to . T k ’

22 Colombia’s response options are “much better,” “better,” “the same,” “worse,” and “much worse.” ungary’s

response options are “it will get better,” “it will not change,” and “it will get worse.” Colombia’s 5-point Likert scale

could be converted to a 3-point Likert scale to make the responses comparable to Hungary. 23 This approach is also known as the Deleek Method of measuring subjective poverty. See Flik and Praag (1991) for

more details about this method. 24 The example provided is the suggested wording of the monetary evaluation question provided by the 2021 EU-

SILC Guidelines. Each country’s statistical office must translate it into their country’s official language, so the exact

wording may vary from country to country.

25

income, is your household able to make ends meet, namely, to pay for its usual necessary

expenses?

• With great difficulty

• With difficulty

• With some difficulties

• Fairly easily

• Easily

• Very easily

Of 12 non-EU countries that reported asking a qualitative categorical evaluation type question

focused on inomce, five of which (Armenia, Brazil, Russian Federation, Turkey, and Ukraine)

report asking a question that is akin to the one asked in the EU-SILC.

Respondents are asked to evaluate their ability to make ends meet with respect to their income.

Additionally, the response options that were reported follow the 6-point Likert scale. Since

these countries and those participating in the EU-SILC asked similar income evaluation

questions with similar response options, it is possible for subjective measures of the ability to

make ends meet to be compared across these countries as well as the EU-SILC participating

countries.

A closely related qualitative categorical question asks respondents to evaluate their income,

but instead of asking respondents about their ability to make ends meet, respondents are asked

to describe their current income by selecting from a list of descriptions. Belarus, Colombia,

Mexico, New Zealand, Ukraine, and Uzbekistan report asking this type of question; however,

response options are substantially different, making cross-country comparison difficult. See

Box 8 for examples.

Box 8. Examples of Qualitative Categorical Evaluation Questions Focused on Making

Ends Meet, Descriptive Responses

[Belarus] How do you assess the total income of your household?

• Income is barely enough to buy food.

• Income is enough to buy food, but it is difficult to buy clothes and other necessary

goods and services.

• Income is enough to buy food, clothes and other necessary goods and services but

it is difficult to buy durables (TV, refrigerator, other).

• Income is enough to buy durables, but expensive goods (car, etc.) are difficult to

buy.

• Income is enough to buy everything we think we need.

[Columbia] Y …

• is not enough to cover minimum expenses.

• is enough to cover the minimum expenses.

• covers more than the minimum expenses.

26

The remaining questions classified as qualitative categorical evaluation focused on income or

a related resource measure are either unique to the country or only asked by one other country.

For example, Belarus reports asking respondents “how satisfied” they are with their money

income. Costa Rica provides respondents with a reference household and asks them to

evaluate whether the monthly income for the household is enough to live on. Both the

Netherlands and Slovakia ask respondents how their income has changed compared to the

previous year.

Prediction

Similar to the earlier qualitative question that did not refer to income specifically, the

qualitative income-focused version of “prediction” asks respondents to evaluate how their

income will change over a specific period in the future, eor will be in some future period. Only

two countries, Canada and Netherlands, reported asking thes types of question. See Box 9 for

the specific question wording.

Box 9. Examples of Qualitative Income-focused Prediction Questions

[Canada] I x w k [ ’ ] w

increase, decrease, or stay the same?

• Increase

• Decrease

• Same

[Netherlands] Do you expect your income/total household income to increase, stay the same or

decrease over the next 12 months?

• Increase

• Stay the same

• Decrease

[Canada] Taking all of the various sources of retirement income into account for your

household (including government sources as well as personal and occupational pensions and

provisions), how adequate do you think your household income in retirement will be to

maintain your st ? W …?

• More than adequate

• Adequate

• Barely adequate

• Inadequate

• Very inadequate

C. Mo ey Metr c Valuatio Que tio

Money metric valuation questions ask respondents to provide a specific value of income or

money they think is necessary for the specified situation. 22 countries report asking a

valuation question. 17 of these countries report asking respondents to provide the minimum

27

income they believe is needed to “make ends meet,”25 “meet the basic needs,”26 or “cover all

normally necessary expenses”27.28 Of the 5 remaining countries, three report asking similar

questions but set the reference for the minimum at different points. This type of question is

referred to in the literature as a Minimum Income Question (MIQ). See Box 10 for examples.

yrgyz Republic and Ukraine set the minimum at avoiding poverty instead of making ends

meet.29 Republic of Moldova asks two questions; the first asks for the minimum income

needed to live day-to-day, and the second asks for the minimum income needed for a decent

life. Although some of the reference points are similar, such as “making ends meet,” “avoiding

poverty,” and “live from day-to-day,” it is not guaranteed that they will evoke the same image

for a respondent. Thus, responses to these questions should not be compared across countries

and cannot be used to create the same subjective poverty threshold.

Box 10. Examples of Money Metric Valuation Questions, Minimum Income Question

(MIQ)

[Brazil] Taking into account the current situation of your family, what would be the minimum

“ k ”?

[Ukraine] W k: w ( ’ p )

your household members is needed in order to not feel poor?

[Kyrgyz Republic] What is your opinion, how much money on average per month at today's

price are needed for the family with the same number of people as you have in order to avoid

poverty?

[Moldova] What monthly cash income would meet the minimum needs of one person in order

to 'live from day to day’?

[Belarus] In your opinion, what amount of money does your household need to have monthly

to meet[satisfy] the minimum needs of all its members?

The remaining two countries, Armenia and ungary, do not ask respondents to report only the

minimum income needed to make ends meet or avoid poverty. Instead, they ask respondents to

report the income needed for a variety of living standards. This type of question is also

referred to in the literature as an Income Evaluation Question (IEQ). See Box 11 for the

specific question wording. Brazil and Turkey also report asking a multi-point valuation

25 Austria, Belgium, Brazil, Cyprus, Germany, Ireland, Italy, Lithuania, Luxembourg, Malta, Republic of North

Macedonia, Russian Federation, and Spain use the phrase “make ends meet.” 26 Costa Rica uses the phrase “meet the basic needs,” and Belarus uses the phrase “meet the minimum needs.” 27 Switzerland and Turkey use the phrase “cover all normally necessary expenses.” 28 A few of the countries that report asking this type of question indicate that it is asked as part of the EU-SILC.

However, not all the EU-SILC countries that responded to the survey reported a valuation question. 12 of the 29 EU-

SILC countries that participated in the survey reported asking a minimum income question. Hungary does not report

asking a minimum income question but does report asking a valuation question. The remaining 16 countries did not

report asking any type of valuation question. Because these countries did not report any valuation questions, we do

not include them in the analysis, even though the EU-SILC was reported to include a minimum income question at

the time of the survey. 29 yrgyz Republic sets the minimum income at what is needed “to avoid poverty,” whereas Ukraine sets the minimum

at what “is needed to order not to feel like the poor.”

28

question using a similar five- and three-point scale, respectively.

Box 11. Example of Money Metric Valuation Questions, Income Evaluation Question

(IEQ)

[Armenia] How much money does your family need monthly to make ends meet (survive)?

How much money does your family need monthly to live well? How much money does your

family need to live very well in a month?

[Hungary] What (net) amount of income do you think your household would need in a month

• a very low standard of living?

• a low standard of living?

• an average standard of living?

• a high standard of living?

• a very high standard of living?

II. ANALYSIS

The literature provides numerous examples of applications of estimation techniques in relation

to subjective welfare or subjective poverty. Some of these assess factors related to subjective

welfare and search for determinants that explain the variation in responses. Others are applied

to estimate subjective poverty lines that allow for the identification of subjectively poor

subpopulations and, hence, the subjective poverty rates. After a brief overview of relevant

determinants of subjective poverty in the literature, we introduce several estimation techniques

to derive subjective poverty lines with respect to different types of subjective poverty

questions.

A. Relatio h p

The empirics concur on the fact that there is a positive correlation between income level and

subjective welfare (e.g., errera et al., 2006), and in turn subjectively based poverty. When

analyzing responses to questions that ultimately are used to assess subjective poverty, these

relationships need to be acknowledged and accounted for in measurement.

A huge stream of literature focuses on the relationship between income and subjective welfare,

mostly defined in a broader sense, e.g., in terms of happiness and/or life satisfaction (e.g.,

Easterlin, 2001). The correlation was found to be stronger in developing countries than in

developed ones ( errera et al., 2006). owever, it was also realized that the correlation is not

perfect and that it is not only current own income that matters (Ravallion and Lokshin, 2002),

but also past incomes, income expectations and aspirations, and/or relative/comparison

incomes (Clark and Oswald, 1996).

The empirical literature broadly analyses factors of subjective poverty, where survey responses

have been regressed on individual and household characteristics. Besides income, other factors

29

such as household size, age and gender composition, education and employment status, and

regional dummies are commonly controlled for in model estimations. For an example of a

wide list of analyzed characteristics, Ravallion and Lokshin (2002) examined how the answers

to a nine-rung economic welfare question (with the rungs ranging from “poor” to “rich”)

varied with various variables grouped in three areas: (i) supplementary objective indicators of

personal or household circumstances (expenditure, assets and durables, education, health,

employment status, age and marital status), also utilizing the panel nature of the applied data

(past incomes); (ii) measures of relative income (variables measuring the individual’s relative

position within certain reference groups, e.g., position within the respondent’s household or

within the locality where they live); and (iii) attitudinal variables (e.g., expectations about

future welfare, perceived insecurity of employment, and whether the government cares about

people), which, however, may have raised concerns about endogeneity.

Some of the variables might affect subjective welfare through effects on expected future

income or perceived riskiness of individuals’ current incomes. Lower subjective welfare of

divorced or widowed individuals may stem from perceived lower economic security. Relative

income within one’s locality were found to account for almost all the variance attributable to

geographic effects; people in richer areas felt relatively worse off. Ravallion and Lokshin

(2002) concluded that “results clearly reject any notion that one only gets noise from the

answers to subjective questions. owever, it is also unclear whether the systematic factors that

influence self-rated welfare will all be deemed relevant to the types of inter-personal welfare

comparisons that are required for making specific policy choices.” (p. 1471).

The type of regression modelling utilized will be based on how the dependent variable is

defined. When subjective welfare is represented by ordinal data from a welfare ladder

question, ordered probit regression models are typically applied. When continuous data is used

as the dependent variable, such as with the MIQ, standard OLS regression is commonly

applied. Researchers have mostly agreed that if regression models are used to estimate

subjective poverty lines, covariates, such as household size, should be included in order to get

unbiased estimates of other variables (Garner and de os, 1995).

B. Subjective Poverty L e

In this section we present an overview of the two most known approaches to estimate

subjective poverty lines based on money metric valuation questions: the Leyden Poverty Line

based on Income Evaluation Question (IEQ) and the Subjective Poverty Line based on

Minimum Income Question (MIQ). Though both the approaches were developed around the

1970s, the latter gained more interest in the literature because of the availability of the

questions in recent surveys. While the IEQ was rarely included, the MIQ was asked annually

in the EU-SILC up to 2020.30

30 The related variable is likely to be collected every six years in the EU-SILC 6-yealy rolling module 2026 on “over-

indebtedness, consumption and wealth”. This module will be legally adopted by the end of 2024. The module will be

collected every six years starting in 2026.

30

Leyden Poverty Line based on Money Metric Evaluation Question

The construction of the Leyden Poverty Line (LPL) relies on estimating parameters of the

individual welfare function of income (income utility function), which is typically based on

the so-called IEQ. The IEQ (presented in Box 11) asks respondents to report what they

consider to be ( ) /( ) /( ) income, in their circumstances (van Praag,

1968, 1971). The amounts corresponding to these categories are used to form the individual

welfare function, and this function is further used as a basis for estimating the LPL (see Box

12). Within this framework, it is necessary to decide upon the value of a parameter 𝛼 – the welfare (utility) level under which a household is considered poor. Ultimately, a household is

considered poor if the total household income falls below a certain level of welfare (𝛼). Note that the parameter 𝛼 is arbitrarily chosen.

Box 12. Leyden Poverty Line

The individual poverty line yαi is defined by solving (Flik and van Praag, 1991):

 

 = 

  

 − 

i

iiy )ln( , (1)

where α is the welfare (utility) level below which a household is considered poor, Ф(∙)

denotes the cumulative distribution function of the standard normal distribution; i and i

are the mean and standard deviation estimated from responses to the IEQ.

Assuming that )ln()ln( 210 iii sy  ++= , (2)

we get: )()ln()ln()ln( 1 210 

−+++= iii syy . (3)

Fixing  at the population average  , the log of national LPL can be computed as:

( ) 1

1 20

1

)()ln( ln

 

++ =

−  s

y . (4)

A specific LPL can be found for each value of household size. In addition, further

household characteristics can be included in the equation.

Intersection Method Based on the Minimum Income Question

Intermediate approaches developed in the 1990s aimed to identify cost and/or utility functions

based on subjective money metric valuation questions. The most well-known approach derives

the Subjective Poverty Line based on subjective valuations of MIQ (Box 10), first introduced

by Goedhart et al. (1977). It is model-based in the sense that individual’s responses do not

directly generate the poverty line ( eptayen et al., 1988). There were attempts to define the

poverty threshold as anyone whose actual income was lower than their reported subjective

minimum; however, as people at the same standard of living can provide different answers to

the MIQ. This heterogeneity must be accounted for because it would lead to inconsistencies in

the poverty measures otherwise (Pradhan and Ravallion, 2000, Ravallion, 2014).

It has been shown that there exists a positive relationship between the expected answer to MIQ

and actual income. More generally, the income effect on subjective welfare has been identified

as robust across countries, within countries, and over time in the literature (Stevenson and

31

Wolfers, 2008; Clark et al., 2008). The conditions the existence of SPL on

subjective minimum income being an increasing function of actual income, more concretely, a

concave function as illustrated by Figure II.1. The intersection (Z*) of the lines representing

the equality of minimum and actual incomes (i.e., the 45‐degree line in Figure II.1) determines

the Subjective Poverty Line. The intersection point assumes that only respondents with actual

incomes equal to their subjective minimum incomes have a realistic idea of the minimum

income level. Richer respondents tend to overestimate their minimum necessary income while

poorer respondents tend to do the opposite.

Figure II.1 Subjective Poverty Line based on Minimum Income Question

Source: Illustrative picture.

Notes: Z* is the estimated Subjective Poverty Line.

The seminal paper by Goedhart et al. (1977) estimated the subjective minimum income as a

function of actual income and household size only, but the authors suggested that “any

quantifiable factor that has a measurable effect” might have been incorporated (p. 518).

Subsequent studies extended the set of explanatory variables as differentiating factors for the

subjective poverty lines (e.g., García‐Carro and Sánchez‐Sellero, 2019; Mysíková et al., 2021,

2022; Želinský, 2022). These commonly included employment status, sex, age, education, and

degree of urbanization. Discussions on the inclusion of explanatory or control variables mostly

argue that even if a variable causing a significant effect is not accepted as a factor

differentiating the poverty line, it should be included in order to obtain unbiased estimates of

other variables (e.g., Garner and de os, 1995). Though effects caused by differences in

Su b

je ct

iv e

m in

im u

m in

co m

e

Actual income

Z*

Z*45

32

personality, tastes, lifestyles, or, for instance, incomes of reference groups (household or

community) or recent income changes may contribute to explain the variance in subjective

minimum income, they would unlikely be considered relevant to policy choices (Ravallion

and Lokshin, 2002).

Depending on the authors’ judgements about the empirical, theoretical and/or political

relevance of the explanatory variables to the poverty lines, the methods to calculate subjective

poverty lines differ (Garner and Short, 2004). One way would be to calculate a single poverty

line holding the explanatory variables at their national averages (or, more frequently, a set of

lines differentiated by the variables defining subpopulations of interest, holding the values of

other control variables at their national averages), while the other would employ all (relevant)

explanatory variables to calculate household-specific lines. The latter approach is particularly

useful when the key aim is distinguishing populations below and above the lines, rather than a

definition of the line itself (Želinský et al., 2022). owever, the approach is different from

simply calculating the number of households reporting actual household income that is less

than the household expected minimum income or setting the average reported MIQ as the

poverty line. See Box 13 for an example of the estimation of a SPL,

Box 13. Subjective Poverty Line and the intersection method

In practical applications, standard OLS regression model is applied to estimate the

subjective minimum income as a function of actual income. Natural logarithms of both

subjective and actual incomes are used instead of original values. The estimated function is:

ln(�̂�) = 𝛼 + 𝛽 ln(𝑋), (1)

where Y is the subjective minimum income, X represents the actual household income, and

α and β are the estimated coefficients. At the intersection point, where Y = X = Z*,

rearranging the equation yields:

ln⁡(𝑍∗) = 𝛼

1−&#x1d6fd; , with necessary conditions α > 0 and 0 < β < 1. (2)

A household i is identified as subjectively poor if the following inequality holds:

Xi < Z*. (3)

Employing control variables in Equation (1) we obtain:

ln(�̂�) = &#x1d6fc; + &#x1d6fd; ln(&#x1d44b;) +⁡∑ &#x1d6fe;&#x1d458;&#x1d449;&#x1d458; &#x1d43e; &#x1d458;=1 , (4)

where Vk k = 1 … K are control variables and γk are the corresponding estimated

coefficients.

The definition of SPL extends to:

ln⁡(&#x1d44d;∗) = &#x1d6fc;+⁡∑ &#x1d6fe;&#x1d458;&#x1d449;&#x1d458;

&#x1d43e; &#x1d458;=1

1−&#x1d6fd; . (5)

The intersection method can also be used to estimate SPL based on Minimum Spending

Question (MSQ) instead of MIQ. An example of a MSQ is provided in Box 15. Garner and

Short (2003, 2004) found the MSQ-based poverty lines to be lower than the MIQ-based

poverty lines, because the MSQ refers to a more narrowly defined set of needs than the MIQ

(See Box 14). Compared to the MIQ-based poverty lines, the MSQ-based poverty lines were

more like the absolute poverty lines applied in the U.S. (Garner and Short, 2003).

Box 14. Minimum Spending Question in SIPP in 1995

33

In your opinion, how much would you have to spend each year in order to provide the basic

necessities for your family? By basic necessities I mean barely adequate food, shelter,

clothing, and other essential items required for daily living.

SIPP – Survey of Income and Program Participation (Garner and Short, 2003)

In addition, subjective poverty lines have been compared to population-based means and

median incomes, and objective and relative poverty thresholds. For example, de os and

Garner (1991), reported that for both the U.S. and the Netherlands, the SPLs lied in the range

of 60–75% of incomes in most household size groups. In addition, with respect to the

Netherlands, the subjective poverty line would have been higher than the objective and

relative income poverty line currently applied in the EU (i.e., with the poverty line set at 60%

of equivalised household income). With the same actual income compared to each

threshold, the subjective poverty rate would have been highest. In addition, Saunders et al.

(1994) found that the poverty rates resulting from the use of thresholds derived from

subjective measures were markedly higher than those based on relative income poverty

thresholds (i.e., with the poverty line defined as 50% of equivalised household

income) for Australia and Sweden around the 1980/1990s. García-Carro and Sánchez-Sellero

(2019), using the national EU-SILC data between 2008 and 2016, found the subjective poverty

rate to be about 40% for Spain, as compared to the official relative income poverty (at risk-of-

poverty rate, AROP) rate of roughly 20%.

As opposed to country case-studies, the recent study by Želinský et al. (2022) compared the

subjective poverty rates based on SPLs with the “at risk of poverty” (AROP) rates in

all EU member states over the period of 2004–2019. It showed a substantially greater variation

in subjective poverty rates than AROP rates across the EU countries: the subjective poverty

rate substantially exceeded the AROP rate in some Eastern and Southern European countries,

while it was lower in Scandinavian countries.

Quasi Leyden Poverty Line Based on the Deleeck Question

As the IEQ puts a burden on respondents, it is rarely integrated in statistical surveys. Piasecki

and Bieńkuńska (2018) propose an alternative way to estimate a subjective poverty line using

the intuition behind the LPL utilising the Deleeck-type of question (Box 7). In the first step,

the approach assigns a utility level to each response option presented in the 6-categorical

Deleeck question. In the second step, it is necessary to estimate parameters of a regression

function modelling the level of actual income at which the household would find itself on the

poverty threshold. The value of the poverty threshold at a (arbitrarily) given utility level (&#x1d6fc;) depends on the size of the household and may also depend on additional characteristics of the

household. See Box 15.

Box 15. Quasi-Leyden Poverty Line

The estimation procedure has several steps:

(1) Assigning a value of utility to the evaluation of actual income for each household using

the transformation

34

ui = (ji – 0.5)/m, (1)

where ji is answer of household i to the Deleeck question, m is the number of categories

(m = 6 for the Deleeck question integrated in EU-SILC survey).

(2) Estimating parameters of an OLS regression function:

ln(&#x1d466;&#x1d6fc;&#x1d456;) = &#x1d6fe;0 + &#x1d6fe;1 ln(&#x1d460;&#x1d456;) + &#x1d6fe;2Φ −1(&#x1d462;&#x1d456;), (2)

where &#x1d466;&#x1d6fc;&#x1d456; is the actual income of household i, si is the household i size, &#x1d6fc; is the utility level

proxied by ui, and Φ−1(&#x1d462;&#x1d456;) is the value of the inverse function of standard normal

distribution for ui.

(3) The estimated regression coefficients then allow us to derive the subjective poverty

lines for different values of household size (si). In formula (2), we employ α, which is an

arbitrarily chosen parameter representing the level of utility from being at the poverty

threshold. Piasecki and Bieńkuńska (2018) report estimations based on different values of α

(0.25; 0.3; 0.33; 0.4; 0.5). Including further control variables also allows us to derive the

poverty thresholds for other subgroups of households.

Note that the estimated value of a subjective poverty line is also determined by the value of

which corresponds to the assumed utility level (&#x1d6fc;). The subjective poverty line estimated for a certain household size depends on an arbitrarily chosen welfare level below which households

are considered poor. Nevertheless, individual poverty lines can be estimated for each

household and aggregating poverty lines across households can help to address this concern.

An Approach Based on Proportional Odds Logistic Regression

Utilizing ordered categorical data (such as the Deleeck question, Box 7) allows us to employ

proportional odds logistic regression, as recently suggested by Pittau and Zeli (2023).

Adopting the alternative specification of ordered probit/logit model, as discussed by the

authors, allows a direct interpretation of the estimated intercepts as thresholds on the scale of

income. The poverty line is constructed as described in Box 16.

Box 16.

As the original (ordered) responses correspond to the self-declared status (e.g., the ability to

make ends meet elicited on scale 1 – 6), the following parametrization of the model is

required:

( 

(    

 

=

5.5

5.55.4

5.25.1

5.1

if 6

, if 5

, if 2

if 1

cz

ccz

ccz

cz

y

i

i

i

i

i 

where

),0(N~ , 2 iiii xz += .

Adopting this parametrization, intercepts c1.5, c2.5, …, c5.5 can be directly interpreted as

thresholds on the scale of income.

35

Considering the proportional odds model:

xc ky

ky k +=

  

)(Prob

)(Prob log ,

where

ck are the intercepts, i.e. the cut-points that need to be estimated,

x is income,

&#x1d6fd; is the regression coefficient that needs to be estimated;

the estimated thresholds can be transformed in the scale of income using a simple re-

parametrization:

etc. ; ˆ

ˆ ˆ ;

ˆ

ˆ ˆ

3|2 5.2

2|1 5.1



c c

c c == , where 6|52|1 ˆ ,...,ˆ ,ˆ cc are the estimates of the standard

parametrization provided within a statistical software output.

For further details, refer to the study by Pittau and Zeli (2021).

An Approach Based on Dichotomized Data

An alternative way to estimate monetary subjective poverty line when having categorical

variables has been produced by Želinský et al. (2020). This method was designed to apply a

dichotomized variable. owever, the current most frequently applied question in the EU is a 6-

point scale variable, the ability to make ends meet question (Box 7), integrated in the EU-

SILC survey. A way to proceed is first dichotomize the question responses (e.g., households

who report great difficulty to make ends meet are deemed poor and all other households are

deemed as non-poor). This step is rather arbitrary, but it is necessary to assess the robustness

of results by considering alternative dichotomizations.

Once the responses are converted to a binary variable, we can utilize an approach proposed by

Duclos and Araar (2006) allowing for the estimation of subjective poverty lines with discrete

information. This approach relies on a binary variable (or a dichotomized multi-categorical

variable) with 1 representing subjectively poor and 0 otherwise. The working assumption is

that respondents compare their actual income to an unknown subjective poverty line Z* which

is unobserved and must be estimated. As shown by Figure II.2, with the binary classification

of (non-)poor, some respondents can misclassify their own situation, i.e., individuals with high

income classify themselves as poor (“false poor”), while individuals with low income classify

themselves as non-poor (“false rich”). To estimate the subjective poverty line Z*, it is

necessary to minimize the numbers of “false poor” and “false rich”.

Figure II.2 Estimating a subjective poverty line with binary categorical variable

36

Source: Želinský et al. (2020, p. 2); based on Duclos and Araar (2006, p. 125).

Notes: Z* represents the subjective poverty line.

Following this intuition, Želinský et al. (2020) propose utilization of the Youden J index as an

option to estimate the unknown subjective poverty line. The Youden Index estimates the

poverty line by selecting the value of income at which the numbers of “false-poor” and “false-

rich” individuals are minimized. As illustrated in Figure II.2, the cut-off point Z* (subjective

poverty line) is defined as the income level that differentiates households which are

subjectively poor from those who are not. The poverty line can be operationalized as in Box

17.

One of the disadvantages of this approach is that it does not automatically allow for

considering control variables, and subjective poverty lines need to be estimated separately for

each subgroup of interest to account for household/individual characteristics.

Box 17. Subjective poverty line based on dichotomized data

Statistically, the Youden index, J, is a function of c which maximizes the sum of sensitivity

(Se) and specificity (Sp) classification measures:

&#x1d43d;(&#x1d450;) = max &#x1d450; {&#x1d446;&#x1d452;(&#x1d450;) + &#x1d446;&#x1d45d;(&#x1d450;) − 1}. (1)

At a given c, Se(c) and Sp(c) denote the probabilities of correctly identifying subjectively

non-poor and poor households. Denoting X1, X2, . . . , Xm and Y1, Y2, . . . , Yn as the income

levels of the non-poor and poor household groups, respectively, the Youden index is

calculated as:

&#x1d43d;(&#x1d450;) = max &#x1d450;

{ ∑ &#x1d43c;(&#x1d44b;&#x1d456;≥&#x1d450;) &#x1d45a; &#x1d456;=1

&#x1d45a; −

∑ &#x1d43c;(&#x1d44c;&#x1d457;>&#x1d450;) &#x1d45b; &#x1d457;=1

&#x1d45b; }⁡, (2)

where I(D) is an indicator function with I(D) = 1 if D is true, 0 otherwise. Subsequently,

the optimal value of c is the one which maximizes the value of J, or equivalently, the

number of correctly classified households. Statistically, the Youden J index is based on

z*

0

1

Income

F e

e l p

o o

r?

'false poor'

'false rich'

37

maximising the sum of sensitivity and specificity classification measures. J = 1 represents a

perfect classification while J < 1 indicates otherwise.

The Youden (1950) index was initially introduced in medical literature to assess the ability

of a biomarker test to classify individuals as either diseased or non-diseased, based on

which side of a cut-off point, c, their biomarker values fell on along the distribution of

possible values. The Youden index can be adapted to the poverty context by defining the

cut-off point as the income level that differentiates households which are subjectively poor

from those which are not. Nevertheless, the classification exercise is not limited to the

adoption of the Youden index but can also be based on alternative metrics such as those

based on a Receiver Operating Characteristics (ROC) curve.

C. Cou try/ ter atio al or a zatio exa ple

From the in-depth review of current country practices organized by the Bureau of the

Conference of European Statisticians, only two countries reported using responses to monetary

subjective poverty questions to produce such thresholds. The Italian National Institute of

Statistics reports using the Subjective Poverty Line (SPL) method. The Brazilian Institute of

Geography and Statistics reported periodically using the SPL method as well as exploring the

possibility of using the Leyden Poverty Line (LPL) and the Center of Social Policy Poverty

Line (CSP) methods. [A w S : S p

p w p L L pp .]

Chapter 4. STATCAN co tr butio

Metho o ata collectio a u el e

This section focuses on data collection methods for subjective poverty research, offering an

overview of various approaches and guidelines, including their characteristics, benefits, and

limitations. It underscores the importance of survey frame quality and sample selection in

method selection, providing organizations with a comprehensive toolkit to choose the most

suitable approach. Additionally, it hints at a forthcoming systematic review of questions

conducted across the UNECE region by 15 countries in subjective poverty research, aiming to

provide a comprehensive resource for organizations seeking to gather relevant data for their

specific needs and priorities.

The initial step in gathering and validating subjective poverty data involves understanding the

range of collection methods in use. This section provides a description and comparison of

common approaches, focusing on major methods and offering specific use cases. These

approaches span from complex sampling surveys to simpler web panel data collected through

crowdsourcing, summarized in Table 1. While this table does not serve as an exhaustive study

comparing these methods, it offers an overview based on Statistics Canada's experience,

considering factors such as data quality, sample control, duration, and cost. Notably, there is a

trade-off between cheaper and quicker surveys with higher error rates and limited

generalizability to population estimates, impacting the ability to study subpopulations as

opposed to more expensive tradition surveys which are designed to produce higher quality

38

data. Therefore, aligning data collection methods with specific research needs is a critical

initial step, and Table 1 serves as a helpful starting point for organizations engaged in

subjective poverty research.

In essence, this section outlines the importance of understanding various subjective poverty

data collection methods and introduces a practical reference tool, as seen in Table 1, which

organizations can use to make informed decisions based on their resource constraints and

research objectives.

Table 1 – Data collection methods

Data collection

type

Description Control over

sample

Approximate

Duration

(planning to

execution)

Cost Country Use Cases

Traditional

Survey

‘Specialized need’ Very high control 1+ year Most

expensive

EU-SILC

Opinion Poll

Survey

‘Specialized need’ Some control

1+ year Medium

expense

United States – Gallup Poll

Omnibus Survey ‘General Social

Data’

High control 9 months Medium

expense

Canadian Social Survey

Rapid Response ‘Quick and Stand

alone’

Some control 7-8 months Medium

expense

Bureau of Labour Statistics

Web panel31 ‘Rapid indicator’ Low control 4 months Low expense Statistics Canada

Crowdsourcing ‘Pulse check’ Voluntary (low

sample control)

Shortest (4 month

turn around)

Low expense Statistics Canada

Administrative

data

Used to improve

sampling and

calibration of

surveys

Often mandatory

(tax data)

n/a Varies Statistics Denmark (for EU-

SILC)

Source: Statistics Canada, 2022

Survey Fra e a a ple co eratio

Prior to elaborating further on each of these survey designs it is worth mentioning two

overarching considerations common to all approaches. One of them is the necessity of a high-

quality survey frame, and the second is sample selection. Better descriptions of a survey frame

can be found elsewhere as this chapter assumes a certain degree of prior knowledge of surveys

by its audience. owever, a very broad review is helpful here to help understand the following

descriptions. There are two types of frames used at Statistics Canada: a list frame and an area

frame. Qualities of a good frame include:

31 Program and proceedings (statcan.gc.ca)

39

• Relevance: the extent to which the survey frame corresponds and permits access to the

target population.

• Accuracy: includes evaluation of coverage errors to minimize and assess coverage and

classification errors of the statistical units in the frame.

• Timeliness: how up-to date is the frame with respect to the survey reference period and

current affairs.

• Cost: the total cost to develop the frame in comparison to the total cost of a survey.

(Statistics Canada, 2010).

The second consideration is sample selection when choosing a data collection method. Sample

selection poses the following questions: (1) Is the survey mandatory or voluntary? (2) Is it a

probability or non-probability sampling? (3) ow large is the sample size? Like the previous

consideration, better references exist for more systematic review of survey design and sample

considerations32. The following section is written in an accessible way such that, with the

descriptions above, a more complex understanding of survey frames and samples is not

needed. The details of each should be considered as secondary to the broad overview of

approaches described below.

The shift towards online surveys is increasing. Online surveys have gained popularity due to

their cost-effectiveness, quick distribution, and utilization of multimedia elements. owever,

online surveys often differ in terms of sampling principles. Many online surveys do not use

probability sampling, which allows for unbiased estimates and accuracy calculations. Instead,

they rely on self-selection of respondents (Bethlehem, J., 2008). This departure from

probability sampling leads to biased results and prevents the application of probability theory.

Self-selection surveys are not a viable solution. owever, web surveys conducted within the

framework of probability sampling hold potential, either as standalone surveys or as part of

mixed-mode approaches. In these cases, web surveys can contribute to addressing the dilemma

of limited budgets and increased information demands.

Tra tio al urvey

The first approach is traditional surveys whose strength resides in standardization,

generalizability33, and versatility. It is a method of gathering information from a set of people

with the purpose of generalizing the results to a larger population. Surveys are used to

understand the choices, preferences, and experiences respondents. They are longer and more

detailed than polls and can be conducted in-person, over the phone, or online. When compared

to non-survey-based data collection techniques such as focus groups traditional surveys are

32. References for developing samples including: Survey Methods and Practices (statcan.gc.ca)

1. American Association for Public Opinion Research (AAPOR): Survey Practice

2. The U.S. Census Bureau Our Surveys & Programs (census.gov)

3. The World Bank's Data Quality Assessment Framework (DQAF): Data Quality Assessment Framework (DQAF) for the International Comparison Program (ICP) : paper for session five (worldbank.org)

33 Generalizability is a measure of how representative your sample is to the target population, also known as external validity.

40

more cost effective to capture data on a population but are the most expense data collection

technique reviewed here. Strict control over the survey sample facilitates probability sampling

and improves generalizability to the target populations.

The European Statistics on Income and Living Conditions (EU-SILC) is an example of a

traditional survey. It collects timely, cross-sectional, and longitudinal microdata from multiple

European countries on income, social inclusion and living conditions cover objective and

subjective aspects in monetary and non-monetary terms for households and individuals.

Anchored in the European Statistical System (ESS), this survey was launched in 2003,

replacing the European Community ousehold Panel (EC P), which expired in 2001. The

data it collects is comparable between the member countries on: (a) income, (b) poverty, (c) social exclusion, (d) housing, (e) labour, (f) education, (g) health. They are used to monitor the

Europe 2030 targets of the European Pillar of Social Rights Action Plan34, particularly its

poverty reduction targets.

The reference population includes all private households and their residents who were in the

country at the time of data collection. All household members are considered, but only those

aged 16 or older are interviewed. Persons living in collective households or institutions are

excluded from the target population.

Case Study 1: National Survey of Self-reported Well-being (ENBIARE) 2021 of Mexico

The National Survey of Self-reported Well-being (ENBIARE) 352021 in Mexico aims to

capture people's subjective well-being perceptions. This survey was conducted in two

questionnaires, one for housing and households and another to collect data from adults aged

18 and older, covering various dimensions of well-being, life events, and financial difficulties,

including perceptions of income sufficiency and future financial outlook. It employs a

probabilistic, stratified, three-stage sampling method, resulting in a national sample of 37,000

housing units. ENBIARE uses a Master Sample provided by Mexico's National Statistical

Office, INEGI, to select diverse clusters for data collection. The data are available five months

after collection, and the survey is expected to be conducted biennially. Data collected from

June 3rd to July 23rd, 2021, revealed that 64% of respondents faced difficulties paying

household expenses in the past year, and 43% anticipated insufficient income for the following

month. The survey provides valuable insights at both national and state levels into well-being

and financial challenges among Mexico’s population.

ENBIARE questions about the minimum income sufficient to pay for monthly home needs.

Once the minimum sufficient income has been declared, ask if the person considers that their

household will be able to reach e the minimum income sufficient. This question is applied to

an adult person, 18 years or older, selected from each household who share a common expense

and reside in the homes assigned for the survey. The selection of the appropriate informant

begins with the identification of the usual members of the household who are within the

34 EU 2030 target on social protection aims that “out of 15 million people to lift out of poverty or social exclusion by 2030, at least 5 million should

be children.”. The European Pillar of Social Rights Action Plan (europa.eu) 35 National Survey of Self-reported Well-being (ENBIARE) 2021 (inegi.org.mx)

41

established age range of 18 years of age or older, based on the information collected in the

ousehold Questionnaire. Additionally, you meet the criteria of knowing how to read, write,

and speak Spanish.

Minimum income perception question:

MINIMUM INCOME PERCEPTION OF MINIMUM INCOME

In your opinion, how much income would be enough to meet all your household needs for a month?

$|___| ,|___|___|___| , |___|___|___| PREFERS NO TO RESPOND 9 999 999

Do you consider that you or your home will reach this income level next month?

Yes.........................................1

No .........................................2

Doesn´t know …....................9

In ENBIARE the definition of minimum income refers to the amount of income from various

sources, defined by the person, sufficient to meet all their household needs in a month.

Results:

The population that considered they would not get the minimum income necessary to meet

household needs next month was 43.4%, 11.3% did not know, and 45.4% declared they would

get it.

Figure 1. Share of households by perception of getting the minimum income level, 2021

Source: INEGI. National Survey of Self-reported Well-being (ENBIARE) 2021, Database.

Encuesta Nacional de Bienestar Autorreportado (ENBIARE) 2021 (inegi.org.mx)

45.4

43.4

11.3

Will reach Won´t reach Doesn´t know

42

Regarding conceptual and statistical design, the ENBIARE target population is adults aged 18

years or over who are literate and Spanish-speaking. Observation units are the sample selected

housing units, the households, the population residing in households, and the chosen people

aged 18 years and over who can read, write, and speak Spanish. ENBIARE provides

estimations with a geographical breakdown at the national and state levels. The indicator of

subjective poverty in ENBIARE refers to the household where the adult population resides.

The household income necessary to make ends meet is based on the personal perception of his

household’s minimum needs.

On the other hand, Mexico has an official, objective measurement of multidimensional

poverty. This means that, in addition to considering the insufficiency of economic resources, it

considers several additional dimensions on which social policy should focus. Under the

General Law of Social Development, the guidelines and criteria to define, identify, and

measure poverty are issued by the National Council for the Evaluation of Social Development

Policy (CONE AL, by its Spanish acronym). CONE AL must use the information generated

by INEGI through the National Survey of ousehold Income and Expenditure (ENIG ) to

estimate poverty.

The following graph compares the subjective poverty indicator (43.4%) with the population in

poverty, those with income below the poverty line, and those below the extreme poverty line

by income. The subjective indicator reports a similar level to the objective indicator that

captures the population below the income poverty line (43.5 percent).

Figure 2. Subjective poverty indicator and objectives poverty indicators, 2021 and 2022

Source: INEGI. National Survey of Self-reported Well-being (ENBIARE) 2021, Database.

Encuesta Nacional de Bienestar Autorreportado (ENBIARE) 2021 (inegi.org.mx) National Council for the Evaluation of Social Development Policy (CONEVAL, by its Spanish acronym)

https://www.coneval.org.mx/Medicion/MP/Paginas/AE_pobreza_2022.aspx

Note: ENBIARE data refers to the year 2021. CONEVAL data refers to 2022.

Figure 1. Percentage won´t be able to reach the next month's income, 2021

43.4

36.3

43.5

12.1

Won´t be able to reach the next month's income

Population in poverty

Population with income below the income poverty line

Population with income below the extreme poverty line by income

CONEVAL ENBIARE

43

Source: INEGI. National Survey of Self-reported Well-being (ENBIARE) 2021, Database.

Encuesta Nacional de Bienestar Autorreportado (ENBIARE) 2021 (inegi.org.mx)

O bu Survey

An omnibus survey is collects data on a wide variety of subjects in the same interview while

sharing the common demographic data collected from each respondent. They provide a

convenient and efficient way to collect data from a consistent group of respondents. They

allow researchers to leverage the same sample over time, thereby improving the accuracy of

their results, optimizing survey procedures, and potentially reducing costs associated with

recruiting new samples for each individual survey. This approach is particularly valuable when

there is a need for quick and frequent insights across different subjects within a population.

Case Study 2 below elaborates on an omnibus survey methodology.

58.7 56.3 56.2 55.8

53.3 52.6 52.4

48.5 47.9 47.7

46.4 46.4

45.0 44.7 44.7 44.6 44.3 44.3

43.2 43.2 42.6

39.2 36.2 36.1

34.2 33.5 33.5 33.0 32.6 32.3

29.8 29.5

Yucatán Oaxaca

Tabasco San Luis Potosí

Campeche Puebla

Guerrero Sinaloa Hidalgo

Michoacán de Ocampo Durango Chiapas

Zacatecas Veracruz de Ignacio de la Llave

México Guanajuato

Morelos Querétaro

Ciudad de México Tlaxcala

Quintana Roo Aguascalientes

Chihuahua Jalisco

Sonora Nuevo León

Baja California Sur Nayarit

Coahuila de Zaragoza Colima

Baja California Tamaulipas

44

Case Study 2: The Quality of Life framework for Canada

Canada's Quality of Life Framework, introduced in the 2021 budget alongside the report

"Measuring What Matters," aims to move beyond GDP and incorporate social, economic, and

environmental factors into Canada's assessment of quality of life. This framework

acknowledges the multifaceted nature of well-being and incorporates both subjective and

objective measures, some of which can be adapted to assess subjective poverty. It aligns with

global trends seen in frameworks from countries like New Zealand, Scotland, Iceland, and the

U 36, which blend subjective and objective indicators in response to recommendations from

the 2009 Commission on the Measurement of Economic Performance and Social Progress.

The Canadian Quality of Life Framework consists of 84 indicators organized into five

domains: prosperity, health, environment, good governance, and society. Statistics Canada

gathers data for many of these indicators through surveys and administrative sources, with 58

of them presently defined on the Quality of Life hub. Some indicators relevant to subjective

poverty include job satisfaction, financial well-being, self-rated health, and trust. Data

collection primarily relies on the Canadian Social Survey (CSS), a versatile survey that

examines various social issues every three months and pools the data over a year to track

changes in living conditions and well-being, showcasing Statistics Canada's approach to

studying subjective well-being.

Op o Poll Survey

Opinion polls serve as a rapid means to gather public sentiment on specific topics and can be

conducted through online, paper, in-person, or phone surveys. A poll is a method of collecting

data by asking a single question with a limited number of answer options. Polls are generally

used to make quick decisions and are conducted at various stages. These polls are particularly

useful for gauging majority opinions and can be applied to assess perceived poverty levels or

evaluate the validity of official poverty thresholds. With an adequate sample size and

randomization, opinion polls offer reliable insights across various demographic groups and are

generally cost-effective compared to traditional surveys. An illustrative example is a 1989

Gallup poll in the United States that revealed public opinion placed the Official Poverty

Measure thresholds 19% higher than calculated using conventional objective methods. In

Canada, government departments often collaborate with external organizations to conduct

public opinion research, utilizing their expertise in questionnaire design and occasionally

involving subject matter experts, such as psychologists or sociologists, to refine questionnaire

wording and content.

Rap re po e

Rapid response surveys are ad-hoc surveys that provide snapshots of a population on specific

issues and can obtain information directly on the most pressing data needs. While this shares

many common features as typical surveys, when timeliness is of great importance, certain

36 Our Living Standards Framework | The Treasury New Zealand, Quality of life in the UK - Office for National Statistics (ons.gov.uk), National

Performance Framework | Our Place, Iceland – Wellbeing Framework : Wellbeing Economy Alliance (weall.org)

45

parameters are loosened, such as randomization of the sample. This allows the survey to be

developed and fielded faster than a typical survey.

The benefit of this is that it can provide a pulse on a particular subject. These have been used

widely during the pandemic, when the rapidly changing economic and political environment

due to the ongoing health crisis necessitated more timely information for decision makers than

had previously been built into official data collection strategies. The drawback to this speed is

that often they are less representative of the target population and are considered of lower

quality data.

Case Study 3: The U.S. Census Bureau Household Pulse Survey Financial Well-being Question

In response to the CO ID-19 pandemic, the U.S. Census Bureau launched the ousehold

Pulse Survey ( PS)37 in collaboration with multiple federal agencies. This survey aimed to

provide timely and efficient data compared to traditional surveys. The PS operates in two-

week survey periods, with a one-week gap between them, and data releases about a week after

each survey period ends38. Since, the beginning of SP in 2020, federal agencies contribute

critical questionnaire items to inform their missions and understand the pandemic's impact on

individuals, families, and households. The questions are periodically reviewed and updated to

address evolving economic conditions and agency-specific needs.

The PS sampling frame combines the Census Bureau's Master Address File with email

addresses and mobile phone numbers. Participants receive email or text invitations to

complete the online questionnaire, and follow-up reminders are sent if there's no response.

Each survey period involves approximately one million households, resulting in about 80,000

respondents despite low response rates of around 8%. Weight adjustments ensure that

responses are representative of the U.S. population. The PS collects a wide range of data,

including both objective and subjective well-being dimensions. Objective questions cover

household income, employment experiences, healthcare access, educational disruptions, and

vaccination status. Subjective questions focus on perceptions of food and housing security,

physical and mental health, and general financial well-being. Garner, Safir, and Schild

(2020)39 40analyzed responses to the financial difficulty questions and in relationship to

income using data collected from August 19 to 31, 2020. The data shows that financial

difficulty is correlated with income, with 59.1% of those earning less than $25,000 reporting

some financial difficulty compared to 7.5% among those earning $200,000 or more.

Depending on how poverty is defined, it ranges from one-third of the population experiencing

some difficulty to 8.3% facing both difficulty and lower income.

37 Additional details about the Household Pulse Survey and the public use data can be at the following link: https://www.census.gov/programs-

surveys/household-pulse-survey.html 38 This schedule is how the survey is currently being conducted but is not how it has always been conducted. Additional information about how the

survey was conducted during earlier cycles can be found in the technical documentation available on the Census Bureau’s ousehold Pulse Survey

webpage. See Footnote 1 for link. 39 https://www.bls.gov/opub/mlr/2020/article/changes-in-consumer-behaviors-and-financial-well-being-during-the-

coronavirus-pandemic.htm

46

Web-pa el

Web panel surveys are a fast and cost-efficient method in market surveys thanks to the continued use of the internet and increasing nonresponse rates and prices. Per Bethlehem (2008), web-panels are just another mode of data collection. Questions are not asked face-to-face or by telephone, but over

the Internet. The difference is the principles of probability sampling are not applied. By selecting random samples, probability theory can be applied, making it possible to compute unbiased, more accurate estimates. Web surveys often rely on self-selection of respondents instead of probability sampling having serious impact on the quality of survey. There are also risks of coverage and measurement errors. The absence of an inferential framework and of data quality indicators is an obstacle against using the web panel approach for high-quality statistics about general populations.

Crow ource urvey

Crowdsourcing involves collecting information by accessing a large community of online

users on a given topic. Statistics Canada has conducted several crowdsourced surveys via

means of a mobile application and engagement. This method lessens the burden for

respondents and allows for quick responses on a variety of subjects. Case 4 below provides

more information on Statistics Canada’s use of crowdsource surveys to collect subjective

poverty data.

Crowdsourcing is less costly than traditional surveys, quicker than other survey types, and can

be a tool to improve how information is collected by filling data gaps. Its strengths, however,

come with risks of population bias due to the lack of sampling control.

Case Study 4: Using crowdsourced data

Two Statistics Canada papers discussed the methodological issues that arise from integrating

crowdsourced data into existing data sources. The goal is to use existing data sources to

improve accuracy and remove bias in the crowdsourced data. The two approaches were the

p (Poirier, 2021) and the q (Ding and

Chatrchi, 2021). Both papers explored the Canadian Perspective Survey Series (CPSS) — an

initiative that began during the pandemic to improve data timeliness. It collected data on just

over 32,000 Canadians every month.

The p combined the larger sample of the CPSS crowdsourced survey

with an online web-panel survey, a quarter of its size. Only provincial estimates could be

provided due to the smaller sample size. The web-panel survey used a probability sample of

randomly selected respondents aged 15 years and older from the Labour Force Survey (LFS).

The probability sample applied sample weights from the LFS to a portion of the CPSS

respondents, thus reducing bias in the crowdsourced data, with the caveat that the bias

reduction depended on the variable of interest.

T q used a basic area-level model to evaluate the

47

effectiveness of a crowdsourced survey to reduce the variance in web-panel estimates. It

adopted a similar methodology to the LFS. The small area estimate is based on two quantities:

the direct estimate from the survey data and a predication-based model, also known as a

synthetic estimate. The results from the first round of modeling were successful for the

domains of province, age group, and sex. For the other domains of interest, such as the Census

Metropolitan Area (CMA)41, the results were unsatisfactory. The area-level model may have

improved the precision of estimates, yet achieving a suitable model remains a challenge.

A trative a re try ata

Administrative and registry data are valuable for enhancing survey data and reducing response

burden, although they are not typically used directly to measure subjective poverty. These data

sources, including demographics, income, wealth, labor market participation, and education,

can improve data quality through methods like weight calibration after sampling. For instance,

a census dataset linked to administrative data like income or education allows statisticians to

oversample low-income households, enhancing the accuracy of subjective poverty surveys.

In countries with low response rates and biases in voluntary household surveys, calibrating

survey weights based on factors such as income and demographics can help mitigate these

biases, provided there is a strong correlation between these factors and the measure of

subjective poverty under investigation. owever, one limitation of administrative data is its

timeliness, as income data may not align with survey collection periods, necessitating the use

of preceding years' data or preliminary income information.

Case Study 5: Use of administrative data for sampling and calibration of EU-SILC at Statistics Denmark

In Denmark, the EU-SILC42 survey serves as the primary source for data on subjective

poverty, with a voluntary participation rate of 52% in 2022, leading to biased responses where

low-income households participate less frequently. To address this bias, Statistics Denmark

employs administrative registers extensively for both sampling and post-calibration of survey

weights.

Using an anonymized version of the Danish Central Personal identifiers (CPR), Statistics

Denmark links surveys and administrative data, obtaining comprehensive information on both

respondents and non-respondents. The Danish census is continually updated, providing an up-

to-date sampling frame for EU-SILC. To ensure adequate coverage of less populated regions,

41 A Census Metropolitan Area (CMA) is formed by one or more adjacent municipalities centered on a population center (known as the core). A

CMA must have a total population of at least 100,000, based on data from the current Census of Population Program, of which 50,000 or more

must live in the core based on adjusted data from the previous Census of Population Program. Source: Dictionary, Census of Population, 2021 – Census metropolitan area (CMA) and census agglomeration (CA) (statcan.gc.ca) 42 Documentation of statistics: Survey on Living Conditions (SILC) - Statistics Denmark (dst.dk)

48

the EU-SILC sample is stratified regionally (NUTS-2) and incorporates preliminary income

data to oversample households likely to have incomes below 60% of the median.

Following data collection, the survey undergoes calibration using administrative data on age-

groups, household size, income groups, and socio-economic status for the entire population,

ensuring more accurate and representative results. This comprehensive approach leveraging

administrative data helps mitigate bias and improve the quality of subjective poverty data in

Denmark's EU-SILC survey.

Note:

1. Eurostat is the statistical office of the European Union. Who we are - Eurostat

(europa.eu).

2. Nomenclature of Territorial Units for Statistics (NUTS-2).

Source o error: co cer w th re po e a repre e tative e

This section delves into sources of error and precision requirements related to EU-SILC

(European Union Statistics on Income and Living Conditions), emphasizing the importance of

studying error sources and standardizing quality measures across EU countries. In 2021, new

legislation brought changes to EU-SILC data collection43, including precision requirements at

national and regional levels for poverty and social exclusion indicators. The legislation,

Regulation (EU) 2019/170044, establishes standards for geographical coverage, sample

characteristics, data gathering periods, and data processing, striving to align with the EU's

regulations.

The section identifies six measures of error: standard errors, coverage errors, measurement and

processing errors, non-response errors (both unit and item), sampling error, and

representativeness error. Standard errors gauge data reliability and were considered during

EU-SILC's design to ensure an absolute precision of about one point for the at-risk-of-poverty

rate. Coverage errors relate to imperfections in the sampling frame and are influenced by the

use of population registers or census databases, necessitating frequent updates. Measurement

and processing errors can arise from questionnaire design and data collection complexity,

impacting data accuracy.

Non-response errors, including unit and item non-response, are inevitable and can introduce

bias, particularly if specific survey patterns emerge such as a particular question being skipped

by a significant number of respondents. Corrective measures, such as post-stratification or

logistic regression models, are employed to address non-response. Sampling error is

recognized as a challenge when measuring subjective phenomena due to susceptibility to non-

43 Legislation - Income and living conditions - Eurostat (europa.eu) 44 Regulation (EU) 2019/1700 establishing a common framework for European statistics relating to persons and households (IESS regulation).

https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.LI.2019.261.01.0001.01.ENG

49

sampling error, such as changes in respondent mood or external factors affecting perceptions.

Representativeness error, particularly in the context of crowdsourced surveys where

population bias can occur, may lack control over sample representativeness, potentially

leading to biased outcomes.

Val ty a relatio h p to other ea ure o poverty a eco o c well-be

This section offers guidance on validity and reliability, beginning by examining the advantages

and disadvantages of subjective measures in comparison to alternative measures. It also

complements the decision regarding data collection methods and question design by

summarizing typical errors related to responsiveness and representativeness, regardless of the

chosen approach.

Quality reports and validating data

The national quality reports for EU-SILC45 are specified in EU regulation 2019/1700 on

European statistics relating to persons and households, and regulated by EU regulations

2019/2180 46and 2019/224247, delves into the importance of validating collected data and its

relation to other reliable sources. It emphasizes the need for countries to submit quality reports

to Eurostat, following specific regulations, to ensure data accuracy. These reports cover

various aspects, including sample design, data collection procedures, measurement errors, and

data comparability.

Regarding subjective well-being (SWB) assessment, the EU-SILC reports reveal an alignment

between respondents’ hypothetical scenarios and their anticipated SWB rankings. Factors

influencing this alignment include a sense of purpose, perceived control over life, family

happiness, and social status. The research draws upon data from diverse sources, with an 83%

alignment rate between SWB and choices. owever, systematic differences in some instances

warrant investigation.

Figure 1. Criterion and Construct Validity

45 See Quality - Income and living conditions - Eurostat (europa.eu) 46 EUR-Lex - 32019R2180 - EN - EUR-Lex (europa.eu) 47 EUR-Lex - 32019R2242 - EN - EUR-Lex (europa.eu)

50

Source: Bureau of Labour Statistics (?)

Each measure identifies about 20% of the population as poor. 33% of the population with at

least one indicator and only 5.7% as experiencing all three.

Furthermore, the report underscores the challenges in assessing various dimensions of poverty

and social exclusion. It highlights the lack of overlap among measures such as deprivation,

subjective poverty, and income poverty. The study explores overlapping poverties and

different permutations, concluding that multiple measures are essential for reliable results.

arious factors contribute to the lack of overlap, including transition, differing perceptions of

poverty, and technical considerations like housing costs and income distribution within

households. Ultimately, using multiple measures can lead to more accurate and nuanced

insights into poverty.

Advantages of subjective poverty measures

Subjective poverty measures offer several advantages, including their multidimensionality, as

respondents can consider various factors such as income, costs, living conditions, and societal

norms in their assessments. Unlike one-dimensional income-based measures, subjective

approaches reflect what individuals consider necessary to avoid poverty and meet their

family's needs, considering socio-psychological factors that influence well-being.

Disadvantages of subjective poverty measures

Subjective poverty measures, despite their value in reflecting people's perceptions of their

circumstances, come with certain drawbacks. They rely on individual opinions to identify

deprivations, which can vary significantly based on location, culture, aspirations, age, and

other factors, making it challenging to define adequate needs universally.

Subjective measures of welfare, while valuable, come with several challenges. One major

concern is the potential for response errors, variations in interpreting survey questions, mood

fluctuations, and differences in personality and tastes among respondents (Ravallion and

Lokshin, 2002, p. 1471). People may have diverse ideas about what it means to be "poor" or

51

"rich," leading them to interpret subjective welfare questions differently (Ravallion, 2014, p.

182–183). This subjectivity can lead to frame of reference bias, where individuals in

vulnerable positions may adapt their preferences to their circumstances, resulting in an

underestimation of their actual hardship (Graham, 2010). Conversely, those with objectively

comfortable lives may express dissatisfaction, causing lower subjective welfare ratings than

those who are objectively worse off (Ravallion, 2014, p. 160).

Another challenge is the variability in responses over time, with studies showing fluctuations

in reported subjective well-being for the same individuals when interviewed at different times

Ravallion (2014, p. 153). Additionally, the framing and context of questions can impact

responses, whether through interviewer-administered surveys or self-administered ones (Conti

and Pudney, 2011, p. 1093). These challenges emphasize the complexity and subjectivity

inherent in measuring welfare and well-being, making it crucial to consider multiple factors

and sources when assessing individuals' economic and overall well-being.

Differences in personal opinion

Subjective indicators pose challenges when the cutoffs are set relative to the sampled

population. This can complicate the interpretation of poverty trends because changes in

poverty may result from changes in either the indicator thresholds or the relative threshold's

adjustment. For example, if the subjective poverty threshold is recalibrated with each new

dataset according to the sampled population, it can impact the axiomatic properties of

measures, potentially rendering some axioms inapplicable (Alkire and Foster 2011).

Most multidimensional measures typically set indicator thresholds based on consistent

international or national standards, adjusting them transparently every decade or so. These

standards often incorporate expert opinions, participatory exercises, international regulations,

and development targets. aving fixed and given indicator thresholds simplifies policy

interpretation and allows policymakers to track progress and allocate resources effectively

based on observed disparities in poverty levels (Alkire, anagaratnam and Suppa 2018).

owever, changes in the population's frame of reference and aspirations could lead to shifts in

subjective poverty thresholds, making it challenging to interpret objective improvements

alongside measured decreases in subjective poverty.

T e ra e or ata collectio a relea e

Subjective poverty is influenced by various factors and can be either a lasting or temporary

condition. Yafit Alfandari (2020), states that when measuring temporary subjective poverty,

determining the appropriate time frame is crucial. A one-year time frame is recommended

because it is less susceptible to temporary fluctuations caused by short-term circumstances.

This period provides a robust assessment of subjective poverty.

Moreover, subjective poverty indicators should not be considered in isolation but should be

compared to indicators from different domains. Using a one-year time frame for data

collection allows for insights into both the present scope and nature of the phenomenon and

estimates of assistance required. Lifetime experience data, collected over the years, provides

an overall picture of the total number of individuals affected by subjective poverty, offering a

52

comprehensive perspective. This approach is consistent with measuring other complex social

phenomena like violence against women.

Cro - ectio al ver u lo tu al ata collectio

In marketing research, there has been increasing concern about the validity of cross-sectional

surveys by editors, reviewers, and authors. These validity concerns center on reducing

common method variance bias and enhancing causal inferences. Longitudinal data collection

is commonly offered as a solution to these problems. A study by Rindfleisch et al. (2008)

looked at the role of longitudinal surveys in addressing these concerns and provided a

comparison of the validity of cross-sectional versus longitudinal surveys using two data sets

and a Monte Carlo simulation by reducing the threat of common method variance bias and

enhancing causal inference. Under certain conditions, cross-sectional data exhibit validity

comparable to the results obtained from longitudinal data. Though longitudinal surveys offer

advantages in terms of reducing these two validity threats, is appropriate when the temporal

nature of the phenomena is clear and unlikely that intervening events could confound a follow-

up study, or alternative explanations are likely, a cross-sectional approach may be more

adequate for studies that examine concrete and externally oriented constructs, sample highly

educated respondents, employ diverse measurement scales, and are strongly rooted in theory

(Rindfleisch et al. 2008).

Marketing researchers recommended using longitudinal analysis and multilevel modeling to

minimize the random measurement error and common method bias by measuring the study

variables at multiple time points. A study by Shashanka et al. (2021) adopted the multilevel

structural equation modeling (ML-SEM) to analyze the longitudinal data of the factors

influencing the shoppers' Impulse purchase behavior (IPB). Structural equation modeling

(SEM) was conducted to examine changes in the causal effects at each time point of data

collection. The results of ML-SEM indicate significant fluctuations in the factors influencing

IPB over time. Results from the SEM indicated that few factors (like store ambience and

salesperson interactions) have had a significant influence on IPB initially, during the first store

visits of shoppers, but lost significance over time. The findings suggest that the store crowd,

secondary customers influence, and in-store promotions show a significant influence on the

IPB. Therefore, the study results of both longitudinal and cross-sectional modeling of the

research model at five-time points indicated that the model validity is not significant over a

period. This study enhances the statistical validity of the research model by analyzing the

fluctuations in the research model over a period of time (Shashanka et al., 2021).

OECD ubjective well-be u el e

The OECD Guidelines for Micro Statistics on ousehold Wealth publication introduces a set

of internationally agreed guidelines for producing micro-level statistics on household wealth,

addressing a crucial gap in existing global guidance for measuring different aspects of

individuals' economic well-being. These guidelines aim to resolve common conceptual,

definitional, and practical challenges that nations encounter when generating such statistics

and to enhance the comparability of country-specific data. They are essential for integrating

micro-level data on household wealth with information on other dimensions of economic well-

being, such as income and consumption. Understanding the composition and distribution of

53

household wealth at the micro-level is valuable for policymakers as it provides insights into

various aspects, including debt distribution, homeownership drivers, liquidity constraints, and

the impact of economic shocks on wealth and indebtedness.

To meet the increasing demand for micro-level wealth statistics and integrated economic well-

being data, the OECD Committee on Statistics established an Expert Group in 201048. This

group was tasked with developing guidelines for collecting and presenting household wealth

statistics, resulting in a comprehensive report (2013). These guidelines complement the

Framework for Statistics on the Distribution of ousehold Income, Consumption, and Wealth.

While macro-level statistics are already well-established, focusing on economy-wide

performance and institutional sectors, micro-level wealth statistics delve into the ownership

and distribution of wealth among individual households, necessitating some conceptual and

practical distinctions. These guidelines help address these differences and provide

recommendations for conducting wealth surveys and addressing challenges in measuring asset

and liability components. They emphasize the importance of a life-cycle perspective when

analyzing wealth data, as wealth accumulation and usage vary across different life stages. The

report also underscores the need for periodic reviews and refinement of these guidelines to

stay aligned with evolving measurement methodologies and analytical requirements,

encouraging countries to test and adapt them according to their specific contexts.

Income, consumption, and wealth are three distinct dimensions of economic well-being, and

this framework describes their central concepts, relationships, and additional elements that

together form a self-contained system for assessing household economic well-being. The

OECD framework49 recognizes that higher levels of income and wealth can contribute to

higher economic well-being by enabling greater consumption and saving for future

consumption. It also considers capital transfers, in-kind income, and expenditure payments as

key elements in understanding household economic resources and transactions. While

households are the primary unit for analysis, the report recommends reporting both household-

and person-weighted statistics to provide a comprehensive view of economic well-being,

considering factors like economies of scale in larger households. It suggests a one-year

reference period for implementing the framework and discusses practical data collection

methods, including the use of surveys, administrative sources, and statistical matching.

Additionally, the report highlights tools for presenting and analyzing information on

household economic well-being and suggests ongoing testing and refinement of the

framework to adapt to evolving practices and emerging research needs.

Hypothetical a e e t o ubjective poverty

The following section focuses on hypothetical questions to assess subjective poverty.

Researchers often employ hypothetical questions to ask respondents to consider the basic

needs of a reference or hypothetical family, such as what would be required for a family of

two adults and two children to make ends meet or not be considered poor. This approach

allows researchers to maintain control over the survey context and reduces concerns about

48 OECD Guidelines for Micro Statistics on Household Wealth | OECD iLibrary (oecd-ilibrary.org) 49 OECD Framework for Statistics on the Distribution of Household Income, Consumption and Wealth | OECD

iLibrary (oecd-ilibrary.org)

54

respondents' current situations.

What the role o que tio wor ?

The role of question wording and survey design in subjective questions is critical, impacting

the data collected. Research suggests that respondents often prefer precise, straightforward

language and questions categorized by components (e.g., shelter, transportation, food)

(Morrissette and Poulin, 1991). While considering respondents' preferences can reduce

response burden, it remains uncertain whether this enhances data accuracy due to the lack of

consistent measures of external validity for subjective questions.

Notable studies, such as Andrews and Withey's (1976) quality-of-life surveys, have explored

effective scales like delighted/terrible (D/T) for measuring income-related feelings. apteyn et

al. (1979) focused on income equation questions (IEQ) and D/T scales for assessing an

individual's welfare function of income (WFI), with a preference for annual income reporting.

Antonides et al. (1968) examined ten alternative methods for measuring welfare functions,

emphasizing the need for further research. Garner's work (1991) compared data between the

United States and the Netherlands, highlighting variations in responses attributed to question

wording, survey design, and data collection instruments. These studies underscore the

significance of question formulation and survey design in subjective data collection but also

highlight the complexities in achieving consistency across responses.

Statistics Canada

A study conducted at Statistics Canada by Morrissette and Poulin (1991) found, using an

Income Satisfaction Survey (IS), that question wording had a significant impact on the average

minimum income reported by respondents. Using more restrictive language reduced the

average minimum income by between 12% to 32% based on the 1987 and 1988 survey

questions. The 1987 IS was split into two sample groups, each being asked a variation of the

minimum income question, with the notable difference of using ‘considered necessary’ in one

and ‘absolutely necessary’ in the second. The more restrictive language found in Figure 2

ersion 2 led to a 12% decrease in the amount of income reported.

Figure 2 – More restrictive language lowers reported minimum income

Version 1 (1987)

To meet the expenses you consider

necessary, what do you think is the

minimum income a family like yours

needs, on a yearly basis, to make

ends meet (if you are not living with

relatives, what are the minimum

income needs of an individual like

you)?

Version 2 (1987)

What do you think is the smallest

yearly income a family the size of

yours would need to meet

absolutely necessary expenses (if

you are not living with relatives,

what is the smallest yearly income

an individual like you would need?).

55

Source: Morrissette and Poulin (1991)

As in the 1987 IS survey, the 1988 IS survey had two subsamples. It found an even larger

impact due to question wording. Compared with using ‘consider necessary’ language and an

additional qualifier of ‘before tax’ income, the more restrictive language referring to ‘basic

needs’ in Figure 3 ersion 2 reduced respondents’ minimum income by 32%.

Figure 3 – ‘Before tax’ in the question has a large impact on income reported

Source: Morrissette and Poulin (1991)

It is important to note that these surveys also contained unchanged questions, which helped

ensure that the distributions of average minimum incomes were relatively stable over time.

The data obtained from the original unchanged questions for 1983, 1986, and 1987 confirmed

this (Morissette, 1991). It emphasizes the importance of consistency with question wording

over time.

Other examples, such as the General Social Survey (GSS) ran extensive cognitive testing on

the new concepts of criminal victimization were to better understand the ways in which

sensitive survey topics such as family violence required greater security. While it was

determined that cognitive tests were needed to study sensitive topics, researchers started to run

cognitive tests to evaluate subjective poverty question.

Cognitive tests Bureau of Labor Statistics

Stinson (1997 and 1998) ran a series of cognitive tests to evaluate the effectiveness of various

subjective poverty questions and alternative approaches to asking questions. The questions

that were tested in 1996 included the Minimum Income Question (MIQ), Minimum

Satisfaction Question (MSQ), Income Evaluation Question (IEQ), and Delighted/Terrible

(D/T) 7-points scales ranging from a deep frown to a broad smile. The 1997 cognitive test

looked at alternative measures to test respondents’ feelings about the questions by using

images such as faces, feeling thermometers, D/T, circles, economic attitudes, income balance,

Version 2 (1988)

In your opinion, how much do you

have to spend each year in order

to provide the basic needs for

your family? By basic needs I

mean barely adequate food,

shelter, clothing and other

essential items required for daily

living.

Version 1 (1988)

To meet the expenses you

consider necessary, what do you

think is the minimum income,

before tax, a family like yours

needs, on a yearly basis, to make

ends meet (if you are not living

with relatives, what are the

minimum needs, before tax, of an

individual like you)?

56

and positive and negative lines scales50. Both tests revealed important lessons for subjective

poverty questions, as demonstrated below in waves 1 and 2.

Wave 1 findings showed that questions about feelings towards income and expenses were

informative but complex and burdensome, with hidden internal questions increasing

respondent burden. Language framing and response categories were also ambiguous,

suggesting the need for clearer language to enhance response precision.

In Wave 2, cognitive testing introduced new question wording and formats. Respondents

preferred a segmented MIQ question, breaking it down into food, shelter, clothing, utilities,

and work expenses, making it simpler and easier to understand. About 67% of respondents

favored a shorter IEQ version. These findings emphasized the importance of question format

in consistency of responses and revealed some inconsistencies between feelings expressed and

objective assessments. Overall, respondents preferred simple, traditional survey question

wording.

Fra a o e effect

Research has emphasized the significance of frame and mode effects in survey design and

delivery, particularly when examining subjective phenomena. Frame effects, influenced by the

survey's content or theme, have been observed to impact responses to subjective indicators. A

study comparing the General Social Survey (GSS) and the Canadian Community ealth

Survey (CC S) revealed that the GSS's changing theme led to variations in life satisfaction

responses, mainly due to framing effects (Waverock et al., 2023). These effects were

responsible for substantial year-over-year fluctuations in average self-reported life satisfaction.

Mode effects, on the other hand, are influenced by the method of data collection, such as

interviews, online surveys, or paper questionnaires. These effects have been found to create

differences in self-reported life satisfaction, particularly across various socio-demographic

backgrounds. Furthermore, the design and content of welcome screens in online surveys play a

critical role in influencing response rates. Factors like the stated survey duration and the

emphasis on explaining privacy rights on the welcome screen significantly impact participants'

decisions to engage in web surveys.

Both effects have the possibility of influencing a respondent, but the potential impact is greater

for subjective questions. Individuals’ responses can be ‘primed’ by preceding questions. The

mode effects respondents experience, leading to a social desirability bias (Atkeson, Adams and

Alvarez 2014; Tourangeau and Yan 2007) by responding differently if they believe they will

50 Face: When used by Andrews and Withey, the faces formed a seven (7)-point scale ranging from a deep frown to a broad smile. In Stinson 1998, test was restricted the scale to five (5) faces. The “Feeling Thermometer” is a graphic device printed on a card that looks like a thermometer. It is,

in fact, a nine (9) point scale ranging from 0 degrees (very cold or unfavorable feeling) up to 100 degrees (very warm or favorable feeling). The

Delighted/Terrible (DT) Scale is a 7-point scale with a “mixed” category as the midpoint. In a previous test of this question, we found subjects generally unwilling to endorse extreme category as an expression of their feelings about their income. The Circles Scale is a series of seven circles

that have each been divided into six segments. At the lowest end of the range, the six segments have all been labeled with minus signs; at the highest

end of the range, there are plus signs placed within each segment. Of all the question formats that were tested, this series of five short-answer questions (dubbed as “economic attitude” questions), was the only section universally approved and applauded by all respondents. The Income

Balance was single short-answer question asking respondents to compare the amounts of the income and expenses. The Line was a simple flat line

with one end point labeled with a “+” and the other end point labeled with a “-.” In-between the poles were three equally spaced vertical marks. Respondents were instructed to place their feelings about their total family income at the appropriate place along the line.

57

be viewed negatively by the interviewer, resulting in differences depending on the method of

data collection.

Measurement errors in surveys like EU-SILC can stem from various sources, including the

questionnaire, interview process, respondent, and data collection methods. To ensure data

accuracy, it's crucial to construct questionnaires that facilitate accurate and efficient responses.

This involves drawing insights from pilot surveys and past EU-SILC waves to identify and

address potential issues. Pre-testing questionnaires helps anticipate problems and enhance the

data collection process.

Subjective poverty a the evolutio o ea ure

Subjective poverty is a concept rooted in individuals' personal perceptions and assessments of

their economic well-being, influenced by factors like income, personality, and societal

perspectives. Unlike objective measures, which rely on externally set thresholds, subjective

measures assess poverty based on personal evaluations and can encompass both monetary and

non-monetary aspects. Monetary measures often center on respondents' perceptions of the

income required for financial security, while non-monetary measures assess aspects like the

ability to make ends meet or afford specific items.

Subjective poverty can also be viewed through the lens of scarcity theory, which sees poverty

as the gap between one's needs and available resources. Subjective income expectations play a

significant role in this context, shaping how individuals perceive their welfare levels and make

decisions regarding consumption and savings. While subjective and objective poverty

assessments are related, they are often treated separately, with comprehensive measures

considering both. This recommendation comes from the Stiglitz et al. report (2009) and has

manifested in initiatives like the OECD Better Life Index (2023), which encompasses

objective and subjective measures.

This section explores various perspectives on developing subjective poverty measures,

including consensual methods51 that define minimum needs or standards through responses

about hypothetical situations and methods based on respondents' assessments of their own

family or situation, which are more commonly used and theoretically grounded. These

approaches aim to provide a holistic understanding of subjective poverty, offering valuable

insights for policy development beyond income considerations.

Case Study 5: Subjective assessments versus objective measures of poverty – discussion of the definitions

of selected poverty measures based on the Polish edition of the EU-SILC survey

Anna Bieńkuńska, Tomasz Piasecki

Measuring poverty is essential for social policy planning and evaluation, but it is a complex

concept with multiple definitions and measurement approaches, including objective and

subjective ones. Subjective assessments complement objective measures, offering a different

51 Van den Bosch, 2001, p. xvi.

58

perspective on poverty and enabling a more comprehensive diagnosis of the phenomenon.

These assessments can also verify and discuss the definitions of objective measures. An

analysis based on 2019 micro-data from the Polish edition of the European Survey on Income

and Living Conditions (EU-SILC) examines the relationship between objective poverty

assessments and respondents' subjective evaluations of their material situation. It compares

various objective poverty measures and demonstrates how subjective assessments can verify

and interpret objective measures, including the discussion of poverty thresholds.

The EU-SILC survey does not directly measure subjective poverty but provides variables for

indirect methods of measurement. This analysis focuses on indirect methods and uses a

question about the ability to make ends meet to calculate an indicator of subjective economic

stress, serving as an indirect measure of subjective poverty. The indicator represents the

percentage of people in households struggling to make ends meet. Additionally, the study

considers both commonly used poverty measures like the 'at-risk-of-poverty rate' (AROP) and

the 'severe material and social deprivation rate' (SMSD)52 for international comparisons and

more specific indicators related to income poverty and deprivation.

Figure 3. ‘False poverty’ rate by poverty threshold (restrictiveness of the poverty definition) – theoretical model

52 See Glossary: Severe material and social deprivation rate (SMSD) - Statistics Explained (europa.eu)

←extreme poverty poverty threshold moderate poverty→

59

Figure 4. ‘Undetected poverty’ rate by poverty threshold (restrictiveness of the poverty definition) – theoretical model

Figures 3 and 4 illustrate the expected relationship between the restrictiveness of the poverty

threshold and various poverty indicators. A more restrictive threshold indicates extreme

poverty, suggesting that those considered poor under such conditions should have worse living

conditions on average, making it less likely for people with positive assessments of their

material situation to be classified as poor ('false poverty'). Conversely, less restrictive poverty

thresholds may lead to more frequent cases of 'false poverty' among those experiencing less

acute poverty. Additionally, cases where individuals with a negative assessment of their

situation are not considered poor ('undetected poverty') are more likely with restrictive

thresholds. As the threshold becomes less restrictive, the incidence of 'undetected poverty'

should decrease. Any decrease in threshold restrictiveness accompanied by changes in the

false poverty or undetected poverty rates would raise doubts about the relationship between

the chosen poverty measure and economic hardship, potentially questioning the validity of the

measure itself.

←extreme poverty poverty threshold moderate poverty→

60

Figure 5. ‘False poverty’, ‘undetected poverty’ and overall misclassification – shares in the whole population (theoretical model)

The relationship between 'false poverty' and 'undetected poverty' and the restrictiveness of the

poverty threshold should follow the same pattern for the total population, leading to an overall

misclassification. This overall misclassification reaches a minimum at a certain threshold

value. This suggests that there exists an optimal threshold value for the objective poverty

measurement method analyzed, where the classification of people into poor and non-poor

aligns most closely with subjective assessments. This approach allows for the evaluation of

poverty threshold values in terms of optimality and facilitates comparisons between various

poverty measurement methods that use threshold values as parameters set at different levels.

This in-depth analysis delves into the relationship between various objective poverty measures

and individuals' subjective assessments of their economic well-being. It aims to understand the

extent to which these different measurements align and examines the impact of poverty

thresholds on these alignments.

One key finding of the study is that the severe material and social deprivation indicator

(SMSD) exhibits the highest consistency with subjective assessments among the objective

poverty measures considered. In this regard, individuals classified as experiencing deprivation

according to SMSD criteria tend to report greater economic stress and difficulties making ends

meet. This suggests that SMSD effectively captures non-monetary aspects of poverty,

providing a more comprehensive view of individuals' material conditions.

Conversely, the study highlights some anomalies when considering extremely low-income

thresholds to define poverty. Surprisingly, among those classified as extremely poor based on

income criteria, a significant proportion still reports making ends meet easily or fairly easily.

This raises questions about the accuracy of identifying extreme poverty solely through

income-based measures, indicating that additional factors may influence individuals'

perceptions of their material situation.

←extreme poverty poverty threshold moderate poverty→

‘undetected poverty’ ‘false poverty’

‘optimal’

threshold

overall

misclassification

61

The analysis emphasizes the complexity of poverty as a multifaceted phenomenon and

underscores the importance of using a combination of both objective and subjective measures

to comprehensively assess it. It argues that subjective assessments should complement

objective measures, as they offer unique insights into individuals' experiences of poverty.

owever, the study also highlights the need for clear communication about the strengths and

limitations of each measure to avoid misinterpretation and ensure that policymakers and the

public have a nuanced understanding of poverty.

What the role o e u a e o e’ ubjective poverty po tio ?

A decent lifestyle in socio-economical terms is the quality, quantity, and price of the goods and

services required for a decent life, which should be sufficient to meet one's physiological,

psychological and social needs and enable full participation in society. It comprises goods and

services needed in everyday life so that people can ‘get by’ and their life goes smoothly while

feeling oneself as part of the surrounding society. A decent minimum describes a consumption

level that is necessary for all members of society in order to live a decent life but excludes

commodities that are not necessary. A decent lifestyle necessary for preventing poverty is

often defined in relation to the average consumption level without paying attention to the fact

that the present average consumption in western welfare states is ecologically unsustainable

(Lettenmeier et al 2014).

An approach to defining minimums is a basic need one—having less than objectively defined.

This method defines the absolute minimum in terms of “basic needs,” such as food, clothing,

and housing. It requires the assessment of a minimum amount necessary to meet each of these

needs. These amounts are added up to arrive at a poverty line in terms of income. In the

Netherlands, budget experts from the Social Services Administration in Leeuwarden have

calculated a poverty line based on this approach. The poverty line, while somewhat arbitrary,

is differentiated according to household composition ( agenaars, A., & de os 1988).

A simpler approach is defining the subjective minimum income, which is based on a survey

question used to observe the income level that people consider to be ‘‘just sufficient” for their

household. If their actual income level is less than the amount they consider to be ‘‘just

sufficient,” they are considered poor. Comparison with the actual household income puts the

household in the category poor or non-poor. This subjective poverty definition is based on the

assumption that the expressions “sufficient” and “insufficient” are associated with the same

welfare levels by everybody ( agenaars, A., & de os 1988).

A third approach is the subjective minimum consumption definition which reconciles the

subjective poverty and the basic needs definitions. Essentially it asks people what they

consider to be basic needs and to specify how much they need to meet these necessities. The

amount people consider to be minimally necessary for food is compared to the actual amount

spent on food to the subjective minimum used to categorize the household as poor or non-poor

( agenaars, A., & de os 1988).

In the Finnish welfare state, the minimum level of social benefit should guarantee a decent and

62

dignified lifestyle. People living on minimum income ought to have not only sufficient means

for fulfilling basic needs (such as having a shelter or adequate nutrition) but also means for

participation (such as having a phone, recreational activities and other forms of social

participation). Thus, in Finland, reference budgets were compiled by using consumer panels to

define which products and services are regarded necessary and parts of a decent lifestyle. The

budget contains: food, clothing and footwear, household appliances, entertainment electronics,

information and communication technology, health and personal care, leisure, participation,

transport, and housing. The material footprint, measured by total material consumption which

is based on the material requirement of an economy minus the export-based resource use, for a

decent minimum based on the reference budget is approximately 20 tons per year. The

households studied show that in the present Finnish society people living on minimum income

is roughly between 15-20 tons per person per year. This affords them decent housing, adequate

nutrition, means for participation and possibilities for recreational activities as well as some

basic services. Below this amount, deprivation such as, homelessness or eating only leftover

food would occur (Lettenmeier et al 2014).

The rate of success of a reference budget depends on its accuracy in identifying the essential

products, consumption quantities, prices, and the life span. The reference budgets should

enable consumption that meets a decent minimum standard of living and allows participation

in society, in the form of decent clothing, proper nutrition and eating out, and the opportunity

to obtain and transmit information, based on today’s society. To determine quantities of

products used, statistics, calculations, and the Finnish ousehold Budget Survey were used.

Evaluation of the quantities and life spans of commodities was extracted from group

discussion participants. The price and quality level chosen is the average, and items are

expected to last a reasonable time. Low-quality or cheap products were not included in the

study. Price information is available on the Internet, and price levels of food items and the

differences in prices between various trade groups in different parts of Finland were gathered

from a food price survey of the National Consumer Research Centre (Lehtinen et al 2011).

What the role o eo raph c ffere ce pr ce ?

While geographic differences in the cost of living are part of popular discourse, assessing

these differences faces both data availability and conceptual challenges. Despite the obvious

large gaps in prices that prevail in different areas, most studies take no account of geographic

price differences or attempt to control for them (Carrillo et al. 2016). Since 1968, the Council

for Community and Economic Research has produced the American Chamber of Commerce

Researchers Association (ACCRA) price indices for six broad categories of goods and an

overall consumer price index for many urban areas (Carrillo et al. 2016). One study attempted

to construct an interarea housing price index for each metropolitan area and the non-

metropolitan part of each state in 2000. It was based on a large data set with detailed

information about the characteristics of dwelling units and their neighborhoods. For most

areas, the price index for all goods—other than housing—is calculated from the ACCRA price

indices, using a regression model explaining differences in the composite price index for non-

housing goods for the areas where it is available, and used to predict a price of other goods for

the uncovered areas. The price indices for housing services and other goods were combined

with data from the Consumer Expenditure Survey to produce an overall consumer price index

for all areas of the United States. The fit of the hedonic equation used to estimate price indexes

63

were consistent with popular views about differences in housing prices. The resulting overall

consumer price index is not sensitive to the expenditure weights used and it differs little from

a simple ideal consumer price index that accounts for how individuals alter their consumption

in response to changes in relative prices (Carrillo et al. 2016).

Since there is no national database that includes rural areas to assess the perception of these

regions having lower prices, it may lead researchers to a faulty conclusion. Adjusting the

poverty threshold for differences in the ‘cost of living’ based on perceptions of lower cost in

rural areas superficially reduces poverty rates for rural areas, lowering federal funding and

placing rural low-income families at greater risk. Rural residents commonly face higher prices

for food and electricity than their urban counterparts due to the higher operating costs.

Differences in the material conditions of rural living also lead to additional costs not typically

found in urban areas. While interarea price comparisons assume that the material conditions of

living are the same, Zimmerman et al. (2008) looked at the differences in rural versus urban

living. They found that there were additional costs incurred for residents in the rural counties.

For instance, in all eight U.S. rural counties studied, extended area phone service would have

doubled the cost of having a phone compared to that in the urban areas. There were costs that

price comparisons alone did not capture. In some cases, going to the grocery store to buy food

meant on average driving 30 miles round trip. This would add additional cost to the price of

the food purchased in order to cover transportation. Some median household income levels

might be artificially inflated due to only parts of a rural area being more prosperous. For

example, counties not part of a micropolitan area, yet adjacent to an interstate, may have a

median household income level similar to the state as a whole, therefore increasing their home

prices. owever, the higher income level may be influenced by a small area that in one case

was dominated by high-income lake-based tourism with luxury boats and second homes, while

the bulk of the county is sparsely populated with a limited number of businesses. Without a

better understanding of the material conditions of rural life and local research there is a risk of

exacerbating place-based inequities (Zimmerman et al 2008).

Another study by Yilmazkuday (2017) focused on the determinants of the expected number of

consumers searching for gas prices before making a purchase across zip codes. It was based on

geographic, demographic and economic characteristics. Per the maximum likelihood

estimation of a consumer search model, they recovered the distribution of search costs for each

zip code in the U.S. by considering the gasoline purchasing behaviour of consumers.

Consumers in zip codes suffering from poverty search for more gas stations before purchasing

gasoline, while consumers at or above 150% of the poverty level do not search more than

other consumers. Consumers double their expected number of stations searched when the

average distance goes up, when the zip code area is tripled in size, and when the population

density goes. Gasoline price spreads are higher in zip codes with spatially dispersed gas

stations. Consumers would halve their expected number of searches when their income is

quadrupled. This is obviously due to the opportunity cost of searching for lower gasoline

prices where higher income consumers do not find it profitable enough to do so. The expected

number of stations searched is halved when commuting time is quadrupled (Yilmazkuday

2017).

64

What the role o hou ehol co po tio a a u ptio re ar har ?

The role of sharing was found to have an impact depending on the type of household

composition. Based on the 2010 Luxembourg Income study data by Tai (2017), research

examines cross-national patterns of rates of youth poverty using household composition. The

increase in poverty following young adults' leaving the parental home indicates not only the

tremendous impact of household composition, but also the marginalization of young adults in

welfare states due to prolonged education and postponed entry into the labor market and

marriage. School-leavers, first-time job seekers, and young adults cycling between education

and work may cease to be eligible for unemployment benefits or social assistance. Thus,

young adults are likely to meet economic needs by living with their parents, pooling their

household income, and sharing living expenses. The prevalence of co-residence with parents is

critical for the economic well-being of East Asian and Southern European young adults. If

Taiwanese young adults had the same living arrangements as young adults in Scandinavian

countries, the poverty level of Taiwanese young adults would increase by 5 to 9 percentage

points. With 62% of respondents residing with their coupled parents, the household

composition of Taiwan seems to be the most economically beneficial for young adults. In

addition, many young people live in households with their grandparents, other relatives, or

non-family household members. Young adults living with coupled parents or with their spouse

are less likely to be poor. Scandinavian single parents are actually better off than single young

adults without children due to Nordic welfare regimes providing generous social provisions

for families with children. Single mothers are most vulnerable, with poverty rates ranging

from 13.5% for Japan to 94.5% for Germany (Tai 2017).

Snyder et al. (2006) looked at race and residential variation in the prevalence of female-headed

households with children and how household composition is associated with several key

economic well-being outcomes using data from the 2000 U.S. Census. ousehold poverty is

highest for female-headed households with children that do not have other adult household

earners. Earned income from other household members lifts many cohabiting and

grandparental female-headed households out of poverty, as does retirement and Social

Security income for grandmother headed households. Poverty was found to be at its highest

among racial/ethnic minorities and for female-headed households with children in non-

metropolitan areas compared to central cities and suburban areas. The presence of other

earners in non-metro female-headed households with children is an important income source

that lifts many out of poverty. The economic benefits of other household earners are important

for white cohabiting households, and for black and ispanic grandmother-headed households.

When the effect of another earner is added in the model, cohabiting female-headed households

with children remain significantly less likely to be poor compared to single mother only

families, indicating that this factor accounts for some of the association between household

composition and household poverty. It was also found that an additional 100 hours worked by

the household head in the prior year translates into a reduction in the odds of poverty by 14%.

The earnings of a male partner are especially important for non-metro female-headed

cohabiting households with children as it cuts poverty in half for these households for all

ethnic groups considered. The presence of additional earners in the household is associated

with a significant reduction in household poverty. This confirms the need to evaluate

household composition, as it is an important determinant of household poverty due to the

65

economic resources that are available to specific household living arrangements (Snyder et al

2006).

Tai (2009) reviewed data on individuals in households with older adults for 22 countries in the

Luxembourg Income Survey. It looked at the risk of poverty to the type of state welfare regime

and comparing it to the situation in Taiwan; the characteristics of the household head, number

of earners, older adults, and children. It finds that persons in households with older adults are

significantly less likely to be poor in countries with social democratic welfare regimes than in

Taiwan, where there are limited social welfare programs. Living with fewer children, more

older adults, and more earners lowers the risk of poverty, as does having a married and better

educated household head. For persons residing in a household with an older adult, having a

single man or a woman rather than a couple heading the household is linked to a greater

likelihood of poverty. In households with more earners, people are less likely to be poor if

only because stronger ties to the labor market bring greater income. An additional older adult

in the household is associated with lower risks of being poor if only they are eligible for old-

age benefits. The risk of poverty and the likelihood of older people living with others are more

common where state provisions for dependents and families are limited. Family co-residence

and welfare state provisions are alternative strategies that help older adults and their kin to

cope when their market income shortfalls. Given the values of societies placed on families

such as those in southern Europe and East Asia, it is not surprising that state welfare programs

have been slow to develop in these regions, which is the opposite of what is observed in

generous welfare such as Nordic countries (Tai 2009).

What the role o Soc al Tra er K (STIK)?

According to research conducted by Eurostat, social transfers in kind (STi s) 53are significant

contributors to household income, particularly for those with lower incomes. These transfers,

provided by governments or non-profit organizations, encompass various services and support

for needs such as education, health, childcare, and long-term care. The analysis conducted by

Alaminos and Geske specifically focuses on health related STi s received by households from

governments. Understanding the impact of these social transfers is crucial for assessing

material well-being, especially in Europe, both before and during economic crises.

ousehold disposable income represents the income available to a household after taxes and

can be spent or saved. It comprises both monetary and non-monetary components. Traditional

monetary income indicators, derived from disposable income, are frequently used to analyze

poverty and inequality. People are considered at risk of monetary poverty when their

equivalized disposable income falls below the at-risk-of-poverty threshold, typically set at

60% of the national median disposable income after social transfers. owever, these indicators

do not account for non-monetary income. Adjusted disposable income, which includes both

monetary income and Social Transfers in ind (STi s), provides a more equitable measure of

income distribution. International statistical guidelines recommend using adjusted disposable

income to analyze the total redistributive impact of government interventions in the form of

benefits and taxes on household income.

53 Impact of health social transfers in kind on income distribution and inequality - Statistics Explained (europa.eu)

66

Non-monetary indicators complement traditional monetary measures and help explore aspects

of inequality not covered by monetary indicators. In Eurostat's analysis, the EU-SILC survey

microdata on disposable income is augmented by imputing health-related STi s to calculate

health STi adjusted disposable income. These health related STi s align with government

health expenditure profiles by age and gender, as reported in the National Accounts. The study

examines the impact of health related STi s on income distribution and inequality measures

like the Gini index. The findings demonstrate that health STi s contribute to a more equitable

distribution of household income across income quintiles, reducing income shares in the

highest quintiles and increasing them in the lowest. Without these health related STi s,

income inequality would significantly worsen, especially for those needing to cover primary

health expenditures from their own pockets.

What the role o hou wealth a pute re t?

Non-financial assets such as the principal residence represent the largest component of wealth

for most households. Per Maestri (2015), imputed rent for owner-occupied accommodation is

the most important form of non-cash income advantage. The difficult perception of this

economic advantage is due to the dual nature of housing, representing at the same time

consumption and investment. Living in social housing is another form of housing advantage.

The rental equivalence approach consists of estimating the market rent that homeowners or

below-market rate tenants should pay if they had to rent their places at full price. For

homeowners, the capital market approach can be applied, which is the imputed rent that can be

estimated as the rent that they would pay if the house were rented (net of costs such as

mortgage interests). For tenants in social housing or under rent control, imputed rent is

estimated as the difference between market and paid rent. The inclusion of tenants with below-

market rent reduces relative poverty and inequality. On the other hand, the inclusion of

homeowners only as beneficiaries of imputed rent leads to inequality and relative poverty

tends to increase. If market rent is imputed for tenants with below-market rent as well,

inequality and relative poverty decrease (Maestri 2015).

There are three ways of estimating imputed rents. First is the rental equivalence approach,

which calculates the value of housing from equivalent units in the private rental market. Rents

are estimated per square metre and housing costs deducted and compared to owner-occupied

housing to arrive at a market value. This method finds that imputed rents reduce income

inequality as the distribution of imputed rents, while right skewed, is less unequal than the

distribution of other income (Maestri 2015).

The second estimation method is the capital market approach, which sees housing as capital

income from an investment and assumes a return on its value in housing. Using the capital

market approach reduces the dampening effect of imputed rent on income inequality.

The third method is the self-assessment method, which uses subjective estimates provided by

the owners on rent from their housing to measure the opportunity cost of renting out owner-

occupied housing and is then used as a proxy for rent. This method leads to the smallest

reduction in inequality (Maestri 2015).

Using the 2010 EU-SILC data to provide an assessment of the impact of the housing situation

67

of households shows that relative income poverty and inequality decrease if imputed rent is

taken into account, while they increase if housing expenses are considered. Therefore, the

deduction of housing expenses provides a better measure of relative poverty. To add imputed

rent, it can be estimated from rental equivalence and capital market methods. To deduct

housing expenses from disposable income, it can be obtained from the out-of-pocket approach.

The comparison of disposable income plus imputed rent, minus housing expenses and

perception of housing costs provides useful hints on the distributional effects of housing in

different housing systems and sheds some light on their possible future developments (Maestri

2015).

In another study, the ousehold Finance and Consumption Survey ( FCS) conducted by the

European System of Central Banks was used to estimate non-cash income from owner-

occupied housing, subsidised rental housing, and free use of the main residence in Austria. The

FCS provides detailed information on mortgages, debt of renters in cooperative housing and

subjective information provided by interviewers on the dwellings and building quality. It

enabled the evaluation of the impact of non-cash income from housing on the full

unconditional household income distribution. Imputed rents have an equalising effect on the

distribution of income, and we find similar evidence for non-cash income from subsidised

rents. owever, imputed rents from owner-occupied housing equalise the upper part of the

income distribution, and subsidised housing has an (albeit smaller) equalising effect for the

lower part of the income distribution (Fessler et al 2016).

What the role o ffere ce “culture” a rel o ?

A study by Yurdakul (2016) on the role of religion discusses how religion may alter beliefs

about the causes of poverty, helping the poor with coping mechanisms. These beliefs are

classified as individualistic (poverty is related to the lack of ability or effort), structural (causes

of poverty are the economic and social systems), and fatalistic (poverty is not caused by the

individual or the system, but by forces such as chance, luck, and fate). Fatalistic beliefs in this

case are closely related to religion. The discourses of informants from a Turkish panel reveal

that religion helps them in resolving the tensions between reality (their poverty) and desire

(especially the desire to consume). Religious beliefs can contribute to the different stances

low-income consumers take towards their poverty, affecting the level of internalization and

resistance to the poverty stigma, and how people respond to the marketing institution. When

resistance is directed toward the desire to consume, arguments are often fueled by religious

beliefs. The effects of religious beliefs differ when used for resistance versus non-resistance

strategies stemming from different interpretations of Islam. Whereas resistant informants

emphasize religious ethics regarding worldly issues, such as greed, sin, improperness of

desire, non-resistant informants emphasize self-blame, fatalism, and the afterlife.

Yurdakul’s findings indicate the empowering aspect of religious arguments in providing low-

income consumers with the strength to cope by resisting consumer culture and re-creating

meaning beyond consumption. Informants further disclose a form of subtle resistance when

they intentionally stay away from consuming beyond the basic necessities for survival. Non-

resistant informants, especially in the cases of fatalism and belief in the afterlife, disclose that

internalized poverty stigma leads to negative feelings and contributes to perceived

vulnerability. Religiosity is more prominent among non-resisters who are more fatalistic in

68

their beliefs. Participants with a more critical stance are more active in their efforts to improve

their current situation, such as taking an active role in the workers’ unions, trying to break up

the vicious cycle of persistent poverty, or engaging in subtle forms of resistance such as non-

consumption (Yurdakul 2016).

A study by Atkin (2016) on India’s National Sample Survey of 1983 and 1987–1988 asked

households about their consumption of a broad set of foods as well as about their migration

particulars to look at the relation between culture and deprivation. The surveys record

household expenditures and quantities for each food item consumed in the last 30 days. The

surveys also provided information on expenditures on non-food items as well as household

demographics and characteristics. The findings suggest that interstate migrants consume fewer

calories per rupee of food expenditure compared to their non-migrant neighbours, even for

households on the edge of malnutrition. Migrants make calorically suboptimal food choices

due to strong preferences for the favoured foods of their origin states. Migrants bring their

origin-state food preferences with them when they migrate and that these preferences are

stronger when there are more migrants in the household. The most adversely affected migrants

would consume 7% more calories if they possessed the same preferences as their neighbours.

These results provide insight into the value that households place on their culture. Even

households on the edge of malnutrition are willing to substantially reduce their caloric intake

to accommodate their cultural food preferences (Atkin 2016).

Deprivation theory holds that poverty will be associated with high levels of religious

identification for those who are already affiliated with a religion. overd (2013) used a large

national probability sample to gather information about religious affiliation (state of having a

commitment to a religion) and level of religious identification (strength of their religious

commitment among those who stated having a religious affiliation). Results indicate that

deprivation initially predicted religious affiliation, but only because deprivation tapped into

variance also shared with ethnicity. When statistically adjusting for ethnicity, deprivation did

not predict whether people affiliated with a religious group. To measure deprivation, the New

Zealand Deprivation Index 2006 (NZDep2006) was used. This index allocates a deprivation

score to each neighbourhood based on the proportion of adults receiving a government-

supplied welfare benefit; household income; not owning their own home; single-parent

families; unemployed; lacking qualifications; household crowding; no telephone access; and

no car access. To examine whether deprivation was associated with levels of religious

identification, a model including education and ethnicity among other factors was constructed.

Results suggested that when controlling for deprivation, more educated participants were more

likely to be strongly identified with their religious group. When ethnicity was added to the

model, it revealed that cultural inheritance affected the strength of identification in connection

with poverty ( overd 2013).

In a 2002 unt study, three dependent variables were examined in a stratification survey that

was conducted in southern California measuring the importance attributed to individualistic,

structuralist, and fatalistic reasons for poverty. A series of statements representing possible

explanations for why some people are poor were presented to respondents. Separate measures

were constructed. Individualistic beliefs are composed of personal irresponsibility, lack of

discipline, effort, thrift, ability, talent, money management among those who are poor.

Structuralist beliefs are concentrated on low wages and lack of good jobs in some businesses

69

and industries, failure of society to provide good schools, discrimination. Fatalistic beliefs are

measured simply with just bad luck as an explanation for poverty. Findings reveal that

Protestants and Catholics are most likely to endorse the historically dominant individualistic

interpretation. Minority religions are most likely to support structural challenges to poverty.

Catholics and Jews are most likely to take the fatalistic view of poverty. Significant race/ethnic

group differences are found between religious affiliation and structuralist and fatalistic beliefs.

Among Whites, Protestants are significantly less likely than the other examined affiliations to

endorse structuralist beliefs, while among Blacks and Latinos, Protestantism is significantly

more positively aligned with structuralist beliefs. For racial and ethnic minorities in America,

Protestantism is more collectivist in orientation. Catholics are similar to Protestants on

individualistic beliefs but are significantly more likely than Protestants to “system blame” for

poverty. Among Blacks and Latinos, unlike Whites, being Catholic is significantly more

predictive of fatalism arising from the need for an alternative account of inequality to

supplement the explanatory limits of individualism. It is important to intersect race/ethnicity

and religion in research on stratification beliefs. Cultural differences between Protestants and

Catholics in America in ideological beliefs about poverty differ among Blacks, Latinos, and

Whites ( unt 2002).

Co clu re ark o hypothetical que tio

ypothetical assessments can be framed as second-order beliefs, where respondents are asked

not to provide their opinion but to estimate what other respondents would answer on average.

This approach helps assess social norms, which can shape individuals' first-order beliefs and

influence what they find acceptable. Some argue that second-order beliefs are better predictors

of behavior than personal beliefs and can be incentivized to reduce social desirability bias

(Babin, 2019). owever, it is essential to recognize that hypothetical household questions

represent a departure from the more common subjective approach, as they gauge respondents'

perceptions of a hypothetical family's welfare rather than their own, resulting in different

conceptualizations of poverty.

Le o lear e ro COVID-19

In this section, we explore the dynamic landscape of subjective poverty research, driven by

several key factors such as declining response rates in national surveys and the rapid adoption

of online data collection methods, a trend notably accelerated by the CO ID-19 pandemic. As

a response to the challenge of survey fatigue, statistical agencies have increasingly prioritized

shorter surveys and concise questioning to maintain respondents' engagement (Statistics

Canada, 201954). This shift in survey design has profound implications for the study of

subjective phenomena, including subjective poverty.

The section opens by providing a comprehensive overview of the OECD's ongoing research

into subjective well-being indicators, which significantly overlaps with the broader subject of

subjective poverty. It highlights the importance of understanding and measuring well-being

from a subjective perspective, emphasizing the need for nuanced indicators that capture the

multifaceted nature of poverty and well-being. Furthermore, the discussion pivots to the

54 Modernization: a key to Statistics Canada's efforts to reduce response burden (statcan.gc.ca)

70

emergence of Socio-Economic Impact Assessments (SEIAs) conducted across 15 European

and Central Asian countries during the onset of the CO ID-19 pandemic. These assessments

play a vital role in enhancing our comprehension of subjective poverty by examining the

socio-economic impacts of the pandemic on individuals and communities. Through SEIA

questions and comparability analyses, we gain valuable insights into how subjective poverty

evolves in the face of crises.

This section underscores the transformative impact of the CO ID-19 pandemic on the

landscape of subjective poverty research and the need to adapt research methodologies to

effectively capture and understand subjective experiences, especially concerning poverty and

well-being assessments. It also underscores the significance of international organizations like

the OECD and UNDP in coordinating global efforts to advance subjective poverty research,

shaping the future of this field.

Subjective Poverty SEIA Que tio a re a Co parab l ty A aly

In the context of Socio-Economic Impact Assessments (SEIA) conducted across the UNECE

region by 15 countries, six of them incorporated subjective poverty measurements into their

assessments: yrgyz Republic, Moldova, Serbia, Tajikistan, Ukraine, and Uzbekistan. Among

these, five countries collected primary data to support these measurements, while Serbia

utilized secondary data from its 2018 and 2019 annual surveys conducted by the Statistical

Office of the Republic of Serbia (SORS). Data collection primarily focused on households and

enterprises, with one exception being Mahalla-level 55administration in Uzbekistan.

Subjective poverty was predominantly assessed through direct methods in SEIA

questionnaires. ouseholds were queried about their perceptions of financial and material

changes resulting from the CO ID-19 pandemic. These questions aimed to understand how

the pandemic affected household income, their capacity to meet material and non-material

needs, and timely household expenses. This approach allowed respondents to voice their

experiences and opinions, offering insights into poverty criteria based on their pandemic-

related experiences. In contrast, traditional poverty measurements evaluate household material

resources and categorize households as poor if they fall below a certain threshold. The use of

direct methods in socio-economic impact assessments is especially significant as it helps

identify areas of economic hardship in the context of a global pandemic.

The questionnaires employed in SEIA included inquiries using minimum income and

economic ladder questions. Thirteen of the participating countries conducted primary data

collection, primarily through quantitative surveys. While randomness and representativeness

criteria were generally met, household-level data collection was less common, with nine

countries conducting household surveys and one focusing on municipal-level data. Over and

above the secondary data collection, high-frequency data, statistics, and desk reviews that

were used, some countries employed adapted Post Disaster Needs Assessment56

methodologies and qualitative studies to complement quantitative and secondary data. Some

countries even utilized Big Data sources like telecom and satellite data for a more

55 The smallest state administrative unit in Uzbekistan which consists of households. 56 https://www.undp.org/publications/pdna

71

comprehensive view of the pandemic's impact.

Table 2 provides a summary of the countries and data collection methods on subjective

poverty used in SEIA. Multiple subjective poverty approaches were adopted in SEIA

questionnaires, which will be explored further below.

Table 2 – Summary of data collections in SEIA

Country Primary

data collection

HH Survey Other Surveys Use of digital survey

Use of big and alternative data

Armenia Yes 3550 households

2100 local governance service providers

Yes, Kobo No

Azerbaijan Yes No No No No

Belarus Yes No No No No

Bosnia and Herzegovina

Yes 2182 respondents

No No No

Kazakhstan Yes 12024 households

No No No

Kosovo* Yes 1412 respondents

No No No

Kyrgyzstan Yes 2340 respondents (1371 women) based on random sampling

No No No

Moldova Yes UNDP analysis of the ad hoc module of the NBS Household budget survey

450 company respondents

No Yes Telecom and Satellite. Micronarratives (300 collected)

Montenegro Yes 1006 households

No No No

Tajikistan Yes 1250 Enterprises, individual entrepreneurs and dehkans (farmers)

in-depth interviews (150 HHs, including 100 women and girls and 100 youth, and 50 MSMEs)

No No

Turkey Yes No No No No

Ukraine Yes 1098 households

No Yes, Kobo No

Uzbekistan Yes No Mahalla survey 3670 mahallas surveyed

No No

72

Poverty defined in a fully subjective way (direct self-identification as poor, feeling of poverty)

Several countries, including yrgyz Republic, Moldova, Tajikistan, and Uzbekistan, adopted the

method of self-identification to assess the feeling of poverty. Respondents were asked questions

to determine if they had felt poor in the past, currently, or anticipated feeling at risk of poverty in

the near future due to the pandemic's impacts. This approach was widely used in the surveys and

adaptable, with questions often focusing on respondents' expectations regarding their household's

financial well-being.

Perceived financial difficulties

Countries utilizing subjective poverty measures in their SEIA assessments often included

questions aimed at assessing respondents' subjective economic stress. Questions inquired

about the ability to meet expected/unexpected expenses, make ends meet, or satisfy basic

needs. These questions considered not only traditional basic needs but also pandemic-related

necessities, such as access to the internet for online schooling or personal protective

equipment. Assessments mainly focused on changes in income-to-expense ratios and coping

mechanisms.

Subjective poverty line approach – perceived poverty line

The yrgyz Republic employed this method, which involves questions about the income

needed to secure a basic standard of living or meet necessities. Respondents were asked to

estimate the amount of money required by a family with the same number of members to

avoid poverty, considering prevailing price levels. Open-ended questions were also used to

capture changes in respondents' lives related to the pandemic.

Subjective poverty lines assessed with the use of statistical methods (so-called objectivised, quasi-

subjective poverty lines)

The yrgyz Republic and Ukraine directly employed this method in their SEIA assessments.

Respondents were questioned about household assets or funds, which were used as indicators

of deprivation. This approach focused on assessing financial restrictions resulting from cost-

related inaccessibility of essential items.

Perception of poverty as a social phenomenon

yrgyz Republic included questions on respondents' views regarding poverty as a social

phenomenon. These questions encompassed definitions of poverty, perceptions of poverty's

extent in the country, its causes, and the role of the government in poverty reduction. They

also examined opinions on the government's anti-crisis measures and the type of assistance

needed personally.

73

Other Approaches

In addition to the subjective poverty measures outlined above, various other approaches were

adopted as well, including inquiries about the availability and access to food, estimations of

fair expenses on basic needs, and negative coping mechanisms adopted by households due to

pandemic-induced income reductions. Some questions assessed the dependence of individuals

on their families during economic hardships. The approaches listed in this paragraph were

used by countries such as the yrgyz Republic and Tajikistan.

A overv ew o UNDP Soc o-Eco o c I pact A e e t (SEIA ) or hou ehol

cou tr e o UNECE re o

The SEIA assessments revealed varying impacts across countries, with differences in intensity

based on economic structure, social protection systems, and other vulnerabilities. At the

individual and household levels, the assessments highlighted the unwinding of development

gains, increased poverty, and rising inequality. Regional and rural-urban disparities were

observed, particularly affecting informal businesses in urban areas. The assessments also

underscored the need to reconsider social protection systems to cover new classes of

vulnerability, often referred to as the "missing middle." Challenges encountered during SEIA

implementation included designing research methodologies, questionnaires, and sampling

methods, as well as targeting vulnerable groups, dealing with fieldwork constraints, and

ensuring data comparability. Coordination with various institutions, access to data, and data

sharing by government and big data providers were additional hurdles. Nevertheless, some

best practices emerged, including Digital SEIA, innovative use of Big Data, combining "thick"

data57 (micronarratives58), high-frequency data, and other methods for sense-making during

the pandemic.

The process of sensemaking involved the integration of various pieces of evidence during the

SEIA assessments, as seen in Table 2 – Summary of data collections in SEIA. This integration encompassed the use of qualitative studies to complement quantitative and secondary data.

Additionally, certain countries leveraged Big Data sources, such as telecom and satellite data,

to gain insights into the context of the CO ID-19 pandemic.

Emerging issues from the SEIA assessments conducted reveal differentiated impacts. Income

disparities are exacerbated by increasing unemployment, especially in urban areas, as well as

reduced income, higher food and healthcare costs, and limited savings for many households.

Gender disparities are notable, with women disproportionately affected in the labor market,

while multi-dimensional consequences, including long-term education and health effects,

contribute to rising inequalities among different groups. Entrepreneurs, migrant laborers, and

informal workers face heightened vulnerabilities, with youth and women bearing the brunt of

these impacts. School closures and ineffective remote learning exacerbate long-term

challenges for children.

57 Thick data is a term that refers to qualitative data that reveals the contexts, emotions, and stories of the subjects

being studied. 58 Micronarratives are a collection of short stories written by survey respondents

74

Macro-economic vulnerabilities have been exposed to varying degrees due to external and

internal shocks, including declines in exports, remittances, and oil prices, as well as

lockdowns. These vulnerabilities translate into micro-economic consequences affecting

individuals, households, and small and medium-sized enterprises (SMEs). Demand-side

shocks have led to falls in remittances and household incomes, reduced demand in sectors like

tourism and hospitality, border closures disrupting supply chains, and increased household

costs for essential goods and services. Supply-side challenges include temporary border

closures affecting value chains and labor movement, plummeting commodity prices, currency

depreciation, higher import costs, financial risks, and debt servicing burdens, as well as fixed

costs and SME weaknesses. The impact of these macro-economic vulnerabilities varies among

countries, with commodity-dependent nations facing a double shock from declining oil and

gas prices. As the pandemic persists with multiple waves of infection, uncertainty rises,

placing increased pressure on public policies and recovery efforts, particularly in terms of debt

and fiscal space. Socio-economic impact assessments underscore the disproportionate effects

on vulnerable groups, households, smaller enterprises, and disparities between urban and rural

areas.

Moreover, SEIAs reveal the need to reassess social protection systems to encompass new

classes of vulnerability often referred to as the “missing middle.” This group includes formerly

non-poor informal workers who lack basic security, occasional and gig workers who

supplement their income with occasional work, long-term unemployed individuals who have

lost eligibility for unemployment benefits, and labor migrants and seasonal workers who face

challenges earning money abroad due to travel restrictions and increased costs. These findings

emphasize the importance of adapting social protection systems to address evolving

vulnerabilities in the wake of the pandemic.

Figure 6. Missing Middle

Source: Socio-Economic Impact Assessments, Statistics Canada (2022)

Middle and upper middle class employed in the formal

economy and covered with social security

Missing middle

Poor covered with targeted social assistance transfers

75

Case study 6: Self-assessed Financial Well-being: comparing objective and subjective measures

This case study from Statistics Canada below examines the comparison between subjective

and objective measures in the context of self-reported financial well-being and official poverty

measures, such as the market basket measure. It contributes to the evolving understanding of

subjective poverty measurement trends.

Due to the impact of the CO ID-19 pandemic, Statistics Canada undertook the task of

establishing a timelier approach to collecting data, enabling a monthly assessment of

households' financial well-being. As a result, a supplementary question was introduced into the

Labour Force Survey (LFS) from April 2020 onward. This question inquired about the ease or

difficulty that households experienced in meeting their financial needs in various areas,

including transportation, housing, food, clothing, and other essential expenses over the past

month.

This monthly incorporation of the question presents Canada with a distinctive opportunity to

enrich its comprehension of official poverty measurements by adopting the perspective of

subjective poverty. This approach offers advantages such as adaptability to evolving

information demands, cost-effective data collection, and time-saving benefits for survey

participants. Nevertheless, there are drawbacks, as the data must undergo an extensive

validation process before being disseminated, which can introduce delays from approval to

results. This is where the monthly LFS data proves advantageous, as it expedites data

collection for swifter outcomes. owever, this comes with increased costs and potential data

reliability concerns. Combining monthly and administrative data appears to bridge the gaps

between subjective and objective poverty measures.

This new incorporation thus offers the possibility to construct an indicator that amalgamates

socio-demographic variables and income data from the Canadian Income Survey (CIS) to

provide a more comprehensive analysis of subjective poverty. Research studies have been

conducted to investigate sociodemographic characteristics in cases where individuals' financial

well-being diverges from the anticipated official poverty line. Linking the CIS 2020 data with

the financial difficulty data extracted from the supplemental LFS between January 2021 and

July 2021 allows comparisons between subjective and objective poverty measures.

The advantages derived from juxtaposing perceived financial well-being with official poverty

measures can be observed in this case study by Statistics Canada. It delves into a comparison

between employed and unemployed individuals, focusing on their Market Basket Measure

(MBM) in relation to their financial well-being. The age group under examination was

restricted to individuals aged 25 to 54. Results revealed that 43.7% of employed individuals

reported financial comfort (Figure 7), in contrast to 21.0% of the unemployed cohort (Figure

8). A larger percentage of individuals above the poverty line, among the unemployment group,

reported financial difficulty compared to the unemployed. This shows that one’s perception of

poverty is not aligned with their objective poverty.

Numerous avenues exist for understanding poverty, but the aim of this case study is to merge

the subjective and objective dimensions to conceptualize and understand poverty more

profoundly. Accordingly, Statistics Canada has been employing yearly data to calculate the

76

MBM, while the LFS relies on monthly data. By linking these two datasets, a deeper insight

into subjective poverty and its nuances is achieved. The example offers just a glimpse of the

potential when these two poverty conceptions converge. Yet, there remains a wealth of

opportunities to explore further aspects, such as gauging the proximity to the poverty line and

juxtaposing it with the ability to meet financial needs or examining the proportion of

immigrants and visible minorities experiencing poverty at income levels exceeding the poverty

threshold. These represent only a subset of the possibilities that could stimulate an array of

future research endeavors.

Source: Statistics Canada, Canadian Income Survey, 2020 and Labour Force Survey, September 2020 to September 2021.

Source: Statistics Canada, Canadian Income Survey, 2020 and Labour Force Survey, September 2020 to September 2021.

Overlaps in Dimensions of Poverty

To further this, the article, O p D , explores the overlap among

three dimensions of poverty and finds that there is minimal overlap in the group of individuals

considered poor by these dimensions, largely due to differences in reliability and validity of

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Below

Above

Total

Figure 7: Comparing MBM to subjective financial well-being in age group 25-54 - Employed persons

Difficult Neither Ease

0 10 20 30 40 50 60 70 80 90 100

Below

Above

Total

Figure 8: Comparing MBM to subjective financial well-being in age group 25-54, Unemployed persons

Difficult Neither Ease

77

measures (Bradshaw and Finch, 2003). This lack of overlap implies that the policy response to

poverty will vary depending on the chosen measure. For example, cumulatively poor

individuals, those poor in multiple dimensions, exhibit different characteristics and social

exclusion patterns compared to those poor in only one dimension. This suggests that

cumulatively poor individuals might be a more reliable way to identify poverty and distinguish

between different levels of poverty. The article recommends using a combination of measures

in future poverty studies to provide a more robust basis for drawing conclusions, as relying on

a single dimension has limitations in terms of reliability and validity.

I pl catio re ar exper e ce w th COVID outbreak

The Socio-Economic Impact Assessments (SEIAs) conducted during the CO ID-19 outbreak

faced several challenges. One major challenge was ensuring the accuracy and suitability of

primary data collection, including research methodology, questionnaire design, sampling

methods, and reaching vulnerable groups. Designing questionnaires proved complex,

particularly for household-level assessments, given the multidimensional nature of impacts

and the need to avoid respondent fatigue during remote data collection.

Sampling presented its own challenges, as some countries had to balance the rapid need for

data with the potential for wider margins of error with smaller sample sizes. Others faced the

trade-off of collecting larger samples, which required more time for data collection fieldwork.

Remote data collection made it difficult to reach hard-to-reach and vulnerable groups like

informal workers and migrants.

Timing for data collection preparations varied across countries, influenced by factors such as

country size, partnerships, and the pandemic's onset. Fieldwork constraints arose due to

quarantine measures, lockdowns, and movement restrictions, further delaying data collection.

Remote data collection methods, including telephone interviews and digital tools, became

essential.

Ensuring data comparability across SEIAs posed a significant challenge. Different countries

employed context-specific approaches with varying questionnaires and sampling strategies,

affecting cross-country comparisons and data aggregation. Maintaining questionnaire

comparability for time-series comparisons with earlier surveys conducted in 2020 was also a

concern.

Data sharing by government partners and big data providers presented another hurdle. While

some countries had open-source secondary data, others had lengthy processes to access data

from partners. Additionally, primary data collected by various entities was often shared in

forms for end-users, not as raw datasets, and some government counterparts were reluctant to

share sensitive primary data. These challenges underscore the complexity of conducting

SEIAs during a global crisis.

In summary, subjective poverty measures in SEIA assessments demonstrated interconnections

and provided valuable insights into the impacts of the CO ID-19 pandemic on households.

These measures allowed affected households to establish poverty criteria and express their

opinions about needed assistance. Gathering opinions about poverty and necessary support

78

proved invaluable in shaping government policies, including subsidies, direct cash transfers,

and bill payment deferments. Regular adoption of subjective poverty measures, not only in

SEIA but also at the country level, can inform government policymaking effectively.

Co clu o

Subjective poverty measures are gaining popularity, but their relationship with existing

monetary and multidimensional poverty measures needs clarification. ey questions revolve

around overlaps and discrepancies in identifying poverty and their relevance for public policy.

It remains uncertain how much information subjective measures capture that monetary and

multidimensional measures already encompass, what novel insights they offer, and whether

they should stand alone or complement other measures. It is imperative that efforts to

understand subjective poverty elucidate their utility for policymakers combating poverty.

Future research should address the proportion of those reporting subjective poverty who also

experience multidimensional or monetary poverty and explore what unique information

subjective measures reveal for those not deemed poor by conventional standards. Additionally,

it should discern when subjective measures provide value for those classified as poor by

conventional criteria and when they reflect adaptive preferences. Whether subjective measures

should replace or work alongside other poverty metrics is a critical consideration for guiding

policymaking effectively. Caution is advised against portraying subjective questions as

simplistic, as their adoption could potentially displace more robust multidimensional

measures, necessitating a balanced approach to ensure a comprehensive understanding of

poverty. It is recommended that subjective poverty questions always complement rather than

replace multidimensional ones to avoid sacrificing valuable insights into poverty's

multidimensional nature.

Chapter 5. RECOMMENDATIONS

Chapter 2 addresses the questions “what is subjective poverty”, “what is a subjective poverty

measure” and “why should National Statistics Offices (NSOs) measure subjective poverty”?

As its name suggests, subjective poverty is based on the personal perspective and evaluation

of individuals. In subjective poverty, poverty is assigned in one of two ways. In the first way,

individuals or households are asked to evaluate their life situation, thereby identifying

themselves as “poor” or finding it “very difficult to make ends meet” through their response to

a question. In the second, a household makes an evaluation of what resources are required to

meet a standard such as “making ends meet”, which can in turn be converted into a “subjective

poverty line”. Subjective poverty measures can capture aspects of poverty missed by

traditional monetary poverty metrics. Subjective poverty incorporates the fundamental aspect

of reflecting citizen’s perspectives on what constitutes poverty – an aspect which is, perhaps

surprisingly, under-considered in policy development.

Recommendation 1

Subjective measures of poverty should be included among the set of assessment tools

used by countries. These do not replace objective measures or multidimensional

79

measures; rather, they are a complement. Countries with dashboards of poverty

indicators should include subjective assessments among the poverty indicators.

Chapters 2 and 3 relate non-monetary subjective poverty measures to the more common

measures of subjective well-being, such as the Cantril ladder, and introduces the most

common non-monetary subjective poverty question forms. They also introduce the most

common monetary subjective poverty question forms including the “Daleeck” question and

the “Minimum Income Question”.

Examples of subjective poverty measures include some that ask respondents to self-identify as

poor: (D consider p ?); evaluate their own situation as one of “making ends

meet” (T k ’ k

p xp ? W W W

F E V ); or provide a subjective valuation of a

poverty line (T k w w

“ k ”?). The second of these questions is

known as the “Deleeck” question and is found in the EU-SILC. The last of these questions is

known as the Minimum Income Question (MIQ).

The chapter then describes various ways that subjective questions can be used to create a

subjective poverty line. The MIQ is one type of subjective poverty question that can be used to

create a subjective poverty line, using a method known as the .

Recommendation 2

Given their inclusion in EU-SILC, and their utility in identifying subjective poverty, the

Deleeck and Minimum Income Question questions should be considered by NSOs as a

standard for international comparison purposes.

A

. T k ' k

p xp ? (W W

W F E V ). E -SILC Q HS120.

I opinion w w w

k p xp ? w

p w

xp ( k ). EU-SILC variable S130.

Recommendation 3

Utilize the Minimum Income Question and the intersection approach as the primary

methods for estimating subjective poverty lines.

Chapter 4 examines in depth good practises associated with surveys which can be used to

determine subjective poverty. Several different survey types can be considered for subjective

poverty content. While subjective poverty measures are not considered replacements for

objective poverty measures, their inclusion on “pulse”, “omnibus”, “crowdsourced” and

80

opinion polls can provide timely information on individuals self-assessments of poverty status.

Nevertheless, different survey models may have implications for results. Similarly,

experimental results show that small differences in question wording or changes in question

wording over time can have large effects on observed results.

Chapter 4 also examines several efforts made by statistical agencies worldwide to rapidly

pivot to provide rapid information during the CO ID-19 pandemic. For example, Socio-

Economic Impact Assessments (SEIA) were conducted across the UNECE region by 15

countries. The example underscores the transformative impact of the CO ID-19 pandemic on

the landscape of subjective research and the need to adapt research methodologies to

effectively capture and understand subjective experiences, especially concerning poverty and

well-being assessments. It also demonstrated challenges in applying rapid collection

approaches, multi-nationally, in a quickly changing environment. In the conclusions, Chapter 4

underscores the need to continue to demonstrate, through empirical studies, the policy utility

of subjective poverty measures. As with other measures of poverty. Subjective poverty is

concentrated among particular groups. A similar breakdown of disaggregated groups

suggested in the UNECE publication Poverty Measurement: Guide to Data Disaggregation

should be used for disaggregation of subjective poverty. These would include age, sex,

disability status, migratory status, ethnicity, household type, employment status, tenure status

of the household, receipt of social transfers, educational attainment and degree of urbanisation.

Recommendation 4

NSOs and analysts should consider the possible impacts of survey mode, context

(framing) and sampling methods and wording differences when analysing subjective

indicators such as subjective poverty.

Recommendation 5

NSOs and analysts should continue to demonstrate the utility of subjective poverty

measures, considering issues of overlap with objective poverty measures and policy

applications.

Recommendation 6

Subjective poverty measures should be disaggregated to at-risk groups, in a similar

fashion as recommended in UNECE’s guide to disaggregation.

81

82

Appe x

Table A.1: Question Types Reported Being Asked by Country in UNECE (2021) Study

Qualitative Categorical Money

Metric Total # of

Subjective

Poverty

Questions

Other

Country Identification Evaluation Prediction Evaluation

Deprivation,

Social

Exclusion,

Well-being

Armenia 1 1 2 8

Austria* 1 1 2 3

Azerbaijan

Belarus 5 1 2 7 2

Belgium* 1 1 2

Bosnia and

Herzegovina 1 1 1

Brazil 1 1 4

Bulgaria* 1 1 1

Canada 8 1 1 9

Colombia 1 4 2 1 7 9

Costa Rica 1 1 2

Croatia* 1 1 1 6

Cyprus* 1 1 2

Czech

Republic*

Denmark* 2 1 2

Dominican

Republic

Estonia* 1 1 1 8

Finland* 2 1 2 17

Georgia

Germany* 1 1 2

Hungary* 3 1 1 5 8

Ireland* 1 1 2

Israel 2 1 3

Italy* 1 1 2

Japan

Kyrgyz

Republic 1 2 1

83

Latvia* 1 1 1

Lithuania* 2 1 3

Luxembourg* 1 1 2 6

Malta* 1 1 2

Mexico 1 1 1 1

Mongolia

Montenegro* 1 1 1

Netherlands* 3 1 2 4 12

New Zealand 1 1 1 10

Republic of

North

Macedonia*

1 1 2 10

Norway* 2 1 1

Portugal* 1 1 1

Republic of

Moldova 2

Romania* 2 2 2 2

Russian

Federation 2 1 3 4

Republic of

Serbia* 1 1 1

Slovakia* 2 2 2 3

Slovenia* 1 1 1

Spain* 1 1 2

Sweden* 1 1 1

Switzerland* 2 1 3 1

Turkey 1 1 3 1

Ukraine 3 1 2 5 7

United States

Uzbekistan 1 1 1

Viet Nam 1 1 1

Total # of

Countries 4 42 6 40 45 22

National Experimental Wellbeing Statistics, Liana Fox (United States Census Bureau)

This is the U.S. Census Bureau’s first release of the National Experimental Wellbeing Statistics (NEWS) project. The NEWS project aims to produce the best possible estimates of income and poverty given all available survey and administrative data. We link survey, decennial census, administrative, and commercial data to address measurement error in income and poverty statistics. We estimate improved (pre-tax money) income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research.

Languages and translations
English

National Experimental Wellbeing Statistics Version 1, released February 14, 2023

SEHSD Working Paper Number 2023-02 CES Working Paper Number 23-04

Adam Bee, Joshua Mitchell, Nikolas Mittag, Jonathan Rothbaum, Carl Sanders, Lawrence Schmidt, and Matthew Unrath

Abstract

This is the U.S. Census Bureau’s first release of the National Experimental Wellbe- ing Statistics (NEWS) project. The NEWS project aims to produce the best possible estimates of income and poverty given all available survey and administrative data. We link survey, decennial census, administrative, and commercial data to address measure- ment error in income and poverty statistics. We estimate improved (pre-tax money) income and poverty statistics for 2018 by addressing several possible sources of bias documented in prior research. We address biases from (1) unit nonresponse through improved weights, (2) missing income information in both survey and administrative data through improved imputation, and (3) misreporting by combining or replacing survey responses with administrative information. Reducing survey error substantially affects key measures of wellbeing: We estimate median household income is 6.3 percent higher than in the survey estimate, and poverty is 1.1 percentage points lower. These changes are driven by subpopulations for which survey error is particularly relevant. For householders aged 65 and over, median household income is 27.3 percent higher than in the survey estimate and for people aged 65 and over, poverty is 3.3 percent- age points lower than the survey estimate. We do not find a significant impact on median household income for householders under 65 or on child poverty. Finally, we discuss plans for future releases: addressing other potential sources of bias, releasing additional years of statistics, extending the income concepts measured, and including smaller geographies such as state and county.

∗Send correspondence to [email protected] Bee: U.S. Census Bureau, [email protected]; Mitchell: U.S. Census Bureau, [email protected]; Mittag: CERGE-EI, [email protected]; Rothbaum: U.S. Census Bureau, [email protected]; Sanders: U.S. Census Bureau, [email protected]; Schmidt: MIT Sloan School of Management, [email protected]; and Unrath: U.S. Census Bureau, [email protected]

Any opinions and conclusions expressed herein are those of the authors and do not reflect the views of the U.S. Census Bureau. The Census Bureau has reviewed this data product to ensure appropriate access, use, and disclosure avoidance protection of the confidential source data used to produce this product (Data Management System (DMS) number: P-7524052, Disclosure Review Board (DRB) approval number: CDRB-FY23-SEHSD003-025).

1

1 Introduction

Accurately measuring household income and poverty is essential to understanding the na-

tion’s overall economic wellbeing. Previous studies suggest that measurement error stem-

ming from unit nonresponse, item nonresponse, and misreporting biases key official statistics

such as mean or median income and the official poverty rate. The direction of bias differs

among these sources of measurement error. Unit and item nonresponse have been found

to bias income up and poverty down (Rothbaum et al., 2021; Rothbaum and Bee, 2022;

Bollinger et al., 2019; Hokayem, Raghunathan and Rothbaum, 2022), while misreporting

can bias income down and poverty up (Bee and Mitchell, 2017; Meyer et al., 2021b; Larri-

more, Mortenson and Splinter, 2020). These previous papers document aspects of the overall

problem of survey error in isolation, so the overall impact of these sources of error on the

accuracy of survey estimates remains unclear.1 Important next steps are to study the joint

impact of these error sources, and to develop a comprehensive solution that addresses all

partial problems simultaneously. Doing so would provide survey users with the best possible

measure of income.

This paper summarizes the National Experimental Wellbeing Statistics (NEWS) Project, a

project to create the most accurate estimates of household income and poverty. The NEWS

project makes three unique contributions towards a more comprehensive solution to the

problem of measuring income accurately. First, we address as many sources of bias as we

can simultaneously, including unit and item nonresponse and underreporting in surveys as

well as the various challenges in administrative data such as measurement error, conceptual

misalignment, and incomplete coverage. Simultaneously addressing these error sources is

crucial, since they have been found to bias key statistics in different directions. Second, we

bring together all of the available survey and administrative data in order to overcome the

shortcomings of individual data sources. For example, we use 5 different sources of wage and

1We discuss these existing approaches and how our methodology compares with them in section 2.4.

2

salary earnings, each of which capture earnings and jobs not on reported on others. Third,

we propose a model to combine survey and administrative earnings data given measurement

error in both sources, replacing ad hoc assumptions that have been used in prior work.2

To demonstrate the importance of more accurate data, we estimate pre-tax money income

and poverty statistics for 2018, mirroring the Census Bureau’s annual income and poverty

report (Semega et al., 2019). Under our approach, median household income is 6.3 percent

higher than the survey-only estimate. The official poverty rate is 1.1 percentage points

lower than the survey-only estimate, with 9.4 percent fewer people in poverty.3 However,

these differences vary considerably across groups. Median household income is 27.3 percent

higher for householders aged 65 and older, 5.0 percent higher for those aged 55-64, and

not statistically different or lower for all other householder ages. Likewise, poverty is 3.3

percentage points lower for persons aged 65 and over (34.2 percent fewer people in poverty),

compared to 0.7 percentage points lower for those aged 18-64 (6.7 percent fewer people in

poverty), and not statistically different for children 17 and under.4

We find that combining survey responses and administrative records matters for the mea-

sured income distribution, with different roles played by non-response and misreporting. At

the bottom of the income distribution, we find that weighting and imputation augmented

with administrative records decreases income at the lowest percentiles of the survey-response

only income distribution. This negative shift of the income distribution is more than offset,

however, by the additional income that administrative records report relative to surveys. We

compare the household income distribution with and without the administrative data and

find large effects across the distribution, from 17.1 percent more income at the 10th per-

centile, to 10.3 percent more at the 25th, 6.8 percent more at the median, and 3.6 percent

more at the 75th. As a result, while the survey estimate of household income at the 90th

2More detail on the earnings measurement error model will be provided in a forthcoming companion paper, Bee et al. (2023).

3All comparisons are statistically significant at the 5 percent level unless otherwise noted. 4Estimates are shown in for median household income by subgroup in Table 1 and Figure 1, for poverty

by subgroup in Table 2 and Figure 2, and for inequality in Table 3.

3

percentile is 12.5 times as large as at the 10th percentile, with the NEWS estimates, the ratio

is 11.5.

In addition to the substantive differences summarized above, our analyses yield three key

methodological takeaways. First, to obtain an improved income measure, it is indeed nec-

essary to simultaneously address error sources such as nonresponse and misreporting. Our

combined nonresponse bias corrections (weighting and improved income imputation) gen-

erally adjust the point estimates of income down and poverty up.5 Including administra-

tive wage and salary earnings to address underreporting, particularly when survey-reported

earnings are zero, shifts income up and poverty down. Addressing retirement income under-

reporting (defined benefit pensions and defined contribution withdrawals) has the biggest

impact on household income across much of the distribution, echoing findings from Bee and

Mitchell (2017). For householders under 55 whose income comes predominantly from wage

and salary earnings, which is one of the best reported income sources in surveys, we find

limited differences in income and poverty estimates. However, for those 55 and over and

particularly for those 65 and over, who have more income in underreported sources (retire-

ment, interest, dividends, etc.), the increase in income due to the underreporting adjustment

is greater than the decline in income from the nonresponse bias correction.

A second key takeaway is that each data source has its own strengths and shortcomings,

making it difficult to produce accurate estimates of income and poverty when relying only

on a single source. As is well-established in the literature, survey data have a number

of limitations. For example, over 40 percent of all income is imputed in the CPS ASEC

(Hokayem, Raghunathan and Rothbaum, 2022), including 46 percent of wage and salary

earnings from a primary job. However, analyses which use administrative data alone are

not a panacea. Administrative sources can miss income as well – 5 percent of adults report

wage and salary earnings in the CPS ASEC but do not receive a W-2 (Bee, Mitchell and

5The differences in this paper are not generally statistically significant, however, as shown in Figure 3 Panel A.

4

Rothbaum, 2019). Likewise, 7 percent of occupied addresses in the 2018 CPS ASEC cannot

be linked to any available source of administrative data. Program eligibility requirements

often imply that certain regions or jobs are excluded from the administrative data.

A third key takeaway is that it is critical to incorporate multiple survey and administrative

data sources. Using multiple data sources allows us to combine their strengths and thereby

reduce the shortcomings we point out above. On the positive side, we find that for some

populations, a single data source can yield quite accurate estimates. Yet each single data

source also misses or contains substantial error for categories of gross income that are of

crucial importance to other subpopulations. Thus, improving measures of income for a wider

population requires combining multiple data sources. Overall, we find that a comprehensive

approach that leverages the strengths of each data source is required to construct the most

accurate estimates of poverty and inequality.

2 Income Measurement Challenges

The major challenge to estimating income is that we do not observe all the information that

we would like for all individuals.

2.1 Survey Income

With the survey data, there are several potential sources of missing data and measurement

error, such as:

1. Survey unit nonresponse - not all individuals or households respond to the survey,

which has been found to bias income up and poverty down (Rothbaum et al., 2021;

Rothbaum and Bee, 2022).

2. Survey item nonresponse - individuals who do respond may choose not to respond

to specific questions (a particular problem for income questions), which has been found

5

to bias income up and poverty down (Bollinger et al., 2019; Hokayem, Raghunathan

and Rothbaum, 2022).

3. Survey mis- and underreporting - income is not always reported accurately on

surveys and can be severely underreported for many income types, which has been

found to bias income down and poverty up (Bee and Mitchell, 2017; Rothbaum, 2015).

We refer to this as misreporting in the rest of the paper.

As Meyer and Mittag (2021) showed in decomposing bias in estimates of means-tested pro-

gram benefits, the various sources of measurement error can have biases of different signs and

magnitudes across different programs and surveys. Also, correcting for one source of bias

without addressing others does not necessarily reduce the overall bias in the estimates.

We address all of these sources of measurement error simultaneously, building on prior work

at the Census Bureau that addressed them separately. First, we create improved weights

to address survey unit nonresponse (extending Rothbaum et al. 2021 and Rothbaum and

Bee 2022). We use imputation to address survey item nonresponse (extending Hokayem,

Raghunathan and Rothbaum 2022). We combine survey and administrative data (including

replacing survey responses), which also helps address survey item nonresponse as well as

survey misreporting (extending Bee and Mitchell 2017).

2.2 Administrative Income

Replacing survey responses with administrative records does not fully address measurement

error concerns. Many of the same types of issues in survey data are also present in admin-

istrative data, including:

1. Selection into administrative data - not all individuals, households, or firms may

be present in the administrative data due to how and why the administrative data

is collected. For example, many low-income individuals are not required to file a tax

return, meaning they may be not represented in tax data. And certain jobs are not

6

covered by unemployment insurance, meaning those jobholders are not included in

commonly used earnings data.

2. Administrative data “nonresponse” - some records may be absent from the ad-

ministrative data that should have been present. For example, although firms are

required to file a W-2 for nearly all workers, some may not for a variety of reasons such

as firm closure, or paying workers “under the table”, etc.

3. Administrative misreporting - even when an administrative record exists, it may

not be accurate. For example, “under-the-table” earnings, such as unreported tips or

underreported self-employment earnings, would result in underreporting in adminis-

trative earnings.

4. Conceptual misalignment - in some cases the income concept measured by admin-

istrative data does not match the concept we would like to measure. For example,

the W-2s information received by the Census Bureau do not include information on

employee pre-tax earnings used to pay health insurance premiums.6 For these workers,

W-2 earnings are effectively an “underreport” of gross earnings.

5. Incomplete data coverage - we may not have access to the data for specific in-

dividuals. For example, state-provided data on earnings and means-tested program

participation are not available for all states.

These make it inappropriate to rely on administrative data alone. For example, selection into

administrative data can exclude subpopulations of interest, such as low-income households

which may be underrepresented in tax data. Larrimore, Mortenson and Splinter (2020)

created households using addresses from tax filings and information returns to estimate

poverty over time, addressing income underreporting in surveys. However, they could not

observe individuals and households that did not receive any information return or file taxes.

6Starting in 2012, Box DD on the W-2 reports the total cost of the employee’s health insurance premium, including the employer and employee contribution. Box DD is not currently available for this work.

7

Instead, they had to impute the presence and poverty status of an unknown number of

individuals per year, which they estimated at 4 to 6 million. Through random sampling

from the universe of residential addresses, surveys do not have the coverage gaps we see in

administrative data.7 For example, in the 2019 Current Population Survey Annual Social and

Economic Supplement (CPS ASEC), 7 percent of occupied housing units cannot be linked to

any administrative or commercial data. But thanks to information from survey responses,

we can generate improved weights, imputations, and income measures to better approximate

our target universe of individuals and households, even in the absence of administrative data

for some.

2.3 Addressing These Challenges

The best estimates of income and poverty would rely on both survey and administrative

data. Having different sources of information allows us to address shortcomings in each

source. For example, we use 5 separate sources of wage and salary earnings. These are (1)

W-2s, (2) the Detailed Earnings Record (DER) file from the Social Security Administration

(SSA), (3) Longitudinal Employer-Household Dynamics (LEHD) data reported by firms to

state unemployment insurance offices, (4) 1040 tax filings, and (5) survey responses. Survey

earnings can help with “nonresponse” in administrative data, as 5 percent of adults report

wage and salary earnings in the CPS ASEC but do not receive a W-2 (Bee, Mitchell and

Rothbaum, 2019). Some individuals with no W-2s also report wage and salary earnings on

their tax returns.

Given the possibility of misreporting in administrative data, we develop a measurement error

model for survey and administrative reports of wage and salary earnings. We use that model

to replace ad hoc assumptions about when to use survey or administrative earnings given

measurement error in both. We discuss that model in Section 4.3.1 and in more detail in a

forthcoming companion paper (Bee et al., 2023).

7The Master Address File, from which housing units are sampled, is discussed in Section 3.

8

Likewise, conceptual misalignment in one source can be addressed using information from

other sources. For example, while available W-2 data do not include employee pre-tax con-

tributions to health insurance premiums, LEHD earnings for the same job should. Workers

with survey-reported private health insurance coverage are 3 to 5 times more likely to have

LEHD earnings that exceed the W-2 amounts by 1-3 percent, 3-5 percent, 5-10 percent, and

10+ percent, shown in Table A1.8

However, incomplete data coverage makes it more difficult to measure gross earnings in

the administrative data. Many jobs are not covered by unemployment insurance, and are

excluded from the LEHD – for 2018, there are nearly 20 million more W-2 jobs than LEHD

jobs, shown in Table A2.9

LEHD and state-provided means-tested program data are also not available for some states.

We use imputation to address this source of incomplete data coverage, to correct for underre-

porting of means-tested program receipt in surveys, and to estimate missing gross earnings

(given incomplete LEHD data), extending the work in Fox et al. (2022) and Hokayem,

Raghunathan and Rothbaum (2022).

An additional challenge in using linked survey and administrative data is selection into link-

age. Linkage rates vary by group, which can bias income estimates that include only linked

individuals (Bond et al., 2014), but if unlinked individuals are are also subject to survey

measurement challenges above, then income estimates are biased if we measure unlinked

8Note that the CPS ASEC variable indicates private coverage, but not necessarily whether that job was the source of that coverage, rather than another job or another individual’s job, such as from a spouse, partner, or other family member.

9Workers not covered by unemployment insurance include federal employees and those in various pri- vate sector occupations For example, Maryland’s Department of Labor lists the following jobs as exempt from unemployment insurance: barbers and beauticians, taxicab drivers, owner-operated tractor drivers in certain E and F classifications, maritime employment, election workers, church employees, clergy, cer- tain governmental employees, railroad employment, newspaper delivery, insurance sales, real estate sales, messenger service, direct sellers, foreign employment, other state unemployment insurance programs, work- relief and work-training, family members, hospital patients, student nurses or interns, yacht salespersons who work for a licensed trader on solely a commission basis, services of aliens who are students, scholars, trainees, teachers, etc., who enter the U.S. solely to pursue a full course of study at certain vocational and other non-academic institutions, recreational sports officials, home workers, and casual labor. Refer to https://www.dllr.state.md.us/employment/empfaq.shtml accessed 11/1/2022.

9

individuals’ incomes using survey responses only. We use weighting of households with all

adults linked conditional on their survey responses to create a representative sample of linked

individuals, extending Rothbaum et al. (2021) and Rothbaum and Bee (2022).

2.4 Relationship to Prior Research

This is not the first project to attempt to address shortcomings in survey data to estimate

improved income and poverty statistics.10 There have been several efforts to adjust sur-

vey data for underreporting in the absence of linked administrative data, include from the

Congressional Budget Office (CBO), Bureau of Economic Analysis (BEA), and the Transfer

Income Model (TRIM) from the Urban Institute. In each case, researchers had to make

assumptions about underreporting that could not be verified without linked data, such as

whether underreporting is on the extensive or intensive margin, which households are more

likely to misreport, etc. If those assumptions are not correct, which is likely in the absence of

linked data, they risk imputing income and benefits to the wrong individuals and households,

introducing biases of unclear direction and magnitude.11

10There has been considerable work on measurement error in income data, as well as comparing survey income to administrative data. As far back as the 1970’s, Kilss and Scheuren (1978) used CPS data linked to data from the Internal Revenue Service (IRS) and Social Security Administration (SSA) to evaluate survey income data. More recent examples include Abowd and Stinson (2013), Bee (2013), Benedetto, Stinson and Abowd (2013), Harris (2014), Bee, Gathright and Meyer (2015), Giefer et al. (2015), Hokayem, Bollinger and Ziliak (2015), Bhaskar et al. (2016), Chenevert, Klee and Wilkin (2016), Noon, Fernandez and Porter (2016), Bee and Mitchell (2017), Fox, Heggeness and Stevens (2017), O’Hara, Bee and Mitchell (2017), Abowd, McKinney and Zhao (2018), Benedetto, Stanley and Totty (2018), Bhaskar, Shattuck and Noon (2018), Brummet et al. (2018), Eggleston and Reeder (2018), Meyer and Wu (2018), Murray-Close and Heggeness (2018), Rothbaum (2018), Shantz and Fox (2018), Bee, Mitchell and Rothbaum (2019), Bollinger et al. (2019), Imboden, Voorheis and Weber (2019), Jones and Ziliak (2019), Eggleston and Westra (2020), Larrimore, Mortenson and Splinter (2020), Abraham et al. (2021), Eggleston (2021), Larrimore, Mortenson and Splinter (2021), Meyer and Mittag (2021), Rothbaum et al. (2021), Carr, Moffitt and Wiemers (2022), Fox et al. (2022), Hokayem, Raghunathan and Rothbaum (2022), Larrimore, Mortenson and Splinter (2022), McKinney and Abowd (2022), Moffitt et al. (2022), Moffitt and Zhang (2022), Rothbaum and Bee (2022), and others. For a more complete discussion of nonsampling error in income and poverty statistics, refer to Bee and Rothbaum (2019), which also discusses the challenges in addressing these issues and discussed the research agenda that led to this project.

11For example, BEA’s approach scales up income on the intensive margin in same cases, risking imputing income to accurate reporters rather than for extensive margin misreporting, which is common for retirement income (Bee and Mitchell, 2017) and means-tested program benefits (Shantz and Fox, 2018; Meyer and Mittag, 2019). The CBO model imputes missing income and benefits on the extensive margin conditional on survey characteristics, but underreporting is often not well captured by the observable survey information

10

Similar work has been pursued under a separate project at the Census Bureau, the Com-

prehensive Income Database (CID, refer to Medalia et al. 2019), including Meyer and Wu

(2018), Meyer et al. (2021b), Meyer et al. (2021a), and Corinth, Meyer and Wu (2022). A

main focus of the CID project has been on addressing misreporting in income and means-

tested program benefits. We additionally address nonresponse bias, missing administrative

data, and model measurement error in survey and administrative earnings.

3 Data

We would like to use any available data that can help inform estimates of income, resources,

or wellbeing, broadly defined. This includes survey and decennial census data collected by

the Census Bureau, administrative data, and commercial data. The data could be useful to

directly measure resources, to model estimates of resources, to validate measures, to address

nonresponse, etc. In this section, we discuss each source of data, also shown in Table 4.

Figures 4 and 5 show how we put these data sources together to create the files we use to

generate the income and poverty estimates, which we discuss in Section 3.7.

3.1 Survey Data

Surveys collect information on many characteristics of individuals and households that are

not available or well-measured in administrative data for all or subsets of the population.

These include race, Hispanic origin, tenure (homeownership vs. renting), educational attain-

ment, household composition, and much more. Surveys also include information on income,

although we have considerable evidence on misreporting of income on surveys.

(Mittag 2019 and Fox et al. 2022). The TRIM model uses unlinked auxiliary data and program rules to impute missing benefits on the extensive margin. However, Shantz and Fox (2018) and Mittag (2019) show that the underreported program benefits may not be missing from households that appear to qualify for them either through the rules-based imputations or from matching to auxiliary data, with the caveat that income item nonresponse means that household income and program receipt may be less correlated as the regular survey imputations do not condition on administrative program data.

11

Survey operations also provide information that can be crucial for these estimates. First,

major surveys conducted by the Census Bureau are stratified random samples of addresses,

in which the occupancy status of housing units (vacant/occupied) is assessed as part of

the survey. This provides a sample of units in our target universe, occupied housing units,

and their sampling probability. In administrative records, it can often be unclear or even

impossible to identify the set of occupied units with no available data – i.e., households and

individuals that received no W-2 or other information return and did not file taxes or units

with no linked information because they are not the primary residence for a high-income or

wealth household. The unobserved units in administrative data may be more likely to be

at one end of the income distribution than the other – making their absence particularly

problematic when measuring inequality or hardship, such as poverty.

We use data from two household surveys. First, we use the Current Population Survey’s

(CPS) Annual Social and Economic Supplement (ASEC). The CPS ASEC is an

annual survey conducted from February to April each year as a supplement to the monthly

CPS. Respondents are asked social and demographic questions, as well as questions about

their income and resources in the prior calendar year. CPS ASEC data are available at

the Census Bureau from 1967 to the present. In 2019, approximately 95,000 addresses were

sampled for the CPS ASEC.12 It is the source of the official poverty measure produced by

the Census Bureau as well as widely cited measures of the household income distribution

(Semega et al., 2019). In Version 1, we estimate income and poverty statistics on the 2019

CPS ASEC sample for income in year 2018.

Second, we use theAmerican Community Survey (ACS), which is available from 2005 to

the present. The ACS is an ongoing survey of more than 2 million respondent households each

year. Respondents are asked similar (although generally less detailed) questions than the

CPS ASEC, particularly for income. Additionally, ACS respondents are asked about income

12Refer to the CPS ASEC technical documentation at https://www2.census.gov/programs-surveys/ cps/techdocs/cpsmar19.pdf.

12

in the prior 12 months, rather than the prior calendar year as in the CPS ASEC.13 For Version

1 of this project, the ACS provides summary information by geography and occupation that

are used in our weighting model and earnings measurement error model.

Both the CPS and ACS use field representatives to assess the occupancy status of housing

units, the CPS as part of the Housing Vacancy Survey and ACS for estimates of vacancy

rates.14

3.2 Other Census Bureau Data

The Census Bureau has other data available on the nation’s people and households that we

use. First, we use data from the decennial census. This includes information on each

individual’s race, Hispanic origin, and age.

We also use information from theMaster Address File File (MAF).15 The MAF contains

continuously updated information of all known living quarters in the United States. The

MAF is used to select housing units for inclusion in household surveys, including the CPS

and ACS, as well as for decennial census operations. The MAF also includes housing unit

characteristics, such as whether addresses are in single-family or multi-family units.

We also use the Master Address File Auxiliary Reference File (MAF-ARF) which

links addresses in the MAF to individuals who reside there in each year. The MAF-ARF is

constructed from multiple administrative data sources, including from the IRS, Department

of Housing and Urban Development (HUD), and the U.S. Postal Service, among others.

Each of these other Census Bureau data sources provide information that can help us address

nonresponse bias and better estimate income and poverty statistics on representative samples

13ACS technical documentation is available at https://www.census.gov/programs-surveys/acs/

technical-documentation.html and https://www.census.gov/content/dam/Census/library/

publications/2020/acs/acs_general_handbook_2020.pdf. 14Refer to https://www.census.gov/topics/housing/guidance/vacancy-fact-sheet.html for a dis-

cussion of housing vacancy estimates in the Housing Vacancy Survey (from the CPS), ACS, and American Housing Survey.

15The specific file we use is the MAF extract file, or MAFx.

13

of individuals, families, and households.

3.3 Federal Administrative Data

The federal government data we use are provided primarily by the IRS and Social Security

Administration (SSA). The Census Bureau also has an agreement with the Department of

Health and Human Services (HHS) for data on the Temporary Assistance for Needy Families

(TANF) program from some states. That data will be discussed in Section 3.4, as TANF

data are also shared with the Census Bureau by individual partner state agencies.

3.3.1 IRS Data

From the IRS, we have the following data:

1. the Information Return Master File (IRMF) from 2005 to the present,

2. the universe of Form 1099-R returns on “Distributions From Pensions, Annuities,

Retirement or Profit-Sharing Plans, IRAs, Insurance Contracts, etc.” from 1995 to the

present,

3. the universe of Form W-2 returns on “Wage and Tax Statement” for all W-2 covered

jobs from 2005 to the present, and

4. the universe of Form 1040 tax filings every five years from 1969 to 1994, 1995, and

then each year from 1998 to the present.

The IRMF includes an indicator for each individual that received one of several information

returns in a given year as well as their address, including for Forms 1098, 1099-DIV, 1099-G,

1099-INT, 1099-MISC, 1099-R, 1099-S, SSA-1099, and W-2. The IRMF allows us to link

individuals to their addresses and is used in constructing the MAF-ARF. The IRMF does

not include any information on income amounts.

The 1099-R extracts provided by the IRS include information on amounts of defined-benefit

14

pension payments (including survivor and disability pensions) and withdrawals from defined-

contribution retirement plans. These extracts exclude 1099-R records corresponding to direct

rollovers between accounts.

The W-2 extracts provided by the IRS include select W-2 boxes, including wages and salary

net of pre-tax deductions for health insurance premiums and deferred compensation, as well

as the total amount of deferred compensation. This means that employee and employer

pre-tax contributions to health insurance premiums are not available in the W-2 data.

The 1040 extracts provided by the IRS include information on tax-unit wage and salary

income, gross rental income, gross Social Security income, taxable and tax-exempt interest

income, dividends, Adjusted Gross Income, and a constructed measure of Total Money

Income (TMI). TMI is the sum of taxable wage and salary income, interest (taxable and

tax-exempt), dividends, gross Social Security income, unemployment compensation, alimony

received, business income or losses (including for partnerships and S-corps), farm income

or losses, and net rent, royalty, and estate and trust income.16 The 1040 also includes

information on marital status through filing status and filer information and identifies up to

four dependents.

We use IRS data to address nonresponse bias and measurement error.

3.3.2 Social Security Administration (SSA) Data

From the SSA, we use the following data:

1. the Numerical Identification System (Numident) file,

2. extracts from the Detailed Earnings Records (DER).

3. several files from the Payment History Update System (PHUS), and

16Prior to tax year 2018, TMI also included total pensions and annuities. However, this was removed from TMI due to a change to income reporting on the Form 1040 and the regulations regarding data sharing between IRS and the Census Bureau.

15

4. several files from the Supplemental Security Records (SSR).

The Numident contains information on any individual to ever receive a Social Security Num-

ber (SSN), including their sex, date of birth, date of death, information on their citizenship

status, and their location of birth.

The DER contains job-level W-2 information that generally corresponds to the data provided

by IRS, but with the potential for additional cleaning and error correction from SSA as part

of their administration of the Social Security system. The DER also includes Social Security

covered self-employment earnings reported on the Form 1040 SE (if at least $400). Like

many SSA data sets, including some PHUS and SSR files, the DER is only available for

linked respondents from specific surveys and years.17

The PHUS contains monthly Old Age, Survivors, and Disability Insurance (OASDI) program

payment information from 1984 to the present. There are several PHUS files available to the

Census Bureau. One set of PHUS files includes OASDI recipients in 2020 and 2021, with one

record per address. There are also PHUS files for linked respondents from specific surveys

and years.

The SSR contains monthly Supplemental Security Income (SSI) payments for both federal

SSI payments and state payments administered by the SSA, from 1984 to the present. One

set of PHUS files includes SSI recipients in 2020 and 2021, with one record per address.

There are also SSR files for linked respondents from specific surveys and years.

We use the survey-linked SSA data (DER, PHUS, and SSR) to address item nonresponse

bias and measurement error. The Numident and address-level SSA data (PHUS and SSR)

are useful for weighting to address nonresponse bias.

17Specifically, the DER includes respondents with an assigned Protected Identification Key (discussed in Appendix A) who can be linked to the Numident from the CPS ASEC in 1973, 1979, 1981-1991, 1994, and 1996-present, the Survey of Income and Program Participation (SIPP) in 1984, 1990-83, 1996, 2001, 2004, 2008, 2014, and 2018-present, and the ACS in 2019.

16

3.4 State Administrative Data

We use several data sets shared with the Census Bureau by state government agencies:

1. the Longitudinal Employer-Household Dynamics (LEHD) files,

2. data on Supplemental Nutrition Assistance Program (SNAP) participation,

and

3. data on Temporary Assistance for Needy Families (TANF) program participa-

tion.

3.4.1 LEHD

Under the LEHD program, states provide data on wage and salary earnings reported by

firms for the administration of the unemployment insurance (UI) program. Firms report

gross earnings to UI offices, so the LEHD should include non-taxable earnings that are not

reported on a Form W-2 for the same job such as pre-tax employee contributions for health

insurance premiums. However, coverage in the LEHD data we use is not complete, as many

government employees (such as federal civilian employees, postal workers, and Department

of Defense employees) are not covered by state UI benefits. Furthermore, some private-sector

employees, including those employed by religious organizations, are not covered by UI, and

are therefore not present in the LEHD data. Finally, data sharing agreements between a

state and the Census Bureau are not always available, resulting in LEHD earnings missing

for all jobs in specific states and years.18

LEHD data are useful for addressing nonresponse bias and misreporting.

18More information on the LEHD program and data is available at http://lehd.ces.census.gov/data/ lehd-snapshot-doc/latest/, accessed 12/16/2022. While the LEHD program does receive data from the Office of Personnel Management (OPM) for many federal employees, those data are not part of the more recent years of data in the LEHD Interleave file used in this project.

17

3.4.2 SNAP

The Census Bureau has agreements with many states to receive data on SNAP participation,

although the available states vary by year.19 The SNAP data includes benefits received for

each case as well as the individual members recorded in that SNAP case.

SNAP data are useful for addressing misreporting of other income items. SNAP is not

included in money income, but these data will be used to address misreporting of in-kind

benefits in future releases.

3.4.3 TANF

The Census Bureau also has agreements with many states to receive data on TANF partici-

pation. In addition to the state agency data, the Census Bureau also has data on TANF cash

assistance receipt from HHS. As with SNAP, the available states vary by year.20 TANF data

are also available by case (benefit amounts) with individuals in each TANF case recorded as

well.

TANF data are useful for addressing misreporting.

3.5 Commercial Data

We use information on home values from Black Knight, a third party aggregator of prop-

erty tax records, which can be useful in correcting for selection into nonresponse on sur-

veys.21

These data are useful for weighting to address nonresponse bias.

19For example, SNAP data are available for 17 states in 2018, 20 states in 2014, 16 states in 2010, and 6 states in 2006. In 2018, the states with available SNAP data are Arizona, Connecticut, Florida, Hawaii, Idaho, Indiana, Kentucky, Maryland, Mississippi, Montana, Nevada, New Jersey, New York, North Dakota, Tennessee, Utah, and Wyoming.

20TANF data are available for 36 states in 2018, 37 states in 2014, 36 states in 2010, and one state in 2006.

21Chapin et al. (2018) evaluated the use of similar data from CoreLogic in ACS production and discuss some strengths and limitations of this kind of data. One limitation is that the coverage varies by location.

18

3.6 Firm Data

We also use data on firm characteristics from the Longitudinal Business Database (LBD),

which is described in Chow et al. (2021). The LBD contains establishment-level information

on firm employment and payroll. The LBD is constructed from other data sources at the

Census Bureau, including the Business Register (BR), that are constructed using data from

the IRS and surveys of businesses, including the Economic Census.

Firm data are useful for addressing nonresponse bias, because they help predict survey

responses. They can also be used to address misreporting when there is measurement error

in both survey and administrative data, since firm information might help us diagnose error

in both data sources.

3.7 Linkage and File Construction

To make use of all of this data, we link them to create two main files: (1) the Address File

and (2) the Person File, with linkages made at the following levels:

• Individual - using Protected Identification Keys (PIKs),

• Address - using Master Address File identifiers (MAFIDs),

• Job - using PIKs and Employer Identification Numbers (EINs) and by the job matching

procedure described below,

• Firm - using the LBD firm identifiers (LBDFID) and Employer Identification Numbers

(EINs), and

• Geography - by state, county, and census tract.

The data linkage process for the individuals and addresses is straightforward. We match

observations using unique identifiers attached to each person (PIK) and address (MAFID)

in each file. The assignment of these identifiers is discussed in Appendix A. To link a

19

survey respondent to any administrative data, we must assign that respondent a PIK using

the personally identifiable information (PII) on the survey. If a survey respondent is not

assigned a PIK, they cannot be linked to any administrative data.

As discussed in Section 2, we have many sources of wage and salary earnings information.

Three of them are available at the job level – W-2s, the DER, and LEHD. However, linking

LEHD and W-2 jobs is not trivial.22 In the simplest case, a firm files a W-2 and reports

the job to the UI office with the same EIN. We can link these “direct matches” by PIK

and EIN. However, some firms do not file their W-2s and UI reports under the same EIN.

We use individual and job-level information from the universe of W-2 and LEHD jobs to

create indirect matches of firm identifiers across datasets. We discuss this process in detail

in Appendix A.3 with an example in Figure A1.

After direct and indirect linkage, of the 264 million jobs, we find 82 percent of jobs matched

directly by PIK-EIN, 6 percent matched indirectly, 10 percent unmatched from W-2s, and

3 percent unmatched from the LEHD (shown in Table A2). We use this linked job infor-

mation to better estimate gross earnings at the job and person level for use in our income

estimates.

Because firms do not necessarily correspond to unique EINs, we use information from the

redesigned Longitudinal Business Database (LBD) to link workers (through EINs in the job

data) to unique firms (Joint Committee on Taxation, 2022; Chow et al., 2021), which we

discuss in Appendix A.4.

We create the Address File by linking the sample of occupied (non-vacant) housing units in

the survey to the aforementioned sources of administrative, survey, census, and commercial

data, as shown in Figure 4. By starting with addresses, we have information from all occupied

units, including respondents and nonrespondents. In the address file, we do not use any

information from survey responses other than whether the unit responded. This file is used

22As the DER is sourced from W-2s, linking DER and W-2 jobs is generally simple.

20

with the Person File to construct the weights that address selection into our sample and

selection into linkage, issues discussed in Section 2.

We then create the Person File by linking survey respondents to administrative data, as

shown in Figure 5. In combination with the weights created using this and the Address File,

the Person File is used for all of the subsequent steps in generating the income and poverty

estimates.

The Address and Person Files are discussed in more detail in Appendix B.

4 Methodology

In this section, we describe the steps needed to take the data described in Section 3 through

to estimating income and poverty statistics, shown in Table 5. We have categorized the steps

into three groups: (1) weighting, (2) imputation, and (3) estimation.

4.1 Weighting

Our analysis sample is the set of households that respond to the CPS ASEC with all survey-

adults assigned a PIK.23 We use weighting to address several measurement challenges dis-

cussed in Section 2, particularly survey unit nonresponse and selection into linkage. Weight-

ing is particularly useful when all of the information is missing for a subset of units – in our

case we have no survey information for nonrespondents and no administrative information

for individuals that cannot be assigned a PIK.

To address survey unit nonresponse, we use information from the linked administrative and

decennial census data which is not observed in the survey. This information is available

for all linkable households regardless of whether they responded, as is the geographic sum-

mary information. We weight respondent households so that the weighted estimates for these

23We define survey-adults as those 15 and over as the survey income questions are asked for all individuals 15 and over in a household.

21

linked characteristics match the estimates obtained using all occupied households given their

sampling probability in the CPS while the person-level weights also match to external pop-

ulation controls by state. This should address survey unit nonresponse, following prior work

in the ACS (Rothbaum et al., 2021) and the CPS ASEC (Rothbaum and Bee, 2022).

To address selection into linkage, we extend that work by estimating statistics from survey

responses in the respondent sample and reweight households with all adults linked (our

analysis sample) so that the weighted estimates from analysis sample simultaneously match:

(1) the linked administrative characteristics from the sample of occupied units, (2) the survey-

response estimates from the respondent sample, and (3) the external population controls by

state. This step should address selection into linkage, extending the prior work that was

focused only on survey estimates and survey unit nonresponse.

Weighting also helps address selection into administrative data and administrative data

nonresponse. The survey frame contains geographic summary information at the address

level for each occupied household and survey responses for respondent households that we

cannot link to administrative data, whether at the individual or address level.

For a more complete discussion of weighting, including the underlying assumptions, imple-

mentation details, and statistics validating the model, refer to Appendix C.

4.2 Imputation

Many of our measurement challenges are not the result of blocks of information missing

completely for defined subsets of observations. For example, an individual that does not

respond to the survey earnings question (46 percent of all workers) or has a missing LEHD

job may have all the other information (e.g., other survey responses, W-2 job earnings, etc.)

that we need to estimate income and poverty. For these measurement challenges, imputation

is a better approach to fully utilize the information that is available (Raghunathan et al.,

2001).

22

There are four sets of variables that we impute:

1. Survey earnings,

2. LEHD job-level gross earnings,

3. Means-tested program benefits (TANF and SNAP), and

4. Administrative income for tax nonfilers in certain categories (unemployment compen-

sation, interest, and dividends)

In the 2019 CPS ASEC, 46 percent of individuals with earnings in the survey had their

primary job earnings imputed.24 We impute earnings for these individuals (and the individ-

uals with missing earnings from other jobs/employers) conditional on the survey and linked

administrative data. These imputed values reflect the distribution of differences between

survey and administrative earnings, conditional on the observed information. This allows

us to address potential measurement error in administrative earnings for survey nonrespon-

dents.

Likewise, we are missing LEHD job-level gross earnings for 8 percent of individuals’ highest

earning job.25 There are additional jobs where W-2 earnings exceed LEHD earnings or

the disagreement between them is sufficiently large that we impute gross earnings out of

concerns about data quality. As discussed in Section, 2, we would like gross earnings from

all jobs because of the conceptual misalignment between available W-2 earnings and the

gross earnings we would like to measure. However, gross earnings is not available because of

incomplete data coverage (some states missing from the LEHD), selection into administrative

data (some jobs not covered by unemployment insurance and thus missing from the LEHD),

administrative data “nonresponse” (missing jobs in the LEHD that should be present), and

administrative data misreporting.

Following Fox et al. (2022), we also use imputation for missing means-tested program benefits

24Refer to Table 6 for rates of missing data for imputed income items. 25If we order jobs from highest to lowest earning in the job-level administrative data.

23

due to incomplete data coverage.

Finally, we impute specific administrative income items for individuals that do not file taxes

using parameters estimated on more detailed data by Rothbaum (2023). 85 percent of

survey-adults can be linked to a 1040 tax filing (refer to Table A4). For those individuals,

the Total Money Income measure includes many income items that are underreported on

surveys such as unemployment insurance compensation, interest, and dividends, even if

not all items are available separately. However, we observe only whether non-filers received

several information returns, including Forms 1099-G, 1099-INT, and 1099-DIV in the IRMF.

From these we have information on whether they received UI compensation, interest income,

and dividends, respectively. Each of these income sources are significantly underreported on

surveys (Rothbaum, 2015). Rothbaum (2023) worked with more detailed data available

under a separate agreement between the Census Bureau and IRS, for limited use. In that

data, the 1099-G, 1099-INT, and 1099-DIV data are available, including income amounts.

Rothbaum (2023) released coefficients that can be used to impute these amounts for nonfilers

conditional on survey responses and the administrative data used in this project. We use

that information to impute these underreported income items for nonfilers. This imputation

addresses selection into administrative data (tax filing) and survey misreporting of these

specific income types.

For a more complete discussion of imputation, including the underlying assumptions, imple-

mentation details, and statistics on the imputed values, refer to Appendix D.

4.3 Estimation

With the Person File, weights, and imputations, we have complete data for all the inputs

used in the NEWS estimates. The final step in processing is putting that data together to

estimate income and poverty.

24

4.3.1 Earnings Measurement Error Model

Earnings represent 80 percent of all income (Rothbaum, 2015). Measurement error in survey

and administrative earnings, therefore, merits particular attention.26

Although survey wage and salary earnings are relatively well reported when compared to

external benchmark aggregates (Rothbaum, 2015), work with linked microdata has identified

systematic differences between administrative records and survey responses.27 This work has

generally found survey wage and salary earnings are “mean-reverting” relative to adminis-

trative reports; i.e., low earners in the administrative data tend to report higher earnings on

surveys, and high earners in the administrative data tend to report lower earnings in surveys.

There is also extensive margin disagreement between survey and administrative records –

about 10 percent of working-age individuals have earnings in one data source but not the

other (Bee, Mitchell and Rothbaum 2019).

Some papers in the survey misreporting literature assumed the administrative records were

free of error (Bound and Krueger 1991, Bound et al. 1994, Pischke 1995, for example).28

However, more recent work considers the possibility that administrative data also contain

measurement error, such as unreported earnings. Abowd and Stinson (2013) consider a

model in which both survey and administrative reports for a given job may contain error.

Under their approach, “true” earnings are a weighted average of the two reports, but they

leave the selection of the proper weight to future work. Using Danish administrative data,

Bingley and Martinello (2017) cannot rule out that survey income reports have only classical

measurement error given the presence of measurement error in administrative records. We

26Some of the discussion in this section follows Bee and Rothbaum (2019) closely. 27Alvey and Cobleigh (1975), Duncan and Hill (1985), Bound and Krueger (1991), Bound et al. (1994),

Pischke (1995), Bollinger (1998), Bound, Brown and Mathiowetz (2001), Roemer (2002), Kapteyn and Ypma (2007), Gottschalk and Huynh (2010), Meijer, Rohwedder and Wansbeek (2012), Abowd and Stinson (2013), Murray-Close and Heggeness (2018), Bee, Mitchell and Rothbaum (2019), Imboden, Voorheis and Weber (2019), Jenkins and Rios Avila (Forthcoming), and many others have studied wage and salary earnings.

28In some cases, the authors restrict their analysis to a subset of workers for which the assumption is more likely to be valid. For example, Pischke (1995) compares surveys of employees of a particular firm against firm reports of the same workers’ earnings. Bound and Krueger (1991) specifically remove occupations they suspect may have under-the-table earnings.

25

do not assume that measurement error is only present in surveys. Under-the-table earnings

are, by definition, not reported to the IRS, which can bias income estimates for particular

subgroups of the population (such as by occupation). In the absence of a “truth set” of data,

it is an open question how much of this disagreement is due to misreporting on surveys or

measurement error in the administrative data.29

We have several separate reports of administrative earnings. In Table 7, we show summary

statistics on the number of individuals assigned a PIK with any wage and salary earnings

reported from all possible combinations of W-2s, the DER, and the LEHD. We also show the

probability that survey respondents report non-zero survey earnings for each combination of

administrative wage and salary sources. The vast majority of individuals with earnings in

one source have earnings in all three.30

From the three separate administrative job-level wage and salary earnings sources (including

gross earnings imputed as discussed in Section 4.2), we construct our job-level estimate of

gross earnings. We aggregate these job-level earnings to estimate total administrative wage

and salary earnings for each individual. This gives a measure of total administrative wage

and salary earnings (ya), which we then use in the model with our final post-imputation

total survey wage and salary earnings (ys) discussed in Appendix D.

29Compounding the challenge, it is not always the case that different sources of administrative data agree. Bee, Mitchell and Rothbaum (2019) found a 0.4 percentage point difference in the estimated poverty rate if survey earnings are replaced using administrative earnings data from SSA compared to data from IRS, both of which are based on the same W-2s.

30Table 7 also has information on how the W-2 earnings information available in the DER differs from the IRS W-2 information. In Panel B, we focus on individuals we can and cannot link to the Numident (a proxy for having a valid SSN). If individuals have W-2 and DER earnings, they are basically always present in the Numident and are very likely to report wage and salary earnings in the survey (87 percent). However, if individuals are in the Numident and have W-2 earnings, but no DER earnings, then they are very likely not to report wage and salary earnings in the survey. This suggests that there is measurement error in the W-2 file for these cases that is not in the cleaned, SSA-provided DER data. We therefore default to the DER information in these cases of no job-level administrative earnings. However, if individuals are not in the Numident and have W-2 earnings, but no DER earnings, they are very likely to report wage and salary earnings on the survey (85 percent). In these cases, we conclude the DER is missing earnings for those without SSNs that are correctly present in W-2s. For these individuals, we default to the W-2 information of positive job-level earnings. This is a clear example of how administrative data are not necessarily free of error and different sources of administrative data covering the same concept (wage and salary earnings) from the same tax information do not necessarily agree.

26

The survey and administrative earnings can differ on the extensive or intensive margin. With

extensive margin disagreement, where earnings are present in one but not both sources, we

default to the earnings report that is non-zero. In other words, we assume that any survey

report in the absence of administrative earnings reflects under-the-table income or a reporting

or linkage issue in the administrative data. We also assume that any administrative earnings

without a corresponding survey earnings report reflect under-/misreporting on the survey.

These are both assumptions that we plan to examine in future work.

The other difference we observe is intensive margin differences in reporting, where the re-

ported values are not equal. Figure 6 shows a scatterplot of survey versus administrative

reports of wage and salary earnings.31 Several important features of the data are visible in

the figure. First, survey and administrative earnings generally agree, reflected in the clus-

tering around the 45◦ line. However, regressing survey on W-2 wage and salary earnings

(in logs) yields a slope of 0.8, which is consistent with mean reversion in survey earnings

reports.32

In our forthcoming companion paper, Bee et al. (2023) define a model that parameterizes

the measurement error in ya and ys relative to the unobserved true earnings (y) for intensive

margin disagreement. We provide a concise summary of the model here.

Since there can be measurement error in both survey and administrative earnings reports

and we do not have data on “true” earnings for anyone, we must impose assumptions on

the data that are untestable or can only be tested indirectly. For example, we believe that

administrative earnings could be underreported either because some income is missing (such

as some portion of tips) or some jobs may be missing. Likewise, we do not assume that

administrative earnings are free of classical measurement error, or noise, even if we believe

that noise may be of lower variance than the noise in survey earnings reports.

31The figure is reproduced from O’Hara, Bee and Mitchell (2017) as more recent disclosure rules limit the possibility of releasing such detailed information of individual survey and administrative earnings values.

32For example, if we assumed no measurement error in W-2 earnings, then a slope that is less than one could indicate mean-reverting error non-classical measurement error in survey responses.

27

These assumptions provide some structure to our earnings measurement error model. The

model setup consists of two earnings measures: (a) survey earnings, which are condition-

ally unbiased but have potentially downward-biased conditional variances, and (b) admin-

istrative earnings records, which can be conditionally biased but have accurate conditional

variances.33

While these assumptions on survey versus administrative records are not directly testable,

they were chosen to be both consistent with prior literature on measurement error in earnings

and to be consistent with previous measurements of average income. Under our assumptions,

the survey would be unbiased for average income measures but may have trouble accurately

assessing income in the tails of the distributions. On the other hand, relying only on admin-

istrative records may generate significant biases in the estimation of income for populations

with income typically not captured by those data. Combining these two sources allows us

to mitigate both these problems simultaneously.

With our assumptions on survey and administrative earnings from above, Bee et al. (2023)

define a model in a Mean Squared Error (MSE) framework with a set of parameters on the

random noise and relative mean reversion in survey report, ys, and administrative record, ya,

conditional on other observed characteritsics, x. The model also defines a “survey confidence”

(SC) measure that is a function of two sets of terms. The first is a measure of the estimated

bias in the administrative data by comparing E(ys|x) to E(ya|x). The second set of terms

compares the relative variance of the random noise in the two reports conditional on x. We

33To further motivate the relevance of these assumptions, consider estimating earnings for auto mechanics as a group. Assumption (a) would imply that if you asked auto mechanics to report what they earned on a survey, some would over-report and some would under-report, but you would still recover an unbiased estimate of average earnings. On the other hand, at the individual-level these mechanics might not remember their exact earnings and so might report their earnings from an average of prior years, such that variation across survey reports would not reflect true variation in earnings for that year. On the other hand, assumption (b) implies that administrative records would fail to generate a correct average for auto mechanic earnings, presumably due to the prevalence of under-the-table payments. Under assumption (b), administrative data better capture variation across individual-level earnings, such that a mechanic whose W2 earnings were twice as large as another mechanic would be expected to have actually earned twice as much in that year. This would be satisfied if, for example, all auto mechanics reported 50 percent (or any fixed percent) of their income to the IRS.

28

select the survey report if the squared bias term exceeds the difference in the variance terms,

or if in the MSE framework, the estimated administrative bias is exceeded by its relatively

lower noise.

The model is only identified and possible to estimate with an assumption about the degree

of mean reversion in survey reports relative to administrative reports. This mean reversion

parameter, κ (or “kappa” in tables and figures in this paper), cannot be estimated, and must

be assumed because true earnings, y, are never observed. If κ = 1, there is no mean reversion

in the survey relative to the administrative data. We assume greater mean reversion as κ

decreases from 1. With a given κ, we can estimate the SC measure for each individual

conditional on his or her x characteristics, which would reflect the model’s “confidence” by

comparing the bias and variance terms in an MSE framework. We use this SC measure in

our decision rule to select the survey or administrative wage and salary earnings report —

if SC > 0, we select the survey report.34

We select the “best” wage and salary earnings report for individuals based on their observable

characteristics x, but not conditional on their actual survey or administrative reports. This

is in contrast with Meyer et al. (2021b), which takes the maximum of survey-reported and

administrative earnings in at least some cases. In other words, we take survey reports for

people whose characteristics suggest that their survey reports are better according to the

SC measure than their administrative reports. Bee et al. (2023) discuss potential limitations

and extensions of this approach to incorporate the actual earnings reports and additional

information, such as longitudinal earnings histories, to improve our estimates of earnings

given survey and administrative reports.

Misclassification of wages versus self-employment earnings further complicates efforts to rec-

oncile multiple earnings reports. If individuals report wage and salary earnings on the

34Bee et al. (2023) discuss the implementation details of the estimation and additional features of our decision rule in the case when we determine that E(ys|x) < E(ya|x) with some confidence for a given individual.

29

survey but self-employment earnings on their tax returns, it’s not clear whether those rep-

resent two separate sources of income or the same income reported in different categories.

Misclassification appears to be a common issue. Only 35 percent of individuals with pos-

itive administrative self-employment earnings report any self-employment earnings on the

survey and less than 50 percent of the survey self-employed have positive self-employment

earnings in the administrative data (Abraham et al., 2021). At this time, we generally defer

to the administrative data when there is disagreement about the source of earnings (wage

and salary vs. self-employment) or if self-employment is reported in both survey and admin-

istrative data. In the future, addressing misclassification of earnings and self-employment

earnings misreporting is an important avenue of research and improvement of our income

estimates.

In Table A8, we summarize the possible combinations of survey and administrative reports

of wage and salary and self-employment earnings and show which we use in our income

estimates. The measurement error model discussed in this section is used for 53 percent

of adults35 and for 74 percent of individuals with any reported earnings in either source.

Another 39 percent of adults had no survey or administrative earnings or reported earnings

in one source, but not the other. Given that we default to the source with reported earnings

under extensive margin disagreement, that leaves above 8 percent of adults or 12 percent

of individuals with earnings in either source for whom we ignore survey reported wage and

salary earnings and use only administrative data due to potential misclassification or other

data issues.

In Table A9, we show the share of individuals whose survey earnings would be used for various

κ mean-reversion parameter values (from the set of people listed as using the measurement

error model in Table A8). The share varies from 6 percent (κ = 0.7) to 31 percent (κ = 1,

no survey-report mean reversion). For the NEWS estimates, we select κ = 0.9 as it implies

35In this context, we define adult as people aged 15 and above who are asked the CPS ASEC earnings questions.

30

a relatively modest level of mean reversion and selects the survey wage and salary earnings

report 21 percent of the time. However, we assess robustness to alternative values of κ in

Section 5.2.

Given our chosen survey mean reversion parameter, Table 8 reports the share of individuals

whose survey earnings were used as part of our measurement error model (as a share of

workers from Table A8 for whom the measurement error model was used). Overall, we use

survey earnings for 21 percent of workers. The rate at which survey earnings are used varies

by age, race, occupation, and industry. For example, survey earnings are used less often for

Black workers and younger (18-24) and older (55+) workers. However, survey earnings are

used for 59 percent of workers in the construction industry.

4.3.2 Income Replacement

In this section, we discuss the final step – combining the survey and administrative data and

replacing particular survey income components with their counterparts in the administrative

data in order to estimate each survey respondent’s money income. We use separate processes

for filers and nonfilers. There is more income information available for tax filers, but some

of it is only available at the tax unit, but not the individual, level. Table A10 summarizes

the income information available for filers and nonfilers.

For tax filers, we start with Total Money Income (TMI) constructed from their 1040s, which

is the sum of taxable wage and salary income, interest (taxable and tax-exempt), dividends,

alimony received, business income or losses (including from partnerships and S-corps), farm

income or losses, net rent, royalty, and estate and trust income, unemployment compensation

and gross Social Security benefits (as noted in Section 3.3.1).

For wage and salary earnings, TMI includes taxable wage and salary earnings reported on the

1040. This amount will understate true earnings if gross earnings are greater than taxable

earnings, for example, if individuals have deferred compensation or use pre-tax earnings to

31

pay health insurance premiums. It will also understate earnings if filers underreport their

true earnings to the IRS. Therefore, we replace the wage and salary earnings component

of TMI with our survey or job-level administrative earnings according to the rules shown

in Table A8 and discussed in Section 4.3.1. We also replace 1040-reported Social Security

income, as we are more confident in the data quality of the SSA data than in the gross 1040

amounts, which may not be well-reported in tax returns (particularly for non-taxable Social

Security income).

For retirement income, we cannot distinguish defined contribution (DC) plan withdrawals

from defined benefit (DB) pensions in the 1099-R data.36 In the CPS ASEC, DC withdrawals

are only counted as income for people aged 59 and above. We therefore follow that convention

and include 1099-R retirement income for all individuals aged 59 and older. For those under

59, we include the 1099-R income if they reported pension or annuity income on the survey.

We add this retirement income to TMI.

Finally, we add several income components that are not taxable. From administrative

sources, we add SSI and TANF and from the survey, we add educational assistance, fi-

nancial assistance, workers’ compensation, and veterans benefit payments. For filers, that

gives us our adjusted TMI, which we use in the income and poverty estimates.

For nonfilers, we must add up the available components individually, since we do not have

a 1040 TMI amount. To get the nonfiler equivalent of adjusted TMI, we start with wage

and salary and self-employment earnings as indicated in Table A8. From administrative

data sources, we add Social Security income (PHUS), retirement income (from the 1099-R

following the same rules for filers as noted above by age), SSI (SSR), and TANF (state

data). We add UI compensation, interest, and dividends imputed using the parameters

estimated on the complete 1099-G, 1099-DIV, and 1099-INT data (Rothbaum, 2023). From

the survey, we add rent and royalty income, educational assistance, financial assistance,

36We will apply and extend the work in Bee and Mitchell (2017) to characterize individual withdrawals as defined benefit or defined contribution in future work.

32

workers’ compensation, and veterans benefit payments. The sum of these amounts represents

our best estimate of adjusted TMI for nonfilers, which we use in the income and poverty

estimates in the next section.

5 Results

5.1 NEWS Estimates

Table 1 and Figure 1 compare the NEWS estimates for median household income in 2018 to

the survey estimates released in Semega et al. (2019).37 Across all households, the NEWS

estimate for median household income was 6.3 percent higher ($67,170 vs. $63,180). Median

household incomes were also higher for nearly all subgroups shown. The main exceptions

were by age of householder. Pooled together, median household income for households under

age 65 was not statistically different (-0.1 percent lower point estimate) whereas households

65 and older had 27.3 percent greater median household income ($55,610 vs. $43,700). For

households aged 55-64, the difference was 5.0 percent ($72,430 vs. $68,950). For all age groups

below 55, the point estimates were not statistically different from zero or negative.

Figure 7 shows estimates from the 10th to 95th percentiles of the household income distribu-

tion overall and by race and Hispanic origin, age of householder, and educational attainment.

Overall, income increased more in proportional terms at the bottom of the distribution than

at the top. This is particularly true for age 65 and over households, for which NEWS house-

hold income was 31 percent higher at the 25th percentile, 20 percent higher at the 75th

percentile, and 15 percent higher at the 90th percentile.

Comparisons between NEWS and survey estimates for poverty are shown in Table 2 and

Figure 2. Overall, poverty was 1.1 percentage points lower than in the survey estimate,

equivalent to 9.4 percent fewer people in poverty. As with income, poverty was much lower

37All estimates are in 2018 dollars. To adjust to 2021 dollars using the R-CPI-U-RS as in official Census Bureau publications, multiply each income estimate by 399.0/369.8 = 1.079.

33

for the 65 and older population. We estimate a 3.3 percentage-point lower poverty rate

and 34.1 percent fewer people in poverty. There were no groups for which poverty was

statistically higher with the NEWS estimates. However, we did not find a statistically

significant difference in poverty for Black individuals, children, residents of the Midwest,

those outside of Metropolitan Statistical Areas, those with a disability, and those with some

college education.

Finally, in Table 3, we compare NEWS estimates for inequality statistics to the survey

estimates, including for income shares, the Gini index, and various percentile ratios.38 For

shares of income, we find a decrease in the share of income in the 2nd to 4th quintile and an

increase in the share of income in the top quintile and particularly the top 5 percent. We

estimate an increase in the Gini coefficient from 0.459 to 0.476. This is likely coming from

no top coding and higher extreme income values in the administrative data relative to the

survey, despite the larger increase in income at lower percentiles of the income distribution

shown in Figure 7, Panel A.39 However, consistent with that figure, we find declines in the

percentile ratio estimates (90/10, 90/50, and 50/10). For example, in the survey responses,

household income at the 90th percentile is 12.5 times as large as at the 10th percentile. With

the NEWS estimates, the ratio is 11.5.

5.2 Robustness to Alternative Uses of Earnings Data

Figure A5 compares NEWS estimates of household income to estimates using alternative

combinations of survey and administrative wage and salary earnings. In Panel A, we show

how income varies under different rules for using earnings when the survey and administrative

38One important area of future research is how to address potential data issues that affect inequality, including how well our sample captures income at the far right tail of the distribution and how to address administrative data issues (like implausible extreme values) that might bias inequality statistics. We note this when discussing our future plans in Section 6. This will affect statistics such as income shares and the Gini coefficient that condition on the entire income distribution, but have less of an impact on statistics such as percentile ratios.

39Survey income top codes vary by income item, but generally do not exceed $1.1 million dollars for a given income source.

34

data disagree at the extensive margin, whether any earnings are present. We compare four

scenarios to the NEWS estimates (with ya for administrative earnings and ys for survey

earnings: (1) use ya unless ya = 0 and ys ̸= 0, (2) use ya, even if ya = 0 and ys ̸= 0, (3) use ys

unless ys = 0 and ya ̸= 0, and (4) use ys, even if ys = 0 and ya ̸= 0. Scenarios (1) and (2) give

priority to administrative earnings and (3) and (4) give priority to survey earnings. If we

use either source of earnings when the other is zero, income declines substantially ((2) and

(4)), particularly at lower income levels. If we use administrative earnings if ̸= 0 , scenario

(1), the household income point estimates are generally lower than the NEWS estimates,

although most of the differences are not statistically significant. If we use survey earnings

if ̸= 0, scenario (3), the household income point estimates are lower everywhere, but the

differences are only statistically significant in the tails of the distribution.

To summarize, how we handle extensive margin disagreement substantially affects our income

estimates, as does whether we prioritize survey or administrative earnings. Compared to

just using administrative earnings (if ̸= 0), the measurement error earnings model does not

have a substantial impact on household income overall, despite using survey earnings for

21 percent of the individuals the model was used on. In Figure A5 Panel B, we estimate

the household income distribution for alternative κ/survey mean-reversion parameters in

the earnings measurement error model. As κ varies from 1 to 0.7, the share of individuals

whose survey earnings are used changes from 6 to 31 percent. Despite this, and while

there are statistically significant differences between the NEWS estimates (κ = 0.9) and

estimates with other κ, there are few economically meaningful differences in the household

income estimates. For example, none of the alternative κs estimates a statistically significant

difference in median household income and the range on the point estimates is from -0.05

percent to 0.03 percent different from NEWS estimate. At the 95th percentile, the estimates

range from -0.46 percent to 0.89 percent different from the NEWS estimate (with only 0.89

percent different for κ = 0.7 statistically different from the NEWS estimate).

35

However, the choice of how to combine survey and administrative earnings could matter

considerably more, shown in Panel C of Figure A5. We add another possible decision rule,

which is to take the maximum of the two reports. This approach might be reasonable if one

thinks all misreporting in both survey and administrative data is underreporting, although

that does not seem consistent with the noise in survey reports around administrative wage

and salary earnings we observe in Figure 6.40 Taking the maximum of reported wage and

salary earnings would vastly increase measured household income across the distribution.

Across the percentiles plotted in Figure A5, the income estimate using the maximum rule

would be 13.5 percent greater than the NEWS estimate, on average.

5.3 Impact of Different Processing Steps on Income and Poverty

Estimates

The NEWS estimates reflect several bias correction steps, including reweighting for non-

response, reweighting for linkage to administrative data, imputing to address nonrandom

nonresponse, replacement of survey responses with administrative income information (in-

cluding observed and imputed TANF and gross earnings), and the earnings measurement

error model to select survey or administrative earnings. In Figure 3, we decompose the ad-

justments to show the impact of each of these steps on the distribution of household income.

In Panel A, we show the weighting and survey imputation steps compared to the survey es-

timates, as these steps use administrative data to adjust for bias in survey-only information

(the weights and imputed earnings). In Panel B, we show the impact of using administrative

data (as discussed in Section 4.3.2) and the earnings measurement error model compared to

the adjusted survey estimates from Panel A. In other words, Panel A illustrates the effect the

survey-only adjustments and Panel B shows the effect of the final two steps after accounting

40Meyer et al. (2021b) take the maximum of survey and administrative earnings (total earnings, not just wage and salary) at least in some cases. However, they argue their estimates of extreme poverty are not affected by this because in most cases both the survey and the administrative earnings measure exceeds their extreme poverty thresholds when they disagree on the intensive margin.

36

for the survey-only adjustments.

The weighting steps lower income across most of the distribution by 1 to 2 percent.41 Re-

placing the survey earnings imputations (and accounting for uncertainty through multiple

imputation) lowers the point estimates at the bottom of the distribution, consistent with

the selection into response observed by Bollinger et al. (2019) in the tails and results in

confidence intervals that are wider on average.

In Figure 3 Panel B, we show the impact of the final two steps, income replacement and the

earnings measurement error model, compared to the estimate after survey earnings impu-

tation from Panel A. We compare the household income distribution with and without the

administrative data and find large effects across the distribution, from 17.1 percent at the

10th percentile, to 10.3 percent at the 25th, 6.8 percent at the median, and 3.6 percent at the

75th. Panel B also shows the impact of the earnings measurement error model and the use

of survey earnings, which has a minimal impact on household income.42 Panel C shows the

overall comparison between the NEWS and survey estimates.43

Figure A6 shows the same decomposition by survey adjustments (Panel A) and adminis-

trative income replacement and measurement error model (Panel B) for the subgroups in

Table 1. Figure A7 does the same for poverty. In both, it is generally the case that the

survey adjustments move point estimates for median household income down and poverty

up, but generally the differences are not statistically significant. The administrative income

replacements move income up and poverty down for most subgroups as well.

41This is slightly different than Rothbaum and Bee (2022), which found no statistically significant differ- ences across the distribution with an average point estimate of -0.23. However, we use more data, particularly contemporaneous rather than lagged 1040 income in the NEWS project, which may reflect selection into response that was not captured in that paper using data available during the regular CPS ASEC production schedule.

42We discuss how alternative uses of survey earnings could have had a large impact in the next section. 43The same information by age of householder (under 65 and 65 and over) is available in the Appendix

in Figure A2.

37

5.4 Impact of Different Income Types on Income and Poverty

Estimates

Finally, we assess how specific administrative income components affect the household income

distribution and poverty. To do so, we start with the NEWS income estimates and replace

each administrative income item one by one (not sequentially or cumulatively) with its survey

counterpart and compare each statistic after the replacement to the NEWS estimate. The

results are shown for income in Figure 8 and poverty in Figure 9.

For income, we make several replacements: (1) interest and dividends, (2) retirement income,

including DC withdrawals and retirement, survivor, and disability pensions, (3) Social Se-

curity and SSI, and (4) wage and salary earnings.

For interest and dividends, we make three replacements: 1) replace administrative interest

income with survey interest income, including the survey measure of interest (and other re-

turns) on retirement accounts, 2) replace administrative income with survey interest income,

excluding the retirement account interest, and 3) replace administrative dividends with sur-

vey dividends, with detail shown in Figure A3 Panel A. If we include interest on retirement

accounts (as is the case in the survey income estimate), we get more income across the dis-

tribution than using administrative income (which does not include this interest). Because

we already count withdrawals from these same retirement accounts as income, this risks

double counting the same income, which is why we exclude it from the NEWS estimate.

If we replace interest or dividends excluding this interest from retirement accounts, we see

slightly lower income across the distribution.Together, interest and dividend replacement

with survey responses lowers income by 1.3 percent at the 25th percentile and 0.5 percent

at the 75 percentile, shown in Figure 8.

Next, we look at transfer income, including Social Security (OASDI), SSI, and TANF income,

shown in detail in Figure A3 Panel B. If we just replace SSI income with survey responses, we

observe increases in income at the bottom of the distribution, primarily because of misclassi-

38

fication of Social Security and SSI, effectively double counting Social Security for individuals

that reported Social Security income as SSI. If we replace Social Security only, we observe

big declines in income at the bottom and smaller declines higher in the income distribution.

If we replace both together, we observe slightly smaller declines at the bottom because we

are preserving the misclassified income (SSI reported as Social Security on the survey, for

example). Replacing TANF with survey responses results in small declines in income that

are only significantly different at a handful of points.Replacing both Social Security and

SSI together lowers income by 1.0 perent at the 25th percentile, but the difference is not

statistically significant at the 75th percentile, shown in Figure 8.

Figure 8 also shows the the impact of replacing retirement, survivor, disability, and pension

income (retirement income, from Form 1099-R) with the corresponding survey items. Even

for overall income, the retirement income replacement has the biggest impact across much

of the income distribution, including 8.7 percent at the 25th percentile and 4.1 percent at

the 75th percentile.

As shown in Figure 9, overall poverty is higher when using survey reports for interest and

dividends. It is much higher if we use survey-reported retirement income. Likewise, replacing

administrative with survey earnings has a large effect on poverty, particularly if we ignore

positive administrative earnings when the survey reports are zero.

6 Release and Future Research

6.1 Transparency and Data Availability

An integral goal of the NEWS project is to be as transparent and open about the data we

use, how we clean them, and how we combine them to generate the NEWS income, poverty,

and resource estimates. Clarity and transparency are especially important in this context, as

there are many decisions about how to clean, process, and combine survey and administrative

39

data that can have major effects on the results. These choices can be relatively opaque and

“in-the-weeds” for even a well-informed outsider. For example, using the maximum of survey

and administrative income reports, as shown in Figure A5 Panel C, would drastically bias

our income and poverty estimates in a way that is not consistent with the survey reporting

noise in Figure 6. Transparency about our methods, code, and estimates is required for

readers to understand the implications of those kind of detailed data choices.

As such, we commit to making all of the code and as much of the data as we are permitted

available to researchers through the Federal Research Data Center (FSRDC) system.44 We

also commit to making the code publicly available, with as few edits as possible as required

by the rules on the disclosure of code to abide by Titles 13 and 26 and our agreements with

data providers.

With each run of the NEWS code, we also plan to log any changes to input extracts so we

can track any changes to input data (such as data provided by the IRS or an updated version

of a survey file) that may affect our estimates. We also use git, a software version control

system, to ensure that the code that generated the results in this paper (or any future paper

with updated data, code, and methods) can be replicated.45

We also have written documentation for nearly all the files and functions involved in loading

and cleaning the data, creating the address and person extracts, implementing the reweight-

ing, imputation, and earnings measurement error model, generating the final person and

tax unit income variables, and estimating income and poverty. While no documentation is

perfect, we have endeavored to be as detailed as possible in this documentation, detailing

what each section of code is doing, including references to particular line numbers. This is

44Subject to the constraints of our data agreements with the various state and federal agencies and commercial data providers.

45Up to the limit of what is possible in the software we use. Unfortunately, there are functions we currently use, such as Stata’s rmcoll function to remove collinear variables from a regression that do not necessarily remove the same variables even when run with the same random seed. The exact set of variables kept can then affect the results from subsequent steps, such as LASSO regression feature selection. A goal for future releases is to remove our dependence on any function that has this property as we would like to ensure that a rerun of the code with the same data and initial seeds generates exactly the same estimates.

40

in addition to the regular commenting provided within the code itself.

6.2 Future Plans

This release represents version 1.0 of the NEWS project. There are many aspects of this

work that we were not able to include in this release and have left for future work. In this

section we discuss our goals for version 2.0 and beyond.

First, we have estimated income and poverty in a single year, 2018, as a proof of concept

and first step in this work. We plan to expand this to include more years, both earlier years

and years up to the present. This will introduce additional challenges. Some administrative

data are not available before a specific year. For example, the Census Bureau currently only

has access to the universe of W-2 earnings starting in 2005. Likewise, not all administrative

data are available in time for estimates of income in the prior year. For example, we might

get data from SSA or state agencies with a lag of a year or more. Creating historical or

preliminary estimates in the absence of complete data is an important direction for future

research.

Second, we have only estimated income and poverty statistics at the national level. In

the future we plan to extend the estimates to smaller geographic units, including states,

counties, and possibly census tracts. However, to do so would require changes to how the

estimates are generated. First, we would likely move to the ACS as the main source of

survey information for subnational estimates. However, the ACS has less detailed income

information, which makes this work more challenging and would require our using a different

approach to estimating various income sources. For example, we do not have separate

survey reports of interest, dividends, rental income, unemployment compensation, workers’

compensation, etc., because these items are reported as part of questions that ask about

several income items simultaneously. Therefore, it will be difficult to know whether the

respondent was also reporting another type of income that is not well-covered by available

41

administrative data. In the long term, we may even move beyond the survey sample (while

using survey information in the process) to better estimate statistics for small areas using

the available administrative, decennial census, and commercial data.

Third, we have generated estimates only for pre-tax money income, as measured in the Cen-

sus Bureau’s annual income and poverty release (Semega et al., 2019). However, there is

considerable interest in how in-kind benefits, taxes, and credits affect measures of material

wellbeing. We plan on expanding the notions of resources we measure and as well as the set

of wellbeing and deprivation statistics we report. For example, we could measure the distri-

bution of disposable income, disposable income plus the cash value of some (or all) in-kind

transfers, improved measures of compensation that include employer matches to retirement

contributions and employer contributions to health insurance premiums, the Supplemental

Poverty Measure (SPM), etc. This will entail estimating taxes and credits and/or addressing

household roster disagreement between administrative and survey data (Unrath, 2022; Meyer

et al., 2022), incorporating additional data on housing assistance from the Department of

Housing and Urban Development and from states on the Special Supplemental Nutrition

Assistance Program for Women, Infants, and Children (WIC), and potentially improved im-

putation and misreporting corrections for other programs such as the National School Lunch

program, etc.

Finally, there are dimensions of misreporting and measurement error that we were not able

to address in this version. For example, we have discussed how self-employment earnings

are underreported in both survey and administrative data (Hurst, Li and Pugsley, 2014;

Internal Revenue Service, Research, Analysis & Statistics., 2016) and how much survey

and administrative reports disagree on the extensive margin (Abraham et al., 2021). It is

not settled in the literature how to adjust for this underreporting (Auten and Splinter, 2018;

Piketty, Saez and Zucman, 2017), much less how one would do so and get unbiased estimates

by subgroup. We plan to extend our measurement error model to self-employment earnings

42

for which different assumptions about misreporting would be necessary. Likewise, it may be

the case that survey samples, even those as large as the ACS, do not adequately capture

the incomes of the top individuals and households. Imputation, combination, or reweighting

may be insufficient to address this issue to estimate unbiased inequality statistics from a

survey sample. We plan on also researching methods to better estimate inequality statistics

that account for the far-right tail of the income distribution.

We would also like to further investigate how our adjustments affect estimates for subgroups

that may be challenging to reach or be unlikely to be present in the administrative data,

such as non-citizens. Weighting and imputation, in particular, assume that the data is

missing at random conditional on the observable information. However, there may be limited

observable information in the address-linked administrative records to identify and adjust

for selection into response by citizenship status. Likewise, our weighting adjustment for

linkage uses survey response information to reweight individuals and households that can be

linked to administrative data to be representative of the full sample. However, it may be

that conditional on the observable survey information (and the address-linked administrative

data), the data are not missing at random and that our final estimates for this group are

biased. Similarly, there are difficult to reach subgroups that are not in sample for the CPS

ASEC that we would like to estimate wellbeing statistics for, such as individuals in group

quarters and the homeless or unhoused.

7 Conclusion

This release under the NEWS project is a first step toward integrating what we know about

bias and measurement error in survey and administrative data into a set of “best possible”

estimates of income, poverty, and resource statistics. We have attempted to address as

many of the sources of bias as possible, including nonresponse bias (unit and item), selection

into linkage to administrative data, misreporting of survey and administrative income, and

43

incomplete data. However, much work remains to be done to address additional potential

sources of error. As we and other researchers advance our understanding of how to address

these measurement challenges, we will revise these estimates.

This work also suggests several additional avenues of possible research at the Census Bureau.

For example, estimating income and poverty from linked survey and administrative data

could impact the information we depend on surveys to provide. Surveys could focus less on

items that are well captured in administrative data (such as Social Security payments) and

more on items that improve linkage and those that are less well captured by administrative

data (self-employment income, etc.). The Census Bureau could also increase efforts to collect

survey responses from hard-to-reach groups who may be less well covered by administrative

data.

The focus of this project is on improving our estimates of income and poverty. However,

much of our planned future work entails trying to understand the quality of various data

sources. This commitment promises many potential benefits to users of both survey and

administrative data who are not primarily focused on income and poverty measurement.

We hope to extend our work, particularly on earnings, to help characterize the data quality

issues that other researchers may confront.

44

References

Abowd, John M, and Martha H Stinson. 2013. “Estimating measurement error in an- nual job earnings: A comparison of survey and administrative data.” Review of Economics and Statistics, 95(5): 1451–1467.

Abowd, John M, Kevin L McKinney, and Nellie L Zhao. 2018. “Earnings inequality and mobility trends in the United States: Nationally representative estimates from longitu- dinally linked employer-employee data.” Journal of Labor Economics, 36(S1): S183–S300.

Abraham, Katharine G, John C Haltiwanger, Claire Hou, Kristin Sandusky, and James R Spletzer. 2021. “Reconciling survey and administrative measures of self- employment.” Journal of Labor Economics, 39(4): 825–860.

Alvey, Wendy, and Cynthia Cobleigh. 1975. “Exploration of differences between linked Social Security and Current Population Survey earnings data for 1972.” Proceedings of the Social Statistics Section, American Statistical Association.

Ambler, Gareth, Rumana Z. Omar, and Patrick Royston. 2007. “A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome.” Statistical methods in medical research, 16(3): 277–298.

Auten, Gerald, and David Splinter. 2018. “Income inequality in the United States: Using tax data to measure long-term trends.” Draft subject to change. http://davidsplinter. com/AutenSplinter-Tax Data and Inequality.pdf.

Bee, Adam. 2013. “An Evaluation of Retirement Income in the CPS ASEC Using Form 1099-R Microdata.” Unpublished U.S. Census Bureau Working Paper.

Bee, Adam, and Jonathan Rothbaum. 2019. “The Administrative Income Statistics (AIS) Project: Research on the Use of Administrative Records to Improve Income and Resource Estimates.” U.S. Census Bureau SEHSD Working Paper #2019-36.

Bee, Adam, and Joshua Mitchell. 2017. “Do Older Americans Have More Income Than We Think?” U.S. Census Bureau SEHSD Working Paper #2017-39.

Bee, Adam, Graton Gathright, and Bruce D. Meyer. 2015. “Bias from Unit Non- response in the Measurement of Income in Household Surveys.” Unpublished U.S. Census Bureau Working Paper.

Bee, Adam, Joshua Mitchell, and Jonathan Rothbaum. 2019. “Not So Fast? How the Use of Administrative Earnings Data Would Change Poverty Estimates.” Unpublished U.S. Census Bureau Working Paper.

Bee, Adam, Joshua Mitchell, Nicolas Mittag, Jonathan Rothbaum, Carl Sanders, Lawrence Schmidt, and Matthew Unrath. 2023. “Addressing Measurement Error in Income Reports by Combining Survey and Administrative Earnings.” Unpublished U.S. Census Bureau Working Paper.

Benedetto, Gary, Joanna Motro, and Martha Stinson. 2016. “Introducing Parametric Models and Administrative Records into 2014 SIPP Imputations.”

45

Benedetto, Gary, Jordan C. Stanley, and Evan Totty. 2018. “The Creation and Use of the SIPP Synthetic Beta v7.0.” U.S. Census Bureau Working Paper.

Benedetto, Gary, Martha Stinson, and John M Abowd. 2013. “The creation and use of the SIPP Synthetic Beta.” U.S. Census Bureau Working Paper.

Bhaskar, Renuka, James M Noon, Brett O’Hara, and Victoria Velkoff. 2016. “Medicare Coverage and Reporting: A Comparison of the Current Population Survey and Administrative Records.” U.S. Census Bureau CARRA Working Paper #2016-12.

Bhaskar, Renuka, Rachel Shattuck, and James Noon. 2018. “Reporting of In- dian Health Service Coverage in the American Community Survey.” U.S. Census Bureau CARRA Working Paper #2018-14.

Bingley, Paul, and Alessandro Martinello. 2017. “Measurement Error in Income and Schooling and the Bias of Linear Estimators.” Journal of Labor Economics, 35(4): 1117– 1148.

Bollinger, Christopher R. 1998. “Measurement error in the Current Population Survey: A nonparametric look.” Journal of Labor Economics, 16(3): 576–594.

Bollinger, Christopher R, and Barry T Hirsch. 2006. “Match bias from earnings imputation in the Current Population Survey: The case of imperfect matching.” Journal of Labor Economics, 24(3): 483–519.

Bollinger, Christopher R, Barry T Hirsch, Charles M Hokayem, and James P Ziliak. 2019. “Trouble in the tails? What we know about earnings nonresponse 30 years after Lillard, Smith, and Welch.” Journal of Political Economy, 127(5): 2143–2185.

Bondarenko, Irina, and Trivellore E Raghunathan. 2007. “Multiple Imputations Us- ing Sequential Semi and Nonparametric Regressions.” American Statistical Association Alexandria, VA.

Bond, Brittany, J David Brown, Adela Luque, Amy O’Hara, et al. 2014. “The nature of the bias when studying only linkable person records: Evidence from the American Community Survey.” U.S. Census Bureau CARRA Working Paper #2014-08.

Bound, John, and Alan B Krueger. 1991. “The extent of measurement error in lon- gitudinal earnings data: Do two wrongs make a right?” Journal of Labor Economics, 9(1): 1–24.

Bound, John, Charles Brown, and Nancy Mathiowetz. 2001. “Measurement error in survey data.” In Handbook of Econometrics. Vol. 5, 3705–3843.

Bound, John, Charles Brown, Greg J Duncan, and Willard L Rodgers. 1994. “Evidence on the validity of cross-sectional and longitudinal labor market data.” Journal of Labor Economics, 12(3): 345–368.

Brummet, Quentin. 2014. “Comparison of Survey, Federal, and Commercial Address Data Quality.” U.S. Census Bureau CARRA Working Paper #2014-06.

46

Brummet, Quentin, Denise Flanagan-Doyle, Joshua Mitchell, John Voorheis, Laura Erhard, and Brett McBride. 2018. “What Can Administrative Tax Information Tell Us about Income Measurement in Household Surveys? Evidence from the Consumer Expenditure Surveys.” Statistical Journal of the IAOS, 34(4): 513–520.

Carr, Michael D, Robert A Moffitt, and Emily E Wiemers. 2022. “Reconciling Trends in Male Earnings Volatility: Evidence from the SIPP Survey and Administrative Data.” Journal of Business & Economic Statistics, 1–10.

Chapin, William, Sandra Clark, Amanda Klimek, Christopher Mazur, Chase Sawyer, and Ellen Wilson. 2018. “Housing Administrative Records Simulation.” U.S. Census Bureau ACS Research and Evaluation Report #ACS18-RER-07.

Chenevert, Rebecca L, Mark A Klee, and Kelly R Wilkin. 2016. “Do imputed earnings earn their keep? Evaluating SIPP earnings and nonresponse with administrative records.” U.S. Census Bureau SEHSD Working Paper #2016-18.

Chernozhukov, Victor, Iván Fernández-Val, and Alfred Galichon. 2010. “Quantile and probability curves without crossing.” Econometrica, 78(3): 1093–1125.

Chow, Melissa C, Teresa C Fort, Christopher Goetz, Nathan Goldschlag, James Lawrence, Elisabeth Ruth Perlman, Martha Stinson, and T Kirk White. 2021. “Redesigning the Longitudinal Business Database.” NBER Working Paper #28839.

Corinth, Kevin, Bruce D Meyer, and Derek Wu. 2022. “The Change in Poverty from 1995 to 2016 Among Single Parent Families.” National Bureau of Economic Research Working Paper #29870.

Deming, W Edwards, and Frederick F Stephan. 1940. “On a least squares adjustment of a sampled frequency table when the expected marginal totals are known.” The Annals of Mathematical Statistics, 11(4): 427–444.

Deville, Jean-Claude, and Carl-Erik Särndal. 1992. “Calibration estimators in survey sampling.” Journal of the American statistical Association, 87(418): 376–382.

Duncan, Greg J, and Daniel H Hill. 1985. “An investigation of the extent and con- sequences of measurement error in labor-economic survey data.” Journal of Labor Eco- nomics, 3(4): 508–532.

Eggleston, Jonathan. 2021. “Comparing Respondents and Nonrespondents in the ACS: 2013-2018.” Unpublished U.S. Census Bureau Working Paper.

Eggleston, Jonathan, and Ashley Westra. 2020. “Incorporating Administrative Data in Survey Weights for the Survey of Income and Program Participation.” U.S. Census Bureau SEHSD Working Paper #2020-07.

Eggleston, Jonathan, and Lori Reeder. 2018. “Does Encouraging Record Use for Finan- cial Assets Improve Data Accuracy? Evidence from Administrative Data.” Public Opinion Quarterly, 82(4): 686–706.

47

Estevao, Victor M, and Carl-Erik Säarndal. 2006. “Survey estimates by calibration on complex auxiliary information.” International Statistical Review, 74(2): 127–147.

Fox, Liana E, Misty L Heggeness, and Kathryn Stevens. 2017. “Precision in mea- surement: Using SNAP administrative records to evaluate poverty measurement.” U.S. Census Bureau SEHSD Working Paper #2017-49.

Fox, Liana, Jonathan Rothbaum, Kathryn Shantz, et al. 2022. “Fixing Errors in a SNAP: Addressing SNAP Underreporting to Evaluate Poverty.” AEA Papers and Pro- ceedings, 112: 330–334.

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Soft- ware, 33(1): 1–22.

Giefer, Katherine, Abby Williams, Gary Benedetto, and Joanna Motro. 2015. “Program confusion in the 2014 SIPP: Using administrative records to correct false positive SSI reports.” FCSM 2015 Proceedings.

Gottschalk, Peter, and Minh Huynh. 2010. “Are earnings inequality and mobility over- stated? The impact of nonclassical measurement error.” Review of Economics and Statis- tics, 92(2): 302–315.

Hainmueller, Jens. 2012. “Entropy Balancing for Causal Effects: A Multivariate Reweight- ing Method to Produce Balanced Samples in Observational Studies.” Political Analysis, 25–46.

Harris, Benjamin Cerf. 2014. “Within and Across County Variation in SNAP Misre- porting: Evidence from Linked ACS and Administrative Records.” U.S. Census Bureau CARRA Working Paper #2014-05.

He, Yulei, Alan M Zaslavsky, MB Landrum, DP Harrington, and P Catalano. 2010. “Multiple imputation in a large-scale complex survey: a practical guide.” Statistical methods in medical research, 19(6): 653–670.

He, Yulei, and Trivellore E Raghunathan. 2006. “Tukey’s gh distribution for multiple imputation.” The American Statistician, 60(3): 251–256.

Hokayem, Charles, Christopher Bollinger, and James P Ziliak. 2015. “The role of CPS nonresponse in the measurement of poverty.” Journal of the American Statistical Association, 110(511): 935–945.

Hokayem, Charles, Trivellore Raghunathan, and Jonathan Rothbaum. 2022. “Match Bias or Nonignorable Nonresponse? Improved Imputation and Administrative Data In the CPS ASEC.” Journal of Survey Statistics and Methodology, 10(1): 81–114.

Hurst, Erik, Geng Li, and Benjamin Pugsley. 2014. “Are household surveys like tax forms? Evidence from income underreporting of the self-employed.” Review of Economics and Statistics, 96(1): 19–33.

48

Imboden, Christian, John Voorheis, and Caroline Weber. 2019. “Measuring Sys- tematic Wage Misreporting by Demographic Groups.” Unpublished U.S. Census Bureau Working Paper.

Internal Revenue Service, Research, Analysis & Statistics. 2016. “Federal Tax Com- pliance Research: Tax Gap Estimates for Tax Years 2008–2010.” Publication 1415 (Rev. 5-2016).

Jenkins, Stephen P, and Fernando Rios Avila. Forthcoming. “Reconciling reports: modelling employment earnings and measurement errors using linked survey and admin- istrative data.” Journal of the Royal Statistical Society.

Joint Committee on Taxation. 2022. “Linking Entity Tax Returns and Wage Filings.” JCT Publication #JCX-5-22.

Jones, Margaret R., and James P. Ziliak. 2019. “The Antipoverty Impact of the EITC: New Estimates from Survey and Administrative Tax Records.” U.S. Census Bureau Center for Economic Studies Working Paper.

Kang, Joseph DY, and Joseph L Schafer. 2007. “Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data.” Statistical science, 22(4): 523–539.

Kapteyn, Arie, and Jelmer Y Ypma. 2007. “Measurement error and misclassification: A comparison of survey and administrative data.” Journal of Labor Economics, 25(3): 513– 551.

Kilss, Beth, and Frederick J Scheuren. 1978. “The 1973 CPS-IRS-SSA exact match study.” Social Security Bulletin, 41: 14.

Larrimore, Jeff, Jacob Mortenson, and David Splinter. 2020. “Presence and persis- tence of poverty in US tax data.” National Bureau of Economic Research Working Paper #26966.

Larrimore, Jeff, Jacob Mortenson, and David Splinter. 2021. “Household incomes in tax data using addresses to move from tax-unit to household income distributions.” Journal of Human Resources, 56(2): 600–631.

Larrimore, Jeff, Jacob Mortenson, and David Splinter. 2022. “Unemployment Insur- ance in Survey and Administrative Data.”

Little, Roderick J, and Sonya Vartivarian. 2005. “Does weighting for nonresponse increase the variance of survey means?” Survey Methodology, 31(2): 161.

McKinney, Kevin L, and John M Abowd. 2022. “Male Earnings Volatility in LEHD before, during, and after the Great Recession.” Journal of Business & Economic Statistics, 1–8.

Medalia, Carla, Bruce D Meyer, Amy B O’Hara, and Derek Wu. 2019. “Linking survey and administrative data to measure income, inequality, and mobility.” International Journal of Population Data Science, 4(1).

49

Meijer, Erik, Susann Rohwedder, and Tom Wansbeek. 2012. “Measurement error in earnings data: Using a mixture model approach to combine survey and register data.” Journal of Business & Economic Statistics, 30(2): 191–201.

Meng, Xiao-Li. 1994. “Multiple-imputation inferences with uncongenial sources of input.” Statistical Science, 538–558.

Meyer, Bruce D, and Derek Wu. 2018. “The poverty reduction of social security and means-tested transfers.” ILR Review, 71(5): 1106–1153.

Meyer, Bruce D, and Nikolas Mittag. 2019. “Using Linked Survey and Administrative Data to Better Measure Income: Implications for Poverty, Program Effectiveness, and Holes in the Safety Net.” American Economic Journal: Applied Economics, 11(2): 176– 204.

Meyer, Bruce D, and Nikolas Mittag. 2021. “An empirical total survey error decom- position using data combination.” Journal of Econometrics, 224(2): 286–305.

Meyer, Bruce D, Angela Wyse, Alexa Grunwaldt, Carla Medalia, and Derek Wu. 2021a. “Learning about homelessness using linked survey and administrative data.” National Bureau of Economic Research Working Paper #28861.

Meyer, Bruce D, Derek Wu, Grace Finley, Patrick Langetieg, Carla Medalia, Mark Payne, and Alan Plumley. 2022. “The Accuracy of Tax Imputations: Estimating Tax Liabilities and Credits Using Linked Survey and Administrative Data.” In Measuring Distribution and Mobility of Income and Wealth. , ed. Raj Chetty, John N Friedman, Janet C Gornick, Barry Johnson and Arthur Kennickell, Chapter 15, 459–498. University of Chicago Press.

Meyer, Bruce D, Derek Wu, Victoria Mooers, and Carla Medalia. 2021b. “The Use and Misuse of Income Data and the Rarity of Extreme Poverty in the United States.” Journal of Labor Economics, 39(S1): S5–S58.

Mittag, Nikolas. 2019. “Correcting for Misreporting of Government Benefits.” American Economic Journal: Economic Policy, 11(2): 142–164.

Moffitt, Robert, and Sisi Zhang. 2022. “Estimating trends in male earnings volatility with the Panel Study of Income Dynamics.” Journal of Business & Economic Statistics, 1–6.

Moffitt, Robert, John Abowd, Christopher Bollinger, Michael Carr, Charles Hokayem, Kevin McKinney, Emily Wiemers, Sisi Zhang, and James Ziliak. 2022. “Reconciling trends in US male earnings volatility: Results from survey and admin- istrative data.” Journal of Business & Economic Statistics, 1–11.

Murray-Close, Marta, and Misty L Heggeness. 2018. “Manning up and womaning down: How husbands and wives report their earnings when she earns more.” U.S. Census Bureau SEHSD Working Paper #2018-20.

50

Noon, James, Leticia Fernandez, and Sonya Porter. 2016. “Response Error and the Medicaid Undercount in the Current Population Survey.” U.S. Census Bureau CARRA Working Paper #2016-11.

O’Hara, Amy, Adam Bee, and Joshua Mitchell. 2017. “Preliminary Research for Re- placing or Supplementing the Income Question on the American Community Survey with Administrative Records.” Center for Administrative Records Research and Applications Memorandum Series #16-7.

Piketty, Thomas, Emmanuel Saez, and Gabriel Zucman. 2017. “Distributional na- tional accounts: Methods and estimates for the United States.” The Quarterly Journal of Economics, 133(2): 553–609.

Pischke, Jörn-Steffen. 1995. “Measurement error and earnings dynamics: Some estimates from the PSID validation study.” Journal of Business & Economic Statistics, 13(3): 305– 314.

Raghunathan, Trivellore E, James M Lepkowski, John Van Hoewyk, Peter Solen- berger, et al. 2001. “A multivariate technique for multiply imputing missing values using a sequence of regression models.” Survey methodology, 27(1): 85–96.

Roemer, Marc. 2002. “Using administrative earnings records to assess wage data quality in the March Current Population Survey and the Survey of Income and Program Partici- pation.” U.S. Census Bureau Center for Economic Studies Working Paper.

Rosenbaum, Paul R, and Donald B Rubin. 1983. “The Central Role of the Propensity Score in Observational Studies for Causal Effects.” Biometrika, 70(1): 41–55.

Rothbaum, Jonathan. 2015. “Comparing Income Aggregates: How Do the CPS and ACS Match the National Income and Product Accounts, 2007–2012.” U.S. Census Bureau SEHSD Working Paper #2015-01.

Rothbaum, Jonathan. 2018. “Evaluating the Use of Administrative Data to Reduce Re- spondent Burden in the Income Section of the American Community Survey.” Unpublished U.S. Census Bureau Working Paper.

Rothbaum, Jonathan. 2023. “Research on Creating Synthetic Data to Better Model the Income of Nonfilers through the Release of Public-Use Parameters.” Unpublished U.S. Census Bureau Working Paper.

Rothbaum, Jonathan, and Adam Bee. 2022. “Addressing Nonresponse Bias in House- hold Surveys using Linked Administrative Data.” U.S. Census Bureau SEHSD Working Paper #2020-10, Update for 2021 and 2022 unpublished.

Rothbaum, Jonathan, Jonathan Eggleston, Adam Bee, Mark Klee, and Brian Mendez-Smith. 2021. “Addressing Nonresponse Bias in the American Community Sur- vey During the Pandemic Using Administrative Data.” U.S. Census Bureau SEHSD Work- ing Paper #2021-24.

Rubin, Donald B. 1976. “Inference and missing data.” Biometrika, 63(3): 581–592.

51

Rubin, Donald B. 1981. “The Bayesian Bootstrap.” The annals of statistics, 130–134.

Rubin, Donald B. 1996. “Multiple imputation after 18+ years.” Journal of the American statistical Association, 91(434): 473–489.

Schmidt, Lawrence D.W., Yinchu Zhu, Brice Green, and Luxi Han. 2022. “quantspace: Quantile Regression via Quantile Spacing.” R package version 0.2.1.

Semega, Jessica, Melissa Kollar, John Shrider, Creamer, and Abinash Mohanty. 2019. “Income and Poverty in the United States: 2018.” U.S. Census Bureau Current Population Reports.

Shantz, Kathryn, and Liana E Fox. 2018. “Precision in Measurement: Using State- Level Supplemental Nutrition Assistance Program and Temporary Assistance for Needy Families Administrative Records and the Transfer Income Model (TRIM3) to Evaluate Poverty Measurement.” U.S. Census Bureau SEHSD Working Paper #2018-30.

Slud, Eric V, and Leroy Bailey. 2010. “Evaluation and selection of models for attrition nonresponse adjustment.” Journal of Official Statistics, 26(1): 127.

Unrath, Matthew. 2022. “Married... With Children? Assessing Alignment between Tax Units and Survey Households.” Unpublished U.S. Census Bureau Working Paper.

U.S. Census Bureau. 2009. “Estimating ASEC Variances with Replicate Weights Part I: Instructions for Using the ASEC Public Use Replicate Weight File to Create ASEC Variance Estimates.” URL: http://usa.ipums.org/usa/resources/repwt/Use_ of_the_Public_Use_Replicate_Weight_File_final_PR.doc, Accessed: 2022-08-11.

Van Buuren, Stef. 2007. “Multiple imputation of discrete and continuous data by fully conditional specification.” Statistical methods in medical research, 16(3): 219–242.

Wagner, Deborah, and Mary Layne. 2014. “The Person Identification Validation System (PVS): Applying the Center for Administrative Records and Research and Applications’ record linkage software.” U.S. Census Bureau CARRA Report Series #2014-01.

Woodcock, Simon D, and Gary Benedetto. 2009. “Distribution-preserving statistical disclosure limitation.” Computational Statistics & Data Analysis, 53(12): 4228–4242.

Zhao, Qingyuan, and Daniel Percival. 2017. “Entropy Balancing is Doubly Robust.” Journal of Causal Inference, 5(1).

52

Figure 1: NEWS Estimate of Median Household Income Relative to Survey in 2018

All Households

Family households Married-couple

Female householder, no husband present Male householder, no wife present

Nonfamily households Female householder

Male householder

White White, not Hispanic

Black Asian

Hispanic (any race)

Under 65 years 15 to 24 years 25 to 34 years 35 to 44 years 45 to 54 years 55 to 64 years

65 years and older

Native born Foreign born

Naturalized citizen Not a citizen

Northeast Midwest

South West

Age 25 and older householder No high school diploma High school, no college

Some college Bachelor's degree or higher

Inside metropolitan statistical areas Inside principal cities

Outside principal cities Outside metropolitan statistical areas

Type of Household

Race and Hispanic Origin of Householder

Age of Householder

Nativity of Householder

Region

Education

Residence

-30 percent -20 percent -10 percent 0 10 percent 20 percent 30 percent

Percent Difference from Survey

Notes: This figure shows the percent difference between the NEWS estimates of median household income compared to the survey estimates in 2018, also shown in Table 1. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

53

Figure 2: NEWS Estimate of Poverty Relative to Survey in 2018

All

White White, not Hispanic

Black Asian

Hispanic (any race)

Male Female

Under age 18 Age 18 to 64

Aged 65 and older

Native-born Foreign-born

Naturalized citizen Not a citizen

Northeast Midwest

South West

With a disability with no disability

Aged 25 and older No high school diploma High school, no college

Some college Bachelor's degree or higher

Inside metropolitan statistical areas .Inside principal cities

.Outside principal cities Outside metropolitan statistical areas

Race and Hispanic Origin

Sex

Age

Nativity

Region

Disability Status

Educational Attainment

Residence

-5 -4 -3 -2 -1 0 1 2

Percentage Point Difference

Notes: This figure shows the percentage point difference between the NEWS estimates of poverty compared to the survey estimate in 2018, also shown in Table 2. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

54

Figure 3: Decomposition of NEWS Processing Steps: Household Income

A. Survey Steps: Weighting B. Administrative Income Replacement and Earnings Imputation and Survey Earnings Choice Modeling

-10

-5

0

5

10

15

20

25

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

Reweighted (Nonresponse) + Reweighted for Linkage + Imputed Earnings

-10

-5

0

5

10

15

20

25

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

+ Administrative Income NEWS (+ Earnings Choice Model)

C. Overall

-10

-5

0

5

10

15

20

25

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

Notes: This figure decomposes the impact of the NEWS processing steps on household income. In Panel A, the figure shows the adjustments made to the survey data, including reweighting and improved earnings imputation comparing household income after the adjustment to the survey estimate. In Panel B, the figure shows impact of replacing survey income responses with administrative income, comparing the estimates after each step to the estimates after reweighting and earnings imputation. The full impact of all adjustments is shown in Panel C. The 95 percent confidence interval for the last step is shown in each: for Panel A comparing the estimate after earnings imputation to the survey estimate and for Panel B comparing the final NEWS estimate to the estimate after earnings imputation. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

55

Figure 4: Linkage Diagram for Address File

Survey Housing Units (Occupied)

Master Address File Black Knight

IRMF

Link Addresses to People (MAFID→PIK)

MAFARF

1040 Tax Returns

MAFID

Linked Individuals at Occupied Units

W-2s

1040 Tax Returns

Information Returns (IRMF)

IRS Data

SSA Data

Social Security/OASDI Payments (PHUS)

SSI Payments (SSR)

State Data (from partner states)

LEHDPIK

Firm Data (LBD)

EIN

Job-Level Match

Decennial Censuses

Geographic Summaries of Characteristics

ACS 5-Year Files IRMF MAFARF

NumidentPIK

MAFID

Housing Unit Information EIN

EIN

EIN EIN

Numident

PIK

W-2s

PIK

1040 Tax Returns

Geographic ID (State, County, Tract)

1099-Rs

PIK

Notes: This diagram shows the linkage used to create the address-based extract file used for weighting. The file starts with the set of occupied addresses in the survey. That file is linked to three sets of files: (1) Geographic summaries of characteristics (by state, county, and tract identifiers), (2) housing unit information from the Master Address File and Black Knight data, and (3) files to link the addresses to people living in them (MAFID → PIK). From the third set of files, we create a roster of all individuals found in the occupied surveyed units and link them to the files shown to the right.

56

Figure 5: Linkage Diagram for Person File

Survey Respondents Linked Survey Respondents

Information Returns (IRMF) 1040 Tax Returns W-2s

Assign PIKs

PIK PIK SSA Data

IRS Data

Social Security/OASDI Payments (PHUS)

Detailed Earnings Record (DER)

PIK

PIK

SSI Payments (SSR)

PIK

State Data (from partner states)

SNAP

PIK

TANF (states + HHS)

PIK

LEHD

PIK

Firm Data (LBD)

EIN

EIN

Job-Level Match

EIN

EIN

EIN

EIN

1099-Rs

PIK PIK

Notes: This diagram shows the linkage used to create the person-level extract file. The file starts with the set of respondents in the survey. For those respondents that can be linked to their Social Security Numbers and therefore assigned a Protected Identification Key (PIK), we link them to the administrative records shown.

57

Figure 6: Intensive Margin Disagreement in Wage and Salary Earnings

Lo g

AC S

W ag

e an

d Sa

la ry

E ar

ni ng

s

Log W-2 Earnings

Regression Fit (&#x1d6fd; = 0.8) 45∘ Line

Notes: This figure was published in O’Hara, Bee and Mitchell (2017) and is replicated here with permission, as it is no longer possible to disclose scatter plots of individual earnings reports. The figure compares individual survey wage and salary earnings reports to W-2 earnings from the 2011 ACS. The regression fit line is shown and the 45◦ is visible in the clustering of points below the regression line on the left side of the figure and above the regression fit on the right. While the survey reports cluster around the 45◦ line, there is considerable noise in the survey relative to the administrative reports, and the figure is consistent with mean-reversion of survey relative to administrative reports (both in the location of points

relative to the diagonal and the fact that β̂ < 1). The axes are unlabeled as a condition of the original release. Source: O’Hara, Bee and Mitchell (2017) using 2011 American Community Survey data linked to 2010 W-2s.

58

Figure 7: NEWS Estimate of Household Income Relative to Survey by Subgroup in 2018

A. Race and Hispanic Origin B. Age of Householder

-10

-5

0

5

10

15

20

25

30

Pe rc

en t D

iff er

en ce

fr om

S ur

ve y

0 20 40 60 80 100 Household Income Percentile

All White, Non-Hispanic Black Asian Hispanic

-10

-5

0

5

10

15

20

25

30

Pe rc

en t D

iff er

en ce

fr om

S ur

ve y

0 20 40 60 80 100 Household Income Percentile

25-34 35-44 45-54 55-64 65+

C. Educational Attainment

-10

-5

0

5

10

15

20

25

30

Pe rc

en t D

iff er

en ce

fr om

S ur

ve y

0 20 40 60 80 100 Household Income Percentile

No HS HS Some College Bachelor's and Above'

Notes: This figure shows the percent difference between the NEWS estimates of household income compared to the survey estimate at the 10th, 25th, 50th, 75th, and 90th percentiles in 2018. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

59

Figure 8: Effect of Removing Individual Administrative Income Items on Household Income

-10

-8

-6

-4

-2

0

2

4

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Household Income Percentile

Interest & Dividends Retirement Social Security & SSI WS Earnings

Notes: In this figure, we replace individual income items from the NEWS estimates with the corresponding survey information and compare the estimate after replacement with the NEWS estimate. An estimate below the zero line indicates that administrative item increases income at that percentile. We replace: (1) interest and dividends, (2) retirement income, including withdrawals from Defined Contribution plans and retirement, survivor, and disability pensions. For interest and dividends, we exclude survey-reported interest earned in Defined Contribution retirement plans. For wage and salary earnings, we replace administrative wage and salary earnings with survey responses in all cases where the individual does not have administrative self-employment earnings, even if the individual reported no earnings on the survey. More detailed decompositions are available in Figure A3. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

60

Figure 9: Effect of Removing Individual Administrative Income Items on Poverty

Interest (including from Retirement Plans)

Interest

Dividends

Retirement

Social Security & SSI

TANF

Survey WS Earnings

Survey WS Earnings (Adrecs if Survey == 0)

-1.5 -1 -.5 0 .5 1 1.5 Percentage Point Difference

Notes: In this figure, we replace individual income items from the NEWS estimates with the corresponding survey information, including for interest, dividends, retirement income, Social Security, SSI, TANF, and survey wage and salary earnings. An estimate above the zero line indicates that administrative item decreases overall poverty. For survey interest, we show two measures, including and excluding the interest earned in Defined Contribution retirement plans such as 401(k)s. We replace Social Security and SSI together to address misclassification across programs, as discussed in Bee and Mitchell (2017). We replace administrative wage and salary earnings with two survey-based earnings measures. In the first, we use survey responses in all cases where the individual does not have administrative self-employment earnings, even if the individual reported no earnings on the survey. In the second, we only replace administrative wage and salary earnings if the survey report was positive. Retirement includes Defined Contribution plan withdrawals, pensions, and survivor and disability pensions. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

61

Table 1: NEWS Median Household Income Estimates Compared to Survey in 2018

Survey NEWS

Median Income Median Income Percent Difference Number (dollars) Number (dollars) (NEWS - Survey)

Characteristic (thousands) Estimate 95 percent CI (thousands) Estimate 95 percent CI Estimate 95 percent CI

HOUSEHOLDS All Households 128,600 63,180 823 133,700 67,170 962 6.3*** 1.4 Type of Household Family households 83,480 80,660 791 85,840 85,210 1,221 5.6*** 1.3 .Married-couple 61,960 93,650 1,340 63,950 98,100 1,402 4.7*** 1.4 .Female householder, no husband present 15,040 45,130 1,329 15,250 47,490 1,754 5.2*** 3.6 .Male householder, no wife present 6,480 61,520 1,485 6,644 63,550 2,798 3.3 4.4 Nonfamily households 45,100 38,120 983 47,890 41,800 846 9.6*** 2.7 .Female householder 23,510 32,010 794 24,860 38,010 1,201 18.7*** 3.6 .Male householder 21,580 45,750 1,034 23,030 46,230 1,212 1.0 2.5 Race and Hispanic Origin of Householder White 100,500 66,940 769 104,000 71,390 984 6.6*** 1.3 ..White, not Hispanic 84,730 70,640 777 87,370 74,210 1,166 5.1*** 1.4 Black 17,170 41,360 1,079 18,290 43,100 2,058 4.2* 4.3 Asian 6,981 87,190 3,342 7,019 89,270 5,614 2.4 5.6 Hispanic (any race) 17,760 51,450 876 18,400 57,710 2,314 12.2*** 4.2 Age of Householder Under 65 years 94,420 71,660 683 99,370 71,580 1,001 -0.1 1.2 ..15 to 24 years 6,199 43,530 3,204 6,961 41,350 2,245 -5.0 6.4 ..25 to 34 years 20,610 65,890 1,281 22,080 65,110 1,764 -1.2 2.3 ..35 to 44 years 21,370 80,740 1,276 22,490 78,600 2,390 -2.7* 2.7 ..45 to 54 years 22,070 84,460 2,198 23,000 84,940 2,017 0.6 2.7 ..55 to 64 years 24,170 68,950 1,720 24,840 72,430 1,975 5.0*** 2.9 65 years and older 34,160 43,700 972 34,360 55,610 1,370 27.3*** 3.0 Nativity of Householder Native born 108,600 64,240 848 114,100 67,680 981 5.3*** 1.3 Foreign born 20,020 58,780 1,891 19,670 64,140 2,322 9.1*** 3.9 ..Naturalized citizen 11,040 65,520 2,682 10,480 72,290 2,877 10.3*** 4.6 ..Not a citizen 8,976 51,940 1,254 9,193 55,670 4,458 7.2* 8.3 Region Northeast 22,050 70,110 2,247 22,840 76,810 2,876 9.6*** 3.4 Midwest 27,690 64,070 1,722 28,730 66,460 1,726 3.7*** 2.5 South 49,740 57,300 978 52,470 58,890 1,418 2.8** 2.2 West 29,100 69,520 1,900 29,700 77,560 2,366 11.6*** 3.1 Residence Inside metropolitan statistical areas 110,800 66,160 725 112,600 71,010 1,049 7.3*** 1.4 ..Inside principal cities 42,980 59,360 1,457 43,040 63,210 1,653 6.5*** 2.4 ..Outside principal cities 67,810 70,930 902 69,520 75,780 1,522 6.8*** 1.8 Outside metropolitan statistical areas 17,790 49,870 1,941 21,170 50,040 1,722 0.3 3.2 Education Age 25 and Above 122,400 64,760 806 126,800 69,200 963 6.8*** 1.4 No HS 11,230 28,330 1,260 11,850 32,400 1,599 14.4*** 5.3 HS 31,810 46,070 870 33,270 50,630 999 9.9*** 2.3 Some College 33,940 60,940 918 35,090 64,620 1,432 6.0*** 2.0 Bachelor’s and Above 45,410 101,800 1,135 46,550 105,400 1,940 3.5*** 1.6

Notes: This table compares the NEWS median household income estimates to the survey estimates by subgroup in 2018. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for percent differences. Federal surveys give respondents the option of reporting more than one race. Therefore, two basic ways of defining a race group are possible. A group, such as Asian, may be defined as those who reported Asian and no other race (the race-alone or single-race concept) or as those who reported Asian regardless of whether they also reported another race (the race-alone-or-in-combination concept). This table shows data using the first approach (race alone). The use of the single-race population does not imply that it is the preferred method of presenting or analyzing data. The Census Bureau uses a variety of approaches. About 2.9 percent of people reported more than one race in the 2010 Census. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data. separately.

62

Table 2: NEWS Poverty Estimates Compared to Survey in 2018

Change in poverty Survey NEWS (NEWS - Survey)

Characteristic Percent 95 percent CI Percent 95 percent CI Difference 95 percent CI

PEOPLE ....Total 11.78 0.29 10.67 0.39 -1.11*** 0.37 Race and Hispanic Origin White 10.07 0.30 8.91 0.40 -1.16*** 0.41 ...White, not Hispanic 8.07 0.28 7.48 0.35 -0.59*** 0.35 Black 20.77 1.16 20.10 1.46 -0.67 1.31 Asian 10.10 0.94 8.52 1.41 -1.57** 1.38 Hispanic (any race) 17.56 0.80 14.61 1.14 -2.95*** 1.19 Sex Male 10.57 0.32 9.71 0.40 -0.86*** 0.40 Female 12.94 0.33 11.59 0.48 -1.34*** 0.45 Age Under 18 years 16.20 0.67 15.62 0.86 -0.57 0.83 18 to 64 years 10.68 0.29 9.97 0.37 -0.71*** 0.35 65 years and older 9.75 0.46 6.42 0.45 -3.33*** 0.56 Nativity Native-born 11.45 0.31 10.48 0.40 -0.97*** 0.38 Foreign-born 13.79 0.67 11.86 0.97 -1.93*** 1.01 ...Naturalized citizen 9.93 0.75 9.07 0.99 -0.86* 1.01 ...Not a citizen 17.46 1.01 14.40 1.59 -3.06*** 1.63 Region Northeast 10.28 0.66 9.14 0.86 -1.14** 0.87 Midwest 10.37 0.66 10.51 0.83 0.14 0.75 South 13.57 0.55 12.24 0.66 -1.33*** 0.66 West 11.22 0.64 9.41 0.83 -1.80*** 0.83 Residence Inside metropolitan statistical areas 11.34 0.32 10.11 0.43 -1.23*** 0.39 ...Inside principal cities 14.59 0.65 13.47 0.74 -1.12*** 0.70 ...Outside principal cities 9.42 0.40 8.18 0.47 -1.24*** 0.45 Outside metropolitan statistical areas 14.68 0.99 13.93 1.14 -0.75 0.98 Disability Status ....Total, aged 18 to 64 10.68 0.29 9.97 0.37 -0.71*** 0.35 With a disability 25.72 1.32 26.64 1.66 0.92 1.58 With no disability 9.46 0.25 8.68 0.36 -0.78*** 0.35 Educational Attainment ....Total, aged 25 and older 9.90 0.24 8.62 0.32 -1.27*** 0.30 No high school diploma 25.90 1.05 21.96 1.36 -3.94*** 1.41 High school, no college 12.73 0.47 10.83 0.56 -1.90*** 0.56 Some college 8.38 0.38 8.05 0.51 -0.33 0.52 Bachelor’s degree or higher 4.37 0.32 3.65 0.33 -0.72*** 0.38

Notes: This table compares the NEWS poverty estimates to the survey estimates by subgroup in 2018. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

63

Table 3: NEWS Inequality Estimates Compared to Survey in 2018

Percent Difference

Survey NEWS (NEWS - Survey)

Measure Estimate 95 percent CI Estimate 95 percent CI Estimate 95 percent CI

Shares of Aggregate Income 1st Quintile 0.036 0.001 0.037 0.001 0.001 0.001 2nd Quintile 0.091 0.001 0.089 0.002 -0.002* 0.002 3rd Quintile 0.148 0.001 0.142 0.003 -0.005*** 0.003 4th Quintile 0.227 0.002 0.215 0.004 -0.012*** 0.004 5th Quintile 0.498 0.004 0.516 0.009 0.018*** 0.008

Top 5 Percent 0.218 0.005 0.252 0.012 0.034*** 0.012 Summary Measures

Gini Index 0.459 0.004 0.476 0.009 0.017*** 0.009 90/10 percentile ratio 12.52 0.34 11.52 0.36 -1.00*** 0.35 90/50 percentile ratio 2.92 0.04 2.82 0.04 -0.10*** 0.05 50/10 percentile ratio 4.29 0.10 4.09 0.10 -0.20*** 0.11

Notes: This table compares NEWS inequality statistics to the survey estimates in 2018. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for percent differences. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

64

Table 4: Data Sources

File Data Source Description

Current Population Survey Annual Social and Economic Supplement (CPS ASEC)

Census Annual survey fielded in February to April with household structure and characteristics at the time of interview and income from the prior calendar year. About 95,000 housing units sampled each year.

American Community Survey (ACS) Census Rolling survey fielded throughout the year about income from prior 12 months. About 3.5 million housing units sample each year.

Short Form Decennial Census Census Complete count decennial census data from 2000 and 2010. Master Address File (MAF) Census File of residential addresses used to support census survey and decennial operations. Survey

samples are drawn from this file for both the CPS ASEC and ACS. Master Address File Auxiliary Reference File (MAFARF) Census Comingled file constructed from administrative records, including the IRMF, postal ser-

vice change of address information, program data, etc. that links individuals (identified by Protected Identification Keys) to addresses in the Master Address File (identified by MAFIDs).

Longitudinal Business Database (LBD) Census Database of private non-farm establishments with employees from 1976 forward. For each establishment the LBD has information on industry, payroll, employment, and a firm iden- tifier to group establishments into firms.

Information Returns Master File (IRMF) IRS Universe file with flags for whether an individual received each of the following information returns forms: 1098, 1099-DIV, 1099-INT, 1099-G, 1099-MISC, 1099-R, 1099-S, SSA-1099, and W-2. No income information is available. Also contains address information which has matched to the MAF to get a MAFID for each form.

Form 1040 Tax Returns (1040s) IRS Universe tax filings with a subset of the information on the complete Form 1040. The extracts provided by the IRS include information on tax-unit wage and salary income, gross rental income, taxable social security income, taxable and tax-exempt interest income, interest income, dividends, Adjusted Gross Income, and a constructed measure of Total Money Income (TMI). TMI is the sum of taxable wage and salary income, interest (taxable and tax-exempt), dividends, gross social security income, unemployment compensation, alimony received, business income or losses (including for partnerships and S-corps), farm income or losses, and net rent, royalty, and estate and trust income. Self-employment income is not available (except as a component of TMI), but flags exist for the filing of different 1040 schedules (such as C, D, E, F, SE).

Form W-2 (W-2s) IRS Universe data with a subset of information from the Form W-2. The extracts provided by the IRS include select boxes from the form, including wages and salary net of pre-tax deductions for health insurance premiums and deferred compensation (boxes 1 and 5), as well as the total amount of deferred compensation (summed values from Box 12 Codes D-H). Employee and employer pre-tax contributions to health insurance premiums are not available in the W-2 data.

Form 1099-R (1099-Rs) IRS Universe data with a subset of information from the Form 1099-R. The extracts provided by the IRS include information on amounts of defined-benefit pension payments (including for survivor and disability pensions) and withdrawals from defined-contribution retirement plans.

Numerical Identification System (Numident) SSA The Numident contains information for anyone ever to have received a Social Security Number. It includes information on date and place of birth, date of death, sex, and some information on citizenship.

Payment History Update System (PHUS) SSA Monthly Old Age, Survivors, and Disability Insurance (OASDI) payments from 1984 to the present. The PHUS exists for several subsamples of individuals including 1) those receiving payments in 2020 and 2021, 2) CPS ASEC respondents in linked years, and 3) ACS respondents in linked years (currently only 2019).

Supplemental Security Record (SSR) SSA Monthly Supplemental Security Income (SSI) payments from 1984 to the present for fed- erally SSI and federally administered state SSI. The SSR exists for several subsamples of individuals including 1) those receiving payments in 2020 and 2021, 2) CPS ASEC respon- dents in linked years, and 3) ACS respondents in linked years (currently only 2019).

Detailed Earnings Record (DER) SSA Annual job-level income (by Employer Identification Number, EIN) from Form W-2s and annual positive self-employment income (from Form 1040 Schedule SE). The DER exists for several subsamples: 1) CPS ASEC respondents in linked years and 2) ACS respondents in linked years (currently only 2019)

Longitudinal Employer Household Dynamics (LEHD) States Quarterly job earnings reports from firms to state Unemployment Insurance offices for participating states. For covered jobs, the LEHD includes gross earnings - this includes employee contributions for health insurance premiums not available on the W-2 extracts. Coverage in the LEHD is not complete as many government employees, such as federal civilian employees, postal workers, and Department of Defense employees are not covered by state UI benefits. Some private-sector employees, including those employed by religious organizations, are not covered by UI, and are therefore not present in the LEHD data.

Supplemental Nutrition Assistance Program States SNAP participant data from partner states. In 2018, SNAP data is available for 17 states. Temporary Assistance for Needy Families (TANF) States + HHS TANF participant data from partner states as well as from the Department of Health and

Human Services (HHS) for additional states. In 2018, TANF data is available for 36 states. Black Knight Home Value (Black Knight) Black Knight Third party data on home values and housing unit characteristics.

Notes: This table describes the data used in this project, including the source of the data and a short description. The name for the data used in Figures 4 and 5 is in parenthesis.

65

Table 5: Measurement and Estimation Steps

Section Step Inputs Category Measurement Challenge Description Related Work

A. Weighting 1. Weight respondents Address and Per- son Files

Survey Survey unit nonresponse Selection into administrative data Administrative data “nonresponse”

Use linked information on all occu- pied housing units and population controls to weight respondent sam- ple to be representative of the target universe of households

Rothbaum et al. (2021); Rothbaum and Bee (2022)

2. Weight respondents with all adults as- signed a PIK

Address and Per- son Files

Survey Survey unit nonresponse Selection into administrative data Administrative data “nonresponse” Selection into Linkage

Use information from A1 and reweight households with all adults assigned a PIK to be representative of the target universe of households

B. Imputation 1. Impute survey earn- ings

Person File Survey Survey item nonreponse Impute survey earnings conditional on survey and administrative infor- mation

Hokayem, Raghunathan and Roth- baum (2022)

2. Impute LEHD gross earnings

Person File Admin Administrative data “nonresponse” Conceptual misalignment Incomplete data coverage

Impute LEHD earnings when miss- ing or there is large disagreement be- tween W-2s and LEHD

3. Impute missing means-tested pro- gram benefits

Person File Admin Incomplete data coverage Impute means-tested program data (TANF and SNAP) for states for which administrative data is not available

Fox et al. (2022)

4. Impute adminis- trative income for nonfilers

Person File and nonfiler income pa- rameters

Admin Selection into administrative data Incomplete data coverage

Impute unemployment insurance compensation, interest, and divi- dends for nonfilers

Rothbaum (2023)

C. Estimation 1. Earnings Measure- ment Error Model

Person File (for CPS ASEC and ACS)

Admin Survey misreporting Administrative misreporting

Combine survey and administrative wage and salary earnings according to the earnings measurement error model

Bee et al. (2023)

2. Income replacement Person File Admin Survey misreporting Administrative misreporting

Use survey and administrative data, imputed income, and earnings from the measurement error model to con- struct household and family income

Bee and Mitchell (2017)

3. Estimate income and poverty statistics

Person File Admin

Notes: This table describes the processing steps used to address measurement error and estimate income and poverty. For each step, we include the Category (Survey or Administrative) matching the breakdown used in the decomposition used in Figure 3. Each step also references the relevant measurement challenges discussed in Section 2 and related work done at the Census Bureau that is being integrated into the NEWS project and extended.

66

Table 6: Rates of Missing Data for Imputed Income Items

Missingness Rate

Survey Earnings from Primary Job 0.456

(0.003) Earnings from Other Employers

Wage and Salary 0.367 (0.007)

Self Employment 0.445 (0.014)

Farm Self Employment 0.574 (0.020)

Usual Hours Worked Per Week 0.260 (0.003)

Weeks Worked Last Year 0.250 (0.003)

Administrative Job 1 LEHD (gross earnings) missing | W-2 or DER not missing 0.080

(0.001) or large disagreement between LEHD and W-2 0.178

(0.002) Job 2 LEHD (gross earnings) missing | W-2 or DER not missing 0.120

(0.002) or large disagreement between LEHD and W-2 0.184

(0.003) SNAP administrative data unavailable 0.695

(0.001) TANF administrative data unavailable 0.474

(0.001)

Notes: This table shows the share of the 2019 CPS ASEC sample that is missing information for the various items imputed in this work, as discussed in Section 4.2. Standard errors in parenthesis. Jobs are ordered in the administrative data (Job 1, Job 2, etc.) from highest to lowest earnings across the three sources of job-level earnings (W-2, DER, and LEHD). Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

67

Table 7: Sources of Administrative and Survey Earnings

A. All Individuals

Administrative Earnings Sources Share with Unimputed Survey:

W-2 DER LEHD N Wage and Salary Earnings Self-Employment Earnings

X X X 72,000 0.887 0.029 (0.002) (0.001)

X X 5,900 0.704 0.033 (0.010) (0.003)

X X 400 0.105 0.034 (0.018) (0.011)

X 300 0.804 0.024 (0.036) (0.011)

X X 30 1.000 Z Z Z

X <15 Z Z Z Z

X 500 0.244 0.058 (0.026) (0.016)

75,000 0.045 0.027 (0.001) (0.001)

B. Citizenship and DER Earnings

N Share Reporting

Administrative Earnings Sources (Survey Earnings Respondents Only) Wage and Salary Earnings Self-Employment Earnings

W-2 DER LEHD In Numident Not In Numident In Numident Not In Numident In Numident Not In Numident

X X Yes or No 47,000 <15 0.874 Z 0.029 Z (0.002) Z (0.001) Z

X Yes or No 350 200 0.093 0.847 0.035 0.023 (0.018) (0.033) (0.011) (0.011)

Notes: This table shows the counts and share of adults with each possible administrative earnings data source (W-2, DER, and LEHD) as well as the share in each group that reported survey earnings (among those that responded to the survey earnings questions). Panel A shows the estimates for all individuals in the CPS ASEC. Panel B shows how the presence or absence of DER earnings given W-2 earnings is related to differential probability of reporting survey earnings for individuals who can be assigned PIKs that have SSNs (In Numident) and do not (Not In Numident). Z indicates an estimate rounds to zero. Standard errors in parenthesis. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

68

Table 8: Combining Administrative and Survey Earnings: Use of Survey Earnings by Group

A. Race and Hispanic Origin B. Age

Share Survey Earnings

Race/Hispanic Origin Overall Relative to Average

All 20.6 Z (2.7) (0.2)

Black 13.8 -6.8* (2.9) (3.1)

Hispanic 22.1 1.5 (2.9) (1.2)

White Non-Hispanic 22.6 2.0 (3.0) (1.2)

Share Survey Earnings

Age Overall Relative to Average

18-24 6.3 -14.3** (1.4) (3.5)

25-34 29.0 8.4** (4.4) (2.5)

34-44 26.8 6.3** (3.5) (1.8)

45-54 20.5 -0.1 (4.1) (2.1)

55-64 16.2 -4.3* (3.3) (2.1)

65+ 8.7 -11.9*** (2.6) (2.4)

Notes: This table shows the share of individuals in each subgroup where survey earnings are used from the measurement error model for choosing survey or administrative earnings discussed in Section 4.3.1 and in more detail in Bee et al. (2023) Standard errors in parenthesis. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences relative to average. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

69

Table 8 Combining Administrative and Survey Earnings: Use of Survey Earnings by Group, Continued

C. Occupation D. Industry

Share Survey Earnings

Occupation (Last Week) Overall Relative to Average

Unemployed 14.2 -6.4 (5.3) (6.0)

Management 30.3 9.7** (5.6) (3.0)

Business and Financial Operations 25.2 4.6 (2.8) (3.2)

Computer and Mathematical 41.5 20.9** (7.2) (6.9)

Architecture and Engineering 52.3 31.7*** (4.0) (2.9)

Life, Physical, and Social Science 9.1 -11.5*** (2.1) (2.2)

Community and Social Services 3.1 -17.5*** (1.8) (3.3)

Legal 11.0 -9.6 (11.0) (8.5)

Education, Training, and Library 8.8 -11.8*** (4.2) (2.5)

Arts, Design, Entertainment, Sports, and Media 7.5 -13.1** (2.7) (3.5)

Healthcare Practitioners and Technical 21.9 1.3 (3.8) (2.0)

Healthcare Support 4.1 -16.4*** (1.6) (3.8)

Protective Service 15.4 -5.2 (3.5) (5.8)

Food Preparation and Serving Related 10.2 -10.4 (9.8) (7.7)

Building and Grounds Cleaning and Maintenance 15.1 -5.5 (6.1) (3.9)

Personal Care and Service 8.8 -11.8* (4.0) (4.8)

Sales and Related 11.9 -8.7*** (1.2) (1.8)

Office and Administrative Support 16.9 -3.7 (1.9) (1.9)

Farming, Fishing, and Forestry 61.1 40.5 (24.3) (22.3)

Construction Trades and Extraction Workers 42.2 21.6 (11.3) (10.6)

Installation, Maintenance, and Repair Workers 38.4 17.8** (4.6) (5.9)

Production Occupations 20.5 -0.1 (5.1) (3.7)

Transportation 11.9 -8.7** (2.6) (2.7)

Material Moving 29.9 9.3* (5.4) (4.2)

Share Survey Earnings

Industry (Last Week) Overall Relative to Average

Unemployed 14.2 -6.4 (5.3) (6.0)

Agriculture, Forestry, Fishing, and Hunting 64.1 43.5 (30.7) (28.7)

Mining 29.2 8.6 (11.2) (8.8)

Construction 58.6 38.0** (12.1) (11.5)

Manufacturing 18.9 -1.7 (6.6) (5.1)

Wholesale Trade 13.5 -7.1 (7.6) (8.4)

Retail Trade 4.2 -16.4*** (1.5) (2.8)

Transportation and Warehousing 17.2 -3.4 (6.6) (5.8)

Utilities 6.8 -13.8* (5.9) (6.4)

Information 23.9 3.3 (8.4) (8.2)

Finance and Insurance 43.8 23.2* (8.1) (10.2)

Real Estate and Rental and Leasing 79.0 58.4*** (11.3) (11.7)

Professional, Scientific, and Technical Services 36.2 15.7 (11.6) (11.1)

Management of companies and enterprises 2.0 -18.6*** (3.6) (4.5)

Administrative and support and waste management services 22.8 2.2 (11.2) (9.2)

Educational Services 9.8 -10.8*** (3.7) (2.1)

Health Care and Social Assistance 10.9 -9.7*** (2.3) (1.7)

Arts, Entertainment, and Recreation 39.3 18.7 (24.6) (23.7)

Accommodation and Food Service 14.4 -6.2 (14.5) (12.4)

Other Services 27.0 6.4 (9.3) (10.4)

Public Administration 7.4 -13.2 (4.7) (7.0)

Notes: This table shows the share of individuals in each subgroup where survey earnings are used from the measurement error model for choosing survey or administrative earnings discussed in Section 4.3.1 and in more detail in Bee et al. (2023) Standard errors in parenthesis. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences relative to average. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

70

Figure A1: Simple Job Linkage Example

W-2 Jobs

PIK EIN Earnings

1 100 10,000

2 100 20,000

2 400 12,000

3 100 5,000

3 500 200

3 600 2,600

LEHD Jobs

PIK EIN Earnings

1 200 11,000

2 200 20,005

2 400 12,000

3 200 5,200

3 500 225

Direct Matches

PIK

W-2 LEHD

EIN Earnings EIN Earnings

2 400 12,000 400 12,000

3 500 200 500 225

Indirect Matches

PIK

W-2 LEHD

EIN Earnings EIN Earnings

1 100 10,000 200 11,000

2 100 20,000 200 20,005

3 100 5,000 200 5,200

Unmatched

PIK

W-2 LEHD

EIN Earnings EIN Earnings

3 600 2,600

Notes: This is an example of how jobs are linked between W-2s and the LEHD (all PIKS, earnings, and EINs in the example are made up and do not correspond to actual individuals or firms). First and easiest are the jobs that match on PIK and EIN (same person, same firm identifier), which we call direct matches. Next, we find the indirect matches, where each person has one EIN on the W-2s and another on the LEHD (same person, but different firm identifiers on the two files). In this example, everyone with W-2 EIN = 100 has a job with similar earnings on the LEHD, but with EIN = 200. Finally, there are jobs that remain unmatched and only exist on one file or the other.

71

Figure A2: Decomposition of NEWS Processing Steps By Age: Distribution of Household Income

A. Under 65 Survey Steps Administrative Income + Earnings Measurement Error Overall

-20 -15 -10

-5 0 5

10 15 20 25 30 35

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

Reweighted (Nonresponse) + Reweighted for Linkage + Imputed Earnings

-20 -15 -10

-5 0 5

10 15 20 25 30 35

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

+ Administrative Income NEWS (+ Earnings Choice Model)

-20 -15 -10

-5 0 5

10 15 20 25 30 35

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

NEWS (+ Earnings Choice Model)

B. 65 and Over Survey Steps Administrative Income + Earnings Measurement Error Overall

-20 -15 -10

-5 0 5

10 15 20 25 30 35

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

Reweighted (Nonresponse) + Reweighted for Linkage + Imputed Earnings

-20 -15 -10

-5 0 5

10 15 20 25 30 35

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

+ Administrative Income NEWS (+ Earnings Choice Model)

-20 -15 -10

-5 0 5

10 15 20 25 30 35

Pe rc

en t D

iff er

en ce

0 20 40 60 80 100 Household Income Percentile

NEWS (+ Earnings Choice Model)

Notes: This figure decomposes the impact of the NEWS processing steps on household income. In the first column, the figures show the adjustments made to the survey data, including reweighting and improved earnings imputation comparing household income after the adjustment to the survey estimate. In the second column, the figures show impact of replacing survey income responses with administrative income, comparing the estimates after each step to the estimates after reweighting and earnings imputation. The full impact of all adjustments is shown in the third column. The 95 percent confidence interval for the last step is shown in each: for A comparing the estimate after earnings imputation to the survey estimate and for B comparing the final NEWS estimate to the estimate after earnings imputation. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

72

Figure A3: Effect of Removing Individual Administrative Income Items on Household Income, Additional Detail

A. Interest and Dividends B. Transfers

-10

-8

-6

-4

-2

0

2

4

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Household Income Percentile

Interest (including from Retirement Plans) Interest Dividends

-10

-8

-6

-4

-2

0

2

4

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Household Income Percentile

Social Security SSI Social Security & SSI TANF

C. Wage and Salary Earnings

-10

-8

-6

-4

-2

0

2

4 Pe

rc en

t D iff

er en

ce fr

om N

EW S

0 20 40 60 80 100 Household Income Percentile

WS Earnings WS Earnings (Adrecs if Survey == 0)

Notes: In this figure, we replace individual income items from the NEWS estimates with the corresponding survey information and compare the estimate after replacement with the NEWS estimate. An estimate below the zero line indicates that administrative item increases income at that percentile. In Panel A, we replace interest and dividend income with survey responses. For survey interest, we show two measures, including and excluding the survey-reported interest earned in Defined Contribution retirement plans such as 401(k)s. In Panel B, we replace Social Security and SSI separately and together (to address misclassification across programs, as discussed in Bee and Mitchell (2017)) and TANF with survey-reported public assistance income. In Panel C, we replace administrative wage and salary earnings with two survey-based earnings measures. In the first, we use survey responses in all cases where the individual does not have administrative self-employment earnings, even if the individual reported no earnings on the survey. In the second, we only replace administrative wage and salary earnings if the survey report was positive. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

73

Figure A4: Effect of Removing Individual Administrative Income Items on Household Income by Householder Age

A. Under 65

-24 -22 -20 -18 -16 -14 -12 -10

-8 -6 -4 -2 0 2 4

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Household Income Percentile

Interest (including from Retirement Plans) Interest & Dividends Retirement Social Security & SSI WS Earnings

B. 65 and Over

-24 -22 -20 -18 -16 -14 -12 -10

-8 -6 -4 -2 0 2 4

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Household Income Percentile

Interest (including from Retirement Plans) Interest & Dividends Retirement Social Security & SSI WS Earnings

Notes: In this figure, we replace individual income items from the NEWS estimates with the corresponding survey information and compare the estimate after replacement with the NEWS estimate. An estimate below the zero line indicates that administrative item increases income at that percentile. We show each of the major administrative income items, including (1) interest (including and excluding the interest earned in Defined Contribution, DC, retirement plans such as 401(k)s), (2) interest (without DC plan interest) and dividends, (3) DC plan withdrawals, pensions, and survivor and disability pensions (Retirement), (4) Social Security and SSI, and (5) wage and salary earnings. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

74

Figure A5: Alternative Uses of Survey and Administrative Earnings

A. Extensive Margin Disagreement B. Alternative Kappa Parameters

-10

-5

0

5

10

15

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Percentile

Administrative (if != 0) Administrative (even if == 0) Survey Earnings (if != 0) Survey (even if == 0)

-10

-5

0

5

10

15

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Percentile

kappa = 0.70 kappa = 0.75 kappa = 0.80 kappa = 0.85 kappa = 0.95 kappa = 1.00 Administrative (if != 0)

C. Maximum of Survey and Administrative

-10

-5

0

5

10

15

Pe rc

en t D

iff er

en ce

fr om

N EW

S

0 20 40 60 80 100 Percentile

Administrative (if != 0) Survey Earnings (if != 0) Max

Notes: This figure shows the impact on household income (relative to the baseline NEWS estimates) of alternative uses of survey and administrative earnings in the income estimates. In Panel A, we show how income estimates vary when survey or administrative wage and salary earnings were used for individuals indicated as “Measurement error model” in Table A8. The four options in Panel A include: (1) Administrative earnings if they are not equal to 0, (2) administrative earnings even if they are equal to 0 and survey earnings are positive, (3) survey earnings if they are not equal to 0, and (4) survey earnings even if they are equal to zero and administrative earnings are positive. Panel B shows the impact on household earnings of alternative mean-reversion kappa parameters in the measurement error model (with the share of individual’s whose survey earnings are used under each shown in Table A9). Panel B also includes (1) from Panel A, with administrative earnings if they are not equal to 0. Panel C compares the NEWS estimates to simpler uses of survey and administrative earnings, including (1) and (3) from Panel A and using the maximum of administrative and survey earnings. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

75

Figure A6: Decomposition of NEWS Processing Steps By Subgroup: Median Household Income

A. Survey Steps: Weighting and Earnings Imputation

All Households

Family households .Married-couple

.Female householder, no husband present .Male householder, no wife present

Nonfamily households .Female householder

.Male householder

White .White, not Hispanic

Black Asian

Hispanic (any race)

Under 65 years .15 to 24 years .25 to 34 years .35 to 44 years .45 to 54 years .55 to 64 years

65 years and older

Native born Foreign born

.Naturalized citizen .Not a citizen

Northeast Midwest

South West

Age 25 and older householder No high school diploma High school, no college

Some college Bachelor's degree or higher

Inside metropolitan statistical areas .Inside principal cities

.Outside principal cities Outside metropolitan statistical areas

Type of Household

Race and Hispanic Origin of Householder

Age of Householder

Nativity of Householder

Region

Education

Residence

-30 percent -20 percent -10 percent 0 10 percent 20 percent 30 percent

Percent Difference

Reweighted (Nonresponse) + Reweighted for Linkage + Imputed Earnings

Notes: This figure decomposes the impact of the NEWS processing steps on median household income. In Panel A, the figure shows the adjustments made to the survey data, including reweighting and improved earnings imputation comparing median household income for each group after the adjustment to the survey estimate. In Panel B, the figure shows impact of replacing survey income responses with administrative income, comparing the estimates after each step to the estimates after reweighting and earnings imputation. The 95 percent confidence interval for the last step is shown in each: for Panel A comparing the estimate after earnings imputation to the survey estimate and for Panel B comparing the final NEWS estimate to the estimate after earnings imputation. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

76

Figure A6: Decomposition of NEWS Processing Steps By Subgroup: Median Household Income, Continued

B. Administrative Income Replacement and Survey Earnings Choice Modeling

All Households

Family households .Married-couple

.Female householder, no husband present .Male householder, no wife present

Nonfamily households .Female householder

.Male householder

White .White, not Hispanic

Black Asian

Hispanic (any race)

Under 65 years .15 to 24 years .25 to 34 years .35 to 44 years .45 to 54 years .55 to 64 years

65 years and older

Native born Foreign born

.Naturalized citizen .Not a citizen

Northeast Midwest

South West

Age 25 and older householder No high school diploma High school, no college

Some college Bachelor's degree or higher

Inside metropolitan statistical areas .Inside principal cities

.Outside principal cities Outside metropolitan statistical areas

Type of Household

Race and Hispanic Origin of Householder

Age of Householder

Nativity of Householder

Region

Education

Residence

-30 percent -20 percent -10 percent 0 10 percent 20 percent 30 percent

Percent Difference

+ Administrative Income NEWS (+ Earnings Choice Model)

Notes: This figure decomposes the impact of the NEWS processing steps on median household income. In Panel A, the figure shows the adjustments made to the survey data, including reweighting and improved earnings imputation comparing median household income for each group after the adjustment to the survey estimate. In Panel B, the figure shows impact of replacing survey income responses with administrative income, comparing the estimates after each step to the estimates after reweighting and earnings imputation. The 95 percent confidence interval for the last step is shown in each: for Panel A comparing the estimate after earnings imputation to the survey estimate and for Panel B comparing the final NEWS estimate to the estimate after earnings imputation. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

77

Figure A7: Decomposition of NEWS Processing Steps By Subgroup: Poverty

A. Survey Steps: Weighting and Earnings Imputation

All

White White, not Hispanic

Black Asian

Hispanic (any race)

Male Female

Under age 18 Age 18 to 64

Aged 65 and older

Native-born Foreign-born

Naturalized citizen Not a citizen

Northeast Midwest

South West

With a disability with no disability

Aged 25 and older No high school diploma High school, no college

Some college Bachelor's degree or higher

Inside metropolitan statistical areas .Inside principal cities

.Outside principal cities Outside metropolitan statistical areas

Race and Hispanic Origin

Sex

Age

Nativity

Region

Disability Status

Educational Attainment

Residence

-5 -4 -3 -2 -1 0 1 2

Percentage Point Difference

Reweighted (Nonresponse) + Reweighted for Linkage + Imputed Earnings

Notes: This figure decomposes the impact of the NEWS processing steps on poverty. In Panel A, the figure shows the adjustments made to the survey data, including reweighting and improved earnings imputation comparing poverty for each group after the adjustment to the survey estimate. In Panel B, the figure shows impact of replacing survey income responses with administrative income, comparing the estimates after each step to the estimates after reweighting and earnings imputation. The 95 percent confidence interval for the last step is shown in each: for Panel A comparing the estimate after earnings imputation to the survey estimate and for Panel B comparing the final NEWS estimate to the estimate after earnings imputation. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

78

Figure A7: Decomposition of NEWS Processing Steps By Subgroup: Poverty, Continued

B. Administrative Income Replacement and Survey Earnings Choice Modeling

All

White White, not Hispanic

Black Asian

Hispanic (any race)

Male Female

Under age 18 Age 18 to 64

Aged 65 and older

Native-born Foreign-born

Naturalized citizen Not a citizen

Northeast Midwest

South West

With a disability with no disability

Aged 25 and older No high school diploma High school, no college

Some college Bachelor's degree or higher

Inside metropolitan statistical areas .Inside principal cities

.Outside principal cities Outside metropolitan statistical areas

Race and Hispanic Origin

Sex

Age

Nativity

Region

Disability Status

Educational Attainment

Residence

-5 -4 -3 -2 -1 0 1 2

Percentage Point Difference

+ Administrative Income NEWS (+ Earnings Choice Model)

Notes: This figure decomposes the impact of the NEWS processing steps on poverty. In Panel A, the figure shows the adjustments made to the survey data, including reweighting and improved earnings imputation comparing poverty for each group after the adjustment to the survey estimate. In Panel B, the figure shows impact of replacing survey income responses with administrative income, comparing the estimates after each step to the estimates after reweighting and earnings imputation. The 95 percent confidence interval for the last step is shown in each: for Panel A comparing the estimate after earnings imputation to the survey estimate and for Panel B comparing the final NEWS estimate to the estimate after earnings imputation. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

79

Figure A8: Comparing Bias in Linked Administrative Characteristics with Different Weights

Any Linkage

PHUS SSR

Numident

IRMF 1099-R

W-2 or LEHD Any 1040

1040 (2018) 1040 (2019)

Decennial MAFARF

Black Knight

SSA Data

IRS Data

Census Bureau Data

3rd Party Data

-3 -2 -1 0 1 2 3 4 5 6 Percentage Point From Target

A. Linkage Rates by Adrec Data

0-17 18-24 25-34 35-44 45-54 55-64

65+

Black White

Hispanic

Citizen Foreign Born

Age

Race/Hispanic Origin

Citizen/Foreign-Born

-3 -2 -1 0 1 2 3 4 5 6 Percentage Point From Target

B. Address-Linked Demographics

10th 25th 50th 75th 90th

10th 25th 50th 75th 90th

W-2 Earnings

AGI

-10000 -5000 0 5000 10000 Difference From Target

Respondents Survey HH EBW EBW EBW + PIKed

C. Address-Linked Income

Notes: This figure shows various statistics of address-linked administrative, decennial census, and commercial data (refer to Section B.1) using different weights compared to the weighting targets (discussed in Appendix C and shown in Table A5). “Respondents” uses the base weights which adjust only for probability of selection into the sample. “Survey” uses the survey weights. “HH EBW” are the Stage 1 weights that adjust for selection into response at the household level. “EBW” are the Stage 2 weights that further adjust to population controls and “EBW + PIKed” are the Stage 3 weights that further adjust for selection into linkage. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

80

Figure A9: Comparing Survey Characteristics with Different Weights

0-17 18-24 25-34 35-44 45-54 55-64

65+

Black White

Hispanic

Native-Born Citizen Foreign-Born Citizen

Non-Citizen

High School Some College

Bachelors Masters

Professional

Poverty Homeowner

Age

Race/Hispanic Origin

Citizen/Foreign-Born

Education

-3 -2 -1 0 1 2 3 Percentage Point From Survey-Weighted

A. Respondent Demographics

10th

25th

50th

75th

90th

10th

25th

50th

75th

90th

Person

Household

-10000 -5000 0 5000 10000 Difference From Survey-Weighted

EBW EBW + PIKed

B. Respondent Survey Income

Notes: This figure shows various statistics of survey demographics and survey-reported income using the entropy balance weights (discussed Appendix C) relative to the survey-weighted estimates. “EBW” are the Stage 2 weights that further adjust to population controls and “EBW + PIKed” are the Stage 3 weights that further adjust for selection into linkage. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

81

Table A1: Comparing Job-Level LEHD and W-2 Earnings

Health Insurance

LEHD-W-2 Comparison All Yes No Yes - No

LEHD < W-2 8.7 9.7 3.9 5.85*** (0.2) (0.2) (0.3) (0.30)

LEHD ≥ W-2 0-1 percent greater 66.9 61.8 89.3 -27.52***

(0.3) (0.3) (0.4) (0.54) 1-3 percent greater 6.4 7.5 2.0 5.51***

(0.1) (0.2) (0.2) (0.26) 3-5 percent greater 4.9 5.8 1.3 4.50***

(0.1) (0.2) (0.1) (0.20) 5-10 percent greater 6.8 8.0 1.6 6.32***

(0.1) (0.2) (0.2) (0.24) 10+ percent greater 6.3 7.3 2.0 5.34***

(0.1) (0.2) (0.2) (0.25)

Observations 47,000 39,000 8,100

Notes: This table shows basic summary statistics on job-level comparisons of LEHD earnings to W-2 earnings (including deferred compensation) for the highest earning job. Jobs are classified by the ratio of LEHD to W-2 earnings. The first category, W-2 > LEHD, indicates that W-2 earnings exceed LEHD earnings by more than a trivial amount ($100). The other categories indicate that LEHD gross earnings exceeded W-2 earnings + deferred compensation by specific percent ranges. Because LEHD gross earnings should exceed W-2 taxable earnings + deferred compensation primarily due to employee pre-tax contributions to health insurance premiums, the sample in this table includes only individuals that responded to the health insurance question in the CPS ASEC, i.e., whose health insurance status was not imputed. The first column shows the share in each LEHD-W-2 bin for all workers with a job in both data sources. The next two columns show estimates for those that reported having and not having private health insurance, respectively. The last column shows the difference between the share in each bin between those having and not having private health insurance. Standard errors in parenthesis. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

82

Table A2: Direct and Indirect Job Linkage Statistics

EIN Matches Only EIN and Indirect Matches

All Jobs Unmatched Jobs Share of Implied Total Unmatched Jobs Share of Implied Total

Total Jobs W-2 256,800,000 40,720,000 0.146 25,680,000 0.097 LEHD 237,900,000 21,780,000 0.078 6,744,000 0.026 EIN Matches 216,100,000 0.776 0.820 Indirect Matches 15,040,000 0.057

Implied Total Jobs 278,600,000 263,600,000

Notes: This table shows the count of jobs that could be directly linked by Employer Identification Number (EIN) and indirectly linked as discussed in Section A.3. Source: 2018 W-2 and Longitudinal Employer-Household Dynamics data.

83

Table A3: Weighted Linkage Rates by Administrative Data Source in the Address Data

Target Estimate Difference from Target

Base-Weighted Base-Weighted Survey Weighted EBW-Weighted

Occupied Units Respondent Units Respondent Units Respondent Units Respondent + All Adults PIKed Units

Any Linkage 0.932*** 0.0037*** 0.0047*** -0.0006 -0.0012*** (0.002) (0.0006) (0.0010) (0.0006) (0.0005)

SSA Data PHUS 0.402*** 0.0584*** 0.0427*** Z 0.0001

(0.002) (0.0010) (0.0019) (0.0024) (0.0043) SSR 0.050*** 0.0050*** 0.0003 Z Z

(0.001) (0.0004) (0.0007) (0.0010) (0.0015) Numident 0.921*** 0.0046*** 0.0058*** Z Z

(0.002) (0.0006) (0.0012) (0.0007) (0.0004) IRS Data

IRMF 0.837*** 0.0085*** 0.0067*** -0.0005 -0.0018 (0.002) (0.0008) (0.0014) (0.0013) (0.0014)

1099-R 0.436*** 0.0127*** 0.0070*** Z Z (0.002) (0.0010) (0.0018) (0.0006) (0.0019)

Any 1040 0.856*** 0.0018*** 0.0055*** Z 0.0001 (0.002) (0.0007) (0.0013) (0.0009) (0.0006)

1040 (2018) 0.828*** 0.0027*** 0.0068*** Z 0.0001 (0.002) (0.0008) (0.0014) (0.0005) (0.0008)

1040 (2019) 0.835*** 0.0021*** 0.0055*** Z 0.0001 (0.002) (0.0008) (0.0014) (0.0009) (0.0007)

W-2 or LEHD 0.751*** -0.0060*** 0.0037** Z 0.0001 (0.002) (0.0008) (0.0017) (0.0010) (0.0010)

Census Bureau Data Decennial 0.867*** 0.0084*** 0.0083*** Z 0.0001

(0.002) (0.0008) (0.0013) (0.0013) (0.0015) MAFARF 0.822*** 0.0092*** 0.0065*** Z -0.0014

(0.002) (0.0009) (0.0014) (0.0022) (0.0031) 3rd Party Data

Black Knight 0.644*** 0.0119*** 0.0071*** Z Z (0.003) (0.0011) (0.0020) (0.0019) (0.0034)

Notes: This table shows statistics on selection into response at the household level by data source that can be linked to occupied housing units, as discussed in Section B.1. The target estimate is calculated on the base- weighted set of all occupied housing units in the March monthly CPS. The other estimates show differences from the target (evidence of selection into the sample unaddressed by weighting if ̸= 0) for the indicated samples of respondents and weights. Standard errors in parenthesis. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences. Z indicates an estimate rounds to zero. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

84

Table A4: Linkage Rates by Administrative Data Source in the Person Data

NEWS Sample (All Survey-Adults in

Full Sample HH Assigned PIK)

Survey-Adults (15+) Survey-Children (<15) Survey-Adults Survey-Children

Assigned PIK 85.8 79.4 100.0 89.4 (0.18) (0.33) (0.30)

Any Adrec Linked to Address If Assigned PIK 94.7 95.6 93.9 95.0

(0.15) (0.22) (0.16) (0.26) If Not Assigned PIK 89.9 92.6 92.3

(0.40) (0.48) (0.88) Present In | Assigned PIK

Any Administrative Record 98.1 85.2 98.0 87.4 (0.05) (0.30) (0.07) (0.33)

IRS Data Tax Filing (1040) 84.6 83.2 84.4 85.6

(0.17) (0.30) (0.19) (0.34) IRMF 89.4 7.8 88.2 7.5

(0.10) (0.22) (0.12) (0.24) W-2 64.3 1.0 63.9 1.0

(0.16) (0.07) (0.17) (0.08) 1099-R 21.1 0.1 20.1 Z

(0.14) (0.02) (0.13) (0.02) SSA Data

DER 67.6 0.3 67.2 0.3 (0.16) (0.04) (0.17) (0.05)

PHUS 37.8 3.9 35.2 3.5 (0.16) (0.16) (0.16) (0.16)

SSR 3.6 1.3 3.4 1.2 (0.09) (0.10) (0.09) (0.10)

State Data LEHD 64.3 1.0 63.9 1.0

(0.16) (0.07) (0.17) (0.08)

Notes: This table shows statistics on the individuals that can be assigned a PIK as well as the households in which those 15 and over (survey-adults) can be assigned a PIK. For all households and the 82 percent of households with all survey-adults assigned a PIK (the NEWS analysis sample), we show the share of survey- adults and survey-children that can be linked to various data sets. Estimates and standard errors that are 0 by construction are omitted. Z indicates an estimate rounds to zero. Standard errors in parenthesis. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

85

Table A5: Entropy Balance Reweighting Procedure

Stage/Step Moment Variables Moment Sample Reweighted Sample

1. Housing-unit level Linked survey, administrative, and census variables

Non-vacant housing units in March Basic CPS (respondents and nonre- spondents)

Respondent housing units

2. Person level A. Preserve distribution of hous- ing unit characteristics

Linked survey, administrative, and census variables

Householders and householder- partners, using the housing-unit level weights from Stage 1

Householders and house- holder partners

B. Spousal equivalence Linked survey, administrative, and census variables

Married couples and cohabiting partners

Married couples and cohabit- ing partners

C. External population targets State-level population estimates by race, Hispanic-origin, gender, and age

External population estimates All individuals

D. Match distribution of house- hold characteristics in March Ba- sic Sample

Subset of linked survey, adminis- trative, and census variables and state-level population controls

Householders and householder part- ners in the March Basic File

Householders and house- holder partners in the full CPS ASEC sample

3. Address Selection into PIK assignment (for all adults in HH) A. Preserve distribution of re- spondent and housing unit char- acteristics

Linked survey, administrative, and census variables. Additional moments for survey-only and linked survey- administrative characteristics from full respondent sample

Respondent sample with weights from step 2.

Households where all individ- uals asked income questions (age 15+) are linked to a PIK.

B. External population targets State-level population estimates by race, Hispanic-origin, gender, and age

External population estimates

Notes: This table describes the entropy balance reweighting procedure. In the first stage, respondent housing units are reweighted to control for selection into response. This is done by reweighting them to match the characteristics of the target population – all nonvacant housing units in sample. In the second stage, we estimate individual weights that preserve the distribution of housing-unit characteristics from the first stage, while also matching external population totals and approximating the spousal equivalence of weights that are a part of the existing CPS ASEC weights, as in Rothbaum and Bee (2022). To address selection into PIK assignment (and the availability of administrative data), we add a third-stage weighting adjustment.

86

Table A6: Imputation Summary Statistics: Survey Earnings

Imputed Estimate SRMI - Survey

W-2 Earnings Respondents Survey SRMI (Percent difference for dollar values)

Has Survey Earnings = 0 0.181 0.282 0.230 -0.052*** (0.007)

!= 0 0.908 0.860 0.907 0.046*** (0.005)

q = 1 0.676 0.623 0.706 0.083*** (0.014)

q = 2 0.924 0.842 0.921 0.079*** (0.009)

q = 3 0.967 0.928 0.961 0.033*** (0.008)

q = 4 0.984 0.960 0.978 0.018*** (0.006)

q = 5 0.985 0.960 0.973 0.013** (0.006)

Average Wage and Salary Earnings = 0 45,760 43,550 40,440 -0.071 (from main job) (0.061)

!= 0 55,520 52,470 53,330 0.016 (0.047)

q = 1 11,960 22,010 20,840 -0.053 (0.084)

q = 2 23,540 29,810 26,300 -0.118* (0.055)

q = 3 37,750 43,950 37,910 -0.137** (0.045)

q = 4 57,340 62,050 56,790 -0.085 (0.058)

q = 5 120,300 100,000 124,900 0.248*** (0.061)

Median Wage and Salary Earnings = 0 25,900 30,210 31,360 0.038 (from main job) (0.092)

!= 0 41,200 37,690 37,090 -0.016 (0.047)

q = 1 6,747 12,400 13,780 0.111 (0.158)

q = 2 20,720 24,660 22,160 -0.102 (0.055)

q = 3 35,630 36,250 33,570 -0.074 (0.055)

q = 4 55,350 51,490 52,060 0.011 (0.045)

q = 5 100,300 78,690 97,460 0.238** (0.073)

Notes: This table shows basic summary statistics of survey wage and salary earnings conditional on W- 2 earnings (having a W-2 and by W-2 earnings quintile for q = 1,2,3,4,5). Each row shows the relevant survey wage and salary earnings statistic for survey earnings respondents, imputed as part of regular survey production and by SRMI, as discussed in Appendix D. Standard errors in parenthesis. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

87

Table A7: Imputation Summary Statistics: Means-Tested Benefits

Administrative Data Available? Difference Diff in Diff

Yes No No - Yes (Adrec - Survey) and (No - Yes)

TANF Survey

Receipt 1.03 1.05 0.02 0.17 (0.08) (0.08) (0.11) (0.20)

Amount 3,054 3,937 882** -975** (205) (331) (391) (471)

Administrative Receipt 0.78 0.97 0.19

(0.06) (0.16) (0.18) Amount 2,604 2,511 -93

(168) (244) (293) SNAP

Survey Receipt 9.85 9.28 -0.57* -0.42

(0.32) (0.22) (0.38) (0.51) Amount 2,363 2,345 -18 73

(70) (51) (87) (120) Administrative

Receipt 16.11 15.12 -0.99* (0.44) (0.39) (0.58)

Amount 2,807 2,862 55 (60) (80) (100)

Notes: This table shows basic summary statistics of means-tested benefits imputed for incomplete state-level administrative data. For both TANF and SNAP, the first rows show how survey responses vary across states with and without administrative records and the next set of rows show the administrative and imputed estimates. For each, we then compare the states without administrative data (No) to the states with (Yes) and take the difference in difference by comparing the administrative (No - Yes) to the survey (No - Yes). The means-tested benefit imputation is discussed in Appendix D. Standard errors in parenthesis. ***, **, and * indicate significance at the 1, 5, and 10 percent levels and are only shown for differences. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

88

Table A8: Combining Survey and Administrative Earnings

A. By Reported Earnings Type and Source Survey Administrative Rule Percent of Sample

Wage and Salary Self Employment Wage and Salary Self Employment Wage and Salary Self Employment All Adults Any Earnings

X X X X Job-level administrative 1040 (from TMI) 0.4 0.6 X X X Job-level administrative 1040 (from TMI) 0.4 0.6

X X X Job-level administrative 1040 (from TMI) 4.1 5.7 X X Job-level administrative 1040 (from TMI) 0.4 0.5

X X X None (administrative) 1040 (from TMI) 0.7 1.0 X X None 1040 (from TMI) 1.5 2.1

X X None (administrative) 1040 (from TMI) 1.3 1.7 X None 1040 (from TMI) 1.2 1.7

X X X Measurement error model Survey 1.8 2.4 X X Measurement error model 0.8 1.1

X X Measurement error model None 50.5 70.1 X Job-level administrative None 5.6 7.7

X X Survey Survey 0.8 1.1 X None Survey 1.0 1.4

X Survey None 1.6 2.3 None None 28.0

B. By Combination Rule Percent of Sample

Combination Rule All Adults Any Earnings

Simple - no earnings or only earnings in one source 38.6 14.7 Earnings Choice 53.0 73.6 Default to administrative data due to data issues (potential misclassification, missing self-employment, etc.) 8.4 11.7

Notes: This table describes the possible combinations of survey and administrative reports of wage and salary and self-employment earnings as well as our rules for when we use survey and administrative reports for each. If the administrative wage and salary earnings on the 1040 is positive but there are no reported job-level administrative earnings, then we use the 1040 value when the rule indicates use of the job-level data. “All adults” includes anyone 15 or over as they are asked survey earnings questions. The sample only includes individuals in the NEWS sample. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

89

Table A9: Combining Administrative and Survey Earnings: Share with Survey Earnings by Mean Reversion Parameter Kappa

Share Kappa Survey Earnings

0.7 5.8 (1.1)

0.75 8.4 (1.5)

0.8 11.8 (2.0)

0.85 16.0 (2.3)

0.9 20.6 (NEWS) (2.7) 0.95 25.8

(3.4) 1 30.9

(3.8)

Notes: This table shows how variation in the mean-reversion kappa parameter in the measurement error model affect the share of individuals whose survey wage and salary earnings are used. Figure A5 shows how the household income distribution differs under these alternatives. Standard errors in parenthesis. Source: 2019 Current Population Survey Annual Social and Economic Supplement linked to administrative, decennial census, and commercial data.

90

Table A10: Income Type by Source for Filers and Nonfilers

Source

Income Type Filers Nonfilers Notes

Wage and Salary Earnings W-2 DER LEHD 1040

W-2 DER LEHD

Administrative data may miss unreported ”under-the-table” earnings. Current W-2s and DER do not include pre-tax employee contributions to health insurance premiums. LEHD does not have complete coverage. Survey has potential for misreporting and underreporting.

Self-Employment Earnings 1040 DER

Survey only Under-reported substantially on surveys and in administrative records. Considerable disagreement between extensive margin reporting on surveys and administrative data (Abraham et al., 2021).

Social Security 1040 PHUS

PHUS

Supplemental Security SSR SSR Unemployment Insurance 1040 Survey only Included in 1040 Total Money Income. Imputed for nonfilers using disclosed results

from more detailed 1099-G data. Worker’s Compensation Survey only Survey only Not available federal administrative data. Public Assistance TANF TANF Current data only covers some states. TANF data does not cover all possible cash

assistance programs. Veteran’s Benefits Survey only Survey only Potential for VA data use in the future Disability, Survivor, and Retirement Income 1099-R 1099-R Interest 1040 Survey only Imputed for nonfilers using disclosed results from more detailed 1099-INT data. Dividends 1040 Survey only Imputed for nonfilers using disclosed results from more detailed 1099-DIV data. Rent and Royalty Income 1040 Survey only Net rent and royalty income included in 1040 Total Money Income. Gross rent and

royalty income available as a separate variable. Educational Assistance Survey only Survey only Financial Assistance Survey only Survey only Alimony 1040 Survey only Included in 1040 Total Money Income Gambling Winnings 1040 Survey only Included in 1040 Total Money Income. Potentially available on survey as ”other in-

come.”

Notes: This table describes the available data sources for the various types of income, including notes about the limitations of various sources. The availability of income varies between filers and nonfilers, with more income sources available in the currently available administrative records for filers.

91

Appendices

A Data Linkage

A.1 Person Linkage46

The Census Bureau developed the Person Identification Validation System (PVS) to probabilis-

tically match individuals’ records in survey and other data to their SSN or Individual Taxpayer

Identification Number (ITIN) using personally identifying information (PII), such as name, date

of birth, and residential address (Wagner and Layne, 2014). Linked records are assigned a Pro-

tected Identification Key (PIK) and the PII and SSN or ITIN are removed. The PIK serves as the

anonymized linkage key to match individuals across data sets.

As a result, if PVS is unable to assign a PIK to a given survey respondent, no administrative data

are available for that respondent. Bollinger et al. (2019) found a linkage rate in their CPS ASEC

sample (2006-2011) of 86 percent, which matches our estimate for the 2019 CPS ASEC. Because

observable characteristics, such as race, ethnicity, citizenship status, etc., are correlated with PIK

assignment (Bond et al., 2014), we must account for this selection into linkage in our estimates,

which we discuss in Section C.

A.2 Address Linkage

Brummet (2014) describes the development and performance of the system used to link household

records, via residential address fields, to the Master Address File (MAF), called the “MAF Match.”

Information such as house number (and suffix, such as apartment number), street name (and

prefix/suffix, such as rural routes or state highway identifiers), city, state, ZIP code, etc. is used to

link addresses in each data set to the MAF, to assign them MAFIDs.

As with PIKs, this means that if the MAF Match process is unable to assign a MAFID to an

address, the information associated with that address in that data source cannot be linked to other

address-level data. For recent years of surveys such as the ACS, CPS ASEC, and SIPP, every

46The discussion in this section follows Bee and Rothbaum (2019) closely.

92

housing unit has a MAFID because the sample was drawn directly from the MAF.

A.3 Job Linkage

The W-2, DER, and LEHD files all have information on individual jobs. However, unlike the LEHD,

the W-2s and DER do not capture gross earnings. The Census Bureau receives W-2 extracts from

the IRS that include Box 1 “Wages, tips, and other compensation,” Box 3 “Social Security wages,”

and the sum of deferred compensation in Box 12 codes D-H.47 We only observe taxable earnings and

deferred compensation, but not other non-taxable earnings. We therefore do not have information

on pre-tax employee payments for health insurance and other forms of pre-tax compensation not

available in the extract provided by the IRS, such as contributions to Health Savings Accounts. In

most of this section, we will primarily discuss W-2s and not the DER, as the two are identical for

most workers for whom the DER is available.

Not all jobs are covered by unemployment insurance, and thus some jobs are out of universe for the

LEHD. This includes all federal government employees and some private sector employees.48

In the earnings question on the CPS ASEC and ACS, respondents are asked to report “money

income”, which includes gross wage and salary earnings. To match this concept, we would like

gross earnings for each individual job, which we could then use to estimate person-level gross

earnings. However, we have gross earnings for only a subset of jobs (from the LEHD) and taxable

earnings + deferred compensation from the universe of jobs (from W-2s). Because the LEHD

includes a subset of jobs we should observe in W-2s, it is possible for an individual to have one job

47These codes include elective deferrals to plans under Box 12 codes D: 401(k), E: 403(b), F: 408(k)(6), G: 457(b), and H: 501(c)(18)(D). These boxes cover 96.3 percent of all elective re- tirement contributions on W-2s, calculated from IRS Statistics of Income Tax States for Indi- vidual Information Return Form W-2 Statistics, Table 7.A at https://www.irs.gov/statistics/

soi-tax-stats-individual-information-return-form-w2-statistics, accessed 11/17/2021. 48For example, Maryland’s Department of Labor lists the following jobs as exempt: barbers and beauti-

cians, taxicab drivers, owner-operated tractor drivers in certain E and F classifications, maritime employ- ment, election workers, church employees, clergy, certain governmental employees, railroad employment, newspaper delivery, insurance sales, real estate sales, messenger service, direct sellers, foreign employment, other state unemployment insurance programs, work-relief and work-training, family members, hospital pa- tients, student nurses or interns, yacht salespersons who work for a licensed trader on solely a commission basis, services of aliens who are students, scholars, trainees, teachers, etc., who enter the U.S. solely to pursue a full course of study at certain vocational and other non-academic institutions, recreational sports officials, home workers, and casual labor. Refer to https://www.dllr.state.md.us/employment/empfaq.shtml

accessed 11/1/2022.

93

in the LEHD and two in the W-2s. Therefore, we cannot just sum the earnings from both sources

and take the maximum, because the one with the higher value (in this case, W-2 earnings from two

jobs) may understate this individual’s true gross earnings.

Therefore, we would like to combine the LEHD and W-2 records at the job level. For an individual

with one LEHD job and two W-2 jobs, we would then observe gross earnings for one job and

taxable earnings plus deferred compensation for the other. For the second job, we could impute

gross earnings conditional on the other information observed about them (discussed in Appendix

D) and then sum the job-level gross earnings to estimate their administrative gross earnings.

However, linking LEHD and W-2 jobs is not trivial. In the simplest case, a firm files a W-2 and

reports the job to the UI office with the same EIN. We can link these “direct matches” by PIK

and EIN. However, some firms do not file their W-2s and UI reports under the same EIN, and

some firms use multiple EINs in one source but a single EIN in the other (i.e., a separate EIN for

each state’s employment in the LEHD but one EIN in the W-2s). Other firms use other identifiers,

such as state EINs, when they report jobs to UI offices. Therefore, we cannot directly link many

jobs between the LEHD and W-2 files using PIK/EIN combinations. Since nearly all jobs in both

files include a PIK, we can create a set of possible matches that match on PIK but not EIN. We

can then identify the W-2 EINs that correspond to a different EIN or state EIN in the LEHD by

looking across all workers with unmatched jobs. We create a W-2 EIN to LEHD EIN crosswalk of

these “indirect match” jobs.

An example of how we find direct and indirect matches is shown in Figure A1. In the example, we

have three workers (PIK = 1, 2, 3) and their W-2 and LEHD jobs. For EIN = 400 and 500, the

jobs match at the PIK-EIN level. However, EINs 100 and 600 in the W-2s and 200 in the LEHD do

not match. Each worker with EIN = 100 in the W-2s also has a job with EIN = 200 in the LEHD

and each of those jobs has similar earnings on the two files. We use this information to infer that

W-2 EIN 100 is the same firm as LEHD EIN 200. We would then be left with the W-2 job at EIN

= 600 that does not match to any job in the LEHD, perhaps representing a job that is not covered

by unemployment insurance.

To create a crosswalk of all indirect matches between W-2 and LEHD EINs, we develop an iterative

94

algorithm using three pieces of information:

1. The diference in earnings reported on the W-2 and LEHD for the possible job match,

2. The share of jobs in the W-2 EIN that match to the same LEHD EIN and the share of jobs

from the LEHD EIN match to the same W-2 EIN, and

3. The number of likely matches between a W-2 EIN and an LEHD EIN

For the first rule, we can identify matches as likely if the W-2 and LEHD earnings are within some

percent of each other. For the second, we can only keep matches in the crosswalk if many or most

of the jobs in a W-2 or LEHD EIN are identified as likely matches to a single EIN on the other file.

For the third, we may be more confident of a possible match if 100 jobs are all flagged as likely

matches than if two are.

We create an iterative process to create our indirect matches where we set the thresholds for

each of these three possible rules to identify likely matches. We identify the W-2 EIN-LEHD EIN

combinations that match under these thresholds, add those combinations to our crosswalk and then

remove the matched jobs from our possible match dataset. The removed jobs include all jobs with

those pairs of EINs, not just the ones flagged as likely matches by our percent difference cutoff.

We then repeat the process with the remaining jobs after adjusting the thresholds used to identify

possible matches. The goal of the iterative process is to first add the matches we are sure of from

the set of unmatched jobs (large firms, for example) before we match jobs from smaller firms or

with larger differences in earnings across the files.

For example, in the first pass at identifying indirect matches, we flag jobs as likely matches if the

W-2 and LEHD earnings are within 10 percent of each other. We then keep the W2 EIN-LEHD

EIN combinations where 50 percent or more of them match in one direction or the other - i.e., 50

percent of jobs at a W-2 EIN match to the same LEHD EIN or 50 percent of jobs at the LEHD

EIN match to the same W-2 EIN. Finally, we only keep EIN matches for the crosswalk if at least

5 jobs match.

In the example in Figure A1, there are three jobs at W-2 EIN = 100 and LEHD EIN = 200 that are

95

within 10 percent of each other and flagged as likely matches. All jobs in W-2 EIN = 100 match to

LEHD EIN = 200 (and vice versa). This combination meets the first two conditions. However, the

number of matches is 3, which is less than the threshold of 5 so this combination of EINs would not

be flagged as a match. These jobs would be kept in the set of unmatched jobs for the next round

of the process.

In subsequent rounds, we can (1) increase the tolerance on likely matches (i.e., from 10 to 20 percent

difference in earnings), (2) reduce the share matched needed within W-2 or LEHD EINs (i.e., from

50 percent to 25 percent), or (3) lower the threshold of likely matches needed to confirm a match

(i.e., from 5 to 3). From Figure A1, if we lowered the number of likely matches to 3, then we would

count W-2 EIN = 100, LEHD EIN = 200 as an indirect match, add that match to our crosswalk,

and remove the matches under Indirect Matches from the set of unmatched jobs.49

Finally, we implement a series of additional steps to match the remaining set of jobs. First, we try

to find jobs that have multiple EINs in the LEHD but one EIN in the W-2s, for example if a firm

changed EIN mid-year for any reason (restructuring, acquisition, etc.). In that case, the LEHD

might have multiple EINs during the year as the firm filed its quarterly reports, but only one EIN

for the workers’ W-2s. We then flag remaining unmatched jobs as ad hoc likely matches if their

earnings are within a certain percent of each other, but they were not matched by the iterative

process.

In Table A2, we show summary statistics from the linkage process. In the W-2s, there are 257

million unique jobs in 2018, with 238 million in the LEHD. Of those, 216 million are direct matches

by PIK-EIN combination. This leaves 41 million unmatched W-2 jobs and 22 million unmatched

LEHD jobs. However, we find an additional 15 million indirect matches through our matching

algorithm, covering 70 percent of the unmatched LEHD jobs and 37 percent of the unmatched W-2

jobs. We then have 82 percent of jobs matched directly by PIK-EIN, 6 percent matched indirectly,

10 percent unmatched from W-2s, and 3 percent unmatched from the LEHD. We use this linked

49In practice, we first increase the earnings percent difference threshold for likely matches from 10 percent to 20 percent to 25 percent. We also decrease the share of matches within an EIN that must match from 50 percent to 25 percent to 10. Finally, we also decrease the minimum number of matches from 5 to 2 to 1. We make each of these changes separately from the initial thresholds and then change them simultaneously.

96

job information to better estimate gross earnings at the job and person level for use in our income

estimates.

Since LEHD earnings should exceed W-2 taxable earnings + deferred compensation in large part

due to employee pre-tax payments for health insurance premiums, we compare them in our CPS

ASEC sample for individuals who reported whether they have private health insurance coverage.50

As shown in Table A1, individuals with private coverage are less likely to have LEHD earnings

that are approximately the same as their W-2 earnings + deferred compensation (LEHD ≥ W-2

by 0-1 percent), and covered individuals are 3 to 5 times more likely to have LEHD values that

exceed the W-2 amounts by 1-3 percent, 3-5 percent, 5-10 percent, and 10+ percent. This likely

reflects the missing gross earnings for employee pre-tax contributions to health insurance premiums

on W-2s.

However, Table A1 also shows that there is a substantial number of jobs whose W-2 taxable earnings

+ deferred compensation exceeds LEHD gross earnings. At present, we treat these jobs as having

measurement issues in the LEHD and default to the taxable earnings + deferred compensation

from the W-2 and impute gross earnings for those jobs as discussed in Appendix D. We plan to

investigate this issue further in future NEWS releases.

A.4 Firm Linkage

Our firm identifier in the employment data is the EIN. However, as we noted when crosswalking

the job-level data between the W-2 and LEHD, an EIN does not necessarily correspond to a firm.

Some firms have multiple EINs, for example in each state of operation, which can make matching

individual workers to their firm (rather than subunits of the firm) difficult.

This is a challenge for all users of EIN-based administrative data (Joint Committee on Taxation,

2022; Chow et al., 2021). Chow et al. (2021) redesigned the Longitudinal Business Database (LBD)

in part to help bridge this gap and to make linkages between various worker- and firm-level datasets

easier. We use this redesigned LBD to map EINs to LBD firm identifiers (LBDFID). In the LBD,

50Note that the CPS ASEC variable we use indicates receipt of private coverage, but not necessarily that the individual’s job (rather than a spouse, partner, or other family member) was the source of the coverage.

97

each establishment is associated with one or more EINs and also to a LBDFID. We create a

crosswalk of all EIN to LBDFID combinations by year. If a firm restructures during a given year,

it is possible for the same EIN to map to different LBDFIDs in the same year. When that happens,

we assign the EIN to the associated LBDFID in the subsequent year. From that, we create a

year-by-year EIN-LBDFID crosswalk for all firms in our data. We can then merge the job-level

data by EIN to an LBDFID to match each worker to a firm. At the firm level (by LBDFID), we

can then use LBD data or create our own summary statistics on firm employment and payroll from

the linked job-level data. At present, we use this firm information for modeling, imputation, and

weighting.

B File Construction

B.1 Address File

The first file we create from the data in Sections 3.1-3.6 is the Address File. We link the sample of

occupied (non-vacant) housing units in the survey to the aforementioned sources of administrative,

survey, census, and commercial data, as shown in Figure 4. By starting with addresses, we have

information from all occupied units, including respondents and nonrespondents. In the address file,

we do not use any information from survey responses other than whether the unit responded. This

file is used to construct the weights that address selection into our sample, discussed in Section

C.

First, we link the MAFIDs of occupied housing units to the MAF and Black Knight data to get

information on the housing units, such as home value and type (single vs. multi-unit). We then link

the same MAFIDs to several files that have both MAFIDs and PIKs, including the IRMF, MAF-

ARF, and 1040 tax returns, giving us information on the information returns (W-2, 1099-G, etc.)

sent to that address, their income (from tax returns), and PIKs for individuals who are associated

with that address. We create a roster of PIKs for the linked individuals in each occupied unit. We

then link this roster to various files, including the universe PHUS and SSR files, the Numident, W-

98

2s, LEHD, and the IRMF and 1040 tax returns.51 We then link the LEHD and W-2 jobs together

using the job crosswalk discussed in Section A.3. We also link those jobs to the characteristics of

the employer firm in the LBD using the EIN-firm ID crosswalk discussed in Appendix A.4.

Finally, we create geographic summary files at different levels of aggregation (state, county, and

tract) that summarize the characteristics of residents of those locations from different files. These

include (1) a summary of demographic characteristics from the 2010 decennial census, (2) de-

mographic and socioeconomic characteristics from 5-year ACS files, (3) earnings and information

return receipt from the IRMF and W-2 files, (4) citizenship information from the MAF-ARF linked

to the Numident, and (5) income and marital status information from 1040 tax returns.

This gives us information on the income, earnings, industry, race, Hispanic origin, marital status,

presence of children, home value, housing unit type, etc., as well as information about the neigh-

borhoods in which each household lives. However, data coverage is not perfect. As shown in Table

A3, we can link 93 percent of occupied CPS ASEC addresses to at least one data set (exclud-

ing the MAF, from which the addresses were sampled). That leaves 7 percent of addresses that

we cannot link to any data other than the MAF. For these, we have no additional address-level

information, and we cannot link the address to possible residents, which means that we cannot

observe any address-level demographic or socioeconomic characteristics for these households (apart

from the survey responses). For them, we only have information about their communities from

the geographic summary files and about their housing unit from the MAF. Furthermore, we do

not directly observe some characteristics that may be related to wellbeing and survey response,

such as educational attainment, health insurance status, disability status (except if receiving SSI

or OASDI), etc.52

51For the IRMF and tax return link, we do this in case an individual associated with the address received an information return at a different address or was on a 1040 tax return filed from a different address.

52Rothbaum and Bee (2022) evaluate how well weighting can control for differences between respondents and nonrespondents by one of the dimensions unobserved in our linked data, educational attainment, by linking the subset of housing units to prior ACS responses. They find that most, but not all, of the selection into response by educational attainment is addressed by weights created using similar linked data.

99

B.2 Person File

The second file we create from the data in Sections 3.1-3.6 is the Person File. We create this file by

linking survey respondents to administrative data, as shown in Figure 5. In combination with the

weights created using the Address File, the Person File is used to create our income and poverty

estimates.

The Person File contains survey responses, including demographics, socioeconomic characteristics,

income, etc. as well as administrative information on income on the following files: 1040s, W-2s,

DER, LEHD, 1099-Rs, PHUS, SSR, and TANF. Table A10 shows the data sources with information

by income type (wage and salary earnings, Social Security, etc.) for tax filers and nonfilers. For

tax filers, most income types are available in the administrative data, either as separate variables

or as part of 1040 Total Money Income. For nonfilers, we observe wages and salary earnings (W-2s,

DER, and LEHD), OASDI benefits (PHUS), SSI (SSR), retirement, survivor and disability income

(1099-R), and TANF income (state data), as well as flags for the potential presence (but not

amount) of interest income (1099-INT), dividends (1099-DIV), and unemployment compensation

(1099-G). Several types of income are only available on the survey, regardless of tax filing status,

including workers’ compensation, veterans benefits, educational assistance, and inter-household

financial assistance. Table A4 shows the share of the sample that can be assigned a PIK and the

share of individuals with a PIK that can linked to each of the administrative data sources.

C Weighting

Weighting is one method for addressing missing data, where variables are completely unobserved

for a subset of the sample.53 Let R be an indicator for whether the information is available for an

individual or unit (i.e., response to a survey). Given a set of k variables X = {x1, x2, . . . , xk} for n

units (individuals, households, firms). These covariates are observed for some units, but not others,

X = {XO, XM}, where O indicates observed (R = 1) and M indicates missingness (R = 0).

There are several possible relationships between missing data and the individual and household

53The discussion in this section follows Rothbaum and Bee (2022) closely.

100

characteristics we are interested in estimating. The simplest possible pattern of missingness (for

the analyst) is if the data are missing completely at random (MCAR). In this case, nonresponse is

completely random and not related to XO or XU , or R ⊥ (XO, XM ). For example, if a unit flips a

coin when deciding whether to respond to the survey, nonresponse would be MCAR. If the data are

MCAR, then the solution is easy – we do not need any adjustment to the data to get an unbiased

estimated. We can just drop missing observations. Only precision is affected by MCAR data, as

the sample is smaller than if all individuals were observed.

Another possibility is that the data are missing at random (MAR), conditional on the observable

information. Given a distribution f(·), data are MAR if f(R|X) = f(R|XO), which means that

missingness is conditionally independent of the unobserved information (XU ). This is the underlying

assumption of most nonresponse bias adjustments, such as survey weights.

However, another possibility is that the data are not missing at random (NMAR), where f(R|X) ̸=

f(R|XO). This is much more challenging to address. Suppose the probability of information

availability varies with income, which is in X. Then f(R|X) ̸= f(R|XO), and we cannot easily

recover the true underlying income distribution from the observed data in XO without strong,

generally difficult to verify assumptions about f(R|X).

However, MAR is an independence assumption conditional on X. Suppose there is another set

of variables A that are observed for the full sample, independent of response. In that case it is

possible that the data are NMAR with respect to X, but MAR with respect to A, or more formally

f(R|X) ̸= f(R|XO) but f(R|X,A) = f(R|XO, A). Rothbaum and Bee (2022) found that from 2020

to 2022, nonresponse in the CPS ASEC was NMAR with respect to X and that income statistics

were biased by 2-3 percent as a result. They used additional information from administrative data

linked at the address level to the addresses of respondent and nonrespondent households to adjust

the weights for nonresponse.54

There are several aspects of our data that lend themselves to weighting to address missing informa-

tion — where a subset of variables is completely missing for some units. For survey nonresponse,

none of the survey information is observable for the nonresponding units. For incomplete linkage,

54Rothbaum et al. (2021) did the same to address nonresponse bias in the 2020 ACS.

101

none of the administrative data is available for the unlinkable individuals. If survey nonresponse

or linkage are MAR, we can address the bias through weighting.

To include additional characteristics in the weighting model, we use entropy balancing (Hainmueller,

2012). Entropy balancing is an application of exponential empirical calibration. Empirical calibra-

tion has a long history of use in survey weighting (Deming and Stephan, 1940; Deville and Särndal,

1992) – the existing weighting models (using raking) in the ACS and CPS ASEC are applications

of empirical calibration.55

We use the unobservable information (in the survey) from the linked administrative and decennial

census data, which are available for all linkable households regardless of whether they responded

as well as the geographic summary information. Entropy balancing estimates weights that match

a specified set of moment constraints (i.e., to adjust the weights according to f(R|XO, A)) while

keeping the final weights as close as possible to the initial weights.

Entropy balancing has several appealing features for this application. The first is flexibility. Inverse

probability weighting (or any simple regression-based reweighting technique) is only amenable to

matching characteristics of the distribution in the sample, but not external targets. Empirical

calibration will adjust the weights to match any properly specified target moment, whether that

moment was estimated on the sample or with external data. The second is statistical efficiency,

which is achieved by keeping the final weights as close as possible to the initial probabilities of se-

lection.56 Third, entropy balancing directly adjusts the weights to the moment conditions, like with

raking but unlike single-index propensity score weighting approaches (such as inverse probability

weights). In propensity score approaches, the adjustment is made to the single index generally

estimated from a regression. The resulting balance must be assessed to evaluate the success and

quality of the propensity score model. In some cases, a misspecified propensity score model can

make balance worse on a given set of dimensions. As entropy balancing directly targets those

moments, balance is assured. Fourth, unlike raking, or cell-based empirical calibration methods,

55Raking, also called iterative proportional fitting, adjusts the weights for each group to match the population total for that group. It is solved by iterating across groups to match the different population targets in stages.

56Through the minimization in equation C.1.

102

entropy balancing allows for the inclusion of continuous variables in the weighting model.

The fifth is computational efficiency – entropy balancing allows matching to a high-dimensional

vector of moment constraints. In terms of our MAR assumption, if A or X is high dimensional,

then the computational efficiency makes it feasible to include all of A and X in the weighting model.

As in Rothbaum and Bee (2022), we use state-level population controls that include estimates of the

share of the population in 20 separate groups in each of the 50 states and the District of Columbia.

That yields 1,020 separate target population moments before even considering information from

the linked administrative data. The computational efficiency of the entropy balancing optimization

algorithm allows us to match to both the linked administrative and population control targets

simultaneously. This eliminates the need for an additional population control raking step that can

undo the balance from the nonresponse adjustment.57

Next we discuss entropy balancing in detail. Suppose we have n observations, where i = 1, 2, . . . , n

with base weights based on sampling probabilities of q = {q1, q2, . . . , qn}. Entropy balancing esti-

mates weights w = {w1, w2, . . . , wn} that solve the following minimization problem:

min w

n∑ i=1

wi log( wi

qi ) (C.1)

subject to several sets of constraints. First, we have p moment conditions. Let X = {X1, . . . , Xp}

be a matrix of observable characteristics. For characteristic j, the moment conditions are defined

57Several studies have implemented first-stage nonresponse adjustments followed by second-stage raking to population controls that do not condition on the first-stage adjustment. Slud and Bailey (2010) found that for some metrics of weight quality, the benefits of the first-stage adjustment disappeared after the application of the second-stage raking to population controls. Eggleston and Westra (2020) found that for some measures used in the first-stage adjustment, the bias is not improved or can be greater using the final weights after raking to population controls, although most statistics show reduced bias after the second-stage raking. Rothbaum et al. (2021) found something similar in follow-up work on the ACS when applied to the 5-year release. Without including very detailed population controls in the 2020 1-year ACS weights (down to tract-level population), when the 2016-2020 files were combined and raked to the 5-year population controls, the 2020 nonresponse adjustment had little impact on the 5-year estimates. Only when the 2020 file was simultaneously reweighted to detailed population controls and the linked administrative targets, limiting the need for additional raking adjustments, did the nonresponse bias adjustment persist on the final 5-year file.

103

to match a vector of pre-specified constants c̄j , where:

n∑ i=1

wicj(Xi,j) = c̄j . (C.2)

cj(·) can be any arbitrary function.

Second, we have constraints on the weights themselves:

n∑ i=1

wi = w̄

wi ≥ 0, i = 1, . . . , n

(C.3)

which ensure that the weights sum to some pre-specified total weight w̄, which can be the population

count or 1. The value of w̄ does not affect the relative weights of each observation.

As such the weights can be adjusted to match pre-specified moments such as population means,

variances, higher-order moments, moments of any transformed distribution of X(i, j), etc. In

summary, entropy balancing adjusts the weights according to (C.1), subject to the constraints in

(C.2) and (C.3).58

Entropy balancing was developed as an application of empirical calibration to balance treatment

and control groups when estimating causal treatment effects in observational studies. Zhao and

Percival (2017) show that, in that context, entropy balancing is equivalent to estimating a logistic

model for the propensity score and a linear regression model for the outcome, conditional on the

covariates used in the moment conditions. They find that entropy balancing is doubly robust - if

at least one of the two models is correctly specified, the estimated population average treatment

effect on the treated (PATT) is consistent.59 Using the notation of that literature, let γ be the

PATT, Y be an outcome of interest where Y (1) is the outcome if treated and Y (0) is the outcome

if untreated, then:

58In practice, as is not necessarily possible to satisfy all constraints simultaneously through weighting adjustment, the analyst sets a tolerance level for the moment constraints. The weighting algorithm adjusts the weights iteratively until all constraints are satisfied subject to the specified tolerance.

59Double robustness is not a panacea. Kang and Schafer (2007) show via simulation that doubly robust models for missingness can perform poorly when neither model is correctly specified, or as they write, “in at least some settings, two wrong models are not better than one.”

104

γ = E[Y (1)|T = 1]− E[Y (0)|T = 1]. (C.4)

In the causal inference literature, the challenge is that E[Y (0)|T = 1] is not observed. Under

entropy balancing, given ∑n

i=1 qi = q̄, the PATT is estimated as:

γ̂ebw = 1

∑ Ti=1

qiYi − 1

∑ Ti=0

wiYi. (C.5)

In the case of survey weights, the “treatment” is nonresponse, and the double robustness result

applies. Entropy balancing reweights the sample so that the estimate of Y for the weighted respon-

dents is equal to the estimate of Y for the population,60 or:

E[Y ] = 1

n∑ i=1

wiY. (C.6)

We would like to reweight the respondent sample so that its distribution of characteristics matches

the target population from which the sample was drawn. However, some characteristics are not

observable for all housing units with the available linked census, survey, and administrative data.

For example, we do not observe any demographic information for housing units that are not linked

to an information return in the IRMF file, as the IRMF provides the identifier needed (PIK) to link

individuals to all other data sources. Therefore, we use a second source of data for our reweighting

– the aforementioned external estimates of population by geography. For both the linked data and

the external population estimates, we can specify a set of moment conditions, which are intended

to capture the distribution of characteristics in the target population. In the language of our MAR

assumption, we are concerned that f(R|A) ̸= f(R|X) and that we need XO (the demographic

information) in the weighting model as well, such that f(R|A,XO) = f(R|X).

Our data have one additional complication – the target moments are at separate levels of aggre-

gation. Estimates from the linked administrative and census data are at the housing unit level

60Conditional on strong ignorability (Y (0), Y (1) ⊥ T |X) and overlap (0 < P (T = 1|X) < 1), from Rosenbaum and Rubin (1983), as well as the proper specification of the moment conditions required for the Zhao and Percival (2017) double robustness result.

105

whereas the external state-level population moments are at the individual level. Entropy balancing

is not amenable to matching moments at different levels of aggregation. Therefore, we proceed with

a multi-stage reweighting procedure, which we discuss below and summarize in Table A5. This is

analogous to two-step calibration, as discussed in Estevao and Säarndal (2006).

In the first stage, we adjust the household base weights for nonresponse, controlling to moments

estimated from the linked administrative and census data. The target distribution is estimated

using the nonvacant housing units in the March Basic CPS Sample, which includes both respon-

dent and nonrespondent housing units. Given the known probability of inclusion in the sample

(from the base weights), these are estimates of the underlying population moments for each of the

included characteristics. The moments include housing-unit-level summary statistics on race, His-

panic origin, age, marital status, income, sources of income (through information return dummies),

citizenship, and nativity.

Entropy balancing adjusts the housing unit weights so that the weighted estimates from respondent

units match the moments estimated from all nonvacant households. Let us designate the housing-

unit moment constraint variables as XL i,j , where L indicates linked data. Let w1

i be the output

weights of the first-stage reweighting. Given n respondent households, and a set of nonvacant

(occupied) households NV , where i = 1, 2, . . . , nNV with survey base weights qi, the moment

conditions are of the form: n∑

i=1

w1 i cj(X

L i,j) =

nNV∑ i=1

q1cj(X L i,j). (C.7)

With these moment conditions, we estimate w1 i for each household using entropy balancing.

In the second stage, we would like to create weights (denoted w2 m,i) for each individual m and

household i, where m = 1, 2, . . . ,M , that adjust to external population controls while maintaining

the household weighting adjustment from the first stage. We do so by simultaneously matching to

three sets of target moments (2A-C in in Table A5):

A Preserve the distribution of housing unit characteristics

B Spousal equivalence

106

C External population targets

In the first set of constraints (A), we calculate person-weighted moments from the stage-1 weights.

Given the number of people in household i, nHH i , we define the moment conditions using the stage-1

weights as follows: M∑

m=1

w2 m,i

1

nHH i

cj(X L i,j) =

n∑ i=1

w1 i cj(X

L i,j). (C.8)

This ensures that if we take the average weight of household members in household i (HHi) as

w̄2 i = 1/nHH

i

∑ p∈HHi

w2 m,i , the following condition will be satisfied:

n∑ i=1

w̄2 i cj(X

L i,j) =

n∑ i=1

w1 i cj(X

L i,j). (C.9)

This does not require that w̄2 i is equal to w1

i for any household i, but rather that the specified

constraints from stage one hold in the final entropy-balance weights, when the final weights are

averaged across all household members. This procedure of dividing the household moments equally

among the family members helps ensure that each person contributes to satisfying the moments

from the linked administrative and decennial census data, which should reduce the variability of

weights among household members. It is particularly important for person-level statistics, such as

poverty or health insurance status, that are functions of household or family characteristics. For

example, poverty status (poor/non-poor) is defined at an aggregated level (the family), but the

share in poverty is estimated from individual weights. By having each household member be part

of the moment conditions for the linked data, administrative income affects each member’s weight,

which affects the poverty estimate.

For the second set of moments in the second-stage reweighting (2.B. in Table A5), we approximate

the spousal equalization that is part of existing CPS ASEC weights. We include this set of conditions

because household- and family-level statistics should also be invariant to which spouse’s weight is

used as the family or household weight. Let S = {0, 1, 2}, where S = 0 if an individual is unmarried,

1 if the individual is the first spouse or cohabiting partner on the file, and 2 if the individual is

the second spouse or partner on the file. Given an indicator function I(·), the spousal equivalence

107

moment condition for a given characteristic in the linked data is:

M∑ i=m

[ I(S = 1)w2

i,mcj(X L i,j)− I(S = 2)w2

i,mcj(X L i,j)

] = 0. (C.10)

This does not require that each individual’s weight be equal to their partner’s, as that would require

a separate moment condition for each couple. Instead, it requires that the characteristics of the

households of partners in the linked data be balanced.

The third set of moment conditions (2.C. in Table A5) reweight the individual observations to

match the age by race/Hispanic-origin/gender cells for each state and the District of Columbia, as

noted above. These conditions have the simple form of equation (C.2).

With these three sets of conditions, we reweight the March Basic CPS sample to simultaneously

match the household-level linked administrative data and the individual-level state population

targets. For each individual, the initial weights for the stage 2 reweighting are the household

weights from the stage 1 reweighting (w1 i ), so that the minimization from (C.1) becomes:

min w2

n∑ i=1

w2 i log(

w2 i

w1 i

). (C.11)

However, for the full CPS ASEC sample, there is one more complication. The full sample includes

groups that were oversampled based on characteristics reported in earlier survey responses, includ-

ing Hispanic origin and the presence of children. Therefore, in the full sample, the weights for

these oversampled individuals and households need to be adjusted to reflect their prevalence in

the population and characteristics. To do this, we add a fourth set of moment conditions (2.D.

in Table A5). We create these conditions from the entropy-balance weighted March Basic sample,

because it is a stratified random sample that is not affected by oversampling based on observable

characteristics from prior survey responses. Let w2,March i,m be the second-stage weights from the

March Basic Sample, w2,Full i,m be the second-stage weights from the full CPS ASEC sample, and

MFull and MMarch be the number of individuals in the full and March Basic CPS samples. This

fourth set of conditions has the form:

108

mFull∑ m=1

w2,Full i,m cj(Xi,k) =

mMarch∑ m=1

w2,March i,m cj(Xi,k). (C.12)

This fourth set of moments includes information on race, Hispanic origin, income (from the linked

administrative data), and the number of adults and children in the household. Without this set of

conditions, estimates of the number of households by type (especially for oversampled groups) differ

between the full and March Basic CPS ASEC samples. Additionally, without these constraints,

observables-based oversampling in the full CPS ASEC biases estimates for oversampled subgroups

relative to estimates from the March Basic sample. Although we focus on the estimates from the

full CPS ASEC sample in this paper, we present the results from the Basic March sample in the

Appendix as well, because it is a stratified random sample with no oversampling based on observable

characteristics from earlier survey responses.

At this point, the weights would adjust for selection into response. However, because we are using

administrative data to address survey misreporting, inclusion in our sample is also conditional on

linkage to a PIK as that is the key to linking each individual to every source of administrative data.

We therefore include in our sample only those households in which all those old enough to receive

survey income questions (15+) are assigned a PIK. To address this selection, we add a third stage

to the entropy balancing weight procedure used in Rothbaum and Bee (2022), as shown in Table

A5, Stage 3.

Stages 3A and 3B have the same form as 2A and 2C, but add additional moments to the already

specified ones from the linked data and external population controls. In adjusting for selection into

linkage, we include moments on survey-reported income, administrative income, and survey poverty

status by survey reported demographics such as race, Hispanic-origin, citizenship, and age.

The weights after this third-stage adjustment should adjust the sample for both selection into

survey response and selection into linkage, to the extent possible given the observable survey and

linked administrative data.

For valid inference, we repeat the above two-stage reweighting procedure 160 additional times using

the baseline successive difference replicate factors created during the sampling process, which are

109

available for all households regardless of response status. These replicate factors account for the

sampling design of the monthly Basic CPS and CPS ASEC. Also, the first-stage target moments

from the March Basic CPS sample are estimates and thus subject to sampling error. By repeating

the procedure with the base weights and replicate factors, the target moments for each replicate

will vary, and variation in the final weights across the replicates will reflect the uncertainty in

our linked data estimates. All standard errors reported using EBW are calculated with these 160

replicate-factor EBW.61

As noted in Rothbaum et al. (2021), in addition to changing point estimates, improved weights can

also affect standard errors. It is generally understood that increased variability among the survey

weights can increase the standard errors, so weighting adjustments aimed at reducing bias are often

done at the expense of increasing variance. However, Little and Vartivarian (2005) show that this

may not hold true if variables used to adjust for nonresponse are correlated with survey variables

of interest, a property they call “super-efficiency.” This also has implications for how weighting

models should be constructed, as including variables that are not strongly predictive of response,

but are correlated with outcomes of interest can reduce variance of an estimate even if they do not

affect its bias.

The full reweighting procedure is described in Table A5 . Stage 1 adjusts for nonresponse at the

housing unit level by reweighting respondent households to match the characteristics of occupied

households estimated from the linked administrative, decennial, and commercial data. Stage 2

creates individual weights that maintain the adjustment from Stage 1, but additionally adjust the

person weights to match the external population controls. As in Rothbaum and Bee (2022), the

Stage-2 weights adjust the sample for selection into survey response.

However, because we are using administrative data to address survey misreporting, inclusion in

our sample is also conditional on linkage to a PIK, as that is the key to linking each individual

to every source of administrative data. Our final sample includes only those households where all

61Refer to “Estimating ASEC Variances with Replicate Weights” (U.S. Census Bureau, 2009) for a dis- cussion of successive difference replication in the CPS ASEC. Note also that at present we do not include uncertainty in the external population targets, but we hope to explore how best to account for that uncer- tainty in the weights as well in future research.

110

those old enough to receive survey income questions (15+) are assigned a PIK. To address this

selection, we add a third stage to the entropy balancing weighting procedure used in Rothbaum

and Bee (2022), as shown in Table A5, Stage 3. The Stage-3 weights maintain the adjustments

of the Stage-2 weights, but also control for selection into linkage, to the extent possible given the

observable survey and linked administrative data.

For valid inference, we repeat the above two-stage reweighting procedure 160 additional times using

the baseline successive difference replicate factors created during the sampling process, which are

available for all households regardless of response status. These replicate factors account for the

sampling design of the monthly Basic CPS and CPS ASEC. Also, the first-stage target moments

from the March Basic CPS sample are estimates and thus subject to sampling error. By repeating

the procedure with the base weights and replicate factors, the target moments for each replicate

will vary and variation in the final weights across the replicates will reflect the uncertainty in

our linked data estimates. All standard errors reported using EBW are calculated with these 160

replicate-factor EBW.

As noted in Rothbaum et al. (2021), in addition to changing point estimates, improved weights can

also affect standard errors. It is generally understood that increased variability among the survey

weights can increase the standard errors, so weighting adjustments aimed at reducing bias are often

done at the expense of increasing variance. However, Little and Vartivarian (2005) showed that

this may not hold if variables used to adjust for nonresponse are correlated with survey variables

of interest, a property they call “super-efficiency.” This also has implications for how weighting

models should be constructed, as including variables that are not strongly predictive of response,

but are correlated with outcomes of interest, can reduce variance of an estimate even if they do not

affect its bias.

Figure A8 shows the bias in estimates of address-linked characteristics using the various weights.

In each panel, we compare the five separate weights to the target moments estimated on the set of

all occupied housing units. They are:

1. Respondents — the weights only adjust for the probability the housing unit is selected into

111

the sample

2. Survey — the final survey weights

3. HH EBW — the Stage 1 weights that adjust for response at the household level only

4. EBW — the Stage 2 weights that adjust for response at the household level and to the

external population controls

5. EBW + PIKed — the Stage 3 weights that adjust for response at the household level, to

external population control, and for selection into linkage.

From Figure A8, we can see that OASDI recipients (linked to the PHUS) are overrepresented with

the respondent and survey weights (Panel A), as are housing units with residents that are 65 and

over (Panel B). The EBW bias estimates in Panels A and B (those that can be directly targeted

in the weighting) are all very close to zero, with few statistically significant differences.62

Figure A9 compares statistics estimated on survey responses using the survey weights to those

estimated using the Stage 2 (EBW) and Stage 3 (EBW + PIKed) weights. In this case, the survey-

weighted and EBW estimates by race, Hispanic origin, and age should match the survey estimates

by construction (as they are each weighting to external population controls). However, differences

for other statistics for the EBW relative to the survey-weighted estimates reflect potential bias in

the survey estimates, which we see, for example, for household income.

D Imputation

Suppose we have two variables Yi and Yj with missing values indicated by Ri = 0 or Rj = 0.63

Missingness is monotone if Rj = 0 in all cases where Ri = 0. The pattern of missingness discussed

above for weighting is one case of monotone missingness.64 Missingness is non-monotone if Ri = 0

62Percentiles cannot be directly matched by entropy balancing. Instead, the weighting model weights respondents to match the share of units in different income bins (i.e., the share of households with address- level W-2 earnings ≤ $25,000.

63The discussion in this section follows Hokayem, Raghunathan and Rothbaum (2022) and Fox et al. (2022) closely.

64In that case, we are assuming that for all variables in X, Ri = R, where i = 1, . . . , k.

112

does not imply that Rj = 0.

While weighting can address missing data for the monotone missingness discussed in the prior

section, it is not optimal as a general missing data correction when missingness is non-monotone.

For non-monotone missingness, imputation is a better approach as it fully utilizes the available

information (Raghunathan et al., 2001). In this section, we discuss imputations models generally

followed by our implementation.

Suppose O is a collection of observable variables with no missing values, with O = (O1, O2, . . . , Oq)

and Y1, Y2, . . . , Yp are variables with missing values, with Y = (Y1, Y2, . . . , Yp). Further, let U

be a set of unobserved characteristics. Let f(Y |O,U, θ) be the conditional joint density, with

θ = (θ1, θ2, . . . , θp) and where θj is a vector of parameters in the conditional distribution for Yj

such as regression coefficients and dispersion parameters. An imputation model imposes some

assumptions on f and θ to assign plausible values to Y where data are missing.

In this case, Y is MAR if missingness can be accounted for by observable characteristics, which

can be written as f(Y |O, θ) = f(Y |O,U, θ) (Rubin, 1976).65 Another way to view imputation is

through the lens of a researcher or data user. Consider a statistic Q, which could be a distributional

statistic (such as a mean or median), a regression coefficient, or any other statistic or parameter

of interest to the researcher. An imputation model is congenial or proper and results in unbiased

estimates of Q if E(Q̂|O, θ) = E(Q̂|O,U, θ) = Q and has valid confidence intervals for Q̂ (Meng,

1994; Rubin, 1996).

This is only true when the imputation model is congenial and proper for the analysis being con-

ducted. There are many examples in the literature where this congeniality condition fails for a given

statistic or set of statistics. An example is match bias in the CPS. Bollinger and Hirsch (2006)

showed that because the imputation model in the CPS does not include union status, estimates of

the relationship between union status and earnings are attenuated in the imputed data. Even in

this case, the issue is not that their earnings are misclassified (as very rarely will imputed earnings

match the true value for a given individual), but that they are drawn from the wrong distribu-

tion – one that does not condition on union status. However, uncongeniality for one statistic does

65It is NMAR if f(Y |O, θ) ̸= f(Y |O,U, θ).

113

not indicate bias for other related statistics. For example, match bias on union status does not

necessarily mean that the CPS imputation model will bias statistics of the unconditional earnings

distribution.

It is impossible for congeniality to hold for all possible statistics Q, unless the model perfectly

predicts the missing values, i.e., there is no misclassification.66 However, we could assess the

quality of an imputation model by comparing a set of the resulting Q̂ estimates against known

Q values. Fox et al. (2022) took this approach, using a variety of statistics, including regression

coefficients and conditional and unconditional distributional statistics to evaluate their imputation

model.

Hokayem, Raghunathan and Rothbaum (2022) addressed survey nonresponse in the CPS ASEC in

2009-2013 by including more covariates in the imputation model than the current CPS ASEC hot

deck approach and comparing models with and without administrative data on earnings and income

in the model. They find further evidence of match bias. However, with sufficient information in

the model, they do not find evidence of nonignorable nonresponse (NMAR) when they compare

the estimates of imputes that condition on administrative income to those that do not.

This non-monotone missingness is present in several variables in our data. Income items are partic-

ularly prone to survey nonresponse - over 40 percent of earnings (and all income) is imputed in the

CPS ASEC due to nonresponse in recent years (Hokayem, Raghunathan and Rothbaum, 2022). We

also do not observe gross wage and salary earnings (in the LEHD) for all jobs because not all jobs

are covered by unemployment insurance and non-covered jobs are not reported to state UI offices.

Gross earnings are also missing for jobs that are not available in the LEHD for other reasons, such

as firms that erroneously fail to report jobs and states with no data-sharing agreement in a given

year. For the missing survey responses and missing gross earnings, we observe a lot of information

(variables in O) that can help us predict the missing values, such as W-2 job-level earnings, survey-

reported occupation, hours and weeks worked, educational attainment, private health insurance

coverage, etc.

66In this sense, misclassification can be important. If the imputed value equals true value for all cases, the data are not truly “imputed.” However, in practice, imputations are unlikely to have extremely low misclassification rates, and we must evaluate the potential bias of each Q̂ with the available information.

114

We use Sequential Regression Multivariate Imputation (SRMI) to impute plausible values for the

missing data (Raghunathan et al., 2001).67 SRMI is an iterative resampling technique to estimate

f(Y |O, θ) while imposing fewer strong parametric assumptions on the joint conditional distribution

f . Under SRMI imputation, We estimate the model for each Yj iteratively as follows. In the first

iteration, Y1 is regressed on O and the missing values are imputed. Any imputation model can

be used to impute values for each Yj , such as a regression model, a hot deck, or predictive mean

matching, with their attendant assumptions about f(Y |O, θ). Let Y (1) 1 denote the filled-in version

of the variable Y1 from the first iteration. Now Y2 is imputed using (O, Y (1) 1 as covariates to generate

Y (1) 2 , the filled in version of Y2 from the first iteration. This process continues until the missing

values in Yp are imputed using (O, Y (1) 1 , Y

(1) 2 , . . . , Y

(1) p−1) as predictors.

We cannot stop at iteration 1 because the imputation of Y (1) 1 , for example, fails to exploit the

observed information from (Y2, Y3, . . . , Yp). Iterations t = 2, 3, . . . proceed in the same manner

except that all other variables (with some filled at the current and the rest in the previous it-

erations) are used in imputing each variable. Specifically, at iteration 2, Y1 is re-imputed using

(O, Y (1) 2 , Y

(1) 3 , . . . , Y

(1) p ) as predictors; Y2 is re-imputed using (O, Y

(1) 1 , Y

(1) 3 , . . . , Y

(1) p ) as predictors,

etc. In each iteration, we are updating our predictions of θ as well as Y .

In general, at iteration t > 1, Yj is re-imputed using (O, Y (t) 1 , Y

(t) 2 , . . . , Y

(t) j−1, Y

(t−1) j+1 , . . . , Y

(t−1) p ) as

predictors. The iterations are continued several times in order to fully use the predictive power

of the rest of the variables when imputing each variable. Empirical analysis has shown that fewer

than 20 and generally as few as 5 to 10 iterations are sufficient to condition the imputed values in

any variable on all other variables (Ambler, Omar and Royston, 2007; Van Buuren, 2007; He et al.,

2010). By repeating the imputation process in each iteration, SRMI is akin to a Gibbs or MCMC

resampling technique that should iteratively converge to the true conditional joint density (if the

model is properly specified).

We impute survey earnings, job-level administrative gross earnings (or LEHD-equivalent earnings),

and missing state-level means-tested program data. For survey earnings, we impute extensive

67SRMI has also been called Fully Conditional Specification and Flexible Conditional Models in the literature.

115

margin earnings receipt and intensive margin earnings amounts for all earnings variables. In the

CPS ASEC this includes the variables ern yn (earnings receipt), ern srce (primary job earnings

source - wage and salary, self employment, or farm self employment), ern val (earnings amount

from primary job), ws yn, se yn, and frm yn (secondary wage and salary, self employment, for

farm self employment earnings?), and ws val, se val, and frm val (amount of secondary earnings in

each category). We also impute upstream variables that are highly predictive of earnings, including

weeks worked last year (wkswork) and hours worked per week last year (hrswork).

For gross earnings by job (for the two highest earning jobs for each worker), we impute several

variables to simplify the imputations and capture important features in the data. First, we impute

a dummy variable for whether gross earnings ≈ taxable earnings + deferred compensation, which

is true for a large share of workers. For those where gross earnings > taxable earnings + deferred

compensation, we then impute a series of dummies for whether gross earnings/(taxable earnings

+ deferred compensation) falls in several bins, including 1.1 and above, [1.05, 1.1), [1.03, 1.05),

[1.02, 1.03), [1.01, 1.02), and (1, 1.01). After assigning each job to a gross earnings/(taxable

earnings + deferred compensation) bins, we then impute the amount of gross earnings for each job.

We chose this approach because many variables (such as survey-reported private health insurance

coverage) are good predictors of whether gross earnings/(taxable earnings + deferred) compensation

exceeds specific thresholds while not necessarily being good predictors of the exact value of gross

earnings/(taxable earnings + deferred).

For each earning variable, we have separate imputation models by spouse (by sex if an opposite-sex

couple, by order on the file if a same-sex couple). This allows for a more flexible imputation model

and allows us to condition on spousal income in the SRMI.

For state-level means-tested program data, we impute program receipt ({Program} yn) and, con-

ditional on receipt, the amount received ({Program} val) for each program at the household

level.

As discussed in Hokayem, Raghunathan and Rothbaum (2022), there are a number of challenges to

implementing SRMI in this context. First, many income types do not follow a normal distribution.

Second, we must select predictors for the modelling of each income variable from a very large set

116

of possible covariates. Third, we must properly account for uncertainty in our estimates of the

parameters in θ. Included in this uncertainty is the selection of variables for our imputation models

because when we select predictors for our models, we are imposing the assumption that there is

no relationship between the excluded variables and the variable being imputed conditional on the

included variables. Next, we discuss how we address each of these issues.

To address non-normality, we transform each continuous variable using the inverse hyperbolic sine,

which allows us to include negative values, as in Fox et al. (2022).68. As the inverse hyperbolic sine

is nearly perfectly correlated with the natural log over most of the defined range of the natural log,

one can interpret the regression coefficients of continuous variables as elasticities (for continuous

dependent variables) or semi-elasticities (for binary dependent variables).

As a practical matter, there are too many potential variables in O to be used in our model. We

reduce the set of variables to be used to impute each Yj in two stages, both using the Least Absolute

Shrinkage Operator (LASSO, Friedman, Hastie and Tibshirani (2010)). In the first stage, we take

all of the possible interaction terms we specify in O and use LASSO to prune the list to Ôj that

predict Yj (including all non-interacted terms in Ôj). The set of variables in Ôj will generally be

large (hundreds of variables and interactions, if the regression sample size is large). In terms of the

general notation f(Y |O, θ), this process places constraints on θ.69.

During the imputation process, we have a second-stage of regularization when we estimate the values

in θ̂. As θ̂ is a set of unknown parameters, we also must incorporate the uncertainty in θ̂ into the

imputation process – the third challenge noted above. We do this as follows. In each implicate c

(independent run of the imputation model), we start by taking a Bayesian Bootstrap of the sample,

we then do a second-stage variable selection process to further reduce the number of variables in

Ôj to Ôj,c, again using LASSO regularization.70 From the regression of Yj on Ôj,c, we estimate θ̂j,c.

68Hokayem, Raghunathan and Rothbaum (2022) tested alternative transformations, such as Tukey’s gh transformation (He and Raghunathan, 2006) and an empirical normal transformation (Woodcock and Benedetto, 2009). However, as in Fox et al. (2022), they found the inverse hyperbolic sine performed well, and we use that transformation here.

69This is primarily done for practical speed considerations. Reducing the number of candidate variables upfront considerably speeds up the process of imputation for each variable in each implicate.

70The Bayesian Bootstrap (Rubin, 1981) is the Bayesian analogue of the bootstrap. Each observation is drawn (with replacement) with an expected probability of 1/n, but with variability. The probabilities of being drawn are defined by taking n − 1 draws from the uniform distribution (0,1), ordering draws from

117

Doing this on a Bayesian Bootstrap sample enables us to account for the uncertainty present in

each step of this process, including which variables are used as model predictors (Ôj,c) and to draw

from the distribution of parameters values θ̂j,c. This resampling approach to estimating uncertainty

in regression-based imputation has been taken in other data products and research, including SIPP

topic flag imputation (Benedetto, Motro and Stinson, 2016), the SIPP Gold Standard and SIPP

Synthetic Beta (Benedetto, Stinson and Abowd, 2013), and imputation research on missing income

in the CPS ASEC (Hokayem, Raghunathan and Rothbaum, 2022).

With the transformed continuous variables, regularization, and Bayesian Bootstrap-based estima-

tion of the uncertainty of θ̂, we are almost ready to impute missing values. We must also specify

the functional form of our imputation models (parametrizing f(Y |O, θ)). Unless otherwise indi-

cated, we use predictive means matching (PMM) to impute both binary and continuous dependent

variables.

For binary dependent variables, we use a Linear Probability Model (LPM), regressing the dependent

variable on the model selected using the LASSO on the Bayesian Bootstrap sample. We then predict

the vector p̂j(Y = 1|X, θ̂j), which includes the estimated probability for all individuals in sample

whether Rj = 0 or Rj = 1. We then take a random draw for each unit i where Ri,j = 0 from the

ten nearest units k where Rk,j = 1 to assign Yi,j values. We use LPM rather than a logit or probit

model as the LPM model more predictor variables. Although LPM does not impose 0 ≤ p̂i,j ≤ 1,

the Yi,j draws must equal 0 or 1. Fox et al. (2022) used the same approach for imputing SNAP

receipt and showed that this PMM model performed well for several conditional and unconditional

statistics (Q’s such as SNAP receipt, SNAP receipt conditional on earnings and demographics, for

example).

For continuous dependent variables, we use Ordinary Least Squares (OLS), regressing the dependent

variable on the model selected using the LASSO on the Bayesian Bootstrap sample. We then predict

lowest to highest, where u = u0, u1, u2, . . . , un given u0 = 0 and un = 1. The probability of being drawn for each observation i is based on the gaps between each adjacent value in u, so that for observation i the probability of being drawn is gi = ui − ui−1. As noted in Benedetto, Stinson and Abowd (2013), using the Bayesian Boostrap adds additional variability to the imputation process to account for the fact that the sample distribution may not be the same as the population distribution. Without the use of the Bayesian Bootstrap, the confidence intervals would not be proper.

118

the vector Ŷj(Y−j , X, θ̂j) where Y−j is the matrix Y excluding Yj , again for all individuals in sample

whether Rj = 0 or Rj = 1. We then take a random draw for each unit i where Ri,j = 0 from the

ten nearest units k where Rk,j = 1 to assign Yi,j values.

For survey wage and salary earnings from the longest job (ern val if ern srce == 1), rather than

using PMM, we use a two-stage model that incorporates OLS and quantile regressions. As before, we

first use OLS to predict Ŷj(Y−j , X, θ̂j) after LASSO regularization. We then use quantile regression

to regress Yj on binned Ŷj and several variables from O, including race and Hispanic origin, age,

education, and hours worked. We do this for each 5th percentile from the 5th to the 95th. This

gives us an estimate for ˆYj,i,q for each individual i at each quantile q.71. From the values of Ŷj,i,q,

we have a posterior predictive distribution (PPD) of Yj,i for each individual i (after interpolation

using Schmidt et al. (2022)). For each individual, we then draw a percentile value from 0 to 1 to

impute Yj,i from the PPD. 72

Using quantile regression to estimate the PPD is useful if there is potential heterogeneity in the

relationship between specific variables in O and Yj . For example, suppose the average relationship

between education and earnings reflects a bigger right tail for college graduates (more very high

earners), the PMM-based estimate would not necessarily reflect that in the resulting imputes.

However, the quantile regression-based PPD would. However, more data (a large sample) is required

to use quantile regressions to reliably estimate the PPD. Because of the possibility of heterogeneity

and the greater data needs, we implement this approach from survey wage and salary earnings

from the primary job (the largest single source of survey income, covering almost 70 percent of

total income).

For the means-tested program variables imputed at the household level, we recode the data to

summarize the information of household members (such as presence of members by race, total

71The regressions do not impose monotonicity, i.e., it does not ensure that for two quantiles q and r where r > q, Ŷj,i,r > Ŷj,i,q (the quantile crossing problem). Following Chernozhukov, Fernández-Val and

Galichon (2010), we rearrange the curve by sorting the Ŷj,i,q values from lowest to highest and assigning them to the corresponding position’s q value. As Chernozhukov, Fernández-Val and Galichon (2010) show, the rearranged curve is closer to the true quantile curve than the original curve in finite samples.

72If any part of this process fails (such as from nonconvergence in a quantile regression estimate), we impute using PMM. This is unusual, but possible, in an automated process like SRMI that runs many regressions per iteration repeated across implicates.

119

household earnings, etc.) and household head variables (such as education, race, etc.) to use as

predictors and then impute receipt and amounts using PMM as discussed above.

For nonfilers, we observe whether they received several information returns, including Forms 1099-

G, 1099-INT, and 1099-DIV in the IRMF. From these we have information on whether they received

UI compensation, interest income, and dividends, respectively. Each of these are vastly underre-

ported on surveys (Rothbaum, 2015). Rothbaum (2023) has been working with more detailed data

available under a separate agreement between the Census Bureau and IRS, for limited use. In that

work, the 1099-G, 1099-INT, and 1099-DIV data is available, including income amounts. Rothbaum

(2023) released coefficients that can be used to impute these amounts for nonfilers conditional on

survey responses and the administrative data used in this project.

To release this statistics, Rothbaum (2023) estimated models for the synthesis of four variables:

1. UI compensation receipt conditional on receipt of a Form 1099-G

2. UI compensation amount conditional on receipt of UI compensation

3. Interest income amount conditional on receipt of a Form 1099-INT

4. Dividend income amount conditional on receipt of a Form 1099-DIV

In order to allow the creation of synthetic data to correct for survey underreporting, Rothbaum

(2023) released three sets of results for each variable.

For UI compensation receipt, they estimate a Linear Probability Model (LPM) of UI compensation

receipt conditional on receiving a Form 1099-G. Individuals receive a 1099-G for various government

payments, including (1) UI compensation, (2) state or local income tax refunds, credits, or offsets,

(3) reemployment trade adjustment assistance payments, (4) taxable grants, and (5) agricultural

payments. This model is estimated as described above using the two-stage LASSO feature selection,

with the second stage estimated on a Bayesian Bootstrap. As such, the released parameters are

effectively a draw from the distribution of possible parameter estimates that could be used to

predict nonfiler UI receipt.

120

With these regression coefficients, we can estimate the expected probability of UI receipt for each

nonfiler (p̂j(Y = 1|X, θ̂j)) on a separate sample (or the data without access to the more detailed

1099-G data). However, as they were estimated using a LPM, we cannot directly use them to

synthesize UI receipt data (as the p̂j(Y = 1|X, θ̂j) can be < 0 or > 1, which PMM addresses by

taking a random draw from individuals with similar p̂j(Y = 1|X, θ̂j), but with observed values for

Yj . Instead, Rothbaum (2023) then separate the expected probability space into bins and released

the boundaries between those bins and the empirical probability that an observation received UI

compensation in each bin. For example, the top quintile of observations has an expected probability

of receipt of 0.87 or higher (the boundary). Within that bin of observations with an expected

probability of 0.87 or higher that received UI compensation was 0.98 (the empirical probability in

the bin), then we can impute UI receipt for this group by drawing a random number between 0

and 1 and assigning receipt if it is ≤ 0.98.

By releasing regression coefficients, bin boundaries, and empirical probabilities, Rothbaum (2023)

implement a semiparametric imputation technique that is similar to the binned imputation proposed

by Bondarenko and Raghunathan (2007).

For the income variables – UI compensation, interest income, and dividends – the approach is

slightly different. The first step is the same as above for continuous variables – estimate an OLS

model to predict expected income amounts conditional on the available information. Again, the

models are estimated using the two-stage LASSO feature selection, with the second stage estimated

on a Bayesian Bootstrap. The coefficients from this model are released so that the expected income

amount can be estimated on a separate sample (ŷi,j). To allow the synthesis of continuous variables,

Rothbaum (2023) release two set of variables. First, they partition ŷi,j into bins. Then, using

quantile regression at various percentiles, the regress income amounts on bin dummies. As with

ern val above, these regression coefficients can be used to estimate a PPD for each individual. By

drawing a value from 0 to 1, we can impute income amounts from these PPDs.

In summary, for each income amount synthesized, Rothbaum (2023) release three sets of statistics,

regression coefficients, bin boundaries and quantile regression coefficients to enable relatively low

dimensional data to be used to synthesize or impute UI compensation amounts, interest income,

121

and dividends.

Finally, we repeat this process five times, to create the five independent implicates. In each impli-

cate, we use SRMI to impute the survey and gross earnings variables, followed, in a separate step,

by the imputation of means-tested program variables. For any statistic or parameter estimate, we

can account for the uncertainty in the imputation process (Rubin, 1976). To do so, we calculate

the total variance by combining the within-implicate variation (for example, the standard error of

an estimate in one implicate) with the between-implicate variation (the variance of the estimates

for that parameter across the five implicates).

In Table 6, we show the rates of missing data for survey earnings, state program data, and LEHD

job-level gross earnings. In the 2019 CPS ASEC, 46 percent of individuals with earnings had

their primary job earnings imputed. We do not have state-level administrative TANF data for 47

percent of households. Finally, we impute gross earnings for 18 percent of jobs, either because

there is no LEHD information for them (8 percent of highest earning jobs) or because the LEHD

and W-2 values disagree substantially (i.e., the LEHD < W-2, about 10 percent of highest earning

jobs).

As the imputation models are applications from prior work (Hokayem, Raghunathan and Rothbaum

2022 for earnings, Fox et al. 2022 for means-tested benefits, and Rothbaum 2023 for nonfiler UI,

interest, and dividends), we provide limited statistics on the imputation outputs. Table A6 shows

summary statistics for survey earnings imputation, comparing the CPS ASEC imputations to the

NEWS SRMI imputations conditional on W-2 earnings. The SRMI estimates fewer individuals

with zero survey earnings conditional on having zero W-2 earnings. Also, the SRMI estimates

higher survey earnings conditional on having higher W-2 earnings (such as in the 5th quintile of W-2

earnings). Table A7 provides some summary statistics for means-tested program imputation.

122

  • Introduction
  • Income Measurement Challenges
    • Survey Income
    • Administrative Income
    • Addressing These Challenges
    • Relationship to Prior Research
  • Data
    • Survey Data
    • Other Census Bureau Data
    • Federal Administrative Data
      • IRS Data
      • Social Security Administration (SSA) Data
    • State Administrative Data
      • LEHD
      • SNAP
      • TANF
    • Commercial Data
    • Firm Data
    • Linkage and File Construction
  • Methodology
    • Weighting
    • Imputation
    • Estimation
      • Earnings Measurement Error Model
      • Income Replacement
  • Results
    • NEWS Estimates
    • Robustness to Alternative Uses of Earnings Data
    • Impact of Different Processing Steps on Income and Poverty Estimates
    • Impact of Different Income Types on Income and Poverty Estimates
  • Release and Future Research
    • Transparency and Data Availability
    • Future Plans
  • Conclusion
  • Data Linkage
    • Person Linkage
    • Address Linkage
    • Job Linkage
    • Firm Linkage
  • File Construction
    • Address File
    • Person File
  • Weighting
  • Imputation

Expanding the family of U.S. CPIs, Thesia Garner (U.S. Bureau of Labor Statistics)

In recent years, there has been increased interest in going beyond headline measures of inflation to better describe the experiences of households. The CPI for All Urban Consumers (CPI-U) targets the inflation experience of over 90 percent of households in the United States, but it may not reflect the inflation experience of an individual household or group of households. This presentation describes two ongoing research efforts at the Bureau of Labor Statistics to expand its offerings of consumer price indexes in ways that allow for a richer description of household experiences.

Languages and translations
English

UNITED NATIONS

ECONOMIC COMMISSION FOR EUROPE

CONFERENCE OF EUROPEAN STATISTICIANS

Group of Experts on Measuring Poverty and Inequality

28-29 November 2023

Workshop on Harmonization of Poverty Statistics to Measure

SDG 1 and 10

27 November 2023

Title of contribution Expanding the family of U.S. CPIs

Author Name(s) William JOHNSON, Joshua KLICK, Paul LIEGEY, Robert MARTIN, Anya

STOCKBURGER,

Presenter Name Thesia GARNER

Presenter Organization U.S. Bureau of Labor Statistics

Topic Inflation and its impact on poverty and inequality

Summary:

In recent years, there has been increased interest in going beyond headline measures of inflation to

better describe the experiences of households. The CPI for All Urban Consumers (CPI-U) targets the

inflation experience of over 90 percent of households in the United States, but it may not reflect the

inflation experience of an individual household or group of households. This presentation describes

two ongoing research efforts at the Bureau of Labor Statistics to expand its offerings of consumer price

indexes in ways that allow for a richer description of household experiences. First, in response to

increasing user demand, we construct consumer price indexes for different groups along the income

distribution. From 2006 to 2023, lower income households generally faced larger inflation rates than

higher income households, and the gap is highest when measured using the Chained CPI, which is a

closer approximation to a cost-of-living index. We explore how different budget items contribute to

this gap, as well as how it changes over time. Second, we estimate a family of price indexes known as

Household Cost Indexes (HCI), which aim to measure the average inflation experiences of households

as they purchase consumer goods and services. These differ from the usual CPIs in two main respects.

First, the upper-level aggregation of the HCIs weights households equally, unlike most headline CPIs

which implicitly give more weight to higher-expenditure households. Second, the HCIs use the

payments approach to value owner-occupied housing services explicitly using household outlays. In

contrast, the U.S. CPIs use rental equivalence. The HCI for all urban consumers has an average 12-

month change of 1.51% over December 2011 to December 2021, compared to 1.86% for the CPI-U.

Roughly 95% of the difference is due to the payments approach.

Please select your preferred contribution (you may select both options):

☒ Presentation

☐ Paper (to be submitted by 20 October)

Household Cost Indexes: Prototype Methods and Results, US

We estimate a family of price indexes known as Household Cost Indexes (HCI) using U.S. data. HCIs aim to measure the average inflation experiences of households as they purchase goods and services for consumption, and similar indexes are produced in the United Kingdom and New Zealand. These differ from the Bureau of Labor Statistics’ headline Consumer Price Index (CPI) products in two main respects. First, the upper-level aggregation of the HCIs weights households equally, unlike most headline CPIs which implicitly give more weight to higherexpenditure households.

Languages and translations
English

1

Household Cost Indexes: Prototype Methods and

Results1

Robert S. Martin, Joshua Klick, William Johnson, Paul Liegey2

June 1, 2023

CONFERENCE PAPER/PRELIMINARY

Abstract

We estimate a family of price indexes known as Household Cost Indexes (HCI) using U.S.

data. HCIs aim to measure the average inflation experiences of households as they purchase

goods and services for consumption, and similar indexes are produced in the United Kingdom

and New Zealand. These differ from the Bureau of Labor Statistics’ headline Consumer Price

Index (CPI) products in two main respects. First, the upper-level aggregation of the HCIs weights

households equally, unlike most headline CPIs which implicitly give more weight to higher-

expenditure households. Second, the HCIs use the payments approach to value owner-occupied

housing services explicitly using household outlays. In contrast, the U.S. CPIs use rental

equivalence. The HCI for all urban consumers has an average 12-month change of 1.51% over

December 2011 to December 2021, compared to 1.86% for the CPI-U. The bulk of the

difference is due to the payments approach.

Key Words: Price index; inflation; democratic aggregation; payments approach

JEL Codes: C43, E31

1 We thank Anya Stockburger, Robert Cage, Thesia I. Garner, and many others at the Bureau of Labor Statistics for helpful comments and guidance. 2 Division of Price and Index Number Research (Martin), Division of Consumer Price Indexes (Klick, Liegey), Division of Price Statistical Methods (Johnson), Bureau of Labor Statistics, 2 Massachusetts Ave., NE, Washington, DC 20212, USA. Emails: [email protected], [email protected], [email protected], [email protected]

2

1. Introduction

This article estimates Household Cost Indexes (HCIs) using U.S. data. Similar price

indexes are already produced in the United Kingdom (Office for National Statistics, 2017) and

New Zealand (Statistics New Zealand, 2020). HCIs measure the change in cash outflows

required, on average, for households to access the goods and services they purchase at a

constant quality. Like the headline and subpopulation Consumer Price Indexes (CPIs) produced

by the Bureau of Labor Statistics (BLS), the HCIs aim to capture price change for consumer

goods and services. However, the HCIs differ in two important methodological respects from

the CPIs. First, the upper-level aggregation of the HCIs weights households equally, whereas the

CPI market baskets implicitly give higher weight to higher-expenditure households.3 Second,

the HCIs use the payments approach to value services from owner-occupied housing, using

outlays on mortgage interest, property taxes, and the full reported value of insurance,

appliances, maintenance and repairs (i.e., what the household pays and when they pay it). The

CPIs, in contrast, use an implicit measure of owner-occupied housing consumption called rental

equivalence, and all other goods and services are valued using acquisition prices and

expenditures (i.e., when the household acquired or took possession of the good). For HCIs in

principle, the payments approach should be applied more broadly, but this paper focuses only

on owner-occupied housing. We are ignoring household outlays for the purchase of vehicles

and other durable goods and instead are including the full acquisition expenditures for these

regardless of financing; including these in an HCI is left for a future study.

3 Households are still weighted by their sampling weight so that averages represent the population.

3

We compute an HCI for the urban U.S. population covering the period December 2011

to December 2021. The HCI is based on the Lowe (modified Laspeyres) formula using average

annual household weights with about a two-year lag. From December 2012 to December 2021,

we find an average twelve-month inflation rate of 1.51 percent for the HCI-U, compared to 1.86

for the CPI-U and 1.73 for the Chained CPI-U. We find that empirical differences between the

HCIs and CPIs are primarily due to the HCI’s use of the payments approach, which we estimate

subtracts 0.39 percentage points per year on average relative to an index that uses rental

equivalence. This difference reflects both a lower weight for owner-occupied housing in the HCI

as well as lower inflation in explicit housing costs when compared to owner’s equivalent rent. In

contrast, we estimate that equal household weighting increases the index only about 0.05

percentage points per year on average compared to an index which uses the standard

expenditure weighting, but otherwise uses the same methodology as the HCI.

CPIs are used in a wide variety of economic applications—as an overall macroeconomic

indicator, to deflate national accounts, to adjust marginal tax rates, and measure changes in the

cost-of-living representative of the entire economy. In such applications, measuring the change

in purchasing power of the average dollar of expenditure using an implicit consumption

concept like owner equivalent rent may be appropriate. In other cases, such comparing the

economic conditions of population subgroups, a measure tied to explicit outlays may be

attractive. One index cannot usually satisfy all needs, and in this sense the HCIs can provide

useful complimentary information about the average household inflation experience.

4

2. Literature Review

Current BLS CPI methodology is based on market-level expenditure weights and the

rental equivalence approach to owner-occupied housing (Bureau of Labor Statistics, 2020).

Household-weighted aggregation and the payments approach differ substantially from current

BLS CPI methodology, though neither is new to the price index literature. Astin and Leyland

(2015) propose using these methods to better capture the inflation experiences of households.

They argue such a measurement is more credible for indexing monetary values, while a

traditional CPI is superior for macroeconomic analysis and inflation targeting. Based in part on

their research, the Office of National Statistics developed a set of HCIs for the United Kingdom

(Office for National Statistics, 2017). Statistics New Zealand publishes a similar set of indexes

called the Household Living-Costs Price Indexes. Research on a similar set of indexes for the U.S.

began with Cage, et. al. (2018).

Household-weighted aggregation (also known as democratic aggregation) has been

considered at least since Prais (1958). The topic has been developed and reviewed in Pollak

(1989), National Research Council (2002), International Labor Organization (2004, Chapter 18),

Ley (2005), and Martin (2022), among others. Spending patterns differ across the distribution of

total expenditure. To the extent that these differences coincide with expenditure categories

that have higher or lower inflation than average, a household-weighted index will differ from a

traditional expenditure-weighted one. Equally weighted indexes have been studied with U.S.

data in Kokoski (2000) and Hobijn, et. al. (2009). The latter is notable for statistically matching

the interview and diary components of the Consumer Expenditure Survey (CE), and we follow

5

many aspects of its approach. Our paper also builds on work from Cage, et. al. (2018) and

Martin (2022), the latter of which finds that household-weighted aggregation adds about 0.08

percentage points per year to inflation measured by a Lowe-type CPI from December 2001 to

June 2021.

The payments approach to owner-occupied housing has been discussed at least since

the 1989 version of the International Labor Organization (ILO) CPI manual (as cited by

Goodhart, 2001), and much of our initial approach follows the 2004 version (International Labor

Organization, 2004, Chapter 10). The payments approach to owner-occupied housing focuses

on the month-to-month outlays by households rather than an upfront purchase price (the

acquisition approach) or the implicit consumption value (the use approach).4 In addition to the

HCIs for the United Kingdom and New Zealand, the payments approach is also used in the CPI

for Ireland (Central Statistics Office, 2016). Mortgage interest is also included in the housing

component of the CPI for Canada (Statistics Canada, 2019), and was a part of the U.S. CPI

housing component prior to 1983 (Gillingham and Lane, 1982). Diewert and Nakamura (2009)

contains a conceptual comparison of the payments approach against other methods like the

user cost approach and rental equivalence, while Garner and Verbrugge (2009) compare

methods empirically using the CE.

Astin and Leyland (2015) argue that the payments approach is superior for comparing

household inflation experiences and escalating payments. They make the case that because

rental equivalence is not tied to explicit outlays, an index which includes it as a large

4 Rental equivalence and user cost are both flavors of the use approach.

6

component may be less tethered to the actual price movements that affect household budgets.

For some subpopulations, there can be large differences between implicit rents and explicit

cash flows. For instance, in Cage et. al. (2018), the subpopulation of households which receives

at least 50% of its before-tax income from Social Security has higher relative expenditures on

shelter (35-39%) when measured using rental equivalence than the overall urban population

(32%), but lower relative expenditures when measured using payments (16-23%). This is

because these households are disproportionately likely to be owner-occupiers without

mortgages, meaning their explicit housing outlays are limited to items like property taxes,

insurance, and maintenance.

Astin and Leyland (2015), as well as ILO (2003) advocate such an index for escalation

purposes, but this position is not universally held. Diewert and Shimzu (2021) argue “it is not an

index that can measure household consumption of the services of durable goods because it

focuses on the immediate costs associated with the purchase of durable goods and ignores

possible future benefits of these purchases.” The payments approach has also been criticized in

Goodhart (2001), Poole, Ptacek, and Verbrugge (2005), and elsewhere on the basis that it

doesn’t reflect consumption in an economic sense. We agree that a flow-of-service method like

rental equivalence is more appropriate for a macro-focused CPI or a representative consumer’s

cost-of-living index (See, e.g., Diewert 1976). However, we study the HCIs as complementary

series intended to capture explicit outlays of households rather than the implicit consumption

prices (in an economic theoretic sense) reflected in a traditional CPI, though initially the

distinction is limited to owner-occupied housing. The objective of our paper is primarily to

compare owner-occupied housing and household aggregation methods.

7

3. Methods and Data

Our methods for this paper are preliminary and based on utilizing existing BLS surveys or

publicly available data sources. Like the CPIs, the HCIs are constructed in two stages. First, basic

indexes are constructed for item-area strata (e.g., coffee in Washington, DC). These are then

aggregated using expenditure weights from the CE. As our initial version only applies the

payments approach to owner-occupied housing, the elementary indexes and underlying

household expenditures used in upper-level aggregation are largely the same. See Bureau of

Labor Statistics (2020) for more details. For housing, the owner equivalent rent elementary

indexes are replaced with indexes for property taxes, mortgage interest, and property

management services. In addition, we use the full reported value of household expenditures on

household appliances, maintenance and repair, and insurance when constructing upper-level

aggregation weights. Finally, we estimate equally weighted averages of household expenditure

shares based on matched CE Interview and Diary data and use these in the second-stage

aggregation.

3.A. Payments Approach Item Structure and Elementary Indexes

The payments approach for owner occupied housing reflects the housing-related cash

outflows of households. Compared to the CPI, the HCI item structure excludes owner’s

equivalent rent and includes three additional expenditure classes—property taxes, mortgage

interest, and other primary residence expenses. The payments approach also removes several

adjustments CPI makes to other category weights, which we discuss more later in this section.

Within property taxes and mortgage interest, we create new elementary item indexes

8

representing primary residences. These also serve as proxies for secondary residences. In the

CPI, the price index for owner’s equivalent rent of primary residences (numbered “01”) also

serves as the proxy for the unpriced item (numbered “09”) representing secondary residences.

A further item classification (see

9

Table 1 for details) for other primary residence expenses consists of ground rent,

parking, and property management services. This category comprises less than one half of one

percent of the overall index weight, and we provisionally measure its price change using the

producer price index for final demand property management services as a proxy. Finally, our

objective, where possible, is to limit expenditures to those pertaining to primary residences and

vacation homes and exclude investment properties.

The rest of this section details the construction of the property tax and mortgage

interest payment indexes. We follow what is (to our knowledge) international practice by

including the interest component of mortgage payments (excluding second mortgages or home

equity lines of credit) and excluding the portion that goes toward principal reduction (and by

this reasoning down payments and cash purchases). From the 2004 ILO manual, only the

interest portion is considered a pure cash outflow; the principal portion immediately shows up

on the household’s balance sheet as an increase in assets, so it may be considered more like an

investment with a potential future return (International Labor Organization 2004, Chapter 10).

This view is not universal (see Astin and Leyland, 2015). However, including mortgage principal

presents additional technical challenges.5

Also following international practice, the mortgage interest and property tax payments

indexes derive conceptually from two sources of potential change: a rate (an interest rate or an

effective property tax rate) and the base to which the rate is applied (the debt level or the

5 The most straightforward method to estimate the proportional impact of changing interest rates on mortgage principal payments would involve plugging in aggregate (i.e., average) interest rates into a nonlinear function. In the sense of measuring a change in average payments across households, the potential bias of such a plug-in procedure from Jensen’s Inequality is unknown.

10

dwelling value). Changes in rates alone do not capture changes in purchasing power

(International Labor Organization 2004, Chapter 10). Some users could be concerned about

allowing the effects of home prices given these could be associated with (eventual) financial

returns to households. In our view, there is a tradeoff between representing the explicit outlays

of households and controlling for investment using economic theory. Indeed, as noted by

Poole, Ptacek, and Verbrugge (2005), adjusting housing payments to account for investment

results in the user cost approach, which is another implicit housing cost concept. Empirically,

Garner and Verbrugge (2009) show that user costs can differ greatly from explicit payments.6

Our initial strategy, following international practice, aims to exclude the investment aspect of

housing ownership by excluding mortgage principal. Appendix A shows the decision to

indirectly include home prices is significantly inflationary for the housing payments indexes and

suggests the decision to exclude mortgage principal is somewhat deflationary.

Finally, our preliminary results compute a single set of payments approach elementary

item indexes representing the U.S. urban population. We leave it to future research to extend

these methods to create elementary indexes by CPI geographic areas.

3.A.1. Mortgage Interest Payment Index

The mortgage interest payments index measures the proportional change in the interest

payment amount that would occur holding fixed the financing conditions—such as the loan

term and proportion of principal remaining. We aim to follow the recommendations in the

2004 ILO manual (Chapter 10), which is to use both a representative basket of interest rates

6 Garner and Verbrugge (2009) also find that user cost measures based on different underlying assumptions can differ greatly from each other and from implicit rents.

11

and a debt index, which holds “constant the age of the debt” between index periods

(International Labor Organization 2004, Chapter 10). Payments in each period are determined

by transactions occurring at many previous points in time, as mortgage loans are long-term

contracts. Consequently, our index is based on weighted averages of interest rates and house

prices corresponding to loans or debt of different ages. A fixed-basket approach has the

advantage of being feasible with aggregate interest rate and house price data, but the

disadvantage of not being micro-founded.7

Similar to Canada (Statistics Canada, 2019), we define the index as the product of a debt

index (which is influenced by home prices) and an interest rate index which compare payments

in the comparison period &#x1d461; against the reference period &#x1d460;.8 The index is based on the model of

a thirty-year fixed rate mortgage, which dominates the U.S. market (about 75% of existing loans

as reported in the CE).9 It is written:

&#x1d443;&#x1d440;&#x1d43c;&#x1d443; = &#x1d443;&#x1d437;&#x1d443;&#x1d45f; , (1)

where &#x1d443;&#x1d437; is the debt index and &#x1d443;&#x1d45f; is the interest rate index. They are written

&#x1d443;&#x1d437; = ∏ &#x1d43b;

&#x1d461;−&#x1d457;

&#x1d713;&#x1d44f;&#x1d457;�̅� &#x1d457;=0

∏ &#x1d43b; &#x1d460;−&#x1d457;

&#x1d713;&#x1d44f;&#x1d457;�̅� &#x1d457;=0

(2)

and

7 We considered such a micro-founded approach which could, for example, average proportional changes in rates actually paid by households between the reference and comparison periods without fixing the loan age. Such an approach may be more appropriate for the U.S. market, which is dominated by 30-year fixed rate mortgages. However, basing such an approach on CE interest rate microdata misses any variation which occurs when a consumer unit moves from one house to another since consumer units are not followed. 8 While our debt index is similar to the housing component of Canada’s mortgage interest index, their interest rate component is based on unit value-like averages using administrative banking data. 9 We ignore preferential treatment of mortgage interest in the tax code.

12

&#x1d443;&#x1d45f; = ∏ &#x1d45f;

&#x1d461;−&#x1d457;

&#x1d711;&#x1d44f;&#x1d457;&#x1d703;−1 &#x1d457;=0

∏ &#x1d45f; &#x1d460;−&#x1d457;

&#x1d711;&#x1d44f;&#x1d457;&#x1d703;−1 &#x1d457;=0

. (3)

The indexes measure change from period &#x1d460; to period &#x1d461; by weighting past home prices (relative

to a common base) and interest rates according to the relative importance of loans or debt

initiated in those months to the index periods &#x1d461; and &#x1d460;.10

In these expressions, &#x1d43b;&#x1d70f; is a home price index for month &#x1d70f;, &#x1d45f;&#x1d70f; is an average interest rate

for month &#x1d70f;, &#x1d713;&#x1d44f;&#x1d457; is the population-weighted proportion of mortgagor-month observations with

debt of age &#x1d457; (measured as the number of months since the property was acquired), and &#x1d711;&#x1d44f;&#x1d457; is

the population-weighted proportion of mortgagor-month observations with current loans of

age &#x1d457; (measured as the number of months since the first payment) during the reference period

&#x1d44f;. The &#x1d713; and &#x1d711; parameters differ due to refinances. We use the proportion of mortgagors

(rather than the proportion of debt, which is closer to what Statistics Canada uses) in keeping

with the equal-weighting objective of the HCI. The parameter &#x1d703; equals 360 to reflect the

number of potential payments in a thirty-year loan, while �̅� is set higher to allow for acquisition

periods to be earlier on refinanced properties. While not well bounded in theory, we set �̅�

equal to 408 to accommodate the beginning of our house price indexes in January 1975. This

covers about 97.5% of observations in our sample. We evaluate adjacent months &#x1d461; and &#x1d460;. We

set &#x1d44f; as the fourth quarterly lag of the quarter containing month &#x1d461;. This reflects a realistic

production constraint for using CE data to construct the weights while keeping them as current

as possible. We use CE microdata on mortgage expenses and keep those observations with 30-

10 While the product of two geometric means with identical weights could be written as one geometric mean, writing the index as a product of two components makes for convenient discussion and analysis.

13

year fixed rate first mortgages on primary residences. We drop loan records that likely pertain

to non-housing expenditures (second mortgages and home equity lines of credit).

We use monthly averages of the weekly 30-year fixed mortgage rate averages from the

Freddie Mac Primary Mortgage Market Survey (PMMS), which are available only for the U.S.

market. We also use the Federal Housing Finance Agency’s (FHFA) All Transactions House Price

Index. This index is quarterly, and we interpolate monthly values using the natural spline in

SAS’s PROC EXPAND. The FHFA’s purchase only house price index is monthly and superior

conceptually for a debt index representing past home purchases. However, this series only goes

back to 1991, and would not be long enough to cover all loan ages in our sample.

3.A.2. Property Tax Payment Index

The property tax payment index measures the change in average property tax payments

for households. Our proposed method attempts to hold the aggregate quality of the housing

stock constant and uses annual data from the CE.11 Let &#x1d44b;&#x1d460;,&#x1d461; and &#x1d449;&#x1d460;,&#x1d461; denote proportional growth

in population aggregates for property tax payments and owner-occupied housing unit values

between years &#x1d460; and &#x1d461;, and let &#x1d43b;&#x1d460;,&#x1d461; be a constant-quality home price index between years &#x1d460; and

&#x1d461;. We use timeseries representing the entire U.S. and leave it for future research to extend the

method to geographic areas, which require more granular tax data than we currently have. We

compute the following:

&#x1d443;&#x1d443;&#x1d447;&#x1d443; = &#x1d44b;&#x1d460;,&#x1d461; &#x1d449;&#x1d460;,&#x1d461;

&#x1d43b;&#x1d460;,&#x1d461;. (4)

11 The CE asks homeowners the annual property taxes owed on their primary residence and adjusts these amounts if the property is partly used as a business. The CE also asks the consumer unit to estimate the market value of their primary residence. Investigating potentially more timely sources of property tax data is a task for future research.

14

Our method is similar to that of Statistics Canada and the Office for National Statistics,

which compute unit value indexes, or ratios of average property tax payments, though they do

so for different geographic areas. Let &#x1d441;&#x1d460;,&#x1d461; be the growth in the number of owner-occupied

housing units between &#x1d460; and &#x1d461;. A similar approach we explored with CE data computes

&#x1d443;&#x1d443;&#x1d447;&#x1d448;&#x1d449; = &#x1d44b;&#x1d460;,&#x1d461; &#x1d441;&#x1d460;,&#x1d461;

. (5)

where we use the number of owner-occupier consumer units to proxy for the number of

owner-occupied housing units.12 Equation (4) is equal to equation (5) divided by (&#x1d449;&#x1d460;,&#x1d461;/&#x1d441;&#x1d460;,&#x1d461;)/&#x1d43b;&#x1d460;,&#x1d461;

which is the growth in average home values deflated by the constant-quality home price index.

We interpret this ratio as a measure of change in dwelling quality which is relevant under the

assumption that the total housing market valuations &#x1d449;&#x1d460;,&#x1d461; and the house price indexes &#x1d43b;&#x1d460;,&#x1d461;

approximate changes in value and price as would be measured by tax assessors. We found that

the long-term trends of Eq. (4) and (5) were very similar. As in Canada and the U.K., we do not

attempt to control for potential differences in quality of municipal services.

Our preliminary efforts use annual property tax aggregates from the CE, as the survey

asks about annual tax obligations rather than monthly payments. The monthly expenditure

microdata include these figures divided by 12. We find that that using Equations (4) and (5) on

this average monthly data leads to substantial short-term sampling variation. For this reason,

we compute the property tax index at an annual frequency and interpolate monthly values

12 In the CE, consumer units are equivalent to households in the vast majority of cases but are defined by joint economic decision making rather than residence or familiar relationships.

15

using a spline function. Statistics Canada and the Office for National Statistics, for instance,

update their property tax indexes once per year. The CE is not the ideal source for property tax

and housing value data, as data for a calendar year are released about nine months after that

year ends. For this reason, this paper’s analysis only covers through the end of 2021. Finding

timelier and larger samples using alternative data is an objective for future research.

3.B. Upper-level Aggregation

As in the CPI, we use CE data to derive upper-level aggregation weights, with some

important differences. As shown in

16

Table 1, the set of eligible elementary item strata now includes property taxes and mortgage

interest and excludes owner equivalent rent. The property tax and mortgage interest weight

are derived from the monthly expenditures on those items as collected by the CE. In addition,

we use the full reported values of expenditures on items like maintenance and repair,

homeowner’s insurance, appliances, and household furnishings. Under the rental equivalence

approach, these items are scaled down for owner-occupiers to reflect the likelihood of a renter

making the same purchase. Table 2 compares average housing-related relative importance

across consumer units in different subpopulations —by housing tenure, an indicator for being a

wage earner or clerical worker (as in the CPI-W), and an indicator for being elderly (age greater

than or equal to 62, as in the R-CPI-E)13—both under the payments approach and rental

equivalence. In general, housing payments make up a smaller share of overall spending under

the payments approach than under rental equivalence. For the urban population, for instance,

housing under the payments approach amounts to 34.3% of the market basket on average,

versus 42.9% on average under rental equivalence. Interestingly, patterns of spending across

some subpopulations differ by housing approach. For instance, under rental equivalence, the

average share going to housing among the elderly is relatively high at 46.8%. Under the

payments approach, however, the elderly have a high proportion going to insurance,

appliances, maintenance, and repairs (“other housing”), but relatively less going to mortgage

interest, resulting in a total housing weight of 34.1%, slightly less than the overall urban

population (34.3%).

13 Consumer units were classified according to their reported demographic in their last interview in the sample.

17

Table 1: Weights for Select Housing Items for the HCI Subsample in 2019

Payments Rental

Equivalence Code Description $ Bil. % RI* $ Bil. % RI*

HC01 Owner’s Equivalent Rent of Primary Residence NA NA 1,144.36 22.40 HC09 Unsampled Own. Equiv. Rent of Second. Res. NA NA 56.29 0.75 HD01 Tenants’ and Household Insurance 38.02 1.01 17.24 0.38 HH01 Floor Coverings 8.29 0.18 2.54 0.05 HK01 Major Appliances 17.05 0.39 2.38 0.06 HK09 Other Appliances 0.08 0.00 0.07 0.00 HM01 Tools, Hardware, and Supplies 17.23 0.43 11.67 0.26 HM09 Unsamp. Tools, Hardw., Outdoor Equip, Supp. 58.44 1.31 9.35 0.20 HP04 Repair of Household Items 46.52 0.83 4.14 0.08 HP09 Unsampled Household Operations 10.69 0.23 4.29 0.07 HR01 Property Tax of Primary Residence 199.70 4.51 NA NA HR09 Property Tax of Secondary Residence 8.61 0.16 NA NA HS01 Mortgage Interest of Primary Residence 211.64 4.26 NA NA HS09 Mortgage Interest of Secondary Residence 4.55 0.08 NA NA HT01 Other Owner Payments for Primary Residence 14.10 0.42 NA NA HT09 Other Owner Payments for Secondary Res. 1.29 0.02 NA NA * Average (equally weighted) relative importance across consumer units.

Table 2: Average Household Relative Importance for Housing by Subpopulation (percent)

Category Urban Wage- earner Elderly

Own. w/ Mortgage

Own. w/o Mortgage Renter

Payments Approach Rent 9.2 13.0 6.3 0.1 0.2 31.8 Property Tax 4.7 4.2 5.7 6.1 7.0 0.2 Mortgage Interest 4.3 5.2 2.7 10.2 0.2 0.1 Other Housing 16.0 14.8 19.4 16.9 22.0 8.8 Total Housing 34.3 37.2 34.1 33.2 29.5 40.9 Rental Equivalence Approach Rent 9.2 13.0 6.3 0.1 0.2 31.7 Owner’s Equiv. Rent 23.1 20.9 29.4 31.1 33.9 0.8 Other Housing 10.6 10.4 11.1 10.9 12.0 8.7 Total Housing 42.9 44.3 46.8 42.1 46.1 41.2 Note: Cells show average December 2020 relative importance (2019 reference period weights price-updated to December 2020 values) across households meeting the HCI sample requirement. While expenditures cover a year, consumer units are classified according by attribute from their last collection quarter.

18

Our upper-level aggregation uses the Lowe formula, and same as the CPI (as of January

2023) the quantity weights pertain to annual expenditure reference periods which are updated

each year. The household-weighted aggregation starts from the CE Interview sample, as

consumer units contribute up to one year of data and the Interview comprises most eligible

expenditures. Eligible expenditures from the Diary survey are imputed to the Interview sample

using a matching procedure based on Hobijn, et. al. (2009), which is described further later in

this section and similar to that used in Martin (2022). The procedure matches eligible Diary

consumer units to an Interview consumer unit based on demographic characteristics that are

predictive of total expenditure. The second-stage aggregation is then based on the Lowe

formula with lagged expenditure weights.

&#x1d443;&#x1d43b;&#x1d436;&#x1d43c; = ∑ ∑�̅�&#x1d44e;,&#x1d456;,&#x1d463;,&#x1d44f;&#x1d443;&#x1d44e;,&#x1d456;,&#x1d461;,&#x1d463;

&#x1d456;∈ℐ&#x1d44e;∈&#x1d49c;

(6)

�̅�&#x1d44e;,&#x1d456;{&#x1d463;,&#x1d44f;} = (

&#x1d43b;&#x1d44e;,&#x1d44f;

&#x1d43b;&#x1d44f; )&#x1d43b;&#x1d44e;,&#x1d44f;

−1 ∑ &#x1d714;ℎ

ℎ∈ℋ&#x1d44e;,&#x1d44f;

&#x1d460;&#x1d456;,&#x1d463;,&#x1d44f;,ℎ

(7)

&#x1d43b;&#x1d44e; = ∑ &#x1d714;ℎ

ℎ∈ℋ&#x1d44e;,&#x1d44f;

, &#x1d43b;&#x1d44f; = ∑ ∑ &#x1d714;ℎ

ℎ∈ℋ&#x1d44e;,&#x1d44f;&#x1d44e;∈&#x1d49c;

,

(8)

where &#x1d44e; indexes the geographic area, &#x1d456; the item stratum, &#x1d463; the index pivot month, &#x1d44f; the weight

reference period, and ℎ the consumer unit. The set of areas is &#x1d49c;, the set of items ℐ, and the

set of consumer units in area &#x1d44e; during period &#x1d44f; is ℋ&#x1d44e;,&#x1d44f;. The elementary index between pivot

month &#x1d463; and period &#x1d461; for item &#x1d456; in area &#x1d44e; is given by &#x1d443;&#x1d44e;,&#x1d456;,&#x1d461;,&#x1d463;. The associated household-weighted

expenditure shares are &#x1d460;�̅�,&#x1d456;,&#x1d463;,&#x1d44f;. These are equally (with respect to the population) weighted

averages of individual consumer unit annual expenditure shares &#x1d460;&#x1d456;,&#x1d463;,&#x1d44f;,ℎ, with &#x1d714;ℎbeing

household ℎ’s sampling weight. The weight reference period &#x1d44f; is the calendar year two years

19

prior to the calendar year containing month &#x1d461;, and the expenditure shares &#x1d460;&#x1d456;,&#x1d463;,&#x1d44f;,ℎ are price-

updated to represent period &#x1d463; values using the ratio of the elementary index in month &#x1d463; to its

average over period &#x1d44f;.

Consumer units participate in the CE for up to four collection quarters, providing up to

twelve months of expenditures. Because participation is on a rolling basis and there is unit

nonresponse and occasional attrition, the number of observations exactly lining up with a single

calendar year is relatively small, often only a few hundred. Therefore, for the HCI, we define a

“reference year” sample differently than does either the CE or CPI. We assign a consumer unit

to a reference year &#x1d44f; if its last month of expenditure occurred during year &#x1d44f;. So that each ℎ’s

expenditure basket reflects a whole year, we include only observations which completed all

four quarterly interviews, even if some of their expenditures occurred in the prior calendar

year. For the 2019 reference year, for instance, (used for indexes in 2021), we include

consumer units with at least one month occurring in 2019, meaning we include some

observations whose sample tenure started as early February 2018. With the four-quarter

requirement, this amounts to a sample of 3,063 unique consumer units (12,252 collection

quarters) representing our 2019 reference year. In comparison, 11,740 unique consumer units

(comprising 22,957 collection quarters) in the CE have expenditures recorded for the calendar

year 2019.14 For index subgroup definitions, we use consumer unit characteristics from their

final collection quarter.

14 These sample sizes were calculated by counting the number of unique FAMID (or the consumer-unit specific portion of the FAMID) for a given expenditure reference period.

20

As discussed in Martin (2022), including observations with periods less than one year

can distort household-weighted indexes due to greater variability in total expenditures and

lower average expenditure shares for less frequently purchased items. However, there is a

potential trade-off with the four-quarter requirement due to representativity. Table 3 shows

differences in the relative frequencies of a few consumer unit demographics. For the 2019

reference year, the HCI subsample has a greater proportion of owners and elderly than the full

sample of urban consumer units. At the same time, Table 2 shows there are differences in the

average expenditure shares on housing-related payments across these groups, suggesting

potential consequences for price indexes. For instance, the elderly spend relatively more on

property taxes than on mortgage interest, reflecting that they are disproportionately owners

without mortgages.

Table 3: Frequency of Consumer Unit Characteristics by Sample in 2019 (percent)

All Urban HCI Subsample

Owner with mortgage 37.3 41.4 Owner without mortgage 23.6 29.1 Renter 39.2 29.6 Wage earner 27.0 25.3 Elderly 30.8 37.7

Nevertheless, we find little evidence of a sample selection bias stemming from our HCI

eligibility criteria, at least over during sample period. Table 4 shows (comparing columns 2 and

3) the impact of using the CE subsample on major group-level weights is small relative to the

effect of using the payments approach or household aggregation. Additionally, we find

(Appendix C) that the sample selection impact on an expenditure-weighted version of the HCI-U

(corresponding to column 4 of Table 4) is minimal, about 0.01 percentage points per year.

21

Furthermore, our results show a CPI-like index calculated from these subsamples (with Diary

expenditures imputed as described in the next subsection), corresponding to column 3 of Table

4 closely matches the published CPI-U. These together imply our results are driven by the

payments approach and household-weighted aggregation, and not the reference period or CE

subsample. Our current method makes no adjustments to the CE sampling weights, which we

leave to future research. Such adjustments may be more important with more recent data than

our sample period, particularly with recent surges in mortgage interest rates.

There are a few other differences between our research indexes and official CPI

methods. Since the HCI is based on consumer unit-specific shares, which must be weakly

positive, we censor negative annual expenditures at zero.15 We also make some small item-

structure changes to simplify calculations using historical data. Finally, we omit weight-

smoothing procedures used in the CPI, including composite estimation for the item-area

weights, which are designed to lower their sampling variance across geographic areas. Our all

items, all areas CPI-U replications closely match the published indexes even without these

procedures, and our prototype procedure only estimates property tax and mortgage interest at

the national level. We leave it to future research to extend weight-smoothing procedures to the

HCIs.

Figure 1 below shows the December 2020 relative importance by major expenditure

group and select housing categories and compares them with the published shares for the CPI-

U. The HCI shares correspond to the 2019 weight reference year, while for the CPI they

15 This affects items RC01 “Sports Vehicles, Including Bicycles”, TA02 “Used Cars and Trucks”, and TA09 “Unsampled New and Used Motor Vehicles.” The CPI counts returns or sales as negative expenditures.

22

correspond to the 2017-18 reference period. Table 4 tracks the change in relative importance

by major group as different HCI elements are activated. The effects of the payments approach

and household-weighted aggregation on the relative weights are significant, but sometimes

have offsetting effects. For instance, the overall housing weight in the HCI is smaller than the

CPI, as property tax, mortgage interest, and the increase in other housing outlays amounts to

less than the decrease due to the exclusion of OER. By itself, this decrease in housing weight

increases the weight allocated to other categories, like medical and recreation. At the same

time, however, household-weighted aggregation shifts weight toward households with lower

total expenditures, further increasing the relative importance of rent and food while decreasing

that of transportation.

Figure 1: December 2020 Relative Importance for HCI-U and CPI-U

Panel a: HCI-U (2019 weights)

Panel b: CPI-U (2017-18 weights)

20.2%

9.2%

4.7%

4.3%

16.0%3.1%

14.2%

11.1%

6.6%

6.8% 3.8%

Food & Bev. Housing: Rent

Housing: Prop. Tax Housing: Mortgage

Housing: Other Apparel

Transportation Medical

Recreation Educ. & Comm.

Other

15.2%

7.9%

24.3%

10.3% 2.7%

15.2%

8.9%

5.8%

6.8% 3.2%

Food & Bev. Housing: Rent

Housing: OER Housing: Other

Apparel Transportation

Medical Recreation

Educ. & Comm. Other

23

Table 4: December 2020 Relative Importance for Different Index Types (percent)

Major Group CPI-U (2) (3) (4) HCI-U

Food and Beverages 15.16 15.68 15.60 17.96 20.16 Housing 42.39 41.84 42.13 33.34 34.26 Apparel 2.66 2.70 2.67 3.07 3.15 Transportation 15.16 15.43 14.60 16.80 14.23 Medical 8.87 8.79 9.18 10.58 11.09 Recreation 5.80 5.80 6.16 7.08 6.59 Education and Comm. 6.81 6.72 6.57 7.61 6.76 Other 3.16 3.04 3.09 3.56 3.76 Methods* Reference Period 2017-18 2018-19 2019** 2019** 2019** CE Sample Full Full 4-quarter 4-quarter 4-quarter Aggregation Expenditure Expenditure Expenditure Expenditure Household Owner Occ. Housing REQ*** REQ*** REQ*** Payments Payments * Columns 2-5 also reflect other methodology changes and simplifications described in text. ** Under our sample eligibility criteria, this includes spending back to February 2018. *** REQ = Rental Equivalence

3.B.1. Interview-Diary Matching Procedure

As mentioned, the basis of our household average expenditure weights is the CE

Interview sample, which covers about three-quarters of the expenditure basket as traditionally

sourced by the CPI. We implement a statistical matching procedure based on Hobijn et al.

(2009) to impute the remaining proportion which CPI sources from the Diary.16 Similar

observations from the Diary sample provide the remaining expenditure data for each Interview

consumer unit, according to a model of expenditures as a function of demographic

characteristics. The dependent variable is expenditures on items which HCI (and the CPI)

sources from the Diary, but for which the Interview either collects the same item or has more

16 Garner, et. al. (2022) and Martin (2022) also use matching processes based on Hobijn, et. al. (2009).

24

aggregate data.17 The model is a convenient way of combining many characteristics according

to which linear combination most strongly predicts expenditures. We then use the predicted

values to form measures of distance between an Interview recipient and its potential Diary

donors. For our main results, the only attribute guaranteed to match between donor and

recipient is quintile group membership based on the distribution of annual before-tax income.18

For our results on housing tenure subpopulations, we also guarantee this attribute matches.

The matching procedure is many-to-one, as we draw four donor Diaries for each Interview in

each month with replacement. The procedure is implemented separately by month so that

weekly Diary donors are evenly distributed temporally over the recipient Interview’s sample

tenure. Due to the sample selection criteria outlined earlier, for reference year 2019, for

example, that means we are running monthly regressions from February 2018 to December

2019. The stratification and model estimation are done on the full Interview sample, not just

the four-quarter subsample.

First, we stratify both Interview and Diary consumer unit samples for the reference

period by the sample quintiles of annual before-tax income. For each month &#x1d461; and quintile

grouping &#x1d45e;, we use the Interview sample to estimate the regression

&#x1d466;ℎ&#x1d461; = &#x1d499;ℎ&#x1d461;&#x1d737;&#x1d45e;&#x1d461; + &#x1d462;ℎ&#x1d461;,

(9)

17 From Martin (2022), Table A2, these amount to about 80% of Diary-sourced expenditures in 2019. Alternatively, it might seem attractive to use the Diary sample to estimate Diary expenditures as a function of demographic characteristics, as we intend to impute these expenditures for the Interview sample. However, we find that characteristics explain relatively little variation in Diary expenditures, perhaps due to the short (week-long) recall period. 18 The Diary samples are small enough that conditioning on multiple characteristics quickly leads to empty cells. See Hobijn, et al. (2009) for more discussion.

25

where &#x1d466;ℎ&#x1d461; is logged expenditure of consumer unit h. The term &#x1d462;ℎ&#x1d461; is an error term, and &#x1d499;ℎ&#x1d461;

include Census region, urban/rural, age, race, sex, and education of the reference person,

consumer unit size, the log of annual before-tax income (if positive), and an indicator for

whether income was negative.19 We use the least squares estimator weighted by the CE

sampling weight, finlwt21. Over the sample period, R-squared values for the quintile and

month-specific regressions averaged 0.17, while income quintile itself explained about 0.31 of

the variation in the dependent variable.

Let �̂�&#x1d45e;&#x1d461; be the slope estimate for quintile &#x1d45e; in month &#x1d461;. As household characteristics are

available and comparably defined in both surveys, we calculate predicted values �̂�ℎ&#x1d461; = &#x1d499;ℎ&#x1d461;�̂�&#x1d45e;&#x1d461;

for each Diary and Interview observation. For a given Interview observation ℎ and Diary

observation &#x1d458;, the distance metric is defined as

&#x1d6ff;&#x1d461;(ℎ, &#x1d458;) = |�̂�ℎ&#x1d461; − �̂�&#x1d458;&#x1d461;|.

(10)

Within each month and income quintile, we calculate &#x1d6ff;&#x1d461;(ℎ, &#x1d458;) for all {ℎ, &#x1d458;} pairs. Then for each

Interview observation ℎ, we randomly select (with replacement) four &#x1d458; from the twenty

smallest &#x1d6ff;&#x1d461;(ℎ, &#x1d458;) out of all the Diary observations from the same month and income quintile.

The random component is intended to ensure a more even distribution of matches across Diary

observations. The detailed set of expenditures of the donor Diary is then assigned to the

recipient Interview. As one donor Diary is intended to represent one quarter of one month of

expenditure, but Diaries correspond to a one-week recall period, the donor Diary expenditures

19 These demographic variables technically pertain to the collection quarter or some other reference period, so we implicitly assume they represent the associated reference months. For the matching regressions, we allow a consumer unit’s attributes to vary by collection quarter.

26

are scaled by 13/12. This process is repeated for each Interview observation, for each month it

is in the sample.20 Since the Interview sample is much larger than the Diary on a per-month

basis, each Diary is matched with several Interviews. Further analysis of the matching

procedure is in Appendix B.

4. Results

We find the HCI-U follows similar patterns of acceleration and deceleration as the CPI-U,

but it has significantly lower average rates of growth during our sample period. The average 12-

month change in the HCI-U averages 1.51% versus 1.86% for the CPI-U, as shown in

Table 5.

20 In the CPI, diary expenditures are multiplied by 13 to account for the difference in recall periods between weekly diaries and quarterly interviews. The scaling in our procedure is analogous in that an interview is matched with a total of 12 diaries each quarter, and with the scaling these also represent 13 weeks.

27

Figure 2 plots the index levels, showing markedly different trends between the CPI-U

and HCI-U from 2012-2020. The two indexes increased at a similar rate in 2021, averaging 4.6-

4.7% year-over-year growth throughout the year.

Table 5 includes an index (U-EW-REQ) which uses expenditure weighting and the rental

equivalence approach but uses our CE subsample and processing methods. It also includes a

comparable series (U-EW-PAY) which instead uses the payments approach but uses

expenditure weighting as in the CPI. Comparisons of these indexes and the HCI-U show the

difference in trends and average growth reflects primarily the impact of the payments

approach. U-EW-PAY averages about 0.39 percentage points per year less than U-EW-REQ, and

in a single year (2016) averages 0.74 percentage points lower. In 2021, the impact of the

payments approach is to add 0.15 percentage points to the average 12-month percent change,

reflecting increasing home prices and interest rates. In 2022, we also expect this effect to be

positive and much larger in magnitude due to the large increase in mortgage interest rates. In

contrast, comparing HCI-U to U-EW-PAY shows the household-weighted aggregation adding

28

only slight amount to the overall average 12-month percent change (0.05%), but yearly average

differences are as high as 0.16 percentage points in 2017. In 2021, household-weighted

aggregation lowers HCI-U by 0.1 percentage points on average.

Figure 2: HCI-U and CPI-U Index Levels

Table 5: Average 12-month Percent Changes by Year, HCI and CPI

1

1.05

1.1

1.15

1.2

1.25

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

lowe-u (ew, req) lowe-u (ew, pay) cpi-u hci-u

29

Year HCI-U CPI-U U-EW-

REQ U-EW-

PAY HCI-OM HCI-ONM HCI-RNT

2013 0.99% 1.47% 1.43% 0.86% 0.52% 1.22% 1.57% 2014 1.41% 1.62% 1.63% 1.27% 1.02% 1.65% 1.77% 2015 -0.44% 0.12% 0.15% -0.44% -0.88% -0.52% 0.27% 2016 0.56% 1.26% 1.24% 0.51% 0.10% 0.55% 1.19% 2017 1.76% 2.13% 2.13% 1.60% 1.41% 1.79% 2.24% 2018 2.36% 2.44% 2.42% 2.33% 2.32% 2.23% 2.52% 2019 1.39% 1.81% 1.81% 1.43% 1.30% 1.02% 1.80% 2020 0.93% 1.24% 1.21% 0.84% 0.65% 0.89% 1.31% 2021 4.62% 4.69% 4.58% 4.73% 4.54% 4.95% 4.44%

Average 1.51% 1.86% 1.84% 1.46% 1.22% 1.53% 1.90% Notes: U signifies urban population. U-EW-REQ is a CPI-like replication using the HCI sample and simplified expenditure processing methods, but expenditure-weighting and rental equivalence. Similarly, U-EW-PAY uses expenditure-weighting, but the payments approach. “OM” is owners with a mortgage, “ONM” is owners without a mortgage, and “RNT” is renters.

Figure 3 describes further how the actual outlays for owner-occupiers are associated

with lower inflation than would be implied by rental equivalence. Over the sample period, the

official index for owner’s equivalent rent increases 33.8% cumulatively, while our sub-aggregate

for owner’s payments (combining property tax, mortgage interest, and other owner payments)

increased only 11.5%. Within owner’s payments, the two major components, the trend in the

property tax index is similar to owner’s equivalent rent for most of the sample period.

However, the mortgage interest index trends flat, not yet picking up the sharp increases in

interest rates occurring in 2022 after our sample period ends.21 We also note that evolution of

the mortgage interest index is smoother than current average mortgage interest rates (from

21 Our analysis is constrained by sourcing property tax payments from the CE, which as of June 2023 are only available through the first half of 2022. The average 12-month change for the mortgage interest index is 8.2% in 2022. Using the first half of 2022 property tax burden (X/V) as a crude forecast, we find an average change in the owner’s payments index of 10.0% in 2022 (versus 5.7% for owner’s equivalent rent), and an average change in the HCI-U of 8.7% (versus 8% for the CPI-U).

30

the Freddie Mac PMMS), because the index is averaging over 30 years of past mortgage rates in

order to reflect current payments.

Figure 3: Owner’s Equivalent Rent vs. Owner’s Payments

Finally, we further illustrate the treatment of owned housing outlays by estimating HCI’s

for three subpopulations, owners with a mortgage (OM), owners without a mortgage (ONM),

and renters (RNT). We define these using the housing tenure value reported by the consumer

unit in their final interview. The final three columns of

0

1

2

3

4

5

6

0.9

0.95

1

1.05

1.1

1.15

1.2

1.25

1.3

1.35

1.4

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

Owner's Equiv. Rent (HC) Owner's Payments (HR, HS, HT)

Property Tax (HR) Mortgage Interest (HS)

Other Owner Payments (HT) 30-yr fix. rate (r. axis, %, PMMS)

31

Table 5 show the average 12-month percent changes, while Figure 4 plots the index levels. HCI-

RNT has average inflation of 1.9% and is closest to the CPI-U. While there may be overall weight

differences between the urban population and the subpopulation of renters, the evolution of

owner’s equivalent rent is close enough to the evolution of actual rent that this result is not

surprising. In contrast, the HCI inflation for owners is significantly lower, averaging 1.53% per

year for those without a mortgage and 1.22% per year for those with a mortgage. As with the

urban indexes, the relative rankings are not the same year to year. For instance, owners

without mortgages had the highest average inflation in 2021, 4.95%, versus 4.54% for owners

with a mortgage and 4.44% for renters.

Figure 4: HCIs for Housing Tenure Subpopulations

4.A. Alternative Treatments of Owner Payments for Housing

As discussed in Section 3.A, we follow international practice in excluding mortgage

principal and basing mortgage interest and property tax index changes on two sources: a

1

1.05

1.1

1.15

1.2

1.25

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

hci-om hci-onm hci-rnt

32

change in a rate (the interest rate or the effective property tax rate), and the change in a

monetary base (the debt level and the housing value). The appendix, including Figure 5 and

Figure 6, explore the sensitivity of the indexes to these decisions. Including mortgage principal

would raise the owner’s payments subindex (combining mortgage interest, property tax, and

other payments as in Figure 3) by 0.8 percentage points per year. Combined with the associated

weight increase to mortgages, this would result in an all-items HCI-U that is higher by 0.10

percentage points per year. The effect of home prices would be more substantial, lowering the

owner’s payments index by 4.0 percentage points per year and the all-items HCI-U by 0.38

percentage points per year.

5. Conclusions and Future Research

Our results show the HCI differs substantially from the CPI because it uses the payments

approach for owner occupied housing, and slightly because it weights households equally in its

upper-level aggregation. The payments approach tracks the actual outlays of homeowners,

which over our sample period of 2012 to 2021 have escalated at a lower trend than (imputed)

owner’s equivalent rent, resulting in lower inflation as measured by the HCI than as measured

by the CPI. We do not argue that the payments approach is superior from the standpoint of

measuring the cost-of-living as an economic theoretic concept or for use in monetary policy.

Rather, by reflecting the explicit outlays of owners, we show the HCI offers a measurement of

the household inflation experience which is empirically different than the CPI.

Future research could focus on many areas. Our measures of price change for mortgage

and property tax payments use only national-level data. A natural next step would be to extend

33

these to subnational geographic areas, if relevant and feasible. Further down the road,

exploring mortgage microdata of the sort described by Bhutta, et. al. (2020) could be

informative on different experiences of subpopulations, to the extent that long enough

histories can be obtained to account for the long lives of mortgage loans. More timely and

granular property tax data would also improve the HCI. In addition, in principle, the payments

approach could be extended to any durable good where payment occurs over a long

timeframe, with automobiles in particular being a high priority. Martin (2022) suggests treating

automobiles under an approach consistent with the target of the index (payments, in our case)

is critical if higher-frequency household weights are to be taken seriously, such as for a monthly

weighted superlative like the C-CPI-U. Custom sampling weights should also be created to

account for demographic differences for the four-quarter sample of consumer units used for

the HCIs, but further analysis may also be warranted related to weight frequency and

subsample selection. With the payments approach weighting of automobiles, for instance,

perhaps infrequent purchase issue discussed in Martin (2022) is less salient. Finally, the impact

household-weighted aggregation on the all-items index’s sampling variation or the potential of

weight-smoothing techniques have yet to be explored.

References

Astin, J., & Leyland, J. (2015). Towards a Household Inflation Index: Compiling a consumer price index

with public credibility. Royal Statistical Society. Retrieved November 20, 2020, from

https://rss.org.uk/RSS/media/News-and-

publications/Publications/Reports%20and%20guides/Astin-Leyland-HII-paper-Apr-2015.pdf

Bhutta, N., Fuster, A., & Hizmo, A. (2020). Paying Too Much? Price Dispersion in the US Mortgage

Market. Washington, DC: Board of Governors of the Federal Reserve System.

doi:https://doi.org/10.17016/FEDS.2020.062

34

Bureau of Labor Statistics. (2020). The Consumer Price Index. In Handbook of Methods. Washington, DC.

Retrieved from https://www.bls.gov/opub/hom/cpi/home.htm

Central Statistics Office. (2016). Consumer Price Index: Introduction of Updated Series (Base: December

2016=100). Cork: Central Statistics Office. Retrieved from

https://www.cso.ie/en/media/csoie/methods/consumerpriceindex/CPI_-

_introduction_to_series_2016.pdf

Diewert, W. E. (1976). Exact and Superlative Index Numbers. Journal of Econometrics, 4(2), 115-145.

doi:10.1016/0304-4076(76)90009-9

Diewert, W. E., & Nakamura, A. O. (2009). Accounting for Housing in a CPI. Philadelphia: Federal Reserve

Bank of Philadelphia. Retrieved from https://www.philadelphiafed.org/-

/media/frbp/assets/working-papers/2009/wp09-4.pdf

Diewert, W. E., & Shimizu, C. (2021). Chapter 10: The Treatment of Durable Goods and Housing. In

Consumer Price Index: Theory (Draft). Washington, D.C.: International Monetary Fund. Retrieved

from https://www.imf.org/en/Data/Statistics/cpi-manual#companion

Federal Housing Finance Agency. (2021). House Price Index Datasets. Retrieved from

https://www.fhfa.gov/DataTools/Downloads/Pages/House-Price-Index-Datasets.aspx

Freddie Mac. (2022). Primary Mortgage Market Survey - About. Retrieved April 29, 2022, from Primary

Mortgage Market Survey: https://www.freddiemac.com/pmms/about-pmms

Freddie Mac. (2023). Primary Mortgage Market Survey - Archive. Retrieved March 17, 2023, from

Primary Mortgage Market Survey: https://www.freddiemac.com/pmms/pmms_archives

Garner, T. I., & Verbrugge, R. (2009). Reconciling user costs and rental equivalence: Evidence from the

US consumer expenditure survey. Journal of Housing Economics, 18(3), 172-192.

doi:10.1016/j.jhe.2009.07.001

Gillingham, R., & Lane, W. (1982). Changing the treatment of shelter costs for homeowners in the CPI.

Monthly Labor Review, 9-14. Retrieved from

https://www.bls.gov/opub/mlr/1982/06/art2full.pdf

Goodhart, C. (2001). What Weight Should be Given to Asset Prices in the Measurement of Inflation? The

Economic Journal, F335-F356. doi:10.1111/1468-0297.00634

International Labor Organization. (2004). Consumer Price Index Manual: Theory and Practice. (P. Hill,

Ed.) Geneva: International Labor Organization. Retrieved from

https://www.ilo.org/wcmsp5/groups/public/---dgreports/---

stat/documents/presentation/wcms_331153.pdf

International Labour Organization. (2003). Resolution concerning consumer price indices. Resolution of

the Seventeenth International Conference of Labor Statisticians. Geneva. Retrieved from

http://ilo.org/wcmsp5/groups/public/---dgreports/---

stat/documents/normativeinstrument/wcms_087521.pdf

35

Office for National Statistics. (2017). Household Costs Indices: Methodology. Office for National

Statistics. Retrieved from

https://www.ons.gov.uk/economy/inflationandpriceindices/methodologies/householdcostsindi

cesmethodology

Office for National Statistics. (2019). Consumer Prices Indices Technical Manual. Office for National

Statistics. Retrieved from

https://www.ons.gov.uk/economy/inflationandpriceindices/methodologies/consumerpricesindi

cestechnicalmanual2019

Poole, R., Ptacek, F., & Verbrugge, R. (2005). Treatment of Owner-Occupied Housing in the CPI.

Washington, DC: Bureau of Labor Statistics. Retrieved from

https://www.bls.gov/advisory/fesacp1120905.pdf

Prais, S. J. (1959). Whose cost of living? The Review of Economic Studies, 126-134. doi:10.2307/2296170

Statistics Canada. (2019). The Canadian Consumer Price Index Reference Paper. Statistics Canada.

Retrieved from https://www150.statcan.gc.ca/n1/pub/62-553-x/62-553-x2019001-eng.htm

Statistics New Zealand. (2020). Household living-costs price indexes (HLPIs) data dictionary (Version 33).

Wellington: Statistics New Zealand. Retrieved November 23, 2022, from

https://datainfoplus.stats.govt.nz/Item/nz.govt.stats/a46a6353-947a-4062-89e7-

c6faef4fece1/?_ga=2.96280540.1570432553.1669226241-1704970333.1669226240

36

Appendix

A. Alternative Mortgage Interest and Property Tax Indexes

The mortgage payments index which includes mortgage principal replaces the interest

rate component, Eq. (3), with the following representing change in full mortgage payments

between months &#x1d460; and &#x1d461;:

&#x1d443;&#x1d453; =

∏ [&#x1d453;(&#x1d45f;&#x1d461;−&#x1d457;, &#x1d703; − &#x1d457;)] &#x1d711;&#x1d44f;&#x1d457;&#x1d703;−1

&#x1d457;=0

∏ [&#x1d45f;&#x1d460;−&#x1d457; , &#x1d703; − &#x1d457;)] &#x1d711;&#x1d44f;&#x1d457;&#x1d703;−1

&#x1d457;=0

. (11)

where &#x1d453;(&#x1d45f;, &#x1d714;) = &#x1d45f;&#x1d445;&#x1d714; (&#x1d445;&#x1d714; − 1)⁄ , &#x1d714; > 1, where &#x1d445; = 1 + &#x1d45f;. The function &#x1d453; represents the

fixed mortgage payment as a proportion of the current debt amount. In this expression, the

interest rate &#x1d45f; is the annualized rate divided by 12 so that it corresponds to one month. Note,

when estimated using aggregate data, even if &#x1d45f;&#x1d461;−&#x1d457; equals an average interest rate across

households with loans of age &#x1d457;, the amount &#x1d453;(&#x1d45f;&#x1d461;−&#x1d457;, &#x1d703; − &#x1d457;) cannot be interpreted as an average

mortgage payment ratio across households due to Jensen’s inequality. The relationship

between &#x1d453;(&#x1d45f;&#x1d461;−&#x1d457;, &#x1d703; − &#x1d457;) and a true household average is unknown (at least to the authors) but

using such an average in a price index would require microdata tracking individual mortgagors

across loan changes including refinances (which we can observe in the CE) and new loans

(which we often do not observe due to address-based sampling). The mortgage payment

indexes without home prices remove the debt index component, Eq. (2), while the property tax

index without home prices is just the effective tax rate component, &#x1d44b;&#x1d460;,&#x1d461; &#x1d449;&#x1d460;,&#x1d461;⁄ from Eq. (4).

37

Figure 5 plots the different Owner’s Payment subindexes (combining mortgage interest,

property taxes, etc., as in Figure 3) and compares them again against owner’s equivalent rent.

Adding mortgage principal increases the owner’s payments index by about 0.8 percentage

points per year when home prices are included, and about 1 percentage point per year when

home prices are excluded. Given the strong upward trend of home prices over the past several

decades, removing their lowers the payments index by 4.0 percentage points per year when

mortgage principal is excluded and by 6.6 percentage points per year when mortgage principal

is included, resulting in downward trends. Figure 6 tracks these payments indexes changes on

the all-items HCI-U, accounting for changes in both the elementary indexes and the aggregation

weights. The overall effect of mortgage principal is modest, adding 0.10 or 0.03 percentage

points per year depending on whether house prices are included. Home prices themselves have

a larger impact on the all-items index, decreasing it by either 0.38 or 0.45 percentage points per

year depending on whether mortgage principal is included.

Figure 5: Alternative Versions of Owner’s Payments

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

Owner's Equiv. Rent (HC) Paym. (HR, HS, HT) Paym. (with principal) Paym. (no home prices) Paym. (with principal, no home prices)

38

Figure 6: HCI-U Under Alternative Versions of Owner’s Payments

B. Interview-Diary Matching Details

We base our household-averaged weights on the CE Interview sample but use a

statistical matching procedure to assign sets of weekly Diary expenditures to each Interview

consumer unit. Our procedure is similar in spirit to that of Hobijn, et. al. (2009), though that

paper models expenditure change (implied by a consumer-unit specific price index) rather than

expenditure levels. Modeling expenditure changes is attractive given the ultimate use of the

matched dataset for price indexes, but Martin (2022) finds demographics explain much less of

the variation in expenditure changes. We limit the dependent variable to categories collected in

both the Interview and the Diary to ensure that the correlations picked up by the model are

relevant to the expenditures we ultimately wish to impute. Over the sample period, R-squared

values for the quintile and month-specific regressions averaged 0.17, while income quintile

itself explained about 0.31 of the variation in the dependent variable. Figure 7 below plots the

average regression R-squared for each quintile, where the averaging is over the 23 months used

1

1.05

1.1

1.15

1.2

1.25

2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

cpi-u hci-u hci-u (with principal) hci-u (no home prices) hci-u (with principal, no home prices)

39

for each reference period. The figure shows that average R-squared for the income quintiles are

fairly stable over time, averaging about 0.23 for the 1st quintile, 0.17 for the second quintile,

0.13 for the third quintile, 0.11 for the fourth quintile, and 0.15 for the fifth quintile. The fits

(conditional on income quintile) are not particularly strong, which motivates matching an actual

diary’s expenditure set to an interview consumer unit rather than using regression fitted values.

Figure 7: Average R-Squared by Reference Period and Income Quintile

The rest of this section presents figures comparing the imputed weekly diary

expenditures to the actual. Figure 8 shows average imputed weekly expenditures for the

reference period track the actual averages well over time, always falling within 1% of the true

averages. Figure 9 compares average weekly Diary expenditures over time by major group. For

food and beverages, which is by far the largest category sourced from the Diary, the imputed

averages fall within 1% of the actual averages, and they fall within 10% for all other categories.

Figure 10 compares the deciles of weekly imputed Diary expenditures to those of the actual

0

0.05

0.1

0.15

0.2

0.25

2010 2011 2012 2010 2013 2014 2015 2016 2010 2017 2018 2019

IQ1 IQ2 IQ3 IQ4 IQ5

40

Diary expenditures for the 2019 reference period (results are similar for other periods). The two

marginal distributions line up well—the imputed deciles are within a few dollars of the actual

deciles.

Figure 8: Actual and Imputed Average Weekly Diary Expenditures by Reference Period

220

230

240

250

260

270

280

290

300

310

2010 2011 2012 2013 2014 2015 2016 2017 2018 2019

actual imputed

41

Figure 9: Average Weekly Diary Expenditures by Reference Period and Major Group

Panel a: Food and Beverages

Panel b: Housing

Panel c: Apparel

Panel d: Transportation

Panel e: Medical

Panel f: Recreation

Panel g: Education and Communication

Panel h: Other

0

50

100

150

200

actual imputed

0

10

20

30

40

actual imputed

0

10

20

30

40

actual imputed

0

5

10

15

20

25

30

actual imputed

0

2

4

6

8

actual imputed

0

10

20

30

40

actual imputed

0

1

2

3

4

actual imputed

0

5

10

15

actual imputed

42

Figure 10: Deciles of Actual and Imputed Weekly Diary Expenditures for 2019 Reference Year

In terms of joint distributions, the matching procedure also does a good job at

replicating average diary expenditures by several demographic characteristics, as shown in

Figure 11 for 2019. Not surprisingly, because income quintile is conditioned on, the procedure

replicates average expenditures by income quintile quite well. The procedure also does well

replicating average differences by housing tenure, age categories, Census region, presence of

children, and education categories, even though these characteristics are not explicitly

conditioned on in the matching process. In these cases, the match quality is being driven by the

correlation between these characteristics and income, as well as the extent to which similarity

in these characteristics across surveys is predictive of expenditures, and so leading to lower

distance between similarly attributed observations.

0

100

200

300

400

500

600

700

1 2 3 4 5 6 7 8 9

actual imputed

43

Figure 11: Average Weekly Diary Expenditures by Attribute, 2019 Reference Period

Panel a: Income Quintile

Panel b: Housing Tenure

Panel c: Age

Panel d: Presence of Children

Panel e: Census Region

Panel f: Education

0

100

200

300

400

500

600

1 2 3 4 5

actual imputed

0

100

200

300

400

Own w/ Mort.

Own w/o Mort.

Renter No cash rent

Student

actual imputed

0

50

100

150

200

250

300

350

<=61 >61

actual imputed

0

100

200

300

400

No kids Kids

actual imputed

260

270

280

290

300

310

320

330

NE MW S W

actual imputed

0

100

200

300

400

< H.S. H.S. & Some Coll.

>= Bachelors

actual imputed

44

C. All-items Indexes Using Different CE subsamples

Figure 12: Twelve-month inflation of CPI and indexes using payments approach by subsample

As a check of our sample requirement that consumer units contributing to the HCI have

four quarters of data in the CE survey, we compare all-items indexes (all using the payments

approach) with this eligibility requirement against all-items indexes without. For this

comparison, we examine expenditure-weighted aggregates across households, as equally

weighted aggregates can be sensitive to weight frequency and overall dispersion in total

expenditures (Ley, 2005; Martin, 2022). We consider both the full CE sample for the reference

year, as well as for the full CE sample for the biennial period ending in the reference year, as

our HCI subsample also includes four-quarter households who entered the CE in the year prior

to the reference year. Figure 12 plots the twelve-month percent changes of these indexes as

well as the CPI-U for reference. Over this period, average inflation of the CPI-U is 1.86% per

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08 D

ec -1

2

M ay

-1 3

O ct

-1 3

M ar

-1 4

A u

g- 1

4

Ja n

-1 5

Ju n

-1 5

N o

v- 1

5

A p

r- 1

6

Se p

-1 6

Fe b

-1 7

Ju l-

1 7

D ec

-1 7

M ay

-1 8

O ct

-1 8

M ar

-1 9

A u

g- 1

9

Ja n

-2 0

Ju n

-2 0

N o

v- 2

0

A p

r- 2

1

Se p

-2 1

lowe-u (ew, pay, 4Q) lowe-u (ew, pay, full-be)

lowe-u (ew, pay, full-a) cpi-u

45

year. The payments approach index using the four-quarter sample averaged 1.46%, while the

indexes using the full annual and biennial samples averaged 1.47% and 1.46%, respectively.

Figure 13: Twelve-month inflation of HCI and indexes using payments approach by subsample

Figure 13 repeats the analysis in Figure 12, but compares the HCI-U and comparable

household-weighted indexes using the full annual or biennial CE samples. The HCI-U averaged

1.51% year-over-year, while the index using the full annual and full biennial samples averaged

1.50% and 1.51%, respectively, though larger differences occurred in 2021. Here, index

differences could reflect sample selection effects, but also likely reflect the mixed frequencies

of household weights underlying the full-sample indexes, as some consumer units have only a

few months or quarters of expenditure due to normal sample rotations and unit nonresponse.

Higher frequency expenditure shares tend to give less weight to less frequently purchased

items and more weight to more frequently purchased items (Martin, 2022). We do not want to

capture this latter effect because, in the case of the HCI’s, it is an artifact of using CPI weights

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

D ec

-1 2

M ay

-1 3

O ct

-1 3

M ar

-1 4

A u

g- 1

4

Ja n

-1 5

Ju n

-1 5

N o

v- 1

5

A p

r- 1

6

Se p

-1 6

Fe b

-1 7

Ju l-

1 7

D ec

-1 7

M ay

-1 8

O ct

-1 8

M ar

-1 9

A u

g- 1

9

Ja n

-2 0

Ju n

-2 0

N o

v- 2

0

A p

r- 2

1

Se p

-2 1

hci-u lowe-u (hw, pay, full-be) lowe-u (hw, pay, full-a)

46

for automobiles, which are measured by full purchase price at the time of acquisition, rather

than ongoing monthly payments. In 2021, when HCI-U (over the four-quarter sample) has

slightly higher inflation than the two full sample indexes. In 2021, vehicle price inflation was

high relative to the average inflation across all items, and the comparison in the figure is

consistent with the full-sample indexes giving too little weight to vehicles. A payments

approach for vehicles should mitigate this effect in the full samples.

Impact of Weight Timeliness on the US CPI

Languages and translations
English

1 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Impact of Weight Timeliness on the US CPI

Anya Stockburger, Joshua Klick, Chris Miller, Jessie Park

Bureau of Labor Statistics

Meeting of the Group of Experts on CPIs

June 9, 2023

2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

BLS Consumer Price Indexes

100

110

120

130

140

150

160

170

180

190

CPI-U Final C-CPI-U

3 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Motivation: Weighting Improvements Product Release Lag

(difference between index reference period and publication)

Weight Lag (difference between weight and index reference period)

Improvement Goals

CPI-U ~10 days Biennial: 12-60 months

Annual: 12-36 months

Reduce weight lag

Final Chained CPI-U

~10 days + 10-12 months

1-2 months Reduce release lag Increase visibility

4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Outline

Motivation

Impact of weight timeliness

Reducing weight lag in CPI-U

Future research

5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weight changes over time Cumulative percentage change in annual spending shares

-80%

-60%

-40%

-20%

0%

20%

40%

60%

2000 2005 2010 2015 2020

Pe rc

en t C

ha ng

e

Year

Apparel Education and Communication Food and Beverages Other Goods and Services Housing Medical Care Recreation Transportation

6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Airline Fares Monthly Expenditure Weights

0.0%

0.2%

0.4%

0.6%

0.8%

1.0%

1.2%

1.4%

1.6%

1.8%

7 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Impact of chain drift - Tornqvist

Cage, Williams, Church 2021 “Chain Drift in the Chained Consumer Price Index: 1999-2017”

8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Impact of chain drift – monthly chained Laspyeres

9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Historical Impact of Timely Weights

-0.40%

-0.20%

0.00%

0.20%

0.40%

0.60%

0.80%

1.00%

2001 2006 2011 2016 2021

Difference in 12-month Percent Change CPI-U less Final C-CPI-U

10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Recent Impact of Timely Weights

-0.30%

-0.20%

-0.10%

0.00%

0.10%

0.20%

0.30%

0.40%

0.50%

0.60%

0.70% Ja

n M

ar M

ay Ju l

Se p

No v

Ja n

M ar

M ay Ju

l Se

p No

v Ja

n M

ar M

ay Ju l

Se p

No v

Ja n

M ar

M ay Ju

l Se

p No

v Ja

n M

ar M

ay

2018 2019 2020 2021 2022

Difference in 12-month Percent Change CPI-U – Final C-CPI-U

11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Items contributing to negative substitution bias

-0.02 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0.02

Full-service meals and snacks

Limited-service meals and snacks

Admissions

College tuition and fees

Airline fare

Personal computers

Toys

New vehicles

Motor vehicle insurance

Owner's Equivalent Rent

Upper-Level Substitution Bias Contribution - December 2021

12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Negative substitution bias: price and weight change

Full-service meals and snacks

Limited-service meals and snacks

Admissions

College tuition and fees

Personal computers

Toys

New vehicles

Motor vehicle insurance

Owner's Equivalent Rent

-100%

-50%

0%

50%

100%

150%

-20.00% -15.00% -10.00% -5.00% 0.00% 5.00% 10.00%W ei

gh t C

ha ng

e

Price Change

Atypical consumer substitution

13 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Items contributing to positive substitution bias

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

Used cars and trucks

Gasoline

New vehicles

Full-service meals and snacks

Limited-service meals and snacks

Food at employee sites and schools

Clocks, lamps, and décor

Jewelry

Motor vehicle insurance

Owner's equivalent rent

Upper-Level Substitution Bias Contribution - December 2021

14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Positive substitution bias: price and weight change

Used cars and trucks

GasolineNew vehicles

Full-service meals and snacks

Limited-service meals and snacksFood at employee

sites and schools

Clocks, lamps, and décor

Motor vehicle insurance

Owner's equivalent rent

-50%

-30%

-10%

10%

30%

50%

70%

90%

110% -60.00% -40.00% -20.00% 0.00% 20.00% 40.00% 60.00%

W ei

gh t c

ha ng

e

Price change

Typical consumer

substitution

15 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Improvements to Weight Timeliness

Weight update frequency

Annualized 12-month percent change (2002-2020)

Upper-level substitution bias

Biennial 2.06 0.24

Annual 2.03 0.21

Quarterly 1.95 0.13

Tornqvist (monthly)

1.82 -

 Chain drift None at aggregate levels

for annual or quarterly Issue for lower-level

quarterly

 Research papers Annual: Klick 2021 Quarterly: Klick, Park

2022

16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Implement annual weight update

 Implemented annual weight updates with January 2023 indexes More relevant weights (replace 2019/2020 with

2021 expenditure data) See website for more information on 2022 and

2023 weight updates

17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weight changes in 2023 Apparel Food, Alcohol Away

& Haircuts

Recreation &

Transportation

Lodging Away

Food, Alcohol at Home Other & Housing Goods

Hospital Services & Medicinal Drugs

18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

What’s next – improve timeliness of the C-CPI?

 Medium-term: 6-month lag to publish final C-CPI Survey protocol (placement dates) Processing efficiencies (auto-coding, monthly processing,

streamlined outlier review) Design changes (survey recall length)

 Long-term: real-time capture of expenditure information Funding for pilot test included in FY24 President’s budget

request

19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Additional References  Greenlees, John and Williams, Elliot (2009). Reconsideration

of weighting and updating procedures in the US CPI. BLS working paper 431.

 Kurtzon, Greg (2018). How much does formula vs. chaining matter for a cost-of-living index? The CPI-U vs. the C-CPI-U. BLS working paper 498.

 Klick, Josh (2021). Measuring price change during economic downturns. Beyond the Numbers Vol. 10 No. 13.

 Matsumoto, Brett (2022). The impact of changing consumer expenditure patterns at the onset of the COVID-19 pandemic on measures of consumer inflation. Monthly Labor Review.

Contact Information

20 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Anya Stockburger Chief, Branch of Revision Methodology

Division of Consumer Price Indexes www.bls.gov/cpi

[email protected]

  • Impact of Weight Timeliness on the US CPI
  • BLS Consumer Price Indexes
  • Motivation: Weighting Improvements
  • Outline
  • Weight changes over time�Cumulative percentage change in annual spending shares
  • Airline Fares�Monthly Expenditure Weights
  • Impact of chain drift - Tornqvist
  • Impact of chain drift – monthly chained Laspyeres
  • Historical Impact of Timely Weights�
  • Recent Impact of Timely Weights
  • Items contributing to negative substitution bias
  • Negative substitution bias: price and weight change
  • Items contributing to positive substitution bias
  • Positive substitution bias: price and weight change
  • Improvements to Weight Timeliness
  • Implement annual weight update
  • Weight changes in 2023
  • What’s next – improve timeliness of the C-CPI?
  • Additional References
  • Contact Information

Expanding the family of US Consumer Price Indexes

Languages and translations
English

1 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Expanding the family of US Consumer Price Indexes

Anya Stockburger, Bill Johnson, Joshua Klick, Paul Liegey, Robert Martin,

Bureau of Labor Statistics

Meeting of the Group of Experts on CPIs

June 8, 2023

2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

CPI Family of Indexes

CPI-U Chained CPI-U CPI-W

R-CPI-E

Chained R-CPI-Income

Household Cost Index

R-CPI-IncomeResearch Indexes

Production Indexes

3 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Outline

Motivation

Income-based indexes

Household Cost Indexes

Next steps

4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Motivation – Increased need for data granularity

 Committee on National Statistics recommendation  Federal Reserve Bank interest  Office of Management and Budget, Bureau of

Economic Analysis, and other government interest  General user interest (major media)  Publications: Initial working paper, Spotlight on

Statistics

5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

CPI by Income Methodology

$12,000

$118,000

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

Q1 Q5

Median Equivalized Income (Interview Survey - 2021)

Expenditure weights Group CE respondents into weighted ranking of equivalized income quintiles

Prices/rents All lower-level data the same (prices, outlets, rents)

Index aggregation Lowe, Tornqvist aggregation from lowest-level basic indexes

6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Snapshot of spending weights by population, 2019-2020 biennial expenditure weight share, equivalized income

0% 5% 10% 15% 20% 25% 30%

Rent

Food at home

Motor fuel

Owner's equivalent rent

Vehicles and maintenance

Food away from home

Recreation

Q1 U Q5

7 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Annualized Inflation Gap Annualized inflation rate, CPI by income quintile, Lowe Formula, December 2005 -

December 2022

2.60

2.54

2.47

2.41

2.33

2.43

2.1

2.2

2.3

2.4

2.5

2.6

2.7

Q1 Q2 Q3 Q4 Q5

Income Quintiles Urban

8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Inflation Gap Variation Lowest income quintile – Highest income quintile

Annual 12-month percent change December 2006 – December 2022

-1.0%

-0.5%

0.0%

0.5%

1.0%

1.5%

Equivalized income Unadjusted income

9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Items contributing to inflation gap (2022) -5 0 5 10 15

Rent primary residence

Gasoline (all types)

Electricity

Utility (piped) gas service

Cigarettes

Commercial Health Insurance

Owners' rent primary residence

Lodging away from home

Airline fare

New vehicles

10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Limitations and Future Improvements

 Lower-level price heterogeneity High levels of aggregation under-estimate inflation gap

(Jaravel 2019) Re-weighting housing prices shows little impact (Larsen

and Molloy 2021)

 BLS future research Further investigate housing adjustments Re-weighting alternative data (gasoline, new vehicles) Interested in a scanner data program (CNSTAT

recommendation), but funding…

11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Household Cost Index

 Inspired by Office for National Statistics and Statistics New Zealand

 Definition: Measure the change in cash outflows required, on average, for households to access the goods and services they consume

 Methodology:  Household-weighted (democratic) aggregation,  Payments-approach to owner-occupied housing  Urban population

12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Household-weighted (Democratic) Aggregation

 Create household-level expenditure shares Consumer Expenditure Surveys (Diary and Interview)

sample different households Eligible expenditures from the Diary survey imputed to the

Interview sample using a matching procedure based on Hobijn, et. al. (2009)

 Aggregation Aggregate across items/areas first for each household

using Lowe formula with lagged expenditure weights Average equally across households

13 — U.S. BUREAU OF LABOR STATISTICS • bls.gov13 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Payments Approach – Mortgage Interest Payment

 Weights Consumer Expenditure Survey

 Prices Mortgage interest payment index =

Debt index * Interest rate index Data sources: • Federal Housing Finance Agency’s All Transactions House Price

Index • Freddie Mac Primary Mortgage Market Survey

14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Payments Approach – Property Tax Payments

 Weights Consumer Expenditure Survey

 Prices Property Tax Payment Index =

Total property tax payments * Constant quality Total housing stock value home price index

Data source: CE

15 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

HCI – Relative Importance December 2020

Major Group CPI-U HCI-U Food and Beverages 15.2 20.1 Housing 42.4 34.3 Apparel 2.7 3.1 Transportation 15.2 14.3 Medical 8.9 11.1 Recreation 5.8 6.6 Education and Comm. 6.8 6.7 Other 3.2 3.7

16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

HCI – Index Results Average 12-month % change CPI-U 1.86% HCI (Payments Approach + Household-weighted Aggregation)

1.51%

HCI-U (Payments Approach Only) 1.46%

17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Explaining the HCI results

18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Limitations - HCI

 Household-weighted aggregation Infrequent purchases (challenge especially with

Tornqvist) Include in HCI given small impact?

 Payments approach Investigate a microdata approach for mortgage

interest index Investigate including mortgage principal

19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

What’s next?

Improve methodology

Income-group specific lower- level indexes

Next step for HCI research?

Stakeholder outreach

Group of Experts BLS advisory committees Federal Committee on Statistical Methodology

Publish regular updates

R-CPI-Income C-CPI-Income

HCI?

Contact Information

20 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Anya Stockburger Chief, Branch of Revision Methodology

Division of Consumer Price Indexes www.bls.gov/cpi

[email protected]

  • Expanding the family of US Consumer Price Indexes
  • CPI Family of Indexes
  • Outline
  • Motivation – Increased need for data granularity
  • CPI by Income Methodology
  • Snapshot of spending weights by population, 2019-2020 biennial expenditure weight share, equivalized income
  • Annualized Inflation Gap�Annualized inflation rate, CPI by income quintile, Lowe Formula, December 2005 - December 2022
  • Inflation Gap Variation�Lowest income quintile – Highest income quintile�Annual 12-month percent change�December 2006 – December 2022
  • Items contributing to inflation gap (2022)�
  • Limitations and Future Improvements
  • Household Cost Index
  • Household-weighted (Democratic) Aggregation
  • Payments Approach – Mortgage Interest Payment
  • Payments Approach – Property Tax Payments
  • HCI – Relative Importance�December 2020
  • HCI – Index Results
  • Explaining the HCI results
  • Limitations - HCI
  • What’s next?
  • Contact Information

United States Inflation Experience across the Income Distribution

The Bureau of Labor Statistics (BLS) produces the Consumer Price Index (CPI) as a measure of price change faced by consumers. The CPI for All Urban Consumers (CPI-U) targets the inflation experience of nearly all consumers in the United States which may not reflect the inflation experience of an individual household or group of households. Increasingly there is user demand for CPIs across the income distribution. This paper builds on the authors’ prior research by modifying the cohort definition and extending the period of analysis.

Languages and translations
English

United States Inflation Experience across the Income Distribution Joshua Klick, Anya Stockburger

WORKING DRAFT

Prepared for the Group of Experts on Consumer Price Indices UNECE

Geneva, June 2023

Abstract The Bureau of Labor Statistics (BLS) produces the Consumer Price Index (CPI) as a measure of price change faced by consumers. The CPI for All Urban Consumers (CPI-U) targets the inflation experience of nearly all consumers in the United States which may not reflect the inflation experience of an individual household or group of households. Increasingly there is user demand for CPIs across the income distribution. This paper builds on the authors’ prior research by modifying the cohort definition and extending the period of analysis. From 2006-2022, lower income households generally faced larger inflation rates than higher income households. The short-term gap between lower and higher income household’s inflation rates changes when the cohort definition accounts for varying family sizes.

JEL Codes: C43, E31

Executive Summary The Bureau of Labor Statistics (BLS) produces the Consumer Price Index (CPI) as a measure of price change faced by consumers. The CPI for All Urban Consumers (CPI-U) targets the inflation experience of urban consumers which covers over 90 percent of the total population of the United States. This broad- based coverage may not reflect the inflation experience of an individual household or groups households1. Increasingly there is user demand for CPIs across the income distribution. These indexes paint a full picture of inflation for users interested in the state of the economy. Other users demand an index limited to lower income households for escalation purposes2.

This paper builds on the authors’ prior research by modifying the cohort definition3. The prior analysis defined income quartile (four) cohorts based on unadjusted total household before tax income. This analysis defines income quintiles (five) cohorts based on equivalized (household size-adjusted) total family income. Adjusting household size is standard practice in income inequality literature. By adjusting household income to a single-member equivalent, income levels are more comparable across households. For example, an $80,000 household income does not convey the same level of resources available to a 4-person family as it does a single-person household.

While this adjustment did not impact the long-term results, there are several notable short-term differences. Lowest income households almost always faced larger inflation rates than highest income households during the study period, however there are several spans when the opposite occurred. This anomalous result occurred more frequently and during different months for the adjusted indexes than the unadjusted indexes.

This paper extends the period of analysis to December 2022 for the CPI indexes. The 12-month change in the CPI-U for All Items was 1.9 percent in December 2018, the last month included in the prior analysis. Since mid-2021, inflation accelerated to a peak of 9.1 percent in June 2022. The average annual inflation rate from December 2005 to December 2022 was largest for the lowest income quintile and smallest for the highest income quintile. The gap in inflation rates between lowest and highest income households was 0.27 percentage points per year.

Background and Issues The BLS publishes consumer price indexes for subgroups of the target urban population. The CPI for Wage Earners and Clerical Workers (CPI-W) became a subgroup index in 1978 when the BLS adopted an urban population target and began calculating the CPI-U. In 1988, the BLS introduced a research series measuring price change for older Americans, the CPI-E. Research conducted by BLS on inflation rates for

1 While the population target is the urban population, the measurement unit is households rather than persons. 2 The BLS is researching a new index product for escalation purposes. Research in this paper describes a low- income subgroup definition that could be applied to the new index product. 3 Klick, Stockburger, “Experimental CPI for Higher and Lower Income Households,” March 2021, BLS working paper 537

low-income consumers began in the 1990s4. Prior research is briefly summarized in an earlier working paper published in March 20215.

Interest in income-based inflation measures continues. In June 2021, an Interagency Technical Working Group convened by the Office of Management and Budget issued a report recommending the BLS produce a new consumer price index to be used in the calculation of the U.S. Official Poverty Measure. The group recommended a low-income Chained CPI. In April 2022, The National Academy of Sciences issued a report recommending development of price indexes by income group6.

In the author’s March 2021 working paper, we outline numerous caveats and limitations with the current methodology to calculate subgroup indexes. Other researchers have shown using the same underlying microdata to calculate indexes for both target population and subgroups underestimates the gap in inflation rates between highest and lowest income households7. The methodological improvements presented in this paper do not account for consumer heterogeneity at lower levels of index aggregation, and so the same caveats and limitations from the March 2021 working paper apply.

Methodology Income Cohort Definition The March 2021 working paper describes the index methodology and data sources in detail. We use data collected in the Consumer Expenditure Surveys (CE) from 2004 to 20218. We estimate expenditures on the full market basket of items using integrated data from the Diary and Interview surveys. We use elementary price indexes, for example Bananas in Boston, that form the foundation of Consumer Price Index aggregation from 2006 to 2022. We derive implicit quantities for the modified Laspeyres formula indexes from biennial expenditures lagged two to three years. For example, we use expenditures from 2019 and 2020 to weight modified Laspeyres indexes in 2022. We refer the reader to the March 2021 working paper for additional information on index methods and formulas.

This paper improves the income group cohort definition. First, we employ a household weighted ranking to distribute the sample weights relatively equally across quintiles. Previously, we used an unweighted income ranking that did not reflect an equal distribution of household weights across quartiles. The BLS calibrates CE sample weights to the Current Population Survey to control for several demographic characteristics such as age, race, owner or renter, geography, and Hispanic ethnicity.9 Weighting

4 Thesia Garner, David Johnson, and Mary Kokoski 1996, “An experimental Consumer Price Index for the poor” https://www.bls.gov/opub/mlr/1996/09/art5full.pdf 5 Klick, Stockburger, “Experimental CPI for Higher and Lower Income Households,” March 2021, BLS working paper 537 6 National Academies of Sciences, Engineering, and Medicine. 2022. Modernizing the Consumer Price Index for the 21st Century. Washington, DC: The National Academies Press. https://doi.org/10.17226/26485. 7 Many examples include Broda and Romalis (2009), Broda, Leibtag, and Weinstein (2009), Agente and Lee (2017), Jaravel (2017), and Kaplan and Schulhofer-Wohl (2017). 8 BLS began imputing missing values of income in 2004, and income data from 2003 are not comparable. To initialize this research, we used a single year of expenditures in 2004 to calculate spending shares used in index calculation for 2006 and 2007. The remaining spending shares use two years of expenditures, consistent with CPI-U methodology. 9 See CE Handbook of Methods, Calculation Methodology https://www.bls.gov/opub/hom/cex/calculation.htm#calculation-methodology

methods also control for subsampling, and a non-interview adjustment that controls for geography, household size, number of contacts, and average gross income for a household’s zip code. The use of sample weights reflects known urban population totals, particularly relevant when comparing owners and renters, so that the weights are equivalent across quintiles, and are comparable to CE’s weighted ranking of the total population.10,11 An inherent benefit to this approach is that weights are relatively evenly distributed across defined quantiles. CE processes this income ranking variable for the total population. Therefore, urban and rural population differences across the CE quintiles (the rural proportion is higher for lower quintiles) provide motivation for CPI to calculate a weighted income distribution so that weights are distributed relatively equally for the urban population. This improvement did not substantively change the results at the All-Items US City Average level.

Second, we divide the CE respondents into quintiles of equivalized income, rather than quartiles as in the prior analysis. We determined that the proportion of quintile households is comparable to the wage- earner population (W) as summarized in Figure 1. Additionally, coverage of item-area weight cells for consumer price index estimation was sufficient to calculate five income groups rather than four as described in the results section. More detailed income groups provide greater granularity to data users and facilitate comparisons of lowest, median, and highest quintiles.

Figure 1. Household respondent summary from 2021 collection quarter 4

Count Proportion relative to U (Percentage) U W E Q1 Q2 Q3 Q4 Q5

Interview 4,515 21.4 36.5 20.2 20.2 19.8 19.9 19.9 Diary 2,694 22.8 38.5 20.3 20.9 19.5 18.8 20.5

Third, we equivalize household income to account for differing family sizes. There is a long literature using equivalence scales to adjust household income to account for different characteristics across households12.

Household size and composition varies across respondents. Equivalized income defined as income divided by the square root of family size, adjusts income to make this comparable across households, as a better measure of household economies of scale.13 The first and fourth quintile maximum income cut points from the Diary and Interview are greater than the corresponding maximum equivalized income

10 See CE Table 1101. Quintiles of income before taxes https://www.bls.gov/cex/tables/calendar-year/mean-item- share-average-standard-error/cu-income-quintiles-before-taxes-2021.pdf 11 For CE income distribution methodology see https://www.bls.gov/cex/csxguide.pdf. CE creates a before tax income ranking variable as a distribution over the interval (0,1] so that weights are relatively equally distributed across defined quantiles. The income ranking variable is created by sorting by income and a random number, used to break ties for CUs reporting the same income, in ascending order for each collection quarter and survey source. The total sum of FINLWT serves as the denominator, and cumulative sum of FINLWT21 serves as the numerator to create the distribution that ranges from greater than 0 to less than 1, to 7 decimal places of precision. 12 Angela Daley, Thesia Garner, Shelley Phipps, Eva Sierminska, “Differences Across Place and Time in Household Expenditure Patterns: Implications for the Estimation of Equivalence Scales,” BLS Working Paper, 2020 https://www.bls.gov/osmr/research-papers/2020/pdf/ec200010.pdf 13 https://www.brookings.edu/blog/up-front/2019/04/17/whats-in-an-equivalence-scale

summarized in Figure 2. The median income and equivalized income for the third quintile is equivalent to the urban population. The median equivalized income is less steep from the first to fifth quintile than median income reflecting the improved comparability across households as displayed in Figure 3.

Figure 2: Household maximum income and equivalized income summary from 2021 collection quarter 4 (in terms of Thousands)

0 20 40 60 80 100 120 140

Q1

Q2

Q3

Q4

Diary Income Interview Income Diary Equivalized Income Interview Equivalized Income

Figure 3. Household median income and equivalized income summary from 2021 collection quarter 4 (in terms of Thousands)

The household weighted ranking described above is used to evaluate equivalized income quintiles (E1:E5) and non-equivalized income quintiles (N1:N5). The counts of households for CPI weighted income quintiles can be compared to those same households for other income definitions to highlight the degree of similarity between subpopulation definitions as summarized in Figure 4. Overlap is the proportion of same households relative to the respective CPI weighted non-equivalized income ranking quintiles (N1:N5). The All group represents sum of the 5 quintiles. When the urban portion of CE total population income weighted ranking is compared to non-equivalized income rankings, there is a high degree of overlap ranging from 94% to 100%. When the equivalized income groups are compared to the non-equivalized income rankings the degree of overlap ranges from 53% to 83% highlighting definitional differences of household income, and potential differences for weighting these respective indexes.

0 20 40 60 80 100 120 140 160 180 200

U

W

E

Q1

Q2

Q3

Q4

Q5

Diary Income Interview Income Diary Equivalized Income Interview Equivalized Income

Figure 4: Proportion of counts of same households relative to non-equivalized income across quintile definitions (Percentage)

D I All Q1 Q2 Q3 Q4 Q5 All Q1 Q2 Q3 Q4 Q5 CE 97 98 97 95 97 100 96 97 95 94 96 100 CPI-(E1:E5) 67 80 62 54 55 83 67 82 62 53 55 82

An additional improvement is the smoothing of expenditure cells comparable to production weight processing. The CE collected survey data are subject to sampling error across geography and unreliable for index estimation, particularly relevant for subpopulation quintiles. The CPI smooths basic item area cell weights to reduce variance across geography. Local area annual weights are composite estimated with more stable broader level of geography (self-representing-regions and non-self-representing- regions). The composite estimate weight is between 0 and 1 and is based on minimizing the mean squared error between the local area versus broader geography.14 The impact of smoothing is described below as expenditure weight cell coverage as the proportion of missing basic item area cells.

With an improved definition of income incorporating population weights and equivalization, we considered how to divide households into quintiles. The BLS produces consumer expenditure estimates by income quintile. Those income quintiles are defined by cut points that are rarely adjusted15. To produce a time-series consistent definition of income groups for index estimation of 243 items by 32 geographic area cells, we chose to define income quintiles that shift to include a fifth of CE households in each group rather than defining cut-points that would need to be revised over time.

We also considered defining income quintile groupings by geography such that CE respondents are classified into income quintiles within a city (Primary Sampling Unit, PSU) selected for inclusion in the CPI. We ultimately concluded a nationally defined income distribution was preferred to represent all households as a single distribution for a national level index, that is methodologically consistent with Bureau of Economic Analysis (BEA) Personal Consumption Expenditures (PCE) and BLS PCE income quintile products.16 Area stratification of the income distribution has a minimal impact to national level indexes and changes the overarching definition/purpose of the product. A limitation of this method is that subnational indexes are not feasible because the weights are not equivalent across quintiles. We will continue research geographic considerations for weighting CPIs by income.

14 For details see https://www.bls.gov/osmr/research-papers/1999/pdf/st990050.pdf 15 For example, the nominal income bounds of the lowest income quintile were less than $3,000 from 1960-1983, and less than $5,000 to present. Historically, the income definitions are subject to change based in part on inflation particularly relevant beginning 2021. These weights are not equivalent across groups limiting distributional comparisons. Also, households are not equivalized based on the number of people within a consumer unit resulting in dissimilar measures of income groups. Income as a standalone variable is not sufficient for weighting subpopulation indexes. 16 See BEA Measuring Inequality in the National Accounts https://www.bea.gov/system/files/papers/measuring- inequality-in-the-national-accounts_0.pdf, and BLS Distribution of U.S. Personal Consumption Expenditures Using Consumer Expenditure Surveys Data: Methods and Supplementary Results https://www.bls.gov/cex/pce-ce- distributions.htm

Income Cohort Demographic Characteristics In addition to income and expenditures, the CE Surveys collect a variety of demographic information about survey respondents. In this section, we present the demographic differences between income quintiles. By construction, the average household size and number of children is more consistent across income quintiles after equivalizing income. Other demographic differences give further context for the expenditure share differences presented in the next section.

Using household size to equivalize income results in more consistent household sizes and number of children across income quintiles (figure 5). Without accounting for household size, more single person households and households without children are included in the lowest income quintile. Conversely, fewer single person households and households without children are included in the highest income quintile. When adjusting for household size, more families with children are included in the lowest income quintile and less families with children are included in the highest income quintile. These changes are consistent across income quintiles, and we include only the first- and fifth-income quintiles to simplify the presentation of results.

Figure 5: Average Family Size and Number of Children, Urban and by Income Quintile, 2020

Urban Q1 Q5 Unadjusted Equivalized Unadjusted Equivalized Family Size 1 person 30% 63% 45% 7% 20% 2 people 33% 22% 25% 34% 40% 3 or more people 37% 15% 30% 59% 40% Number of Children None 62% 80% 66% 43% 61% 1-2 children 30% 16% 24% 46% 34% 3 or more children 8% 4% 10% 11% 5%

This data confirms the importance of equivalizing income to adjust for varying household sizes. The expenditure pattern differences between unadjusted income quintiles are reflective more of household composition differences. By standardizing household sizes, the expenditure pattern differences presented in the next section are more reflective of income differences.

Households grouped by income quintile have different rates of home ownership, working status, and educational attainment (figure 6). Households earning the lowest quintile of income are more likely to rent their home and not work for pay than higher income households. Of the households with retired members, 65% report incomes that fall in the first and second quintile. The large number of retired individuals in the lower income quintiles explains why more than half of households earning income in the first and second quintiles own their home with no mortgage. Higher income households are more likely to own their home with an outstanding mortgage. Higher income households are also more likely to hold advanced degrees.

Figure 6: Housing tenure, working status, and educational attainment by population

Urban Q1 Q5 Housing Tenure

Owner with a mortgage 41% 18% 63% Owner with no mortgage 25% 29% 20% Renter 34% 53% 17%

Working status

Not working (due to disability or taking care of family)

9% 23% 3%

Not working (retired) 21% 34% 6% Working 70% 43% 91%

Educational Attainment

Less than high school degree 8% 18% 1% High school degree or some college

41% 55% 17%

Advanced degree 51% 28% 81%

Data Inputs CPI Basic Item-Area Expenditure Weight coverage The above household coverage analysis indicates that each of the quintiles has approximately the same number of households as the wage earner subpopulation. The expenditure weight coverage measures the proportion of missing item area cells used to weight basic indexes for 2nd stage estimation. When price change occurs, weighting basic indexes accurately relative to the All-Items US level is imperative to construct aggregate indexes. Coverage is measured as the proportion of item-area cells less than $1 as missing. There are 32 areas cells multiplied by each item series. There are 243 basic item area indexes that can be divided into priced item series and non-sampled item series. The non-sampled series are subject to infrequent number of expenditures reported and the price movement is based on aggregate priced series. Coverage of overall results are distorted when combining the non-sampled items and priced items. An additional adjustment occurs for health insurance which are excluded from this data quality metric. We display results in figure 7.

The urban population collected proportion of missing overall is 3.6% versus priced items is 0.5%; smoothing reduces the proportion of priced items missing to 0.0%. The wage earner collected proportion missing of priced items is 7.1%, and smoothing reduces this proportion to 0.6%. The lowest income quintile collected proportion missing of priced items is 13.7%, and smoothing reduces this proportion to 0.0%. The highest income quintile collected proportion of missing priced items is 4.6%, and smoothing reduces this proportion to 0.0%. Smoothing therefore has a larger impact on the lowest income quintile and improves weighting coverage for index estimation.

Figure 7: 2021 Reference year expenditure weight basic cell coverage as proportion missing (percentage)

Collected Smoothed # Items U W Q1 Q5 # Items U W Q1 Q5 Overall 209 3.6 13.1 19.9 10.0 225 0.4 1.4 0.4 0.4 Non-sampled 26 25.2 55.4 63.3 48.3 26 3.8 7.7 3.8 3.8 Priced 183 0.5 7.1 13.7 4.6 199 0.0 0.6 0.0 0.0

Income Quintile Spending Weights We produce price indexes, which use spending weights to calculate an average price change. While the spending weights for the urban population reflect average spending, they may not reflect spending of any individual household or groups of households. Spending weights vary across the income distribution. Overall, households earning the lowest quintile of income devote a larger share of their spending on essential goods and services. Households earning the highest quintile of income allocate a larger share of their spending on recreational and leisure goods and services. Figure 8 shows a snapshot of these spending differences in 2019-2020 for select categories. We present more categories in the appendix. We used spending weights constructed from these data to calculate indexes in 2022.

Figure 8: Snapshot of spending weights by population, 2019-2020 biennial expenditure weight share, equivalized income

These spending weights reflect the differences between quintiles of equivalized income. There are a few notable shifts in these spending weights from unadjusted income we used in an earlier analysis. The

0.0

5.0

10.0

15.0

20.0

25.0

30.0

Q1 Q2 Q3 Q4 Q5

share of spending on owner’s equivalent rent by households classified in the first quintile of equivalized income is 2 percentage points lower than households classified in the first quintile of unadjusted income. The households shifting out of the first income quintile after adjusting for household size are more likely to be retired and own their own homes without a mortgage. The households shifting into the first income quintile after adjusting for household size are more likely to rent their homes or own their homes with a mortgage. Although homeowners without mortgages pay less out of pocket to live in their home than other households, the owners’ equivalent rent approach to owned housing imputes an implicit rent. For retirees, who are more likely to own their home without a mortgage than other households, owner’s equivalent rent constitutes a large share of their spending weights. This is evidenced by the spending shares for the CPI-E population, nearly 60 percent of whom are retired. The net effect of households shifting into and out of the first income quintile is a reduction in spending on shelter services.

We also observe a notable shift in transportation spending weights from unadjusted income we used in an earlier analysis. Spending on all vehicle-related categories (new and used vehicles, motor fuels, and vehicle insurance) is 1.1 percentage points higher for the households categorized in the first quintile of equivalized income relative to their unadjusted counterparts. Again, retirees are likely the cause of this shift. Households included in the CPI-W population typically spend a larger share of their budget on vehicle-related expenses than urban households. The wage-earner and clerical worker population includes very few retirees (4 percent). With more households with members who are working included in the lowest quintile of equivalized income, they dedicate more of their budget to vehicle-related expenses.

Price Analysis As noted in the methodology section, the BLS calculates price indexes for different populations by applying varying spending weights to the same set of underlying basic price indexes. That is, when averaging price changes across all items, the price change for rent has a greater impact on overall price change for lowest-income versus highest-income households. If prices changed at the same rate for all item categories, there would be no difference in inflation rates by population. In this section we present price changes by item category which will explain overall index differences.

In figure 9, we show the price change relative to the spending share differences between first and fifth quintile income groups for select components of the CPI. The y-axis shows price change from January 2020 to December 2021 and shows new and used motor vehicles and motor fuel had the largest price increases during that period (nearly 27%). The x-axis shows the ratio of spending shares (first quintile divided by fifth quintile) using 2019-2020 spending shares. Compared to fifth income quintiles households, first income quintile households spent four times their budget share on rent and a quarter their budget share on lodging away from home.

Figure 9: Price change and spending share scatterplot, first and fifth quintile (price change January 2020-December 2021, spending shares 2019-2020 ratio Q1/Q5)

Results In this section we present indexes by income quintile. Overall, the trends we observe in previous analysis continued in 2019-2022. Lowest-income households tend to experience larger inflation rates than highest-income households. In this section we present index results and further analyze periods that defy the overall trend.

Overall Index Results Lowest-income households tend to experience larger inflation rates than highest-income households. We show the annualized inflation rates over the period in Figure 10. Lowest-income households faced inflation rates that were on average 0.27 percentage points larger than highest-income households every year over this period. Cumulatively, the inflation gap is 5.18% over 17 years.

Rent of primary residence(HA)

Tobacco and smoking products(GA)

Pork(FD)

Poultry(FF) Energy services(HF)

Beef and veal(FC)

Telephone services(ED)

Motor fuel(TB) New and used motor vehicles(TA)

Lodging away from home(HB)

-10%

-5%

0%

5%

10%

15%

20%

25%

30% 0 1 2 3 4 5

Price Change

Ratio of Spending Shares (Q1/Q5)

Figure 10: Annualized inflation rate, CPI by income quintile, Lowe Formula, December 2005 - December 2022

Variation of income inflation gap over time On average from 2006 through 2022, lowest-income households faced larger inflation rates than highest-income households. At its peak in August 2008, the gap in inflation rates was 1.37 percentage points. At its trough in February 2016, the gap in inflation was reversed with highest-income households facing inflation rates 0.31 percentage points larger than lowest-income households. The long-term gap in the average inflation rate is the same whether classifying households using equivalized income or unadjusted income. In the short-term, the magnitude and direction of the inflation gap differs depending on the income definition we use to classify households. We show the difference in annual 12- month percent change between lowest- and highest-income households in Figure 12.

2.6

2.54

2.47

2.41

2.33

2.43

2.15

2.2

2.25

2.3

2.35

2.4

2.45

2.5

2.55

2.6

2.65

Q1 Q2 Q3 Q4 Q5

Income Quintiles Urban

Figure 12: Annual 12-month percent change in CPI, difference between lowest and highest income quintile, December 2006 – December 2022

Variation in inflation gap by item category Over the period studied, lower income households faced larger inflation rates than higher income households aggregated across all items in the market basket. Which item categories drive this difference? In this section, we present inflation rates by eight broad classifications called major groups. In the next section we decompose the contribution of each of these major groups to the overall inflation gap between lowest and highest income households.

Figure 13 displays inflation rates (annualized 12-month change) for each major group and the inflation gap between lowest and highest income households. For Apparel and Medical Care, highest income households faced larger inflation than lowest income households. The inflation gap was the smallest for Education and Communication and Food and Beverages. Lowest income households faced larger inflation rates than highest income households for Other Goods and Services, Housing, and Transportation. How much each major group contributed to the overall inflation gap depends on the spending shares. We present contribution information in the next section.

-1.00%

-0.50%

0.00%

0.50%

1.00%

1.50% 20

06 12

20 07

06 20

07 12

20 08

06 20

08 12

20 09

06 20

09 12

20 10

06 20

10 12

20 11

06 20

11 12

20 12

06 20

12 12

20 13

06 20

13 12

20 14

06 20

14 12

20 15

06 20

15 12

20 16

06 20

16 12

20 17

06 20

17 12

20 18

06 20

18 12

20 19

06 20

19 12

20 20

06 20

20 12

20 21

06 20

21 12

20 22

06 20

22 12

Equivalized income Unadjusted income

Figure 13: Inflation gap by CPI major group; Annualized inflation rate, Lowe Formula, December 2005 - December 2022

Item Category - Major Group

Urban Lowest Income Quintile (Q1)

Highest Income Quintile (Q5)

Inflation Gap (Q1-Q5)

Apparel 0.35 0.25 0.42 -0.17 Education and Communication

1.34 1.63 1.67 -0.04

Food and Beverages 2.89 2.90 2.89 0.01 Other Goods and Services

2.90 3.38 2.43 0.95

Housing 2.66 2.85 2.53 0.32 Medical Care 3.07 2.89 3.16 -0.27 Recreation 1.13 1.18 1.14 0.04 Transportation 2.33 2.52 2.23 0.29

To interpret these results, recall our methodology adjusts spending shares on item categories such as women’s dresses, men’s pants, and children’s clothing to reflect the shopping behavior of households in each income quintile. Price change at the major group level reflects different averages of price change across those item categories. Highest income households faced larger apparel inflation than lowest income households because they spent a larger share on item categories whose prices were rising faster than average (or smaller shares on item categories whose prices were falling or rising slower than average). These results do not indicate any differences in shopping behaviors below the item strata level.

The BLS calculates the CPI including all goods and services purchased by consumers. For some uses of the CPI, users prefer calculating a CPI over a smaller set of goods and services. The BLS calculates a CPI Less Food and Energy index which some users refer to as “core” inflation since it excludes some volatile item categories. Another subset calculation is the CPI for Food, Clothing, Shelter, and Utilities (FCSUti). The BLS uses this index to calculate a research poverty measure and can be considered an “essentials” index17. We show inflation gap results for these indexes in Figure 14.

17 The Supplemental Poverty Measure website explains the SPM methodology and use of the FCSUti CPI. https://www.bls.gov/pir/spmhome.htm

Figure 14: Inflation gap for “core” and “essentials” items; Annualized inflation rate, Lowe Formula, December 2005 - December 2022

Special Aggregation Index

Urban Lowest Income Quintile (Q1)

Highest Income Quintile (Q5)

Inflation Gap (Q1- Q5)

All Items 2.43 2.60 2.33 0.27 “Core” All Items Less Food and Energy (X)

2.35 2.57 2.23 0.34

“Essentials” Food, Shelter, Clothing, and Utilities (FCSUti)

2.63 2.71 2.60 0.11

Excluding food and energy, the inflation gap is wider between lowest and highest income households than when those categories are included. The inflation gap is less for the “essentials” index.

Which items explain the inflation gap? In the previous section, we showed the variability in the inflation gap below the All-Items level. To understand how different components of the market basket contribute to the All-Items inflation gap, we need a measure that incorporates relative weights across item categories. These measures are called contributions and effects.18 If the price change for an item category was unchanged instead of the value measured, then the effect is the resulting change in the all-items price change. The contribution scales items effects relative to the all-items price change.

These contributions and effects can be extended to explain inflation rates for income groups. In figure 15, we display the top and bottom three contributing items to the urban, first income quintile, and fifth income quintile populations. For example, inflation in owner’s equivalent rent was the largest contributor of any single item stratum for the urban, lowest income quintile, and highest income quintile populations inflation rate in 2022. If owner’s equivalent rent had not changed in 2022, the all- items price change would have been 0.5 percentage points lower for the urban population, 1.2 percentage points smaller for the lowest income quintile, and 1.4 percentage points smaller for the highest income quintile.

Figure 15: 2022 Year over year item ranking of contribution and effect (percentage) for U, Q1, and Q5

U: All Items 8.0 Q1: All Items 8.3 Q5: All Items 7.7 Rank Item Effect Contribution Item Effect Contribution Item Effect Contribution 1 HC01 1.3 16.6 HC01 1.2 15.2 HC01 1.4 18 2 TB01 1.1 13.9 TB01 1.2 14.5 TB01 0.9 11 3 TA02 0.5 6.0 HA01 0.9 10.8 TA01 0.5 7.0 209 ED03 -0.0 -0.1 ED03 -0.0 -0.1 ED03 -0.0 -0.1 210 RA01 -0.0 -0.2 RA01 -0.0 -0.2 RA01 -0.0 -0.2 211 EE04 -0.1 -0.7 EE04 -0.1 -0.7 EE04 -0.0 -0.6

18 See Footnote 1 https://www.bls.gov/news.release/cpi.t07.htm

Gasoline is the second largest contributor to inflation for all three populations in 2022. The third ranked items differ across populations: used vehicles for the urban population, rent for lowest income households, and new vehicles for highest income households.19 The bottom three ranked items display the small negative effects and contributions.

Since owner’s equivalent rent is an important contributor to all populations, what explains the inflation gap between lowest and highest income quintiles? To home in on this question, we redefine the contribution and effect measure to identify the item categories that most contribute to widening the inflation gap (positive effect) and narrowing the inflation gap (negative effect). Formulas used for this analysis are described in Appendix 2. The 2022 year over year change inflation gap is 0.5%, with a positive effect of 2.1% and a negative effect of -1.6%.

We show in figure 16 the item categories contributing most to the positive effect (greater than 0) and the negative effect (less than 0). Rent, gasoline, and electricity had the largest contributions to the positive effect. The lowest income quintile of households spends more of their budget share on these categories than the highest income quintile of households. New vehicles and airline fares had the largest contributions to the negative effect. The highest income quintile of households spends more of their budget share on these categories than the lowest income quintile of households.

19 See Appendix 7 Consumer Price Index items by publication level for item definitions https://www.bls.gov/cpi/additional-resources/index-publication-level.htm

Figure 16. 2022 Year over year inflation gap (Q1-Q5) CPI contributions to All-Items (percentage)

Summary/Conclusion/Future analysis/Next Steps BLS produces different measures of inflation used to assess the health of the American economy. With this research, we add additional measures of consumer inflation across the distribution of household income. This paper builds on the authors’ prior research by modifying the cohort definition and extending the period of analysis to 2022. As with earlier periods studied, lower income households generally faced larger inflation rates than higher income households through 2022. The long-term inflation gap between lowest and highest income households is unaffected by the cohort definition changes to better account for varying household sizes (however short-term differences in the inflation gap emerge).

The inflation gap is the result of differences in spending shares across households. Prices for rent, gasoline, electricity, new vehicles, and owner’s equivalent rent rose faster than average in 2022. The impact of rent, gasoline, and electricity spending share differences generated larger inflation measures for lowest income households. The larger spending shares highest income households dedicated to new vehicles and owner’s equivalent rent had a moderating impact on the inflation gap. Modifying the set of item categories over which inflation measures are calculated changes the spending share differences across households, leading to differences in inflation gap measures.

-5 0 5 10 15

Rent primary residence(HA01) Gasoline (all types)(TB01)

Electricity(HF01) Utility (piped) gas service(HF02)

Cigarettes(GA01) Motor vehicle insurance(TE01)

Limited service meals/snacks(FV02) Juices and drinks(FN03)

Cable & satellite tv/radio(RA02) Chicken(FF01)

Club membership (RB02) Child care & nursery school(EB03)

Owners' rent secondary res.(HC09) Leased cars and trucks(TA03)

Full service meals and snacks(FV01) Commercial Health Insurance(ME01)

Owners' rent primary residence(HC01) Lodging away from home(HB02)

Airline fare(TG01) New vehicles(TA01)

Throughout this paper, we have identified potential areas for future research. Perhaps most importantly, we recognize the importance of capturing price change differences at the lower level by income quintile. As other researchers have demonstrated, there may be considerable heterogeneity in the prices paid and unique items purchased that can have an impact on the overall measure of inflation. Previous research has found little difference in rent inflation by income group. We are interested in exploring this finding further and the impact of rent subsidies that have a larger impact on the lowest quintile of households.

We define cohorts by income quintile but recognize there may be other cohorts better suited for different uses. For example, cohorts defined by expenditure can lead to reclassification of some households into different quintiles (some lowest quintile households would fall into the highest quintile of expenditure, for example). There could also be different geographic stratifications that would be helpful (below the national level). Furthermore, there could be measures of wealth that are more useful for categorizing households.

Finally, this research is limited to the income group cohort definition. BLS is developing another product, Household Cost Indexes, that could be calculated for different subgroups. It would also be useful to develop confidence intervals and standard errors to identify statistically significant differences between inflation measures for different populations.

Appendix 1 Snapshot of spending weights by population, 2019-2020 biennial expenditure weight, equivalized income

Item Category Urban Q1 Q2 Q3 Q4 Q5 Food and beverages 14.2 14.9 14.6 14.2 14.5 13.5 Alcoholic beverages 0.9 0.5 0.6 0.8 1 1.3 Food away from home 5.1 4.5 4.9 5 5.2 5.4 Food at home 8.1 9.9 9.1 8.5 8.3 6.8 Housing 42.9 46.6 44.6 42.1 41.3 42.5 Owner’s equivalent rent 24.7 21.4 23.3 22.9 24.5 27.6 Rent 7.6 14.3 11 9.1 6.5 3.7 Fuels and utilities 4.5 5.9 5.4 4.8 4.4 3.5 Household furnishings and operations 4.8 4.1 4 4.3 4.7 5.6

Lodging away from home 0.9 0.5 0.5 0.6 0.8 1.6 Apparel 2.7 2.5 2.5 2.4 2.7 2.9 Transportation 16.5 13.9 15.3 17.5 17.6 16.5 Motor fuels 3 3.2 3.4 3.5 3.3 2.4 Public transportation 0.9 0.6 0.6 0.7 0.8 1.3 Vehicle purchase and maintenance and repair 9.4 7.1 8 9.8 10.1 10.2

Vehicle insurance 2.5 2.6 2.9 2.9 2.8 2 Medical care 8.8 7.9 10 9.7 9.1 7.9 Health insurance, retained earnings 0.8 0.5 0.7 0.9 0.9 0.8 Professional services 3.8 3.5 4.3 4.1 3.9 3.5 Recreation 5.3 4.3 4.2 4.7 5.5 6.4 Education and communication 6.9 6.8 5.7 6.4 6.7 7.8 Education 2.8 2.4 1.3 1.9 2.4 4.4 Communication 4.1 4.5 4.4 4.5 4.3 3.4 Other goods and services 2.8 3.1 3 2.9 2.7 2.6

Snapshot of spending weights by population, 2019-2020 biennial expenditure weight, unadjusted income

Item Category Urban Q1 Q2 Q3 Q4 Q5 Food and beverages 14.2 14.1 14.2 14.4 14.5 13.8 Alcoholic beverages 0.9 0.6 0.7 0.8 0.9 1.2 Food away from home 5.1 4.4 4.7 5.2 5.1 5.5 Food at home 8.1 9.2 8.8 8.4 8.4 7.2 Housing 42.9 48.4 45.4 43.2 40.8 41.4 Owner’s equivalent rent 24.7 23.4 22.9 23.2 23.8 27.2 Rent 7.6 14.3 12 9.7 6.7 3.2 Fuels and utilities 4.5 5.7 5.4 4.9 4.4 3.6 Household furnishings and operations 4.8 4.1 4.1 4.5 4.7 5.4

Lodging away from home 0.9 0.5 0.5 0.6 0.8 1.5 Apparel 2.7 2.3 2.3 2.6 2.7 3 Transportation 16.5 12.8 15.3 17.1 18.1 16.7 Motor fuels 3 2.8 3.3 3.5 3.4 2.6 Public transportation 0.9 0.7 0.6 0.7 0.8 1.3 Vehicle purchase and maintenance and repair 9.4 6.6 8.1 9.4 10.5 10.2

Vehicle insurance 2.5 2.3 2.8 2.9 2.8 2.1 Medical care 8.8 8.8 9.6 9.3 9.2 8 Health insurance, retained earnings 0.8 0.5 0.7 0.8 0.9 0.8 Professional services 3.8 3.9 4.1 3.9 4 3.5 Recreation 5.3 4 4.6 4.6 5.3 6.4 Education and communication 6.9 6.5 5.6 5.9 6.6 8.1 Education 2.8 2.3 1.2 1.4 2.3 4.7 Communication 4.1 4.2 4.4 4.5 4.4 3.5 Other goods and services 2.8 3 3.1 2.9 2.7 2.6

Appendix 2 Subpopulation difference of effects and contribution formulas

Effects: When pivot month the same across 12-month average (odd index years):

&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d461;&#x1d461;−&#x1d45b;&#x1d45b;→&#x1d461;&#x1d461;;&#x1d44e;&#x1d44e;,&#x1d456;&#x1d456;→&#x1d456;&#x1d456;,&#x1d434;&#x1d434; &#x1d45d;&#x1d45d; = �

(&#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d461;&#x1d461;,&#x1d434;&#x1d434;,&#x1d456;&#x1d456; − &#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d461;&#x1d461;−12,&#x1d434;&#x1d434;,&#x1d456;&#x1d456;) (&#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d461;&#x1d461;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; − &#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d461;&#x1d461;−12,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c;)

∗ (&#x1d43c;&#x1d43c;&#x1d43c;&#x1d43c;���&#x1d461;&#x1d461;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; − &#x1d43c;&#x1d43c;&#x1d43c;&#x1d43c;���,&#x1d461;&#x1d461;−&#x1d45b;&#x1d45b;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c;)

&#x1d43c;&#x1d43c;&#x1d43c;&#x1d43c;���&#x1d461;&#x1d461;−&#x1d45b;&#x1d45b;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; ∗ 100�

When pivot month is revised across 12 month average (even index years): &#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d461;&#x1d461;−12→&#x1d461;&#x1d461;;&#x1d44e;&#x1d44e;,&#x1d456;&#x1d456;→&#x1d456;&#x1d456;,&#x1d434;&#x1d434;

&#x1d45d;&#x1d45d; =

⎣ ⎢ ⎢ ⎢ ⎡�&#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d438;&#x1d438;&#x1d461;&#x1d461;,&#x1d434;&#x1d434;,&#x1d456;&#x1d456; − &#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d438;&#x1d438;&#x1d461;&#x1d461;−&#x1d45b;&#x1d45b;,&#x1d434;&#x1d434;,&#x1d456;&#x1d456; ∗ �

&#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d463;&#x1d463;,&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d45b;&#x1d45b;&#x1d434;&#x1d434;&#x1d434;&#x1d434;,&#x1d434;&#x1d434;,&#x1d456;&#x1d456; &#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d463;&#x1d463;,&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;,&#x1d434;&#x1d434;,&#x1d456;&#x1d456;

��

�&#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d438;&#x1d438;&#x1d461;&#x1d461;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; − &#x1d438;&#x1d438;&#x1d450;&#x1d450;&#x1d438;&#x1d438;&#x1d461;&#x1d461;−&#x1d45b;&#x1d45b;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; ∗ � &#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d463;&#x1d463;,&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d45b;&#x1d45b;&#x1d434;&#x1d434;&#x1d434;&#x1d434;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; &#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d463;&#x1d463;,&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c;

�� ∗ �

&#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d461;&#x1d461;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c;

&#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d461;&#x1d461;−&#x1d45b;&#x1d45b;,&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c; ∗ &#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d463;&#x1d463;,&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c;,&#x1d434;&#x1d434;

&#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d463;&#x1d463;,&#x1d434;&#x1d434;&#x1d434;&#x1d434;&#x1d45b;&#x1d45b;&#x1d434;&#x1d434;&#x1d434;&#x1d434;,&#x1d43c;&#x1d43c;,&#x1d434;&#x1d434; − 1� ∗ 100

⎦ ⎥ ⎥ ⎥ ⎤

&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456; &#x1d444;&#x1d444;1,&#x1d444;&#x1d444;5 = &#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456;

&#x1d444;&#x1d444;1 − &#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456; &#x1d444;&#x1d444;5 (Normalized based on absolute value).

Contributions: For an individual population contribution the terms highlighted in gray are removed. For subpopulation contribution difference the absolute value of item &#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456;

&#x1d444;&#x1d444;1,&#x1d444;&#x1d444;5 is evaluated relative to the sum representing the subpopulation proportional effect. The sum of subpopulation proportional effects equals 100%. Positive subpopulation item effects represent items where Q1>Q5. Negative subpopulation item effects represent items where Q1<Q5.

&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d438;&#x1d438;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446; &#x1d446;&#x1d446;&#x1d45d;&#x1d45d;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d45d;&#x1d45d;&#x1d438;&#x1d438;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446; &#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456; &#x1d444;&#x1d444;1,&#x1d444;&#x1d444;5 =

�&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456; &#x1d444;&#x1d444;1,&#x1d444;&#x1d444;5�

∑�&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d438;&#x1d456;&#x1d456; &#x1d444;&#x1d444;1,&#x1d444;&#x1d444;5�

t = current period CW= cost weight p = Q1 or Q5 A = aggregate All US i = lower-level item t –12 = period 12 months prior

AWnew = new aggregation weight

AWnew = new aggregation weight v = pivot month index I = aggregate All items

  • Abstract
  • Executive Summary
  • Background and Issues
  • Methodology
    • Income Cohort Definition
    • Income Cohort Demographic Characteristics
  • Data Inputs
    • CPI Basic Item-Area Expenditure Weight coverage
    • Income Quintile Spending Weights
    • Price Analysis
  • Results
    • Overall Index Results
    • Variation of income inflation gap over time
    • Variation in inflation gap by item category
    • Which items explain the inflation gap?
  • Summary/Conclusion/Future analysis/Next Steps
  • Appendix 1
  • Appendix 2

Impact of Weight Timeliness on the US CPI

Languages and translations
English

1 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Impact of Weight Timeliness on the US CPI

Anya Stockburger, Joshua Klick, Chris Miller, Jessie Park

Bureau of Labor Statistics

Meeting of the Group of Experts on CPIs

June 9, 2023

2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

BLS Consumer Price Indexes

100

110

120

130

140

150

160

170

180

190

CPI-U Final C-CPI-U Initial C-CPI-U

3 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Motivation: Weighting Improvements Product Release Lag

(difference between index reference period and publication)

Weight Lag (difference between weight and index reference period)

Improvement Goals

CPI-U ~10 days 2-4 years 2 years

Reduce weight lag

Final Chained CPI-U

9-12 months None Reduce release lag Increase visibility

Initial Chained CPI-U

~10 days 2-4 years 2 years

Reduce revision size Increase visibility

4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Outline

Motivation

Impact of weight timeliness

Reducing weight lag in CPI-U

Future research

5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Annual Weight Changes

6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Airline Fares Monthly Expenditure Weights

_202004, 0.021%

_202107, 0.931%

0.0%

0.2%

0.4%

0.6%

0.8%

1.0%

1.2%

1.4%

1.6%

1.8%

_201201 _201301 _201401 _201501 _201601 _201701 _201801 _201901 _202001 _202101

7 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Impact of chain drift

Cage, Williams, Church 2021 “Chain Drift in the Chained Consumer Price Index: 1999-2017”

8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov -0.40%

-0.20%

0.00%

0.20%

0.40%

0.60%

0.80%

1.00%

2001200220032004200520062007200820092010201120122013201420152016201720182019202020212022

Difference in 12-month Percent Change CPI-U less Final C-CPI-U

Historical Impact of Timely Weights

9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Recent Impact of Timely Weights

-0.30%

-0.20%

-0.10%

0.00%

0.10%

0.20%

0.30%

0.40%

0.50%

0.60%

0.70% Ja

n M

ar M

ay Ju l

Se p

No v

Ja n

M ar

M ay Ju

l Se

p No

v Ja

n M

ar M

ay Ju l

Se p

No v

Ja n

M ar

M ay Ju

l Se

p No

v Ja

n M

ar M

ay

2018 2019 2020 2021 2022

Difference in 12-month Percent Change CPI-U – Final C-CPI-U

10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Snapshot #1: December 2020 -0.02 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0.02

Full-service meals and snacks

Limited-service meals and snacks

Admissions

College tuition and fees

Airline fare**

Personal computers

Toys

New vehicles

Motor vehicle insurance

Owner's Equivalent Rent

Upper-Level Substitution Bias Contribution - December 2021 (-0.13%)

11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

2020 Price and Weight Change

Full-service meals and snacks

Limited-service meals and snacks

Admissions

College tuition and fees

Airline fare**

Personal computers

Toys

New vehicles

Motor vehicle insurance

Owner's Equivalent Rent

-100%

-50%

0%

50%

100%

150%

-20.00% -15.00% -10.00% -5.00% 0.00% 5.00% 10.00%W ei

gh t C

ha ng

e

Price Change

12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Snapshot #2: December 2021 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

Used cars and trucks

Gasoline

New vehicles

Full-service meals and snacks

Limited-service meals and snacks

Food at employee sites and schools

Clocks, lamps, and décor

Jewelry**

Motor vehicle insurance

Owner's equivalent rent

Upper-Level Substitution Bias Contribution - December 2021 (0.54%)

13 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

2021 Price and Weight Change

Used cars and trucks

GasolineNew vehicles

Full-service meals and snacks

Limited-service meals and snacksFood at employee

sites and schools

Clocks, lamps, and décor

Jewelry**

Motor vehicle insurance

Owner's equivalent rent

-50%

-30%

-10%

10%

30%

50%

70%

90%

110% -60.00% -40.00% -20.00% 0.00% 20.00% 40.00% 60.00%

W ei

gh t c

ha ng

e

Price change

14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Improvements to Weight Timeliness

 Implemented annual weight updates with January 2023 indexes Reduced historical upper-level substitution bias by

0.03 percentage points per year More relevant weights (replace 2019/2020 with

2021 expenditure data) See website for more information on 2022 and

2023 weight updates

15 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weight Changes in 2023 Apparel Food, Alcohol Away

& Haircuts

Recreation &

Transportation

Lodging Away

Food, Alcohol at Home Other & Housing Goods

Hospital Services & Medicinal Drugs

16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

What’s next – quarterly weight updates?

 Reduces aggregate upper-level substitution bias

 Issues at sub- aggregate level Chain drift Outlier impact

 Klick, Park 2022

Weight update frequency

Annualized 12-month percent change (2002-2020)

Upper-level substitution bias

Biennial 2.06 0.24

Annual 2.03 0.21

Quarterly 1.95 0.13

Tornqvist (monthly)

1.82 -

17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

What’s next – improve timeliness of the C-CPI?

 Medium-term: 6-month lag to publish final C-CPI Survey protocol (placement dates) Processing efficiencies (auto-coding, monthly processing,

streamlined outlier review) Design changes (survey recall length)

 Long-term: real-time capture of expenditure information Funding for pilot test included in FY24 President’s budget

request

Contact Information

18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Anya Stockburger Chief, Branch of Revision Methodology

Division of Consumer Price Indexes www.bls.gov/cpi

[email protected]

  • Impact of Weight Timeliness on the US CPI
  • BLS Consumer Price Indexes
  • Motivation: Weighting Improvements
  • Outline
  • Annual Weight Changes
  • Airline Fares�Monthly Expenditure Weights
  • Impact of chain drift
  • Historical Impact of Timely Weights
  • Recent Impact of Timely Weights
  • Snapshot #1: December 2020
  • 2020 Price and Weight Change
  • Snapshot #2: December 2021
  • 2021 Price and Weight Change
  • Improvements to Weight Timeliness
  • Weight Changes in 2023
  • What’s next – quarterly weight updates?
  • What’s next – improve timeliness of the C-CPI?
  • Contact Information

Hedonic price estimates for new vehicles: When do rotations lead to drift? United States

Languages and translations
English

1

Hedonic price estimates for new vehicles: When do rotations lead to drift?

Brendan K. Williams

May 2023

WORKING DRAFT

Prepared for the Group of Experts on Consumer Price Indices

UNECE, Geneva, June 2023

Abstract

Using a transaction level dataset on car purchases, we document the empirical relationship between standard hedonic index methods, including hedonic imputation and time dummy estimates, and the matched-model approach. We extend this analysis to investigate the effects of product cycles on hedonic estimates and the potential for coefficient “drift” that may result in biased price indexes. We distinguish between these effects and conventional “chain drift” in the context of bilateral, pooled, similarity linking, and multilateral index approaches. We map transaction records to additional sources to incorporate more detailed vehicle attributes and performance metrics. We introduce a new method of similarity linking specific to hedonic regression. Our results offer guidance on hedonic index construction methods and incorporating alternative data for industries into official price statistics.

JEL Codes: C43, E31

__________________________________________________________ Brendan K. Williams is a Senior Economist in the Branch of Consumer Prices at the Bureau of Labor Statistics (BLS). This research arose out of an earlier project with Leonard Nakamura, Ryan Michaels, and Erik Sager. Thank you to Bill Thompson and Nicole Shepler for their comments and review.

2

1. Introduction

The availability of transaction and scanner data has created a shift in focus on price index research, with methods related to chain drift receiving a great deal of attention. In many cases, downward drift may be generated by a product cycle and not the conventional “chain drift” mechanism, product cycle effects have received relatively little attention in the literature. We conduct our empirical analysis on new vehicle sales data, which have documented product cycle effects related to intertemporal price discrimination and price changes being introduced simultaneously with product updates. In a traditional, fixed-sample consumer price index, item replacement is used to address issues that result from this drift and quality adjustment emerges as a necessity to address the quality bias that may ensue. Hedonic methods are associated with measuring technological improvements and have negative price index impacts as a result. However, when hedonic imputation methods are used on data with a product cycle, we often see positive effects when hedonic imputation allows improved measurement of long-run price change.

The relationships between product cycles and quality change, and how they relate to various price index construction methods have not been well investigated. We use a dataset of new vehicle sales records to compare several, standard approaches to price index construction to analyze the empirical effects of product cycles. New vehicles were the subject of the very first hedonic analysis and have continued to be a subject of interest in the literature. We revisit some earlier hedonic models of vehicles and combine these specifications with more recent hedonic index methods.

When using transaction data product matching and grouping or hedonic imputation may be used to address drift. Multilateral methods play a role in offsetting product cycle drift, but mainly as a means of time aggregating hedonic imputations and addressing drift they may induce. We find that multilateral methods applied without hedonic imputation or product matching do not address product cycle drift. Within just the past few years, similarity linking methods have shown promise for dealing with chain drift as an alternative to GEKS-type multilaterals (Diewert, 2021). In an apparent first, we combine similarity linking methods with hedonic imputation. Similarity linking methods entail finding similar time periods for bilateral price index comparisons. Several different methods have been proposed to quantify “similarity” in practice. We introduce a new method where Chow test statistics are used with hedonic regression estimates to assess relative similarity between time periods.

We begin with an explanation of our data. We receive transaction level data from J.D. Power. These data are currently in estimation for the U.S. CPI based on the methodology in Williams and Sager (2019). We match these sales observations to more detailed specification information, including measures of vehicle performance, from Wards. The addition of this information allows us to produce more detailed hedonic models and reproduce specifications of historical interest.

We move to a discussion of previous work on hedonic estimates for vehicles. We conduct our empirical analysis on new vehicle sales. New vehicles have a long history in hedonic research going back to the seminal papers introducing and popularizing hedonic methods. We revisit these earlier model specifications and evaluate them in terms of more recent hedonic price index methods.

Next, we review evidence for product cycles and the intuition behind their effect on price indexes.

3

We then review the existing, established methodologies for using hedonic, multilateral, and similarity linking methods. We detail the use of our novel similarity linking method based on hedonic imputation.

Much of this paper is devoted to documenting the behavior of various standard hedonic and other price index methods on a single data source with well-documented product cycle behavior. We show that matched model price indexes tend to show implausibly large decreases. Hedonic estimates tend to show less of a decline. Our results are consistent with several other papers where matched model indexes are downwardly biased. We find that hedonic imputation methods may address product cycle issues by allowing long-run price comparisons to be made over a long time horizon.

2. Data Transaction records with pricing data from J.D. Power were linked to specification data from Wards. The two sources were not consistent with each other, especially when identifying trim and packages. An algorithmic-assisted process was used to link records between the two sources. Concatenations of the model, trim, and package fields for both sources were created and compared. One-to-one matches were assumed to be correct. When a J.D. Power record matched several potential Wards vehicles, records were prioritized based on the highest number of words in common followed by the minimum string distance. These matches were then manually reviewed. The Wards data did not cover all observations in the J.D. Power data with certain trims and even a few models were omitted.

Our Wards data contain specification information from the 2005 to 2019 model year. 2020 model year vehicles began sales in 2019, meaning our specification information only covers our transaction data through 2018. Data for the 2019 model year also does not include mileage estimates (presumably these were not available at the time when we received this data from Wards), which limits model specifications that include fuel efficiency to the index through 2017.

This paper will focus on the pricing of passenger cars (defined as vehicles with listed body types of sedans, convertibles, hatchbacks, coupes, and wagons). Truck and van vehicle configurations (such as cabin type, bed length, and van height) can add variation to price, and our data does not indicate these specifications consistently. Moreover, many of these configurations are intended for commercial use and outside the scope of a Consumer Price Index.

Unlike much of the research published by BLS, the indexes created here are not intended as candidates for production use in the U.S. CPI. We have made many simplifications that would not be used in a production series (such as estimating a single-stage price index at the national level rather estimating area level indexes and performing an aggregation).

3. Background on New Vehicle Hedonic Research New vehicle pricing has been the focus of landmark hedonic1 methods papers including the foundational research in Court (1939) and the popularization of hedonics following Griliches (1961)—and more recently in the broader demand estimation literature with Berry, Levinsohn, and Pakes (1995). These early hedonic papers focused on tangible aspects of vehicles and their performance with Court

1 Court first used the term “hedonic” and attributed the name to a suggestion from Alexander Sachs. Court’s paper is typically described as the beginning of hedonic, and it seems to have laid theoretical foundations, but other papers in agriculture proceeded it in using features as predictors of price. See Colwell and Dilmore (1999).

4

proposing a three-variable specification of weight, wheelbase, and horsepower. Griliches (1961) added dummy variables for V8 engines, hardtops, transmission, compact body type, and power brakes and steering. Triplett (1969) followed this specification but combined power brakes and steering and also proposed a truncated model. Cowling and Cubbin (1972) introduced several other variables including vehicle fuel efficiency.

Of these historical specifications, we can most directly reproduce the specifications used in Court (1939) and Ohta and Griliches (1976). Power brakes have long been standard on almost all vehicles sold in the United States and only a few models are available with manual steering (even manual transmission has become uncommon). The simple, three-variable model in Court remains directly applicable even to modern vehicles. The other models are less applicable as options like power brakes and steering are now nearly universally standard and while others no longer exist—namely, the pillarless “hardtop,” which has not been sold since the 1970s. Omitting hardtop as an obsolete feature, the Ohta and Griliches specification is producible given our data and has the advantage of accounting for vehicle make. Since “make” is generally indicative of the level finishings in a vehicle, including make gives a rough control on interior quality (an aspect generally otherwise omitted in our data and the papers discussed below).

Table 1: Comparison of historical model specifications

Court (1939)

Griliches (1961)

Triplett (1969)

Triplett Trunc. (1969)

Cowling & Cubbin (1972)

Ohta & Griliches (1976)

Weight x x x x

x Wheelbase x length/wheelbase

Horsepower x x x

x x Length

length/wheelbase x

x x

V8

x x

x Hardtop

x x

x

Transmission

x X Comb.

Power brakes

x Comb. Comb. x

Power steering

x Comb. Comb.

Compact

x x x

Over4Gears

x Luxury

x

PassengerArea

x Efficiency

x

Make

Indicator variables

5

4. Product Cycle We refer to regular patterns in product entry and exit and their effects price and quantity measurement as “product cycles.” For cars, we focus on two elements of the product life cycle: price declines over a single product iteration driven by intertemporal price discrimination and the tendency for price change to be associated with model updates. Both of these elements of the vehicle product cycle lead to potential bias in estimating price change. Taking a simple matched model approach with product entry and exit through overlap would result in persistent downward index movement since price discrimination leads to a strong tendency for price discounts over a single iteration. Similarly, if sellers update their pricing strategies with new product entry, a matched model index with overlap would not reflect the change between pricing regimes. While these product cycle effects pertain to the new vehicle industry, other item categories also exhibit product cycle behavior that result in similar measurement issues.

Aizcorbe, et al. (2010) and Williams and Sager (2019) document evidence for intertemporal price discrimination related to consumer heterogeneity over the product cycle. Chained price comparisons that reflect the price change across variants fail to offset product life cycle effects. Williams and Sager (2019) found multilateral indexes without linking across product versions failed to counter downward drift and proposed a year-over-year, model-on-model measurement for the trend price in order to avoid the effects of price discrimination and account for price change with product updates.

Reinsdorf, et al. (1996) noted that sellers often introduced prices alongside new models. If indexes only show price change for the same version of an item (and overlap old and new products as they enter and exit the market), price change between regimes, which is the most important in the long run, will be omitted. In cases when sellers update their price strategy or schedules for inflation at the time of changing product offerings, omitting this price change will result in downward bias. This issue is addressed in traditional fixed sample surveys by showing price change between new and replacement items. Williams (2021) finds that the effects of item replacement and class-mean imputation, which is motivated by the need to capture and impute price change across products updates, are very large— larger than estimated quality bias in the index. Moreover, the need to correct for quality bias during these comparisons is ultimately motivated by the need to measure the price change across updates.

In a scanner data context, product matching and grouping have often been used. However, results can be sensitive to the producers used to map products together. Moreover, the timing and other aspects of the item replacement and dynamic weighting further complicate translating “item replacement” methods to scanner data.

Another approach is to use hedonic estimation. Looking at apparel data, Greenlees and McClelland (2010) found a similar pattern that we see in the vehicle market where prices decline strongly in within version price change. Their results varied greatly depending on the specific technique of hedonic index construction used. Multilateral methods did not address product cycle effects as “the relentless downward march of prices completely overwhelm the chain drift issue.” Below we investigate various approaches to hedonic index construction and how they relate to product cycles.

6

5. Hedonic Methods Hedonic methods basically predict a product’s price as a function of its attributes. We can revisit the early model in Court to serve as an example.

&#x1d43f;&#x1d43f;&#x1d43f;&#x1d43f;(&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;) = &#x1d6fc;&#x1d6fc; + &#x1d6fd;&#x1d6fd;1 × &#x1d44a;&#x1d44a;ℎ&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d443;&#x1d443; + &#x1d6fd;&#x1d6fd;2 ×&#x1d44a;&#x1d44a;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d44a;&#x1d44a;ℎ&#x1d461;&#x1d461; + &#x1d6fd;&#x1d6fd;3 × &#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d443;&#x1d443;&#x1d452;&#x1d452;&#x1d443;&#x1d443;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d443;&#x1d443;&#x1d443;&#x1d443;

Using linear regression, Court estimated the coefficient values (for wheelbase in inches, weight in hundredweight) in a joint time period regression for 1925 to 1930:

&#x1d43f;&#x1d43f;&#x1d43f;&#x1d43f;(&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;) = 4.1256 + 0.0161 ×&#x1d44a;&#x1d44a;ℎ&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d452;&#x1d443;&#x1d443; + 0.0461 × &#x1d44a;&#x1d44a;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d44a;&#x1d44a;ℎ&#x1d461;&#x1d461; + −0.0003 × &#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d443;&#x1d443;&#x1d452;&#x1d452;&#x1d443;&#x1d443;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d43b;&#x1d443;&#x1d443;&#x1d443;&#x1d443;

To estimate the price of a Model T in 1925 we can enter in the specification values for the Model T:2

&#x1d43f;&#x1d43f;&#x1d43f;&#x1d43f;(&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;) = 4.1256 + 0.0161 × 100 + 0.0461 × 12 + −0.0003 × 20

This estimates the price of a Model as $535.29 in 1925.

We can perform the same exercise using a model estimated on modern data and estimate what a new Model T would cost in 2019 as

&#x1d43f;&#x1d43f;&#x1d43f;&#x1d43f;(&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;&#x1d443;) = 9.9 + −0.0149 × 100 + 0.0004 × 1200 + 0.0029 × 20

an estimate of $7673.06.

Hedonic methods are often suggested to address selection bias related to the immediate entry and exit of a product. While the literature generally expects hedonic indexes or hedonic adjustment to have downward impact on indexes, BLS research has found that hedonic adjustment has small or even upward effects shows that, were the BLS to omit the item replacement process entirely, indexes would generally be substantially lower. Previous research has found that these adjustments of have little impact on the U.S. CPI (Brown and Stockburger, 2006; Johnson, et al., 2006; Williams, 2021). Williams (2021) finds that product cycle effects are much larger than estimates of quality bias. Here, we focus on hedonic imputation as a means of calculating long-run price change in order to address these product cycle effects.

Model We continue in the vein of the previous model discussed above in focusing on vehicle performance attributes and basic elements of vehicle size. In addition to the horsepower, we also have data on torque, and mileage broken out into city and highway estimates. Automotive engineers face basic tradeoffs in terms of power, weight, and efficiency. We create a highly interacted model to allow parameter estimates to account for the underlying relationships between these variables and better fit our data. We produce the Court and Ohta and Griliches models for their ease of interpretation and historical interest. The Ohta and Griliches specification is used as a benchmark for several comparisons in this paper.

2 Court originally estimated with weight as “hundred weight,” so we use 12 instead of 1200.

7

Hedonic Imputation Indexes Hedonic imputation (HI) is often cited as the preferred approach to hedonic indexes (Diewert, 2019). In a hedonic imputation index, a hedonic regression is estimated on each period. The prices for the sets of goods in other periods are then estimated. Following the example above, taking a Model T imputed price from 1925, $535.29, and the imputed price of the Model T in 2019 gives us a Laspeyres price index increase of 1422% which is not far off from the overall CPI change of 1455% for the same period.

Silver and Heravi (2007) find them preferable to adjacent period time dummy hedonics (TDH). Unlike time product dummy (TPD) and TDH indexes, the dependent variable of price is not restricted to the natural log transformation. In TPD and TDH approaches, the dependent variable must be in the form of a natural logarithm to allow the time dummy to be interpreted as a proportional change in price.

Here we focus on full imputation, HI indexes where both omitted and observed are replaced with the predicted value produced by a hedonic regression for the corresponding period. Imputed “missing” observations comprise of an imputed price with a zero-value quantity and expenditure weight.

In our approach, all observations with the same set of values for a given specification are grouped together into one unit. For a detailed specification, this is equivalent or nearly equivalent to defining a unit by product identifier. In less detailed specifications, such as the Court model, a unit consists of transactions from multiple product identifiers.

Time Dummy Hedonic Time dummy hedonic regressions constrain coefficients to have the same value over time. If the underlying parameter shifts between periods, the residual will be correlated with time period. The time dummy variable will then capture the difference.

In a time dummy model estimated on pooled dataset, the coefficient data is pooled over a long-time period. Another approach is to use an adjacent period TDH where a series of regressions is estimated on data pairs of adjacent periods with a dummy variable indicating the later period. These time dummy variables can be accumulated into a chained multiperiod index. The adjacent period index allows the

The time dummy hedonic equation is

ln&#x1d43b;&#x1d43b;&#x1d456;&#x1d456;&#x1d461;&#x1d461; = &#x1d6fc;&#x1d6fc; + �&#x1d6ff;&#x1d6ff;&#x1d461;&#x1d461; &#x1d447;&#x1d447;

&#x1d461;&#x1d461;=1

&#x1d437;&#x1d437;&#x1d461;&#x1d461; + �&#x1d6fd;&#x1d6fd;&#x1d458;&#x1d458;&#x1d467;&#x1d467;&#x1d456;&#x1d456;&#x1d458;&#x1d458;

&#x1d43e;&#x1d43e;

&#x1d458;&#x1d458;=1

Given the close relationship between a TPD and matched model, and a TPD as a “fully” interacted TDH, we should be wary that a TDH is susceptible to the same issues we see in the matched model.

Time-Product Dummy The Time-Product Dummy variable assigns a “dummy” or “indicator” variable to each “product” or model where each product is identified by a model number or a particular set of features.

Following the representation in de Haan et al. (2021), the time-product-dummy equation is:

ln &#x1d43b;&#x1d43b;&#x1d456;&#x1d456;&#x1d461;&#x1d461; = &#x1d6fc;&#x1d6fc; + �&#x1d6ff;&#x1d6ff;&#x1d461;&#x1d461; &#x1d447;&#x1d447;

&#x1d461;&#x1d461;=1

&#x1d437;&#x1d437;&#x1d461;&#x1d461; + �&#x1d6fe;&#x1d6fe;&#x1d456;&#x1d456;&#x1d437;&#x1d437;&#x1d456;&#x1d456;

&#x1d447;&#x1d447;

&#x1d461;&#x1d461;=1

8

The TPD can accommodate additional fixed effects specific to a product. Any omitted variables from a hedonic specification, would contribute to a product specific fixed effect. TPD could also accommodate differences in coefficient values (e.g., differences in how a given feature contributes to the price of a car versus a truck). The TPD is the equivalent of a flexible, data-driven hedonic—Krsinich (2016) describes the TPD as a fully interacted TDH. However, it fails in the case of hedonic adjustment’s reason for being, product entry and exit. The time-product dummy approach approximates a matched model index and only trivially includes “unmatched” observations. The TPD produces a similar index to the geometric matched model (Aizcorbe, 2014).

For a “good” hedonic specification, we would expect that the fixed effects for a product i would be approximately equal to the coefficient effects from the hedonic model. As de Haan points, the TPD is a special case of the TDH where:

&#x1d6fe;&#x1d6fe;&#x1d456;&#x1d456; = �&#x1d6fd;&#x1d6fd;&#x1d458;&#x1d458;&#x1d467;&#x1d467;&#x1d456;&#x1d456;&#x1d458;&#x1d458;

&#x1d43e;&#x1d43e;

&#x1d458;&#x1d458;=1

Given the close relationship between a TPD and matched model, and a TPD as a “fully” interacted TDH, we should be wary that a TDH is susceptible to the same issues we see in the matched model.

Multilaterals with Hedonic Imputation Ivancic, Diewert, and Fox (2011) introduced the GEKS formula and, more generally, sparked interest in multilateral approaches to address chain drift in price indexes. Chain drift can broadly be defined as divergence between the chained and fixed-base versions of a price index. The literature on chain drift focuses on the “stock” economic explanation for chain drift.3 Here, consumers buy a product at price p0 with a frequency of quantity q0. The product goes on sale and quantity increases dramatically to q1 and price decreases to p1. Consumers stock up on the product the sale price, p1, and satiate their demand over a longer time horizon than the measurement period. As such, even though the price returns to p0 in period 2 (p2=p0), quantity is significantly lower than the original amount demanded at the same price (q2<q0). Many price indexes, including the generally preferred superlative indexes, will show this as a permanent price decrease since the weight on the price increase is not symmetric with the weight on the price decrease. (Other indexes, such as the Jevons, an unweighted geometric index, would not show a permanent decrease). Other factors may lead to a divergence between chained and fixed-base index results especially when product turnover requires methods for product matching or grouping.

These methods have been combined with hedonic imputation in research beginning with de Haan and Krsinich (2014). De Haan and Daalmans (2019) discuss “single imputation,” where only missing prices are imputed, and “double imputation” where a missing price and the observed price that corresponds to it in a price relative are imputed. They note that the double imputation may mitigate the effects of omitted variable bias.

Similarity Linking with Hedonic Imputation Similarity linking has recently gained attention as a means of addressing chain drift in price indexes. Like multilateral methods, similarity linking first arose in the context of spatial price measurement but has

3 See Diewert (2021).

9

been translated to intertemporal price indexes. As noted in Modernizing the Consumer Price Index for the 21st Century (National Academies, 2022), similarity linking has two advantages over multilateral indexes: first, the indexes satisfy the multiperiod identity test, and, second, are fully transitive, unlike rolling window extensions of multilateral indexes. The National Academies’ report also suggests that hedonic imputation could be combined with similarity, and we explore that recommendation.

Similarity linking methods create chained price indexes where each period’s price relative is a bilateral comparison between the given period and the prior period determined to be most “similar.” The intuition being that price comparisons between periods with similar consumption patterns and weight distributions will reduce drift. The question arises of how to quantify “similarity.”

Proposed methods include using the dissimilarity in predicted product shares between periods and the relative dispersion of the Laspeyres and Fisher indexes.

We introduce a method where the period specific hedonic regressions themselves are used to determine the similarity between periods using the test for regression model similarity proposed in Chow (1960). To determine the most similar link for a month t, we run Chow tests between t and each preceding period and take the period with the minimum Chow statistic as the link. Following the same process as other similarity linking procedures, index level for t, &#x1d43c;&#x1d43c;&#x1d461;&#x1d461;, is calculated based on the bilateral price index, P, and most similar period to t, &#x1d461;&#x1d461;&#x1d45a;&#x1d45a;&#x1d456;&#x1d456;&#x1d45a;&#x1d45a;:

&#x1d43c;&#x1d43c;&#x1d461;&#x1d461; = &#x1d43c;&#x1d43c;&#x1d461;&#x1d461;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a; × &#x1d443;&#x1d443;(&#x1d43b;&#x1d43b;&#x1d461;&#x1d461; ,&#x1d45e;&#x1d45e;&#x1d461;&#x1d461;,&#x1d43b;&#x1d43b;&#x1d461;&#x1d461;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a; , &#x1d45e;&#x1d45e;&#x1d461;&#x1d461;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;&#x1d45a;)

The Chow statistic measures how well a model estimated on a combination of two samples compares with models fitted on the samples individually. The Chow test consists of taking two sets of data—in our case, time period t and t-a—and estimating three regressions: one for each period and one where the data is combined into a single pool. The Chow statistic is then calculated based on the sum of squared errors from each regression, SSE, number of observations from each sample, N, and number of parameter estimates, k. These values produce the F-distributed Chow statistic:

&#x1d439;&#x1d439; = (&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d45a;&#x1d45a;&#x1d436;&#x1d436;&#x1d436;&#x1d436; − (&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461; − &#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461;−&#x1d44e;&#x1d44e;))/&#x1d458;&#x1d458;

(&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461; − &#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461;−&#x1d44e;&#x1d44e;)/(&#x1d441;&#x1d441;&#x1d461;&#x1d461; + &#x1d441;&#x1d441;&#x1d461;&#x1d461;−&#x1d44e;&#x1d44e; − 2&#x1d458;&#x1d458;)

We modify the typical Chow test to include a dummy variable for time in the combined regression, which, for t and t-1, would be equivalent to an adjacent period time dummy regression (this allows time periods to match based on similar coefficients even with aggregate price change). The time period with the lowest Chow statistic, &#x1d461;&#x1d461;&#x1d45a;&#x1d45a;&#x1d456;&#x1d456;&#x1d45a;&#x1d45a;, is determined to be the most similar regression model to t and is selected as the link.

In addition to numerical advantages, similarity linking has significantly reduced computational requirements compared to multilateral indexes. Each new period of data is directly compared to each preceding period once and then an index is calculated. This means given w periods in an index window, w comparisons must be made to update a similarity index with w-1 similarity comparisons and one index calculation. This is a substantial reduction from GEKS-type indexes which require index comparisons on the order of w2 for an w-length window.

10

6. Matched Model Indexes and Product Definition The “matched model” index is the standard approach to measuring price change. Individual “models” are identified by either an indicator (for example, UPC or GTIN) or a set of specification values. The price for the same good is then compared from period-to-period. In a traditional, fixed sample, survey-based price index, product cycle effects are dealt with by comparing the price of a discontinued product with the price of a similar, successor product. When a replacement product is not considered comparable, the difference between the two items may be imputed. The item replacement and related-imputation process can have extremely large effects on a price index (See Williams, 2021).

In a scanner data context, the direct relationship between an exiting and entering good does exist as it does in a fixed sample survey. Products may be allowed to come and go from calculations as they enter and exit the market or remain on the market without any recorded sales. When “matched model” allows goods to fluidly enter and exit calculations, the “maximum overlap” approach to product turnover is used. However, this omits price change that may be introduced with model updates and allows for bias from product cycle effects. When working with scanner data, some researchers use the concept of a “product relaunch” to link old and new products together. Similarly, “product grouping” can be used so that multiple products can be grouped together and treated as one.

Here we investigate product definition in terms of aggregating transactions to a given model specification level. For example, following the Court specification, all observations that are all 180 inches long, 160 horsepower, and 2000 pounds would all be aggregated together to form a mean price and total quantity used in regression and matched model price index estimates. Once again, we use the specifications from Court, Ohta and Griliches, and our own specification. For a given specification, transactions are aggregated into an arithmetic mean price across all transactions meeting a given combination of variables and the total number of transactions as the quantity (with expenditure implied by the product of the mean price and total quantity). These indexes constitute a matched model index (without imputation) where a unique set of variable values for a given model specification constitutes a product definition.

7. Results We reproduced the specifications used in Court (1939) and Ohta and Griliches (1976) for various forms of hedonic indexes namely pooled, adjacent period, and single period. Regression results for pooled version of the Court and Ohta and Griliches specifications are presented below in tables 2 and 3, respectively. The dummy variables for month have been excluded from both and nameplate has been excluded in table 3. The pooled result for our interacted specification is in the appendix. Results were similar to adjacent and single period coefficient estimates. Variables were generally significant and had the expected sign with the exception of “length” and “wheelbase” where these were negative (except for wheelbase in the interacted model).

11

Table 2: Regression results for the pooled Court model

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.06E+01 1.57E-02 677.231 < 2e-16 *** Wheelbase -2.84E-02 1.66E-04 -171.311 < 2e-16 *** Weight 6.31E-04 2.35E-06 268.856 < 2e-16 *** Horsepower 2.28E-03 7.41E-06 308.125 < 2e-16 *** MONTHS - - - - - Multiple R-squared: 0.6887 Adjusted R-squared: 0.6884

Table 3: Regression results for the pooled Ohta & Griliches model

Estimate Std. Error t value Pr(>|t|)

(Intercept) 9.75E+00 1.31E-02 742.181 < 2e-16 *** Length -5.54E-03 6.38E-05 -86.912 < 2e-16 *** Weight 3.50E-04 1.98E-06 176.808 < 2e-16 *** Horsepower 1.12E-03 6.37E-06 176.371 < 2e-16 *** Cylinders 4 6.49E-02 8.02E-03 8.096 5.69E-16 *** Cylinders 5 -2.46E-02 8.63E-03 -2.854 0.004313 ** Cylinders 6 1.73E-01 8.20E-03 21.034 < 2e-16 *** Cylinders 8 4.29E-01 8.47E-03 50.681 < 2e-16 *** Cylinders 10 9.59E-01 1.05E-02 91.744 < 2e-16 *** Cylinders 12 7.56E-01 1.28E-02 58.97 < 2e-16 *** NAMEPLATE - - - - - MONTHS - - - - - Multiple R-squared: 0.8717 Adjusted R-squared: 0.8715

Matched Model Index Results When using basic, matched model methods indexes drifted downward substantially. The results align with expectations from basic cost-of-living index theory: The Laspeyres and Paasche form upper and lower bounds (respectively), and the Törnqvist and Fisher are essentially equivalent. Applying multilateral methods does not address drift. This reinforces the finding from Williams and Sager (2019) that the index declines resulted from product cycles pricing patterns not weight-driven “chain drift.” Interestingly, the multilateral indexes with longer window lengths showed more of a decline than shorter window lengths. The opposite of what we see in the hedonic imputation indexes. The final period chained bilateral Törnqvist and Fisher indexes are in between the 13- and 24-month multilateral indexes, suggesting little overall effect. Since no matches are being made across different versions of a product, longer windows for multilateral indexes will capture more sales for very old cars. This suggests extending the window will only worsen “drift.” Extending the window length to 36 months resulted in bilateral comparisons with no matched observations.

12

The prevailing expectation may be that hedonic estimates would show lower indexes than conventional matched. To the contrary, we see that, of all the methodologies, matched model indexes produce the largest declines. This is consistent with similar research including Greenlees and McClelland (2010) and de Haan and Daalmans (2019). Matched model, maximum overlap price indexes show price change only for the same item so constant quality is maintained. These indexes also allow products to enter and exit calculations. They do not exhibit “quality bias” in the sense that price comparisons are made between goods of differing quality, which is often the motivation behind applying hedonic methods. However, the indexes are still subject to selection bias and product life cycle effects.

Product Grouping and Multilateral Indexes As an alternative to hedonic imputation, cross-version price change can be measured by aggregating products with the same set of specification values and treating them as one product. As new iterations are introduced. Using the Court and Ohta and Griliches specifications to group products leads to indexes that decline much less than the matched model index based on a product identifier. Moreover, the application of multilateral formulas reduces the declines further. These indexes still do not represent plausible estimates for price change. For the decline of one product to be offset, it must have another

40

50

60

70

80

90

100

Ja n-

07 M

ay -0

7 Se

p- 07

Ja n-

08 M

ay -0

8 Se

p- 08

Ja n-

09 M

ay -0

9 Se

p- 09

Ja n-

10 M

ay -1

0 Se

p- 10

Ja n-

11 M

ay -1

1 Se

p- 11

Ja n-

12 M

ay -1

2 Se

p- 12

Ja n-

13 M

ay -1

3 Se

p- 13

Ja n-

14 M

ay -1

4 Se

p- 14

Ja n-

15 M

ay -1

5 Se

p- 15

Ja n-

16 M

ay -1

6 Se

p- 16

Ja n-

17 M

ay -1

7 Se

p- 17

Ja n-

18 M

ay -1

8 Se

p- 18

Ja n-

19 M

ay -1

9 Se

p- 19

In de

x (Ja

nu ar

y 20

07 =1

00 )

Matched Model (SquishVIN) Price Indexes

Tornqvist Laspeyres Paasche Fisher

GEKSmean13 GEKSmean24 CCDImean13 CCDImean24

13

exact match in terms of the specified features and continue to sell in the market. If an exact specification match does not exist, product cycle effects will bias the index.

Product matching is often viewed as incidental to price index methods, however, our results show that making price comparisons across broader time horizons is essential for accurately measuring long-run price change. In other words, accumulations of short-term, same version price change do not result in accurate price measures—even when multilateral and similarity linking methods are applied. Hedonic and product grouping and matching methods are needed.

It is important to consider that many of the issues related to chain drift may arise as secondary effects that result from the method of product matching grouping or hedonic estimation applied to the data rather than a feature of the data in terms of a matched model.

Pooled and Adjacent Period Time Dummy Hedonic Pooled regressions TDH were consistently higher than corresponding adjacent period indexes. Pooled regressions constrain coefficients to the same value over the entire period. The effect constrains the valuation of different features to remain the same over the entire period, which does not accommodate changes in consumer tastes. Pooled TDH also are also subject to revision as previous period values are reestimated with each additional month of data leading to revision.

60

65

70

75

80

85

90

95

100

105

Ja n-

07

Ju n-

07

N ov

-0 7

Ap r-

08

Se p-

08

Fe b-

09

Ju l-0

9

De c-

09

M ay

-1 0

O ct

-1 0

M ar

-1 1

Au g-

11

Ja n-

12

Ju n-

12

N ov

-1 2

Ap r-

13

Se p-

13

Fe b-

14

Ju l-1

4

De c-

14

M ay

-1 5

O ct

-1 5

M ar

-1 6

Au g-

16

Ja n-

17

Ju n-

17

N ov

-1 7

Ap r-

18

Se p-

18

Product Grouped Product Multilaterals

OhtaGrilGrouped OhtaGrilGroupedCCDI13 OhtaGrilGroupedCCDI24

CourtGrouped CourtGroupCCDI13 CourtGroupCCDI24

14

Bilateral Hedonic Imputation and Time-Product Dummy Comparing these same adjacent period TDH indexes with their bilateral hedonic imputation counterparts shows little difference between the methods with an exception of period of divergence in the Court models. Both the Ohta and Griliches and interacted models were within a few percentage points of each other. The Court specification with hedonic imputation showed volatile behavior in 2015 that caused a divergence from its adjacent period counterpart. Shortly after the Court hedonic imputation index appears to stabilize and run close to parallel with the adjacent period index.

While the adjacent period and hedonic imputations appear plausible, there are still concerns that they may reflect product cycle bias.

Our results confirm the expectation that TPD and a geometric matched model index would perform similarly as the resulting indexes are extremely close.

85

90

95

100

105

110

115

120

125 Ja

n- 07

Ju n-

07

N ov

-0 7

Ap r-

08

Se p-

08

Fe b-

09

Ju l-0

9

De c-

09

M ay

-1 0

O ct

-1 0

M ar

-1 1

Au g-

11

Ja n-

12

Ju n-

12

N ov

-1 2

Ap r-

13

Se p-

13

Fe b-

14

Ju l-1

4

De c-

14

M ay

-1 5

O ct

-1 5

M ar

-1 6

Au g-

16

Ja n-

17

Ju n-

17

N ov

-1 7

Ap r-

18

Se p-

18

Pooled vs Adjacent Period TDH Indexes

CourtPooled CourtAdj

OktaGrilPooled OktaGrilsAdj

InteractPool InteractAdj

15

Hedonic Imputation with Multilateral Methods The 13-month extension window had a downward effect compared to a bilateral hedonic imputation index (constructed on single period index imputation). The shorter window would reduce the occurrence of longer-run relatives compared to indexes with longer extensions, but it is unclear why it would lower an index below the bilateral hedonic imputation index.

Longer window multilaterals decline less than those with shorter windows. In the matched model indexes above this relationship was inverted with the 24-month window multilateral falling more than the 13-month. This suggests that the positive effects of extending the window are not related to addressing weight fluctuations that lead to drift, but, rather, increasing the representation of weight placed on longer-term, hedonically imputed price change. A fixed base, hedonic imputation index should not be sensitive to drift or product cycle effects, but the index will lose representivity over time as the base period set of products becomes less relevant. We construct a fixed base, Törnqvist index hedonic imputation which, over a 12-year span, is about 1.5% higher than the hedonic imputation CCDI (Caves- Christensen-Diewert-Inklaar index, a GEKS-type multilateral index based on Törnqvist bilateral comparisons) with a 36-month window.

To avoid product cycle effects in cars, an index must reflect price change across different iterations of goods (model years). A fully transitive index is not dependent on intervening periods, so within model year price change would not alter the long-run measurement of the index. However, full period multilaterals are difficult to calculate because of product turnover and computational demand.

55

60

65

70

75

80

85

90

95

100

105

110

115

120 Ja

n- 07

Ju n-

07

N ov

-0 7

Ap r-

08

Se p-

08

Fe b-

09

Ju l-0

9

De c-

09

M ay

-1 0

O ct

-1 0

M ar

-1 1

Au g-

11

Ja n-

12

Ju n-

12

N ov

-1 2

Ap r-

13

Se p-

13

Fe b-

14

Ju l-1

4

De c-

14

M ay

-1 5

O ct

-1 5

M ar

-1 6

Au g-

16

Ja n-

17

Ju n-

17

N ov

-1 7

Ap r-

18

Se p-

18

Adjacent TDH, Hedonic Imputation, TPD

CourtAdj CourtHITorn

OhtaGrilsAdj OhtaGrilHItorn

InteractAdj InterHItorn

MMTornqvist TPDummyWLS

16

Moreover, they lead to revisions of prior months which are not acceptable for the publication of many official statistics. Extension methods can lead to indexes that are nearly transitive, but longer windows are preferred to better capture long-run price comparisons.

Similarity Linking with Hedonic Imputation The three methods of similarity linking combined with hedonic imputation all produced similar results. The Chow similarity and predicted share methods were highly correlated. The similarity link indexes without hedonic imputation (matched model indexes with similarity linking) showed large declines. The case mirrors the results of applying multilateral methods to matched model indexes: Without product matching or hedonic imputation to offset product cycle effects and capture price change with model updates, indexes will decline to implausible levels.

Our indexes using similarity linking showed index results comparable to a CCDI index with a 36-month extension window. However, the most similar month was typically the proceeding month with 105 of the 143 periods tested selecting the month prior as the most similar. Unlike GEKS-type multilateral, similarity linking does not necessarily force a comparison over a longer-time horizon. If changes are incremental or if the most “similar” link remains the previous month even after a pricing regime changes, similarity linking may not address aspects of the product cycle.

95

97

99

101

103

105

107

109

111

113

Ja n-

07

Ju n-

07

N ov

-0 7

Ap r-

08

Se p-

08

Fe b-

09

Ju l-0

9

De c-

09

M ay

-1 0

O ct

-1 0

M ar

-1 1

Au g-

11

Ja n-

12

Ju n-

12

N ov

-1 2

Ap r-

13

Se p-

13

Fe b-

14

Ju l-1

4

De c-

14

M ay

-1 5

O ct

-1 5

M ar

-1 6

Au g-

16

Ja n-

17

Ju n-

17

N ov

-1 7

Ap r-

18

Se p-

18

Ohta and Griliches HI with CCDI

OhtaGrilHItorn HIccdiOhtaGril13 HIccdiOhtaGril24

HIccdiOhtaGril36 HiFixOhtaGril

17

8. Conclusion Hedonic estimates have often been used to impute the prices for entering and exiting products. Hedonic estimates may also be used to estimate long-run price relatives, which allow better measurement of price change across product cycles. Product cycle effects have generally been neglected and the focus has been on “quality bias.” In matched model indexes, quality bias emerges as a secondary effect from the use of product matching as a means of addressing product cycle issues including price change with model updates and price discrimination.

Previous research has also found that multilateral indexes with hedonic imputation tend to fall less when a longer extension window is used. We find evidence that this is mostly due to the additional influence long-term, cross-product cycle relatives have in multilaterals with longer extended windows.

Estimates from hedonic imputation can be used with similarity linking methods. Like other multilateral methods, similarity linking without product replacement or hedonic imputation does not remedy product cycle effects. Using regression model similarity as a method for linking produces similar results to other multilateral methods but with greater simplicity and less computational demand.

90

95

100

105

110

115

Ja n-

07 Ju

n- 07

N ov

-0 7

Ap r-

08 Se

p- 08

Fe b-

09 Ju

l-0 9

De c-

09 M

ay -1

0 O

ct -1

0 M

ar -1

1 Au

g- 11

Ja n-

12 Ju

n- 12

N ov

-1 2

Ap r-

13 Se

p- 13

Fe b-

14 Ju

l-1 4

De c-

14 M

ay -1

5 O

ct -1

5 M

ar -1

6 Au

g- 16

Ja n-

17 Ju

n- 17

N ov

-1 7

Ap r-

18 Se

p- 18

In de

x (Ja

nu ar

y 20

07 =1

00 )

Similarity Linking

ChowSimilarityOktaImp PredShareImpOktaSimilar PLSpreadOktaImpSIm

18

Works Cited Aizcorbe, Ana M. 2014. A Practical Guide to Price Index and Hedonic Techniques. Oxford: Oxford

University Press.

Aizcorbe, Ana, Benjamin Bridgman, and Jeremy Nalewaik. 2010. "Heterogeneous car buyers: A stylized fact." Economic Letters 109 (1): 50-53. doi:10.1016/j.econlet.2010.08.003.

Berry, Steven, James Levinsohn, and Ariel Pakes. 1995. "Automobile Prices in Market Equilibrium." Econometrica 63 (4): 841-890. doi:10.2307/2171802.

Brown, Craig, and Anya Stockburger. 2006. "Item Replacement and Quality Change in Apparel Price Indexes." Monthly Labor Review 129 (12): 35-45. https://www.bls.gov/opub/mlr/2006/12/art3full.pdf.

Chow, Gregory C. 1960. "Tests of Equality Between Sets of Coefficients in Two Linear Regressions." Econometrica 28 (3): 591-605. doi:10.2307/1910133.

Colwell, Peter F., and Gene Dilmore. 1999. "Who Was First? An Examination of an Early Hedonic Study." Land Economics 75 (4): 620-626. doi:10.2307/3147070.

Court, A.T. 1939. "Hedonic Price Indexes with Automotive Examples." General Motors Corporation, Automobile Demand.

Cowling, Keith, and John Cubbin. 1972. "Hedonic Price Indexes for United Kingdom Cars." Economic Journal 82 (327): 963-978.

de Haan, Jan, and Jacco Daalmans. 2019. "Scanner Data in the CPI: the imputation CCDI Index revisited." 16th meeting of the Ottawa Group.

de Haan, Jan, Rens Hendriks, and Michael Scholz. 2016. "A Comparison of Weighted Time-Product Dummy and Time Dummy Hedonic Indexes." Graz Economics Papers.

Diewert, Erwin. 2018. "“Scanner Data, Elementary Price Indexes and the Chain Drift Problem”." (Vancouver School of Economics) Discussion Paper 20-07.

Diewert, W. Erwin. 2019. "Quality Adjustment and Hedonics: A Unified Approach." Vancouver School of Economics.

Feenstra, Robert C. 1995. "Exact Hedonic Price Indexes." The Review of Economics and Statistics 77 (4): 634-653.

Goodman, Allen C. 1998. "Andrew Court and the Invention of Hedonic Price Analysis." Journal of Urban Economics 44 (2): 291-298. doi:10.1006/juec.1997.2071.

Greenlees, J., and R. McClelland. 2010. "Superlative and Regression-Based Consumer Price Indexes for Apparel Using U.S. Scanner Data." St. Gallen, Switzerland: Conference of the International Association for Research in Income and Wealth.

Greenlees, John S., and Robert McClelland. 2011. "Does Quality Adjustment Matter for Technologically Stable Products? An Application to the CPI for Food." The American Economic Review 101 (3): 200-205. https://www.jstor.org/stable/29783739.

19

Griliches, Zvi. 1961. "Hedonic Price Indexes for Automobiles: An Econometric Analysis of Quality Change." In The Price Staistics of the Federal Government, 173-96. NBER.

Ivancic, Lorraine, W. Erwin Diewert, and Kevin J. Fox. 2011. "Scanner data, time aggregation and the construction of price indexes." Journal of Econometrics 161 (1): 24-35. doi:10.1016/j.jeconom.2010.09.003.

Johnson, David S, Stephen B. Reed, and Kenneth J Stewart. 2006. "Price measurement in the United States: a decade after the Boskin Report." Monthly Labor Review 129 (5): 10-19. https://www.bls.gov/opub/mlr/2006/05/art2full.pdf.

Konny, Crystal G., Brendan K. Williams, and David M. Friedman. 2019. "Big Data in the U.S. Consumer Price Index: Experiences and Plans." In Big Data for Twenty-First Century Economic Statistics, edited by Katharine G. Abraham, Ron S. Jarmin, Brian Moyer and Matthew D. Shapiro. University of Chicago Press. http://www.nber.org/chapters/c14280.

Krsinich, Frances. 2014. "The FEWS index: Fixed effects with a window splice." Meeting of the Group of Experts on Consumer Price Indices. Geneva, Switzerland. https://unece.org/sites/default/files/datastore/fileadmin/DAM/stats/documents/ece/ces/ge.22 /2014/New_Zealand_-_FEWS.pdf.

National Academies of Sciences, Engineering, and Medicine. 2022. Modernizing the Consumer Price Index for the 21st Century. Washington, DC: The National Academies Press.

Ohta, Makoto, and Zvi Griliches. 1976. "Automobile Prices Revistited: Extensions of the Hedonic Hypothesis." In Household Production and Consumption, edited by Nestor E. Teleckyj, 325-398. NBER.

Pakes, Ariel. 2003. "A Reconsideration of Hedonic Price Indexes with an Application to PC's." The American Economic Review 93 (5): 1578-1596. https://www.jstor.org/stable/3132143.

Reinsdorf, Marshall B., Paul Liegey, and Kenneth Stewart. 1996. "New Ways of Handling Quality Change in the U.S. Consumer Price Index." BLS Working Paper 276. https://www.bls.gov/osmr/research- papers/1996/pdf/ec960040.pdf.

Silver, Mick, and Saeed Heravi. 2007. "The Difference between Hedonic Imputation Indexes and Time Dummy Hedonic Indexes." Journal of Business & Economic Statistics 25 (2): 239-246. http://www.jstor.org/stable/27638928.

Triplett, Jack E. 1969. "Automobiles and Hedonic Quality Measurement." Journal of Political Economy 77 (3): 408-417. doi:10.1086/259524.

Triplett, Jack. 2006. Handbook on Hedonic Indexes and Quality Adjustments in Price Indexes. Paris: OECD Publishing.

Williams, Brendan. 2021. "Twenty-One Years of Adjustments for Quality Change in the US Consumer Price Index." Group of Experts on Consumer Price Indices. Online: UNECE. https://unece.org/sites/default/files/2021-05/Session_3_US-BLS_Paper_0.docx.

20

Williams, Brendan, and Erick Sager. 2019. "A New Vehicles Transaction Price Index: Offsetting the Effects of Price Discrimination and Product Cycle Bias with a Year-Over-Year Index." BLS Working Papers (Working Paper 514 ). https://www.bls.gov/osmr/pdf/ec190040.pdf.

21

Appendix

Table 4: Pooled Interacted Regression

Estimate Std. Error

t value Pr(>|t|)

(Intercept) 8.5590 0.0268 319.715 < 2e-16 *** BASE..ins.. 0.0058 0.0002 33.998 < 2e-16 *** Length..ins.. -0.0002 0.0001 -2.464 0.013759 * weight 0.0003 0.0000 45.374 < 2e-16 *** horsepower 0.0016 0.0001 28.577 < 2e-16 *** AWDdummy 0.0806 0.0013 62.736 < 2e-16 *** displacement -0.2948 0.0044 -66.646 < 2e-16 *** height -0.0155 0.0002 -67.134 < 2e-16 *** MPGCity 0.0216 0.0011 20.094 < 2e-16 *** MPGHwy -0.0168 0.0011 -14.834 < 2e-16 *** HybrDummy 0.4583 0.0179 25.607 < 2e-16 *** torque 0.0011 0.0000 121.504 < 2e-16 *** I(horsepower/weight) 18.0400 0.1710 105.457 < 2e-16 *** Make1 -0.2496 0.0027 -91.931 < 2e-16 *** Make2 -0.1644 0.0470 -3.497 0.000471 *** Make3 -0.1594 0.0038 -41.437 < 2e-16 *** Make4 -0.1441 0.0027 -52.95 < 2e-16 *** Make5 -0.1405 0.0021 -67.696 < 2e-16 *** Make6 -0.1396 0.0055 -25.335 < 2e-16 *** Make7 -0.1328 0.0022 -59.761 < 2e-16 *** Make8 -0.1301 0.0039 -33.104 < 2e-16 *** Make9 -0.1283 0.0025 -50.53 < 2e-16 *** Make10 -0.1239 0.0022 -55.747 < 2e-16 *** Make11 -0.1148 0.0066 -17.365 < 2e-16 *** Make12 -0.0870 0.0032 -26.788 < 2e-16 *** Make13 -0.0521 0.0038 -13.8 < 2e-16 *** Make14 -0.0333 0.0021 -15.912 < 2e-16 *** Make15 -0.0215 0.0025 -8.516 < 2e-16 *** Make16 -0.0211 0.0021 -9.951 < 2e-16 *** Make17 -0.0194 0.0024 -8.062 7.56E-16 *** Make18 -0.0066 0.0029 -2.305 0.021175 * Make19 0.0632 0.0022 28.216 < 2e-16 *** Make20 0.1381 0.0059 23.425 < 2e-16 *** Make21 0.1722 0.0030 57.081 < 2e-16 *** Make22 0.1824 0.0039 47.164 < 2e-16 ***

22

Make23 0.1907 0.0029 65.505 < 2e-16 *** Make24 0.2104 0.0124 16.971 < 2e-16 *** Make25 0.2244 0.0032 69.286 < 2e-16 *** Make26 0.2343 0.0112 20.872 < 2e-16 *** Make27 0.2479 0.0026 94.48 < 2e-16 *** Make28 0.2561 0.0043 60.248 < 2e-16 *** Make29 0.3073 0.0026 117.379 < 2e-16 *** Make30 0.3261 0.0030 108.382 < 2e-16 *** Make31 0.3711 0.0026 144.678 < 2e-16 *** Make32 0.4131 0.0033 125.356 < 2e-16 *** Make33 0.4226 0.0028 153.396 < 2e-16 *** Make34 0.5864 0.0278 21.129 < 2e-16 *** Make35 0.8271 0.0180 45.862 < 2e-16 *** Make36 0.8463 0.0030 279.909 < 2e-16 *** cylinders3 -0.0625 0.0084 -7.429 1.10E-13 *** cylinders4 -0.1437 0.0041 -35.445 < 2e-16 *** cylinders5 -0.2308 0.0042 -54.936 < 2e-16 *** cylinders6 -0.1201 0.0027 -44.113 < 2e-16 *** cylinders10 0.2364 0.0056 42.428 < 2e-16 *** cylinders12 0.2428 0.0080 30.451 < 2e-16 *** BODYSTYLEconvertible 0.1467 0.0017 85.684 < 2e-16 *** BODYSTYLEcoupe 0.0261 0.0014 19.003 < 2e-16 *** BODYSTYLEhatchback 0.0209 0.0015 14.213 < 2e-16 *** BODYSTYLEwagon 0.0823 0.0017 48.44 < 2e-16 *** MPGCity:HybrDummy 0.0018 0.0004 4.759 1.95E-06 *** MPGHwy:HybrDummy -0.0102 0.0005 -19.042 < 2e-16 *** HybrDummy:torque 0.0005 0.0000 15.657 < 2e-16 *** displacement:MPGHwy 0.0207 0.0003 65.344 < 2e-16 *** displacement:MPGCity -0.0160 0.0004 -40.57 < 2e-16 *** horsepower:MPGCity 0.0001 0.0000 14.697 < 2e-16 *** horsepower:MPGHwy -0.0002 0.0000 -68.501 < 2e-16 *** weight:MPGHwy 0.00001 0.0000 13.83 < 2e-16 *** weight:MPGCity 0.0000 0.0000 -0.219 0.827012

  • 2. Data
  • 3. Background on New Vehicle Hedonic Research
  • 4. Product Cycle
  • 5. Hedonic Methods
    • Model
    • Hedonic Imputation Indexes
    • Time Dummy Hedonic
    • Time-Product Dummy
    • Multilaterals with Hedonic Imputation
    • Similarity Linking with Hedonic Imputation
  • 6. Matched Model Indexes and Product Definition
  • 7. Results
    • Matched Model Index Results
    • Product Grouping and Multilateral Indexes
    • Pooled and Adjacent Period Time Dummy Hedonic
    • Bilateral Hedonic Imputation and Time-Product Dummy
    • Hedonic Imputation with Multilateral Methods
    • Similarity Linking with Hedonic Imputation
  • 8. Conclusion
  • Works Cited
  • Appendix

Hedonic price estimates for new vehicles: When do rotations lead to drift? United States

Languages and translations
English

1 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Hedonic price estimates for new vehicles:

When do rotations lead to drift?

Brendan Williams Senior Economist

Consumer Price Index Division

Bureau of Labor Statistics

Prepared for Group of Experts 6 June 2023

2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Summary  New vehicles and other items subject to

product cycle effects  Multilateral methods alone do not address

product cycle Price change must be measured across versions Hedonic methods allow price measurement

across product cycles

 Hedonic imputation can be used with similarity linking New link method based on the similarity of

regression estimates based on a Chow test

3 — U.S. BUREAU OF LABOR STATISTICS • bls.gov3 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Data  J.D. Power: Transaction records of car sales SquishVIN as product ID Some information on features

 Wards: Specification information Data on vehicle performance and other features

– Horsepower, torque, fuel efficiency, vehicle size…

 Combine based on manufacturer, engine type, and string matching for model and trim

4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Product Cycle  Intertemporal price discrimination Evidence documented in Aizcorbe, et al. (2010)

and Williams and Sager (2019) Price decreases consistently for a product over a

given model year

 Price updates with model updates Sellers introduce new pricing regimes with

product updates Related to theory of price rigidity

5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

20

30

40

50

60

70

80

90

100 Ja

n- 07

M ay

-0 7

Se p-

07

Ja n-

08

M ay

-0 8

Se p-

08

Ja n-

09

M ay

-0 9

Se p-

09

Ja n-

10

M ay

-1 0

Se p-

10

Ja n-

11

M ay

-1 1

Se p-

11

Ja n-

12

M ay

-1 2

Se p-

12

Ja n-

13

M ay

-1 3

Se p-

13

Ja n-

14

M ay

-1 4

Se p-

14

Ja n-

15

M ay

-1 5

Se p-

15

Ja n-

16

M ay

-1 6

Se p-

16

Ja n-

17

M ay

-1 7

Se p-

17

Ja n-

18

M ay

-1 8

Se p-

18

Matched Model (SquishVIN) Price Indices

Tornqvist Laspeyres Paasche Fisher

GEKSmean13 GEKSmean24 CCDImean13 CCDImean24

6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

20

30

40

50

60

70

80

90

100 Ja

n- 07

M ay

-0 7

Se p-

07

Ja n-

08

M ay

-0 8

Se p-

08

Ja n-

09

M ay

-0 9

Se p-

09

Ja n-

10

M ay

-1 0

Se p-

10

Ja n-

11

M ay

-1 1

Se p-

11

Ja n-

12

M ay

-1 2

Se p-

12

Ja n-

13

M ay

-1 3

Se p-

13

Ja n-

14

M ay

-1 4

Se p-

14

Ja n-

15

M ay

-1 5

Se p-

15

Ja n-

16

M ay

-1 6

Se p-

16

Ja n-

17

M ay

-1 7

Se p-

17

Ja n-

18

M ay

-1 8

Se p-

18

Matched Model (SquishVIN) Price Indices

Tornqvist Laspeyres Paasche Fisher GEKSmean13

GEKSmean24 CCDImean13 CCDImean24 SimPredShare SimPaaLaspSpread

7 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

60

65

70

75

80

85

90

95

100

105 Ja

n- 07

Ap r-

07 Ju

l-0 7

Oc t-0

7 Ja

n- 08

Ap r-

08 Ju

l-0 8

Oc t-0

8 Ja

n- 09

Ap r-

09 Ju

l-0 9

Oc t-0

9 Ja

n- 10

Ap r-

10 Ju

l-1 0

Oc t-1

0 Ja

n- 11

Ap r-

11 Ju

l-1 1

Oc t-1

1 Ja

n- 12

Ap r-

12 Ju

l-1 2

Oc t-1

2 Ja

n- 13

Ap r-

13 Ju

l-1 3

Oc t-1

3 Ja

n- 14

Ap r-

14 Ju

l-1 4

Oc t-1

4 Ja

n- 15

Ap r-

15 Ju

l-1 5

Oc t-1

5 Ja

n- 16

Ap r-

16 Ju

l-1 6

Oc t-1

6 Ja

n- 17

Ap r-

17 Ju

l-1 7

Oc t-1

7 Ja

n- 18

Ap r-

18 Ju

l-1 8

Oc t-1

8

Product Grouping with CCDI

OhtaGrilGrouped OhtaGrilGroupedCCDI13 OhtaGrilGroupedCCDI24

CourtGrouped CourtGroupCCDI13 CourtGroupCCDI24

8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Historical Models Court (1939)

Griliches (1961)

Triplett (1969)

Triplett Truncated (1969)

Cowling & Cubbin (1972)

Ohta & Griliches (1976)

Weight x x x x x Wheelbase x Length/wheelbase Horsepower x X x x x Length Length/wheelbase x x x V8 X x x Hardtop x x x Transmission x X Comb.

Power brakes

x Comb. Comb. x

Power steering x Comb. Comb.

Compact x x x Over4Gears x Luxury x PassengerArea x

Efficiency x Make Indicator

variables

9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Interacted Model  Wheelbase, length, weight, horsepower,

displacement, height, MPGCity, MPGHwy, torque

 Indicators: # of Cylinders, Make, Bodystyle, Hybrid, AWD

 Interactions: Hybrid (MPGCity, MPGHwy, torque) MPGCity/MPGHwy (weight, horsepower,

displacement) horsepower/weight

10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Basic Hedonic Methods  A bilateral Time-Product Dummy, WLS

regression Nearly identical to matched model

 TPD is a “fully interacted” time-dummy hedonic, Krsinich (2016)

 Pooled TDH constrains feature values to a constant over pooled time period

 Hedonic imputation: Hedonic predicted price for each specification weighted with observed quantities

11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

85

90

95

100

105

110

115

120

125 Ja

n- 07

M ay

-0 7

Se p-

07

Ja n-

08

M ay

-0 8

Se p-

08

Ja n-

09

M ay

-0 9

Se p-

09

Ja n-

10

M ay

-1 0

Se p-

10

Ja n-

11

M ay

-1 1

Se p-

11

Ja n-

12

M ay

-1 2

Se p-

12

Ja n-

13

M ay

-1 3

Se p-

13

Ja n-

14

M ay

-1 4

Se p-

14

Ja n-

15

M ay

-1 5

Se p-

15

Ja n-

16

M ay

-1 6

Se p-

16

Ja n-

17

M ay

-1 7

Se p-

17

Ja n-

18

M ay

-1 8

Se p-

18

Pooled vs Adjacent Period Hedonic Indexes

CourtPooled CourtAdj

OhtaGrilPooled OhtaGrilsAdj

InteractPool InteractAdj

12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Pooled Regression Fits

Model Specification R-Squared Adjusted R-Squared

Court 0.6887 0.6884

Ohta & Griliches 0.8717 0.8715

Interacted 0.9253 0.9252

13 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

55

65

75

85

95

105

115

Ja n-

07

M ay

-0 7

Se p-

07

Ja n-

08

M ay

-0 8

Se p-

08

Ja n-

09

M ay

-0 9

Se p-

09

Ja n-

10

M ay

-1 0

Se p-

10

Ja n-

11

M ay

-1 1

Se p-

11

Ja n-

12

M ay

-1 2

Se p-

12

Ja n-

13

M ay

-1 3

Se p-

13

Ja n-

14

M ay

-1 4

Se p-

14

Ja n-

15

M ay

-1 5

Se p-

15

Ja n-

16

M ay

-1 6

Se p-

16

Ja n-

17

M ay

-1 7

Se p-

17

Ja n-

18

M ay

-1 8

Se p-

18

Adjacent TDH vs Hedonic Imputation

CourtAdj CourtHITorn

OhtaGrilsAdj OhtaGrilHItorn

InteractAdj InterHItorn

MMTornqvist TPDummyWLS

14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Product Cycle and Measurement

 Long-term objective price change should fully reflect the difference between completely different regimes

 In the case of IPD, long-run relative may still be biased, but will not compound and cycle patterns may disperse over a longer time horizon

15 — U.S. BUREAU OF LABOR STATISTICS • bls.gov15 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Index Methods and Product Cycle: Accuracy Issues

 Matched model, TPD Inaccurate: Price change omitted between regimes

 Product grouping or product matching Partially accurate: Dependent on matching method and

weighting

 Short-term hedonic imputation and adjacent period TDH Partially accurate: Dependent on weighting of transition

16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Index Methods and Product Cycle: Most Accurate Measures

 Long-term relatives  Intermediate price changes are transitive or

not included  Methods: Hedonic imputation with fixed base

– No chain drift, but dependent on base period and losses representivity

Hedonic imputation multilaterals Similarity linking

17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

95

97

99

101

103

105

107

109

111

113

115

Ja n-

07

M ay

-0 7

Se p-

07

Ja n-

08

M ay

-0 8

Se p-

08

Ja n-

09

M ay

-0 9

Se p-

09

Ja n-

10

M ay

-1 0

Se p-

10

Ja n-

11

M ay

-1 1

Se p-

11

Ja n-

12

M ay

-1 2

Se p-

12

Ja n-

13

M ay

-1 3

Se p-

13

Ja n-

14

M ay

-1 4

Se p-

14

Ja n-

15

M ay

-1 5

Se p-

15

Ja n-

16

M ay

-1 6

Se p-

16

Ja n-

17

M ay

-1 7

Se p-

17

Ja n-

18

M ay

-1 8

Se p-

18

Ohta and Griliches Model Imputes with Multilaterals

Adj TDH Hed Impute HI CCDI Mean13 HI CCDI Mean 24 HI CCDI Mean 36 HI Fixed Base

18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Similarity Linking Indexes  Pass multiperiod identity test and are fully

transitive  A new period index, &#x1d43c;&#x1d43c;&#x1d461;&#x1d461;, is found by creating a

bilateral index of the most similar previous period

 Different methods exist for quantifying similarity or dissimilarity

&#x1d43c;&#x1d43c;&#x1d461;&#x1d461; = &#x1d43c;&#x1d43c;&#x1d461;&#x1d461;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446; × &#x1d443;&#x1d443;(&#x1d45d;&#x1d45d;&#x1d461;&#x1d461;, &#x1d45e;&#x1d45e;&#x1d461;&#x1d461;, &#x1d45d;&#x1d45d;&#x1d461;&#x1d461;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;, &#x1d45e;&#x1d45e;&#x1d461;&#x1d461;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;)

19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Similarity Linking and Hedonics  Use hedonic imputed price with a similarity

linking method  New similarity linking method Estimate a single period hedonic regression and

find the previous period with the closest fit

 For all prior periods to t, t-a, find  &#x1d439;&#x1d439; = (&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d436;&#x1d436;&#x1d436;&#x1d436;&#x1d446;&#x1d446;&#x1d436;&#x1d436;&#x1d436;&#x1d436;−(&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461;−&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461;−&#x1d44e;&#x1d44e;))/&#x1d458;&#x1d458;

(&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461;−&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d446;&#x1d461;&#x1d461;−&#x1d44e;&#x1d44e;)/(&#x1d441;&#x1d441;&#x1d461;&#x1d461;+&#x1d441;&#x1d441;&#x1d461;&#x1d461;−&#x1d44e;&#x1d44e;−2&#x1d458;&#x1d458;)

The prior period with the lowest Chow test statistic is used as a link

20 — U.S. BUREAU OF LABOR STATISTICS • bls.gov20 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

95

97

99

101

103

105

107

109

111

Ja n-

07 Ap

r- 07

Ju l-0

7 Oc

t-0 7

Ja n-

08 Ap

r- 08

Ju l-0

8 Oc

t-0 8

Ja n-

09 Ap

r- 09

Ju l-0

9 Oc

t-0 9

Ja n-

10 Ap

r- 10

Ju l-1

0 Oc

t-1 0

Ja n-

11 Ap

r- 11

Ju l-1

1 Oc

t-1 1

Ja n-

12 Ap

r- 12

Ju l-1

2 Oc

t-1 2

Ja n-

13 Ap

r- 13

Ju l-1

3 Oc

t-1 3

Ja n-

14 Ap

r- 14

Ju l-1

4 Oc

t-1 4

Ja n-

15 Ap

r- 15

Ju l-1

5 Oc

t-1 5

Ja n-

16 Ap

r- 16

Ju l-1

6 Oc

t-1 6

Ja n-

17 Ap

r- 17

Ju l-1

7 Oc

t-1 7

Ja n-

18 Ap

r- 18

Ju l-1

8 Oc

t-1 8

Similarity Linking Indices for Ohta & Griliches Model

Min Chow Test Predicted Share Paasche Laspeyres Divergence

21 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

95

97

99

101

103

105

107

109

111

113

115

Ja n-

07

M ay

-0 7

Se p-

07

Ja n-

08

M ay

-0 8

Se p-

08

Ja n-

09

M ay

-0 9

Se p-

09

Ja n-

10

M ay

-1 0

Se p-

10

Ja n-

11

M ay

-1 1

Se p-

11

Ja n-

12

M ay

-1 2

Se p-

12

Ja n-

13

M ay

-1 3

Se p-

13

Ja n-

14

M ay

-1 4

Se p-

14

Ja n-

15

M ay

-1 5

Se p-

15

Ja n-

16

M ay

-1 6

Se p-

16

Ja n-

17

M ay

-1 7

Se p-

17

Ja n-

18

M ay

-1 8

Se p-

18

Ohta and Griliches Model Imputes with Multilaterals

Adj TDH Hed Impute HI CCDI Mean13 HI CCDI Mean 24

HI CCDI Mean 36 Sim Chow Test HI Fixed Base

22 — U.S. BUREAU OF LABOR STATISTICS • bls.gov22 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Conclusions  Product cycle effects dominate quality change

in certain markets.  Product cycle can drive drift and chain

drift/multilateral may be a secondary effect.  Hedonic imputation should be preferred over

product matching when possible.  Similarity linking with hedonic imputation

appears promising.

23 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Contact Information

Brendan Williams Senior Economist

Consumer Prices Division Office of Prices and Living Conditions

www.bls.gov/cpi [email protected]

  • Hedonic price estimates for new vehicles: �When do rotations lead to drift?
  • Summary
  • Data
  • Product Cycle
  • Matched Model Price Indices
  • Matched Model Price Indices 2
  • Product Grouping with CCDI
  • Historical Models
  • Interacted Model
  • Basic Hedonic Methods
  • Pooled vs Adjacent Period TDH
  • Pooled Regression Fits
  • Adjacent TDH vs Hedonic Imputation
  • Product Cycle and Measurement
  • Index Methods and Product Cycle: Accuracy Issues
  • Index Methods and Product Cycle: Most Accurate Measures
  • Ohta and Griliches Model
  • Similarity Linking Indexes
  • Similarity Linking and Hedonics
  • Similarity Linking Indices
  • Ohta and Griliches Model 2
  • Conclusions
  • Contact Information