The Commission on Evidence-Based Policymaking was formed in response to a bill sponsored by Speaker of the House Paul Ryan and Senator Patty Murray (AP Photo/ Scott Applewhite, File)

The bipartisan, congressionally mandated Commission on Evidence-Based Policymaking released its final report today, advocating for a number of sound changes to the way the federal government collects, manages, and makes use of federal data. The Washington Center for Equitable Growth, a grant-giving organization that works closely with academic economists to expand our understanding of inequality in the economy, knows firsthand the challenges posed by current federal data-stewardship practices and applauds the Commission for making a number of smart recommendations for modernizing this infrastructure.

The work of the Commission is complete and it is now incumbent on Congress and the Trump administration to implement these recommendations. We address some of the Commission’s recommendations below, but we must emphasize that without congressional action, the Commission’s report will do nothing. Unfortunately, Congress has not been kind to statistical agencies in 2017, raising the question of whether there is political will to provide the resources that the commission’s plan will require.

Will Congress provide the necessary funding?

The Commission’s report does not address funding levels for existing statistical agencies, but funding for these agencies is not a luxury—it is critical to the functioning of a modern government. The commission was told time and again during hearings that funding for agencies is too low and that data quality is at risk. As we have highlighted before, the House of Representatives is currently on track to cut budgets for important statistical agencies. If Speaker of the House Paul Ryan (R-WI) truly believes in the importance of this Commission’s work then his first priority should be to reverse these cuts.

Despite presenting himself as a strong champion of utilizing data in the governmental process, Speaker Ryan has given little indication that he is willing to pay for such efforts. In a 2014 policy document that first raised the possibility of the Commission, the Speaker proposed a clearinghouse for federal program and survey data. In that document he suggested that such a clearinghouse should be funded by user fees to keep it revenue neutral. Prioritizing revenue-neutral funding mechanisms in an early conceptual document is another discouraging sign that the Speaker may be unwilling to make the necessary investment to turn the commission’s recommendations into a reality.

Statistical agency budgets are measured in millions of dollars, a drop in the bucket in terms of annual government spending. To make the Trump administration’s fiscal year 2017 budget target, the U.S. Bureau of Economic Analysis is proposing to halt programs to track the impact of small businesses, collect better data on trade, and measure health care more accurately for incorporation into quarterly gross domestic product calculations. The savings from cutting these three programs, which would help us understand regional variations in our economy and improve economic decisionmaking, is a mere $10 million. By way of comparison, cutting the top tax rate and reducing the number of tax brackets—part of the House Republican tax plan that Speaker Ryan endorses, would cost $94 billion next year and $1.4 trillion over the next decade.

Increasing access to administrative data is critical for modern governance

Administrative data—data collected in the regular course of a federal agency performing its designated function—has already revolutionized our understanding of several economic phenomena. Most notably, the use of tax data has allowed economists to study income inequality at the top 1 percent of the income distribution and show that this group of earners is taking a much larger share of total income in the economy than they did 30 years ago. One of the first researchers to use tax data to study incomes noted that for the economics profession, “the economic lives of the rich, especially the rich who are not famous, are something of a mystery.”[1: Feenberg, Daniel R. and James M. Poterba, “Income Inequality and The Incomes of Very High-Income Taxpayers: Evidence from Tax Returns.” In Tax Policy and The Economy, edited by James M. Poterba. MIT Press. 2003. ] That has changed: The New York Times recently published a chart created by academic researchers that showed massive income growth in the top 0.001 percent of all earners, all thanks to the availability of tax data to researchers. These data continue to be hard to obtain, even for researchers in other sections of the federal government. If policymakers want to identify and address modern economic challenges, data such as these need to be more widely available to researchers.

The Commission proposes a new agency, the National Secure Data Service, to provide data anonymization and linkage as a service to researchers. It would not be a data warehouse, as is sometimes proposed, but would instead be an intermediary between researchers and federal agencies to facilitate access to data. This is a reasonable approach to the problem of data access. It means that existing agencies can continue to store data as they have while this new agency would concentrate on developing the methodological capacity to evaluate projects, assess privacy concerns, and merge survey and administrative data.

The Commission’s report also calls attention to an under-appreciated challenge for federal researchers: often even federal researchers cannot obtain data if it is generated in another department. This prevents the Bureau of Economic Analysis, for example, from accessing individual tax data, which could be used to improve some of their current statistical processes. The commission suggests revisiting parts of the U.S. Code that erect these barriers between agencies.

Old sources of data shouldn’t fall by the wayside

As the girl scouts say “make new friends, but keep the old.” Administrative data is new to the scene and much in demand among researchers, but for decades policymakers, academic economists, and pundits have relied on economic surveys such as the Current Population Survey. The Commission clearly understands the value of these surveys and notes that they are now suffering from decreased participation and reluctance of respondents to answer particular questions. It may be tempting to see administrative data as a wholesale replacement for these older tools. This is a mistake.

First, these surveys do capture some dynamics that administrative data does not. The Current Population Survey, for example, tells us about the income of low-income Americans who are not required to file a tax return because they owe no taxes. Researchers frequently merge that data to the tax data to obtain a complete universe of individuals.

More importantly however, survey data comes with far fewer privacy concerns than administrative data, making it possible for the government to freely distribute the raw data. This in turn means that analysis is not limited to federal employees or researchers at elite universities. Journalists, bloggers, policy analysts, and casual enthusiasts all have access to the full data set. This truly democratizes the data and contributes to the discourse over economics by incorporating a diverse set of voices.

Balance privacy and access

Per its congressional mandate, the Commission also engaged at length with the issue of privacy in data. Administrative data raises new privacy concerns and it is reasonable to approach this issue with caution. Researchers who work with administrative data are generally receiving data where obvious identifiers, such as names, birth dates, and addresses have been removed. It may still be possible, however, to identify individuals in the dataset by looking at the data. There are many ways that agencies can approach this problem, and recent advances promise new possibilities. Some agencies, for example, are researching the creation of synthetic data sets that use generated data but retain the statistical properties of the original data set.

While Equitable Growth agrees that privacy is important, it should be balanced against the benefits of access for researchers. The Commission notes this tension as well: “It is equally important, however, to calibrate the need for privacy with the public good that research findings based on such data can provide.” At Commission meetings, presenters in charge of sensitive datasets were frequently asked if they had experienced data breaches. In each occasion, the answer was no (although a few reported minor rules violations). It appears that existing safeguards and those used by state and foreign entities are sufficient to the task of maintaining privacy, so further restrictions on dataset use should be approached with care.