Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Tax-Calculator use a taxpayer dataset formatted for NBER TAXSIM? #2178

Closed
ernietedeschi opened this issue Dec 31, 2018 · 9 comments
Closed

Comments

@ernietedeschi
Copy link
Contributor

I thought I read before that it could, but I didn't see / missed mention of this in the latest official documentation.

If so, are the required variables the same as in TAXSIM v27? E.g. https://users.nber.org/~taxsim/taxsim27/

Or are there any differences / additions?

@ernietedeschi ernietedeschi changed the title Can Tax-Calculator use a taxpayer dataset formatted for TAXSIM? Can Tax-Calculator use a taxpayer dataset formatted for NBER TAXSIM? Dec 31, 2018
@MattHJensen
Copy link
Contributor

MattHJensen commented Jan 3, 2019

@evtedeschi3, does this file look like what you need?

https://github.com/PSLmodels/Tax-Calculator/blob/7c39274b952328d7f36651f2d20848b9106120a2/taxcalc/validation/taxsim/taxcalc.sh

The action to translate from taxsim input to taxcalc input is happening here: https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/validation/taxsim/prepare_taxcalc_input.py, which is called from the above.

Look in the validation and validation/taxsim folders to see the context. @martinholmer has significantly more experience with these files (having written and used them), but he may be out on vacation at the moment.

@martinholmer
Copy link
Collaborator

@evtedeschi3, sorry for my delay in responding to your question:

Can Tax-Calculator use a taxpayer dataset formatted for NBER TAXSIM-27?

While I was still on vacation, @MattHJensen gave a perfect answer. I can't improve on that answer but I can provide more context.

You were absolutely correct when you said:

I thought I read before that it could, but I didn't see / missed mention of this in the latest official documentation.

The old script that did this (which predated the tc --dump capability) has been replaced by the scripts in the taxcalc/validation/taxsim directory. And the new scripts have been updated to handle TAXSIM-27 (rather than the old 22-input-variable version of TAXSIM).

As of the beginning of 2019, there has been considerable Tax-Calculator-versus-TAXSIM-27 validation for 2017, and much less for the years 2013-2016. Because TAXSIM-27 is already using the actual policy parameter values for 2018 and Tax-Calculator is not (see issue #1694), we have not undertaken any comparison of output from the two models for 2018 filing units. We plan to update Tax-Calculator to use actual 2018 values (as opposed to 2017 projected values) soon. Once that is done we will be able to conduct a comparison for a randomly-generated sample of 2018 filing units.

All this means two things:

  1. You should submit a bug report if, in your own work, you find more than a rounding-error difference between Tax-Calculator and TAXSIM-27 payroll tax liability or individual income tax liability for a 2013-2017 filing unit.

  2. You should not be surprised if 2018+ output from Tax-Calculator and TAXSIM-27 differ.

@ernietedeschi
Copy link
Contributor Author

Perfect. Thank you both!

@ernietedeschi
Copy link
Contributor Author

ernietedeschi commented Jan 9, 2019

OK, so I've created a file I've named c17_taxsim.csv. It's based on data from the 2018 CPS ASEC. It's comma-delimited and has columns for all the TAXSIM27 variables, though I have assigned zeros to ones with insufficient data in the CPS.

Crucially, my variables aren't all in the same order as the TAXSIM27 page present them, though they do all have names / first row column headers that are the same as the TAXSIM27 named fields.

Also, I've narrowed down my file to just records with mstat == 1 | mstat == 2.

Here's the output when I run the script:

Traceback (most recent call last):
  File "prepare_taxcalc_input.py", line 114, in <module>
    sys.exit(main())
  File "prepare_taxcalc_input.py", line 57, in main
    invar = translate(ivar)
  File "prepare_taxcalc_input.py", line 76, in translate
    assert np.all(np.logical_or(mstat == 1, mstat == 2))
AssertionError
ERROR: prepare_taxcalc_input.py failed

I'm confused because all my mstats are 1 or 2, so I'm wondering if the column order in my input file matters and the script is reading the wrong variable as mstat.

@ernietedeschi ernietedeschi reopened this Jan 9, 2019
@martinholmer
Copy link
Collaborator

@evtedeschi3 said in Tax-Calculator issue #2178:

OK, so I've created a file I've named c17_taxsim.csv. It's based on data from the 2018 CPS ASEC. It's comma-delimited and has columns for all the TAXSIM27 variables, though I have assigned zeros to ones with insufficient data in the CPS.

Crucially, my variables aren't all in the same order as the TAXSIM27 page present them, though they do all have names / first row column headers that are the same as the TAXSIM27 named fields.

Also, I've narrowed down my file to just records with mstat == 1 | mstat == 2.

Here's the output when I run the script:

Traceback (most recent call last):
 File "prepare_taxcalc_input.py", line 114, in <module>
   sys.exit(main())
 File "prepare_taxcalc_input.py", line 57, in main
   invar = translate(ivar)
 File "prepare_taxcalc_input.py", line 76, in translate
   assert np.all(np.logical_or(mstat == 1, mstat == 2))
AssertionError
ERROR: prepare_taxcalc_input.py failed

The prepare_taxcalc_input.py script expects a file that could be uploaded to TAXSIM for processing.
My guess is that your c17_taxsim.csv file would generate all kinds of errors if uploaded to TAXSIM because TAXSIM does not deal with CSV-formatted files. It expects a file with 27 space-delimited columns in a specific order as shown on the TAXSIM homepage. And it expects a file without a header row.

@ernietedeschi
Copy link
Contributor Author

That's what I feared. Thank you.

@feenberg
Copy link
Contributor

feenberg commented Jan 10, 2019 via email

@martinholmer
Copy link
Collaborator

Dan @feenberg said in Tax-Calculator issue #2178:

taxsim 27 will accept commas between items, but as Martin says, no variable names in the first row and the order of variables is fixed.

Dan, thanks for the clarification.

@evtedeschi3

@ernietedeschi
Copy link
Contributor Author

Fantastic, thank you. Works now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants