Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Did AGI millionaires have a tax increase under the TCJA or am I doing something wrong? #2513

Closed
donboyd5 opened this issue Nov 24, 2020 · 15 comments

Comments

@donboyd5
Copy link
Contributor

donboyd5 commented Nov 24, 2020

Hi,

I'm hoping @MattHJensen, @Peter-Metz, or someone else familiar with tax-calculator and TCJA can respond.

Am trying to make sure I know how to use and construct json files for different kinds of reforms.

As a test:

  • I tried to compare 2018 law (TCJA) to 2017 law on puf.csv at 2018 income levels (puf.csv advanced to 2018 with default parameters, thus using growfactors, pufweights, and adjust_ratios).

  • I pulled the json files from here -- 2017_law.json for 2017 law, and TCJA.json for 2018 law. (I know I can use the built-in 2018 law on 2018 income levels, and in fact I compared to that and got the same results. I am using json files here because I want to see and understand how parameters are set so that later I can change them selectively.)

  • I ran the reforms statically - no behavioral changes.

  • I examined results for the iitax concept -- Total federal individual income tax liability. In 2017 this is defined as 1040 line 63 minus line 57 minus line 62a.

  • I examined this for all records in puf.csv and for those that had agi (c00100) >= $1 million in 2017.

What I found:

  • The TCJA caused a static reduction in iitax, over all puf.csv records, of $162 billion, or 9.4%

  • For millionaires based on the 2017 definition of AGI, iitax actually increased by 0.7%.

  • For these same records (millionaires based on 2017-defined AGI), 2018-defined AGI at 2018 levels was 1.8% greater than 2017-defined AGI at 2018 levels.

This is not what I expected. I did not expect millionaires to have a tax increase and I did not expect their AGI to increase based solely on definitional changes. I am wondering if there is something I am doing wrong or if there is something about the TCJA that should have caused these results. I suspect the former.

I've copied my full code below. It should be reproducible save for changing folder locations. I would much appreciate any advice or feedback people can provide.

Many thanks.

Don


# %% imports
import taxcalc as tc
import pandas as pd
import numpy as np


# %%  locations
DIR_FOR_OFFICIAL_PUF = r'C:\Users\donbo\Dropbox (Personal)\PUF files\files_based_on_puf2011/2020-08-20/'
REFORMSDIR = r'C:\programs_python\puf_analysis\reforms/'


# %% constants
LATEST_OFFICIAL_PUF = DIR_FOR_OFFICIAL_PUF + 'puf.csv'  # August 20, 2020 puf.csv

# reform files and the locations I took them from
# https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/2017_law.json
law_2017 = REFORMSDIR + '2017_law.json'

# https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/reforms/TCJA.json
law_2018 = REFORMSDIR + 'TCJA.json'


# %% get reforms
params_2017 = tc.Calculator.read_json_param_objects(law_2017, None)
params_2018 = tc.Calculator.read_json_param_objects(law_2018, None)


# %% get data
puf = pd.read_csv(LATEST_OFFICIAL_PUF)
recs_puf = tc.Records(data=puf)


# %% estimate tax using puf.csv

pol = tc.Policy()

# estimate TCJA on 2018 data
pol.implement_reform(params_2018['policy'])
calc2018 = tc.Calculator(policy=pol, records=recs_puf)
calc2018.advance_to_year(2018)
calc2018.calc_all()
itax2018 = calc2018.weighted_total('iitax')

# estimate 2017 law on 2018 data
pol.implement_reform(params_2017['policy'])
calc2017 = tc.Calculator(policy=pol, records=recs_puf)
calc2017.advance_to_year(2018)
calc2017.calc_all()
itax2017 = calc2017.weighted_total('iitax')

itax2018  # 1557359689038.7046
itax2017  # 1719791464209.197
diff = itax2018 - itax2017  # $162 billion cut -162431775170.49243
pdiff = diff / itax2017 * 100   # 9.4% cut


# %% how did returns with 2017 agi > $1 million do?
df2017 = calc2017.dataframe(variable_list=['RECID', 's006', 'c00100', 'iitax'])
df2018 = calc2018.dataframe(variable_list=['RECID', 's006', 'c00100', 'iitax'])

sub2017 = df2017.query('c00100 >= 1e6')  # get 2017-agi-definition millionaires
sub2018 = df2018.query('RECID in @sub2017.RECID') # get the same records from df2018

# get weighted sums of tax
taxm2017 = (sub2017.iitax * sub2017.s006).sum()  # $524 billion  523957799640.81213
taxm2018 = (sub2018.iitax * sub2018.s006).sum()  # $528 billion 527798114885.58154
taxm2018 / taxm2017 * 100 - 100  # + 0.7%   0.7329436163374226

# get weighted sums of agi
agi2017 = (sub2017.c00100 * sub2017.s006).sum()
agi2018 = (sub2018.c00100 * sub2018.s006).sum()
agi2018 / agi2017 * 100 - 100  # +1.8%  1.7550784754322422


@MattHJensen
Copy link
Contributor

MattHJensen commented Nov 24, 2020

@donboyd5, TCJA is the default current law for the calculator, so you do not need to implement the TCJA reform again. Could you try comparing a 2017-law calculator (as baseline) to a current-law calculator (as reform) in 2018?

If you'd like to implement the TCJA law explicitly, then you could first implement 2017 law for that calculator, and then implement the TCJA reform, thereby going on a "round trip".

@MattHJensen
Copy link
Contributor

MattHJensen commented Nov 24, 2020

@donboyd5, I should also point you to recipe 1 and recipe 5 in the cookbook. Recipe 1 compares pre-tcja to tcja law, and recipe 5 shows how to use an alternative tab measure (like AGI) for the built-in distribution tables.

@donboyd5
Copy link
Contributor Author

donboyd5 commented Nov 24, 2020 via email

@donboyd5
Copy link
Contributor Author

donboyd5 commented Nov 25, 2020

@MattHJensen, lightly edited version of the email:

As far as I can tell puf.csv data really do show uber-millionaires getting tax increases.

I tried the approach above and several others, including Recipe 1, all of which gave me the same answer. My version of Recipe 1 code is at the bottom; I used puf.csv instead of the cps and there are a few other minor differences that I don't think could cause error.

In addition to the weighted deciles difference table in Recipe 1, I calculated 2 other difference tables, with different bins:

diff_table = calc1.difference_table(calc2, 'weighted_deciles', 'iitax')
diff_table2 = calc1.difference_table(calc2, 'standard_income_bins', 'iitax')
diff_table3 = calc1.difference_table(calc2, 'soi_agi_bins', 'iitax')

When you look at weighted deciles, everything looks reasonable - total tax cut is $162 billion, the top 1% get $5b of that, and the next 4% get $45b of that. It's very toploaded, as I would expect.

image

Next, I looked at the standard income bins. All use expanded income, not AGI, which is what I looked at before, but I don't think that difference is too important h ere. For expanded income > $1 million, we see a tax INCREASE of $3.5 billion. It's only ~740k returns whereas the top 1% from weighted deciles are the top 1.7 million returns. We must be getting into very thin parts of the data. When I do the calculation manually using AGI (not available in difference_table) I get a tax increase of $3.8 billion, so expanded income is telling a similar story to AGI and the two income measures are reasonably in accord.

image

Next, I looked at finer expanded income breakdowns, using the soi_agi_bins. They're not labeled, but I put the cutpoints into the screenshot:

image

We have significant increases in the two bins above $5 million with about 70k taxpayers, but the poorest millionaires (338k of them) have a substantial tax cut.

It probably is possible to track down by looking at intermediate calculations for a few very rich people and figuring out why they have tax increases. I'll see if I can do that.

import pandas as pd
import taxcalc as tc

DIR_FOR_OFFICIAL_PUF = r'C:\Users\donbo\Dropbox (Personal)\PUF files\files_based_on_puf2011/2020-08-20/'
LATEST_OFFICIAL_PUF = DIR_FOR_OFFICIAL_PUF + 'puf.csv'  # August 20, 2020 puf.csv

# read an "old" reform file
# ("old" means the reform file is defined relative to pre-TCJA policy)
# REFORMS_PATH = '../../taxcalc/reforms/'
REFORMS_PATH = r'C:\programs_python\puf_analysis\reforms/'

# specify reform dictionary for pre-TCJA policy
reform1 = tc.Policy.read_json_reform(REFORMS_PATH + '2017_law.json')

# specify reform dictionary for TCJA as passed by Congress in late 2017
reform2 = tc.Policy.read_json_reform(REFORMS_PATH + 'TCJA.json')

# specify Policy object for pre-TCJA policy
bpolicy = tc.Policy()
bpolicy.implement_reform(reform1, print_warnings=False, raise_errors=False)
assert not bpolicy.parameter_errors

# specify Policy object for TCJA reform relative to pre-TCJA policy
rpolicy = tc.Policy()
rpolicy.implement_reform(reform1, print_warnings=False, raise_errors=False)
assert not rpolicy.parameter_errors
rpolicy.implement_reform(reform2, print_warnings=False, raise_errors=False)
assert not rpolicy.parameter_errors


puf = pd.read_csv(LATEST_OFFICIAL_PUF)

# recs = tc.Records.cps_constructor()  # use cps

recs = tc.Records(data=puf)  # or use puf

# specify Calculator objects using bpolicy and rpolicy
calc1 = tc.Calculator(policy=bpolicy, records=recs)
calc2 = tc.Calculator(policy=rpolicy, records=recs)

CYR = 2018

# calculate for specified CYR
calc1.advance_to_year(CYR)
calc1.calc_all()
calc2.advance_to_year(CYR)
calc2.calc_all()

# compare aggregate individual income tax revenue in cyr
iitax_rev1 = calc1.weighted_total('iitax')  # bpolicy 2017 law
iitax_rev2 = calc2.weighted_total('iitax')  # rpolicy

iitax_rev2 / iitax_rev1 * 100 - 100


df2017 = calc1.dataframe(variable_list=['RECID', 's006', 'c00100', 'iitax'])
(df2017.iitax * df2017.s006).sum() - iitax_rev1
df2018 = calc2.dataframe(variable_list=['RECID', 's006', 'c00100', 'iitax'])
((df2017.iitax * df2017.s006).sum()  - (df2018.iitax * df2018.s006).sum()) / 1e9

sub2017 = df2017.query('c00100 >= 1e6')  # get 2017 agi millionaires
sub2018 = df2018.query('RECID in @sub2017.RECID') # get the same records from dfTCJA
mtax2017 = (sub2017.iitax * sub2017.s006).sum()
mtax2018 = (sub2018.iitax * sub2018.s006).sum()
(mtax2018 - mtax2017) / 1e9

# construct reform-vs-baseline difference table with results for income deciles standard_income_bins
diff_table = calc1.difference_table(calc2, 'weighted_deciles', 'iitax')
diff_table.loc[:, ['count', 'tax_cut', 'tax_inc', 'tot_change']]

diff_table2 = calc1.difference_table(calc2, 'standard_income_bins', 'iitax')
diff_table2.loc[['>$1000K'], ['count', 'tax_cut', 'tax_inc', 'tot_change']]
diff_table2.loc[:, ['count', 'tax_cut', 'tax_inc', 'tot_change']]

@donboyd5
Copy link
Contributor Author

donboyd5 commented Nov 25, 2020

@MattHJensen , @feenberg (see 10 RECIDS below)

The reason for millionaire tax increases seems to be related to (1) SALT, which is entirely expected from a tax calculation standpoint although I don't know whether we'd expect as much of it in the data as we have in puf.csv, (2) capital gains, which I don't understand but I suspect it's the same thing - that is, expected from a calculation perspective but I wouldn't know if we have what we want in the data, and (3) to a far lesser extent, miscellaneous deductions.

I will try to add some data re capital gains after this post and would welcome comments on why this could cause tax increases.

Here is what I did. I did the post-tax-calculator analysis in R for the sake of speed but happy to provide code if anyone wants:

  • ran 2017_law and TCJA json files on puf.csv extrapolated to 2018 with all defaults, static, in python and saved results; the rest is in R
  • selected all records in both files where agi-by-2017-definition was > $ 1 million; 29,403 records
  • calculated iitax change and weighted iitax change for these AGI millionaires -- the net weighted tax increase among all these millionaires was $3.8 billion, the same as noted above
  • if we only include the millionaires with increases, the weighted increase was $16 billion
  • increases were widespread -- 43% of the 29k unweighted millionaire records had increases
  • and were concentrated but not terribly concentrated -- about 500 records account for 50% of the $16b increase

Next:

  • I identified the approximately 12.8k 2017law-agi-defition millionaire records that had iitax increases under TCJA
  • for each record, for each tax-calculator output variable, calculated the difference and weighted difference between 2017_law and TCJA values, and also calculated
  • for each variable where the difference between 2017_law and TCJA was nonzero, calculated the sum over the ~12.8k iitax-increase 2017-agi-defined-millionaire records, the number of recs with nonzero changes, the weighted number, the sum of the change, and the weighted sum of the change
  • sorted from greatest to smallest
  • the result is the table you see below

Comments:

  • for these records, c04800 taxable income increased by $86b
  • largest drivers were dwks14 and dwks19, which are related and increased by ~$68b and ~$66b respectively; they are defined in very complex calculations in the GainsTax function in calcfunctions.py, which "implements (2015) Schedule D Tax Worksheet logic for the special taxation of long-term capital gains and qualified dividends if CG_nodiff is false."; beyond that I don't understand what is going on and would really love to hear comments
  • the next major item is c18300 SALT where this group lost $69b of deductions; presumably the only question is whether that's a reasonable amount to have in the data and I suspect it is
  • after that, the numbers get small; c20800 Net limited miscellaneous deductions deducted is a reduction of $8b which again I presume is entirely expected

So I think the only real questions (and others may already know the answers) are:

  • are the capital gains tax increases expected (and would it be possible to explain why the increases occur - I would love to know)?
  • is $69b of lower SALT deductions reasonable, which I suspect it is, but will look around at some comparisons to IRS data and will report back

As for specific records, the 2017-agi-definition millionaire RECIDS with the 10-largest weighted increases in iitax are:

207120 236805 228028 223428 227389 205708 222811 231102 228467 234452

If anyone has ability to understand why any of those had increases (presumably, why dwks14 and dwks19 increased so much), it would be great to know. @feenberg, if you can look at any I would really appreciate it.

I'll report back on:

  • what the data inputs to GainsTax looked like in this group of 12.8k records, and
  • what we know of SALT for millionaires in 2018 from IRS data

image

@feenberg
Copy link
Contributor

feenberg commented Nov 25, 2020 via email

@donboyd5
Copy link
Contributor Author

donboyd5 commented Nov 25, 2020 via email

@donboyd5
Copy link
Contributor Author

Here is the first of the 2 additional pieces of info I mentioned: a comparison of c18300 from puf.csv grown to 2017 (NOT 2018) with defaults, to the relevant IRS data for 2017, by 2017 agi group, giving us insight into how much SALT millionaires had to lose in 2017. Below that I copy a table that sums the millionaire ranges. It turns out that millionaires had a lot more to lose than $69 billion, and puf appears to understate the amount they had to lose. So while that could use some investigation in a weighting context, the millionaire tax increases we are seeing in puf.csv due to the SALT cap are if anything too low.

image

image

@MattHJensen
Copy link
Contributor

MattHJensen commented Nov 25, 2020

@donboyd5, I will start looking at the data next, but first a couple of initial observations:

  1. we should expect a sizable portion of this million+ AGI group to have tax increases. TPC, here, in table 4, find that 18% of the top 0.01% of TPC expanded income (over $3.4mm) had tax increases in 2018. This is after they distribute estate and corporate tax changes, both of which would register as significant tax cuts for this income range, and neither of which are included in these tax-calculator results.

  2. the default treatment for QBID when using a tax-data produced puf.csv file w/ Tax-Calculator will significantly underestimate the value of the deduction and lead to more high-income taxpayers with tax increases. See Add switch for QBI deduction wage/capital limitations #2497.

@donboyd5
Copy link
Contributor Author

donboyd5 commented Nov 25, 2020

Thanks, @MattHJensen, I suspect that is the case. If anything, the SALT data above suggest it might even be greater than puf.csv suggests. I want to make sure I understand it and can explain it, in relation to a NY project underway. I don't recall it being talked about much in the press (SALT in isolation yes, but I don't recall much discussion of net increases for millionaires), in the manner analyzed here (static, individual tax only), so I'm looking for some comfort.

The TPC results definitely provide some comfort. Here is some rough math on Table 4, converting losers to total dollars:

  • number in top 0.1% 172.7m taxpayers (from above) x 0.1% = ~173k taxpayers in the top 0.1%
  • times an average tax increase for the 16.2% (see table below) who are losers of $387.6k, a tax increase for those in the top 0.1% who are losers of $10.8 b; that's definitely in the ballpark -- my similar (not comparable) number from puf was $16 billion; however, you note that the TPC number also includes estate and corp tax changes, not in what I did; so that is indeed comforting

TPC table below:
image

This is all very interesting and quite frankly a surprise to me.

I'll reread the QBID thread with an eye toward understanding it well enough to implement other treatment(s) to see how much impact it has. It may be that I should be using an alternative QBID treatment.

Here is the last piece I mentioned, on the inputs into the GainsTax function. I don't understand what's going on well enough to interpret it, but maybe others do. However, here's what I did to create the table below:

  • created comparison file for millionaires as above, with weighted values for law2017, law2018, and difference
  • selected variables that enter into GainsTax, and got the sums of weighted values for each variable, with dollar sums in $ billions
  • calculated number of weighted records that had nonzero amounts under 2017 law, and weighted number that had nonzero differences
  • sorted by 2017-law weighted sums

I am not sure to make of it other than that these tax-increase millionaires had a lot of capital gains, it seems. But someone who understands what GainsTax is doing may glean information from it.

On to QBID.

image

@MattHJensen
Copy link
Contributor

@donboyd5, note that JCT does show a net tax cut of around $16B for million+ expanded income on the individual side only in FY2019. Their line item revenue estimate for estate tax changes in the same year roughly $8B. So let's say they find a net tax cut of around $8B w/o estate taxes. They include QBID on the individual side.

https://www.jct.gov/publications/2019/jcx-10-19/

image

image

@donboyd5
Copy link
Contributor Author

donboyd5 commented Nov 25, 2020

Many thanks @MattHJensen . Our comments crossed.

So my summary of what I think we've learned so far:

  • JCT estimates a net individual tax cut for "millionaires" (we don't have quite consistent definition) of call it $8 billion
  • TPC suggests that a fair number of "millionaires" (top 0.1%-ers) have increases, even when corp and estate changes (likely cuts) are included, and could amount to $10b, and prob signif greater if we could net out corp and estate
  • puf.csv shows ~$16b increase for loser millionaires, which doesn't seem crazily out of line with TPC, and the SALT part of it probably understates it considerably because IRS data show much more SALT available to lose than puf in those income ranges
  • puf.csv shows net tax increase of $3.8b for this group, which is different from the net tax cut of ~$8b in JCT
  • would be nice to understand a bit more; candidates for this are (1) GainsTax, where puf has big increases; may make perfect sense but it involves tax-calculator calcs that are way beyond my current understanding and I'm hoping to learn from someone else, (2) QBID, where I will read and up my understanding and see if I can do alternative calcs, and (3) something else

@jdebacker
Copy link
Member

@donboyd5 @MattHJensen Can you update us on the status of this issue? Thanks!

@donboyd5
Copy link
Contributor Author

@jdebacker @MattHJensen @andersonfrailey:

I think @MattHJensen and I concluded that when we turn off the QBID phaseout (#2497) our results are reasonably consistent with others so I don't think we have a Tax-Calculator issue and it would be fine to close this.

We continue to have substantially different income and deduction magnitudes in top AGI ranges than the IRS, which I think is worthy of TaxData consideration, along with the other issues @jdebacker closed recently in Tax-Calculator. I continued to see large differences in my recent work on 50-state weights. I'm not sure it makes sense to re-open each of those issues in TaxData. Rather, it might make sense to have a more-general long-(or mid-)term TaxData issue on methods for (a) comparing TaxData to published and forecasted values, and (b) methods for bringing TaxData (puf.csv) in line with those values, time permitting.

@jdebacker
Copy link
Member

@donboyd5 Thanks for the update and your work in this area. I'll go ahead and close this hear, but I like the idea of having more discussion over in the taxdata repo as this seems like an important issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants