Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory not released under pressure #782

Open
mtrower opened this issue Jan 3, 2021 · 0 comments
Open

Memory not released under pressure #782

mtrower opened this issue Jan 3, 2021 · 0 comments

Comments

@mtrower
Copy link

mtrower commented Jan 3, 2021

Hi, I wasn't sure whether to file this against ZFS or the SPL, but the SPL has very few issues filed, so here I am.

After some heavy activity on a pool (intensive file creation and listing) I'm seeing a wired memory consumption of 14.07GB, of which 10.5GB appears to be consumed by ZFS:

% sysctl kstat.spl.misc.spl_misc.os_mem_alloc
kstat.spl.misc.spl_misc.os_mem_alloc: 11457265664

Which is all well and good, but even under pressure it wasn't dropping, so I tried to constrain the ARC:

kstat.zfs.darwin.tunable.zfs_arc_max: 0 -> 4294967296
kstat.zfs.darwin.tunable.zfs_arc_meta_limit: 0 -> 3221225472
kstat.zfs.darwin.tunable.zfs_arc_min: 0 -> 1610612736
kstat.zfs.darwin.tunable.zfs_arc_meta_min: 0 -> 1342177280
kstat.zfs.darwin.tunable.zfs_dirty_data_max: 1717986918 -> 536870912

ARC now looks like this:

    Time   read   miss  miss%   dmis  dm%  pmis  pm%   mmis  mm%   size  tsize  
13:44:04   122M    23M   18.9   466K  0.7   22M  39.1    23M  18.9  2515M  4294M  
13:44:05      0      0      0      0    0     0    0      0    0  2515M  4294M  
13:44:06      0      0      0      0    0     0    0      0    0  2515M  4294M 

but the SPL isn't dropping with it. No matter; let's apply some pressure and see if it releases.

sudo memory_pressure -l warn -s 8

App Memory consumption slowly climbs over the next 5-10 minutes. Pressure rises, and Wired Memory does not drop. Eventually, Compressed shoots through the roof (8GB or so), and we finally hit "warn", where I hold it for a while to observe.

We can see that the ARC releases memory at a few points:

    Time   read   miss  miss%   dmis  dm%  pmis  pm%   mmis  mm%   size  tsize  
13:56:46      0      0      0      0    0     0    0      0    0  2535M  4294M  
13:56:47      0      0      0      0    0     0    0      0    0  2535M  4294M  
13:56:48      0      0      0      0    0     0    0      0    0  2534M  2534M  
13:56:49      0      0      0      0    0     0    0      0    0  2534M  2534M 
...
13:57:32      0      0      0      0    0     0    0      0    0  2512M  2512M  
13:57:34      0      0      0      0    0     0    0      0    0  2507M  2507M  
13:57:35      0      0      0      0    0     0    0      0    0  2487M  2487M  
13:57:36      0      0      0      0    0     0    0      0    0  1390M  1610M  
13:57:37      0      0      0      0    0     0    0      0    0  1390M  1610M  
13:57:38      0      0      0      0    0     0    0      0    0  1353M  1610M 

But the SPL remains iron-fisted.

kstat.spl.misc.spl_misc.os_mem_alloc: 11457265664

Alright; let's just export the pool entirely (no pools imported), and check the ARC again:

    Time   read   miss  miss%   dmis  dm%  pmis  pm%   mmis  mm%   size  tsize  
00:44:09   122M    23M   18.9   468K  0.7   22M  39.1    23M  18.9    44M  1610M  
00:44:10      0      0      0      0    0     0    0      0    0    44M  1610M

And the SPL...

kstat.spl.misc.spl_misc.os_mem_alloc: 10613424128

So this time, the SPL dropped by the amount of ARC freed. What the heck is it doing with the rest?

Let's try applying putting the squeeze on again

% sudo memory_pressure -l warn

Memory looks like this:

Screen Shot 2021-01-03 at 00 52 38

kstat.spl.misc.spl_misc.os_mem_alloc: 3674210304

We've released a lot, but we're still holding on to >3GB with no pools imported?

Let's repeat from the start (sort of; I'm not rebooting, nor relaxing the ARC constraints). Import pool; do some work:

    Time   read   miss  miss%   dmis  dm%  pmis  pm%   mmis  mm%   size  tsize  
01:02:16    58K    13K   23.7    215  0.7   13K  45.8    13K  23.7  4288M  4294M  
01:02:17    59K    13K   22.6    214  0.7   13K  43.7    13K  22.6  4250M  4294M
kstat.spl.misc.spl_misc.os_mem_alloc: 6923747328

Seems reasonable. Apply pressure: takes forever again (>10m to hit WARN at a final pressure of 65%). For most of that time, memory consumption climbed very slowly with pressure holding around ~35%. It's as if memory_pressure is struggling to find pages to allocate, even though pressure is ostensibly low.

    Time   read   miss  miss%   dmis  dm%  pmis  pm%   mmis  mm%   size  tsize  
01:14:35   159M    30M   19.4   616K  0.7   30M  39.6    30M  19.4  4161M  4294M  
01:14:36      0      0      0      0    0     0    0      0    0  4161M  4294M
...
01:24:59      0      0      0      0    0     0    0      0    0  4059M  4059M  
01:25:00      0      0      0      0    0     0    0      0    0  3974M  4059M  
01:25:01      0      0      0      0    0     0    0      0    0  3974M  4059M  
01:25:02      0      0      0      0    0     0    0      0    0  3974M  4059M  
01:25:03      0      0      0      0    0     0    0      0    0  2318M  1610M  
01:25:04      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:05      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:06      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:07      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:08      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:09      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:10      0      0      0      0    0     0    0      0    0  1001M  1610M  
01:25:11      0      0      0      0    0     0    0      0    0   882M  1610M  

And SPL has barely budged:

kstat.spl.misc.spl_misc.os_mem_alloc: 6909067264

Finally, I reversed all of the tuning (thinking maybe the ARC minimums were causing SPL to hold memory)

kstat.zfs.darwin.tunable.zfs_arc_max: 4294967296 -> 0
kstat.zfs.darwin.tunable.zfs_arc_meta_limit: 3221225472 -> 0
kstat.zfs.darwin.tunable.zfs_arc_min: 1610612736 -> 0
kstat.zfs.darwin.tunable.zfs_arc_meta_min: 1342177280 -> 0
kstat.zfs.darwin.tunable.zfs_dirty_data_max: 536870912 -> 1717986918

exported the pool, and applied pressure, but the SPL is still holding >3GB

    Time   read   miss  miss%   dmis  dm%  pmis  pm%   mmis  mm%   size  tsize  
01:54:00      0      0      0      0    0     0    0      0    0   144K  1610M
sysctl kstat.spl.misc.spl_misc.os_mem_alloc
kstat.spl.misc.spl_misc.os_mem_alloc: 3529768960

System info

% sysctl zfs spl
zfs.kext_version: 1.9.4-0
spl.kext_version: 1.9.4-0

% sw_vers
ProductName:	Mac OS X
ProductVersion:	10.14.6
BuildVersion:	18G7016

Next steps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant