Skip to content

High memory use and possible leak with sequential access mode #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mzur opened this issue Apr 14, 2025 · 12 comments
Open

High memory use and possible leak with sequential access mode #268

mzur opened this issue Apr 14, 2025 · 12 comments

Comments

@mzur
Copy link

mzur commented Apr 14, 2025

Hi, first of all thank you @jcupitt for this library and your incredible support!

I ran across an issue where sequential access mode seems to consume a lot more memory and also seems to have a leak where memory is not freed after processing, comapred to random mode. Here is a minimal example script:

<?php

include 'vendor/autoload.php';

use Jcupitt\Vips\Image;

$access = 'sequential';
// $access = 'random';

for ($i=0; $i < 10; $i++) {
    $image = Image::newFromFile('my_image.jpg', ['access' => $access]);
    $width = $image->width;
    $height = $image->height;

    $buf = $image->crop(round($width / 2) - 150, round($height / 2) - 150, 300, 300)
        ->writeToBuffer('.jpg');
}

When I use sequential mode and call /usr/bin/time -v php my_script.php I get:

	Command being timed: "php my_script.php"
	User time (seconds): 13.18
	System time (seconds): 5.63
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:18.82
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 15905256
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 4083940
	Voluntary context switches: 852
	Involuntary context switches: 672
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

So it uses 15 GB to process the images. When I increase the number of iterations in the loop, the consumed memory also increases.

With random mode I get:

	Command being timed: "php my_script.php"
	User time (seconds): 5.20
	System time (seconds): 4.96
	Percent of CPU this job got: 181%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.61
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 472804
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 107770
	Voluntary context switches: 7281
	Involuntary context switches: 1091
	Swaps: 0
	File system inputs: 0
	File system outputs: 12657432
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

It is faster and only uses 500 MB.

The image I use is a heavily compressed JPEG (~60 MB):

$ vipsheader my_image.jpg
my_image.jpg: 46789x46169 uchar, 3 bands, srgb, jpegload

I can share the actual file via email but not publicly here.

I use vips-8.15.1 and jcupitt/vips:2.4.1.

@jcupitt
Copy link
Member

jcupitt commented Apr 14, 2025

Hi @mzur,

This is the libvips operation cache -- it's tracking recent operations and trying to reuse them. If you disable the cache with Vips\Config::cacheSetMax(0); (ie. set the max cache size to 0), the seq version runs in 500mb too. You will see memory use creep up over time due to heap fragmentation, but there's no leak.

random mode will decompress the entire image to a temporary file for random access, then reuse that 10 times. sequential will decompress each time, so it'll be a lot slower.

@mzur
Copy link
Author

mzur commented Apr 14, 2025

Thanks! So now I have this:

<?php

include 'vendor/autoload.php';

use Jcupitt\Vips\Image;
use Jcupitt\Vips\Config;

$access = 'sequential';
// $access = 'random';

Config::cacheSetMax(0);

for ($i=0; $i < 10; $i++) {
    $image = Image::newFromFile('my_image.jpg', ['access' => $access]);
    $width = $image->width;
    $height = $image->height;

    $buf = $image->crop(round($width / 2) - 150, round($height / 2) - 150, 300, 300)
        ->writeToBuffer('.jpg');
}

But it still uses lots of memory, increasing with the number of loop iterations:

	Maximum resident set size (kbytes): 18410168

@mzur
Copy link
Author

mzur commented Apr 14, 2025

cacheSetMax() only seems to have an effect on access random where it will no longer reuse the temporary file.

@mzur
Copy link
Author

mzur commented Apr 16, 2025

I did some more experiments, re-encoded the image, converted it to PNG but nothing helped. Even if I found a way to reduce the amount of memory used, this would only delay the issue, as the used memory would still stack up (only slower). I found no way to clear the memory. It also seems to be outside of the memory_limit checks of PHP. So the only solution left to me is to configure my worker processes to restart after each job. That's an ok solution for me and I don't think anything else could be done here so I'll close this. Thanks again for everything you do here @jcupitt!

@mzur mzur closed this as completed Apr 16, 2025
@jcupitt
Copy link
Member

jcupitt commented Apr 16, 2025

Sorry, I was stuck on another issue. I'll have a look into this today.

@jcupitt jcupitt reopened this Apr 16, 2025
@mzur
Copy link
Author

mzur commented Apr 16, 2025

No need to apologize at all! I think there is nothing to be done about the continuously increasing memory usage. PHP just isn't designed for continuously running scripts. My first observation with sequential vs random is irrelevant, as random just uses disk space instead of memory, as you pointed out. I'm fine with closing this issue.

@cypherbits
Copy link

Hi, I think I may be hitting the same issue, but with the thumbnail function.

$original = Image::thumbnail($sourcePath, IMAGE_MAX_WIDTH, ['height' => IMAGE_MAX_HEIGHT, 'size' => 'down']);
$original->writeToFile($destinationPathDownload, [
                                'Q' => 78,
                                'strip' => true]);

I have this code in a loop and memory is increasing. Even tried unset($original).
(Didn't try the cache to 0 yet).

@cypherbits
Copy link

Had to disable the use of vips on production since I tried everything but memory was increasing on each thumbnail generated...

@jcupitt
Copy link
Member

jcupitt commented May 28, 2025

This is strange, I don't see this when running at the CLI.

For example:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';
use Jcupitt\Vips;

if (count($argv) != 2) {
    echo("usage: ./thumbnail-loop.php input\n");
    exit(1);
}

function thumbnail($filename)
{
    $image = Vips\Image::thumbnail($filename, 600, [
        'height' => 10_000_000,
        'export-profile' => 'srgb'
    ]);

    $buf = $image->writeToBuffer('.jpg', [
        'Q' => 75,
        'strip' => TRUE,
        'optimize_coding' => TRUE,
        'profile'=>'srgb'
    ]);

    return $buf;
}


echo "iteration, now (kb), growth (kb)\n";
$prev = 0;
for ($i = 0; $i < 100000; $i++) {
    $buf = thumbnail($argv[1]);

    if ($i % 10 == 0) {
        gc_collect_cycles();
        $pid = getmypid();
        $now = intval(`ps --pid $pid --no-headers -orss`);
        $use = $now - $prev;
        $prev = $now;
        echo "$i, $now, $use\n";
    }
}

With this sample JPEG:

$ vipsheader nina.jpg
/home/john/pics/nina.jpg: 6048x4032 uchar, 3 bands, srgb, jpegload

I see:

$ ./thumbnail-loop.php nina.jpg
iteration, now (kb), growth (kb)
0, 107252, 107252
10, 139800, 32548
20, 142152, 2352
30, 143280, 1128
40, 145464, 2184
50, 146524, 1060
60, 146636, 112
70, 147932, 1296
80, 148272, 340
90, 148784, 512
100, 148476, -308
110, 148792, 316
120, 149328, 536
130, 149620, 292
140, 149780, 160
150, 151012, 1232
160, 150552, -460
...
990, 157220, 532
1000, 156336, -884
1010, 156412, 76
...
1990, 166904, 340
2000, 166456, -448
2010, 167068, 612
...
2990, 172524, 200
3000, 172792, 268
3010, 173144, 352
...
4990, 179592, 576
5000, 178828, -764
5010, 178896, 68
...
7990, 179812, -984
8000, 179684, -128
8010, 179752, 68
...

You can see size stabilises at about 180mb after maybe 8000 iterations, so I would say that's classic memory fragmentation and no leaks. jemalloc makes it stabilise more quickly:

$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 ./thumbnail-loop.php nina.jpg
iteration, now (kb), growth (kb)
0, 118720, 118720
10, 156060, 37340
20, 162596, 6536
30, 163560, 964
40, 164576, 1016
50, 165052, 476
60, 165824, 772
70, 165192, -632
80, 164836, -356
90, 163388, -1448
100, 162844, -544
110, 164424, 1580
120, 162100, -2324
130, 159912, -2188
140, 161928, 2016
150, 161348, -580
160, 159000, -2348
...

Now it's stable after only 100 iterations, and at a lower level.

@jcupitt
Copy link
Member

jcupitt commented May 28, 2025

That's php 8.4.5 on ubuntu 25.04 with libvips 8.17, I should have said, though I think older versions behave the same.

@jcupitt
Copy link
Member

jcupitt commented May 28, 2025

I tried with libvips 8.14 and it also seems fine:

$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 ./thumbnail-loop.php nina.jpg
iteration, now (kb), growth (kb)
0, 82844, 82844
10, 103416, 20572
20, 107012, 3596
30, 110028, 3016
40, 110904, 876
50, 112632, 1728
60, 111280, -1352
70, 113108, 1828
80, 110136, -2972
90, 107776, -2360
100, 107028, -748
110, 107964, 936
120, 112792, 4828
130, 108032, -4760
140, 106568, -1464
150, 108268, 1700
160, 106944, -1324
170, 108624, 1680
180, 106192, -2432
190, 105140, -1052
...

I think I'd need some way to be able to reproduce this high memory use before I could investigate further.

@jcupitt
Copy link
Member

jcupitt commented May 28, 2025

Hi again @mzur, I finally got around to this (sorry). I made this program, based on yours:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';
use Jcupitt\Vips;

$access = 'sequential';
// $access = 'random';
    
echo "iteration, now (kb), growth (kb)\n";
$prev = 0;
for ($i=0; $i < 100000; $i++) {
    $image = Vips\Image::newFromFile($argv[1], ['access' => $access]);
    $width = $image->width;
    $height = $image->height;
        
    $buf = $image
        ->crop(round($width / 2) - 150, round($height / 2) - 150, 300, 300)
        ->writeToBuffer('.jpg');

    if ($i % 10 == 0) {
        gc_collect_cycles();
        $pid = getmypid();
        $now = intval(`ps --pid $pid --no-headers -orss`);
        $use = $now - $prev;
        $prev = $now;
        echo "$i, $now, $use\n";
    }
}

With a 6kx 4k sample JPEG I see:

$ ./mzur.php ~/pics/nina.jpg
iteration, now (kb), growth (kb)
0, 97676, 97676
10, 420440, 322764
20, 491712, 71272
30, 617412, 125700
40, 646892, 29480
50, 571248, -75644
60, 583912, 12664
70, 598468, 14556
80, 653044, 54576
90, 653912, 868
100, 621688, -32224
110, 660300, 38612
120, 681684, 21384
130, 669884, -11800
140, 679448, 9564
150, 670444, -9004
160, 670124, -320
170, 671568, 1444
180, 689084, 17516
...

It seems stable up to 3000 iterations at least.

600mb is a lot. This PC has 32 cores and this task has almost no parallelism, so you can shrink the libvips threadpool right down with no loss in speed:

$ VIPS_CONCURRENCY=1 ./mzur.php ~/pics/nina.jpg
iteration, now (kb), growth (kb)
0, 67320, 67320
10, 97252, 29932
20, 103456, 6204
30, 105116, 1660
40, 106180, 1064
50, 107336, 1156
60, 107540, 204
70, 107048, -492
80, 107020, -28
90, 107076, 56
100, 106828, -248
110, 107208, 380
120, 107244, 36
130, 106964, -280
140, 107680, 716
150, 107524, -156
160, 107336, -188
170, 107764, 428
180, 107688, -76
190, 107736, 48
200, 107276, -460
210, 107216, -60
220, 107332, 116
230, 107568, 236
240, 107428, -140
250, 107140, -288
...
26070, 105752, 396
26080, 105368, -384
26090, 106204, 836

Now it's down to 100mb. jemalloc would help a little more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants