December 4 Johannes Schmitt schmittjoh

Composer, the GC performance improvement, and who else might be affected

Chances are you might have heard of sensational performance improvements in PHP’s package manager, composer, of 30-90% of original runtime by changing not even a single line of code.

I initially suggested this change to the composer team on December 1st, and after some positive reports, it made its way into the production version of composer soon after, and is since then available to all PHP users.

How was the issue found?

At Scrutinizer, we run a lot of CLI-based analysis tools, some of which are written in PHP, and some of these create a lot of objects f.e. during type-inference.

Very early in Scrutinizer’s development (maybe something like 1.5 years ago), we noticed that the analysis time of bigger packages was not increasing linearly, but rather expontentially. Of course, we used profilers like XHProf to track down the issue, but the results were inconclusive; time was spent randomly in different places.

After some research, we came to the conclusion that garbage collection might be responsible, and indeed after disabling it, the analysis of a bigger package like Symfony2 went down from over 50 minutes to less than 10 minutes (this was more than a year ago).

It was only until a few days ago, that we realized that composer did not contain this fix, and might benefit from this, too. When a performance improvement that Nils made had a significantly different impact depending on how many dependencies someone had. The rest of the story is history now :)

Is your app affected, too?

Chances are most likely, no. This particular bad behavior of PHP’s GC only manifests itself in long running CLI processes that create tens of thousands of objects. Web-requests are likely not affected.

If you profile your application, and time is randomly spent in different places from run to run. This is the usual pattern to look for. If you see this, only then try running with GC disabled.

There are also some initiatives now to provide better support for profilers to measure the time spent in GC which will make this even easier.

Is memory consumption increased when disabling GC?

The short, and counter-intuitive answer is no. You will not see a change in your memory consumption.

First, the call to gc_disable() does not turn off garbage collection entirely, it just disables one particular GC strategy that cleans up circular references.

Second, if you see this particular performance degradation, the garbage collection is not working anyway. It basically tries to clean-up your objects, only to recognize that it cannot clean them up, and it does that frequently. This is where the time is lost and the huge improvement is coming from when disabling it.

For a detailed explanation of the different GC strategies in PHP, check out this blog post from Anthony.

Which other tools might be affected?

Like said above, this is mostly something that affects CLI tools. One prime candidate that comes to mind is phpunit. If you have a big test-suite, try running it with GC disabled:

php -d zend.enable_gc=0 vendor/bin/phpunit

 

Have Feedback? Tweet to @scrutinizerci

If you experienced a bug or have any questions, please send them to [email protected].