Chances are you might have heard of sensational performance improvements in PHP’s package manager, composer, of 30-90% of original runtime by changing not even a single line of code.
I initially suggested this change to the composer team on December 1st, and after some positive reports, it made its way into the production version of composer soon after, and is since then available to all PHP users.
At Scrutinizer, we run a lot of CLI-based analysis tools, some of which are written in PHP, and some of these create a lot of objects f.e. during type-inference.
Very early in Scrutinizer’s development (maybe something like 1.5 years ago), we noticed that the analysis time of bigger packages was not increasing linearly, but rather expontentially. Of course, we used profilers like XHProf to track down the issue, but the results were inconclusive; time was spent randomly in different places.
After some research, we came to the conclusion that garbage collection might be responsible, and indeed after disabling it, the analysis of a bigger package like Symfony2 went down from over 50 minutes to less than 10 minutes (this was more than a year ago).
It was only until a few days ago, that we realized that composer did not contain this fix, and might benefit from this, too. When a performance improvement that Nils made had a significantly different impact depending on how many dependencies someone had. The rest of the story is history now :)
Chances are most likely, no. This particular bad behavior of PHP’s GC only manifests itself in long running CLI processes that create tens of thousands of objects. Web-requests are likely not affected.
If you profile your application, and time is randomly spent in different places from run to run. This is the usual pattern to look for. If you see this, only then try running with GC disabled.
There are also some initiatives now to provide better support for profilers to measure the time spent in GC which will make this even easier.
The short, and counter-intuitive answer is no. You will not see a change in your memory consumption.
First, the call to gc_disable()
does not turn off garbage collection entirely,
it just disables one particular GC strategy that cleans up circular references.
Second, if you see this particular performance degradation, the garbage collection is not working anyway. It basically tries to clean-up your objects, only to recognize that it cannot clean them up, and it does that frequently. This is where the time is lost and the huge improvement is coming from when disabling it.
For a detailed explanation of the different GC strategies in PHP, check out this blog post from Anthony.
Like said above, this is mostly something that affects CLI tools. One prime candidate
that comes to mind is phpunit
. If you have a big test-suite, try running it
with GC disabled:
php -d zend.enable_gc=0 vendor/bin/phpunit
The Open Web Application Security Project (OWASP) considered Injection attacks to be the most critical security risk in 2013. There are many different forms of injection attacks such as SQL injection, Path injection, Command injection, and many more. Injection attacks especially hurtful because they are easy to exploit, and can cause severe damage.
At Scrutinizer, we already notified you of SQL injection issues f.e. if you did
not use parameter binding offered by PHP’s PDO abstraction, or libraries like
Doctrine. However, there is a whole range of other PHP functions that are safe
per-se, but can lead to severe security issues if passed arbitrary user input.
file_get_contents
for example is one of these functions. Malicious users can
use it to gain access to sensitive credentials from your application.
Flagging each call to file_get_contents
would be a very naive approach and not
really helpful. However, at Scrutinizer we have invested into a solid foundation.
Our PHP analyzer is almost like a compiler for PHP and uses advanced techniques
such as data flow analysis,
and abstract interpretation
which distinguishes us from other competitors, or also existing open-source tools
that only use AST-based
approaches. It are these very analysis techniques that help us understand how data
flows through your application.
In the most recent version of PHP analyzer, we have added a security analysis framework that specifically focuses on making sure your request data is not ending up in one of PHP’s sensitive functions, and that consequentially could make you vulnerable to injection attacks. Let’s take a look at an example:
use Symfony\Component\Yaml\Yaml;
use Symfony\Component\HttpFoundation\Request;
class Instance
{
private $config;
public function __construct($config)
{
$this->config = $config;
}
public function getParsedConfig()
{
return Yaml::parse($this->config);
}
}
class MyController
{
public function createAction(Request $request)
{
$instance = new Instance($request->request->get('config'));
// ... save new instance, etc.
}
}
In this example, we are assigning a YAML configuration from request data to a class property, and then save the new instance. Let’s see what Scrutinizer finds when analyzing this code:
This might come a bit unexpected. There is no problem with database persistence
here, Doctrine takes care of escaping the content for us, but we still introduced
a security issue by passing raw request data to the Yaml::parse()
function which
in turn calls file_get_contents()
. Let’s take a look at it to see what is
happening there:
class Yaml
{
public static function parse($input, $exceptionOnInvalidType = false, $objectSupport = false)
{
// if input is a file, process it
$file = '';
if (strpos($input, "\n") === false && is_file($input)) {
if (false === is_readable($input)) {
throw new ParseException(sprintf('Unable to parse "%s" as the file is not readable.', $input));
}
$file = $input;
$input = file_get_contents($file);
}
// ...
}
}
This function has a feature that expands the input in case it is a valid file path,
and replaces it with the contents of that path before parsing its actual contents.
If someone were to pass a path such as ../app/config/parameters.yml
, he
might gain access to the parameters file of a Symfony2 application.
I would like to point out that we have not hard-coded the Yaml::parse
function
in PHP analyzer, but since we not only analyze your application, but also all
your dependencies. We also assess the security relevance of the specific dependency
version that your application uses. Analyzing your dependencies is even more important
as you might not be as familiar with your dependencies’ code and consequentially
also not aware of some of their features such as the file expansion shown above.
This also allows us to compile a stack trace for you that aids in assessing the
security issue more easily.
We are proud to now add this additional level of security to your PHP applications on
Scrutinizer. Currently, we support tracking request data from Symfony2’s, and Zend’s
request abstraction, as well as usage of PHP’s super globals like $_GET
. If you are
using another request data abstraction, just let us know by
opening an issue.
Happy & secure coding!
This is a commonly requested feature, and I’m happy to announce that this is now available. If one of your tasks is waiting and not executed, you can now view which other tasks are currently using your plan’s containers.
We also provide some statistics on how long your tasks are in waiting state before being run on your billing page helping you decide whether you might need to add another container to your plan:
Enjoy!
We have recently deployed a new version of PHP Analyzer. This minor version upgrade contains mostly bug fixes and a couple of new features/improvements.
One of the major new features in this upgrade is deadlock detection. This is mainly intended for CLI scripts where such deadlocks might go unnoticed for a while. PHP Analyzer now checks the exit conditions of loops for whether they are either always satisfied, or can never be satisfied.
Let’s have a look at an example:
function isMergeable(GitHubRepository $repository)
{
$i = 0;
do {
if ($i > 0) {
$waitTime = pow(2, $i) * 1;
sleep($waitTime);
}
$prDetails = $this->api->getPrDetails($repository->getLogin(), $repository->getName(), $prNumber);
if ($prDetails['mergeable'] === true) {
return true;
}
} while ($i < 3);
return false;
}
This code fragment comes from our code-base. It is run as a background process
whenever GitHub notifies us of a pull-request. We pull the GitHub API to check
whether a pull-request is mergeable and return true
or false
. However,
you might have spotted a small mistake in the loop, we are actually missing a $i++
or similar to increase the counter, and as such the condition $i < 3
is
always true.
This did go unnoticed for a while since the actual result, i.e. creating an inspection when the pull-request was mergeable or ignoring the pull-request if it was not mergeable was achieved. However, we saw some errors because jobs exceeded their maximum runtime which eventually led us to find this error.
Even if the cost of this error was not high in this case, PHP Analyzer now
makes sure that we do not slip in any unsatisfiable loops. This check is enabled
by default if you are using the tools
configuration, and can be enabled via
the checks
configuration as follows:
# .scrutinizer.yml
# Only add this if you already have a "checks" section.
checks:
php:
deadlock_detection_in_loops: true
This release also contains a couple of bug fixes, and minor improvements. Some of the highlights below:
... and some more improvements, see the full changelog.
Thanks, and happy inspecting! :)
A frequent request we got, especially when you start to parallelize your test-suite,
is to merge the code coverage data that is produced. We now support this feature natively.
All you need to do, is to define the runs
option in your configuration:
# .scrutinizer.yml
tools:
external_code_coverage:
runs: 2 # Wait for two code coverage submissions
This will make Scrutinizer wait for two code coverage submissions instead of proceeding directly after receiving the first one.
Happy inspecting :)