October 28, 2014 Johannes Schmitt schmittjoh

Security Audit of Request Data for PHP

The Open Web Application Security Project (OWASP) considered Injection attacks to be the most critical security risk in 2013. There are many different forms of injection attacks such as SQL injection, Path injection, Command injection, and many more. Injection attacks especially hurtful because they are easy to exploit, and can cause severe damage.

At Scrutinizer, we already notified you of SQL injection issues f.e. if you did not use parameter binding offered by PHP’s PDO abstraction, or libraries like Doctrine. However, there is a whole range of other PHP functions that are safe per-se, but can lead to severe security issues if passed arbitrary user input. file_get_contents for example is one of these functions. Malicious users can use it to gain access to sensitive credentials from your application.

Flagging each call to file_get_contents would be a very naive approach and not really helpful. However, at Scrutinizer we have invested into a solid foundation. Our PHP analyzer is almost like a compiler for PHP and uses advanced techniques such as data flow analysis, and abstract interpretation which distinguishes us from other competitors, or also existing open-source tools that only use AST-based approaches. It are these very analysis techniques that help us understand how data flows through your application.

In the most recent version of PHP analyzer, we have added a security analysis framework that specifically focuses on making sure your request data is not ending up in one of PHP’s sensitive functions, and that consequentially could make you vulnerable to injection attacks. Let’s take a look at an example:

use Symfony\Component\Yaml\Yaml;
use Symfony\Component\HttpFoundation\Request;

class Instance
{
    private $config;

    public function __construct($config)
    {
        $this->config = $config;
    }

    public function getParsedConfig()
    {
        return Yaml::parse($this->config);
    }
}

class MyController
{
    public function createAction(Request $request)
    {
        $instance = new Instance($request->request->get('config'));
        // ... save new instance, etc.
    }
}

In this example, we are assigning a YAML configuration from request data to a class property, and then save the new instance. Let’s see what Scrutinizer finds when analyzing this code:

File-Expansion Security Issue

This might come a bit unexpected. There is no problem with database persistence here, Doctrine takes care of escaping the content for us, but we still introduced a security issue by passing raw request data to the Yaml::parse() function which in turn calls file_get_contents(). Let’s take a look at it to see what is happening there:

class Yaml
{
    public static function parse($input, $exceptionOnInvalidType = false, $objectSupport = false)
    {
        // if input is a file, process it
        $file = '';
        if (strpos($input, "\n") === false && is_file($input)) {
            if (false === is_readable($input)) {
                throw new ParseException(sprintf('Unable to parse "%s" as the file is not readable.', $input));
            }

            $file = $input;
            $input = file_get_contents($file);
        }

        // ...
    }
}

This function has a feature that expands the input in case it is a valid file path, and replaces it with the contents of that path before parsing its actual contents. If someone were to pass a path such as ../app/config/parameters.yml, he might gain access to the parameters file of a Symfony2 application.

I would like to point out that we have not hard-coded the Yaml::parse function in PHP analyzer, but since we not only analyze your application, but also all your dependencies. We also assess the security relevance of the specific dependency version that your application uses. Analyzing your dependencies is even more important as you might not be as familiar with your dependencies’ code and consequentially also not aware of some of their features such as the file expansion shown above. This also allows us to compile a stack trace for you that aids in assessing the security issue more easily.

We are proud to now add this additional level of security to your PHP applications on Scrutinizer. Currently, we support tracking request data from Symfony2’s, and Zend’s request abstraction, as well as usage of PHP’s super globals like $_GET. If you are using another request data abstraction, just let us know by opening an issue.

Happy & secure coding!


September 25, 2014 Johannes Schmitt schmittjoh

Better insight in how containers are used

This is a commonly requested feature, and I’m happy to announce that this is now available. If one of your tasks is waiting and not executed, you can now view which other tasks are currently using your plan’s containers.

Image that shows the tasks that use a container

We also provide some statistics on how long your tasks are in waiting state before being run on your billing page helping you decide whether you might need to add another container to your plan:

Image from the billing page that shows containers

Enjoy!


July 28, 2014 Johannes Schmitt schmittjoh

PHP Analyzer: New Features and Bug Fixes

We have recently deployed a new version of PHP Analyzer. This minor version upgrade contains mostly bug fixes and a couple of new features/improvements.

New: Deadlock Detection

One of the major new features in this upgrade is deadlock detection. This is mainly intended for CLI scripts where such deadlocks might go unnoticed for a while. PHP Analyzer now checks the exit conditions of loops for whether they are either always satisfied, or can never be satisfied.

Let’s have a look at an example:

function isMergeable(GitHubRepository $repository)
{
    $i = 0;
    do {
        if ($i > 0) {
            $waitTime = pow(2, $i) * 1;
            sleep($waitTime);
        }

        $prDetails = $this->api->getPrDetails($repository->getLogin(), $repository->getName(), $prNumber);
        if ($prDetails['mergeable'] === true) {
            return true;
        }
    } while ($i < 3);

    return false;
}

This code fragment comes from our code-base. It is run as a background process whenever GitHub notifies us of a pull-request. We pull the GitHub API to check whether a pull-request is mergeable and return true or false. However, you might have spotted a small mistake in the loop, we are actually missing a $i++ or similar to increase the counter, and as such the condition $i < 3 is always true.

This did go unnoticed for a while since the actual result, i.e. creating an inspection when the pull-request was mergeable or ignoring the pull-request if it was not mergeable was achieved. However, we saw some errors because jobs exceeded their maximum runtime which eventually led us to find this error.

Even if the cost of this error was not high in this case, PHP Analyzer now makes sure that we do not slip in any unsatisfiable loops. This check is enabled by default if you are using the tools configuration, and can be enabled via the checks configuration as follows:

# .scrutinizer.yml
# Only add this if you already have a "checks" section.
checks:
    php:
        deadlock_detection_in_loops: true

Bug Fixes & Improvements

This release also contains a couple of bug fixes, and minor improvements. Some of the highlights below:

  • Various fixes to our PHP stubs: This mainly improves checks where we verify that you pass the right number of arguments, and the right types. PHP stubs tell the analyzer which types are built-in into the PHP runtime.
  • Type inference around associative arrays: PHP Analyzer is more accurate at inferring types of arrays that are used as maps with named keys.
  • Side-effects analysis supports some cases better: This makes various checks more accurate that draw on the results of the side-effects analysis. The side-effects analysis checks whether a function invocation has any effect after the invocation is done.

... and some more improvements, see the full changelog.

Thanks, and happy inspecting! :)


June 2, 2014 Johannes Schmitt schmittjoh

Support for Merging Code Coverage Data

A frequent request we got, especially when you start to parallelize your test-suite, is to merge the code coverage data that is produced. We now support this feature natively. All you need to do, is to define the runs option in your configuration:

# .scrutinizer.yml
tools:
    external_code_coverage:
        runs: 2    # Wait for two code coverage submissions

This will make Scrutinizer wait for two code coverage submissions instead of proceeding directly after receiving the first one.

Happy inspecting :)


March 26, 2014 Johannes Schmitt schmittjoh

Finding Duplicated Code in PHP reloaded

Code duplication is generally considered a bad programming style. While it might be appealing and in certain situations faster to duplicate certain code fragments, it often comes at the cost of increased maintenance efforts like harder to change code. Usually, duplicated code is introduced either by copy/pasting. In large code bases, it could also happen that two developers write similar code without even knowing about it.

In PHP, Copy/Paste Detector so far could be used for detecting code clones. On Scrutinizer, it was enabled by default on all PHP projects. PHP Copy/Paste Detector operates on the token stream which is generated by PHP’s token_get_all function and uses string hashing techniques to find duplicates. This makes it quite good at detecting large literal code duplications. However, often developers modify copy/pasted code slightly which could then not be detected anymore.

To overcome the current shortcomings, we are happy to introduce a new tool for duplicated code detection, PHP Similarity Analyzer. PHP Similarity Analyzer is based on latest academic research combined with our in-depth practical experience. It has been evaluated against a range of open-source and closed-source projects and its results are already very promising. In contrast to PHP Copy/Paste Detector, it is robust against code modification and also finds smaller code fragments which make very good targets for refactoring. Besides, it not only detects copy/pasted code, but also semantically similar code.

You can enable PHP Similarity Analyzer on your repository with the following configuration:

tools:
    php_sim: true

    # PHP Similarity Analyzer and Copy/paste Detector cannot be used at
    # the same time right now. Make sure to either remove, or disable one.
    php_cpd: false

Check it out, and let us know what you think.

Happy inspecting :)