Preventing Problems in PHP Security

As any PHP developer that’s been around for a while will tell you, there’s a certain kind of stigma that comes with the language. They’ll hear it from their peers using other languages that PHP is “sloppy” or that “it’s just a scripting language, not a real one.” There’s one other that seems to follow the language around as well—that it’s insecure. Sure, PHP’s not without its problems—but any language is going to have its share. Ruby’s had several major vulnerabilities in the press lately and Java has definitely had its own list over its extensive lifetime. People put down PHP for not being secure, but they forget that it’s not the language that makes for insecure code, it’s the developer.

PHP, by its nature is “meant to die” at the end of every request, so the developers don’t have to worry about some things that more persistent languages do. There’s still some common dangers, though, that you as a PHP developer should be aware of. The most common ones come from the well known OWASP Top 10 list. Here’s a quick look at how to help prevent just a few:

SQL Injection attacks

If you keep up with security at all, you’ve undoubtedly seen articles about break-ins and hacks of large companies (smaller ones too) that were a result of something called a SQL injection (SQLi) flaw. Basically, this is an attack where the person wanting access to the system uses a specially formatted string as a part of an input that gets down into the database level and executes a malicious command. For example, say an input in your script comes from the $_GET superglobal. If you’re not escaping or filtering what the user is giving you, that value could be anything. Scary, right? Well, there’s one easy thing you can do in PHP that can help with this. You can use the built-in database abstraction layer, PDO, and it’s prepared statements feature. Here’s an example that could go a long way to help prevent SQL injection issues:

<?php
$dbh = new PDO('mysql:dbname=test;host=127.0.0.1');
$stmt = $dbh->prepare('select foo from baz where bar = :one');
$stmt->bindParam(':one', $myValue);
$stmt->execute();
?>

The thing that makes this different is that the value passed in isn’t a part of a SQL statement built as a string. Imagine the kind of problems that might be caused if you allowed something like:

<?php
$sql = 'select foo from baz where bar = '.$_GET['one'];
?>

Using bound parameters is a good (and effective) first line of defense to keeping your application safe from SQL injection.

Cross-Site Scripting

Cross-site scripting (XSS) is an attack where user output is blindly echoed back out to the browser unfiltered and unvalidated. Unfortunately, PHP does get a knock on this one as there’s no built-in tooling to help prevent this kind of thing. There’s several ways to output user inputted data back to the browser using things like echo or printf but none of them will check the incoming data automatically. Instead, its left up to the developer to figure out the best way to handle the user data.

Obviously, this has its positives and negatives. A positive is that the developer doesn’t have to wrestle with a system that may or may not do what he wants. This gives him more flexibility to handle his data more correctly for his situation. Unfortunately, the converse is also true. It’s very easy for a developer to just not think about filtering or validation of the data and feed that right back to the user. Think what might happen if someone entered a string that executed Javascript on your $_GET parameter like:

http://mysite.com?foo=<script>alert('xss is bad, mkay');</script>

Now, that’s a pretty simplistic example, but it leaves the door open to a lot of potential attacks, especially if there’s no filtering at all. So, what can a PHP developer do to help prevent this? The answer is one of the easiest to talk about, but one of the most difficult to implement—filtering input and escaping output.

Thankfully, PHP (and the community surrounding it) has come up with some things to help make mitigating this risk a bit simpler. First off, filtering input—PHP comes with some string handling functions that can make it easier to check the incoming data, but there’s one function that stands out at being not only flexible but reliable, filter_var. It can be used to both filter and validate incoming data. Here’s some examples:

<?php
// to validate an email address, no regex needed (result = false)
$result = filter_var('bad.email', FILTER_VALIDATE_EMAIL);

// to remove everything from the string that's not an integer (result = 12343)
$result = filter_var('123gfdsb43gsgf', FILTER_SANITIZE_NUMBER_INT);
?>

There’s a whole list of filters that can be applied to the data to help you figure out if it’s valid or to get just the valid parts out of it.

So that’s the input side…what about output? Well, there’s a few things that can help with that too:

htmlspecialchars: This built-in function translates HTML entities in the data into their HTML encoded versions. Be sure you call it with a value for the optional encoding, though—this helps prevent a whole other kind of issue along with the escaping.
Using external output handling libraries: There’s a whole host of templating libraries out there for PHP and some are better at handling output escaping than others. One prime example to look into is the Symfony-related project Twig. Twig, by default, escapes output to help prevent HTML from being injected into your page. In fact, you have to specifically tell it where in your templates you want to turn that off, reducing the risk that you missed something somewhere.

As there always is with security advice, there’s a word of warning that comes with this. Just because you implement something like Twig doesn’t mean that your site is going to automatically be protected. All libraries and tools have their limitations and flaws, so you still need to validate that it’s doing its job everywhere and not just in “most places.” There’s lots of tools out there that can help you find these places—just look for a good security scanning software to run against your application.

Cross-Site Request Forgery

Finally, I want to talk about another major issue that’s pretty wide spread across web applications—and not just PHP ones. It’s a vulnerability called the “cross-site request forgery” (CSRF) and it can be pretty hard to detect. The reason it’s so dangerous is that it uses the user’s own credentials to execute a script on the target site as if they were them. For example, if there’s a URL/resource in your app that adds admin rights to another user and they can get some unsuspecting user to click on a special link, there’s a good possibility that an unprotected request could perform the action and you’d be none the wiser.

There’s a few ways to help prevent this in PHP-based applications and they’re relatively simple to implement.The first of them is more of a good rule to follow when creating your applications and involves the difference in HTTP verbs. If you’re familiar with the underlying HTTP request/response structure, you already know what GET and POST are for. If not, here’s a quick summary: a GET request is usually what happens when you go to something like http://owasp.org in your browser. The browser asks the server to “get” the data for that page and send it back. POST, on the other hand, is submitting (“posting”) data back to the server.

A CSRF attack usually relies on a vulnerable GET URL as a target. The rule of thumb when planning an application is to make GET requests idempotent. This means making them where they can be run over and over without impacting the state of the application. If you want the user to change something or give you data, you want to POST instead.

“So, how do I protect myself in a POST request,” you might be asking, “isn’t that still vulnerable in the same way?” The short answer is “yes” (by default at least). PHP’s POST requests don’t have anything that keeps them from being used over and over. Enter CSRF tokens. A CSRF token is a piece of data that’s embedded into the form data (the POST data, usually coming from a form on a site) that’s a randomly generated string. This same string is stored in the user’s session and validated when the POST submit is made.

Here’s an example using the excellent RandomLib library from Anthony Ferrara to generate a token:

<?php
$generator = (new RandomLibFactory())->getMediumStrengthGenerator();
$token = $generator->generateString(64);
$_SESSION['token'] = $token;
?>

This three line call generates the token to use and sets it into the user’s session (sessions are created even if the user isn’t logged in). The value of `$token` can then be pushed out the to page as a hidden HTML field in the form. When the form is submitted, the values are checked against each other:

<?php
if ($_SESSION['token'] !== $_POST['token']) {
die('fail, the tokens did not match!');
}
?>

Using this simple method, you can check to see if the request did in fact come from the page on your site. By making the string random each time the page is loaded, it also makes it less likely that an attacker would be able to guess it and abuse your site’s trust.

But wait, there’s more…

I’ve only gone over three of the OWASP Top 10 list in this article—there’s still seven others that are serious issues in their own right. There are, thankfully, a lot of articles out on the web about helping mitigate those in PHP applications. A word of warning, though—be sure to check the date on the articles you’re reading. Some of them may be stale and might not provide the most up-to-date and reliable solutions to some of these issues.

I hope that I’ve helped to shed some light on these three major issues, though. Security in development should be something that’s on every developer’s mind all through the development process, not just “bolted on” at the end. If you write good, solid and secure code from the start and use effective tools to help, you’ll save yourself and your company a lot of heartache down the line.

Preventing Problems in PHP Security

Taking a look at the usual suspects: SQLi, XSS & CSRF

SQL Injection attacks

Cross-Site Scripting

Cross-Site Request Forgery

But wait, there’s more…

Get the O’Reilly Programming Newsletter