# Seven habits for writing secure PHP applications

When it comes to security, remember that in addition to actual platform and operating system security issues, you need to ensure that you write your application to be secure. When you write PHP applications, apply these seven habits to make sure your applications are as secure as possible:

• Validate input
• Guard against Cross-Site Scripting (XSS) vulnerabilities
• Verify form posts
• Protect against Cross-Site Request Forgeries (CSRF)

Validating data is the most important habit you can possibly adopt when it comes to security. And when it comes to input, it's simple: Don't trust users. Your users are probably good people, and most are likely to use your application exactly as you intended. However, whenever there is chance for input, there is also chance for really, really bad input. As an application developer, you must guard your application against bad input. Carefully considering where your user input is going and what it should be will allow you to build a robust, secure application.

Although file system and database interaction are covered later, there are general validation tips that cover every sort of validation:

• Use white-listed values
• Always revalidate limited selections
• Use built-in escape functions
• Validate for correct data types, like numbers

White-listed values are values that are valid, as opposed to black-listed values that are invalid. The distinction is that often when doing validation, the list or range of possible values is smaller than the list of invalid values, many of which can be unknown or unexpected.

When you're doing validation, remember that it's often easier to conceptualize and validate what the application allows instead of trying to guard against all the unknown values. For instance, to limit values in a field to all numbers, write a routine that makes sure the input is all numbers. Don't write the routine to search for non-numerical values and mark it as invalid if any are found.

In July 2000, a Web site leaked customer data that was found in files on a Web server. A visitor to the Web site manipulated the URL to view files containing the data. Although the files were erroneously placed, this example underscores the importance of guarding your file system against attackers.

If your PHP application does anything with files and has variable data that a user can enter, be careful that you scrub the user input to make sure users can't do anything with the file system that you don't want them to do. Listing 1 shows an example of a PHP site that downloads an image given a name.

"); echo("
"); echo("
"); echo(""); }

As you can see, the relatively dangerous script in Listing 1 serves any file that the Web server has read access to, including files in the session directory (see "Guard your session data") and even some system files such as /etc/passwd. This example has a text box in which the user can type the file name for example purposes, but the file name could just as easily be supplied in the query string.

Configuring file system access along with user input is dangerous, so it's best to avoid it altogether by designing your application to use a database and hidden, generated file names. However, that's not always possible. Listing 2 provides an example of a routine that validates file names. It uses regular expressions to make sure that only valid characters are used in the file name and checks specifically for the dot-dot characters: ...

Listing 2. Checking for valid file name characters
 function isValidFileName($file) { /* don't allow .. and allow any "word" character \ / */ return preg_match('/^(((?:\.)(?!\.))|\w)+$/', $file); } In April 2008, a U.S. state's Department of Corrections leaked sensitive data as a result of the SQL column names being used in the query string. This leak allowed malicious users to select which columns they wanted to display, submit the page, and get the data. This leak shows how users can figure out ways to make their input do things that the application developers definitely didn't foresee and underscores the need for careful defense against SQL injection attacks. Listing 3 shows an example of a script that runs an SQL statement. In this example, the SQL statement is a dynamic statement that would allow the same attack. Owners of this form may be tempted to think they're safe because they've limited the column names to select lists. However, the code neglects attention to the last habit regarding form spoofing — just because the code limits the selection to drop-down boxes doesn't mean that someone can't post a form with whatever they want in it (including an asterisk [*]). Listing 3. Executing an SQL statement  SQL Injection Example ' .$select . '

'; $result = mysql_query($select) or die('

' . mysql_error() . '

'); echo ''; while ($row = mysql_fetch_assoc($result)) { echo '
' . $row[$col] . '
'; echo ''; echo ''; } echo ''; mysql_close($link); } ?> So, to form the habit of guarding your database, avoid dynamic SQL code as much as possible. If you can't avoid dynamic SQL code, don't use input directly for columns. Listing 4 shows an example of the power of adding a simple validation routine to the account number field to make sure it cannot be a non-number in addition to using static columns. Listing 4. Guarding with validation and mysql_real_escape_string()  SQL Injection Example ' .$select . '

'; $result = mysql_query($select) or die('

' . mysql_error() . '

'); echo ''; while ($row = mysql_fetch_assoc($result)) { echo '
' . $row['account_number'] . '' .$row['name'] . '' . $row['address'] . ' '; echo ''; echo ''; echo ''; echo ''; } echo ''; mysql_close($link); } else { echo "" . "Please supply a valid account number!"; } } ?>

This example also shows the use of the mysql_real_escape_string() function. This function properly scrubs your input so it doesn't include invalid characters. If you've been relying on magic_quotes_gpc, be forewarned that it is deprecated and will be removed in PHP V6. Avoid relying on it now and write your PHP applications to be secure without it. Also, remember that if you're using an ISP, there's a chance that your it doesn't have magic_quotes_gpc enabled.

Finally, in the improved example, you can see that the SQL statement and output do not include a dynamic column selection. This way, if you add columns to the table later that have different information, you can print them. If you're using a framework to work with your database, there is a chance that your framework does the SQL validation for you already. Make sure to check with the documentation for your framework to be sure; if you're still unsure, do the validation to err on the safe side. Even if you're using a framework for database interaction, you still need to perform the other verification.

By default, session information in PHP is written to a temporary directory. Consider the form in Listing 5, which shows how to store a user's ID and account number in a session.

Listing 5. Storing data in session
 Storing session information

Listing 6 shows the contents of the /tmp directory.

Listing 6. The session files in the /tmp directory
 -rw------- 1 _www wheel 97 Aug 18 20:00 sess_9e4233f2cd7cae35866cd8b61d9fa42b

As you can see, the session file, when printed (see Listing 7), contains the information in a fairly readable format. Because the file has to be readable and writable by the Web server user, the session files can create a major problem for anyone on a shared server. Someone other than you can write a script that reads these files so they can try to get values out of the session.

Listing 7. The contents of a session file

Passwords should never, ever, ever be stored in plain text anywhere — not in a database, session, file system, or any other form. The best way to handle passwords is to store them encrypted and compare the encrypted passwords with one another. Although this may seem obvious, storing them in plain text seems to be done quite a bit in practice. Any time you use a Web site that can send you your password instead of resetting it means that either the password is stored in plain text or there is code available for decrypting the password if it's encrypted. Even if it's the latter, the code for decryption can be found and exploited.

You can do two things to guard your session data. The first is to encrypt anything that you put into session. But just because you've encrypted the data doesn't mean it's completely safe, so be careful with relying on this as your sole means of guarding your session. The alternative is to store your session data in a different place, like a database. You still have to make sure you're locking down your database, but this approach solves two problems: First, it puts your data into a more secure place than a shared file system; second, it enables your application to scale across multiple Web servers more easily with shared sessions across multiple hosts.

To implement your own session persistence, see thesession_set_save_handler() function in PHP. With it, you can store session information in a database or implement a handler for encrypting and decrypting all of your data. Listing 8 provides an example of the function use and skeleton functions for implementation. You can also check out examples of how to use a database in the Resources section.

Listing 8. session_set_save_handler() function example
 function open($save_path,$session_name) { /* custom code */ return (true); } function close() { /* custom code */ return (true); } function read($id) { /* custom code */ return (true); } function write($id, $sess_data) { /* custom code */ return (true); } function destroy($id) { /* custom code */ return (true); } function gc($maxlifetime) { /* custom code */ return (true); } session_set_save_handler("open", "close", "read", "write", "destroy", "gc"); XSS vulnerabilities represent a large portion of all the documented Web-site vulnerabilities in 2007 (see Resources). An XSS vulnerability occurs when a user has the ability to inject HTML code into your Web pages. The HTML code can carry JavaScript code inside script tags, thus allowing JavaScript to run whenever a page is drawn. The form in Listing 9 could represent a forum, wiki, social networking, or any other site where it's common to enter text. Listing 9. Form for inputting text  Your chance to input XSS Listing 10 demonstrates how the form could print the results, allowing an XSS attack. Listing 10. showResults.php  Results demonstrating XSS You typed this: "); echo(" "); echo($_POST['myText']); echo("

"); ?>

Listing 11 provides a basic example in which a new window pops open to Google's home page. If your Web application does not guard against XSS attacks, the only limit to the harm done is the imagination of the attacker. For instance, someone could add a link that mimics the style of the site for phishing purposes (see Resources).

Listing 11. Malicious input text sample


To guard yourself against XSS attacks, filter your input through the htmlentities() function whenever the value of a variable is printed to the output. Remember to follow the first habit of validating input data with white-listed values in your Web application's input for names, e-mail addresses, phone numbers, and billing information.

A much safer version of the page that shows the text input is shown below.

Listing 12. A more secure form
 Results demonstrating XSS You typed this:

"); echo("

"); echo(htmlentities($_POST['myText'])); echo(" "); ?> Form spoofing is when someone makes a post to one of your forms from somewhere you didn't expect. The easiest way to spoof a form is simply to create a Web page that submits to a form, passing all the values. Because Web applications are stateless, there's no way of being absolutely certain that the posted data is coming from where you want it to come from. Everything from IP addresses to hostnames, at the end of the day, can be spoofed. Listing 13 shows a typical form that allows you to enter information. Listing 13. A form for processing text  Form spoofing example I am processing your text: "); echo($_POST['myText']); echo("

"); } ?>

Listing 14 shows a form that will be post to the form in Listing 13. To try this, you can put the form on a Web site, then save the code in Listing 14 as an HTML document on your desktop. When you have saved the form, open it in a browser. Then you can fill in the data and submit the form, observing while the data is processed.

Listing 14. A form for collecting your data

The potential impact of form spoofing, really, is that if you have a form that has drop-down boxes, radio buttons, checkboxes, or other limited input, those limits mean nothing if the form is spoofed. Consider the code in Listing 15, which contains a form with invalid data.

Listing 15. A form with invalid data

Think about it: If you have a drop-down box or a radio button that limits the user to a certain amount of input, you may be tempted not to worry about validating the input. After all, your input form ensures that users can only enter certain data, right? To limit form spoofing, build measures to ensure that posters are likely to be who they say they are. One technique you can use is a single-use token, which does not make it impossible to spoof your forms but does make it a tremendous hassle. Because the token is changed each time the form is drawn, a would-be attacker would have to get an instance of the sending form, strip out the token, and put it in their spoofing version of the form. This technique makes it highly unlikely that someone can build a permanent Web form to post unwanted requests to your application. Listing 16 provides an example of a one-type form token.

Listing 16. Using a one-time form token
 SQL Injection Test '; echo 'Token from form=' . $_POST['token']; echo ' '; if ($_SESSION['token'] == $_POST['token']) { /* cool, it's all good... create another one */ } else { echo ' Go away! '; }$token = md5(uniqid(rand(), true)); $_SESSION['token'] =$token; ?>

Cross-Site Request Forgeries (CSRF attacks) are exploits that take advantage of user privileges to carry out an attack. In a CSRF attack, your users can easily become unsuspecting accomplices. Listing 17 provides an example of a page that carries out a certain action. This page looks up user login information from a cookie. As long as the cookie is valid, the Web page processes the request.

Listing 17. A CSRF example


CSRF attacks are often in the form of <img> tags because the browser unwittingly calls the URL to get the image. However, the image source could just as easily be the URL of a page on the same site that does some processing based on the parameters passed into it. When this <img> tag is placed with an XSS attack — which are the most common of the documented attacks — users can easily do something with their credentials without knowing it — thus, the forgery.

To guard yourself against CSRF, use the one-use token approach you use in your habit of verifying form posts. Also, use the explicit $_POST variable instead of$_REQUEST. Listing 18 demonstrates a poor example of a Web page that processes identically — whether the page is called by a GET request or by having a form posted to it.

Listing 18. Getting the data from $_REQUEST  Processes both posts AND gets I am processing your text: "); echo(htmlentities($_REQUEST['text'])); echo("

"); } ?>

Listing 19 shows a cleaned-up version of this page that only works with a form POST.

Listing 19. Getting the data only from $_POST  Processes both posts AND gets I am processing your text: "); echo(htmlentities($_POST['text'])); echo("

"); } ?>

Starting with these seven habits for writing more secure PHP Web applications will help you avoid becoming an easy victim of malicious attacks. Like many habits, they may seem awkward at first, but they become more natural as time goes on.

Remember that the first habit is key: Validate your input. When you make sure your input doesn't include bad values, you can move on to guarding your file system, database, and session. Finally, make sure your PHP code is resilient to XSS attacks, form spoofs, and CSRF attacks. A disciplined approach to forming these habits goes a long way toward preventing easy attacks.

Learn

Get products and technologies

• Innovate your next open source development project with IBM trial software, available for download or on DVD.

• Download IBM product evaluation versions, and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

Nathan Good lives in the Twin Cities area of Minnesota. Professionally, he does software development, software architecture, and systems administration. When he's not writing software, he enjoys building PCs and servers, reading about and working with new technologies, and trying to get his friends to make the move to open source software. He's written and co-written many books and articles, including Professional Red Hat Enterprise Linux 3, Regular Expression Recipes: A Problem-Solution Approach, and Foundations of PEAR: Rapid PHP Development.