Wednesday, December 19, 2007

Web Application Security

Before expounding the topics in my last post, I thought I'd take a detour to explain something that is very important and often overlooked: web application security.

This is a broad topic, one that is the subject of innumerable mailing lists and that has spawned innumerable consulting firms. I will hardly be able to treat the topic thoroughly in this article. I can, however, give some advice that can knock out the most common and devastating vulnerabilities in web applications. This short list covers some of the biggest non-SQL related vulnerabilities. SQL vulnerabilities and attack prevention are very important, as well, and are a topic for a future article.

Cross-Site Scripting (XSS) - This vulnerability can affect any application that accepts and displays user input. If this input is not properly validated, an attacker can input HTML or Javascript code that is then displayed to every user who comes to the site. This code can do a range of damage, from changing the display/style of the webpage to reaping user login information with an AJAX control using the Javascript DOM.
Solution: Validate your data! Strip all HTML tags from your user input. If you must, allow a well-defined subset of harmless HTML tags (text formatting only, like bold, italics, etc.), and strip the rest.

Cross-Site Request Forgeries (CSRF) or "session riding" - Attackers can fool your unwitting users who have logged in to your application to execute arbitrary site functionality. For instance, say someone has logged into your web forum. Later, he receives a malicious email containing a link like this:

<a href="http://yourwebforum.tld/post.php?body=I'm%20a%20loser!">Check out this funny picture!</a>

Or worse:

<img src="http://yourwebforum.tld/post.php?body=I'm%20a%20loser!"/>

The user wouldn't have to click on the above example--only open the email--for the request to be made. In this example, a post could be made using a user's credentials without his knowledge.
Solution: There are several easy things you can do to mitigate the chance that an attack like this would be successful, like using POST instead of GET or making user credentials expire quickly. However, to thoroughly prevent this type of attack, the solution requires even more. It helps to associate a unique key with each form submission, a key that the attacker could never know. The key could be generated randomly for the form when it is displayed to the user and saved to match to the future POST when the form is submitted back to the server.

"Session hijacking" - An attacker can discover the ID of a user's session and send the session ID in the URL or in the HTTP headers as his own, thereby "hijacking" the user's credentials. This gives any access your user had to the attacker.
Solution: You can try associating the user's IP address with the session, which you would check against the client's IP every time a request were made. This, though, can knock out valid users whose ISPs use load-balancing proxies for internet traffic; their IP addresses may change with every request. It could also allow an attacker behind the same router as the user to hijack a session undetected. Instead, you can set a cookie on the client's machine that contains nothing important or private, say, a unique key generated when the session is created. This key is also saved in the session, and is checked against the cookie at every request.

Lack of URL access restriction - Sometimes a web application will only authenticate a user when trying to access an entry point to a restricted portion of the site. For instance, if a user tries to view his personal dashboard, his credentials are checked and he is redirected to a login page before he can continue. However, if this user were to type in a direct URL to a page other than the dashboard, he would bypass the login procedure altogether. This is a glaring oversight in many web applications. All an attacker has to discover is the the URLs to non-entry pages, to which he would have full access.
Solution: Validate credentials on every page that should have restricted access.

A good rule of thumb when it comes to web application security: Don't ever assume that an attacker won't try or won't figure out how. Ideally, you should be comfortable allowing an attacker to see your source code. It should be apparent that your security is so air-tight that he is forced to play by the rules.

Of course, there are many other types of attacks, and, as I mentioned, I will do another article on SQL vulnerabilities and exploits, which are also very important. For more information on the topics in this article, here are some great resources:

Monday, December 10, 2007

The Basics

From the perspective of most people, web software development is much simpler than compiled software development. Coding a desktop application should require far more planning than coding a web application, right? On the other hand, compiled software developers see most web applications as silly toys and most web application developers as silly people who don't know good software from a hole in Vista.

So how can these two perspectives be reconciled? The fact is, web software development and compiled software development are exactly the same. Both have attracted different types of developers and, in some cases, different types of users, but the same principles apply. By opening the eyes of web software developers to the beauty of good software design, their opinion that web software doesn't require much planning will change. At the same time, with the increase in quality in the design of web software, compiled software developers may actually begin to show a little respect for "silly" web software developers. Well, maybe.

There are some very basic questions you should ask yourself when designing ANY software:

Can I make it easier to code additional functionality in the future?
Some develop with the sole purpose of checking items off the list of requirements. True, this may help you meet your deadline faster (though even that is not likely). But what happens when, six months from now, your client asks for another feature to be added? Did you anticipate that feature request? More realistically, did you code a framework to support quick addition of any request? Here are some ways you can do so:
- Follow the DRY (Don't Repeat Yourself) principle. For example, do you find that your code is similar or even identical in different parts of your software? It would make far more sense to generalize the code and put it in a single function. Later, when you add a feature that requires similar code, you simply call this function. This works especially well for pulling and massaging data from your database.
- Anticipate changes in data. Sure, your client may say that "red" and "blue" are the only options, so you figure an enum in the database will work just fine. Don't be so trusting. There's an N% chance that your client will remove or add colors in the future, where N is large. Create a separate table for these options, and don't feel silly when there are only two rows in the table at your initial launch.
- Correctly identify relationships in your data. Does a shopping cart have an item, or is it the other way around? Can a cart have more than one item? Unless the alternative is absurd, don't ever assume that a cart won't have multiple items, or a user won't have multiple profiles, or a post won't have multiple comments, even if your client swears that they want to "keep it simple" and only have a one-to-one relationship.
- Separate anything that looks remotely like an option from your logic (i.e., don't "hard-code" anything). The number of pictures per row in a photo gallery, the number of file upload fields in a form, the maximum dimensions of a profile photo--all of these should be variables or constants in your code. Initially, before the client asks for the ability to modify these options, they can be set in a separate config file for the application. As the client realizes that these options need to be configurable through the web interface, it will be easy-cheesy to accommodate that.

Can I make it easier to maintain this code?
Even if the client never asks for additional features, he may want "tweaks" to the software. Are tweaks going to be just that, or are they going to require an overhaul?
- Separate design from content. I can't stress how important this is. For instance, in a web application, have just a few source files (like a header, a footer, and a stylesheet) that define all design elements. If your client wants a re-design, you won't have to modify every single page of the website. This seems like common sense, but it's amazing how many people will forgo this huge time saver because "it's just a small site."
- Separate data from code. What if your database schema changes? Don't spend hours modifying eighty-two source files because you add or change one column, or worse, avoid making changes to the software that would improve functionality or efficiency because it's too much of a pain to make them. Your database queries should be quarantined in functions. All your application should know about is functions like getUsers, addUser, getCart, removeAccount etc. To demonstrate how flexible this can be, imagine converting a web application from MySQL to CSV flat files (don't actually do this, or I'll shoot you). None of your page-to-page code would have to change; you'd only have to re-work a few functions responsible for accessing data.
- Keep things organized. This is a broad point, and touches on things like consistent tabbing of your code blocks, sane file naming conventions, sane directory structure, and grouping library functions in files. There's a place for everything, and everything should be in its place. All files should be included at the top of a file, and, ideally, included in only one file. Don't include a source file in the middle of your code just because you need to call a function defined in that file on the line below. How in the world are you going to follow that logic later?

How can I make this code usable in other projects?
It's really nice to have a battle-hardened library of code that you can plug into a client's site for oft-requested functionality. Really, by taking the advice above, this takes care of itself. But just to stress a couple important points:
- Separate data from code. You wrote an awesome photo gallery module for a client, but your new client uses a different type of database. It shouldn't matter. Just rewrite your data access functions.
- Options, options, options. One client may want four photos per row, but the other may only want three. Even if you don't have a web interface for them, separate all options from your code and keep them in a config file. By changing only a few config variables or constants, you can completely customize an application for a client.
- Separate design from code. How's your client going to feel when he sees an identical gallery on a competitor's site? Allowing your clients to skin their application to fit their branding gives them warm fuzzies inside.

This is a huge topic, and I've only scratched the surface. I will expound on some of these points in later articles. Please leave additional points in the comments.