Lecture Notes - Week 10
Readings
- Chapters:
- 22 - Creating Secure Web Applications
Screencast - Week 10
Outline of Topics
- OWASP Top Ten
- Secure Password Protection for Authenticating
- Guarding Against SQL Injection
- Leaking Information to Hackers
- Preventing Cross-Site Scripting Attacks
- File Uploads
- Securing Your Session
Lecture
Most people have heard about the horrific data breaches that occurred over the past ten or so years. The most significant breach to date was the Equifax data breach reported in September of 2017, where hundreds of millions of customer credit records were stolen—basically, more than 70% of all adults in the U.S. The good news is, there are plenty of vulnerabilities that can be prevented by developers properly securing their web applications.
OWASP Top Ten
Web security is a deep topic, and I can only scratch its surface. That said, a reference that every developer must become familiar with is the Open Web Application Security Project (OWASP) site. The two resources I find most useful on it are the OWASP Top Ten and the OWASP Cheat Sheet Series. The "OWASP Top Ten" lists the top ten security risks present in web applications today and has helpful links describing the vulnerability and how to mitigate the risk effectively. The "OWASP Cheat Sheet Series" has a collection of security information organized by topic. Each article explains the vulnerability and gives recommendations for how to mitigate the risks.
Let's discuss the more common vulnerabilities, what they are, and how to mitigate them.
Secure Password Protection for Authenticating
It goes without saying, we need to protect sensitive data by requiring credentials from users to gain access to certain functionality. It's important that the mechanism we use to authenticate users must be robust and guarantee security.
SHA-1 is not Secure
I used to teach this class from a book that used the example of securing user passwords with Secure Hash Algorithm 1 SHA-1
. Unfortunately, since 2005, SHA-1 was no longer considered secure for protecting data. SHA-1 hashed passwords, once obtained by an attacker, can be cracked offline with sufficient processing power. In 2017, Google successfully cracked the SHA-1 algorithm using a collision attack.
SHA-1 and SHA-256 can be Cracked
The bad thing about using hashing algorithms like SHA-1 or even SHA-256, which has not been cracked, for storing passwords is that an attacker can generate the hashes quickly. As a result, hashes are generated from dictionaries of potential passwords and stored in online databases known as Rainbow Tables.
This can be demonstrated by creating a SHA-256
hash using the SHA256 Hash Generator website:
If I hash a password from the phrase ilikebananas
I get the following hash:
1 |
|
Next, if we navigate to the Crackstation website, and I enter this hash, I can successfully crack the hashed password:
This is why everyone tells you to make strong passwords
Use Salted Password Hashes
These rainbow tables only work on hashed values that are unsalted. A salt is a string added to make a password hash output unique. Randomly generated salts make the hash output unique even if multiple users use the same password. Every user must have a unique salt for this to work. To understand more about how and why this works, read up on Salted Passwords in this week's reading in chapter 22 of the book.
- Use
password_hash()
andpassword_verify()
When signing up a new user, we use the password_hash()
function to generate a salted hash. The standard way to use this function is to pass a single string argument containing the user’s password and the constant PASSWORD_DEFAULT
, and a salted hash is returned as a string:
1 |
|
When authenticating a user logging in with their password, use the password_verify()
function. This function takes two arguments, the first is the user-entered password, and the second is the hashed password you retrieve from the database. The function returns true if the password is verified you’ve successfully authenticated the user. Otherwise, it returns false:
1 2 3 4 |
|
Guarding Against SQL Injection
SQL injection is usually the number one security risk for web applications (but often hovers in the top 3) and was listed as the number one ASR from the "OWASP Top Ten" in 2017. An SQL injection attack inserts data directly into an SQL query without first sanitizing the input. Therefore, any source of data passed to a query can be a vector of attack and must be appropriately escaped.
As of 2021, SQL injection is considered the number 3 ASR from the "OWASP Top Ten."
What is a SQL Injection Attack?
All the code we have written so far has been vulnerable to SQL injection. There is a good write up on Gurading Against SQL Injection in this week's reading in chapter 23 of the book.
However, to fully feel the impact of releasing application code out into the wild that is vulnerable to SQL Injection, take a look at this video (from Dr. Mike Pound lecturer and researcher in Computer Science at Nottingham University) where he demonstrates exactly what a SQL Injection Attack looks like, and how bad it can be:
NOTE: Do not perform any of what I am about to show you on any website unless you are explicitly contracted and qualified to perform penetration testing for said website!
Mitigating SQL Injection
Thankfully, SQL injection attacks are straightforward to mitigate.
- The Bandaid way using
mysqli_real_escape_string()
The easiest way to sanitize form field inputs is to use the mysqli_real_escape_string()
function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
The issue with using mysqli_real_escape_string()
is that it escapes the form field entry without the context of the underlying SQL query using it. The reasons why this fails sometimes are pretty technical. However, take a look at http://phpa.me/so-sqli-edgecase for an edge case that fails sanitation.
- Prepared Statements are the best way to mitigate SQL Injection Attacks
A more robust method for mitigating SQL injections is to use prepared statements, also known as parameterized queries. The problem with mysqli_real_escape_string()
is that it doesn’t separate your input from the query itself, but in effect, it inserts it into the query. However, prepared statements will separate your database inputs from your queries and not allow you to insert SQL commands (like subqueries).
The PHP language, via the mysqli extension, provides two functions to parameterize our database inputs: mysqli_prepare()
and mysqli_stmt_bind_param()
. We can use these functions together to parameterize database queries. mysqli_prepare()
is used to parameterize the SQL query into a statement that will be bound to the input parameters using the mysqli_stmt_bind_param()
function.
The next step is to invoke or execute the prepared SQL statement with mysqli_stmt_execute()
and then get the results of the executed prepared SQL statement using mysqli_stmt_get_result()
:
1 2 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Note, a
?
is used as a placeholder for query parameters.
Leaking Information to Hackers
It is important to know what the attack vectors are for any application you write. These are:
- cookies
- query parameters
- form fields
- hints about our database schema (in the naming of our form fields)
- file system path (by not having an
index
file in each folder)
Cookies
Cookies reside on a client’s browser, and as such, they are viewable and vulnerable to modification. You should never store personal information in cookies. However, the debate is moot; along with the database, use session variables and only use a single cookie to store the session ID.
Query Parameters
Query parameters are sent over in the URL of an HTTP GET request, after the ?
:
1 |
|
A potential hacker can see this. Query parameters can be used as a persistence mechanism (similar to hidden variables in a form). This practice exposes details about the application and how it runs. Often we only need to send a single query parameter which can be an ID that maps to a primary key in a database table which can be queried in the script after processing the GET request:
1 |
|
Since query parameters are user supplied data, they should be parameterized when used in database queries and escaped when displayed in HTML.
Form Fields
Form fields are a gold mine of information for hackers trying to guess field names in a database schema. Unfortunately, this has been a common practice for developers and makes the process of guessing the database schema much easier in a blind SQL injection attack.
Many buisinesses have development requirements for obfuscating the column names in database tables used in form fields. This is an excellent practice every developer should follow.
Preventing Cross-Site Scripting Attacks
Cross-Site Scripting (XSS) is a particularly nasty vulnerability. According to OWASP:
XSS flaws occur whenever an application includes untrusted data in a new web page without proper validation or escaping, or updates an existing web page with user supplied data using a browser API that can create HTML or JavaScript. XSS allows attackers to execute scripts in the victim’s browser which can hijack user sessions, deface web sites, or redirect the user to malicious sites.
XSS Attacks
There are three vectors of XSS attacks targeting browsers: Reflected XSS, Stored XSS, and DOM XSS:
Here is what the OWASP says about these:
- Reflected XSS: The application or API includes unvalidated and unescaped user input as part of HTML output. A successful attack can allow the attacker to execute arbitrary HTML and JavaScript in the victim’s browser. Typically, the user will need to interact with some malicious link that points to an attacker-controlled page, such as malicious watering hole websites, advertisements, or similar.
- Stored XSS: The application or API stores unsanitized user input that is viewed at a later time by another user or an administrator. Stored XSS is often considered high or critical risk.
- DOM XSS: JavaScript frameworks, single-page applications, and APIs that dynamically include attacker-controllable data to a page are vulnerable to DOM XSS. Ideally, the application would not send attacker-controllable data to unsafe JavaScript APIs.
Mitigating XSS Attacks
Let's talk about the first two since we can mitigate against these within our application.
The mechanics of how XSS works are pretty involved due to the number of actors. But in brief, XSS involves three actors at its core: the attacker, the victim, and the vulnerable website. To mitigate the vulnerability, our code must sanitize all inputs to the website of JavaScript and HTML entities before outputting it.
Reflected XSS
Let's take a look at an application with a form for searching through an inventory of stuff:
Notice that JavaScript has been entered into the search field. When the user submits the form, an alert box will pop up:
Which means this application is vulnerable to XSS attacks, and a hacker could end up submitting something like this:
1 2 3 4 5 6 |
|
When the user submits the form, this would be displayed which would be far more nefarious:
And you certainly don't want to enter your credentials in and submit this form:
The solution is straightforward and requires using the function filter_var()
. Since we expect a string for our search term, this is how we can sanitize our application from XSS:
1 2 3 4 5 6 |
|
Now if some nefarious form is entered into the search field, you should see this instead:
Note that if you receive an email address as an input, you should use
FILTER_SANITIZE_EMAIL
as it removes all characters except those allowed in an email address. For more information on the list of filters for sanitization, see https://php.net/filter.filters.sanitize.
Stored XSS
Stored XSS attacks are similar to second order SQL injections as they originate from either a form field or query parameter that gets stored in your database. You should be super paranoid about this and run the filter_var()
function on everything you query out of your database you plan to display on a web page.
You should couple this approach with validating and sanitizing everything coming into the web application. This means filtering every input from a form or a query parameter using filter_var()
and then parameterizing all your queries from these inputs as well.
File Uploads
Another source of attack is uploaded files. Follow the guidance from the OWASP. However, here are the minimal things you must consider to allow for securely uploading files.
Let's say we are asking a user to upload an image file using this form:
1 2 3 4 |
|
Validate the Uploaded File
There are a few steps you need to follow to validate an uploaded file:
- guard against a directory path traversal attack
- only accept uploaded files using
POST
- check the file type
- check the file size
Protect Against a Path Traversal Attack
An attacker can try and acquire passwords from our web server or access a file that wasn’t meant to be accessed by setting the file’s name as a relative path (e.g. ../../../etc/passwd
).
To guard against this attack, use the basename()
function to strip off unwanted characters:
1 |
|
Only Accept Uploaded Files Using HTTP POST
As an extension to not being able to manipulate the web application to work on files it should not, you should verify the file was actually uploaded using HTTP POST
with the is_uploaded_file()
function:
1 2 3 4 5 6 7 |
|
Check the MIME
File Type
We also want to prevent attackers from uploading files our application is not interested in (e.g. executable files). Rather than relying on the file extension contained in $_FILES['picture']['type']
, it is best to use the finfo_open()
and finfo_file()
functions that interrogate the actual file for it’s MIME
file type:
1 2 3 4 5 6 7 8 9 10 |
|
If you are expecting an image type, in addition to explicitly validating the MIME
file type, you should also use the function getimagesize()
:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Check the File Size
It is always a best practice to limit the maximum file size of uploaded files. We typically do that by setting a hidden input element with the name attribute set to max_file_size
inside the form. However, an attacker can manipulate this value.
The web server also has an INI
file directive called upload_max_filesize
that limits the maximum file that a browser can upload to the server.
A file will not be uploaded if it exceeds either the max_file_size
or upload_max_filesize
(whichever is smaller). In this case, UPLOAD_ERR_FORM_SIZE
will be set for $_FILES['picture']['error']
To further limit the size (in bytes) of the uploaded file to the application by checking $_FILES['picture']['size']
:
1 2 3 4 5 6 7 8 |
|
Securing Your Session
The last attack I’ll cover is session hijacking. An attacker can steal your session if they can get a hold of your session ID. This frequently happens as a result of an XSS or man-in-the-middle attack and not using an encrypted connection.
Use HTTP-Only Session Cookies
You can prevent session hijacking by making sure to set the web server’s INI
directive session.cookie_httponly=On
. This will refuse access to the session cookie from JavaScript.
Another INI
directive to consider is session.use_strict_mode=On
. It prevents the session module from accepting session IDs that were not generated by the session module and can prevent using an attacker-initialized session ID.
For more information on securing sessions and INI settings, see https://php.net/session.security.ini
.
HTTPS Uses Encrypted Communication
Another way to prevent a session from being hijacked is to make your web application available only using an encrypted connection (i.e., HTTPS). The HTTP protocol is communicated in clear text. In contrast, HTTPS encrypts the HTTP using the Transport Layer Security (TLS) protocol. Setting up a webserver to use the HTTPS protocol involves using TLS/SSL certificates that contain a private key for the webserver and a public key that client browsers use to connect to the secure server. The public and private keys handle encryption and decryption.
If your site uses HTTPS, you can further protect your sessions by setting session.cookie_secure=On
. This setting only allows accessing the session ID cookie over HTTPS.
Let’s Encrypt is a nonprofit Certificate Authority that provides TLS certificates free of charge.