Introduction
It’s been my observation that when it comes to XSS prevention, angle brackets associated with HTML and script tags (<>) often get the majority of the attention. While eliminating script tags as a potential XSS vector is a great start, just as important, and seemingly most forgotten when it comes to input validation and output encoding, are double (“) and single (‘) quotes. Whether they are completely ignored or improperly escaped, they seem to be a frequent cause of reflected XSS. Because I come across this issue so often, both during application pen tests as well as everyday Internet browsing, I wanted to dedicate a post to the dangers of quotes and XSS.
Discussion
Why is failure to safely handle user-submitted quotes such a bad thing? Quite simply, if that input is incorporated into the server response without proper input validation and output encoding, it could allow for the direct modification of HTML attributes. To demonstrate, I’ll start with the more obvious <input> tag and then move on to some other examples where I commonly find reflected XSS resulting from insecurely handled quotes.
Input Tags
Let’s take a look at a simple but common example of a webpage with a search function. In this example, when the user clicks Search their search term is returned in the input box of the response page.
The HTML for the search function on this page might look like this:
Search for a product on our site:
Unfortunately, this site does not strip, escape, or encode quotes included in a search term. Should the user submit a quoted, compound search term such as “desk chair”, the HTML returned in the response page would now look like this:
Search for a product on our site:
What has happened here is we have escaped the value attribute and created another attribute called desk. Of course in HTML parlance, this new attribute means nothing; however, using this approach we can insert valid attributes which can result in successful XSS exploits. Let’s use the common onmouseover attribute to execute a user-driven XSS exploit with the following search term:
” onmouseover=”alert(‘XSS’)
The HTML of the response page would now look like this:
Search for a product on our site:
This search term modifies the resulting HTML so that it executes the XSS should the user mouse over the input field. Aside from the more obvious <input> tag, sites often incorporate user input directly into some less noticeable, but equally injectable locations of the returned HTML of the server’s response. These include drop-down menus, hyperlinks, meta tags, 3rd party advertisements, and even directly into HTML-embedded JavaScript. Lets’ take a closer look at some examples:
Drop-down Menus
Sites often use drop-down menus (<select> tags) as a means for users to sort or filter search results (by price, popularity, etc). Sometimes when the user selects an option from one of these menus, they are executing additional searches that incorporate their original search term along with some other parameters that serve to sort or filter the results. For example, selecting a drop down item to sort search results by popularity might append the parameter ?sort=popular to the original query and re-execute the search.
The response HTML for the drop-down menu in our previous “Products A-Z” example page (which sorts results by price) might look like the following:
Sort by:
In this case, selecting either option would execute the reflected XSS from the original search term.
Hyperlinks
Sites might also incorporate user input into customized hyperlinks within the returned HTML of a sever response. Some examples I’ve come across several times are page navigation and “results per page” links. Once again, using our previous “Products A-Z” example page, the HTML of the “Results per page” links might look like the following:
Results per page:
Hovering the mouse over any of these links would also execute the reflected XSS.
Take notice of the URL preview in the lower left corner of the above screenshot. Specifically, note how the URL unexpectedly terminates after the ?keywords=” due to the double quote included in our search term. Seeing this behavior while navigating a site is a tell-tale sign that it does not properly handle quotes and may be vulnerable to reflected XSS.
Meta Tags
Often overlooked, <meta> tags are another fairly common source of reflected XSS attacks because they include unvalidated, unencoded user inputs. For example, a site might use the following page description meta tag for Search Engine Optimization:
These tend to be overlooked because the meta tag itself is not visibly displayed to the screen. Even so, meta tags have attributes that make it possible to execute XSS attacks without the need for user interaction and even support encoded data which may help to evade rudimentary input validation filters.
Here’s an example <meta> tag that places a user’s search term in the content attribute without proper encoding.
One of the other attributes of the <meta> tag is http-equiv which, when populated with possible value of “refresh”, will refresh the page using the value in the “content” attribute. Further, we can base64 encode the XSS injection (in this case: <script>alert(‘XSS’)</script>)and insert that into the “content” attribute, which in turn, will execute it within the browser automatically. In this example, our XSS input would look like this:
a;url=data:text/html;base64,PHNjcmlwdD5hbGVydCgnWFNTJyk8L3NjcmlwdD4″ HTTP-EQUIV=”refresh”
This creates the following meta tag in the returned HTML:
Here’s an alternate, non-encoded version that works in Chrome:
Sometimes, rudimentary input validation or output escaping prevents the execution of a script in the meta tag. Instead of attempting to embed the script in the returned HTML, you might also consider testing a redirect to an external, vulnerable page.
For example, I’ve noticed sites that attempt to escape double quotes by inserting a backslash (\) in front of each one (ineffective in prevention of XSS within HTML attributes). In our example above, this additional character renders the base64 injection string ineffective. Instead, we can insert the URL of a “malicious” site under our control (containing drive-by-download, XSS, etc), making the value of the content attribute: content=”a;url=hxxp://malicious.com”
Note: If the site your are testing does injection escape characters and the URL to which you are attempting to redirect cannot be terminated with a backslash, insert a hash symbol (#) to create an artificial URL fragment that won’t be processed server-side. Also, because the quotes in the HTTP-EQUIV value are optional, they can be omitted altogether to avoid any additional unwanted backslashes. With that, our injection input would become: “a;url=hxxp://malicious.com#” HTTP-EQUIV=refresh which creates the following meta tag:
Once the page loads, the meta tag will execute and the user will be automatically redirected to the malicious site.
3rd Party Advertisements
3rd party ads are another source of quote-based reflected XSS that I’ve come across. Even sites that do a good job of validating and encoding user input everywhere else seem to have a hard time with 3rd party advertisements and I suspect it’s because they might be copying/pasting the 3rd party ad code directly in their site’s HTML. Here’s a generalized example that I’ve seen on a popular website:
Note the value SEARCHTERM, where the user input is included directly in the URL value of the ad. Once again, if quotes are not properly handled, this too becomes a source of reflected XSS. What’s interesting is that this particular source of XSS can often by discovered simply by using a site and noticing abnormalities in the visual layout of the page. Using our previous vulnerable 3rd party ad <div> as an example, if the value for the URL attribute is unexpectedly terminated with a double quote, the remaining portion of the URL (sz=…) is excluded and the returned ad is not sized or constrained properly in the server response. Here’s a generic visual representation of what a vulnerable page would look like before a double quote is included in the ad URL:
Here is what the vulnerable page looks like after the double-quote is included (note the ads “floating” on the page):
Visual cues such as this can be key indicators that a site is vulnerable to reflected XSS.
Embedded JavaScript
Even when all other HTML attributes are properly validated and encoded, websites may forget about embedded JavaScript, especially if it serves a 3rd party function. This is where single quotes (‘) tend to be most applicable.
Take for example, the following embedded JavaScript that is used to display a 3rd party advertisement:
thirdpartyAds.displayAd({
thirdparty_max_ads : '0',
thirdparty_num_links : '8',
thirdparty_sw : 'SEARCHTERM',
thirdparty_max_results : '20',
templates: {
...
code
...
}
});
thirdpartyAds.render();
Again, note the SEARCHTERM value which incorporates the user input directly into the script. Assuming this site is not properly escaping/encoding single quotes, we an make our input ‘}); alert(‘XSS’); test({ which would modify the JavaScript (and automatically execute XSS) as follows:
thirdpartyAds.displayAd({
thirdparty_max_ads : '0',
thirdparty_num_links : '8',
thirdparty_sw : ''}); alert('XSS'); test({
thirdparty_max_results : '20',
templates: {
...
code
...
}
});
thirdpartyAds.render();
There are obviously many more possibilities of reflected XSS due to insecure handling of quotes from user input, but these are representative examples of what I’ve seen in various sites and web applications.
XSS Prevention and Conclusion
I won’t go into great detail about XSS prevention as its covered quite well by OWASP. However, at its core, prevention of any XSS is fairly simple — validate untrusted input to the extent possible and be sure to safely encode/escape all user-provided input before incorporating (reflecting) it in the server response. Never rely solely on input validation as it is often incomplete and insufficient. For example, during a web app pen test, I observed the application I was testing relied solely on input validation to strip out event attribute that included the word “on” such as onmouseover or onclick. At face value, this might seem to go a long way in preventing reflected XSS. However, I quickly bypassed this rudimentary filter by inserting an encoded null character between the “o” and the “n” of my input (o%00nmouseover”). The null character is subsequently stripped by the web server and the full term “onmouseover” returned in the response, executing the reflected XSS input. Note: For more examples of how input validation can be defeated by filter evasion techniques, be sure to check out the OWASP filter evasion cheat sheet
When encoding or escaping output be sure to apply it consistently and use proper encoding/escaping for the given output language — in other words, don’t rely on backslashes to escape HTML!
Even before input validation and output encoding, consider limiting user input in the server response as much as possible. Be sure to consider all possible vectors of user input and all places in which they are used, including within 3rd party code.
Whether you’re involved with web application testing or development, hopefully this post gets you thinking about all of the various locations in which quotes can be used to execute XSS and underlines the importance of thorough XSS prevention.
Tags:Cross Site Scripting , encoding , escaping , filter evasion , input validation , penetration testing , quotes , reflected xss , security , web application , web security , XSS