An article on Netcraft describes the discovery of a back door that wheedled its way into the popular Web Application Security Penetration Testing collection of Firefox Add-ons.
The attacker exhibited some degree of mad genius by claiming the Add-on fixed problems with the very popular Tamper Data add-on -- popular, it's worth nothing, among security testers and web developers.
The back door was discovered by chance when a vigilant user noticed his browser sending HTTP traffic to an unknown web site. (Check out the article for more details.)
This would make a great example for Chapter Seven, Malware and Browser Attacks, of The Book. Whereas many attacks target vulnerabilities in plugins like PDF readers or Flash player, there has not been as great a number of (observed) back-doored or otherwise malicious plug-ins.
Mozilla Sniffer may be the first back-doored Add-on for Firefox, but it's not the first one to be malicious. In December 2008 an Add-on dully labeled, "Basic Example Plugin for Mozilla," was discovered to be siphoning users' banking credentials from the browser.
Malicious plug-ins are a natural evolution of malware authors' endeavors to pull valuable data from the browser. Plug-ins are cross-platform and don't require buffer overflows or privileged access (other than having the user install it). In another sense these attacks are not really a dramatic evolution, but a small speciation of a well-established tactic1. As browser computing becomes more analogous with desktop computing the risks have simply shifted from downloading and installing an unverified .exe file to installing some unverified JavaScript as a browser extension.
Watch for more malicious plug-ins to follow the steps of Mozilla Sniffer. One improvement will likely be in the command-and-control channel. A more subtle plug-in might only launch on random pages or random times in order to decrease detection. Or the plug-in might have a pre-defined list of strings (bank, check-out, credit card, password) that cause it to trigger -- although Mozilla Sniffer already did this. The plug-in could even check a twitter feed or a URI shortener to dynamically load commands or report its data to a twitter feed rather than a static IP address.
=====
1 If you don't believe in evolution then blame Noah for bringing a pair of hackers aboard the Ark. He squeezed them between the penguins and the dinosaurs. I personally doubt this account of history because of the dubious possibility of hackers breeding.
July 19, 2010
June 15, 2010
Regex-based security filters sink without anchors
On June 7th the Stanford Web Security Research Group released a study of clickjacking countermeasures employed across Alexa Top-500 web sites. It's an excellent survey of different approaches taken by web developers to prevent their sites from being framed (i.e. subsumed by an <iframe> tag). To better understand the dangers of framing pages, read the paper or check out Chapter Two of The Book.
One interesting point emphasized in the paper is how easily regular expressions can be misused or misunderstood as security filters. Regexes can be used to create positive or negative security models -- either match acceptable content (whitelisting) or match attack patterns (blacklisting). Inadequate regexes lead to more vulnerabilities than just clickjacking.
One of the biggest mistakes made in regex patterns is leaving them unanchored. Anchors determine the span of a pattern's match against an input string. The '^' anchor matches the beginning of a line. The '$' anchor matches the end of a line. (Just to confuse the situation, when '^' appears inside grouping brackets it indicates negation, e.g. '[^a]+' means match one or more characters that is not 'a'.)
Consider the example of the nytimes.com's
As the authors point out (and anyone who is using regexes as part of a security or input validation filter should know), the pattern is unanchored and therefore easily bypassed. The site developers intended to check the referrer for links like these:
The regex would be improved by requiring the pattern to begin at the first character of the input string. The new, anchored pattern would look more like this:
The same concept applies to input validation for form fields and URI parameters. Imagine a web developer, we'll call him Wilberforce for alliterative purposes, who wishes to validate zip codes submitted in credit card forms. The simplest pattern would check for five digits, using any of these approaches:
At first glance the pattern works. Wilberforce even tests some basic XSS and SQL injection attacks with nefarious payloads like
Then our attacker, let's call her Agatha, happens to come by the site. She's a little savvier and, whether or not she knows exactly what the validation pattern looks like, tries a few malicious zip codes (the five digits are underlined):
Poor Wilberforce's unanchored pattern finds a matching string in all three cases, thereby allowing the malicious content through the filter and enabling Agatha to compromise the site. If the pattern had been anchored to match the complete input string from beginning to end then the filter wouldn't have failed so spectacularly:
Unravelling Strings
Even basic string-matching approaches can fall victim to the unanchored problem; after all they're nothing more than regex patterns without the syntax for wildcards, alternation, and grouping. Let's go back to the Stanford paper for an example of walmart.com's
Sigh. As long as the
The previous sentence is very important. Read it again. The referrer string isn't supposed to end in walmart.com, the referrer's host is supposed to end with that domain. That's an important distinction considering the bypass techniques we've already mentioned:
Parsers Before Patterns
Input validation filters often require an understanding of a data type's grammar. Some times this is simple, such as a five digit zip code. More complex cases, such as email addresses and URIs, require that the input string be parsed before pattern matching is applied.
The previous
A good security filter must understand the context of the pattern to be matched. The improved walmart.com referrer check is shown below. Notice that the
Conclusion
Regexes and string matching functions are ubiquitous throughout web applications. If you're implementing security filters with these functions, keep these points in mind:
Normalize the character set. Ensure the string functions and regex patterns match the character encoding, e.g. multi-byte string functions for multi-byte sequences.
Always match the entire input string. Anchor patterns to the start ('^') and end ('$') of input strings. If you expect input strings to include multiple lines, understand how multiline (?m) and single line (?s) flags will affect the pattern -- if you're not sure then treat it as a single line. Where appropriate to the context, the results of string matching functions should be tested to see if the match occurred at the beginning, within, or end of a string.
Prefer a positive security model over a negative one. Whitelist content that you expect to receive and reject anything that doesn't fit. Whitelist filters should be as strict as possible to avoid incorrectly matching malicious content. If you go the route of blacklisting content, make the patterns as lenient as possible to better match unexpected scenarios -- an attacker may have an encoding technique or JavaScript trick you've never heard of.
Consider a parser instead of a regex. If you want to match a URI attribute, make sure your pattern extracts the right value. URIs can be complex. If you're trying to use regexes to parse HTML content...good luck.
Don't shy away from regexes because their syntax looks daunting, just remember to test your patterns against a wide array of malicious and valid input strings.
=====
1 Technically, the pattern should match the host portion of the URI's authority. Check out RFC 3986 for specifics, especially the regexes mentioned in Appendix B.
One interesting point emphasized in the paper is how easily regular expressions can be misused or misunderstood as security filters. Regexes can be used to create positive or negative security models -- either match acceptable content (whitelisting) or match attack patterns (blacklisting). Inadequate regexes lead to more vulnerabilities than just clickjacking.
One of the biggest mistakes made in regex patterns is leaving them unanchored. Anchors determine the span of a pattern's match against an input string. The '^' anchor matches the beginning of a line. The '$' anchor matches the end of a line. (Just to confuse the situation, when '^' appears inside grouping brackets it indicates negation, e.g. '[^a]+' means match one or more characters that is not 'a'.)
Consider the example of the nytimes.com's
document.referrer check as shown in Section 3.5 of the Stanford paper. The weak regex is highlighted in red:
if(window.self != window.top &&
!document.referrer.match(/https?:\/\/[^?\/]+\.nytimes\.com\//))
{
top.location.replace(
window.location.pathname);
}
As the authors point out (and anyone who is using regexes as part of a security or input validation filter should know), the pattern is unanchored and therefore easily bypassed. The site developers intended to check the referrer for links like these:
http://www.nytimes.com/ https://www.nytimes.com/ http://www.nytimes.com/auth/login http://firstlook.blogs.nytimes.com/Since the pattern isn't anchored, it will look through the entire input string for a match, which leaves the attacker with a simple bypass technique. In the following example, the pattern matches the text in red -- clearly not the developers' intent:
http://evil.lair/clickjack.html?a=http://www.nytimes.com/
The devs wanted to match a URI whose domain included ".nytimes.com", but the pattern would match anywhere within the referrer string.The regex would be improved by requiring the pattern to begin at the first character of the input string. The new, anchored pattern would look more like this:
^https?:\/\/[^?\/]+\.nytimes\.com\/
The same concept applies to input validation for form fields and URI parameters. Imagine a web developer, we'll call him Wilberforce for alliterative purposes, who wishes to validate zip codes submitted in credit card forms. The simplest pattern would check for five digits, using any of these approaches:
[0-9]{5}
\d{5}
[[:digit:]]{5}
At first glance the pattern works. Wilberforce even tests some basic XSS and SQL injection attacks with nefarious payloads like
<script src=...> and 'OR 19=19. The regex rejects them all.Then our attacker, let's call her Agatha, happens to come by the site. She's a little savvier and, whether or not she knows exactly what the validation pattern looks like, tries a few malicious zip codes (the five digits are underlined):
90210'
<script>alert(0x42)<script>57732
10118<script>alert(0x42)<script>
Poor Wilberforce's unanchored pattern finds a matching string in all three cases, thereby allowing the malicious content through the filter and enabling Agatha to compromise the site. If the pattern had been anchored to match the complete input string from beginning to end then the filter wouldn't have failed so spectacularly:
^\d{5}$
Unravelling Strings
Even basic string-matching approaches can fall victim to the unanchored problem; after all they're nothing more than regex patterns without the syntax for wildcards, alternation, and grouping. Let's go back to the Stanford paper for an example of walmart.com's
document.referrer check based on a JavaScript String object's IndexOf function. This function returns the first position in the input string of the argument or -1 in case the argument isn't found:
if(top.location != location) {
if(document.referrer &&
document.referrer.indexOf("walmart.com") == -1)
{
top.location.replace(document.location.href);
}
}
Sigh. As long as the
document.referrer contains the string "walmart.com" the anti-framing code won't trigger. For Agatha, the bypass is as simple as putting her booby-trapped clickjacking page on a site with a domain name like "walmart.com.evil.lair" or maybe using a URI fragment, http://evil.lair/clickjack.html#walmart.com. The developers neglected to ensure that the host from the referrer URI ends in walmart.com rather than merely contains walmart.com.The previous sentence is very important. Read it again. The referrer string isn't supposed to end in walmart.com, the referrer's host is supposed to end with that domain. That's an important distinction considering the bypass techniques we've already mentioned:
http://walmart.com.evil.lair/clickjack.html
http://evil.lair/clickjack.html#walmart.com
http://evil.lair/clickjack.html?a=walmart.com
Parsers Before Patterns
Input validation filters often require an understanding of a data type's grammar. Some times this is simple, such as a five digit zip code. More complex cases, such as email addresses and URIs, require that the input string be parsed before pattern matching is applied.
The previous
indexOf string example failed because it doesn't actually parse the referrer's URI; it just looks for the presence of a string. The regex pattern in the nytimes.com example was superior because it at least tried to understand the URI grammar by matching content between the URI's scheme (http or https) and the first slash (/)1.A good security filter must understand the context of the pattern to be matched. The improved walmart.com referrer check is shown below. Notice that the
get_hostname_from_url function now uses a regex to extract the host name from the referrer's URI and the string comparison ensures the host name either exactly matches or ends with "walmart.com". (You could quibble that the regex in get_hostname_from_url isn't anchored, but in this case the pattern works because it's not possible to smuggle malicious content inside the URI's scheme. The pattern would fail if it returned the last match instead of the first match. And, yes, the typo in the comment in the killFrames function is in the original JavaScript.)
function killFrames() {
if (top.location != location) {
if (document.referrer) {
var referrerHostname = get_hostname_from_url(document.referrer);
var strLength = referrerHostname.length;
if ((strLength == 11) && (referrerHostname != "walmart.com")){ // to take care of http://walmart.com url - length of "walmart.com" string is 11.
top.location.replace(document.location.href);
} else if (strLength != 11 && referrerHostname.substring(referrerHostname.length - 12) != ".walmart.com") { // length og ".walmart.com" string is 12.
top.location.replace(document.location.href);
}
}
}
}
function get_hostname_from_url(url) {
return url.match(/:\/\/(.[^/?]+)/)[1];
}
Conclusion
Regexes and string matching functions are ubiquitous throughout web applications. If you're implementing security filters with these functions, keep these points in mind:
Normalize the character set. Ensure the string functions and regex patterns match the character encoding, e.g. multi-byte string functions for multi-byte sequences.
Always match the entire input string. Anchor patterns to the start ('^') and end ('$') of input strings. If you expect input strings to include multiple lines, understand how multiline (?m) and single line (?s) flags will affect the pattern -- if you're not sure then treat it as a single line. Where appropriate to the context, the results of string matching functions should be tested to see if the match occurred at the beginning, within, or end of a string.
Prefer a positive security model over a negative one. Whitelist content that you expect to receive and reject anything that doesn't fit. Whitelist filters should be as strict as possible to avoid incorrectly matching malicious content. If you go the route of blacklisting content, make the patterns as lenient as possible to better match unexpected scenarios -- an attacker may have an encoding technique or JavaScript trick you've never heard of.
Consider a parser instead of a regex. If you want to match a URI attribute, make sure your pattern extracts the right value. URIs can be complex. If you're trying to use regexes to parse HTML content...good luck.
Don't shy away from regexes because their syntax looks daunting, just remember to test your patterns against a wide array of malicious and valid input strings.
=====
1 Technically, the pattern should match the host portion of the URI's authority. Check out RFC 3986 for specifics, especially the regexes mentioned in Appendix B.
Labels:
web security
June 11, 2010
DEFCON 18
DEFCON 18 is coming up from Friday July 30th to Sunday August 1st in Las Vegas. They always have cool badges so you should at least sign up just for that.
If badges aren't enough to whet your appetite, think about how much fun you might have learning about "Securing MMOs: A Security Professional's View From the Inside" or the forensics of video games in "The Games We Play".
The EFF is also running a contest to sign up attendees and gather new members. You can go directly to their web site or click on the following image to be part of The Book's efforts to support the EFF:
If badges aren't enough to whet your appetite, think about how much fun you might have learning about "Securing MMOs: A Security Professional's View From the Inside" or the forensics of video games in "The Games We Play".
The EFF is also running a contest to sign up attendees and gather new members. You can go directly to their web site or click on the following image to be part of The Book's efforts to support the EFF:
May 18, 2010
Cross-Site Tracing (XST): The misunderstood vulnerability
In January 2003 Jeremiah Grossman divulged a method to bypass the HttpOnly1 cookie restriction. He named it Cross-Site Tracing (XST), unwittingly starting a trend to attach "cross-site" to as many web-related vulnerabilities as possible.
Alas, the "XS" in XST evokes similarity to XSS (Cross-Site Scripting) which has the consequence of leading people to mistake XST as a method for injecting JavaScript. (Thankfully, character encoding attacks have avoided the term Cross-Site Unicode, XSU.) Although XST attacks rely on browser scripting to exploit the vulnerability, the vulnerability is not the injection of JavaScript. XST is a means for accessing headers normally restricted from JavaScript.
Confused yet?
First, review XSS. XSS vulnerabilities, better described as HTML injection, occur because a web application echoes an attacker's payload within the HTTP response body -- the HTML. This enables the attacker to modify a page's DOM by injecting characters that affect the HTML's layout, such as adding spurious characters like brackets (< and >) and quotes (' and "). Cross-site tracing relies on HTML injection to craft an exploit within the victim's browser, but this implies that an attacker already has the capability to execute JavaScript. So, XST isn't about injecting <script> tags into the browser; the attacker must already be able to do that.
Cross-site tracing takes advantage of the fact that a web server should reflect the client's HTTP message in its respose.2 The common misunderstanding of an XST attack's goal is that it uses a TRACE request to cause the server to reflect JavaScript in the HTTP response body that the browser would consequently execute. As the following example shows, this is in fact what happens even though the reflection of JavaScript isn't the real vulnerability. The green and red text indicates the response body. The request was made with netcat.
The reflection of <script> tags is immaterial (the RFC even says the server should reflect the request without modification). The real outcome of an XST attack is that it exposes HTTP headers normally inaccessible to JavaScript.
Let that sink in for a moment. XST attacks use the TRACE (or synonymous TRACK) method to read HTTP headers that are otherwise blocked from JavaScript access.
For example, the HttpOnly attribute of a cookie is supposed to prevent JavaScript from reading that cookie's properties. The Authentication header, which for HTTP Basic Auth is simply the Base64-encoded username and password, is not part of the DOM and is not directly readable by JavaScript.
No cookie values or auth headers showed up when we made the request via netcat because, obviously, netcat doesn't have the internal state a browser does. So, take a look at the server's response when a browser's XHR object is used to make a TRACE request for. This is the snippet of JavaScript:
The following image shows one possible response. Notice the text in red. The browser added the Authorization and Cookie headers to the XHR request, which have been reflected by the server:
Now we see that both an HTTP Basic Authentication header and a cookie value have shown up in the response text. A simple JavaScript Regex could extract these values, bypassing the normal restrictions imposed on script access to headers or protected cookies. The drawback for attackers is that modern browsers (such as the ones that have moved into this decade) are savvy enough to block TRACE requests through the XMLHttpRequest object, which leaves the attacker to look for alternate capabilities in plug-ins like Flash.
This is the real vulnerability associated with cross-site tracing: peeking at header values. The exploit would be impossible without the ability to inject JavaScript in the first place3. Therefore, its real impact (or threat, depending on how you define these terms) is exposing sensitive header data. Hence, alternate names for XST could be TRACE disclosure attack, TRACE header reflection, TRACE method injection (TMI), or TRACE header & cookie (THC) attack.
We'll see if any of those actually catch on for the next OWASP Top 10 list.
-----
1 HttpOnly was introduced by Microsoft in Internet Explorer 6 Service Pack 1, which was released September 9, 2002. It was created to mitigate, not block, XSS exploits that explicitly attacked cookie values. It wasn't a method for preventing html injection (XSS) vulnerabilities from occurring in the first place. Mozilla magnanimously adopted in it FireFox 2.0.0.5 four and a half years later.
2 Section 9.8 of the HTTP/1.1 RFC.
3 Security always has nuanced exceptions. Merely requesting "TRACE /<script>alert(42)</script> HTTP/1.0" might be stored in the web server's log file or a database. If some log parsing tool renders requests like this to a web page without filtering the content, then HTML injection once again becomes possible. This is often referred to as second order XSS -- when a payload is injected via one application, stored, then rendered by a separate web app.
Alas, the "XS" in XST evokes similarity to XSS (Cross-Site Scripting) which has the consequence of leading people to mistake XST as a method for injecting JavaScript. (Thankfully, character encoding attacks have avoided the term Cross-Site Unicode, XSU.) Although XST attacks rely on browser scripting to exploit the vulnerability, the vulnerability is not the injection of JavaScript. XST is a means for accessing headers normally restricted from JavaScript.
Confused yet?
First, review XSS. XSS vulnerabilities, better described as HTML injection, occur because a web application echoes an attacker's payload within the HTTP response body -- the HTML. This enables the attacker to modify a page's DOM by injecting characters that affect the HTML's layout, such as adding spurious characters like brackets (< and >) and quotes (' and "). Cross-site tracing relies on HTML injection to craft an exploit within the victim's browser, but this implies that an attacker already has the capability to execute JavaScript. So, XST isn't about injecting <script> tags into the browser; the attacker must already be able to do that.
Cross-site tracing takes advantage of the fact that a web server should reflect the client's HTTP message in its respose.2 The common misunderstanding of an XST attack's goal is that it uses a TRACE request to cause the server to reflect JavaScript in the HTTP response body that the browser would consequently execute. As the following example shows, this is in fact what happens even though the reflection of JavaScript isn't the real vulnerability. The green and red text indicates the response body. The request was made with netcat.
The reflection of <script> tags is immaterial (the RFC even says the server should reflect the request without modification). The real outcome of an XST attack is that it exposes HTTP headers normally inaccessible to JavaScript.
Let that sink in for a moment. XST attacks use the TRACE (or synonymous TRACK) method to read HTTP headers that are otherwise blocked from JavaScript access.
For example, the HttpOnly attribute of a cookie is supposed to prevent JavaScript from reading that cookie's properties. The Authentication header, which for HTTP Basic Auth is simply the Base64-encoded username and password, is not part of the DOM and is not directly readable by JavaScript.
No cookie values or auth headers showed up when we made the request via netcat because, obviously, netcat doesn't have the internal state a browser does. So, take a look at the server's response when a browser's XHR object is used to make a TRACE request for. This is the snippet of JavaScript:
<script>
var xhr = new XMLHttpRequest();
xhr.open('TRACE', 'http://test.lab/', false);
xhr.send(null);
if(200 == xhr.status)
alert(xhr.responseText);
</script>
The following image shows one possible response. Notice the text in red. The browser added the Authorization and Cookie headers to the XHR request, which have been reflected by the server:
Now we see that both an HTTP Basic Authentication header and a cookie value have shown up in the response text. A simple JavaScript Regex could extract these values, bypassing the normal restrictions imposed on script access to headers or protected cookies. The drawback for attackers is that modern browsers (such as the ones that have moved into this decade) are savvy enough to block TRACE requests through the XMLHttpRequest object, which leaves the attacker to look for alternate capabilities in plug-ins like Flash.
This is the real vulnerability associated with cross-site tracing: peeking at header values. The exploit would be impossible without the ability to inject JavaScript in the first place3. Therefore, its real impact (or threat, depending on how you define these terms) is exposing sensitive header data. Hence, alternate names for XST could be TRACE disclosure attack, TRACE header reflection, TRACE method injection (TMI), or TRACE header & cookie (THC) attack.
We'll see if any of those actually catch on for the next OWASP Top 10 list.
-----
1 HttpOnly was introduced by Microsoft in Internet Explorer 6 Service Pack 1, which was released September 9, 2002. It was created to mitigate, not block, XSS exploits that explicitly attacked cookie values. It wasn't a method for preventing html injection (XSS) vulnerabilities from occurring in the first place. Mozilla magnanimously adopted in it FireFox 2.0.0.5 four and a half years later.
2 Section 9.8 of the HTTP/1.1 RFC.
3 Security always has nuanced exceptions. Merely requesting "TRACE /<script>alert(42)</script> HTTP/1.0" might be stored in the web server's log file or a database. If some log parsing tool renders requests like this to a web page without filtering the content, then HTML injection once again becomes possible. This is often referred to as second order XSS -- when a payload is injected via one application, stored, then rendered by a separate web app.
Labels:
cross site tracing,
html injection,
web security
May 8, 2010
At about this time...
"When a day that you happen to know is Wednesday starts off by sounding like Sunday, there is something seriously wrong somewhere."
Bill Masen's day only worsens as he tries to survive the apocalyptic onslaught of mobile, venomous plants.
John Wyndham's The Day of the Triffids
doesn't feel like an outdated menace even though the book was published in 1951. Furthermore, the central character's initial condition serves as an excellent hook that gives both a brief history of the triffids and underlies the high tension with which the book starts off. Where Cormac McCarthy's The Road
focuses on the harshness of personal survival and meaning after an apocalypse, Triffids considers the ways a society might try to emerge from one.
The 1981 BBC adaption
stays very close the book's plot and pacing. Read the book first, as the time-capsule aspect of the mini-series might distract you -- 80's hair styles, control panels, and lens flares. If you've been devoted to Doctor Who since the Tom Baker era (or before), you'll feel right at home!
Bill Masen's day only worsens as he tries to survive the apocalyptic onslaught of mobile, venomous plants.
John Wyndham's The Day of the Triffids
The 1981 BBC adaption
Labels:
completely unrelated
Subscribe to:
Posts (Atom)

