Improper Neutralization of Script-Related HTML Tags in a Web Page (Basic XSS)

ID: 80Date: (C)2012-05-14   (M)2012-11-08
Type: weaknessStatus: INCOMPLETE
Abstraction Type: Variant


The software receives input from an upstream component, but it does not neutralize or incorrectly neutralizes special characters such as "<", ">", and "&" that could be interpreted as web-scripting elements when they are sent to a downstream component that processes web pages.

Extended Description

This may allow such characters to be treated as control characters, which are executed client-side in the context of the user's session. Although this can be classified as an injection problem, the more pertinent issue is the improper conversion of such special characters to respective context-appropriate entities before displaying them to the user.

Likelihood of Exploit: High to Very High

Applicable Platforms
Language Class: All

Time Of Introduction

  • Implementation

Related Attack Patterns

Common Consequences

ScopeTechnical ImpactNotes
Read application data
Execute unauthorized code or commands

Detection Methods

Potential Mitigations

 Carefully check each input parameter against a rigorous positive specification (white list) defining the specific characters and format allowed. All input should be neutralized, not just parameters that the user is supposed to specify, but all data in the request, including hidden fields, cookies, headers, the URL itself, and so forth. A common mistake that leads to continuing XSS vulnerabilities is to validate only fields that are expected to be redisplayed by the site. We often encounter data from the request that is reflected by the application server or the application that the development team did not anticipate. Also, a field that is not currently reflected may be used by a future developer. Therefore, validating ALL parts of the HTTP request is recommended.
Output Encoding
Use and specify an output encoding that can be handled by the downstream component that is reading the output. Common encodings include ISO-8859-1, UTF-7, and UTF-8. When an encoding is not specified, a downstream component may choose a different encoding, either by assuming a default encoding or automatically inferring which encoding is being used, which can be erroneous. When the encodings are inconsistent, the downstream component might treat some character or byte sequences as special, even if they are not special in the original encoding. Attackers might then be able to exploit this discrepancy and conduct injection attacks; they even might be able to bypass protection mechanisms that assume the original encoding is also being used by the downstream component.
The problem of inconsistent output encodings often arises in web pages. If an encoding is not specified in an HTTP header, web browsers often guess about which encoding is being used. This can open up the browser to subtle XSS attacks.
 With Struts, write all data from form beans with the bean's filter attribute set to true.
Identify and Reduce Attack Surface
To help mitigate XSS attacks against the user's session cookie, set the session cookie to be HttpOnly. In browsers that support the HttpOnly feature (such as more recent versions of Internet Explorer and Firefox), this attribute can prevent the user's session cookie from being accessible to malicious client-side scripts that use document.cookie. This is not a complete solution, since HttpOnly is not supported by all browsers. More importantly, XMLHTTPRequest and other powerful browser technologies provide read access to HTTP headers, including the Set-Cookie header in which the HttpOnly flag is set.
Defense in Depth


Related CWETypeViewChain
CWE-80 ChildOf CWE-896 Category CWE-888  

Demonstrative Examples   (Details)

  1. In the following example, a guestbook comment isn't properly encoded, filtered, or otherwise neutralized for script-related tags before being displayed in a client browser.

Observed Examples

  1. CVE-2002-0938 : XSS in parameter in a link.
  2. CVE-2002-1495 : XSS in web-based email product via attachment filenames.
  3. CVE-2003-1136 : HTML injection in posted message.
  4. CVE-2004-2171 : XSS not quoted in error page.

For more examples, refer to CVE relations in the bottom box.

White Box Definitions
A weakness where the code path has:
1. start statement that accepts input from HTML page
2. end statement that publishes a data item to HTML where
a. the input is part of the data item and
b. the input contains XSS syntax

Black Box Definitions

Taxynomy Mappings



