SecPod SCAP Repo, a repository of SCAP Content (CVE, CCE, CPE, CWE, OVAL and XCCDF)

Download | Alert*

CWE

Improper Encoding or Escaping of Output

ID: 116		Date: (C)2012-05-14 (M)2022-10-10
Type: weakness		Status: DRAFT
Abstraction Type: Class

Description

The software prepares a structured message for communication with another component, but encoding or escaping of the data is either missing or done incorrectly. As a result, the intended structure of the message is not preserved.

Extended Description

Improper encoding or escaping can allow attackers to change the commands that are sent to another component, inserting malicious commands instead.

Most software follows a certain protocol that uses structured messages for communication between components, such as queries or commands. These structured messages can contain raw data interspersed with metadata or control information. For example, "GET /index.html HTTP/1.1" is a structured message containing a command ("GET") with a single argument ("/index.html") and metadata about which protocol version is being used ("HTTP/1.1").

If an application uses attacker-supplied inputs to construct a structured message without properly encoding or escaping, then the attacker could insert special characters that will cause the data to be interpreted as control information or metadata. Consequently, the component that receives the output will perform the wrong operations, or otherwise interpret the data incorrectly.

Likelihood of Exploit: Very High

Applicable Platforms
Language Class: All
Technology Class: Often
Technology Class: Database-Server
Technology Class: Often
Technology Class: Web-Server

Time Of Introduction

Architecture and Design
Implementation
Operation

Related Attack Patterns

Common Consequences

Scope	Technical Impact	Notes
Integrity Confidentiality Availability Access_Control	Modify application data Execute unauthorized code or commands Bypass protection mechanism	The communications between components can be modified in unexpected ways. Unexpected commands can be executed, bypassing other security mechanisms. Incoming data can be misinterpreted.

Detection Methods

Name	Description	Effectiveness	Notes
Automated Static Analysis	This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives.	Moderate
Automated Dynamic Analysis	This weakness can be detected using dynamic tools and techniques that interact with the software using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The software's operation may slow down, but it should not become unstable, crash, or generate incorrect results.

Potential Mitigations

Phase	Strategy	Description
Architecture and Design	Libraries or Frameworks	Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, consider using the ESAPI Encoding control [R.116.1] or a similar tool, library, or framework. These will help the programmer encode outputs in a manner less prone to error. Alternately, use built-in functions, but consider using wrappers in case those functions are discovered to have a vulnerability.
Architecture and Design	Parameterization	If available, use structured mechanisms that automatically enforce the separation between data and code. These mechanisms may be able to provide the relevant quoting, encoding, and validation automatically, instead of relying on the developer to provide this capability at every point where output is generated. For example, stored procedures can enforce database query structure and reduce the likelihood of SQL injection.
Architecture and Design Implementation		Understand the context in which your data will be used and the encoding that will be expected. This is especially important when transmitting data between different components, or when generating outputs that can contain multiple encodings at the same time, such as web pages or multi-part mail messages. Study all expected communication protocols and data representations to determine the required encoding strategies.
Architecture and Design		In some cases, input validation may be an important strategy when output encoding is not a complete solution. For example, you may be providing the same output that will be processed by multiple consumers that use different encodings or representations. In other cases, you may be required to allow user-supplied input to contain control information, such as limited HTML tags that support formatting in a wiki or bulletin board. When this type of requirement must be met, use an extremely strict whitelist to limit which control sequences can be used. Verify that the resulting syntactic structure is what you expect. Use your normal encoding methods for the remainder of the input.
Architecture and Design		Use input validation as a defense-in-depth measure to reduce the likelihood of output encoding errors (see CWE-20).
Requirements		Fully specify which encodings are required by components that will be communicating with each other.
Implementation		When exchanging data between components, ensure that both components are using the same character encoding. Ensure that the proper encoding is applied at each interface. Explicitly set the encoding you are using whenever the protocol allows you to do so.

Relationships
This weakness is primary to all weaknesses related to injection (CWE-74) since the inherent nature of injection involves the violation of structured messages.
CWE-116 and CWE-20 have a close association because, depending on the nature of the structured message, proper input validation can indirectly prevent special characters from changing the meaning of a structured message. For example, by validating that a numeric ID field should only contain the 0-9 characters, the programmer effectively prevents injection attacks.
However, input validation is not always sufficient, especially when less stringent data types must be supported, such as free-form text. Consider a SQL injection scenario in which a last name is inserted into a query. The name "O'Reilly" would likely pass the validation step since it is a common last name in the English language. However, it cannot be directly inserted into the database because it contains the "'" apostrophe character, which would need to be escaped or otherwise neutralized. In this case, stripping the apostrophe might reduce the risk of SQL injection, but it would produce incorrect behavior because the wrong name would be recorded.

Related CWE	Type	View	Chain
CWE-116 ChildOf CWE-896	Category	CWE-888

Demonstrative Examples (Details)

Observed Examples

CVE-2008-4636 : OS command injection in backup software using shell metacharacters in a filename; correct behavior would require that this filename could not be changed.
CVE-2008-0769 : Web application does not set the charset when sending a page to a browser, allowing for XSS exploitation when a browser chooses an unexpected encoding.
CVE-2008-0005 : Program does not set the charset when sending a page to a browser, allowing for XSS exploitation when a browser chooses an unexpected encoding.
CVE-2008-5573 : SQL injection via password parameter; a strong password might contain "&"
CVE-2008-3773 : Cross-site scripting in chat application via a message subject, which normally might contain "&" and other XSS-related characters.
CVE-2008-0757 : Cross-site scripting in chat application via a message, which normally might be allowed to contain arbitrary content.

For more examples, refer to CVE relations in the bottom box.

White Box Definitions
None

Black Box Definitions
None

Taxynomy Mappings

Taxynomy	Id	Name
WASC	22	Improper Output Handling
CERT Java Secure Coding	IDS00-J	Sanitize untrusted data passed across a trust boundary
CERT Java Secure Coding	IDS12-J	Perform lossless conversion of String data between differing character encodings
CERT Java Secure Coding	IDS05-J	Use a subset of ASCII for file and path names
CERT C++ Secure Coding	MSC09-CPP	Character Encoding - Use Subset of ASCII for Safety
CERT C++ Secure Coding	MSC10-CPP	Character Encoding - UTF8 Related Issues

References:

OWASP .OWASP Enterprise Security API (ESAPI) Project.
Jeremiah Grossman .Input validation or output filtering, which is better?.
Joshbw .Output Sanitization. 2008-09-18.
Niyaz PK .Sanitizing user data: How and where to do it. 2008-09-11.
Jeremiah Grossman .Input validation or output filtering, which is better?. 2007-01-30.
Jim Manico .Input Validation - Not That Important. 2008-08-10.
Michael Eddington .Preventing XSS with Correct Output Encoding.
M. Howard D. LeBlanc .Writing Secure Code 2nd Edition. Microsoft. Section:'Chapter 11, "Canonical Representation Issues" Page 363'. Published on 2002.


30480



423868



253928



909



198006



282