Improper Encoding or Escaping of OutputID: 116 | Date: (C)2012-05-14 (M)2022-10-10 |
Type: weakness | Status: DRAFT |
Abstraction Type: Class |
Description
The software prepares a structured message for communication
with another component, but encoding or escaping of the data is either missing
or done incorrectly. As a result, the intended structure of the message is not
preserved.
Extended DescriptionImproper encoding or escaping can allow attackers to change the commands
that are sent to another component, inserting malicious commands
instead.Most software follows a certain protocol that uses structured messages for
communication between components, such as queries or commands. These
structured messages can contain raw data interspersed with metadata or
control information. For example, "GET /index.html HTTP/1.1" is a structured
message containing a command ("GET") with a single argument ("/index.html")
and metadata about which protocol version is being used ("HTTP/1.1").If an application uses attacker-supplied inputs to construct a structured
message without properly encoding or escaping, then the attacker could
insert special characters that will cause the data to be interpreted as
control information or metadata. Consequently, the component that receives
the output will perform the wrong operations, or otherwise interpret the
data incorrectly.
Likelihood of Exploit: Very High
Applicable PlatformsLanguage Class: AllTechnology Class: OftenTechnology Class: Database-ServerTechnology Class: OftenTechnology Class: Web-Server
Time Of Introduction
- Architecture and Design
- Implementation
- Operation
Related Attack Patterns
Common Consequences
Scope | Technical Impact | Notes |
---|
IntegrityConfidentialityAvailabilityAccess_Control | Modify application
dataExecute unauthorized code or
commandsBypass protection
mechanism | The communications between components can be modified in unexpected
ways. Unexpected commands can be executed, bypassing other security
mechanisms. Incoming data can be misinterpreted. |
Detection Methods
Name | Description | Effectiveness | Notes |
---|
Automated Static Analysis | This weakness can often be detected using automated static analysis
tools. Many modern tools use data flow analysis or constraint-based
techniques to minimize the number of false positives. | Moderate | |
Automated Dynamic Analysis | This weakness can be detected using dynamic tools and techniques that
interact with the software using large test suites with many diverse
inputs, such as fuzz testing (fuzzing), robustness testing, and fault
injection. The software's operation may slow down, but it should not
become unstable, crash, or generate incorrect results. | | |
Potential Mitigations
Phase | Strategy | Description | Effectiveness | Notes |
---|
Architecture and Design | Libraries or Frameworks | Use a vetted library or framework that does not allow this weakness to
occur or provides constructs that make this weakness easier to
avoid.For example, consider using the ESAPI Encoding control [R.116.1] or a
similar tool, library, or framework. These will help the programmer
encode outputs in a manner less prone to error.Alternately, use built-in functions, but consider using wrappers in
case those functions are discovered to have a vulnerability. | | |
Architecture and Design | Parameterization | If available, use structured mechanisms that automatically enforce the
separation between data and code. These mechanisms may be able to
provide the relevant quoting, encoding, and validation automatically,
instead of relying on the developer to provide this capability at every
point where output is generated.For example, stored procedures can enforce database query structure
and reduce the likelihood of SQL injection. | | |
Architecture and DesignImplementation | | Understand the context in which your data will be used and the
encoding that will be expected. This is especially important when
transmitting data between different components, or when generating
outputs that can contain multiple encodings at the same time, such as
web pages or multi-part mail messages. Study all expected communication
protocols and data representations to determine the required encoding
strategies. | | |
Architecture and Design | | In some cases, input validation may be an important strategy when
output encoding is not a complete solution. For example, you may be
providing the same output that will be processed by multiple consumers
that use different encodings or representations. In other cases, you may
be required to allow user-supplied input to contain control information,
such as limited HTML tags that support formatting in a wiki or bulletin
board. When this type of requirement must be met, use an extremely
strict whitelist to limit which control sequences can be used. Verify
that the resulting syntactic structure is what you expect. Use your
normal encoding methods for the remainder of the input. | | |
Architecture and Design | | Use input validation as a defense-in-depth measure to reduce the
likelihood of output encoding errors (see CWE-20). | | |
Requirements | | Fully specify which encodings are required by components that will be
communicating with each other. | | |
Implementation | | When exchanging data between components, ensure that both components
are using the same character encoding. Ensure that the proper encoding
is applied at each interface. Explicitly set the encoding you are using
whenever the protocol allows you to do so. | | |
RelationshipsThis weakness is primary to all weaknesses related to injection (CWE-74)
since the inherent nature of injection involves the violation of structured
messages.CWE-116 and CWE-20 have a close association because, depending on the
nature of the structured message, proper input validation can indirectly
prevent special characters from changing the meaning of a structured
message. For example, by validating that a numeric ID field should only
contain the 0-9 characters, the programmer effectively prevents injection
attacks.However, input validation is not always sufficient, especially when less
stringent data types must be supported, such as free-form text. Consider a
SQL injection scenario in which a last name is inserted into a query. The
name "O'Reilly" would likely pass the validation step since it is a common
last name in the English language. However, it cannot be directly inserted
into the database because it contains the "'" apostrophe character, which
would need to be escaped or otherwise neutralized. In this case, stripping
the apostrophe might reduce the risk of SQL injection, but it would produce
incorrect behavior because the wrong name would be recorded.
Related CWE | Type | View | Chain |
---|
CWE-116 ChildOf CWE-896 | Category | CWE-888 | |
Demonstrative Examples (Details)
- Consider a chat application in which a front-end web application
communicates with a back-end server. The back-end is legacy code that does
not perform authentication or authorization, so the front-end must implement
it. The chat protocol supports two commands, SAY and BAN, although only
administrators can use the BAN command. Each argument must be separated by a
single space. The raw inputs are URL-encoded. The messaging protocol allows
multiple commands to be specified on the same line if they are separated by
a "|" character.
- Here a value read from an HTML form parameter is reflected back to
the client browser without having been encoded prior to output.
- This example takes user input, passes it through an encoding scheme
and then creates a directory specified by the user.
Observed Examples
- CVE-2008-4636 : OS command injection in backup software using shell metacharacters in a filename; correct behavior would require that this filename could not be changed.
- CVE-2008-0769 : Web application does not set the charset when sending a page to a browser, allowing for XSS exploitation when a browser chooses an unexpected encoding.
- CVE-2008-0005 : Program does not set the charset when sending a page to a browser, allowing for XSS exploitation when a browser chooses an unexpected encoding.
- CVE-2008-5573 : SQL injection via password parameter; a strong password might contain "&"
- CVE-2008-3773 : Cross-site scripting in chat application via a message subject, which normally might contain "&" and other XSS-related characters.
- CVE-2008-0757 : Cross-site scripting in chat application via a message, which normally might be allowed to contain arbitrary content.
For more examples, refer to CVE relations in the bottom box.
White Box Definitions None
Black Box Definitions None
Taxynomy Mappings
Taxynomy | Id | Name | Fit |
---|
WASC | 22 | Improper Output Handling | |
CERT Java Secure Coding | IDS00-J | Sanitize untrusted data passed across a trust
boundary | |
CERT Java Secure Coding | IDS12-J | Perform lossless conversion of String data between differing
character encodings | |
CERT Java Secure Coding | IDS05-J | Use a subset of ASCII for file and path
names | |
CERT C++ Secure Coding | MSC09-CPP | Character Encoding - Use Subset of ASCII for
Safety | |
CERT C++ Secure Coding | MSC10-CPP | Character Encoding - UTF8 Related Issues | |
References:
- OWASP .OWASP Enterprise Security API (ESAPI) Project.
- Jeremiah Grossman .Input validation or output filtering, which is
better?.
- Joshbw .Output Sanitization. 2008-09-18.
- Niyaz PK .Sanitizing user data: How and where to do it. 2008-09-11.
- Jeremiah Grossman .Input validation or output filtering, which is
better?. 2007-01-30.
- Jim Manico .Input Validation - Not That Important. 2008-08-10.
- Michael Eddington .Preventing XSS with Correct Output Encoding.
- M. Howard D. LeBlanc .Writing Secure Code 2nd Edition. Microsoft. Section:'Chapter 11, "Canonical Representation Issues" Page
363'. Published on 2002.