Ethical Hacking

Learn to find vulnerabilities before the bad guys do! Gain real world hands on hacking experience in our state of the art hacking lab. Course designed and taught by expert instructors with years of penetration testing experience. 12 student maximum in every class. Certification attempt included in every package.
Computer Forensics Training at InfoSec Institute

Gain the in-demand skills of a certified computer examiner, learn to recover trace data left behind by fraud, theft, and cybercrime perpetrators. Discover the source of computer crime and abuse at your organization so that it never happens again. All of our class sizes are guaranteed to be 12 students or less to facilitate one-on-one interaction with one of our expert instructors.




Network Security Web-App-Sec
[Top] [All Lists]

Re: Canonicalization

Subject: Re: Canonicalization
Date: Thu, 20 Apr 2006 22:22:18 -0400
Andrew,

Is that “simplest form” achievable? One can perform many and different encodings making the task of decoding them very difficult and resource consuming. Usually it is cheaper and safeties to do semantic checkup and treat the input as erroneous if it does not confirm to the expected input format.

For example if you are expecting number anything different than a number is error. If you expect alphanumeric – verify if the input is composed only by alphas and numbers...

Regards,
Rossen Raykov

--
Slug - The Slow Web Processes Manager
http://slug.webhydra.com/


Andrew van der Stock wrote:

Susan,

I am the lead OWASP Guide author so I hope I can answer your query.

The basics of this sentence is the fact that there are many ways to encode text in web apps, and if you're going to make decisions about that text, or accept it for persistent storage, or re-display it, it's vital that you make it "canonical" or as simple as possible before you act on it.

For example, if you get:

select%20*%20from%20...

from the user and you write code to tokenize input based upon spaces, it will not see any spaces.

So you must decode the that string properly (so it becomes "select * from ...") and then you can process it "safely".

Be aware of double and n-deep encodings - they can occur, and obviously there are many encodings you've never seen or considered. That's why I strongly advocate positive validation.

ie (in C# and .NET, but applicable to most languages):

Hashtable clean = new Hashtable();

// Ensure that if the statement fails for any reason,
// the collection has a safe value for our field
clean.Add("field", "");

// is the data a single word no more than 20 characters long, using only a-z and 0 to 9?
if ( Regex.isMatch(Request.Form["field"], "^[a-z0-9]{1,20}$", RegexOptions.IgnoreCase) )
{
// it's safe to take the value of the string as there's most likely no nasties
clean["field"] = Request.Form["field"].ToString();


    // now ensure that the business rules make sense
    processFieldBusinessRules(clean["field"]);
}
else
{
    throw ...
}

// Now it's moderately safe to use or store the data in clean[]

...

thanks,
Andrew

On 11/04/2006, at 11:12 PM, susam_pal@yahoo.co.in wrote:
I found the following paragraph in owasp.org. Can someone please elaborate on this?


Parameters must be converted to the simplest form before they are validated, otherwise, malicious input can be masked and it can slip past filters. The process of simplifying these encodings is called “canonicalization.”


-------------------------------------------------------------------------
This List Sponsored by: SPI Dynamics

ALERT: "How A Hacker Launches A Web Application Attack!" Step-by-Step - SPI Dynamics White Paper
Learn how to defend against Web Application Attacks with real-world examples of recent hacking methods such as: SQL Injection, Cross Site Scripting and Parameter Manipulation


https://download.spidynamics.com/1/ad/web.asp?Campaign_ID=701300000003gRl
--------------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>