How can I allow my user to insert HTML code, without risks? (not only technical risks)
|
1
|
Hi guys. I developed a web application, that permits my users to manage some aspects of a web site dynamically (yes, some kind of cms) in LAMP environment (debian, apache, php, mysql) Well, for example, they create a news in their private area on my server, then this is published on their website via a cURL request (or by ajax). The news is created with an WYSIWYG editor (fck at moment, probably tinyMCE in the next future). So, i can't disallow the html tags, but how can i be safe? What kind of tags i MUST delete (javascripts?)? That in meaning to be server-safe.. but how to be 'legally' safe? If an user use my application to make xss, can i be have some legal troubles? |
|||
|
|
If you are using php, an excellent solution is to use HTMLPurifier. It has many options to filter out bad stuff, and as a side effect, guarantees well formed html output. I use it to view spam which can be a hostile environment. |
||||||||
|
|
|
The general best strategy here is to whitelist specific tags and attributes that you deem safe, and escape/remove everything else. For example, a sensible whitelist might be
|
||||||||||||||||||||||||
|
|
|
It doesn't really matter what you're looking to remove, someone will always find a way to get around it. As a reference take a look at this XSS Cheat Sheet. As an example, how are you ever going to remove this valid XSS attack:
Your best option is only allow a subset of acceptable tags and remove anything else. This practice is know as White Listing and is the best method for preventing XSS (besides disallowing HTML.) Also use the cheat sheet in your testing; fire as much as you can at your website and try to find some ways to perform XSS.
|
|||
|
|
Rather than allow HTML, you should have some other markup that can be converted to HTML. Trying to strip out rogue HTML from user input is nearly impossible, for example
Removing from this will leave
|
||||||||||||||||||
|
|
|
For a C# example of white list approach, which stackoverflow uses, you can look at this page. |
||
From StackOverflow.com




