Posterous
Daniele is using Posterous to post everything online. Shouldn't you?
Unknown35
 

Daniel’strae

Is cuma cá mhinice a théann tú ar strae; is é is tábhachtaí gurb áil leat do bhealach a aimsiú arís.

How can I allow my user to insert HTML code, without risks? (not only technical risks)

vote up 2 vote down
star
1

Hi guys.

I developed a web application, that permits my users to manage some aspects of a web site dynamically (yes, some kind of cms) in LAMP environment (debian, apache, php, mysql)

Well, for example, they create a news in their private area on my server, then this is published on their website via a cURL request (or by ajax).

The news is created with an WYSIWYG editor (fck at moment, probably tinyMCE in the next future).

So, i can't disallow the html tags, but how can i be safe? What kind of tags i MUST delete (javascripts?)? That in meaning to be server-safe.. but how to be 'legally' safe? If an user use my application to make xss, can i be have some legal troubles?

bdukes
7,249628
asked Mar 31 at 15:26
DaNieL
4189

7 Answers

vote up 6 vote down
check

If you are using php, an excellent solution is to use HTMLPurifier. It has many options to filter out bad stuff, and as a side effect, guarantees well formed html output. I use it to view spam which can be a hostile environment.

answered Mar 31 at 15:40
DGM
775111


I decided to take this way, plus some kind of personal steps. I must give the total freedom to my costumers to use html tags ('cos of the WYSIWYG editor), restricting only certain things.. i hope that keep it updated with the latest security doors wont be much problematic. – DaNieL Apr 1 at 7:40 
 
 
I trust it much more that I trust my own efforts.... – DGM Apr 1 at 17:09
add comment

vote up 6 vote down
check

The general best strategy here is to whitelist specific tags and attributes that you deem safe, and escape/remove everything else. For example, a sensible whitelist might be <p>, <ul>, <ol>, <li>, <strong>, <em>, <pre>, <code>, <blockquote>, <cite>. Alternatively, consider human-friendly markup like Textile or Markdown that can be easily converted into safe HTML.

answered Mar 31 at 15:31
John Feminella
10.5k2547

 
 
Can´t you still insert scripts in the allowed tags using a white-list? – jeroen Mar 31 at 15:38
 
 
That depends on how you're escaping them. If you're describing something like "<scr<script>ipt ...", I'd first note that "<scr" looks like the beginning of a tag. Since "scr" isn't whitelisted, we can escape it safely. Then we get to the "<script>" and it's also escaped/removed. – John Feminella Mar 31 at 15:45
 
 
I was thinking more about the attributes, but I guess that depends if your white-list has any tags that need them, so you would have to allow them. If you allow attributes, you´d have to get rid of the whole onclick="", etc. range, but I guess that´s pretty obvious :) – jeroen Mar 31 at 15:54
 
 
Oh, absolutely. You have to whitelist attributes separately, though, just like you whitelist each tag. (That's the price you pay for being explicit.) – John Feminella Mar 31 at 16:18
add comment

vote up 4 vote down
check

It doesn't really matter what you're looking to remove, someone will always find a way to get around it. As a reference take a look at this XSS Cheat Sheet.

As an example, how are you ever going to remove this valid XSS attack:

<IMG SRC=&#x6A&#x61&#x76&#x61&#x73&#x63&#x72&#x69&#x70&#x74&#x3A&#x61&#x6C&#x65&#x72&#x74&#x28&#x27&#x58&#x53&#x53&#x27&#x29>

Your best option is only allow a subset of acceptable tags and remove anything else. This practice is know as White Listing and is the best method for preventing XSS (besides disallowing HTML.)

Also use the cheat sheet in your testing; fire as much as you can at your website and try to find some ways to perform XSS.

answered Mar 31 at 15:32
LFSR Consulting
5,2851827

vote up 2 vote down
check

Rather than allow HTML, you should have some other markup that can be converted to HTML. Trying to strip out rogue HTML from user input is nearly impossible, for example

<scr<script>ipt etc="...">

Removing from this will leave

<script etc="...">

answered Mar 31 at 15:31
ck
5,689219

 
 
Using a white list rather than a black list would solve this problem. – Gumbo Mar 31 at 15:37
 
 
see the img tag answer in stackoverflow.com/questions/701580/… – ck Mar 31 at 15:44
 
 
XSS is also possible through other markup languages, such as BBcode, so that doesn't really fix anything. A whitelist approach works pretty well. – troelskn Mar 31 at 16:17
add comment

vote up 2 vote down
check

For a C# example of white list approach, which stackoverflow uses, you can look at this page.

answered Mar 31 at 15:42
cagdas
1,0828

From StackOverflow.com

Loading mentions Retweet
Filed under  //   Html   input-satinization   PHP  
Posted June 5, 2009
// 0 Comments

Best way to retrieve variable values from a text file - Python - Json

vote up 1 vote down
star

Referring on this question, i have a similar -but not the same- problem..

On my way, i'll have some text file, structured like:

var_a: 'home'
var_b
: 'car'
var_c
: 15.5

And i need that python read the file and then create a variable named var_a with value 'home', and so on.

Example:

#python stuff over here
getVarFromFile
(filename) #this is the function that im lookin for
print var_b
#output: car, as string
print var_c
#output 15.5, as number.

Is this possible, i mean, even keep the var type?

Notice that i have the full freedom to the text file structure, i can use the format i like if the one i proposed isn't the best.

EDIT: the ConfigParser can be a solution, but i dont like it so much, becose in my script i'll have then to refer to the variables in the file with

config.get("set", "var_name")

But what i'll love is to refer to the variable direclty, as i declared it in the python script..

There is a way to impoer the file as a python dictionary?

Oh, last thing, keep in mind that i dont know exactly how many variables would i have in the text file

Edit 2: i'm very interessing to the stephan json solution, becose in that way the text file could be readed simply with others languages (php, then via ajax javascript, for example), but i fail in something while acting that solution:

#for the example, i dont load the file but create a var with the supposed file content
file_content
= "'var_a': 4, 'var_b': 'a string'"
mydict
= dict(file_content)
#Error: ValueError: dictionary update sequence element #0 has length 1; 2 is required
file_content_2
= "{'var_a': 4, 'var_b': 'a string'}"
mydict_2
= dict(json.dump(file_content_2, True))
#Error:
#Traceback (most recent call last):
#File "<pyshell#5>", line 1, in <module>
#mydict_2 = dict(json.dump(file_content_2, True))
#File "C:\Python26\lib\json\__init__.py", line 181, in dump
#fp.write(chunk)
#AttributeError: 'bool' object has no attribute 'write'

In what kind of issues can i fall with the Json format? And, how can i read a json array in a text file, and transform it in a python dict?

p.s: i dont like the solution using .py files, i'll prefer .txt, .inc, .whatever is not restrictive to one language

asked May 29 at 6:48
DaNieL
4189

 
 
You can't manage without any import modules... you could manage with just the sys module, but it wouldn't be as nice as the other solutions suggested :) – workmad3 May 29 at 7:12

ok, its not a big problem – DaNieL May 29 at 7:16 
 
 
regarding your Edit2: you want the_dict = json.loads('{"var_a": 4, "var_b": "a string"}'). Pls note that I have switched " and '. – stephan May 29 at 9:13

And that's exactly what i want. Thanks man! – DaNieL May 29 at 9:33 
add comment

6 Answers

vote up 2 vote down
check

Load your file with JSON or PyYAML into a dictionary the_dict (see doc for JSON or PyYAML for this step, both can store data type) and add the dictionary to your globals dictionary, e.g. using globals().update(the_dict).

If you want it in a local dictionary instead (e.g. inside a function), you can do it like this:

for x in the_dict.items():
   
exec('%s=%s' % x)

as long as it is safe to use exec. If not, you can use the dictionary directly.

answered May 29 at 7:42
stephan
5007


Can you please show a basilar example? – DaNieL May 29 at 7:58 
1
 
A shorter idiom is globals().update(the_dict) or locals().update(the_dict). – Jouni K. Seppänen May 29 at 10:12
 
 
Thanks for pointing this out for globals(). Amended my answer. locals() is however read-only (see docs.python.org/library/functions.html#locals/…). – stephan May 29 at 10:29
 
 
It's not really read-only (I have used it myself) but apparently there are circumstances in which writing to it doesn't work. Thanks for pointing this out. – Jouni K. Seppänen May 31 at 9:59
add comment

vote up 5 vote down
check

But what i'll love is to refer to the variable direclty, as i declared it in the python script..

Assuming you're happy to change your syntax slightly, just use python and import the "config" module.

# myconfig.py:

var_a
= 'home'
var_b
= 'car'
var_c
= 15.5

Then do

from myconfig import *

And you can reference them by name in your current context.

answered May 29 at 7:54
yangyang
963

 
 
+1, This is absolutely THE best way to do config files. No need to grow your own config syntax, config parsers, config loaders etc. when you can just re-use python's core parts! – Simon May 29 at 8:13
 
 
+1 yes, as long as you can trust your config file and don't need portability – stephan May 29 at 8:26
 
 
Convenience and flexibility ++ vs. security -- : in some situations it's great, in others... less so. – mavnn May 29 at 8:28

I have to discard this solution becose i dont want to restric the file to python only, i think is better to use a generic file format (.txt, .inc, .whatever) that can be accessed by others languages too (with all the safe measure that this solution will mean). Anyway i'll keep in mind this way for the future – DaNieL May 29 at 9:37 
add comment

vote up 5 vote down
check

Use ConfigParser.

Your config:

[myvars]
var_a
: 'home'
var_b
: 'car'
var_c
: 15.5

Your python code:

import ConfigParser

config
= ConfigParser.ConfigParser()
config
.read("config.ini")
var_a
= config.get("myvars", "var_a")
var_b
= config.get("myvars", "var_b")
var_c
= config.get("myvars", "var_c")
answered May 29 at 6:57
Igor Krivokon
1,3939


Just edited - this way is fine, but if possible im lookin for another one who would let me using the variables in a usefull way – DaNieL May 29 at 7:17 
 
 
I've updated the answer to show you how to use these as variables. – Igor Krivokon May 29 at 8:26
 
 
Or you could take the Bunch class from code.activestate.com/recipes/52308/ and hack it to be recursive, so you could refer to config.myvars.var_a directly. – Jouni K. Seppänen May 29 at 10:16
add comment

vote up 2 vote down
check

How reliable is your format? If the seperator is always exactly ': ', the following works. If not, a comparatively simple regex should do the job.

As long as you're working with fairly simple variable types, Python's eval function makes persisting variables to files surprisingly easy.

(The below gives you a dictionary, btw, which you mentioned was one of your prefered solutions).

def read_config(filename):
    f
= open(filename)
    config_dict
= {}
   
for lines in f:
        items
= lines.split(': ', 1)
        config_dict
[items[0]] = eval(items[1])
   
return config_dict
answered May 29 at 8:27
mavnn
3129


I can wrote the text file exaclty as i want, the format var: value is just for example, if others formats are better, fell free to suggest – DaNieL May 29 at 8:37 
 
 
As long as it's consistant, and the value is of the form generated by repr(variable) it doesn't really matter for this method. Just pick a seperator that will never be part of the variable name and use that in the split method. – mavnn May 29 at 10:47
add comment

vote up 1 vote down
check

What you want appear to want is the following, but this is NOT RECOMMENDED:

>>> for line in open('dangerous.txt'):
...     exec('%s = %s' % tuple(line.split(':', 1)))
...
>>> var_a
'home'

This creates somewhat similar behavior to PHP's register_globals and hence has the same security issues. Additionally, the use of exec that I showed allows arbitrary code execution. Only use this if you are absolutely sure that the contents of the text file can be trusted under all circumstances.

You should really consider binding the variables not to the local scope, but to an object, and use a library that parses the file contents such that no code is executed. So: go with any of the other solutions provided here.

(Please note: I added this answer not as a solution, but as an explicit non-solution.)

answered May 29 at 7:55
community wiki



Yes, thanks for the raccomandation, i've banned the exec function both from my php and python scripts time ago ;) – DaNieL May 29 at 7:59 
add comment

vote up 0 vote down
check

You can treat your text file as a python module and load it dynamically:

import

Loading mentions Retweet
Filed under  //   PHP   Python  
Posted June 5, 2009
// 0 Comments