Brad Fitzpatrick (brad) wrote in fotobilder,
Brad Fitzpatrick

S2 and HTML cleaning

Now that the S2 compiler is in Perl, I've been working on making S2 development available to users, not just site admins.

The compiling part is easy now (I have the S2 compiler hooked up and spitting on HTML for the View Source mode already), but the important part now is to clean HTML from untrusted styles.

I wrote a new HTML cleaner (quite unlike LiveJournal's) that works on a stream, cleaning HTML (stripping JavaScript) as its received, and spitting it back out the other side.

Problem is, that's relatively slow, and sometimes JavaScript is desired, to achieve certain effects.

So, this will be the new behavior, as of tomorrow:

There will be two print statements. "print" (as now) and "uprint" (untrusted print). System styles may use either. print is fast and goes right to the client, unchecked and uncleaned. uprint gets piped through the HTML cleaner.

For system layers, "print" means "print", and the implicit print (a string literal) means "print" too.

For user (untrusted) layers, both "print" and "uprint" map to "uprint".

Using this scheme, the same source code can run in either context. (well, the mapping is decided at compile-time, not run-time, but the same source code can be recompiled and run in either context)

Why would a system style ever use uprint? Perhaps the style is printing something with a property variable interpolated, the property value coming from the user. Those cases should be rare, and usually could be avoided by just HTML-escaping the output and using the fast print instead.

The advantage to this system should be obvious... almost all users will run the stock code, so the majority of prints will be fast. Even in cases where users override specific functions from stock layouts, at least the layouts will print fast, and only the occasional prints in the untrusted override function will be slow.

I'm also writing a CSS cleaner which will be used when printing out a style's /res/stylesheet URL, and from the HTML cleaner, when parsing <style> elements and style='...' attributes.

  • 302: lj_dev

    In the interests of consolidating all FotoBilder development-related discussion, we're going to be closing down this community. The same…

  • Development stalled?

    Is the development of Fotobilder held? Stalled? I am asking because there is no activity on the community and there is no link to the Fotobilder…

  • (no subject)

    Does FotoBilder works with Apache2? I installed all the required modules on my debian sarge, and when I restart my apache server, it dies horribly…

  • Post a new comment


    Comments allowed for members only

    Anonymous comments are disabled in this journal

    default userpic
  • 1 comment