Tag Archives: cpan

Catalyst auto HTML/XSS scrubbing

At work, we needed to implement some HTML scrubbing and XSS protection across a Perl Catalyst-powered API, so went looking for existing solutions. We found Catalyst::Plugin::HTML::Scrubber which did some of what we needed, but did not scrub within encoded PUT/POST bodies e.g. POSTed JSON.

I implemented some improvements to provide this, but sadly the original author could not be reached – it seems he hasn’t been active in the Perl community for quite some time. With a little help from the CPAN admins (thanks!) I obtained maintainership of it, and have since got a couple of releases out which add the features we needed:

  • scrubbing HTML/XSS attempts within both normal parameters (querystring / POSTed forms) and also recursively within PUTted/POSTed JSON etc
  • the ability to whitelist certain parameters by name or regex to exclude them from scrubbing – we have some admin-only areas where staff can enter “message of the day” content which is allowed to contain HTML
  • a “no encode HTML entities” option to undo HTML::Scrubber‘s automatic HTML entity encoding of e.g. angle brackets – whilst content destined for the browser wants to be HTML-encoded, inbound parameters don’t want that, we just want HTML /XSS attempts stripped, a parameter value like >= 5 should be left alone

The amended version can be found on CPAN – Catalyst::Plugin::HTML::Scrubber.


(Aside: yes, it has been, er, quite some time since I posted anything on this blog.)

New song lyrics search site

A whistling badger

I’ve been meaning to whack up a post about this – I launched a new song lyrics search website the other day called LyricsBadger.

It uses my Lyrics::Fetcher Perl module to fetch song lyrics from a variety of sites, and remembers what it’s been asked for before so that it can present lists of artists/songs which it’s already been asked for.

I built it as a testbed for Lyrics::Fetcher and to get some experience with Template Toolkit for Perl (which absolutely rocks!). The entire site is powered by one Perl script and a handful of templates, and uses a ScriptAlias directive to pass all requests to the one script so that it can provide nice clean URLs like /lyrics/Artist/Title.

Why not go and give LyricsBadger a try?