Category Archives: Perl

Perl, the duct tape of the Internet. Posts about Perl development, releases of my CPAN modules, etc.

Catalyst auto HTML/XSS scrubbing

At work, we needed to implement some HTML scrubbing and XSS protection across a Perl Catalyst-powered API, so went looking for existing solutions. We found Catalyst::Plugin::HTML::Scrubber which did some of what we needed, but did not scrub within encoded PUT/POST bodies e.g. POSTed JSON.

I implemented some improvements to provide this, but sadly the original author could not be reached – it seems he hasn’t been active in the Perl community for quite some time. With a little help from the CPAN admins (thanks!) I obtained maintainership of it, and have since got a couple of releases out which add the features we needed:

  • scrubbing HTML/XSS attempts within both normal parameters (querystring / POSTed forms) and also recursively within PUTted/POSTed JSON etc
  • the ability to whitelist certain parameters by name or regex to exclude them from scrubbing – we have some admin-only areas where staff can enter “message of the day” content which is allowed to contain HTML
  • a “no encode HTML entities” option to undo HTML::Scrubber‘s automatic HTML entity encoding of e.g. angle brackets – whilst content destined for the browser wants to be HTML-encoded, inbound parameters don’t want that, we just want HTML /XSS attempts stripped, a parameter value like >= 5 should be left alone

The amended version can be found on CPAN – Catalyst::Plugin::HTML::Scrubber.


(Aside: yes, it has been, er, quite some time since I posted anything on this blog.)

DBI reading MySQL connection details from .my.cnf

Useful trick: I often have my MySQL account credentials stored in .my.cnf so the mysql command-line client can use them. I also often have Perl scripts which want to connect to the database, and want them to use that file, not have to put the params into the script or have the script read its own config file with the credentials duplicated there.

The answer:

my $dsn = "DBI:mysql:database_name;mysql_read_default_file=$ENV{HOME}/.my.cnf";
my $dbh = DBI->connect($dsn,undef,undef,{RaiseError => 1}) 
    or die "Failed to connect to DB!";

Easy!

Using SSL client certs with Perl’s LWP::UserAgent

I recently needed to authenticate to a remote API using an SSL client certificate, and had a bit of trouble getting LWP::UserAgent to work with it.

The examples I found which looked like they should work involved e.g.:

use LWP::UserAgent;

my $ua = LWP::UserAgent->new(
    ssl_opts => {
        SSL_use_cert => 1,
        SSL_cert_file   => "/path/to/clientcert.crt",
        SSL_key_file    => "/path/to/privatekey.key",
    },
);

However, that didn’t work; changing the paths to the cert/key to non-existent files didn’t cause any difference, so I suspected that those options were actually being ignored.

After a fair bit of digging, the option I found that actually worked was loading Net::SSL first, to make LWP use Net::SSLeay, and setting env vars to the client cert to use:

use Net::SSL;
use LWP::UserAgent;

$ENV{HTTPS_CERT_FILE} = "/path/to/clientcert.crt";
$ENV{HTTPS_KEY_FILE}  = "/path/to/privatekey.key";
my $ua = LWP::UserAgent->new();

This, to me, is pretty icky – I’d much rather pass config to affect just that single LWP object. However, it gets it working.

Dancer talk at YAPC::NA 2012

Mark Allen will give a talk at YAPC::NA 2012 on the Dancer Perl web framework he describes as:

This talk presents the Dancer web framework beginning with “Hello World” and progressing through a couple of easy to digest introductory applications. All of the primary Dancer features are presented including URL routing, writing handlers, and output templating. A selection of useful and common Dancer plugins will also be covered. This talk is best suited for beginning and intermediate Perl programmers.

(via JT Smith, in turn via the YAPC::NA blog.)

I hope it’s recorded, as I’d like to see it, but won’t be able to afford to attend YAPC::NA.

Perl Advent Calendars for 2011

Well, December is upon us – time for advent calendars, and as usual, the Perl community doesn’t disappoint – here’s a list of the Perl-related advent calendars I’m aware of:

There are also several Japanese-language advent calendars:

If you know of any others, please feel free to let me know and I’ll add them to the list :)

LPW2011 : my thoughts overall

Yesterday, I attended the 2011 London Perl Workshop – my first ever Perl conference.

I had a good day, met a few members of the Perl community I knew from online interactions who I’d never met in person before, saw some good talks, and partook in some free food and beer (kindly paid for by the sponsors, including my employer, UK2).

Some brief mentions of talks I attended:

Matt S Trout (mst) – First, Tak wrote the world‎

I have IO::Pipeline. I have App::FatPacker. I have IPC::Command::Multiplex. And yet I still couldn’t whip up a five line example of bolting them all together that made a compelling argument for a perl-loving sysadmin to stop using fabric.

This problem, among others, will be solved by the conclusion of this talk.

Tak sounds like something which will be very useful to me – running code on multiple other hosts via SSH, but including Perl code – with all locally-installed modules available for use at the remote end!

As mst went through explaining how it all worked, my thoughts went from “hmm, useful”, to “hmm, useful but looks over-engineered, not sure it needs to be that complex” to “whoah, that’s genius”. Fatpacking and sending code to the remote side, which then adds a coderef to @INC which requests other modules from the local end, sent over and loaded remotely, is awesomely creative.

These kind of tricks remind me of why I love Perl.

Mike Whitaker (‎Penfold‎) – ‎Perl and Unicode, the 5.14 edition‎

A very good talk on handling Unicode safely in Perl, and the gotchas to avoid. Provided major impetus for me to upgrade to 5.14, too.

Zefram – ‎why time is difficult‎

Dates, times, time intervals, clocks, calendars, and related phenomena are major contributors to hassle in programming, and the source of innumerable bugs.

Zefram’s talk, whilst barely Perl related, was very interesting, and very well delivered. I hadn’t realised quite how complex time was :)

Zefram’s amusing lightning talk on doing away with source code by simply storing bytecode and editing it by deparsing the source, editing it, then “compiling” back to bytecode was also entertaining.

Claes Jakobsson (‎claes‎) – ‎Don’t debug now, debug later

Runops::Recorder is a alternate runloop for perl that writes down what your program does to disk for playback later

It also comes with a viewer and some helper classes for you to write your own playback tools such as diffs etc.

This looks like a very useful debugging tool, recording the path of execution through your code and writing it to a file which can then later be “replayed” using a viewer – much like single-stepping or tracing through the debugger, but after the fact. The ability to leave it running and have it dump out a configurable amount of trace data when a die is encountered looks excellently useful for catching intermittent / rare problems – you should be able to leave it in place, wait until the problem occurs, then replay what happened leading up to the die to see what was going on.

Future versions should also be able to track changes to variables, etc, which will be very useful indeed.

There were a couple of workshops I’d like to have attended, but which I didn’t; partly because they conflicted with talks I wanted to see, and partly because I didn’t have a laptop with me to “work along” and didn’t think I could take much of value away from them.

Andrew Solomon – ‎[[TRAINING SESSION]] Web development for beginners using Dancer‎

As a core developer for the Dancer perl web framework I’d love to have attended Andrew Solomon’s workshop, to see what was being taught, and offer any input desired. Unfortunately, I wasn’t there, but I’ll be looking with interest for any feedback from people who were, and what they learned and what they thought of Dancer if they hadn’t encountered it before. Making Perl accessible for new users is an important thing.

Gabor Szabo (‎szabgab‎) – ‎[[TRAINING SESSION]] Testing in Perl

I’d also like to have taken part in Gabor’s workshops, but they were in two parts and conflicted with several other talks I wanted to see.

I met a few members of the Perl community who I knew from online interactions but had never met in meatspace, so it was great to meet them. Unfortunately, there were a few others I’d meant to go introduce myself to, but never got a chance to do so – including Tatsuhiko Miyagawa and Gabor Szgabo.

Overall, it was a good day, and I imagine there’s a very good chance I’ll be back next year :)

LPW2011 : abigail’s “Business Aware Developer” talk

I caught abigail’s “Business Aware Developer” talk yesterday at the London Perl Workshop 2011.

Overall, I think it was a good talk, and raised some good points, even if the “you don’t always have to write tests, write them only if they provide value” is a little controversial with some of the audience, leading to a reasonable amount of debate and running late with the talk so having to skip some slides.

Personally, I agree to some degree – I think some people write tests simply to push up their test coverage figure, without really writing tests which are likely to catch bugs (exercising the code in both expected and unexpected ways, providing strange input and edge cases (does it blow up if given undef or a ref, say).

However, I do think a fair amount of the talk is summed up by advice given to me by a boss at work, Ditlev, with regards to getting stuff out – sometimes you have to “launch crap but launch” – sometimes code that works well enough to be put into use and making money for you can be more valuable than taking longer to produce better quality code – which may be nicer and better to work with in the future, but isn’t ready to launch now. In other words, examining the trade-off between quick results now, and better quality code which becomes more valuable later – but “what if later never comes?”.

The impression I took away from the talk, which might be a misconception, is that Booking.com don’t do code reviews or refactoring, which would seriously put me off applying for a position there – I think code review in particular (even if just casual – it needn’t be a formal procedure) is very valuable to push yourself to be a better coder. If you know other members of the team are going to be glancing over your commits when they have time and pointing at bits you could have done better, that’s a good motivation to write good code, and also often helps you realise other ways you could have done things.

I’d be interested in seeing the other slides which abigail had to skip over, but I haven’t been able to find them anywhere online.

Graphing time-based data in Perl

I recently wanted to produce some graphs from a web app powered by the Dancer Perl web framework, and reevaluated the various Perl graphing moduiles out there.

Modules I considered were:

  • Chart::Strip
  • Chart::Graph
  • Google::Chart
  • Chart::Clicker
  • Chart::Gnuplot
  • Unfortunately, I didn’t have time to do a full in-depth writeup trying every module like the excellent ones Neil Bowers has been doing, but I thought I’d write up a quick post on the choice I made, with example code, in case it helps other people looking to graph potentially irregularly-spaced time-based data samples in Perl easily.

    Chart::Clicker looked to be a nice choice (with a nice example of doing just what I want given as the topic answer to a question on StackOverflow), but had a huge chain of dependencies, finally failing when demanding Cairo and various X11 libraries (on my headless server).

    Chart::Strip seemed to do exactly what I wanted in a simple way, but I encountered a div-by-zero bug when dealing with a certain dataset with > 89 data points.

    I reported this to the author, Jeff Weisgberg in RT #72288, and he promptly released 1.08 with a fix (thanks Jeff!).

    Chart::Strip made it simple to do what I wanted:

    my @dataset;
    while (my $row = $sth->fetchrow_hashref) {
        push @dataset, { time => $row->{timestamp}, value => $row->{value} };
    }
    
    my $chart = Chart::Strip->new( title => "My chart" );
    $chart->add_data(\@dataset, { style => 'line' });
    
    # then get the chart as an image with $chart->png
    

    Nice and easy, just what I wanted – a way to say “here’s some timestamps and values (quite possibly irregularly spaced) – work out how to plot this sensibly for me”.

    The resulting graphs look good enough to me, e.g.:

    (Rendered intentionally a little smaller to fit the blog; naturally the graphs can be whatever size you want. Also, I had to use the transparent option to disable transparent backgrounds.)

Why IRC is a valuable tool to your development team

IRC (Internet Relay Chat) is a protocol for multi-server text chat between many participants in many channels, started back in 1988.

There are plenty of IRC networks out there for social chatter, including the likes of Freenode and irc.perl.org hosting many channels for Perl and open source channels in general, making it easy to get quick help from developers and users of your favourite project.

However, I find IRC to be a very valuable tool indeed to help development teams collaborate effectively; at work we make extensive use of it. It’s useful whether you’re a formal development team in a corporate environment, or an open source project whose developers / collaborators gather on IRC.

Why is it so useful? Well:

It enables quick discussion and collaboration without breaking your workflow

As a developer, you don’t want to lose your concentration – when you’re “in the zone”, you’re carrying information about the code you’re working on in your brain, and it doesn’t stay there for long if you’re distracted. Someone walking over to you and starting talking to you, or a phone call, demand more or less 100% attention; you will be distracted, and you will “fall out of the zone”, causing your productivity to fall until you get back to where you were.

IRC, on the other hand, means you don’t have to respond quite so immediately, and I find it easy to flick between coding and IRC (both terminals within Terminator for me) without losing focus on where I’m up to and what I’m doing.

Most IRC clients support alerting you when your nick is mentioned in a channel or you receive a direct message, so you can ignore general chatter in the channel until you’re ready to read it, but know if someone is trying to get your attention.

Of course, it’s even more valuable when your development team work from multiple locations, whether that’s having employees working from home, or multiple offices.

Logs of discussions can be valuable for future reference

If you keep logs of your discussions, it’s easy to refer back to later – sometimes you’ll remember “ah, yes, we talked about this – what was the outcome?” – quick log search, and your answer is there. “Why did we decide that this was the best way to implement this?” – log search – “ah, that was why”.

Open, widely-supported protocol

IRC is an open, widely supported protocol; there’s various clients available for pretty much every platform, so whatever system your devs work on, they’ll be able to find a client that suits them.

Easily extensible to integrate with other tools

It’s easy to write IRC “bots” which can help integrate with various other tools in various ways.

A good example is providing easy links to commits / bug reports or issues / pull requests etc.

If you’re using my Bot::BasicBot::Pluggable::Module::GitHub for instance, you can mention an issue and have the bot automatically provide a summary and an URL for anyone who wants to see what the issue in question is – e.g.:

<user1> Anyone had a moment to look at Issue 42 and see what's going on?
<bot> Issue 42 (It doesn't work) https://github.com/....
<user2> Oh yeah - I fixed that in 5fcbb01
<bot> Commit 5fcbb01 (Retarded logic fail.) - https://github.com/....

It’s easy to cobble together a simple bot or bot module to do this kind of stuff for whatever your in-house situation requires, if there’s nothing suitable already out there on CPAN (which, a lot of the time, there already will be).

GitHub provide post-receive hooks which can be configured to announce pushes to your IRC channel(s). Bot::BasicBot::Pluggable::GitHub::Announce can automatically announce new/updated issues, and, in future releases, also pull requests and commits/pushes.

System problems reported instantly

Use something like my Bot::BasicBot::Pluggable::Module::Nagios, and you can have system problems reported automatically to the appropriate IRC channels, for quick attention by whoever needs to deal with them. I use an applet in my GNOME system tray which alerts me to problems, but seeing them reported in detail on IRC is handy, and also strikes up conversation about it – a simple “I’m on it ^^” is enough to let others know you’re dealing with the issue and they don’t need to worry about it.

Announce tweets about your company/brand/project/interests

My Bot::BasicBot::Pluggable::Module::TwitterWatch module allows you to have the bot watch for and report new posts on Twitter about your company/project/brand/stuff of interest, and post them to your IRC channel – either for awareness, or to strike up discussion about them.

Wrap various other tools

Your IRC bot(s) can provide various other useful facilities – for instance, find the corelist command useful? Bot::BasicBot::Pluggable::Module::CoreList makes it easy for your bot to answer corelist lookups within the flow of a conversation.

<user1> Could use File::Spec - that's part of core, isn't it? 
<user2> bot: corelist File::Spec
<bot> File::Spec was first released with perl 5.00503 (released on 1999-03-28)
<user2> Yep :)

Bot::BasicBot::Pluggable::Module::Nagios released

I’ve just released Bot::BasicBot::Pluggable::Module::Nagios – a module for IRC bots powered by Bot::BasicBot::Pluggable which monitors one or more Nagios instances and reports problems to IRC channels.

I’ll be using this at work to have service problems reported to us on IRC for quick attention, but figured it’s something that’s likely to be of use to others elsewhere, too, so I’m releasing it.

There’s still some more features and improvements I want to make (a TODO list is included in the module POD), but it’s at a state where I consider it to be usable (it works for me).

Feedback/suggestions welcome.

It should be on a CPAN mirror near you soon, and the repo is on GitHub should you wish to submit pull requests or raise issues for bug reports/feature requests.