howto

Writing CPAN modules

it's hard

Oh, you wanted to write portable CPAN modules?

First: Decide what you mean by "portable".

I used to think I could write perl code that will run everywhere perl does.

But look at the platform coverage of CPAN smoke tests: lots of Linux, some FreeBSDs, Darwins, a few MSwin32s, a little Solaris...

My current take: if the VMS community wants us to support it, they should set-up a smoke test machine.

Also note that while the smoke tests are a great feature it does take a long time to get all the results back: a long dev cycle.

Also: decide how you feel about dependencies: If your target audience is using automated dependency tracking (CPAN.pm, CPANPLUS.pm, or perhaps things like apt-get using re-packaged cpan code) then there are no worries.; If you want to support people who are manually installing each module, then you need to worry about adding a dozen obscure dependencies.; (This is the kind of thing that makes Module::Install look good...); Probably: you should avoid using non-core modules that only provide small conveniences (List::MoreUtils, Test::Differences...).

Aren't there simple Best Practices you can follow for portability?

Read perldoc perlport And prepare to be confused.

It says something like "Perl is already portable, so don't worry so much!" Then it goes on for twelve thousand words about things to worry about.

And it's not necessarily up-to-date, either.

File paths

Standard advice: you should use File::Spec rather than manipulate file paths directly.

And now there's Path::Class which can do it even more neatly.

But: I almost never use File::Spec... and yet my code gets through the smoke tests.

Pop quiz. Does this look like portable code?

   my $cmd = "locate seat_of_pants 2>&1";
   my $results = qx{ $cmd };

As far as I'm concerned it is. It can fail if there's no "locate" command, but that bourne-shell style io-redirect is okay. Certainly it works on windows: shell redirect on windows, qx? I have had better luck with backticks than messing around with IPC::* modules. Quoting perlport:

In general, don't directly access the system in code meant to be portable. That means, no "system", "exec", "fork", "pipe", "``", "qx//", "open" with a "|", nor any of the other things that makes being a perl hacker worth being."

It appears that that's out-of-date... at least, given my critereon for portablity.

You might think you can write cross-platform code something like this:

  if ( $^O =~ /win/ ) {
    # do it the windows way
  } elsif ($^O =~ /VMS/ {
    croak "I give up";
  } elsif ($^O =~ /.../ {
    # ...
  } else {
    # unix-like code
  }

There are many problems here.

First of all, we blew it on the /win/ pattern (which also matches "cygwin", and "Darwin").

But more importantly: no one out there is really maintaining capabilities databases for all these platforms.

I don't even think there's a definitive list for all values of $^O. (Though many are in perlport, under PLATFORMS).

It makes more sense to try a plug-in system that looks for subclasses named with $^O

That way, if you want it to work on your platform, it's up to you to find someone using that platform to get the plugin working.

One module that works this way is File::Spec. (Follow that link to get an idea of what a mess cross-platform programming can be.)

And these days, $^O reports "MSwin32" on nearly every windows platform.: It could be there's a slight difference between Windows95 and Vista.; The core module Win32 has a GetOSName() you can use to distinguish sub-types of "MSWin32".; So your plug-ins might need plug-ins. (David Sharnoff has a solution for that: Plugins. "Plugins allows plugins [to] have plugins and for all the plugins to share a single configuration file.")

Let's say that you want to use an external program if it's available on the system. How would you find out?

A good idiom is to try it and see if it works, then fall back to something else if it's not available.

This is a simple, reasonably robust way of checking for an external program:

      sub can_run_program {
        my $program = shift;
        my $devnull = File::Spec->devnull;
        my $found = qx{ which $program     2>$devnull } ||
                    qx{ $program --version 2>$devnull };
        return $found;
      }

internationalization

Once upon a time (5.8 era), I wanted to get capitalize_title() working with at least the European characters....

The solution (?): use locale: The pragma use locale; seemed like a simple solution; It took the hint from my locale setting: en_US.iso8859-1; uc() would then know how to turn a ü into a Ü

Can you write tests that depend on locale settings?

You can't know in advance if locale is going to be "C" or "en_US.iso8859-1" or something else entirely.

You can set it with the POSIX module's "setlocale"...

The tests for PerlIO::locale do something like this:

   use POSIX qw(locale_h);
   SKIP: {
       setlocale(LC_CTYPE, "en_US.UTF-8") or skip("no such locale", 1);
       open( my $fh_out, ">", "foo") or die $!;
       print $fh_out "\xd0\xb0";
       close $fh_out;
       open(my $fh_in, "<:locale", "foo") or die $!;
       is(ord(< $fh_in $gt;), 0x430);
       close $fh_in;
   }

But...

Can you assume you're on a POSIX system? (The "perlport" docs call that "a pretty big assumption"...)
Is it common to have a "en_US.UTF-8" available, or will the test just get skipped a lot?
Can you even assume that "en_US.UTF-8" is what the locale will be called? Locale names aren't actually standardized that well (!): See perldoc perllocale look under "Finding Locales".

But if you take a look at the smoke test results for PerlIO::locale, you'll see that he's largely gotten away with these assumptions:

   PASS (166)   FAIL (31)   NA (13)   UNKNOWN (18)

What I actually did, though, for the Text::Capitalize tests is I got scared by the above issues and wrote operational tests... if it looks like uc() knows what to do with a few international characters, then I go ahead with those test cases, if not, then I skip them.

Something like this:

  SKIP: {
      skip( "Can't test strings with international chars", 1 ) unless i18n();
      my $result = capitalize_title( $case );
      is ($result, $expected, "test: $case");
  }

   sub i18n {
     $lower = 'ü';
     $upper = 'Ü';
     if ( ($upper eq uc($lower) )   &&
          ($lower eq lc($upper) ) ) { # transformed as expected
        return 1;
     }
     return 0;
   }

Recently, I noticed that these tests were always getting skipped, even on my own box.

use locale ignores UTF-8 locales: It turns out: I'm now living a UTF-8 life: locale, editor, terminal are all now UTF-8 rather than iso8859-1; When your locale is en_US.UTF-8, the pragma use locale no longer has any effect on the behavior of uc() and friends (!?).; If you want things like uc() to work right, you need to do a utf8::upgrade() on each variable that you expect to use "uc()" on.

Conclusion: perl internationalization is still a mess.

But wait. there's more... If you want to write portable perl code, you need to know what sort of input and output encodings the user is expecting.

portable encodings

What if you want your scripts to be able to output international characters in the encoding that the user expects when you don't know that encoding in advance?

  use PerlIO::locale;
  binmode STDOUT, ":locale";

And what if you want this to work with a "Test::More" based script?

You need to do this:

  use PerlIO::locale;
  my $builder = Test::More->builder;
  binmode $builder->output,         ":locale";
  binmode $builder->failure_output, ":locale";
  binmode $builder->todo_output,    ":locale";

So this is fine... Provided you're not worried about those FAILs on the PerlIO::locale smoke tests. Does doing this improve portability or hurt it? It's your guess...

punt with private tests

Writing tests is a good thing, but there are very important things you probably can't test.

One of my modules is intended to work with Postgresql. Does the system have postgresql installed? Do you have an account and password to access it (maybe via a .pgaccess file?). Do you have permissions to do a CREATE DATABASE? How about a DROP DATABASE? In practice: be willing to punt.

You can write private tests that you won't ship with the packages: you just don't include them in the MANIFEST... (and don't use Module::Install).

Another good trick: ship a test that skips itself unless it's run with a special option.

Some examples to look at:

Next: ExtUtils::MakeMaker vs. Module::Build vs. Module::Install

Joseph Brenner, 22 Sep 2009