An Introduction to the Perl Debugger
http://obsidianrook.com/web/perldb_talk.html
Update for the perl 5.8 era
October 2000
Revised: June 4, 2007
This talk is now out of date: the perl debugger commands have
changed significantly, and there's a standard perl debugger
tutorial now: perldebtut.
|
The important points to take away from here:
|
(1) Using "perl -d" is probably easier than you think, and if you
haven't tried it yet, you should: you can probably pick it up in
a half-hour (but don't try to learn it when you're working
on a live bug).
|
(2) Some of the debugger documentation is intimidating because of
the sheer number of the commands (there's around fifty,
not-including the many sub-options), but you can get started
doing useful work just knowing these:
- s - single step
- n - steps carefully to avoid falling into a sub
- v - view some code around the current line
- c - continue processing until a break of some sort
- r - continue running until you do a return from the current sub
- x - displays value of variables (including arrays & hashes)
- b - sets a break point (uses line number or sub name)
- w - sets a watch expression
- T - displays a stack back trace
- L - list all the breaks
- B * - delete all breakpoints
- W * - delete all watch expressions
- R - restart
|
And note that you don't need to do "v" listings if you're using a
good debugger front-end (such as the emacs "gud" system, possibly
with perlnow).
|
My favorite new command:
- m - lists all of an object's available methods
|
On with the original text (from the perl 5.6 era):
preamble with stupid joke
By the way, if anyone has any questions, feel free to just
jump right in. Gatherings like this always have lots of
people of different skill levels, and it's important not to
feel intimidated. Don't worry that you're about to ask a
stupid question. Because when you really come down to it
there aren't any intelligent questions.
|
intro to the intro
If there's one goal to this presentation, it's to get across
the fact that using perldb is probably easier than you think, and if
you haven't done it yet, you should: you can probably pick
it up in a half-hour.
I'm going to complicate this a bit by also recommending the
use of emacs (which could take you a bit longer than a
half-hour to get started with. Not to mention a lifetime to
master), but I won't neglect the command-line. Using the
perl debugger with emacs is (usually) better, but not
strictly necessary. (There's also a third way to go, a
graphical Tk front end called "ptkdb", but I haven't ever
played with that myself.)
Anyway, this is a subject where we definitely need to get
to put the emphasis on practice over theory, so let's
skip the introductions and lead off with some "practice".
|
Example session #1: contrived demo of some perl debugger
basics
I once sent some email to Randal Schwartz, complaining
about the first chapter of his "Learning Perl" book. It
seemed to me that the examples were way too contrived, and
for me the question "why would I want to do that?" kept
getting in the way of learning how to do anything. So I
thought long and hard about what kind of code to use to
quickly introduce the functions of the debugger:
#!/usr/bin/perl -w
# ~/bin/contrived_perldb_demo Fri Oct 20 19:02:56 2000
# Expected behavior: the variable "positive_guy" will
# (1) increase all the time.
# (2) will always be positive.
# (But there are bugs, and we'll use breakpoints and
# watchexpressions to track them down.)
use strict;
use vars qw($positive_guy $count $real_negative);
$positive_guy = 1; $count = 0;
while($count < 200) {
$count++;
$positive_guy++;
dostuff();
$positive_guy = jackitup($positive_guy);
dostuff();
$real_negative = evil_routine(\$positive_guy);
dostuff();
}
print "Real negative should be down now: $real_negative\n";
sub dostuff {
# doesn't really do much of anything
print "The count is $count, and positive_guy is $positive_guy\n";
1;
}
sub jackitup {
my $n = shift;
my $factor = 2;
$n = $n/$factor; # increase number by $factor
return $n;
}
sub evil_routine {
# sends $positive_guy negative, eventually
my $pgr = shift;
if ($count>100) {
${$pgr} = -100;
}
return ${$pgr};
}
Useful huh? I'll try to get into something more like real
code later.
|
So, the expected behavior that we're shooting for here is that
the variable "positive_guy" will:
- increase all the time.
- will always be positive.
When you try to run it, you immediately see that we've got
problems though, because at first we see output like this:
The count is 1, and positive_guy is 2
The count is 1, and positive_guy is 1
The count is 1, and positive_guy is 1
The count is 2, and positive_guy is 2
The count is 2, and positive_guy is 1
The count is 2, and positive_guy is 1
The count is 3, and positive_guy is 2
The count is 3, and positive_guy is 1
The count is 3, and positive_guy is 1
The count is 4, and positive_guy is 2
The count is 4, and positive_guy is 1
And then after awhile, we see it suddenly change like this:
The count is 100, and positive_guy is 1
The count is 100, and positive_guy is 1
The count is 101, and positive_guy is 2
The count is 101, and positive_guy is 1
The count is 101, and positive_guy is -100
The count is 102, and positive_guy is -99
The count is 102, and positive_guy is -49.5
The count is 102, and positive_guy is -100
The count is 103, and positive_guy is -99
The count is 103, and positive_guy is -49.5
The count is 103, and positive_guy is -100
The count is 104, and positive_guy is -99
The count is 104, and positive_guy is -49.5
|
So, it's debugger time. I'm going to lead off with using
the debugger from the command line (then I'm going to get
into using it inside of emacs as soon as I can, because I feel
very insecure when I'm forced to step outside of my
emacs).
|
Running the debugger is easy enough, you just use the "-d" flag:
perl -d ~/bin/contrived_perldb_demo
Now, if you've looked at the existing documentation, you
might have been scared off by the listing of dozens of
commands (there's around 45, and that ignores the many
sub-options). I claim you can get started doing useful work
just knowing these twelve:
- s - single step
- n - steps carefully to avoid falling into a sub
- c - continue processing until a break of some sort
- r - continue running until you do a return from the current sub
- w - shows a "window" of code around the current line
- b - sets a break point (uses line number *or* sub name)
- x - displays value of variables (*including* arrays & hashes)
- W - sets a watch expression
- T - displays a stack back trace
- L - list all the breaks
- D - delete all the breaks
- R - restart
So let's play with these a little now, before we get serious
about debugging. First of all, when I run the debugger, I
see this:
perl -d ~/bin/contrived_perldb_demo
Loading DB routines from perl5db.pl version 1.0402
Emacs support available.
Enter h or `h h' for help.
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
You can see it's skipped passed the comments and such and
stopped on what looks like the first real line of code.
If you want to get some context you can do a w:
DB<1> w
11: use strict;
12: use vars qw($positive_guy $count $real_negative);
13
14==> $positive_guy = 1; $count = 0;
15
16: while($count < 200) {
17: $count++;
18: $positive_guy++;
19: dostuff();
20: $positive_guy = jackitup($positive_guy);
DB<1>
That arrow symbol ==> indicates the line that is
*about* to be executed. That's an important point to think
about for a second, because most people tend to assume that
the arrow is pointing at the line that was just executed
(and I suspect that might've been the right way to do it).
|
Let's move forward through the code, single stepping with
the s command at first:
DB<1> s
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
DB<1> s
main::(/home/doom/bin/contrived_perldb_demo:16):
16: while($count < 200) {
DB<1> s
main::(/home/doom/bin/contrived_perldb_demo:17):
17: $count++;
DB<1> s
main::(/home/doom/bin/contrived_perldb_demo:18):
18: $positive_guy++;
DB<1>
|
The first trick to learn here in debugger 101 is that you
can peek at the current state of the variables whenever you
want. Let's check a few things with the x command:
|
DB<1> s
main::(/home/doom/bin/contrived_perldb_demo:18):
18: $positive_guy++;
DB<1> x count
0 'count'
DB<2> x $count
0 1
DB<4> x $positive_guy
0 1
Oops, first I made a mistake, and just asked it to print
"count" rather than $count. But when I got the
$ right, it told me the value of $count. Hm,
$count was
initialized at zero, and incremented once, so okay, "1" makes
sense. Does $positive_guy make sense, though? It was
initilized to one, incremented once and it's still one.
Oh, no, it *hasn't* been incremented yet. The increment
line is displayed (and if you do a w you'll see the
arrow pointing at it), so that's what's going to happen next.
Let's check that.
|
DB<5> s
main::(/home/doom/bin/contrived_perldb_demo:19):
19: dostuff();
DB<5> x $positive_guy
0 2
Right, now it's been bumped up one. And the next line is
a function call: "dostuff". And here we've got a decision to
make. We've been stepping through with s, which
always follows every line of code in detail: it follows
subroutine calls and starts stepping through every line
of the sub. There's an alternative we can use, the n
command, which executes the subroutine, but doesn't show us
the details, it treats a function call as a single step.
So s is the low road, and n is the high road,
and which way is going to get us there faster? Well, we've
got no reason to suspect that "dostuff" is where the problem
is, so there's some logic to just hitting an n here.
But on the other hand, we know that dostuff isn't very long,
so for the first time through, let's keep using s:
|
DB<6> s
main::dostuff(/home/doom/bin/contrived_perldb_demo:29):
29: print "The count is $count, and positive_guy is $positive_guy\n";
DB<6> s
The count is 1, and positive_guy is 2
main::dostuff(/home/doom/bin/contrived_perldb_demo:30):
30: 1;
DB<6> s
main::(/home/doom/bin/contrived_perldb_demo:20):
20: $positive_guy = jackitup($positive_guy);
Okay, so if you read that carefully, you can see that it
jumped down into main::dostuff, and you'll note
that when the print command was executed, we saw the output
appear here mixed in with the debugger commands:
The count is 1, and positive_guy is 2.
|
We could keep going like this, stepping through the code
a line at a time, deciding whether to navigate down into
subs, peeking at variables here and there. This can be
pretty tedious (though this might be a very good
thing to do if you're completely confused and looking for
clues). But things will go faster if you can come up with
some theories about where the problem might be, then think
About t a way to use the debugger to test the theory.
|
Our first problem is that we expected that $positive_guy
would increase, instead it seems to hold steady. A glance
at the code indicates that the function "jackitup" was probably
intended to increase it. Could it be that there's a problem
in "jackitup"? Let's do a restart (R) and pretend we
had that theory from the start. How would we proceed?
|
Let's set a breakpoint on the suspect routine.
Then when we do a c it will just chug along until
it hits the breakpoint.
|
DB<5> R
Warning: some settings and command-line options may be lost!
Loading DB routines from perl5db.pl version 1.0402
Emacs support available.
Enter h or `h h' for help.
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
DB<5> L
DB<5> b jackitup
DB<6> c
The count is 1, and positive_guy is 2
main::jackitup(/home/doom/bin/contrived_perldb_demo:34):
34: my $n = shift;
DB<6> w
31 }
32
33 sub jackitup {
34==>b my $n = shift;
35: my $factor = 2;
36: $n = $n/$factor; # increase number by $factor
37: return $n;
38 }
39
40 sub evil_routine {
And now that we figure we're close to the problem, we should
start single-stepping and peeking at variables:
|
DB<6> s
main::jackitup(/home/doom/bin/contrived_perldb_demo:35):
35: my $factor = 2;
DB<6> x $n
0 2
DB<7> s
main::jackitup(/home/doom/bin/contrived_perldb_demo:36):
36: $n = $n/$factor; # increase number by $factor
DB<7> x $factor
0 2
DB<8> s
main::jackitup(/home/doom/bin/contrived_perldb_demo:37):
37: return $n;
DB<8> x $n
0 1
Hold it. We're about to return this value and it didn't get
"jacked up" it got smaller. We just executed a line with
a comment that said it was going to increase it...
Duh, we're *dividing* by $factor not multiplying.
|
So, we run off to our favorite emacs -- I mean, "editor"
and change
$n = $n/$factor; # increase number by $factor
to
$n = $n*$factor; # increase number by $factor
and come back to the debugger window to do a restart
(R) to read in the changed file. And personally, I always do
an L right after an R, because I don't have
good feel for what get's zapped when you do a restart:
|
DB<9> R
Warning: some settings and command-line options may be lost!
Loading DB routines from perl5db.pl version 1.0402
Emacs support available.
Enter h or `h h' for help.
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
DB<9> L
/home/doom/bin/contrived_perldb_demo:
34: my $n = shift;
break if (1)
So the breakpoint remains in the same place, though it
appears that it's tied to the line number rather than the
sub name. A little peculiar, but it works:
|
DB<9> w
31 }
32
33 sub jackitup {
34==>b my $n = shift;
35: my $factor = 2;
36: $n = $n*$factor; # increase number by $factor
37: return $n;
38 }
39
40 sub evil_routine {
DB<9> s
main::jackitup(/home/doom/bin/contrived_perldb_demo:35):
35: my $factor = 2;
DB<9> s
main::jackitup(/home/doom/bin/contrived_perldb_demo:36):
36: $n = $n*$factor; # increase number by $factor
DB<9> s
main::jackitup(/home/doom/bin/contrived_perldb_demo:37):
37: return $n;
DB<9> x $n
0 4
Okay, that looks more like it, the function "jackitup" will
now be increasing the value of $positive_guy.
But running the code shows that the other problem still
hasn't gone away, all of a sudden $positive_guy
goes negative.
|
So, how are we going to investigate that problem? One
method would be to try setting a watch expression.
With a watch expression, you can tell the perl debugger
to keep an eye on something like the value of a variable,
and stop and tell you whenever it changes. So let's clear
the breakpoint with a D, and play around with some
W commands. You might think that that since we're
worried about $positive_guy that's literally what
we should keep an eye on:
|
DB<10> R
Warning: some settings and command-line options may be lost!
Loading DB routines from perl5db.pl version 1.0402
Emacs support available.
Enter h or `h h' for help.
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
DB<9> D
Deleting all breakpoints...
DB<9> W $positive_guy
DB<10> c
Watchpoint 0: $positive_guy changed:
old value: undef
new value: '1'
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
DB<10> c
Watchpoint 0: $positive_guy changed:
old value: '1'
new value: '2'
main::(/home/doom/bin/contrived_perldb_demo:19):
19: dostuff();
DB<10> c
The count is 1, and positive_guy is 2
Watchpoint 0: $positive_guy changed:
old value: '2'
new value: '4'
main::(/home/doom/bin/contrived_perldb_demo:21):
21: dostuff();
DB<10> c
The count is 1, and positive_guy is 4
The count is 1, and positive_guy is 4
Watchpoint 0: $positive_guy changed:
old value: '4'
new value: '5'
main::(/home/doom/bin/contrived_perldb_demo:19):
19: dostuff();
You can see there's a problem with just spying on
$positive_guy. It just changes too much, and we're going to
have to keep hitting c until we forget why we're
doing it.
|
Much better is to get a little more clever about what
expression we watch. The important point is that we're
talking about perl expressions here, which can be
arbitrarily complicated. Let's use a logical expression
like ($positive_guy > 0).:
|
DB<10> L
DB<10> W ($positive_guy > 0)
Use of uninitialized value at (eval 8) line 1.
eval '($@, $!, $^E, $,, $/, $\\, $^W) = @saved;package main; ($positive_guy > 0);
;' called at /usr/lib/perl5/5.00503/perl5db.pl line 1239
DB::eval called at /usr/lib/perl5/5.00503/perl5db.pl line 987
DB::DB called at /home/doom/bin/contrived_perldb_demo line 14
DB<11> c
Watchpoint 0: ($positive_guy > 0) changed:
old value: ''
new value: '1'
main::(/home/doom/bin/contrived_perldb_demo:14):
14: $positive_guy = 1; $count = 0;
Ah, it stops immediately. Is this significant? No, not
really: this is just the place where it's initialized, it's
gone from undef to 1, so the logical watch expression
becomes true (by the way, note that the debugger prints the
value of the *expression* not the *variable*). So let's do
another c and keep going:
|
DB<11> c
The count is 1, and positive_guy is 2
The count is 1, and positive_guy is 4
The count is 1, and positive_guy is 4
The count is 2, and positive_guy is 5
The count is 2, and positive_guy is 10
The count is 2, and positive_guy is 10
The count is 3, and positive_guy is 11
The count is 3, and positive_guy is 22
The count is 3, and positive_guy is 22
The count is 4, and positive_guy is 23
The count is 4, and positive_guy is 46
The count is 4, and positive_guy is 46
The count is 5, and positive_guy is 47
The count is 5, and positive_guy is 94
The count is 5, and positive_guy is 94
The count is 6, and positive_guy is 95
The count is 6, and positive_guy is 190
The count is 6, and positive_guy is 190
The count is 7, and positive_guy is 191
and so on...
The count is 97, and positive_guy is 4.75368975085586e+29
The count is 97, and positive_guy is 4.75368975085586e+29
The count is 98, and positive_guy is 4.75368975085586e+29
The count is 98, and positive_guy is 9.50737950171172e+29
The count is 98, and positive_guy is 9.50737950171172e+29
The count is 99, and positive_guy is 9.50737950171172e+29
The count is 99, and positive_guy is 1.90147590034234e+30
The count is 99, and positive_guy is 1.90147590034234e+30
The count is 100, and positive_guy is 1.90147590034234e+30
The count is 100, and positive_guy is 3.80295180068469e+30
The count is 100, and positive_guy is 3.80295180068469e+30
The count is 101, and positive_guy is 3.80295180068469e+30
The count is 101, and positive_guy is 7.60590360136938e+30
Watchpoint 0: ($positive_guy > 0) changed:
old value: '1'
new value: ''
main::evil_routine(/home/doom/bin/contrived_perldb_demo:46):
46: return ${$pgr};
So, could this be the place where $positive_guy get's
screwed up (could it just possibly be the "evil_routine" that
messes things up?). Let's look into it. Doing a w,
we can see that the last line to be processed was
${$pgr} = -100:
|
DB<11> w
43: if ($count>100) {
44: ${$pgr} = -100;
45 }
46==> return ${$pgr};
47 }
48
49
That's certainly suggestive. Can we be sure that ${$pgr}
has anything to do with $positive_guy?
|
DB<11> x ${$pgr}
0 '-100'
Not a suprize.
DB<12> x $pgr
0 SCALAR(0x80e27d0)
-> '-100'
Unfortunately, looking at value of the reference itself
doesn't help. It just tells you it's pointing at something
which now has a value of '-100', which we knew already.
|
DB<13> x $positive_guy
0 '-100'
But that's again very suggestive. For some reason
$positive_guy has assumed the same value as
${$pgr}... Just for the hell of it, let's look
at where this routine was called from. We can get that
pretty easily with a stack back trace:
|
DB<14> T
$ = main::evil_routine(ref(SCALAR)) called from file `/home/doom/bin/contrived_perldb_demo' line 22
A quick trip to our "favorite editor" again reveals that
line 22 is just this:
|
$real_negative = evil_routine(\$positive_guy);
And that confirms it. For some reason, some brain-dead loser
decided to pass in a reference to $positive_guy,
rather than just the value, which gave the "evil_routine"
a chance to mangle it.
|
The fix for that one is left as an exercise, because this is
silly code is getting *really* boring.
|
So that's the how, but what's the for
Let me do some musing here about the difference between
debugging by using embedded print statements and using the
debugger. |
Everyone starts with embedded print statements because
they know how to do it. You have to know how, or else you
couldn't write code in that language at all. And everyone
feels reluctant to get into using a debugger, because
it means you've got to take time out to learn something new
(and despite the usual academic attitude toward learning
for it's own sake, in reality you really want to avoid
learning new stuff without a good reason).
|
So what's the reason? What's so bad about "printf"?
I think the trouble with embedded print statements is that
they're clumsy for extended use:
you tend to constantly go back and add more debug print
statements as your ideas about the problems change, then
your error logs get bloated, then you have to decide which
statements are obsolete and need to be commented out, after
awhile the old/commented debugging code turns into cruft
making it hard to follow the real code, etc..
|
The main advantage of the debugger is a kind of
flexibility. You come up with theories, zero in on where
you think the problem is (with breakpoints) or what you
think the problem is (with watch expressions), and then
start looking closely to confirm or refute your theory.
When you're through with an idea, two keystokes wipe the
slate clean, and you're ready to try another one.
|
One of the beautiful things about perl is that you can often
express an important idea in a screenful of code. You can
see an important chunk of it all at once. When you start
cluttering it up with DEBUG statements, you start to lose this.
Better to keep the code untouched by the debugging process,
if at all possible.
|
A debugger lets you navigate around in the code testing
different theories about what's wrong without adding any
cruft. Instead of reading through bloated error logs of
debug printouts, you keep the code in front of you, and peek
at values only as required.
|
I'd argue that the main barrier to using an interactive debugger is
only a slight shift in mind-set. It's a good idea to take
out a little time to get across this barrier.
|
Running the debugger inside of emacs
To run the debugger under emacs, you don't just add a
-d to the command line. Instead, you do an Alt x
perldb and hit return. Emacs then prompts you for how
you want to run it. It'll make a guess, and you can often
just hit return (it tends to remember the last way it was
run, and reuse that command). Here's what the prompt looks
like:
Run perldb (like this): perl ~/bin/contrived_perldb_demo
I recommend just using a command line like "perl
<script_name>". Doing anything fancier tends to lead to
trouble. Further, if this is a cgi script you're working
on, it probably has the '-T' flag set: I recommend removing
it for now (and remember to put it back when you're done).
More about both of these points later under "gotchas".
|
Once you've got it running, you can immediately see the
main advantage of working inside of emacs: two tiled windows
one above show the debugger commands in one, and the code
stays displayed in the other. These two windows stay in
synch with each other... as you single-step using debugger
commands, you can see the familiar "=>" automatically
stepping down through the code window.
|
Note: this "=>" is not actually added to the code,
it's superimposed over it. You can skip into the code
window, make modifications and save without any fear that
the "=>" is going to be added to the file.
|
Note: for this screen shot, I turned on the emacs syntax
coloring feature. I don't always use it myself (sometimes
staring at the colors drives me crazy), but if you'd like to
play with it it's easy enough to activate it:
Alt x font-lock-mode <RET>.
And you can of course get this turned on permanently via
some .emacs magic: see Appendix 1 below. (This all just
applies to GNU emacs by the way, if you're using xemacs,
syntax coloring is on by default).
|
In general, the emacs two-window debugger configuration
is so convienient that I suspect that you should use emacs
as a perldb front-end, even if you don't use it as your
preferred editor. Presuming you're not an emacs person
already, let's think about what the bare minimum of emacs
commands are that you would need to know in order to use
perldb effectively:
- Starting emacs.
- Typing emacs in a terminal
window usually works.
- There's a sample .emacs file in Appendix 1
below that you might like to try putting in your home
directory ("C:\" under NT, as I remember it).
- For VI addicts
- If you're already familiar with the vi command set, you
might like to ease the transition to emacs by doing a
Alt x viper. Viper mode changes a lot of the key
bindings to make emacs a little more vi-like. (Haven't
played with this a lot, myself, so I can't vouch for it personally.)
- Opening a file
- Cntrl x d this runs "dired" the directory
editor. Use this to get a listing of your working
directory.
- cursor (or mouse?) down to the file you're interested
in. Doing a <RET> will open it. (Note, doing a
<RET> on a directory takes you down into that directory.)
- Running the debugger
- Alt x perldb
- perl code_to_be_debugged.pl
- Note: debugger commands (discussed above) work in this
window. A full list can be obtained with the debugger
command h.
- Scrolling a window.
- "Page Down" and Cntrl v are equivalent.
- "Page Up" and Alt v are equivalent.
- Alt Cntrl v is useful for doing a "page down" in the
other tiled window.
- Alt Cntrl Shift v (now, don't laugh) is useful
for doing a "page up" in the other tiled window.
- Switching to another emacs sub-window.
- Cntrl x o rotates among the open tiled emacs windows.
- Clicking with a mouse pointer seems to work as well.
- Text searches
- Cntrl S takes you into interactive/iterative
search feature. Type what you're looking for and it tries
to jump you there... A <RET> (among other things) will
terminate the search.
- Getting out of mistakes
- Cntrl G is the emacs break character. Hammering
on the Cntrl G key will typically get you out of
any crazy mode.
- Cntrl x k will kill the current buffer.
Sometimes I find it helpful to kill the debugger window
completely so that I know I'm starting from scratch.
- Insert mode
- In emacs, you're always (almost) in insert mode. Just start
typing.
- If you're stuck in the many odd modes of this
"non-modal" editor, try a bunch of control-Gs. You could
also try hitting the INS key.
- Saving a file
- You can save the active titled window with a Cntrl x
Cntrl s
- You can save all open buffers with a Cntrl x s
(note the subtle difference).
- Finding out more.
- Alt x help <RET> t gets you into an emacs
tutorial
- Alt x help <RET> i m emacs <RET> should get you
to the emacs info nodes. If you don't know much about
"info" just keep hitting the space bar. Alternately, you
can read about info: Alt x help <RET> i m info <RET>.
- Hint: if you don't know how to navigate from anywhere
but the top, you can just kill the buffer (Cntrl x k) and
start over to get back to the top.
Have fun with emacs. Remember, every typo is a learning experience.
|
Example session #2: some real code -- the commonplace book
Let's talk about some actual code here. There's a small
web application that I wrote called "the Commonplace Book",
|
A commonplace book was something you were likely to find in
the parlors of more civilized 19th century homes. It's
essentially an open notebook that members of the household
might use, for example to save and share favorite quotations
from books they were reading. My on-line commonplace book
is a quotation database that allows users to login and enter
favorite quotations of their own. But the feature that I
want to talk about is just the welcome screen, which
randomly displays a quote selected from the database.
|
Now, this is (as you might expect) a perl CGI script (though
it's been cleaned up enough to run under mod_perl using
Apache::Registry).
In essence, the database schema used to implement this
is something like this. There are two main tables "books" and
"quotes", and they're linked on the title field (and there's also
a "commoners" table, where user information is added):
books quotes commoners
|------------| |-------------| |------------|
| title | <-----> | title | /--------> | username |
| authors | | quotation | | ^ | password |
| user_added | ----\ | page_number | | | | email |
| when_added | | | chapter | | | | user_hash |
| [...] | | | user_added | --/ | | [...] |
| | | | when_added | | | |
| | | | [...] | | | |
| | | | | | | |
|------------| | |-------------| | |------------|
| |
| |
\--------------------------/
(You may notice that this schema is very primitive:
linking tables on varchar field doesns make sense in
a number of ways, but that's something to be fixed on
another day.)
|
As it happens, the code I originally wrote to implement this
was somewhat quick-and-dirty, and once it was working I went
back in to clean it up. Then I needed to clean up the
clean up.
|
Here's the original chunk of code, in all it's
quick-and-dirty lack of glory:
sub but_welcome {
# First: select a quotation randomly.
my $sth = $dbh->prepare("select max(index) from quotations");
$sth->execute();
my @row = $sth->fetchrow_array;
# The maximum index value for the quotations table:
my $maxindy = $row[0];
my $pick;
# Select record randomly, make sure it's valid, if not, pick again:
$pick = int( rand($maxindy) + 1 ); # random pick of number
# from 1 to maxindy:
$sth = $dbh->prepare("select $q_fields from $qtab where index=$pick");
$sth->execute();
my $hr;
until ($hr = $sth->fetchrow_hashref) { # Read of quotation
# with index = $pick failed.
$pick = int( rand($maxindy) + 1 ); # random pick of a number
# from 1 to maxindy:
$sth = $dbh->prepare("select $q_fields from $qtab where index=$pick");
$sth->execute();
};
%rec = (%rec, %{$hr});
# And now, do the books look-up
# (isn't this braindead? why not do it all in one SQL statement?)
my $statement = "select $b_fields from $btab where title='$rec{title}'";
my $sth_b = $dbh->prepare($statement);
$sth_b->execute();
# Add the newly read parameters to the existing values in %rec (if any):
%rec = (%rec, %{$sth_b->fetchrow_hashref});
$author = &nameswap($rec{'authors'});
$translator = &nameswap($rec{'translator'});
&form_welcome;
}
There are two problems with this that I wanted to clean up:
(1) Some unnecessarily duplicated lines of code, because
when originally written I just
couldn't remember how to do a loop in perl that does the
comparison at the end of the loop.
(2) More seriously, there are two "SQL" selects done here,
that could easily be merged into one, providing several
benefits: you get greater efficiency (better to let the
database do the work); it fixes a minor bug that used to pop
up if the title for the book was missing from the books
table for some reason; but most importantly of all, it
eliminates an embarrassing lapse that might give some people
the strange idea that I don't know what I'm doing. Doing
multiple selects like this is pretty classic mistake for the
beginning perl DBI/DBD programmer.
|
Here's my first stab at doing the revised code. Note I'm
using the do {...} until (...) loop to make the
code a little neater. I think that the main reason I didn't
just do it that way in the first place is that I consulted
the infamous Camel first, and this construct is only barely
mentioned there. I eventually found it in the on-line
documentation, in "perlsyn".
sub but_welcome {
# First: select a quotation randomly.
my $sth = $dbh->prepare("select max(index) from quotations");
$sth->execute();
my @row = $sth->fetchrow_array;
# The maximum index value for the quotations table:
my $maxindy = $row[0];
my $pick;
# Select record randomly, make sure it's valid, if not, pick again:
my ($hr, $sql);
do {
$pick = int( rand($maxindy) + 1 ); # random pick of a number
# from 1 to maxindy:
# Doing query something like:
# select * from quotations, books where
# index=262 and books.title = quotations.title;
$sql = "select $q_fields, $b_fields from $qtab, $btab "
. "where $qtab.index=$pick";
$sth = $dbh->prepare($sql);
$sth->execute();
} until ($hr = $sth->fetchrow_hashref) ;
# Add the newly read parameters to the existing values in %rec (if any):
%rec = (%rec, %{$hr});
$author = &nameswap($rec{'authors'});
$translator = &nameswap($rec{'translator'});
&form_welcome;
}
|
This is the way it looked (can you spot the problem?):
Notice anything funny about the style of this Balzac
biography? In fact this is a quote from a Mickey Spillane
novel, and the attribution is completely wrong. And when
you hit the reload button, you get a different quote,
but the attribution stays the same: it always claims that
it's written by Andre Maurois, no matter what it is.
|
Now I'm going to try talk about my own mental process in
doing this debugging, with emphasis on my own stupid
misconceptions... because debugging is all about being in a
state of confusion and trying to find a quick, orderly way
back to a state of understanding.
|
Anyway, as I remember it, my first thought from looking at
this behavior is that I had some sort of mod_perl
peculiarity going on. One of the ways that mod_perl differs
from CGI programming, is that you can't assume that you've
got a blank slate every time you run the code. It's one of
the well-known gotchas, that you need to carefully
initialize your variables. Otherwise, rather than start off
with everything undefined, a mod_perl script might lead off
with a leftover value from the last run. The broken
behavior of the website suggested to me that something like
this might be going on: the value of the book information is
"stuck". Could it be coming from the previous run?
|
Current directory is /home/httpd/cgi-bin/
Loading DB routines from perl5db.pl version 1.0402
Emacs support enabled.
Enter h or `h h' for help.
Running program /home/httpd/cgi-bin/common_perldb_demo.pl 0.95pre: Sun Oct 22 00:52:28 2000
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> s
DB<1> x $q_fields
0 'quotations.title, quotations.quote, quotations.page, quotations.chapter, quotations.topic, quotations.when_added, quotations.user_added, quotations.context, quotations.tag_comment, quotations.pointers'
DB<2> s
DB<2> s
DB<2> s
DB<2> x $b_fields
0 'books.title, books.authors, books.year_published, books.edition, books.publisher, books.translator, books.original_language, books.physical_desciption, books.cover_artist, books.when_added, books.user_added, books.isbn, books.pointers'
First I single_stepped through, and then checked the values
for $q_fields and $b_fields, since I'd just modified them,
adding those prefixes to specify the table. Now they're all
quotation.* and books.* as desired (in SQL
you usually don't use these prefixes unless you need them,
but they never hurt, so I decided to add them to all my
fields so I wouldn't have to worry about it).
|
Now, I'm going to set the breakpoint for the
but_welcome routine, and skip there with a
c for continue. But then I decided to just hit
another c and let the program finish. The output
shows the naked html, but it doesn't have to be read very
carefully to see that the problem is still there:
|
DB<3> b but_welcome
DB<4> c
Running sub freezefields.
DB<4> c
Running sub nameswap.
Running sub nameswap.
Running sub form_welcome.
Running sub debug_userprobe.
Running sub readcookie.
username
Content-Type: text/html; charset=ISO-8859-1
<!DOCTYPE HTML
PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML LANG="en-US"><HEAD><TITLE>Commonplace (0.95pre)</TITLE>
</HEAD><BODY BGCOLOR="#CCCC99" TEXT="#000000" LINK="#330066"><H1>Commonplace (0.95pre)</H1><P><I>A collection of quotations</I></P><FORM METHOD="POST" ENCTYPE="application/x-www-form-urlencoded">
Running sub print_quote.
<BLOCKQUOTE><P><B></B></P>No matter what foolish declarations people make about
money, they one and all, if they live in Paris, must
grovel before accounts, do homage to figures, and kiss
the forked hoof of the golden calf. <DL> <DT><DD>Andre Maurois, </DD><I>Prometheus: The Life of Balzac</I>, 1965<BR><BR>Translated by Norman Denny</DL><BR><I>Quote entered by doom</I><P></BLOCKQUOTE><HR><INPUT TYPE="submit" NAME="but_welcome" VALUE="Show random quote"> <INPUT TYPE="submit" NAME="but_welcome_login" VALUE="Enter a quote"> <INPUT TYPE="submit" NAME="but_welcome_search" VALUE="Search"><HR></FORM><DIV align='right'><A HREF="mailto:doom@kzsu.stanford.edu"><I>doom@kzsu.stanford.edu</I></A></DIV></BODY></HTML> DB<4>
Note that while the quote has changed, it still cites
that Maurois biography of Balzac.
|
An obvious point here is that since the script seems to
behave the same way inside of the debugger (which is
treating it as a CGI script) as from the website, it can't
be a mod_perl artifact. So that means I dropped my first
idea immediatly, right? No way: I wasn't thinking this
through anywhere near as clearly as that. I was getting a
feeling that it wasn't a mod_perl thing, or indeed any kind
of "stuck" value issue, but I still didn't have any good
alternatives in mind, so I kept probing the different
variables, looking for a place where something could be
"stuck".
|
So next I did a R to start over, and a
L revealed I still had my breakpoint. A c
carried me back to there again.
|
DB<4> R
Warning: some settings and command-line options may be lost!
Loading DB routines from perl5db.pl version 1.0402
Emacs support enabled.
Enter h or `h h' for help.
Running program /home/httpd/cgi-bin/common_perldb_demo.pl 0.95pre: Sun Oct 22 01:05:20 2000
DB<4> L
/home/httpd/cgi-bin/common_perldb_demo.pl:
185: my $sth = $dbh->prepare("select max(index) from quotations");
break if (1)
DB<4> c
Running sub freezefields.
DB<4> s
DB<4> n
DB<4> n
DB<4> n
DB<4> n
DB<4> x $sth
0 DBI::st=HASH(0x8416a30)
empty hash
DB<5> s
DB<5> x $sth
0 DBI::st=HASH(0x8416a30)
empty hash
DB<6> s
DB<6> x @row
0 268
DB<7> s
DB<7> x $maxindy
0 268
DB<8> s
DB<8> x $pick
0 undef
DB<9> s
DB<9> s
DB<9> s
DB<9> x $pick
0 228
DB<10> s
DB<10> x $sql
0 'select quotations.title, quotations.quote, quotations.page, quotations.chapter, quotations.topic, quotations.when_added, quotations.user_added, quotations.context, quotations.tag_comment, quotations.pointers, books.title, books.authors, books.year_published, books.edition, books.publisher, books.translator, books.original_language, books.physical_desciption, books.cover_artist, books.when_added, books.user_added, books.isbn, books.pointers from quotations, books where quotations.index=6'
DB<11>
Here's a different, probably obvious thought: could it be
that I just got my SQL screwed up somehow? But that didn't
seem *too* likely... after all, I carefully experimented
with the SQL from the command-line monitor (psql is
what's used in the postgresql world) to make sure I had the
syntax right. Once I had a command that worked, I copied it
into the perl code as a comment, and only then did I
generalize it as perl. I was being so careful with this,
it didn't seem likely that the SQL syntax was at fault in
anyway.
|
So what I *really* did was to keep going around and around
checking the values of variables. I made some simple
modifications to the code to make things easier to read, and
to break out some of the interesting values, e.g. I didn't
originally have a $sql variable.
|
But to keep from dragging this out completely though,
let's just jump to the point where it finally occured to me
to do a very simple test: take the value of $sql output by
the debugger, and cut and paste it back into the psql
monitor to doublecheck that it did what I expected it to.
Here's that sql again, cleaned up a little:
|
select
quotations.title, quotations.quote, quotations.page,
quotations.chapter, quotations.topic, quotations.when_added,
quotations.user_added, quotations.context,
quotations.tag_comment, quotations.pointers, books.title,
books.authors, books.year_published, books.edition,
books.publisher, books.translator, books.original_language,
books.physical_desciption, books.cover_artist,
books.when_added, books.user_added, books.isbn,
books.pointers
from quotations, books
where quotations.index=6
So does it work? Answer:
hell no. Rather than retrieving *one* record, it dumped
a huge number of them, and notably the *last* result has the
same book title information that I kept getting in the
website. I can't show you the full output, but here's part of the
listing:
|
The By-Pass Control|Her mouth was a wild thing,
sucking hungrily, tasting quickly to absorb
the present and the
future in the few seconds left
to us, then I pushed her away
when I didn't want her to go at all.
| 174| 11| |Wed Oct 06 12:48:41 1999 PDT|doom | |
| | 7|Prometheus: The Life of Balzac |
Maurois, Andre |
1965|1st Carrol & Graf edition, 1983 |C
arrol & Graf Publishers|Norman Denny |French |
Trade paperback, blue cover, white & red text, engraving of Balzac|
|Tue Aug 24 14:54:32 1999 PDT|doom | |
The By-Pass Control|Her mouth was a wild thing,
sucking hungrily, tasting quickly to absorb
the present and the
future in the few seconds left
to us, then I pushed her away
when I didn't want her to go at all.
| 174| 11| |Wed Oct 06 12:48:41 1999 PDT|doom | |
| | 7|The Histories |
Herodotus |
-450|1971 |P
enguin Books Ltd |Aubrey de Selincourt |Greek |
paperback, black cover, photo of greek sculpture |
|Wed Sep 08 20:43:34 1999 PDT|doom | |
[...]
The By-Pass Control|The whole situation was like a huge bowl of Jell-o
that was liquid-hot and you had to stand by until it set before
it could be handled properly. And you knew there wouldn't be
that much time alloted you.
| 125|Chapter 8| |Fri Sep 17 00:38:04 1999 PDT|doom | |
| | 6|Fantasy and Science Fiction Column, 11/82|
Budrys, Algis |
1982| |
| | |
|
|Tue Oct 10 21:08:38 2000 PDT|doom | |
The By-Pass Control|The whole situation was like a huge bowl of Jell-o
that was liquid-hot and you had to stand by until it set before
it could be handled properly. And you knew there wouldn't be
that much time alloted you.
| 125|Chapter 8| |Fri Sep 17 00:38:04 1999 PDT|doom | |
| | 6|The Mandarins |
Beauvoir, Simone de |
1954| |
|Friedman, Leonard M. |French |
|
|Sun Oct 15 23:58:14 2000 PDT|doom | |
(57 rows)
At that point it was glaringly obvious: my $sql statement
actually didn't match the prototype I included as a
comment. Somehow I had *dropped* the books.title =
quotations.title constraint. This is in fact a
*classic* gotcha with SQL, all SQL beginners have to learn
that inner joins often dump far more data than you would expect,
and that they need to be filtered carefully with the
WHERE clause. My (broken) WHERE limited us to one
row from the quotations table, but supplied *all* the rows
from the *books* table, and the result returned is every
combination of the two.
|
When you do this from the command line, you immediately
see the problem, but in this particular perl script, I used
the DBI method fetchrow_hashref, which despite the
word "row" (singular) there, actually was happy to fetch
many, many rows, and to then try to stash them all in the
same hash. So my result was actually the last record of the
fetch (and not at all the right one).
|
So finally, here we have the finished, debugged code:
sub but_welcome {
# First: select a quotation randomly.
my $sth = $dbh->prepare("select max(index) from quotations");
$sth->execute();
my @row = $sth->fetchrow_array;
# The maximum index value for the quotations table:
my $maxindy = $row[0];
my $pick;
# Select record randomly, make sure it's valid, if not, pick again:
my ($hr, $sql);
do {
$pick = int( rand($maxindy) + 1 ); # random pick of a number
# from 1 to maxindy:
# Doing query something like:
# select * from quotations, books where
# index=262 and books.title = quotations.title;
$sql = "select $q_fields, $b_fields from $qtab, $btab "
. "where $qtab.index=$pick and $qtab.title = $btab.title";
$sth = $dbh->prepare($sql);
$sth->execute();
} until ($hr = $sth->fetchrow_hashref) ;
# Add the newly read parameters to the existing values in %rec (if any):
%rec = (%rec, %{$hr});
$author = &nameswap($rec{'authors'});
$translator = &nameswap($rec{'translator'});
&form_welcome;
}
|
And we can see that this is working much better:
|
Was this the right way to debug?
Point to discuss: did using the debugger help that much
here?
|
On the one hand, just running the script from the
command-line could have quickly ruled out the idea that the
problem was a mod_perl artifact.
|
And in theory if I'd guessed right about what variables
needed to be probed, I could have used embedded print
statements that would have revealed the value of things like
"$sql"... but I'd argue that that's a big if. I've tried to
convey a little bit of the sense of the confused blundering
around I was doing before finally zeroing in on the SQL
syntax, but I really can't remember all the half-formed
ideas I tried on and discarded while probing this little
piece of code.
|
There's one little difference I should mention between
what I really did and what I've shown you here: the
commonplace app is something I've often used embedded print
statements to debug. If you look at the code that I've put
up there on the web, it's absolutely riddled with if
(DEBUG) this and that, and calls to a
print_debug routine. The but_welcome
routine that we've been debugging here doesn't have any of
this extraneous cruft now, only because I stripped it out to
simplify presenting the routine to you. I think there's
some irony here: one of the real reasons I had trouble
seeing the problem at a glance was that there was a lot of
old debugging cruft scattered around in the code. If
only I hadn't been using that cheesy embedded print statement
debugging business, maybe neither debugging process
would have been necessary.
|
Debugging CGI, in the general case
Another point, is it useful for CGI scripts *in general*?
|
It was straightforward to run this script under the perl
debugger, because though it's a CGI script, it didn't
require any input to run this particular routine.
In the general case, CGI scripts are intended to process the
information in name/value pairs, hidden fields, and
cookies. Using the CGI.pm module helps in this
effort a little bit, making it possible to pass name value
pairs in on the command line. But this alone doesn't get
you into the land of debuggery... I can't think of any other
approach besides writing some sort of wrapper script that
calls the code to be debugged in just the right way. Then
you debug the wrapper script, and break/step your way into
the real code...
|
If there are any better techniques for debugging CGI
scripts, I'd be glad to hear about them... (in particular,
some kind of open source http recorder to automatically
generate the name/value pairs would be a good thing to
have...).
|
I understand that there's a new Apache::DB; module
for debugging mod_perl scripts. I haven't looked into this
as of yet.
|
Some perl debugger gotchas I have known
- If you do an s when you wanted an n, you
should realize that you can do a r to keep chugging
to the end of the routine. Note that when you're using some
external modules (which is like, always, since this is perl
we're talking about), it's relatively easy to accidentally
start single-stepping through someone else's code.
- In general the the perl debugger is a little finicky
about how it's run from inside of emacs:
- If you're working on a CGI script, you will almost
certainly have the taint flag enabled, which will give you a
"too late to taint" error when you run perldb on it.
-T is required to be in the first group of options
fed into perl. You can do a #!/usr/bin/perl -dwT
and run the command line debugger, but if you try to run a
#!/usr/bin/perl -wT script using the emacs perldb
command, you're stuck. The simplest (though not necessarily
the most elegant) thing to do is just to manually remove the
-T flag first, and restore it when you're done.
- Also, I have yet to find a way to do something as simple
as run the emacs perl debugger on a script with the output
piped to a file. You would think that you could just feed
perldb something like perl ~/bin/myscript.pl > /tmp/output.txt
but that doesn't work (it seems to just ignore the redirect).
I typically just let the output show up
in the debugging window: this certainly works, though it can
be annoying when there's a lot of it.
- The perl dbugger gets confused sometimes and starts
debugging itself. This can be kind of amusing, but isn't
usually very productive: just kill it, and try again in
a slightly different way until it works as expected.
- It's tempting to use the capital X command rather
than the lowercase x that I've been recommending,
because X does a better job of pretty printing.
I keep it off of my recommended list because it doesn't
handle things like the @_ array right. I also find it
confusing to switch between the two, because the syntax is a little
different: It's x @array versus X array.
- A quick glance at the docs gives one the impression that
the V command is identical to X, but it isn't
at all. I hate the V command myself. It
dumps far too much information to the screen. For example,
you do not want to accidentally do a V if
you've slurped a half megabyte file into a variable.
|
Appendix 1: a simple emacs init file (.emacs)
This file should work with GNU emacs. I don't know if
it works with xemacs. xemacs is incompatible with some GNU
emacs stuff, though they use the same start-up file name,
.emacs (?!).
;; A simple .emacs file, recommened for the home directories
;; of people experimenting with ESC x perldb
;; -- Joe Brenner, doom@kzsu.stanford.edu
; A convenient way to jump to a given line number, using control-x L
(global-set-key "\C-xL" 'goto-line)
; RMS is clearly a mix of genius and insanity, and if you
; want an example of the "insanity" part, go no farther than
; the fact that Control-H is the default help command.
; Backspace for help? Let's try ESC h instead.
; This is courtesy of coleman@twinsun.com (Mike Coleman):
; Fix stupid EMACS problem.
;
(global-set-key "\C-h" 'delete-backward-char)
(global-set-key "\C-?" 'delete-backward-char)
(setq search-delete-char ?\^h)
(define-key esc-map "h" 'help-command)
(define-key esc-map "?" 'help-command)
(define-key esc-map "\C-h" 'backward-kill-word)
(define-key help-map "\M-h" 'help-for-help)
(define-key help-map "\M-?" 'help-for-help)
;; a subshell inside of emacs (ESC x shell) has a lot of
;; silly Control-Ms echoed to it, unless you do this
;; (courtesy of the NT (!) emacs FAQ):
(add-hook 'comint-output-filter-functions 'shell-strip-ctrl-m nil t)
;; Turns on syntax coloring for perl code by default.
(add-hook 'perl-mode-hook 'turn-on-font-lock)
|
Appendix 2: if you're *going* to use embedded print statements
If you're going to use embedded print statments for
debugging, do it right. I'd argue that this is right.
At the beginning of your program do something like:
use constant DEBUG => 1; # Turns on debugging statements to error_log (STDERR)
use constant VERSION => "0.95pre";
BEGIN {
$_ = scalar localtime;
my $datestamp = "\nRunning program $0 " . VERSION . ": " . $_ . "\n";
print_debug($datestamp);
sub print_debug {
(DEBUG) && print STDERR "$_[0]";
}
}
You can begin your subs with a line like this:
my $function = (caller(0))[3]; # courtesy of recipe 10.4, perl cookbook
&print_debug("Running sub $function .\n");
Fancier structures might require something like this (could
be I should learn to use Data::Dumper or whatever it's
called, eh?):
if (DEBUG) {
print STDERR "hash from querying the quotations and books tables:\n";
foreach my $key (keys %rec) {
print STDERR "Value of rec for key $key is: ";
($rec{$key}) && print STDERR "$rec{$key}";
print STDERR "\n";
}
}
|
- Contact:
-
doom@kzsu.stanford.edu
- About doom (that's me, not the damn game):
- http://obsidianrook.com/