This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
To: sfpug@sf.pm.org From: zenji@gmx.net Subject: Re: [sf-perl] Dinner before SFPUG? Date: Thu, 30 Aug 2001 22:39:06 +0200 (MEST) > I wasn't able to make it to the meeting, but I am curious about > how things went. Anyone? We traded a lot of horror stories about insane code we had inherited from other people, and talked about using various tools to analyse control flow, from the debugger to various partial solutions people had tried to code on their own. We came out with a few big lessons. They're a little pessimistic: 1) Code archaeology is _hard_. There are no panaceas; there are some basic tools and techniques that help, but there's a lot of painstaking work involved, too. 2) Often just documenting the code is a good start. You will have to read through it and understand it, so you might as well share your understanding. Rich put forth the idea of different levels of documentation, where Level 0 is simply readable code, Level 1 is well- commented code, and higher levels correspond to system design documents, user documentation, et cetera. Vicki pointed out that documenting the data structures code expects is just as important as documenting the code itself, and recommended putting sample structures in the comments at the top of blocks of code that use them. 3) There are no good static analysis tools for Perl. It would be nice to see all the places where a sub or method is called, displayed in flowchart or on demand when you mouse over a method, but there are no tools that do this. For one thing, Perl allows for run- time evaluation of code, so such tools would have to be limited. Worse, Perl is so syntactically rich that it is difficult to parse. An oft-repeated observation was "The only thing that can parse Perl is perl [the interpreter]." 4) Re-writing code is a perfectly legitimate solution. It often proves faster than trying to understand existing code. It's sometimes easier to understand what a piece of code is supposed to do than to understand what it actually does. The tricky part is figuring out when you have an isolated chunk of code that can be replaced safely, without breaking the rest of the system. What if it doesn't do what it is supposed to do, but other pieces of code depend on its broken behavior? You can end up putting your arms around bigger and bigger parts of the system, trying to find an independent unit to replace. Really bad code is often the least modular, so you may be out of luck in this respect. 2) The bad systems that require code archaeology are the result of sociological problems, not technological ones. The two major problems are duplication and layering of code. Layering occurs when an individual programmer cannot understand a piece of code needed for an application, and just writes to it anyway, hacking away until the errors seem to stop. Sometimes this means writing yet another level of indirection around a library; sometimes it means cutting and pasting code, then modifying it for the situation (thereby trashing any chances for generality!) Layering has the nasty property of being self-perpetuating: The greater the accretion of cruft in the program, the harder it is for anyone to understand it, let alone change it safely. Without understanding, the temptation to add yet another layer is great. Duplication, related to layering but more widely recognized, happens when various programmers solve the same general problem with different pieces of code (usually, in the Perl case, writing their own modules). In particularly egregious cases, various parts of the code do the same thing, overriding each other in hard-to-predict ways. Matt mentioned a past Web programming job where they constantly had to play "Find the Header", since the template library auto-generated HTTP headers, but various pieces of code also manipulated them directly, either before or after the template was applied. The proactive solution to both problems is good communication. This can mean more consistency in choosing libraries and more frequent code reviews, but neither of these solutions is sufficient on its own. More important is changing in the mentality of programmers who think they should work in isolation and never discuss the problems they are solving. If programmers talk regularly to explain the designs they are developing, they can recognize common needs and come up with general solutions (avoiding the duplication problem). Likewise, they can actually keep some understanding of each other's code (heading off the layering problem). This doesn't mean understanding all the implementation details; it means understanding the API (after making sure there is a defined API at all!) Code archaeologists seldom have the luxury of taking proactive measures; they are called in to clean up the mess after the classic mistakes have already been made. On the other hand, if you're working more generally as a programmer, you can recognize the signs of layering and duplication and know to go into archaeology mode before things get worse. While you do, you should advocate for better communication processes to prevent the problem from happening again. And, yes, there was enough pizza. :) --Q === To: sfpug@sf.pm.org From: zenji@gmx.net Subject: [sf-perl] Running the debugger non-interactively Date: Thu, 30 Aug 2001 22:51:10 +0200 (MEST) In the meeting, I mentioned that you could use the debugger non- interactively to print a stack trace of executing code. The way to do this is to set the environment variable $PERLDB_OPTS. It's a space-separated list of options; you want something like "NonStop frame=2", where NonStop sets non-interactive mode and frame specifies a level of detail at which to print stack traces. From the perldebug man page: `frame' Affects the printing of messages upon entry and exit from subroutines. If `frame & 2' is false, messages are printed on entry only. (Printing on exit might be useful if interspersed with other messages.) If `frame & 4', arguments to functions are printed, plus context and caller info. If `frame & 8', overloaded `stringify' and `tie'd `FETCH' is enabled on the printed arguments. If `frame & 16', the return value from the subroutine is printed. The length at which the argument list is truncated is governed by the next option: Another useful options (again, from the man page): `maxTraceLen' Length to truncate the argument list when the `frame' option's bit 4 is set. You can also use a .perldb file to set options. --Q === To: sfpug@sf.pm.org From: Peter Prymmer <pvhp@best.com> Subject: Re: [sf-perl] Dinner before SFPUG? Date: Thu, 30 Aug 2001 14:58:14 -0700 (PDT) On Thu, 30 Aug 2001 zenji@gmx.net wrote: > > I wasn't able to make it to the meeting, but I am curious about > > how things went. Anyone? > > We traded a lot of horror stories about insane code we had inherited from > other people, and talked about using various tools to analyse control flow, > from the debugger to various partial solutions people had tried to code on > their own. We came out with a few big lessons. They're a little > pessimistic: > > 1) Code archaeology is _hard_. > > There are no panaceas; there are some basic tools and techniques > that help, but there's a lot of painstaking work involved, too. [snip] > And, yes, there was enough pizza. :) That was an excellent summary of the discussion. Thanks. One tool that was mentioned that only a few people seemed to have heard of was the perl reformatter/pretty printer called perltidy. It is currently a sourceforge project accesible via: http://perltidy.sourceforge.net/ While it is not a static analysis tool, it seemed to be well liked by those who had mentioned it or used it. Peter Prymmer ===