This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
From: "Tom Ford" <fordt@uci.edu>
Subject: Regular Expressions & Substitution
Date: Mon, 7 Aug 2000 12:14:44 -0700
Hi, I just recently started learning Perl, and I have been loving it.
Especially the Regular Expressions. However, I have run into a problem I
can't seem to get around. Perhaps someone can help me solve it.
I am using a two dimensional array for my regular expression-subsitutions
like this (for example):
$Order[3][0] = "(~) ($Operand)";
$Order[3][1] = " t";
The array contains several different types of substitions I want to perform.
Then I am using a FOR loop to run through my array like this:
for ($i=0;$i<7;$i++)
{
if ( $_[0] =~ s/$Order[$i][0]/$Order[$i][1]/ )
{
.....
}
} # end FOR-loop
Which seems to work just fine. However, the problem arises when I try to use
those special escape characters (\1 \2 \3 etc...) in the second string, for
example:
$Order[3][1] = "\1 t";
When I do that, it was replacing "\1" with a smiley face. It took me a while
to realize it was inserting ASCII character 1 into my string, so I tried
single quotes like this:
$Order[3][1] = '\1 t';
But in the substition, it was putting the actual "\1" instead of the value
of "\1". So my question is: is there anyway I can use those "\1" things
within a string and use that string as my substitution? Or is there any
other way I can do this without hard coding every single Regular Expression
I would like to perform?
Thanks, I hope I made sense :)
Tom
Path: nntp.stanford.edu!newsfeed.stanford.edu!sn-xit-01!supernews.com!sn-inject-01!corp.supernews.com!not-for-mail
From: gbacon@HiWAAY.net (Greg Bacon)
Newsgroups: comp.lang.perl.misc
Subject: Re: Regular Expressions & Substitution
Date: Mon, 07 Aug 2000 19:48:28 GMT
Organization: Eric Conspiracy Secret Labs
Lines: 75
Message-ID: <sou4kc6p63a11@corp.supernews.com>
References: <8mn1n2$7ra$1@news.service.uci.edu>
Reply-To: Greg Bacon <gbacon@hiwaay.net>
X-Complaints-To: newsabuse@supernews.com
X-Eric-Conspiracy: There is no conspiracy.
X-Newsreader: trn 4.0-test72 (19 April 1999)
Xref: nntp.stanford.edu comp.lang.perl.misc:332832
In article <8mn1n2$7ra$1@news.service.uci.edu>,
Tom Ford <fordt@uci.edu> wrote:
: I am using a two dimensional array for my regular expression-subsitutions
: like this (for example):
:
: $Order[3][0] = "(~) ($Operand)";
: $Order[3][1] = " t";
:
: The array contains several different types of substitions I want to perform.
: Then I am using a FOR loop to run through my array like this:
:
: for ($i=0;$i<7;$i++)
: {
: if ( $_[0] =~ s/$Order[$i][0]/$Order[$i][1]/ )
: {
: .....
: }
: } # end FOR-loop
:
: Which seems to work just fine. However, the problem arises when I try to use
: those special escape characters (\1 \2 \3 etc...) in the second string, for
: example:
:
: $Order[3][1] = "\1 t";
:
: When I do that, it was replacing "\1" with a smiley face. It took me a while
: to realize it was inserting ASCII character 1 into my string, so I tried
: single quotes like this:
:
: $Order[3][1] = '\1 t';
:
: But in the substition, it was putting the actual "\1" instead of the value
: of "\1". So my question is: is there anyway I can use those "\1" things
: within a string and use that string as my substitution?
Build up a matcher at run time. You should see big performance gains.
Consider this example:
#! /usr/bin/perl -w
use strict;
my @subst = (
[ 'foo' => 'bar' ],
[ '([awm])\1\1' => '${1}3' ],
);
sub replacer {
my $code = "sub {\n for (\@_) {\n";
for (@_) {
$code .= " s/$_->[0]/$_->[1]/gi;\n";
}
$code .= " }\n}\n";
#print $code;
my $sub = eval $code;
die $@ if $@;
$sub;
}
my $subst = replacer @subst;
while (<DATA>) {
$subst->($_);
print;
}
__DATA__
The food is under the bar in the barn.
MMM employees should visit AAA on the WWW.
Just another Perl hacker,
Greg
Path: nntp.stanford.edu!newsfeed.stanford.edu!arclight.uoregon.edu!newsfeed.ksu.edu!nntp.ksu.edu!onews.collins.rockwell.com!not-for-mail
From: Michael Carman <mjcarman@home.com>
Newsgroups: comp.lang.perl.misc
Subject: Re: Regular Expressions & Substitution
Date: Mon, 07 Aug 2000 15:03:12 -0500
Organization: None (anarchist)
Lines: 41
Message-ID: <398F1600.2293988E@home.com>
References: <8mn1n2$7ra$1@news.service.uci.edu>
NNTP-Posting-Host: gatekeeper.collins.rockwell.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD (WinNT; U)
X-Accept-Language: en
Xref: nntp.stanford.edu comp.lang.perl.misc:332859
Tom Ford wrote:
>
> I am using a two dimensional array for my regular
> expression-subsitutions like this (for example):
>
> $Order[3][0] = "(~) ($Operand)";
> $Order[3][1] = " t";
Note: $Order[3][0] will contain the value of $Operand at the time it was
assigned, not at the time it is used. Is that what you want?
> The array contains several different types of substitions I want
> to perform.
> Then I am using a FOR loop to run through my array like this:
>
> for ($i=0;$i<7;$i++)
Ah, another C programmer joins our ranks.
> [T]he problem arises when I try to use those special escape
> characters (\1 \2 \3 etc...) in the second string, for example:
>
> $Order[3][1] = "\1 t";
>
Those aren't really escape chars, they're backreferences. Also (IIRC)
that format is deprecated. Use $1, $2, etc. instead.
> When I do that, it was replacing "\1" with a smiley face. It took
> me a while to realize it was inserting ASCII character 1 into my
> string, so I tried single quotes like this:
>
> $Order[3][1] = '\1 t';
>
> But in the substition, it was putting the actual "\1"
Yes, because the RHS of a regex is normally taken literally. Change your
's///' to 's///ee' -- the /ee will force (double) evaluation of the RHS
as an expression.
-mjc
Path: nntp.stanford.edu!newsfeed.stanford.edu!logbridge.uoregon.edu!ihnp4.ucsd.edu!news.service.uci.edu!not-for-mail
From: "Tom Ford" <fordt@uci.edu>
Newsgroups: comp.lang.perl.misc
Subject: Re: Regular Expressions & Substitution
Date: Mon, 7 Aug 2000 13:32:21 -0700
Organization: University of California, Irvine
Lines: 12
Message-ID: <8mn685$9p3$1@news.service.uci.edu>
References: <8mn1n2$7ra$1@news.service.uci.edu> <sou4kc6p63a11@corp.supernews.com>
Reply-To: "Tom Ford" <fordt@uci.edu>
NNTP-Posting-Host: 207.136.135.118
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 5.00.2314.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Xref: nntp.stanford.edu comp.lang.perl.misc:332856
> Build up a matcher at run time. You should see big performance gains.
Interesting technique! I can tell I still have a lot to learn with perl :) I
didnt know it was so flexible.
This should solve my problem nicely.
Thanks so much,
Tom