List Info

Thread: Re: Another regex




Re: Another regex
user name
2007-03-02 17:27:13
yes it is a DNA sequence i need to find.

But still not getting how.. should i go about.

Can you advise something

Thanks


On 3/2/07, Deane.Rothenmaierwalgreens.com
<Deane.Rothenmaierwalgreens.com> wrote:
>
> If those letters were different, I'd think you were
working on a chunk of
> DNA... P-))
>
>  Deane Rothenmaier
>  Programmer/Analyst
>  Walgreens Corp.
>  847-914-5150
>
>  "On two occasions I have been asked [by members
of Parliament], 'Pray, Mr.
> Babbage, if you put into the machine wrong figures,
will the right answers
> come out?' I am not able rightly to apprehend the kind
of confusion of ideas
> that could provoke such a question." -- Charles
Babbage
_______________________________________________
ActivePerl mailing list
ActivePerllistserv.ActiveState.com
To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs

Re: Another regex
country flaguser name
United States
2007-03-02 18:09:17
Well, some of it depends upon how consistent your markers
are:

$temp= "XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB"

> I need to write a regex for filterin out the string
between.
AAA
BBB
CCC

> so in the above case i should have the output as:
AAAZZZZZBBB
BBBSSSSSSCCC
CCCGGGGBBB
BBBVVVVVBBB
> meaning all combinations of start and end for AAA BBB
CCC.

So you want the markers and what's between them - will there
always be a 
begin/end set of markers, but just of different content?


> I have the regex for one of them but how do i do it
simultaneously for
> all 3 of them.

$temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';

 t =
($temp =~/(AAA)(.*?)(BBB)/g);
 foreach (t)
 {

 print $_;

 }

So, use the alternative to create marker sets (note, you
need to add "n" 
to the end of your print stmts or it'll all run together
which makes its 
seem like its working ... sort of):

my $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
my t
= ($temp =~/(AAA|BBB|CCC)(.*?)(AAA|BBB|CCC)/g);
foreach (t) {
     print "Got: ", $_, "n";
} 

Sort of work - it gets:
Got: AAA
Got: ZZZZ
Got: BBB
Got: CCC
Got: GGGG
Got: BBB

you want to capture the whole shebang - so we use both the
capture parens 
and, because we're using the alternative pipe "|"
, the non-capturing 
parens (which are "(?:....)" ) to group our
alternatives:
my $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
my t
= ($temp =~/((?:AAA|BBB|CCC).*?(?:AAA|BBB|CCC))/g);
foreach (t) {
     print "Got: ", $_, "n";
} 

Got: AAAZZZZBBB
Got: CCCGGGGBBB

But this isn't quite right as its not 'reusing' the last
marker set to be 
the beginning of the first.  This gets trickier, you want to
restart the 
match at the marker  of the previous match not just after
it. First, lets 
go to the cool 
while ( /.../g ) { 

loop - note the change to '$1'  in the print:
my $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
while( $temp =~/((?:AAA|BBB|CCC).*?(?:AAA|BBB|CCC))/g) {
     print "Got: ", $1, "n";
} 

Got: AAAZZZZBBB
Got: CCCGGGGBBB

Er, I have to go here but I think the proper bump
along/reset code might 
be in this articles:

http://www.samag.com/documents/s=10118/sam0703i/0703i.h
tm

nope. Dang. I'll have to find it.  The G marks the point of
the last 
match, when you're doing a global "/g" matching
process. The "pos()" 
function is the location of the current G and you can reset
that. 
Something like:
my $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
while( $temp =~/((?:AAA|BBB|CCC).*?(?:AAA|BBB|CCC))/g) {
     $pos = pos $temp;
     print "Got ($pos):", $1, "n";
     pos $temp -= 3;
}

Got (14):AAAZZZZBBB
Got (21):BBBSSSSCCC
Got (28):CCCGGGGBBB
Got (36):BBBVVVVVBBB

a

Andy Bach
Systems Mangler
Internet: andy_bachwiwb.uscourts.gov
VOICE: (608) 261-5738  FAX 264-5932

"Procrastination is like putting lots and lots of
commas in the sentence 
of your life."
Ze Frank 
http://lifehacker.com/softw
are/procrastination/ze-frank-on-procrastination-235859.php
_______________________________________________
ActivePerl mailing list
ActivePerllistserv.ActiveState.com
To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )