[问题]一个关于perl 的正则表达式的问题

软件和网站开发以及相关技术探讨
头像
banban
帖子: 3340
注册时间: 2008-03-23 17:01

#31

帖子 banban » 2008-08-15 17:15

wuchuanren 写了:突然想起来,perl里面有个叫\G(具体需要查书……)的标签,是从上次匹配结束开始匹配。
用这个就可以了
有\G吗?我只记得有一个/g是用于全局替换的,等等我查一下阿……
wuchuanren
帖子: 99
注册时间: 2008-01-31 16:55

#32

帖子 wuchuanren » 2008-08-16 11:24

原来楼主还在关心,那我贴一段代码吧 :

代码: 全选

#!/usr/bin/perl -w
$data="atggcagttggtacctaagcattdggtacccgtta";
while($data =~ /\G.*?(.{4}ggtacc)/g){
        print $1,"\n";
}
~
呵呵,只是有个小地方不知道是否符合楼主的要求
wuchuanren
帖子: 99
注册时间: 2008-01-31 16:55

#33

帖子 wuchuanren » 2008-08-16 11:29

像这样做有个特点:
比如这么一串碱基

代码: 全选

atggcagttggtaccttdggtacccgtta
前后有两个ggtacc,后面一个ggtacc结尾的长度为10的串的前半部分含有上一个ggtacc的一部分内容,这段代码就不能完美解决

所以具体采用什么方法需要看楼主的需求:em01




(最近复习回一点perl啦~~) :D
头像
banban
帖子: 3340
注册时间: 2008-03-23 17:01

#34

帖子 banban » 2008-08-16 14:41

哦,这样的运行结果也是正确的。多谢你提供的另一种方法。
可是关于这个 \G ,我还是不知道它是如何用的,课本上没有讲到,也google 不到。不过,还是多谢了,呵呵。
wuchuanren
帖子: 99
注册时间: 2008-01-31 16:55

#35

帖子 wuchuanren » 2008-08-16 15:46

\G和 \b一样只是一个锚点,指示上次匹配结束的位置(如果没有匹配或者匹配失败就回到字符串的起点)。
\G在/g的时候有效。
使用\G的时候有个/c修饰符,用了/c修饰符,匹配失败了不回到起点。
................ :D
头像
banban
帖子: 3340
注册时间: 2008-03-23 17:01

#36

帖子 banban » 2008-08-16 15:49

恩,原来如此,多谢了,亏我还学了几个月的perl 呢,好多东西都还没掌握,呵呵,见笑了……
heejun
帖子: 60
注册时间: 2006-05-01 12:29
来自: zju

#37

帖子 heejun » 2008-08-16 21:46

banban 写了:哦,这样的运行结果也是正确的。多谢你提供的另一种方法。
可是关于这个 \G ,我还是不知道它是如何用的,课本上没有讲到,也google 不到。不过,还是多谢了,呵呵。
有兴趣的话可以看mastering regular expression
里面很详细的
\G was first introduced by Perl to be useful when doing iterative matching with /g (☞51), and ostensibly matches the location where the previous match left off. On the first iteration, \G matches only at the beginning of the string, just like \A.
If a match is not successful, the location at which \G matches is reset back to the beginning of the string. Thus, when a regex is applied repeatedly, as with Perl's s/⋯/⋯/g or other language's "match all" function, the failure that causes the "match all" to fail also resets the location for \G for the next time a match of some sort is applied.
Type of match Where match starts pos upon success pos upon failure
m/⋯/ start of string (pos ignored) reset to undef reset to undef
m/⋯/g starts at target's pos set to end of match reset to undef
m/⋯/gc starts at target's pos set to end of match left unchanged
头像
banban
帖子: 3340
注册时间: 2008-03-23 17:01

#38

帖子 banban » 2008-08-16 21:55

heejun 写了:
banban 写了:哦,这样的运行结果也是正确的。多谢你提供的另一种方法。
可是关于这个 \G ,我还是不知道它是如何用的,课本上没有讲到,也google 不到。不过,还是多谢了,呵呵。
有兴趣的话可以看mastering regular expression
里面很详细的
\G was first introduced by Perl to be useful when doing iterative matching with /g (☞51), and ostensibly matches the location where the previous match left off. On the first iteration, \G matches only at the beginning of the string, just like \A.
If a match is not successful, the location at which \G matches is reset back to the beginning of the string. Thus, when a regex is applied repeatedly, as with Perl's s/⋯/⋯/g or other language's "match all" function, the failure that causes the "match all" to fail also resets the location for \G for the next time a match of some sort is applied.
Type of match Where match starts pos upon success pos upon failure
m/⋯/ start of string (pos ignored) reset to undef reset to undef
m/⋯/g starts at target's pos set to end of match reset to undef
m/⋯/gc starts at target's pos set to end of match left unchanged
多谢了哦……又有新的收获了,呵呵
回复