1 任务内容: 写一个脚本每天定时将adblock的规则转为squid规则,使squid使用最新的规则来过滤广告
2 任务的难度:四星
3 任务的目的: 学习脚本语言和SQUID
4 任务所涉及的软件: 任意你熟悉的脚本语言
5 任务将大致消耗的时间: 一周
6 参考网站:
http://easylist.adblockplus.org/easylist.txt
http://adblock-chinalist.googlecode.com ... dblock.txt
[四星]写一个脚本将adblock的规则转为squid规则
- oneleaf
- 论坛管理员
- 帖子: 10448
- 注册时间: 2005-03-27 0:06
- 系统: Ubuntu 12.04
- hcym
- 帖子: 15634
- 注册时间: 2007-05-06 2:46
Re: [四星]写一个脚本将adblock的规则转为squid规则
要是将adblock的规则转为urlfilter阻挡规则
opera就真是神器了哦啊
神神叨叨的从来没干过正经事
opera就真是神器了哦啊
神神叨叨的从来没干过正经事
وإذا كان هذا لا يحصل أكثر من 100 ملاحظات ، انا ذاهب الى غضب
- oneleaf
- 论坛管理员
- 帖子: 10448
- 注册时间: 2005-03-27 0:06
- 系统: Ubuntu 12.04
Re: [四星]写一个脚本将adblock的规则转为squid规则
adblock2squid.py
代码: 全选
#!/usr/bin/env python
# -*- coding: utf-8; -*-
# (c) UbuntuChina, http://www.ubuntu.org.cn
# (c) free software, GPLv3
# Connect: oneleaf@gmail.com
'''
本脚本的作用是将ADBlock的规则转为SQUID的规则
使用方法:
1 修改 /etc/squid/squid.conf ,找到 http_access allow localhost 行,在上面增加如下两行:
acl adblock url_regex "/etc/squid/adblock.rules"
http_access deny adblock
2 使用root帐号来运行本脚本
sudo python adblock2squid.py
3 重启squid使其生效
sudo /etc/init.d/squid restart
'''
import urllib2, re, os
def addcustomrules(outfile,customfile='adblock.txt'):
if not os.path.exists(customfile):
print 'not find',customfile,'ignore'
return
print 'loading',customfile
lines=open(customfile).readlines()
addruletext(outfile,lines)
def addrules(outfile,adblockurl):
print 'loading',adblockurl
html = urllib2.urlopen(adblockurl).readlines()
addruletext(outfile,html)
def addruletext(outfile,lines):
for line in lines:
rule=line.strip()
if rule=='' or rule[0]=='!' or rule[0]=='[' : continue
if rule.find('##')>=0: continue
if rule.find('@@')==0: continue
rule=rule.replace('.','\\.')
rule=rule.replace('?','\\?')
rule=rule.replace('*','.*')
rule=rule.replace('^','.')
if rule.find('||')==0: rule='!!'+rule[2:]
if rule[0]=='|': rule='^'+rule[1:]
if rule[-1]=='|': rule=rule[:-1]+'$'
rule=rule.replace('|','\\|')
if rule.find('!!')==0: rule='^http://(.*?\\.|)'+rule[2:]
domain=''
c=rule.find('$')
if c>0:
part=rule[c:]
r=''
if part.find('image')>=0:r=r+'|jpg|jpeg|svg|png|bmp|gif'
if part.find('object')>=0:r=r+'|swf|jar'
if part.find('script')>=0:r=r+'|js|vbs'
if part.find('domain=')>=0:
domain=part[part.find('domain=')+7:]
if domain.find(',')>=0: domain=domain[:domain.find(',')]
if domain.find('~')>=0: domain=domain.replace('~','.*\\.')
if domain.find('\\|')>=0: domain='('+domain.replace('\\|','|')+')'
if r:r='.*('+r[1:]+')'
rule=rule[:c]+r
if rule[-1]=='^': rule=rule[:-1]
try:
if rule.find("http")>=0:
if domain != '' : continue
rule=re.compile(rule)
else:
rule=re.compile(domain+rule)
except:
continue
print 'add:',rule.pattern,'from',line.strip()
outfile.write(rule.pattern+"\r\n")
if __name__ == '__main__':
outfile=open("/etc/squid/adblock.rules","w")
addcustomrules(outfile,"adblock.txt")
addrules(outfile,"http://easylist.adblockplus.org/easylist.txt")
addrules(outfile,"http://adblock-chinalist.googlecode.com/svn/trunk/adblock.txt")
outfile.close()
- eexpress
- 帖子: 58428
- 注册时间: 2005-08-14 21:55
- 来自: 长沙
- oneleaf
- 论坛管理员
- 帖子: 10448
- 注册时间: 2005-03-27 0:06
- 系统: Ubuntu 12.04
Re: [四星]写一个脚本将adblock的规则转为squid规则
将adblock的格式转为标准的正则表达式,然后利用squid的url_regex来匹配过滤。
acl adblock url_regex "/etc/squid/adblock.rules"
http_access deny adblock
acl adblock url_regex "/etc/squid/adblock.rules"
http_access deny adblock