当前时区为 UTC + 8 小时



发表新帖 回复这个主题  [ 5 篇帖子 ] 
作者 内容
1 楼 
 文章标题 : Wikipedia Dump Reade,直接读取*.xml.bz2
帖子发表于 : 2010-08-09 23:45 
头像

注册: 2009-11-19 15:03
帖子: 1229
地址: 娜美星
送出感谢: 1
接收感谢: 1
去这里下载: https://launchpad.net/wikipediadumpreader

An application to easily read Wikipedia's downloaded dump files.

This simple program displays the text-only Wikipedia compressed dumps, currently available at http://download.wikimedia.org/backup-index.html, generally named like pages-articles.xml.bz2.

It's fairly useable now for wikipedia reading, altough lots of rendering or layout glitch occurs.
It is focused on usability, and not necessarily trying to mimic the online web interface.

Features includes a Qt viewer with basic text mark-up, following links, ability to read directly on the .bz2 compressed file (although some index creations step is needed on first run), tab-like list of articles with load-in-the-background by default, a simple but useful keyword search, very light source-code, optional latex rendering, no install necessary.



After reading on some blog from planetkde about an offline wikipedia reader, I decided to make available some python-Qt program I wrote some months ago as i realized it might be useful for more people than myself.

Wikipedia Dump Reader display the text-only wikipedia compressed dumps, currently available at http://download.wikimedia.org/backup-index.html, generally named something like pages-articles.xml.bz2.

It's fairly useable now for basic wikipedia reading, altough lots of feature might still not work, since i originally intended this reader for my own usage only.

Features includes a Qt viewer with basic text markup, following links, ability to read directly on the .bz2 compressed file (altough some index creations step is needed on first run), tab-like list of articles with load-in-the-background by default, very light source-code.

The current code required PyQt4, altough some old unmaintened PyQt 3 code is included.

Tested on Fedora Core 4 and Kubuntu with PyQt4.1 (Python 2.4, Qt 4.2).

Usage:

1. on the commandline, run:

python dumpReader.py

or just click on it from your favorite file manager

2. Browse and select the archive (some file probably named *.xml.bz2)

3. If it's the first time, an index is created, which can take some time. Currently, the program need write permission on the same directory.

4. The main windows contains the article title area (top), main text area (left) and article history (right). You can go to an article by typing its name then click the "Go" button, or by clicking a link from the main text area. By default, clicking a link load the article in the background. The search-box area allows to keyword search among the articles' title. You can also go to a random article by clicking "Go" with an empty entry.


deb包:
附件:
wikipediaDumpReader-i386-0.2.10.deb [260.07 KiB]
被下载 38 次


最后由 parseeci 编辑于 2010-08-10 22:21,总共编辑了 1 次

页首
 用户资料  
 
2 楼 
 文章标题 : Re: Wikipedia Dump Reade,直接读取*.xml.bz2
帖子发表于 : 2010-08-10 21:22 

注册: 2010-01-13 23:26
帖子: 3173
送出感谢: 0 次
接收感谢: 12
QT的东西,不知道仓库里有没有。二进制deb不能随便用的啊。

http://kde-apps.org/content/show.php/Wi ... tent=65244


_________________
Here I am.
Ubuntu 桌面培训 - 全中文官方文档,含汉化截图,提供PDF


页首
 用户资料  
 
3 楼 
 文章标题 : Re: Wikipedia Dump Reade,直接读取*.xml.bz2
帖子发表于 : 2010-08-11 13:49 
头像

注册: 2008-09-13 19:17
帖子: 7789
系统: Arch Linux (x86_64)
送出感谢: 10
接收感谢: 77
launchpad ppa 有


_________________
博客:http://www.lainme.com


页首
 用户资料  
 
4 楼 
 文章标题 : Re: Wikipedia Dump Reade,直接读取*.xml.bz2
帖子发表于 : 2011-12-06 1:25 

注册: 2011-02-04 14:08
帖子: 132
送出感谢: 4
接收感谢: 0 次
这个一定要收藏!!!太好了!!


页首
 用户资料  
 
5 楼 
 文章标题 : Re: Wikipedia Dump Reade,直接读取*.xml.bz2
帖子发表于 : 2011-12-06 1:29 

注册: 2011-02-04 14:08
帖子: 132
送出感谢: 4
接收感谢: 0 次
http://download.wikimedia.org/backup-index.html, 能浏览,不能下载,是不是被墙了!!!


页首
 用户资料  
 
显示帖子 :  排序  
发表新帖 回复这个主题  [ 5 篇帖子 ] 

当前时区为 UTC + 8 小时


在线用户

正在浏览此版面的用户:没有注册用户 和 1 位游客


不能 在这个版面发表主题
不能 在这个版面回复主题
不能 在这个版面编辑帖子
不能 在这个版面删除帖子
不能 在这个版面提交附件

前往 :  
本站点为公益性站点,用于推广开源自由软件,由 DiaHosting VPSBudgetVM VPS 提供服务。
我们认为:软件应可免费取得,软件工具在各种语言环境下皆可使用,且不会有任何功能上的差异;
人们应有定制和修改软件的自由,且方式不受限制,只要他们自认为合适。

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
简体中文语系由 王笑宇 翻译