怎么查找一份文档中有多少个单词
-
- 帖子: 5
- 注册时间: 2015-09-06 17:19
- 系统: Ubuntu
怎么查找一份文档中有多少个单词
我是一个初级的菜鸟,我想用shell脚本做实现查找一份文档中有多少个单词,请大家指教
- susbarbatus
- 帖子: 2966
- 注册时间: 2010-04-10 16:14
- 系统: Arch Linux
Re: 怎么查找一份文档中有多少个单词
只是要统计词数吗?还是要去重?
去重的话我能想到的大概就只能这样了(这里标点符号只处理了空格、逗号和句号而且遇到例如 2.0 这种词,还会造成多算):
去重的话我能想到的大概就只能这样了(这里标点符号只处理了空格、逗号和句号而且遇到例如 2.0 这种词,还会造成多算):
代码: 全选
cat test.txt | tr ' ,.' '\n' | sort -u | wc -l
沉迷将棋中……
- HuntXu
- 帖子: 5776
- 注册时间: 2007-09-29 3:09
Re: 怎么查找一份文档中有多少个单词
代码: 全选
$ wc --help
Usage: wc [OPTION]... [FILE]...
or: wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified. A word is a non-zero-length sequence of
characters delimited by white space.
With no FILE, or when FILE is -, read standard input.
The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
--files0-from=F read input from the files specified by
NUL-terminated names in file F;
If F is - then read names from standard input
-L, --max-line-length print the maximum display width
代码: 全选
--help display this help and exit
--version output version information and exit
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/wc>
or available locally via: info '(coreutils) wc invocation'
HUNT Unfortunately No Talent...
-
- 帖子: 4
- 注册时间: 2015-09-23 9:26