It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam.
For this project, you can start with a spam/ham dataset, then predict the class of new documents (either withheld from the training dataset or from another source such as your own spam folder).
library(RCurl)
library(XML)
library(stringr)
library(tm)
length(list.files("spam_2"))
## [1] 1397
list.files("spam_2")[1:3]
## [1] "00001.317e78fa8ee2f54cd4890fdc09ba8176"
## [2] "00002.9438920e9a55591b18e60d1ed37d992b"
## [3] "00003.590eff932f8704d8b0fcbe69d023b54d"
Tried to rename spam files (this did not work)
file.rename(list.files(pattern="0*."), paste0("", 1:1396))
Look at the file format of one spam email
file.info("spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176")
## size isdir mode
## spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176 4721 FALSE 644
## mtime
## spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176 2003-02-28 05:58:07
## ctime
## spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176 2017-11-04 19:43:02
## atime uid gid
## spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176 2017-11-07 08:15:14 501 20
## uname grname
## spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176 emiliembolduc staff
spam1 <- readLines("spam_2/00001.317e78fa8ee2f54cd4890fdc09ba8176")
spam1 <- str_c(spam1, collapse = "")
head(spam1)
## [1] "From ilug-admin@linux.ie Tue Aug 6 11:51:02 2002Return-Path: <ilug-admin@linux.ie>Delivered-To: yyyy@localhost.netnoteinc.comReceived: from localhost (localhost [127.0.0.1])\tby phobos.labs.netnoteinc.com (Postfix) with ESMTP id 9E1F5441DD\tfor <jm@localhost>; Tue, 6 Aug 2002 06:48:09 -0400 (EDT)Received: from phobos [127.0.0.1]\tby localhost with IMAP (fetchmail-5.9.0)\tfor jm@localhost (single-drop); Tue, 06 Aug 2002 11:48:09 +0100 (IST)Received: from lugh.tuatha.org (root@lugh.tuatha.org [194.125.145.45]) by dogma.slashnull.org (8.11.6/8.11.6) with ESMTP id g72LqWv13294 for <jm-ilug@jmason.org>; Fri, 2 Aug 2002 22:52:32 +0100Received: from lugh (root@localhost [127.0.0.1]) by lugh.tuatha.org (8.9.3/8.9.3) with ESMTP id WAA31224; Fri, 2 Aug 2002 22:50:17 +0100Received: from bettyjagessar.com (w142.z064000057.nyc-ny.dsl.cnc.net [64.0.57.142]) by lugh.tuatha.org (8.9.3/8.9.3) with ESMTP id WAA31201 for <ilug@linux.ie>; Fri, 2 Aug 2002 22:50:11 +0100X-Authentication-Warning: lugh.tuatha.org: Host w142.z064000057.nyc-ny.dsl.cnc.net [64.0.57.142] claimed to be bettyjagessar.comReceived: from 64.0.57.142 [202.63.165.34] by bettyjagessar.com (SMTPD32-7.06 EVAL) id A42A7FC01F2; Fri, 02 Aug 2002 02:18:18 -0400Message-Id: <1028311679.886@0.57.142>Date: Fri, 02 Aug 2002 23:37:59 0530To: ilug@linux.ieFrom: \"Start Now\" <startnow2002@hotmail.com>MIME-Version: 1.0Content-Type: text/plain; charset=\"US-ASCII\"; format=flowedSubject: [ILUG] STOP THE MLM INSANITYSender: ilug-admin@linux.ieErrors-To: ilug-admin@linux.ieX-Mailman-Version: 1.1Precedence: bulkList-Id: Irish Linux Users' Group <ilug.linux.ie>X-Beenthere: ilug@linux.ieGreetings!You are receiving this letter because you have expressed an interest in receiving information about online business opportunities. If this is erroneous then please accept my most sincere apology. This is a one-time mailing, so no removal is necessary.If you've been burned, betrayed, and back-stabbed by multi-level marketing, MLM, then please read this letter. It could be the most important one that has ever landed in your Inbox.MULTI-LEVEL MARKETING IS A HUGE MISTAKE FOR MOST PEOPLEMLM has failed to deliver on its promises for the past 50 years. The pursuit of the \"MLM Dream\" has cost hundreds of thousands of people their friends, their fortunes and their sacred honor. The fact is that MLM is fatally flawed, meaning that it CANNOT work for most people.The companies and the few who earn the big money in MLM are NOT going to tell you the real story. FINALLY, there is someone who has the courage to cut through the hype and lies and tell the TRUTH about MLM.HERE'S GOOD NEWSThere IS an alternative to MLM that WORKS, and works BIG! If you haven't yet abandoned your dreams, then you need to see this. Earning the kind of income you've dreamed about is easier than you think!With your permission, I'd like to send you a brief letter that will tell you WHY MLM doesn't work for most people and will then introduce you to something so new and refreshing that you'll wonder why you haven't heard of this before.I promise that there will be NO unwanted follow up, NO sales pitch, no one will call you, and your email address will only be used to send you the information. Period.To receive this free, life-changing information, simply click Reply, type \"Send Info\" in the Subject box and hit Send. I'll get the information to you within 24 hours. Just look for the words MLM WALL OF SHAME in your Inbox.Cordially,SiddhiP.S. Someone recently sent the letter to me and it has been the most eye-opening, financially beneficial information I have ever received. I honestly believe that you will feel the same way once you've read it. And it's FREE!------------------------------------------------------------This email is NEVER sent unsolicited. THIS IS NOT \"SPAM\". You are receiving this email because you EXPLICITLY signed yourself up to our list with our online signup form or through use of our FFA Links Page and E-MailDOM systems, which have EXPLICIT terms of use which state that through its use you agree to receive our emailings. You may also be a member of a Altra Computer Systems list or one of many numerous FREE Marketing Services and as such you agreed when you signed up for such list that you would also be receiving this emailing.Due to the above, this email message cannot be considered unsolicitated, or spam.------------------------------------------------------------- Irish Linux Users' Group: ilug@linux.iehttp://www.linux.ie/mailman/listinfo/ilug for (un)subscription information.List maintainer: listmaster@linux.ie"
spam1_corpus <- Corpus(VectorSource(spam1))
spam1_corpus[[1]]
## <<PlainTextDocument>>
## Metadata: 7
## Content: chars: 4613
meta(spam1_corpus[[1]])
## author : character(0)
## datetimestamp: 2017-11-07 13:37:18
## description : character(0)
## heading : character(0)
## id : 1
## language : en
## origin : character(0)
file.list <- list.files("spam_2", pattern = "*.*")
head(file.list)
## [1] "00001.317e78fa8ee2f54cd4890fdc09ba8176"
## [2] "00002.9438920e9a55591b18e60d1ed37d992b"
## [3] "00003.590eff932f8704d8b0fcbe69d023b54d"
## [4] "00004.bdcc075fa4beb5157b5dd6cd41d8887b"
## [5] "00005.ed0aba4d386c5e62bc737cf3f0ed9589"
## [6] "00006.3ca1f399ccda5d897fecb8c57669a283"
length(file.list)
## [1] 1397
setwd("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/spam_2")
spam.list <- sapply(file.list, readLines)
class(spam.list)
## [1] "list"
Remove numbers, punctuation characters, stop words, and reduce terms to stem words
SpamAll_corpus <- Corpus(VectorSource(spam.list)) %>%
tm_map(content_transformer(tolower)) %>%
tm_map(removeNumbers) %>%
tm_map(removeWords, stopwords("english")) %>%
tm_map(removePunctuation) %>%
tm_map(stemDocument) %>%
tm_map(stripWhitespace) #%>%
SpamAll_corpus <- tm_map(SpamAll_corpus, removeNumbers)
Spam_tdm <- TermDocumentMatrix(SpamAll_corpus)
Spam_tdm
## <<TermDocumentMatrix (terms: 58946, documents: 1397)>>
## Non-/sparse entries: 273699/82073863
## Sparsity : 100%
## Maximal term length: 868
## Weighting : term frequency (tf)
Take a peak at the data…
Spam_matrix <- as.matrix(Spam_tdm)
Spam_matrix <- sort(rowSums(Spam_matrix), decreasing = TRUE)
Spam_df <- data.frame(word = names(Spam_matrix),freq=Spam_matrix)
head(Spam_df, 50)
## word freq
## receiv receiv 7196
## size size 5960
## jul jul 4382
## font font 3986
## widthd widthd 3556
## email email 3226
## esmtp esmtp 3139
## tabl tabl 2994
## width width 2899
## will will 2626
## tbi tbi 2577
## helvetica helvetica 2531
## may may 2419
## tfor tfor 2406
## mon mon 1961
## localhost localhost 1899
## facedari facedari 1882
## subject subject 1811
## free free 1783
## can can 1774
## sansserif sansserif 1773
## div div 1681
## mail mail 1661
## contenttyp contenttyp 1598
## color color 1566
## date date 1524
## faceari faceari 1518
## tue tue 1508
## height height 1463
## list list 1458
## jun jun 1451
## arial arial 1424
## get get 1399
## messageid messageid 1399
## wed wed 1380
## html html 1375
## thu thu 1336
## busi busi 1313
## aug aug 1288
## smtp smtp 1287
## bodi bodi 1269
## faceverdana faceverdana 1264
## heightd heightd 1251
## borderd borderd 1239
## new new 1216
## remov remov 1215
## pleas pleas 1213
## order order 1212
## dogmaslashnullorg dogmaslashnullorg 1200
## colord colord 1197
It looks like my clean up removed some letters, like “e,” from the end of some words, like “receiv”.
spam_tdm1 <- Spam_tdm
spam_tdm1$Spam_Ham <- rep(1,nrow(Spam_tdm))
And make sure it work
Spam_matrix <- as.matrix(spam_tdm1)
Spam_matrix <- sort(rowSums(Spam_matrix), decreasing = TRUE)
Spam_df <- data.frame(Word = names(Spam_matrix), Frequency = Spam_matrix, Spam_Ham = spam_tdm1$Spam_Ham)
head(Spam_df, 30)
## Word Frequency Spam_Ham
## receiv receiv 7196 1
## size size 5960 1
## jul jul 4382 1
## font font 3986 1
## widthd widthd 3556 1
## email email 3226 1
## esmtp esmtp 3139 1
## tabl tabl 2994 1
## width width 2899 1
## will will 2626 1
## tbi tbi 2577 1
## helvetica helvetica 2531 1
## may may 2419 1
## tfor tfor 2406 1
## mon mon 1961 1
## localhost localhost 1899 1
## facedari facedari 1882 1
## subject subject 1811 1
## free free 1783 1
## can can 1774 1
## sansserif sansserif 1773 1
## div div 1681 1
## mail mail 1661 1
## contenttyp contenttyp 1598 1
## color color 1566 1
## date date 1524 1
## faceari faceari 1518 1
## tue tue 1508 1
## height height 1463 1
## list list 1458 1
length(list.files("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham"))
## [1] 2501
list.files("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham")[1:3]
## [1] "00001.7c53336b37003a9286aba55d2945844c"
## [2] "00002.9c4069e25e1ef370c078db7ee85ff9ac"
## [3] "00003.860e3c3cee1b42ead714c5c874fe25f7"
Look at the file format of one Ham email
file.info("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c")
## size
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 5216
## isdir
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c FALSE
## mode
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 644
## mtime
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 2003-02-28 05:53:40
## ctime
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 2017-11-04 19:41:49
## atime
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 2017-11-07 08:24:27
## uid
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 501
## gid
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c 20
## uname
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c emiliembolduc
## grname
## /Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c staff
ham1 <- readLines("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham/00001.7c53336b37003a9286aba55d2945844c")
ham1 <- str_c(ham1, collapse = "")
head(ham1)
## [1] "From exmh-workers-admin@redhat.com Thu Aug 22 12:36:23 2002Return-Path: <exmh-workers-admin@spamassassin.taint.org>Delivered-To: zzzz@localhost.netnoteinc.comReceived: from localhost (localhost [127.0.0.1])\tby phobos.labs.netnoteinc.com (Postfix) with ESMTP id D03E543C36\tfor <zzzz@localhost>; Thu, 22 Aug 2002 07:36:16 -0400 (EDT)Received: from phobos [127.0.0.1]\tby localhost with IMAP (fetchmail-5.9.0)\tfor zzzz@localhost (single-drop); Thu, 22 Aug 2002 12:36:16 +0100 (IST)Received: from listman.spamassassin.taint.org (listman.spamassassin.taint.org [66.187.233.211]) by dogma.slashnull.org (8.11.6/8.11.6) with ESMTP id g7MBYrZ04811 for <zzzz-exmh@spamassassin.taint.org>; Thu, 22 Aug 2002 12:34:53 +0100Received: from listman.spamassassin.taint.org (localhost.localdomain [127.0.0.1]) by listman.redhat.com (Postfix) with ESMTP id 8386540858; Thu, 22 Aug 2002 07:35:02 -0400 (EDT)Delivered-To: exmh-workers@listman.spamassassin.taint.orgReceived: from int-mx1.corp.spamassassin.taint.org (int-mx1.corp.spamassassin.taint.org [172.16.52.254]) by listman.redhat.com (Postfix) with ESMTP id 10CF8406D7 for <exmh-workers@listman.redhat.com>; Thu, 22 Aug 2002 07:34:10 -0400 (EDT)Received: (from mail@localhost) by int-mx1.corp.spamassassin.taint.org (8.11.6/8.11.6) id g7MBY7g11259 for exmh-workers@listman.redhat.com; Thu, 22 Aug 2002 07:34:07 -0400Received: from mx1.spamassassin.taint.org (mx1.spamassassin.taint.org [172.16.48.31]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with SMTP id g7MBY7Y11255 for <exmh-workers@redhat.com>; Thu, 22 Aug 2002 07:34:07 -0400Received: from ratree.psu.ac.th ([202.28.97.6]) by mx1.spamassassin.taint.org (8.11.6/8.11.6) with SMTP id g7MBIhl25223 for <exmh-workers@redhat.com>; Thu, 22 Aug 2002 07:18:55 -0400Received: from delta.cs.mu.OZ.AU (delta.coe.psu.ac.th [172.30.0.98]) by ratree.psu.ac.th (8.11.6/8.11.6) with ESMTP id g7MBWel29762; Thu, 22 Aug 2002 18:32:40 +0700 (ICT)Received: from munnari.OZ.AU (localhost [127.0.0.1]) by delta.cs.mu.OZ.AU (8.11.6/8.11.6) with ESMTP id g7MBQPW13260; Thu, 22 Aug 2002 18:26:25 +0700 (ICT)From: Robert Elz <kre@munnari.OZ.AU>To: Chris Garrigues <cwg-dated-1030377287.06fa6d@DeepEddy.Com>Cc: exmh-workers@spamassassin.taint.orgSubject: Re: New Sequences WindowIn-Reply-To: <1029945287.4797.TMDA@deepeddy.vircio.com>References: <1029945287.4797.TMDA@deepeddy.vircio.com> <1029882468.3116.TMDA@deepeddy.vircio.com> <9627.1029933001@munnari.OZ.AU> <1029943066.26919.TMDA@deepeddy.vircio.com> <1029944441.398.TMDA@deepeddy.vircio.com>MIME-Version: 1.0Content-Type: text/plain; charset=us-asciiMessage-Id: <13258.1030015585@munnari.OZ.AU>X-Loop: exmh-workers@spamassassin.taint.orgSender: exmh-workers-admin@spamassassin.taint.orgErrors-To: exmh-workers-admin@spamassassin.taint.orgX-Beenthere: exmh-workers@spamassassin.taint.orgX-Mailman-Version: 2.0.1Precedence: bulkList-Help: <mailto:exmh-workers-request@spamassassin.taint.org?subject=help>List-Post: <mailto:exmh-workers@spamassassin.taint.org>List-Subscribe: <https://listman.spamassassin.taint.org/mailman/listinfo/exmh-workers>, <mailto:exmh-workers-request@redhat.com?subject=subscribe>List-Id: Discussion list for EXMH developers <exmh-workers.spamassassin.taint.org>List-Unsubscribe: <https://listman.spamassassin.taint.org/mailman/listinfo/exmh-workers>, <mailto:exmh-workers-request@redhat.com?subject=unsubscribe>List-Archive: <https://listman.spamassassin.taint.org/mailman/private/exmh-workers/>Date: Thu, 22 Aug 2002 18:26:25 +0700 Date: Wed, 21 Aug 2002 10:54:46 -0500 From: Chris Garrigues <cwg-dated-1030377287.06fa6d@DeepEddy.Com> Message-ID: <1029945287.4797.TMDA@deepeddy.vircio.com> | I can't reproduce this error.For me it is very repeatable... (like every time, without fail).This is the debug log of the pick happening ...18:19:03 Pick_It {exec pick +inbox -list -lbrace -lbrace -subject ftp -rbrace -rbrace} {4852-4852 -sequence mercury}18:19:03 exec pick +inbox -list -lbrace -lbrace -subject ftp -rbrace -rbrace 4852-4852 -sequence mercury18:19:04 Ftoc_PickMsgs {{1 hit}}18:19:04 Marking 1 hits18:19:04 tkerror: syntax error in expression \"int ...Note, if I run the pick command by hand ...delta$ pick +inbox -list -lbrace -lbrace -subject ftp -rbrace -rbrace 4852-4852 -sequence mercury1 hitThat's where the \"1 hit\" comes from (obviously). The version of nmh I'musing is ...delta$ pick -versionpick -- nmh-1.0.4 [compiled on fuchsia.cs.mu.OZ.AU at Sun Mar 17 14:55:56 ICT 2002]And the relevant part of my .mh_profile ...delta$ mhparam pick-seq sel -listSince the pick command works, the sequence (actually, both of them, theone that's explicit on the command line, from the search popup, and theone that comes from .mh_profile) do get created.kreps: this is still using the version of the code form a day ago, I haven'tbeen able to reach the cvs repository today (local routing issue I think)._______________________________________________Exmh-workers mailing listExmh-workers@redhat.comhttps://listman.redhat.com/mailman/listinfo/exmh-workers"
ham1_corpus <- Corpus(VectorSource(ham1))
ham1_corpus[[1]]
## <<PlainTextDocument>>
## Metadata: 7
## Content: chars: 5103
meta(ham1_corpus[[1]])
## author : character(0)
## datetimestamp: 2017-11-07 13:37:35
## description : character(0)
## heading : character(0)
## id : 1
## language : en
## origin : character(0)
hamfile.list <- list.files("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham", pattern = "*.*")
head(hamfile.list)
## [1] "00001.7c53336b37003a9286aba55d2945844c"
## [2] "00002.9c4069e25e1ef370c078db7ee85ff9ac"
## [3] "00003.860e3c3cee1b42ead714c5c874fe25f7"
## [4] "00004.864220c5b6930b209cc287c361c99af1"
## [5] "00005.bf27cdeaf0b8c4647ecd61b1d09da613"
## [6] "00006.253ea2f9a9cc36fa0b1129b04b806608"
length(hamfile.list)
## [1] 2501
setwd("/Users/emiliembolduc/Week 10 - Text Mining/Project 4/easy_ham")
ham.list <- sapply(hamfile.list, readLines)
class(ham.list)
## [1] "list"
Remove numbers, punctuation characters, stop words, and reduce terms to stem words
HamAll_corpus <- Corpus(VectorSource(ham.list)) %>%
tm_map(content_transformer(tolower)) %>%
tm_map(removeNumbers) %>%
tm_map(removeWords, stopwords("english")) %>%
tm_map(removePunctuation) %>%
tm_map(stemDocument) %>%
tm_map(stripWhitespace) #%>%
HamAll_corpus <- tm_map(HamAll_corpus, removeNumbers)
Ham_tdm <- TermDocumentMatrix(HamAll_corpus)
Ham_tdm
## <<TermDocumentMatrix (terms: 37752, documents: 2501)>>
## Non-/sparse entries: 353793/94063959
## Sparsity : 100%
## Maximal term length: 265
## Weighting : term frequency (tf)
Take a peak at the data…
Ham_matrix <- as.matrix(Ham_tdm)
Ham_matrix <- sort(rowSums(Ham_matrix), decreasing = TRUE)
Ham_df <- data.frame(word = names(Ham_matrix), freq = Ham_matrix)
head(Ham_df, 30)
## word freq
## receiv receiv 14230
## sep sep 9788
## esmtp esmtp 8406
## localhost localhost 7347
## oct oct 5251
## tbi tbi 4728
## tfor tfor 4723
## postfix postfix 4661
## aug aug 4476
## ist ist 4224
## jmlocalhost jmlocalhost 4144
## mon mon 4035
## wed wed 3840
## thu thu 3837
## jalapeno jalapeno 3705
## deliv deliv 3536
## date date 3410
## dogmaslashnullorg dogmaslashnullorg 3048
## tue tue 3002
## subject subject 2898
## forkadminxentcom forkadminxentcom 2743
## messageid messageid 2540
## use use 2454
## imap imap 2378
## fetchmail fetchmail 2375
## returnpath returnpath 2369
## singledrop singledrop 2358
## contenttyp contenttyp 2341
## fri fri 2336
## list list 2267
Again, it looks like my clean up removed some letters, like “e,” from the end of some words, like “receiv”. Do not know how to correct.
Ham_tdm1 <- Ham_tdm
Ham_tdm1$Spam_Ham <- rep(0,nrow(Ham_tdm))
And make sure it work
Ham_matrix <- as.matrix(Ham_tdm1)
Ham_matrix <- sort(rowSums(Ham_matrix), decreasing = TRUE)
Ham_df <- data.frame(Word = names(Ham_matrix), Frequency = Ham_matrix, Spam_Ham = Ham_tdm1$Spam_Ham)
head(Ham_df, 30)
## Word Frequency Spam_Ham
## receiv receiv 14230 0
## sep sep 9788 0
## esmtp esmtp 8406 0
## localhost localhost 7347 0
## oct oct 5251 0
## tbi tbi 4728 0
## tfor tfor 4723 0
## postfix postfix 4661 0
## aug aug 4476 0
## ist ist 4224 0
## jmlocalhost jmlocalhost 4144 0
## mon mon 4035 0
## wed wed 3840 0
## thu thu 3837 0
## jalapeno jalapeno 3705 0
## deliv deliv 3536 0
## date date 3410 0
## dogmaslashnullorg dogmaslashnullorg 3048 0
## tue tue 3002 0
## subject subject 2898 0
## forkadminxentcom forkadminxentcom 2743 0
## messageid messageid 2540 0
## use use 2454 0
## imap imap 2378 0
## fetchmail fetchmail 2375 0
## returnpath returnpath 2369 0
## singledrop singledrop 2358 0
## contenttyp contenttyp 2341 0
## fri fri 2336 0
## list list 2267 0