pf » Analyzing Words in Spam Emails

Analyzing Words in Spam Emails

misc

We recently did some analysis on our bayesian spam filter corpus (spam assassin token database), and came up with a list of words with a high spam/ham ratio.

By using the spam/ham ratio, and not the spam count, we came up with a better list of words to avoid. Most lists would have you avoid words like click and here, but they are used so much in legitimate email, that they have a lot spam/ham ratio.



Related Entries
2 people found this page useful, what do you think?

Trackback Address: 432/90676D79DFB3950EC6197F6323C5A0EC



  



Spell Checker by Foundeo





Subscribe to my RSS Feed: solosub RSS
Tags