Detection of Internet Scam Using Logistic Regression

Internet scam is fraudulent or intentionally misleading information posted on the web, usually with the intent of tricking people into sending money or disclosing sensitive information. We describe an application of logistic regression to detection of Internet scam. The developed system automatically collects 43 characteristic statistics about websites from 11 online sources and computes the probability that a given website is malicious. We present its empirical evaluation, which shows that its precision and recall are about 98%