Ethical Hacking Learn to find vulnerabilities before the bad guys do! Gain real world hands on hacking experience in our state of the art hacking lab. Course designed and taught by expert instructors with years of penetration testing experience. 12 student maximum in every class. Certification attempt included in every package. | Computer Forensics Training at InfoSec Institute Gain the in-demand skills of a certified computer examiner, learn to recover trace data left behind by fraud, theft, and cybercrime perpetrators. Discover the source of computer crime and abuse at your organization so that it never happens again. All of our class sizes are guaranteed to be 12 students or less to facilitate one-on-one interaction with one of our expert instructors. |

| Subject: | Re: Machine Learning for IDS: which dataset? |
|---|---|
| Date: | Mon, 19 Jun 2006 20:15:39 +0200 |
J.A. wrote:
I am using the KDD-99 dataset in my research work. Though it is the most well-known datasets it has several drawbacks that limits what you can do with it. As an example, and as you note, the distribution of normal data and attack data does not represents a true real network.
You may also add that some of the header fields have regularities and markers introduced by the generation mechanism (the dataset is artificial), and that the attack types were limited even in '99. All in all, that dataset is of very limited use nowadays.
I think that a better dataset is the original used to generate the KDD-99 dataset. It can be obtained from www.ll.mit.edu.
What good would this be ? Anomalies, artifacts and aged attacks are present in the original dataset as well. The only way to go is generate new datasets with repeatable, scientific approaches and start from there. Stefano ------------------------------------------------------------------------ Test Your IDS Is your IDS deployed correctly? Find out quickly and easily by testing it with real-world attacks from CORE IMPACT. Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708 to learn more. ------------------------------------------------------------------------
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: Latest published papers on IPS brands evaluation and comparison, Stefano Zanero |
|---|---|
| Next by Date: | Re: IPS Vendor - Customer Experiences, Stefano Zanero |
| Previous by Thread: | Re: Machine Learning for IDS: which dataset?, J.A. |
| Next by Thread: | Re: Machine Learning for IDS: which dataset?, John Goodall |
| Indexes: | [Date] [Thread] [Top] [All Lists] |