Ethical Hacking

Learn to find vulnerabilities before the bad guys do! Gain real world hands on hacking experience in our state of the art hacking lab. Course designed and taught by expert instructors with years of penetration testing experience. 12 student maximum in every class. Certification attempt included in every package.
Computer Forensics Training at InfoSec Institute

Gain the in-demand skills of a certified computer examiner, learn to recover trace data left behind by fraud, theft, and cybercrime perpetrators. Discover the source of computer crime and abuse at your organization so that it never happens again. All of our class sizes are guaranteed to be 12 students or less to facilitate one-on-one interaction with one of our expert instructors.




Network Security Focus-IDS
[Top] [All Lists]

Applying data mining to Intrusion Detection System

Subject: Applying data mining to Intrusion Detection System
Date: 16 Jul 2005 12:33:07 -0000
Hi all,
I am a newbie in Network Security. I have looked at a webiste about KDD 99 
(http://www-cse.ucsd.edu/users/elkan/clresults.html ) and I found this very 
interesting. 
I would like to try the dataset and use some data mining tools to mine this. 
However, I am having few problems.


1. The data I downoaded from 
(http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html)

kddcup.data.gz The full data set (18M; 743M Uncompressed) -> I need the output 
(classified as normal or an intrusion) so that a supervised learnign can be 
done. This file is too big so I cant even open it to see if it contains the 
output. 

kddcup.data_10_percent.gz A 10% subset. (2.1M; 75M Uncompressed) -> is this 10% 
extracted from the above whole data?

kddcup.newtestdata_10_percent_unlabeled.gz (1.4M; 45M Uncompressed)  -> is that 
true the test data is not extracted from the training data (743 Mb) ?

kddcup.testdata.unlabeled.gz (11.2M; 430M Uncompressed) -> is this test data 
the same with above test? and how different?

kddcup.testdata.unlabeled_10_percent.gz (1.4M;45M Uncompressed) 

corrected.gz Test data with corrected labels. 

I see so many test sets and have no clue which one to use.

2. What tool would you recommend me to use to mine these data?

3. How can I run the scoring script in 
http://www-cse.ucsd.edu/users/elkan/awkscript.html 
I dont know how to evaluate my model after I finish training. Do I have to send 
my model to the commeetee in order to have it evaluated, or I just run 

the script by myself. What I really want to evaluate my model is the way 
described in http://www-cse.ucsd.edu/users/elkan/clresults.html 
 

Could anyone please give me some advices about this?
Thanks
Have a nice day
Patrick Tran

------------------------------------------------------------------------
Test Your IDS

Is your IDS deployed correctly?
Find out quickly and easily by testing it 
with real-world attacks from CORE IMPACT.
Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708 
to learn more.
------------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>