Solution
Task Title: Data Analytics for intrusion detection
Subject code: MN 623 assignment 2
Objective: objective of this assignment is to provide data analysis for the detection of intruders by using popular tool of data mining known as Weka.
Overview: assignment comprises of three different sections including the implementation and deployment of Data Analytics along with several analytic strategies. There is the usage of bro files which are required to be converted into flows with the use of IPFIX tools. Students in this assignment required to prepare a fully report focusing on the intrusion detection system.
University: Melbourne Institute of Technology
Requirement of tool: Weka
Deliverables of task
Report introduction: introduction to the report including the main point which are being covered in this report.
Section 1: this include Tools and techniques of data analysis along with installation and deployment of Data Analytics platform. It is required to include different steps of tool working for the demonstration.
Section 2: this is the evaluation and penetration testing section in which students are required to list the selection of files along with the attacks that are covered in dataset. The visualization of different attacks is required to be provided under this section.
Section 3: this include Data Analytics network for intrusion detection and the working of csv files.
Conclusion and future related works: it is required to include future related works and contribution under the section.
Sample output
Suggested modifications
There is a need to modify overall content of the report as each and every section requires to have proper understanding. It is recommended to include different screenshot for testing and comparison of tools.
Comments of experts
Most of the times the main difficulty faced by the students are in operating watercolor software and at in screenshots into the assignment. With the premium support, these problems can be resolved easily.
Prepared by: Dr Ammar Alazab and Dr Ghassan Kbar Moderated by:
Farshid Hajati August, 2019
Assessment Details and Submission Guidelines
Trimester T2, 201 9
Unit Code MN6 23
Unit Title Cyber Security and Analytics
Assessment
Type
Group assignment – (Assignment 2)
Assessment
Title
Data analytics for intrusion detection
Purpose of
the
assessment
(with ULO
Mapping)
This assignment assesses the following Unit Learning Outcomes; students should be able to demonstrate
their achievements in them.
c) Evaluate intelligent security solutions based on data analytics
d) Analyse and interpret results from descriptive and predictive data analysis
Weight 15%
Total Marks 100
Word limit 1200 -1500 words
Due Date 11:55 PM, Week 1 1 (25th of September 201 9)
Submission
Guidelines
• All work must be submitted on Moodle by the due date along with a completed Assignment Cover
Page.
• The assignment must be in MS Word format, 1.5 spacing, 11- pt Calibri (Body) font and 2 cm margins
on all four sides of your page with appropriate section headings.
• Reference sources must be cited in the text of the report, and listed appropriately at the end in a
reference list using IEEE referencing style.
Extension • If an extension of time to submit work is required, a Special Consideration Application m ust be
submitted directly to the School's Administration Officer, on academic reception level. You must
submit this application within three working days of the assessment due date. Further information
is available at:
http://www.mit.edu.au/about -mit/institute -publications/policies -procedures -and -
guidelines/spec ialconsiderationdeferment
Academic
Misconduct
• Academic Misconduct is a serious offence. Depending on the seriousness of the case, penalties can
vary from a written warning or zero marks to exclusion from the course or rescinding the degree.
Students should make themselves familiar with the full policy and procedure available at:
http://www.mit.edu.au/about-mit/institute- publications/policies -procedures -and -
guidelines/Plagiarism -Academic -Misc onduct -Policy -Procedure . For further information, please
refer to the Academic Integrity Section in your Unit Description.
MN623 Cybersecurity and Analytics Assignment 2 Page 2 of 5
Prepared by: Dr Ammar Alazab and Dr Ghassan Kbar Moderated by: Dr
Farshid Hajati August , 2019
Assignment Overview
For this assignment , you will analyses and evaluate one of the public ly available Network Intrusion
dataset s given in Table 1.
Your t ask is to complete and make a research report based on the following:
1- Discuss all the attacks on your selected public intrusion dataset.
2 - Perform intrusion detection using the available data analytic techniques using WEKA or other
platforms.
3 - In consultation with your lecturer, choose at least three data analytic techniques for network
intrusion detection and prepare a technical report. In the report, e valuate the performance
of data analytic techniques in intrusion detection using comparative analysis .
4 - Recommend the security solution using the selected data analytic technique.
Follow the marking guide to prepare your report.
Dataset Attacks References/download
UNSW -
NB15
analysis, backdoors, DoS ,
exploits, fuzzers, generic,
reconnaissance,
shellcode, worms
https://www.unsw.adfa.edu.au/unsw -canberra -
cyber/cybersecurity/ADFA -NB15 -Datasets/
NSL -
KDD
Do S, remote -to -local,
user -to -root, probing
https://www.unb.ca/cic/datasets/nsl.html
KDD
CUP 99
DoS, remote -to -local,
user -to -root, probing
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
CIC
DoS
Application layer DoS
attacks (executed
through ddossim,
Goldeneye, hulk, RUDY,
Slowhttptest, Slowloris)
https://www.unb.ca/cic/datasets/dos -dataset.html
Table 1 Network Intrusion Dataset
Section 1:
Data Analytic Tools and Techniques
In this section, your task is to complete and write a report on the following:
1. Install/d eploy the data analytic platform of your choice (on Win8 VM on VirtualBox).
2. Demonstrate the use of at least two data analytic techniques (e.g. decision tree, clustering
or other techniques) – you are free to use any sample testing data to demonstrate your
skills and knowledge.
3. Lab demonstrat ion: Must explain how each tool technique works in your lab prior
to week 11. Data can be anything including Iris dataset.
Section 2: Evaluation of the Penetration Test ( PT) of the given Dataset of UNSW in Table1
1. Select from UNSW example of the dataset, cvs, pcap and bro files to evaluate the result of
the penetration test as explained below
2. For csv files you need to generate statics to identify the total number of attacks related to
DOS, Exploits, generic, reconnaissance, shellcode, and worms and display the result in a
graph and shows the percentage of attacks compared to normal traffic . (need to submit the
excel csv file you analyzed with your report) MN623 Cybersecurity and Analytics Assignment 2 Page 3 of 5
Prepared by: Dr Ammar Alazab and Dr Ghassan Kbar Moderated by: Dr
Farshid Hajati August , 2019
3.
Use Wireshark to open the cap file and generate report with different statistics related to :
Resolved address
DNS, http
Packet length
T CP Throughput
4. Use bro file and analyse results and write report on the type of traffic generated. Then,
convert Bro Logs to Flows, where you can convert the Bro logs into IPFIX (using IPFIX utility)
by defining your own elements and templates, then create bro report by filtering and
thresholds to watch for specific events or patterns
Section 3: Data Analytic for Network Intrusion Detection (using Weka if possible)
P erform the following tasks and write a full report on your outcomes:
1. Convert the benchmark data suitable for the data analytic tools and platform of your choice.
Explain the differences in the available data format for data analytics.
2. Select the features with rationale (external reference or your own reasoning).
3. Create trai ning and testing data samples .
4. Evaluate and select the data analytic techniques for testing .
5. Classify the network intrusion given the sample data.
6. Evaluate the performance of intrusion detection using the available tools and technologies
(e.g. confusion matrix).
7. Identify the limitation of overfitting .
8. Evaluate and analyse the use of ensemble tools .
9. Recommend the data analytic solution for the network intrusion detection.
10. Discuss future research work given time and resources
Note: Take screenshots of your work on WEKA , showing the answer of above questions. Include the se
screenshot s in your final report .
Please use the following references and others for more information:
http://ftp10.us.freebsd.org/users/azhang/disc/disc01/cd1/out/websites/kdd_explorations_full/levin.pdf
https://pdfs.semanticscholar.org/1d6e/a73b6e08ed9913d3aad924f7d7ced4477589.pdf
ftp: //ftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/websites/kdd_explorations/pfahringer.pdf
MN623 Cybersecurity and Analytics Assignment 2 Page 4 of 5
Prepared by: Dr Ammar Alazab and Dr Ghassan Kbar Moderated by: Dr
Farshid Hajati August , 2019
Marking criteria:
Section to be included in
the report and
demonstration
Description of the section Marks
Section 1 - Install and
deploy
Intro duction to each of your data analytic tools and platforms 3
Section 1 - Explain and
evaluate
Full explanation of each data analytic techniques and attacks with
support from either own evidence(s) and/or from other online
sources.
Advantages and disadvantages of each data analytic techniques (of
your choice).
5
Section 1 - Lab
demonstration
To obtain full mark s, student s need to implement and demonstrate
the use of at least two data analytic techniques in any platform of
your choice. You may choose to use any testing data for
demonstration.
10
Report structure and
report presentation
Compile a written report of the above along with your e valuations
and recommendations. The report must contain several
screenshots of evidence and a short description for each snapshot
that provides proof that you completed the work.
10
Reference style Follow IEEE reference style 2
Section 2 - Evaluation of
the PT of the given Dataset
of UNSW in Table1
1. Analyzing CSV file and report as explained in section 2
2. Analyze the cap file and report as explained in section 2
3. Analyze the Bro file and report as explained in section 2
10
10
10
Section 3 – Data analytic s
practical report
1. Convert the benchmark data suitable for the data analytic
tools and platform of your choice. Explain the differences in
the available data format for data analytics.
2. Select the features with rationale (external reference or
your own reason ing).
3. Create training and testing data samples
4. Evaluate and select the data analytic techniques for testing
5. Classify the network intrusion given the sample data
6. Evaluate the performance of intrusion detection using the
available tools and technologies (e.g . confusion matrix).
7. Identify the limitation of overfitting
& Evaluate and analyse the use of ensemble tools
8. Recommend the data analytic solution for the network
intrusion detection.
& Discuss future research work given time and resources .
5
5
5
5
5
5
5
5
Total 100
MN623 Cybersecurity and Analytics Assignment 2 Page 5 of 5
Prepared by: Dr Ammar Alazab and Dr Ghassan Kbar Moderated by: Dr
Farshid Hajati August , 2019
Marking Rubrics
Marking Rubric for Assignment #2 : Total Marks 80
Grade
Mark
HD
80%+
D
70%-79%
CR
60%-69%
P
50%-59%
Fail
< 50%
Excellent Very Good Good Satisfactory Unsatisfactory
Introduction
Introduction is
clear, easy to
follow, well
prepared and
professional
Introduction
is clear and
easy to
follow.
Introduction is
clear and
understandable
Makes an basic
Introduction to
each of your data
analytic tools and
platforms
Does not make a
an introduction
to each of your
data analytic
tools and
platforms
Evaluation
Logic is clear
and easy to
follow with
strong
arguments Consistency
logical and
convincing Mostly consistent
and convincing
Adequate cohesion
and conviction Argument is
confused and
disjointed
Demonstration
All elements are
present and
very well
demonstrated .
Components
present with
good
cohesive
Components
present and
mostly well
integrated
Most components
present
Proposal lacks
structure.
Report
structure and
report
presentation
All elements are
present and well
integrated.
Components
present with
good
cohesion Components
present and
mostly well
integrated Most components
present
Lacks structure.
Reference style
Clear styles with
excellent source
of references.
Clear
referencing/
style Generally good
referencing/style Unclear
referencing/style
Lacks
consistency with
many errors
Report
structure and
report
presentation Proper writing.
Professionally
presented Properly
written, with
some minor
deficiencies
Mostly good, but
some structure or
presentation
problems Acceptable
presentation Poor structure,
careless
presentation