Q1 ML for Malware Analysis 25 Points
In the week 7 lecture, we show example code of using 3 machine learning models to train and measure the performances on data in the form of feature vectors for 200 binaries of which 50% are malware and 50% are benign ware. You can find the code and the data at https://drive.google.com/drive/folders/142NMRSTifttezfPqwTkf6dlg-VWOrdaY? usp=drive_link
Download the jupyter notebook and the test.csv file from the above link, and run the code either on google colab, or using anaconda installation on your own machine.
(i) Open the test.csv file using excel or another spreadsheet program. You will find that rows 2 till 101 are labeled as malware (see the last column). Rows 102 till 201 are labeled as benign ware. Since the amount of data is so small, you can eyeball the data very quickly and find certain features (columns) that have very different values for rows marked as malware compared to their values for rows marked as benign. Those features are useful in classifying between a malware and benign ware. You will also find some features that have similar values irrespective of whether the row is labeled malware or not. Such features are useless in classification. Name 3 features that are useful for classifying and 3 that are not useful for classifying.
(ii) Explain the need for feature selection in 3-4 sentences. In other words, once we have extracted the features, why not use all the extracted features and why do we need to select a subset of features?
(iii) In the code, you will find that we computed feature correlations, and generated a heatmap for all pairwise correlation. However, we selected only those features which have high absolute value of correlation with the labels. Explain in your own words, within 2-3 sentences why this selection criterion makes sense?
(iv) In the code, we selected 7 features out of 23 features extracted. State in your own words (no more than 2-3 sentences) what might be the reason that even after removing that many features, some of the machine learning models yielded high accuracy, precision and recall?
(v) We only kept those features which have high correlation with the labels, but there may be other methods to reduce features — explain in 2-3 sentences one possible alternative method for feature selection.
Q2 ML for Intrusion Detection 25 Points
In Week 8 lecture, we show how to use ML models to train on network packet data for intrusion detection. You can find the data and the code at https://drive.google.com/drive/folders/1BrX2QtYvTZiBIKYVrn64phV4dbqsDDpn? usp=sharing
(i) In the example code shown in week 8, we showed how scapy library is used. Write in your own words, what use of scapy library was shown? (Hint: in the rest of the code, we used pcap file for data source — and did not use scapy library in the code — but think about how the pcap files might have been collected).
(ii) Explain in your own words what are flows that are constructed from packets in pcap file?
(iii) In the example code shown in week 8, explain in 2-3 sentences how the flows are labeled as benign and malicious?
(iv) In the example code shown in week8, we use PCA to transform the feature vectors into transformed vectors. We then plot the first two features in the transformed feature to plot the transformed data in 2-D plots. Do your own research to find out what PCA does and explain in 2-3 sentences why PCA is useful?
- WE OFFER THE BEST CUSTOM PAPER WRITING SERVICES. WE HAVE DONE THIS QUESTION BEFORE, WE CAN ALSO DO IT FOR YOU.
- Assignment status: Already Solved By Our Experts
- (USA, AUS, UK & CA PhD. Writers)
- CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS
Looking for unparalleled custom paper writing services? Our team of experienced professionals at AcademicWritersBay.com is here to provide you with top-notch assistance that caters to your unique needs.
We understand the importance of producing original, high-quality papers that reflect your personal voice and meet the rigorous standards of academia. That’s why we assure you that our work is completely plagiarism-free—we craft bespoke solutions tailored exclusively for you.
Why Choose AcademicWritersBay.com?
- Our papers are 100% original, custom-written from scratch.
- We’re here to support you around the clock, any day of the year.
- You’ll find our prices competitive and reasonable.
- We handle papers across all subjects, regardless of urgency or difficulty.
- Need a paper urgently? We can deliver within 6 hours!
- Relax with our on-time delivery commitment.
- We offer money-back and privacy guarantees to ensure your satisfaction and confidentiality.
- Benefit from unlimited amendments upon request to get the paper you envisioned.
- We pledge our dedication to meeting your expectations and achieving the grade you deserve.
Our Process: Getting started with us is as simple as can be. Here’s how to do it:
- Click on the “Place Your Order” tab at the top or the “Order Now” button at the bottom. You’ll be directed to our order form.
- Provide the specifics of your paper in the “PAPER DETAILS” section.
- Select your academic level, the deadline, and the required number of pages.
- Click on “CREATE ACCOUNT & SIGN IN” to provide your registration details, then “PROCEED TO CHECKOUT.”
- Follow the simple payment instructions and soon, our writers will be hard at work on your paper.
AcademicWritersBay.com is dedicated to expediting the writing process without compromising on quality. Our roster of writers boasts individuals with advanced degrees—Masters and PhDs—in a myriad of disciplines, ensuring that no matter the complexity or field of your assignment, we have the expertise to tackle it with finesse. Our quick turnover doesn’t mean rushed work; it means efficiency and priority handling, ensuring your deadlines are met with the excellence your academics demand.