Limited Offer Get 25% off — use code BESTW25
No AI No Plagiarism On-Time Delivery Free Revisions
Claim Now

The code file in Jupyter

Assignment Task

This assignment consists of two deliverables, being:

• One code implementation (50%). The code file in Jupyter Notebook format and the relevant date set files should be contained within a folder named: Task 3-Your Name- Student_Number, the folder is then to be zipped and uploaded to blackboard.

• A report (50%). The report must be uploaded as a separate file.

Part I – PySpark source code (50%)

Important Note: For code reproduction, your code must be self-contained. That is, it should not require other libraries besides PySpark environment we have used in the workshops. The data files are packaged properly with your code file.

In this component, we need to utilise Python 3 and PySpark to complete the following data analysis tasks:

1. Exploratory data analysis

2. Recommendation engine

3. Classification

4. Clustering

Task I.1: Exploratory data analysis

This subtask requires you to explore your dataset by

• telling its number of rows and columns,

• doing the data cleaning (missing values or duplicated records) if necessary

• selecting 3 columns, and drawing 1 plot (e.g. bar chart, histogram, boxplot, etc.) for each to summarise it

Task I.2: Recommendation Engine

This subtask requires you to implement a recommender system on Collaborative filtering with Alternative Least Squares Algorithm. You need to include

• Model training and predictions

• Model evaluation using MSE

Task I.4: Clustering

This subtask requires you to implement a clustering system on K-means. You need to include

• Model training

• Model evaluation

Part II –Report (50%)

You are required to write a report to explain your design and implementation of the machine learning parts in your code, including the following topics:

• Introduction/summary/explanation to the ML algorithm/concepts

• The learning settings, such as how to prepare training/testing set, what are the key parameters and how you set them up

• Comments/evaluation for the models learned

Task I.3: Classification

This subtask requires you to implement a classification system on Logistic regression with the LogisticRegressionWithLBFGS class. You need to include

• Logistic Regression model training

• Model evaluation

The post The code file in Jupyter appeared first on My Assignment Online.

Plagiarism Free Assignment Help

Expert Help With This Assignment — On Your Terms

Native UK, USA & Australia writers Deadline from 3 hours 100% Plagiarism-Free — Turnitin included Unlimited free revisions Free to submit — compare quotes
Scroll to Top