You will implement the K-means clustering and Fuzzy C-means clustering from scratch using a programming language of your choice. Follow software design principles and document (comment

19 Sep You will implement the K-means clustering and Fuzzy C-means clustering from scratch using a programming language of your choice. Follow software design principles and document (comment

Posted at 12:46h in Computer Science by

Please find the requirement in the attached document.

Need this assignment by tuesday 9/20, 10:00pm

No copy and paste, Plagiarism results in course termination.

Fall2022_CS7830-01-Assignment11.pdf

Assignment 1 Due Date/Time: 9/23/2021, 11:59 PM

Total Points: 100

You will implement the K-means clustering and Fuzzy C-means clustering from scratch using a programming language of your choice. Follow software design principles and document (comment) your code clearly explaining what you did and why you did what you did. In your report, include a README that states how your code is supposed to be run to obtain the expected results.

You will use a dataset representing ten years of clinical care at 130 US hospitals and integrated delivery networks. It includes over 50 features representing patient and hospital outcomes. The dataset is included in the assignment with the filename diabetic_data.csv.

Use the Euclidean distance to compute the distance between any two patients in the dataset. You will run your clustering algorithms with different combinations of variables as specified in each question.

1. K-means clustering with different numbers of clusters (30 points)

a. Run K-means on the entire dataset with the following two variables: ‘time_in_hospital’, and ‘num_medications’ with the number of clusters K = 2. Plot your clusters using a 3D sca�er plot and report (print) the centroid locations. Based on this plot, what are your thoughts on the generated clusters?

b. Test with different numbers of clusters K, running from K = 2 to K = 10 using the same variables in 1a. According to the sca�er plots, which

number of clusters do you think is the most appropriate? Justify your response.

c. Implement Dunn index (DI) cluster validity measure from scratch. Repeat the experiments in problem 1b and compute the corresponding DI indices. Which one do you believe is the best number of clusters according to Dunn indices? Does this agree with your initial observation in problem 1b?

2. K-means clustering with different variables and sample size (30 points)

a. Based on the best number of clusters you obtained in problems 1c and the two variables, does adding the ‘insulin’ variable (total 3 variables) improve clustering results for any 30 patients randomly selected? Use sca�er plots or any other equivalent method to justify your response.

b. Based on the model in problem 2a, does adding the ‘diabetesMed’ and ‘change’ variables (total five variables) improve the clustering results for the same 30 patients? Plot the results and compute the Dunn index to justify your response.

c. Randomly sample 50,000 observations and 10,000 observations from the entire dataset and re-run 2a and 2b for each sample size. Plot the clustering results and compute the Dunn index for each sample size and compare the results with 50,000 and 10,000 observations vs the entire dataset. Justify what you observe.

d. (Bonus): What happens to the relative positioning of the centroids as you sample fewer observations (50,000, 10,000, 5,000) from the data? Do the centroids go farther apart, or do they get closer after your clustering

algorithm has converged? Justify why. Plot your findings (sample size (x-axis) vs Dunn Index (y-axis)). (Bonus: 10 points)

3. Fuzzy C-means clustering (40 points)

a. Implement Fuzzy C-means and apply it with the best number of clusters you selected in problem 1 and the best combination of variables you selected in problem 2 for the entire observations. Was there any difference in the clusters as compared to the K-means clusters? (Compare using visualization tools, using centroid values, OR using some labels and observing the differences).

b. Harden the cluster assignment of Fuzzy C-means and use the Dunn index to compare it with the K-means clustering result. Is there any difference in the results? Which clustering algorithm do you think produces be�er clusters and why?

c. Select one more variable by exploring the data and add this variable into the model in problem 3a. Does adding this new variable improve the clustering results? If so, why or why not? If you play with different variables for 3c, please mention that as well as the variables you experimented with and why you chose that particular additional variable.

Submission Instructions: Submit a zipped file containing your code(s) and report (in pdf) in the Dropbox folder titled “Assignment 1-LastName” on Pilot.

Academic Integrity: Please note that the code and report you submit should be your work and yours alone. If plagiarism is detected, it will be dealt with strictly and in accordance with Wright State guidelines.

Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteDemy. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.

Do you need an answer to this or any other questions?

About Wridemy

We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.

How It Works

To make an Order you only need to click on “Order Now” and we will direct you to our Order Page. Fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Are there Discounts?

All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.

19 Sep You will implement the K-means clustering and Fuzzy C-means clustering from scratch using a programming language of your choice. Follow software design principles and document (comment

About Wridemy

We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.

How It Works

Are there Discounts?

Hire a tutor today CLICK HERE to make your first order

Related Tags

About us

Quick help

Subjects covered