# 计算机作业代写|K-means聚类算法

“聚类算法“试图将数据集中的样本划分为若干个通常是不相交的子集，每个子集称为一个“簇”(cluster)，通过这样的划分，每个簇可能对应于一些潜在的概念或类别。

kmeans算法又名k均值算法,K-means算法中的k表示的是聚类为k个簇，means代表取每一个聚类中数据值的均值作为该簇的中心，或者称为质心，即用每一个的类的质心对该簇进行描述。

## 计算机作业代写-K-means聚类算法案例

The K-Town Zoological Institute has 100 types of animals on
exhibit (including humans – the zookeepers!) The Board of Directors Fortunately the zoo is dedicated to the ethical treatment of animals and participates in many international programs of conservation and preservation, so we
need have no worries about unintentionally becoming stars of the next Tiger
King series. has decided that the animals should be grouped together on the basis of shared characteristics. The zoo has 7 buildings, so you are tasked with separating the species into 7 groups. There are 15 characteristics, each representing an aspect of the
animal. Most (such as “airborne” and “milk”) are either 0 or 1. But some (such as “legs”) can take higher values. All the characteristic values are integers. The method you will use to group the animals is called k-means clustering. It works like this: Choose the number of clusters (groups) you want. In this case we are told the number of groups must be 7. Choose 7 animals at random. These are the initial “cluster centres”. Then repeat the following steps:

K城动物研究所有100种动物展出（包括人类–动物园管理员！）。

1.For each animal, determine which of the cluster centres it is closest to.1 This separates the animals into 7 clusters. 1 See below under Manhattan and Euclid for a discussion of “closest”.

2. For each cluster, compute a new centre by averaging the values of each characteristic for the animals currently in the cluster. This new centre will probably have non-integer values for its characteristics and will almost certainly not correspond exactly to any of the animals. That’s ok!
This gives us 7 new cluster centres, so we can go back to Step
1 and repeat the process of assigning the animals to clusters by choosing the nearest cluster centre.

3. We repeat this process until the clusters stabilize (i.e. they stop changing or almost stop changing). I’ll describe ways of measuring stability later in this assignment.
Example:
Suppose we have just 5 animals, and they only have 3 characteristics A, B and C. We can represent the data like this (I picked random numbers for the characteristic values – in the actual data set the values are not randomly assigned):

Suppose that we decide to separate these animals into 2 groups based on the 3 characteristics. We randomly choose 2 initial centres (I’ll pick Horse and Moose), and compute which of these is closer for each of the other animals. I’m using a simple distance measure called the Manhattan Metric2. Using that metric, Cow is closer to 2 See below Horse, and Bear and Duck are both closer to Moose. This gives us two clusters: [Horse, Cow] and [Moose, Bear, Duck]. We compute the new centre for each cluster by averaging the characteristic values for the animals in the cluster. So the updated centre for Cluster 1 is the set of characteristic values [9/2, 2, 7/2] and the updated centre for Cluster 2 is [5/3, 16/3, 8/3]

Now for all the animals we compute which of the new cluster centres they are closest to. None of them switches to the other cluster, so the clusters do not change. We are done – but if any of the animals had switched to the other cluster we would have iterated again, calculating new centres and seeing if the clusters changed again.

To see this in action, suppose we choose Bear and Duck as the two initial centres. The initial clusters are [Bear] and [Duck, Moose, Horse, Cow]. The updated centre for Cluster 1 is [2, 6, 1] and the updated centre for Cluster 2 is [3, 3.5, 3.5]. When we recompute the distances from the animals to the cluster centres, Duck jumps to Cluster 1, giving [Bear, Duck] and [Moose, Horse, Cow]. Computing the new centres and determining which centre is closest for each animals creates no further changes, so the situation is now stable and we stop.

This example illustrates that the algorithm can settle on different solutions, depending on the starting points. We typically run the algorithm several times with different randomly chosen starting points to see if one solution comes up more frequently than others.

## Recent Case

### 盘点留学生常用5 种最佳笔记方法，哪一种适合您

SCI期刊论文的发表过程是一项复杂而细致的工作，涉

### 轻松掌握万能Essay模板，撰写一篇合格的英文论文

Python可谓是当下最火的编程语言，同时，Pyt