The classical taxonomy of machine learning methods splits them into two classes, supervised and unsupervised methods. In principle the difference lays in the presence of a decision, an expert-given vector of proper answers for the problem the model is expected to solve. In a supervised setting, decision is given, and the training algorithm uses it to build a model that predicts it; in unsupervised one, it is absent and the algorithm is expected to discover and express fundamental patterns in the data.
The general problem here is that training data consists of numerous patterns; many of them are pure noise and will never reappear in testing, yet others are true, but not necessarily relevant to the problem. Henceforth, I do not consider the notion of a fundamental pattern to be well defined.
For example, we may think about a bitmap, and ask whether the lightness of a particular pixel, say (211,315), is important, given its surroundings? For most image analysis taks, like object recognition, the answer is no, since low-frequency features are what forms actual shapes, and pixel-level information is mostly noise. However, when the image is a telescope photograph of a night sky, and our pixel covers a star of interest, its intensity may literally be the whole relevant information of the entire image.
How does it look in practice, though? Clustering methods like k-means are supervised by the distance function, which expresses the focus on particular patterns. In realistic conditions, their output can be made arbitrary by manipulating it. Auto-encoders and lossy compressors rely on penalty functions quantifying their reconstruction error, which, similarly, can be used to select what information is retained. Finally, semi-supervised methods are in essence just supervised; the information from the unlabelled part can be used to improve estimation of internal interactions leading to a better overall accuracy, but the focus is fully controlled by the labeled part.
Previously: Scaling ferns.