That is the boundaries of each familiarity are know to us. The task is to find out which character is drawn in each specific position. We have a limitd number of symbols let them be Russian letters and Arabic numerals for simplicity- only pieces. That is we ned to take an image in each of the filld cells. Attribute it to one of classes – one class per character. This is the classification. In a rough approximation the problem can be formulatd as follows: Let there be an unknown function f in nature. That maps the feature space X in our case the vector of numbers. Representing a picture with a symbol in raster formonto a discrete set of classes.

In our example this is a large set of images of handwritten letters and numbers in which for each image it is indicatd which character is depictd on it. And basd on this set we nd to somehow construct an approximation of the real function f – a certain function g: X → Y such that gxalmost always coincides with fx. There are many different approaches here. One of the simplest is calld k nearest neighbors k nearest neighbors kNN. Its essence is this. We cannot express the function f analytically in the form of a formula or algorithm because we don't know how it works. But then we have a large number of pairs x y about which we know that.

We take our sample and represent it as a set of points in a multidimensional space. Next we take the point x corresponding to the feature set of the object that we want to classify. We find the k points of the training sample closest to it and see which class is the most common among them recall for each object in the training sample we initially know BLB Directory the correct class. It is this most numerous class that we attribute to our object. Unfortunately kNN does not work well for all problems. For example handwritten versions of the letter A may have a different style size a pen / pencil may turn out to be off-black after converting the image to monochrome etc.

