What is Machine Learning
Machine learning is a combination of algorithms to perform precise tasks in a similar way to humans. This discipline is in close contact with Statistical Computation, otherwise known in the industrial world as Predictive Analytics or Predictive Modelling. Machine learning is a core component of Artificial Intelligence. The most important difference in Machine Learning is between supervised and unsupervised algorithms.
Computers are supplied with examples of input along with the desired output. The aim is to establish general rules that generate intended output on new data. To some extent, it is as if a person proposed himself/herself to act as the computer’s teacher. For CELI, the following applications are central to Natural Language Processing (NLP):
- Automatic Categorisation : automatically assign predefined categories or tags to new documents, i.e. automatically classify a document as a human would do.
- Named Entity Recognition: understand and annotate people, places and organisations that are mentioned in a text that is potentially ambiguous.
- Text To Speech Translation.
- Casual and Regression Models which represent connections between each other and other variables, e.g. comments on Twitter in response to hate or viral content.
Unsupervised Learning is the machine learning task of inferring a function to describe hidden structure from unlabeled data. In this case, data are automatically developed by computers, whithout the intervention of humans, i.e. the examples given to the learner are unlabeled. The algorithms extract information only on the basis of general and statistical criteria. The results of this unsupervised procedure can be directly used by the client, such as in clustering, or they can be a starting point for a more in-depth analysis. The best known form of unsupervised learning is cluster analysis, i.e. extracting reccurring patterns or repetitive forms of information from a big data set. The aim of cluster analysis is to provide a computed classification of items without information about the classification being known. Cluster analysis is popular because it provides the first insight into a dataset. CELI has gained experience into the segmentation of structured data, of which:
- Transactional and Contractual Data
- Behavioural Data (such as polls, but also website visit logs)