Machine Learning

This course focuses on a range of techniques inspired by artificial intelligence and classical statistics. In the last decade, these fields have experienced a boom, particularly with regard to problems related to largevolumes of data for which the mathematical, statistical,or classical operations research have been unable to offer effective or efficient solutions. The applications of machine learning cover fields as diverse as bioinformatics, finance, and natural language. The studentwill study the most common major techniques for data mining, as well as acquire skills in the use of free software packages that implement these techniques. This will be linked to the study and demonstration of real applications of these techniques.

  •  Theme 1. A short introduction to the world of Data Science: business and big data, the open data concept, big data and humanitarian projects, data visualization, software resources, methodologies for project management, the "big data" concept and applications...

  •  Theme 2: Principal classification scenarios. Formalisms and applications: supervised classification, unsupervised classification (clustering), weakly supervised classification, crowd learning, etc.

  •  Theme 3: Techniques and filters for data preprocessing. Software: WEKA

  •  Theme 4: Feature selection techniques. Software: WEKA

  •  Theme 5: Model validation. Using statistical tests to compare the accuracy of different classifiers. Software: WEKA, R, web pages

  •  Theme 6:"A short introduction to the tm (text mining) package in R: text processing". How to construct by text mining operators a proper document-term matrix for further machine learning analysis. A tutorial using R software

  •  Theme 7: The machine learning approach: clustering words and classifying documents with R". A tutorial using R software and its "caret" package.

  •  Theme 8: "First steps on deep learning for NLP by h2o package". A tutorial using R software

4.5 H
1. lauhilekoa



Subscribe to RSS - LAP8