We will focus on the most time-consuming part of the machine learning process which is the data exploration consisting from data visualisation and data wrangling serving for preparing and understanding your data.
The whole course is full of different data manipulation and visualisation hands-on exercises in three popular data science platforms:
1. open-source and very progressive programming language Python
2. open-source, highly intuitive and effective analytics platform KNIME
3. the most popular for people working with data MS Excel,
where we we will load data, transform them and visualise them.
So, what will we cover during this course?
- Start with the KNIME analytics platform (installation and environment description)
- Start with Python (installation and environment description)
- Gathering the data into all platforms (data from Excel and csv)
- Data manipulation (preparation and transformation) I - Table, Row transform, Row filter and split
- Data manipulation (preparation and transformation) II - Column Binning, Column Convert and replace, Column Filter, Column Split, Column Transform,
- Data manipulation (preparation and transformation) III - Other data types – date and time.
- Data manipulation - Feature scaling
- Data visualisation - Histogram, Line plot, Pie chart, Scatter plot, Box plot
- access to computer or laptop with Windows (32bit or 64 bit), Linux (64bit) or Mac (64bit) and with permission to download softwares (if not, ask your administrator to download it for you – it is common at company´s computers)
- no prior skills required (basic data analyzing experience in different programs is an advantage)