- Apa itu machine learning, artificial intelligence, dan data science
- Apa saja masalah-masalah yang dapat diselesaikan menggunakan machine learning?
- Bidang-bidang yang terkait dengan machine learning
- Apa yang perlu dikuasai untuk menjadi seorang machine learning?
- Pengenalan tentang regresi (termasuk evaluation metrics, e.g: MSE dan MAE)
- Regresi linear sederhana
- Regresi polinomial
- Regresi dengan regularisasi
- Suport vector regression
- Generalized linear model
- Pengenalan tentang klasifikasi dan confusion matrix
- Logistic regression (regresi logistik)
- LDA (Linear Discriminant Analysis)
- k-NN (k-Nearest Neighbors)
- Naive bayes
- Decision tree
- Support vector machine
- Neural networks
- Pengenalan tentang klastering
- k-means klastering
- EM (Expectation-Maximization) klastering
- Klastering hirarkis
- Pengenalan tentang metode kernel
- Kernel k-means
- Kernel SVM
- Kernel regresi
- Feature engineering
- Transformasi data
- Data cleaning
- Pengurangan dimensi (PCA, LDA)
- Seleksi variabel
- Pengenalan tentang deep learning dan tools
- CNN (Convolutional Neural Networks): case untuk klasifikasi digit MNIST
- RNN (Recurrent Neural Networks)
- Generative Model: GAN (Generative Adversarial Networks) dan Autoencoder
- Deep learning Object Detection: SSD, Yolo, Mask RCNN
- Deep learning Image Segmentation: FCN, SegNet, Mask RCNN
- Overview Text Mining dan NLP
- Corpus
- Dictionary
- Feature extraction
- Bag of words
- Term Document matrix
- Term frequency and Weight
- TF-IDF
- POS Tagging
- Named Entity Recognition
- Overview Text Classification
- Binary Classification
- Multiclass Classification
- Multilabel Classification
- Information Retrieval
- Text Clustering
- Document Similarity
- topic modeling
- Word2Vec
- Skip.Gram
- CBOW
- Language Modeling
- Natural Language Understanding
- Natural Language Generation
- Pengenalan tentang computer vision dan tools
- Representasi image dan video di dalam komputer
- Binary thresholding
- Otsu thresholding
- Pengenalan tentang spatial filtering
- Smoothing (averaging filter)
- Sharpening
- Median filter
- Sobel filter
- Erosion
- Dilation
- Morphological opening & closing
- Connected component analysis
- Image segmentation
- Object detection: case face detection
- Overview Speech Recognition
- MFCC
- LPC
- Noise Reduction
- Speech Recognition for Low Resource
- Large Vocabulary Continuous Speech Recognition
- Speaker Indentification
- Speech Enhancement
- Speech separation
- Overview Data Visualization
- Principles of Data Visualization
- Overview Chart
- Pie Chart
- Line Chart
- Bar Chart
- Stacked Bar Chart
- Heat Map
- Bubble Chart
- Area Charts
- Box Plot
- Whisker plot
- Scatter Plot
- GeoSpatial
- Real Time Data Visualization
- MS Excel with Analysis toolpack
- Java, Python
- R, Rstudio, Rattle
- Weka, Knime, RapidMiner
- Hadoop dist of choice
- Spark, Storm
- Flume, Scibe, Chukwa
- Nutch, Talend, Scraperwiki
- Webscraper, Flume, Sqoop
- tm, RWeka, NLTK
- RHIPE
- D3.js, ggplot2, Shiny
- IBM Languageware
- Microsoft Azure, AWS, Google Cloud
- Cassandra, MongoDB
- Microsoft Cognitive API
- Tensorflow
- Git
- Pengenalan Basis Data
- Basic SQL
- Intermediate SQL
- Advance SQL
- Matrices, Vector & Algebra fundamentals
- Hash function, binary tree, O(n)
- Relational algebra, DB basics (with SQL)
- Inner, Outer, Cross, theta-join
- CAP theorem
- Tabular data
- Entropy
- Data frames & series
- Sharding
- OLAP
- Multidimensional Data model
- ETL
- Reporting vs BI vs Analytics
- JSON and XML
- NoSQL
- Regex
- Pick a dataset
- Descriptive statistics
- Exploratory data analysis
- Histograms
- Percentiles & outliers
- Probability theory
- Bayes theorem
- Random variables
- Cumul Dist Fn (CDF)
- Continuous distributions
- Skewness
- ANOVA
- Prob Den Fn (PDF)
- Central Limit theorem
- Monte Carlo method
- Hypothesis Testing
- p-Value
- Chi2 test
- Estimation
- Confid Int (CI)
- MLE
- Kernel Density estimate
- Regression
- Covariance
- Correlation
- Pearson coeff
- Causation
- Least2-fit
- Euclidian Distance
- Measures of centralizing Data
- Measures of spread Data
- Python Basics
- Working in excel
- R setup / R studio
- R basics
- Expressions
- Variables
- IBM SPSS
- Rapid Miner
- Vectors
- Matrices
- Arrays
- Factors
- Lists
- Data frames
- Reading CSV data
- Reading raw data
- Subsetting data
- Manipulate data frames
- Functions
- Factor analysis
- Install PKGS
- Code versioning
- Data Table
- Map Reduce fundamentals
- Hadoop Components
- HDFS
- Data replications Principles
- Setup Hadoop
- Name & data nodes
- Job & task tracker
- M/R programming
- Sqop: Loading data in HDFS
- Flume, Scribe
- SQL with Pig
- DWH with Hive
- Scribe, Chukwa for Weblog
- Using Mahout
- Zookeeper Avro
- Storm: Hadoop Realtime
- Rhadoop, RHIPE
- RMR
- Cassandra
- MongoDB, Neo4j
- Summary of data formats
- Data discovery
- Data sources & Acquisition
- Data integration
- Data fusion
- Transformation & enrichment
- Data survey
- Google OpenRefine
- How much data ?
- Using ETL
- Dim. and num. reduction
- Normalization
- Data scrubbing
- Handling missing Values
- Unbiased estimators
- Binning Sparse Values
- Feature extraction
- Denoising
- Sampling
- Stratified sampling
- PCA
- Intro Python
- Set Up Environment
- Data Structure
- Iteration & Conditional
- Intro Libraries
- Function
- OOP
- Package
- Numpy