Skip to content

This repository contains my implementations of the algorithms discussed in the aforementioned book by Joel Grus.

Notifications You must be signed in to change notification settings

ruchikaverma-iitg/Data-Science-from-Scratch-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Science-from-Scratch

This repository contains my implementations (on Python 3.7) of the algorithms discussed in the aforementioned book "Data Science From Scratch" by Joel Grus.

File name Python/IPython Notebooks Description
1_Counting_clicker .py/.ipynb Count or track how many people have shown up for a class
2_Visualizing_data .py/.ipynb Data visualization using matplotlib library
3_Vector_operations_on_data .py/.ipynb Depicts linear algebra operations on data vectors
4_Matrix_operations .py/.ipynb Depicts creation and manipulation of matrices
5_Statistics .py/.ipynb Stastistical operations to understand the distribution of data
6_Probability .py/.ipynb Understanding the data distribution
7_Hypothesis_and_Inference .py/.ipynb To test whether a certain hypothesis is likely to be true
8_Gradient_descent .py/.ipynb Minimizing the error and estimating unknown parameters using gradient descent on whole dataset/mini-batches
9_Working_with_data .py/.ipynb Basic operations including creation of data histogram, correlation, dictionaries, NamedTuple, classes and rescaling
10_Principal_component_analysis .py/.ipynb Principal component analysis from scratch
11_machine_learning .py/.ipynb Train and test data split, functions to evaluate model's accuracy, precision, recall and F1-score
12_k-Nearest-Neighbors .py/.ipynb Implemention of k-nearest neighbors algorithm from scratch in Python
13_Naive_Bayes .py/.ipynb Naive Bayes classifier from scratch to identify words belonging to spam and not spam (ham) emails
14_Linear_Regression .py/.ipynb Linear regression from scratch using closed form solution and stochastic gradient descent
15_Multiple_Regression .py/.ipynb Multiple regression from scratch using stochastic gradient descent, compute statistics in bootstrap manner, ridge and lasso regularization
16_Logistic_Regression .py/.ipynb Logistic regression from scratch and compute precision and recall on testing data
17_Decision_Trees .py/.ipynb Decision Trees using ID3 learning algorithm from scratch
18_Neural_networks .py/.ipynb Neural network (including feed-forward and backpropagation) from scratch. An interesting "fizzbuzz" example is also shown to train and test the neural network
19_Deep_Learning .py/.ipynb Implementation of deep neural networks with various loss functions, optimization techniques, network regularization using dropout from scratch. Training of deep neural networks on Fizzbuzz and MNIST data.
20_Clustering .py/.ipynb Implementation of k-means and bottom-up hierarchical clustering from scratch.
21_nlp .py/.ipynb Implementation of popular natural language processing algorithms including bigrams, trigrams, topic modeling, word vectors and recurrent neural networks from scratch in Python.

About

This repository contains my implementations of the algorithms discussed in the aforementioned book by Joel Grus.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published