description |
---|
My learning notes. Just in time (JIT) is better than Just in Case |
I am a data scientist. Recently, I find myself studying database, data structure, data pipeline way more than machine learning. To build a good model, I found the importance of writing good code to produce data with quality often triumphs a SOTA model.
Delivering the model is the job of a data scientist. Inevitably, every data scientist should somewhat be a "full-stack" data scientist.
This is a central repository for my blogs and notes
- Blog: https://noklam.ml (Github Page) - Usually blog or notes with code with shorter articles
- Blog: Medium (https://medium.com/@nokknocknok)
- GitBook (Study notes mainly, I use Joplin to keep notes in markdown, am considering sync to Gitbook from time to time. I haven't figured out what's the best way to do so.)
I am generally interested in tools that increase productivity, please let me know if you have any recommendations. Here is a list of software/topics that I found useful.
Uncertainty Quantification in Deep Learning
Visualization (University of Washington)
https://raw.githubusercontent.com/noklam/mediumnok/master/_demo/python-viz/presentation.mplstyle
my_style = 'https://raw.githubusercontent.com/noklam/mediumnok/master/_demo/python-viz/presentation.mplstyle'
with plt.style.context(['ggplot', my_style]):
make_scatter_plot()
make_line_plot()
-
pyinstructment: for profiling python process, which is useful for optimization
-
torchsnooper -> pytorch profiling, another profiling tool which is for PyTorch, no more print x.shape anymore.
-
knockknock notification: A single line of code that get you notifications when your 10 hours model training finally done. No more starring at the progress bar.
-
colorama: Colored printing in terminal (cross platform)
-
Hypoehsis - Property-based testing, autogenerated input for unit-test.
Reviewing (any suggestions for code metric report/analysis library are welcome!)
-
coala - coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
-
radon - Radon is a Python tool that computes various metrics from the source code
-
great_expectations - A data validation library for python integrated with Pandas/Spark/SQL
- lunr.js
A catalog of various machine learning topics.
- spectral graph theory - Why Laplacian Matrix need normalization and how come the sqrt of Degree Matrix? - Mathematics Stack Exchange
- spectral graph theory - Why Laplacian Matrix need normalization and how come the sqrt of Degree Matrix? - Mathematics Stack Exchange
- What's the intuition behind a Laplacian matrix? I'm not so much interested in mathematical details or technical applications. I'm trying to grasp what a laplacian matrix actually represents, and what aspects of a graph it makes accessible. - Quora
While neural network has gain a lot of success in NLP and computer vision, there are relatively less changes for traditional time series forecasting. This repository aims to study the lastest practical technique for time series prediction, with either statistical method, machine learning, or deep neural network.
Statistical Method
Machine Learning
Deep Neural Network
Gramian Angular Field : Transform time series into an image and use transfer learning with CNN
While forecasting accuracy is important, the prediction interval is also important and it is an area that the machine learning world has less focus on.
- Traditional statistical forecast (ARIMA, ETS etc)
- Bayesian Neural Network
- Random Forest jackknife approximation
- MCDropout (Use Dropout at inference time as variation inference)
- Quantile Regression
- VOGN (Optimizer weight perturbation)
- Random Forest jackknife approximation
Prophet (Facebook): Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth. It has build-in modeling for the Holiday effect.
pyts : state-of-the-art algorithms for time-series transformation and classification
Feel free to send a PR or discuss by starting an issue.😁
powered by fastpages
fastpages allow me to blog directly in Notebook, so I don't have to worry how to convert into markdown anymore. I simple code and write.