Skip to content
View MichalLauer's full-sized avatar

Highlights

  • Pro

Block or report MichalLauer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MichalLauer/README.md

Hello there πŸ‘¨β€πŸ”¬

I'm a student from Czechia who is passionate about math, statistics, and Data Science. This repo holds some of my work that showcases how I go around data analysis and coding in general. If you wish to seek more about me, feel free to visit my website.

Data Analyst Associate certification πŸ•β€

  • Tool: RMarkdown
  • Packages: readr, dplyr, forcats, skimr, ggplot2, glue, stringr, tidytext
  • Output: Written analysis

Pet Box Subscription analysis is a descriptive analysis of a pet store, which was done for my Data Analyst Associate certification. This analysis aims to identify pet owners who could purchase stuff every month (food, toys, medical supplies...). The data is read with readr and wrangled with dplyr. As most characteristics are factors, I heavily relied on forcats to simplify my work. Data visualization is done with ggplot2 and skimr. When working with text, I applied glue for string interpolation and stringr for text manipulation. For advanced graphs, I used tidytext's facet functions.

My final submission consisted of a written report for Data Scientists at Data Camp, who reviewed my proposal and reviewed that the analysis meets current industry standards. You can view it in my DataCamp workspace.

Professional Data Analyst certification πŸ’Έ

  • Tool: RMarkdown
  • Packages: dplyr, tidyr, ggplot2, patchwork, gtsummary
  • Output: Oral presentation with PowerPoint slides

My second Data Analyst certificate was achieved with my analysis on a made-up insurance company. This analysis mainly aims to identify which customers are buying insurance and what their characteristics are. Coding and data interpretation is done in R Markdown. The data is wrangled and transformed using dplyr and tidyr. Data visualization is put together using ggplot2 and patchwork. The final tables are beautified with gtsummary.

The analysis was presented orally to Data Scientist from DataCamp, who reviewed my presentation and verbal communication. My video presentation is not available; however, the PowerPoint presentation can be downloaded from my Github repo.

Data Scientist Associate certification πŸ§˜πŸ½β€β™€οΈ

  • Tool: DataCamp Notebook
  • Packages: readr, dplyr, glue, ggplot2, tidymodels
  • Output: Written submission

To recieve the Data Scientist Associate certification, I created a report that first reads (readr) and wrangles (dplyr, glue) data about a made-up fitness center. After set domain restrictions are validated and applied, data is explored using ggplot2. To predict the number of people in a fitness class, I used various packages from the Tidymodels family.

First model created uses Ridge regression from the glmnet package. Alpha was validated using 10-cross validation. The second model uses Random forest to predict the number of customers. Parameters were tune()'d using 10-cross validation. The final submission can be seen on my DataCamp workspace.

Pinned Loading

  1. TravelAssured TravelAssured Public

    Personal project to showcase my work

    HTML

  2. PetBoxSubscription PetBoxSubscription Public

    Personal analysis project for my DataCamp Data Analyst Associate certification

    R

  3. laumi-blogdown laumi-blogdown Public

    HTML