Skip to content
View YSayaovong's full-sized avatar

Block or report YSayaovong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
YSayaovong/README.md

Machine Learning / Data Engineer

Data Engineer with hands-on experience from a previous internship, specializing in Python, SQL, and data visualization. Skilled in analyzing complex datasets, automating data workflows, and generating actionable insights to support business decision-making. Proficient in data warehousing, ETL processes, and creating interactive dashboards that drive strategic growth. Actively seeking a machine learning or data engineer position to utilize my analytical skills and contribute to meaningful data-driven solutions.


Career Highlights

Data Engineer Intern | Refonte Infini – Remote
11/2024 – Present

  • Implemented data ingestion with tools such as Sqoop, Flume, and Kafka.
  • Optimized Hive queries using partitioning, bucketing, and indexing.
  • Processed large datasets and real-time data using Apache Spark.
  • Managed AWS services like RDS and EBS for secure and scalable solutions.
  • Applied AWS deployment options tailored to business needs.
  • Processed data on Amazon EMR using Hadoop tools.
  • Built real-time data solutions with Amazon Kinesis and visualized data using QuickSight.
  • Deployed Azure virtual machines, web apps, and databases.
  • Integrated Azure Active Directory for secure identity management.
  • Designed data solutions with Azure Synapse, Data Lake, and SQL Database.
  • Developed batch and streaming data processes for performance optimization.

Education

B.S., Information Technology; Minor in Music | Arizona State University, Expected Completion: 5/2025
A.I. & Machine Learning Engineer Career Path | Zero to Mastery, Expected Completion: 5/2025
A.S. in Mechanical Design Technology | Milwaukee Area Technical College, 5/2021


Technical Proficiencies

Languages

  • Python (Advanced): Skilled in OOP, data structures, algorithms, backend development (Flask, Django), and process automation.
  • JavaScript (ES6+): Backend services, asynchronous operations, and event-driven architecture.
  • C++ (Basic), C# (Basic), Java (Basic): Basic syntax, programming logic, and foundational concepts.

Machine Learning & Data Science

  • Scikit-learn, Pandas: Skilled in building, training, and fine-tuning machine learning models with scikit-learn, and proficient in Pandas for efficient data manipulation.
  • Feature Engineering & EDA: Experienced in exploratory data analysis to uncover patterns, trends, and insights.
  • NLP: Hands-on experience with natural language processing, including text preprocessing and sentiment analysis.

Data Engineering

  • Apache Kafka, Spark, Hadoop: Proficient in developing scalable, high-performance data pipelines with Kafka and Spark, and leveraging Hadoop for distributed processing.
  • ETL Pipelines & Data Warehousing: Skilled in constructing ETL pipelines ensuring data quality and accessibility.

Cloud & DevOps

  • AWS Services: Experienced with EC2, S3, Lambda for scalable infrastructure, and serverless computing.
  • CI/CD & Containers: Proficient in Docker, Jenkins, and GitHub Actions for automated builds and deployments.
  • Azure Synapse: Skilled in Azure Synapse for unified analytics and large-scale data solutions.

Database

  • PostgreSQL, SQL, NoSQL (MongoDB, Couchbase): Skilled in writing complex queries and managing relational and NoSQL databases.

Projects

Predictive Analysis and Data Insights on the Titanic Dataset
November 2024
Description: Conducted a comprehensive analysis of the Titanic dataset to uncover factors influencing passenger survival and developed a predictive model to estimate outcomes.

  • Data Preprocessing: Addressed missing values, removed irrelevant columns, and prepared the dataset for analysis.
  • Exploratory Data Analysis (EDA): Visualized survival trends across variables such as class and age using Seaborn and Matplotlib.
  • Feature Engineering: Selected and prepared key features for optimal model training.
  • Machine Learning Model: Developed a RandomForestClassifier, evaluated with confusion matrix and classification report.

Key Skills: Python, Data Analysis, Machine Learning, Data Visualization, EDA, scikit-learn, pandas, Seaborn, Matplotlib.

Pinned Loading

  1. Portfolio Portfolio Public

    Explore my portfolio showcasing projects in data engineering, cybersecurity, software development, and cloud computing. Highlights include SQL tutorials, automation tools, cybersecurity assessments…

    HTML

  2. Stockroom_Management Stockroom_Management Public

  3. Modern_Library_Management_Database Modern_Library_Management_Database Public

    1

  4. Minimizing_Costs_on_AWS_Platform Minimizing_Costs_on_AWS_Platform Public