Data Engineer with hands-on experience from a previous internship, specializing in Python, SQL, and data visualization. Skilled in analyzing complex datasets, automating data workflows, and generating actionable insights to support business decision-making. Proficient in data warehousing, ETL processes, and creating interactive dashboards that drive strategic growth. Actively seeking a machine learning or data engineer position to utilize my analytical skills and contribute to meaningful data-driven solutions.
Data Engineer Intern | Refonte Infini – Remote
11/2024 – Present
- Implemented data ingestion with tools such as Sqoop, Flume, and Kafka.
- Optimized Hive queries using partitioning, bucketing, and indexing.
- Processed large datasets and real-time data using Apache Spark.
- Managed AWS services like RDS and EBS for secure and scalable solutions.
- Applied AWS deployment options tailored to business needs.
- Processed data on Amazon EMR using Hadoop tools.
- Built real-time data solutions with Amazon Kinesis and visualized data using QuickSight.
- Deployed Azure virtual machines, web apps, and databases.
- Integrated Azure Active Directory for secure identity management.
- Designed data solutions with Azure Synapse, Data Lake, and SQL Database.
- Developed batch and streaming data processes for performance optimization.
B.S., Information Technology; Minor in Music | Arizona State University, Expected Completion: 5/2025
A.I. & Machine Learning Engineer Career Path | Zero to Mastery, Expected Completion: 5/2025
A.S. in Mechanical Design Technology | Milwaukee Area Technical College, 5/2021
- Python (Advanced): Skilled in OOP, data structures, algorithms, backend development (Flask, Django), and process automation.
- JavaScript (ES6+): Backend services, asynchronous operations, and event-driven architecture.
- C++ (Basic), C# (Basic), Java (Basic): Basic syntax, programming logic, and foundational concepts.
- Scikit-learn, Pandas: Skilled in building, training, and fine-tuning machine learning models with scikit-learn, and proficient in Pandas for efficient data manipulation.
- Feature Engineering & EDA: Experienced in exploratory data analysis to uncover patterns, trends, and insights.
- NLP: Hands-on experience with natural language processing, including text preprocessing and sentiment analysis.
- Apache Kafka, Spark, Hadoop: Proficient in developing scalable, high-performance data pipelines with Kafka and Spark, and leveraging Hadoop for distributed processing.
- ETL Pipelines & Data Warehousing: Skilled in constructing ETL pipelines ensuring data quality and accessibility.
- AWS Services: Experienced with EC2, S3, Lambda for scalable infrastructure, and serverless computing.
- CI/CD & Containers: Proficient in Docker, Jenkins, and GitHub Actions for automated builds and deployments.
- Azure Synapse: Skilled in Azure Synapse for unified analytics and large-scale data solutions.
- PostgreSQL, SQL, NoSQL (MongoDB, Couchbase): Skilled in writing complex queries and managing relational and NoSQL databases.
Predictive Analysis and Data Insights on the Titanic Dataset
November 2024
Description: Conducted a comprehensive analysis of the Titanic dataset to uncover factors influencing passenger survival and developed a predictive model to estimate outcomes.
- Data Preprocessing: Addressed missing values, removed irrelevant columns, and prepared the dataset for analysis.
- Exploratory Data Analysis (EDA): Visualized survival trends across variables such as class and age using Seaborn and Matplotlib.
- Feature Engineering: Selected and prepared key features for optimal model training.
- Machine Learning Model: Developed a RandomForestClassifier, evaluated with confusion matrix and classification report.
Key Skills: Python, Data Analysis, Machine Learning, Data Visualization, EDA, scikit-learn, pandas, Seaborn, Matplotlib.