Skip to content
View VLSI-Shubh's full-sized avatar

Block or report VLSI-Shubh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
VLSI-Shubh/README.md

๐Ÿ‘‹ Hi, I'm Shubham Kapil Upadhyay

๐Ÿš€ VLSI Design Engineer | MS in Electrical Engineering @ Purdue
๐Ÿ” Specializing in RTL Design, FPGA Development & Digital System Architecture

๐Ÿ“ง Email | ๐Ÿ”— LinkedIn | ๐Ÿ“„ Resume | ๐Ÿ“ United States


๐Ÿงญ Philosophy

"Designing tomorrowโ€™s digital systems today โ€” one clock cycle at a time."


๐Ÿง  About Me

Iโ€™m a passionate digital designer focused on building efficient, scalable, and synthesizable hardware systems. With a strong foundation in Verilog, FPGA development, and memory architecture, I love translating abstract logic into real-world digital hardware.

Whether itโ€™s architecting FSMs, optimizing datapaths, or exploring memory systems, I enjoy solving problems that live at the intersection of elegant design and timing precision.


๐Ÿ› ๏ธ What I Do Best

  • โœ… RTL Design & Verification (Verilog, VHDL)
  • โœ… FPGA-based System Implementation (Xilinx, Vivado)
  • โœ… Memory Architecture & FIFO Buffers
  • โœ… FSM Design & Control Logic
  • โœ… SoC Integration & Embedded Digital Systems

๐Ÿ”ง Technical Toolbox

Hardware Description Languages
Verilog โ€ข VHDL โ€ข SystemVerilog

Programming & Scripting
Python โ€ข C โ€ข C++ โ€ข MATLAB โ€ข Tcl/Tk

EDA Tools & Synthesis
Vivado โ€ข Cadence Virtuoso โ€ข Synopsys Design Compiler โ€ข GTKWave โ€ข Icarus Verilog โ€ข Yosys

Design & Verification
RTL Design โ€ข FSM Architecture โ€ข Memory Systems โ€ข Functional Verification โ€ข Logic Synthesis

Protocols & Analysis
UART โ€ข SPI โ€ข I2C โ€ข AXI โ€ข STA โ€ข CDC โ€ข Power Optimization

๐Ÿ“Š Languages in My Repositories

Langs

This chart reflects the primary languages used across my GitHub repositories highlighting my focus on HDLs (Verilog/VHDL) along with supporting languages like C, C++, Python, and Tcl for verification, scripting, and testbench automation.


๐ŸŒŸ Featured Projects

Cross clock-domain FIFO with Gray code pointers & metastability prevention

  • Designed a parameterized dual-clock FIFO enabling reliable data transfer across independent clock domains (100 MHz โ†’ 71.4 MHz).
  • Implemented Gray code pointer synchronization with dual flip-flop synchronizers, eliminating metastability and ensuring 0 simulation errors.
  • Added extra MSB flag logic for precise full/empty detection, validated across 100+ corner cases including rapid write/read bursts.
  • Achieved zero synthesis warnings in Vivado and generated post-synthesis schematics confirming correct RTL-to-gate mapping with CDC-safe structures.
  • Verified operation with comprehensive VCD waveform analysis, demonstrating safe flag propagation delays (1โ€“2 cycles) and glitch-free operation.
  • Applications: SoC interconnects, network packet buffers, DDR controllers, and other high-speed CDC use cases.

Real-time comparison of Bubble, Quick, Merge, and Radix Sort algorithms with Python + matplotlib animation

  • Implemented four classic sorting algorithms (Bubble, Quick, Merge, Radix) with state instrumentation for tracking comparisons, swaps, pivots, and merges.
  • Developed a visualization tool using matplotlib animations, enabling side-by-side execution of multiple algorithms for educational and performance comparison.
  • Designed interactive CLI with support for custom dataset size, number ranges, file input, and adjustable animation speed.
  • Produced animated visual outputs and performance tables showing execution steps and timing (e.g., Radix fastest @ 0.07 ms vs Bubble slowest @ 2.72 ms).
  • Applications: Teaching tool for algorithms, complexity analysis, and computer science education.

Real-time image enhancement with FPGA-accelerated min, max, and median filters

  • Implemented morphological filtering (min, max, median) on a PYNQ-Z2 FPGA, processing 64ร—64 RGB images via AXI streams and DMA.
  • Designed custom VHDL IP (FilterSelect_v1_0) with independent R/G/B filter modules using shift registers and bubble sort for median filtering.
  • Integrated hardware acceleration + Python control, achieving <10% LUT usage while maintaining real-time performance.
  • Enabled interactive filter selection with 2 physical board switches, supporting min (00), median (01), and max (10) operations in real time.
  • Validated FPGA outputs against MATLAB image processing, confirming pixel-accurate results for noise reduction, edge preservation, and texture segmentation.
  • Overcame design challenges in RGB channel handling and reconstruction, ensuring lossless image recombination post-processing.
  • Excluded mean filter due to excessive latency, but maintained project timeline with full feature delivery.
  • Applications: Biomedical imaging, digital photography, visual inspection systems, and FPGA-accelerated computer vision.

Sensor-based intelligent traffic management system

  • Designed a 5-state FSM that dynamically allocates Green/Yellow/Red signals based on live sensor inputs from 4 roads.
  • Implemented priority-based state transitions with configurable timers (55 cycles green, 10 cycles yellow).
  • Verified FSM determinism across 100+ simulation cycles, ensuring no deadlocks or undefined states.
  • Synthesized in Vivado with 0 timing violations, producing a clean schematic with state registers and control logic.
  • Demonstrated real-time congestion reduction with scalable timing parameters for smart city deployment.
  • Applications: Intelligent transportation systems, pedestrian-aware traffic controllers, and FPGA-based IoT solutions.

Complete SRAM collection: Single/Dual Port variants

  • Implemented four SRAM designs: single-port (sync/async read), pseudo dual-port, and true dual-port with independent clocks.
  • Verified simultaneous read/write in true dual-port SRAM across independent ports and clock domains.
  • Parameterized depth and width for scalable memory solutions, validated up to 256ร—32-bit arrays.
  • Conducted sync vs async read analysis, showing predictable latency for synchronous reads and immediate access for async reads.
  • Achieved successful Yosys synthesis with clean schematics confirming correct memory array inference.
  • Applications: CPU caches, FPGA memory blocks, dual-core communication buffers.

๐Ÿ›ฐ๏ธ UART Communication Protocol

Parameterized full-duplex UART with baud rate generator and FSM-based TX/RX logic

  • Designed a full-duplex UART with independent 4-state TX FSM and 5-state RX FSM, supporting 8N1 protocol.
  • Implemented baud rate generator with 16ร— oversampling, achieving reliable data reception across 9600โ€“115,200 baud rates.
  • Verified loopback communication with clean VCD traces: TX complete @ 838,710 ns, RX complete @ 942,710 ns with 0xA5 integrity.
  • Developed hierarchical RTL modules for transmitter, receiver, and baud generator, ensuring modularity and reusability.
  • Synthesized and tested on FPGA toolchains with 0 functional mismatches across 50+ test cycles.
  • Applications: FPGA-to-PC communication, embedded serial links, SoC peripherals.

High-performance multiplier with hybrid logic optimization

  • Designed an 8ร—8 Dadda multiplier in 45 nm CMOS using hybrid logic: Transmission Gate XOR/AND + CMOS OR gates.
  • Achieved critical path delay of 0.204 ns (~2 GHz max frequency) with power dissipation of 62.47 ฮผW.
  • Optimized partial product reduction using 4:2 compressors, minimizing transistor count while maintaining stability.
  • Compared performance vs Wallace Tree and Booth multipliers, demonstrating superior speed-power efficiency.
  • Validated design through Cadence Virtuoso pre/post-layout simulations, confirming timing closure and scalability.
  • Applications: High-speed arithmetic units in DSPs, RISC processors, and low-power accelerators.

๐Ÿ’ผ Experience

๐Ÿ’ป Firmware Engineer @ WinWin Labs (Volunteer)

Remote, US | Aug 2025 โ€“ Present

  • Developed embedded firmware for IoT systems using C/C++, implementing real-time communication protocols (UART, SPI, IยฒC) with interrupt handling and buffer management.
  • Collaborated with hardware teams on board bring-up, interface validation, and system-level debugging of microcontroller-based platforms including PlatformIO development environments and Arduino DevKit boards.
  • Programmed and tested embedded applications for connected IoT devices, ensuring reliable operation in resource-constrained environments with hands-on experience on Xilinx Pynq-Z2 development boards and ESP32 microcontroller platforms.

๐ŸŽ“ Graduate Teaching Assistant @ Purdue University (EPICS)

Aug 2024 โ€“ May 2025 | Indianapolis, IN

  • Mentored 35+ engineering students in AI/ML application development for Vaani Connect speech-to-text translation project.
  • Designed and implemented technical platforms and testing frameworks, improving project development efficiency by 30%.
  • Provided technical guidance in digital design, RTL coding, and verification methodologies across First-Year Engineering programs.

๐Ÿ‘จโ€๐Ÿ’ป Engineering Intern @ Thyssenkrupp Crankshaft Company

May 2024 โ€“ Aug 2024 | Illinois, US

  • Evaluated Marposs system components for electrical compatibility and upgrade planning across critical machines.
  • Created standardized parts lists and collaborated with OEM support for phased system modernization.

๐Ÿง‘โ€๐Ÿญ Junior Electrical Manager @ 21 Knots Engineering

Feb 2022 โ€“ July 2023 | Mumbai, IN

  • Led procurement and execution for major electrical engineering projects
  • Maintained 100% execution success and high client satisfaction

๐Ÿง‘โ€๐Ÿ”ง Senior Electrical Design Engineer @ Petrocil Engineering

June 2019 โ€“ Jan 2022 | Mumbai, IN

  • Delivered over 10 successful electrical design projects
  • Improved on-site technical resolution by 10%

๐ŸŽ“ Education

MS Electrical Engineering
Purdue University Indianapolis
Specialization: VLSI Design & FPGA Systems


๐Ÿ† Highlights

  • ๐Ÿ’ก Passionate about crafting efficient, synthesizable hardware that just works
  • ๐Ÿ” Continuously learning and mastering complex RTL design and verification techniques
  • ๐Ÿค Enjoy collaborating with cross-disciplinary teams to bring projects to life
  • ๐Ÿ› ๏ธ Experienced in bridging theoretical designs with practical FPGA implementations
  • ๐Ÿš€ Always pushing boundaries by exploring new architectures and optimization methods

Pinned Loading

  1. Morphological-Image-Filtering-on-PYNQ-FPGA Morphological-Image-Filtering-on-PYNQ-FPGA Public

    Morphological image filtering system on PYNQ-Z2 FPGA implementing min, max, and median filters. Demonstrates HW/SW co-design with VHDL acceleration and Python orchestration for real-time image enhaโ€ฆ

    VHDL 1

  2. Asynchronous-FIFO Asynchronous-FIFO Public

    Production-ready asynchronous FIFO buffer with independent read/write clock domains for safe CDC operations. Features Gray code pointers, dual flip-flop synchronizers, metastability prevention, andโ€ฆ

    Verilog

  3. GCD-Calculator GCD-Calculator Public

    Greatest Common Divisor calculator showcasing CPU-like controller + datapath architecture using subtraction-based Euclidean algorithm. Demonstrates synthesizable FSM design vs behavioral modeling tโ€ฆ

    Verilog

  4. Sorting-Algorithm-Visualizer-in-Python Sorting-Algorithm-Visualizer-in-Python Public

    A Python-based sorting algorithm visualizer that demonstrates Bubble, Quick, Merge, and Radix Sort with step-by-step animations using Matplotlib. Includes performance comparison, command-line custoโ€ฆ

    Python

  5. UART UART Public

    Fully parameterized UART (Universal Asynchronous Receiver Transmitter) module in Verilog with FSM-based transmitter and receiver, configurable baud rate generator, and support for full-duplex commuโ€ฆ

    Verilog

  6. Delay-and-Power-Analysis-of-a-Static-8x8-Dadda-Multiplier-Circuit Delay-and-Power-Analysis-of-a-Static-8x8-Dadda-Multiplier-Circuit Public

    High-speed 8ร—8 Dadda multiplier designed in 45nm CMOS technology with hybrid transmission gate/CMOS logic. Features 4:2 compressor-based partial product reduction, critical path delay of 0.204ns, aโ€ฆ