Hi there ! 👋
I like to explore the past to shape a smarter future.
In this project, I developed an NLP classifier for predicting medical domains in texts using a Siamese Neural Network.
The approach involves training a simple classifier with a smart representation for each text. This representation is a vector that captures the distance to text prototypes. The distance measure is learned using a dual network, consisting of a PubMedBERT model.

NLP models require large amounts of annotated data, which are often difficult to obtain, especially in specialized fields like medicine. 💊
To address this challenge, we propose a multi-head BERT-based architecture, trained simultaneously on a 'main task' (which has limited data) and 'supporting tasks' (which have abundant data), enabling cross-task knowledge transfer.
This approach is inspired by how humans generalize knowledge across different domains. For example, learning to play the piano enhances hand-eye coordination, which can then improve performance in other activities, such as basketball.

Large Language Models are pre-trained on vast public data collected from the Internet, which likely contains private or sensitive information. Combined with the fact that large models tend to memorize training data, this scenario poses a potential risk of data leakage. 🔓
In this project, I reimplemented a decoding method that ensures privacy with high probability, based on the paper "Differentially Private Decoding in Large Language Models" (Majmudar et al., 2022), and experimented on GPT-2 and ViLT (Vision-and-Language Transformer) models.

Sepsis is a life-threatening medical condition that can lead to organ failure and death without timely treatment in the Intensive Care Unit (ICU). Managing septic patients involves the administration of several medications, but there is no universal treatment policy due to patient variability and the complexity of the disease.
In this project, I developed a Reinforcement Learning (RL)-based agent that adjusts medication doses based on real-time clinical information from patients.

Fast inference time is crucial for deep learning models that are deployed on resource-constrained devices, for providing real-time responses to user requests, and for cost and sustainability reasons. 🌿
In this blog I review the “early exiting” method, which improves inference time by allowing samples to exit at different depths within the network, potentially making many “easier” samples to exit early and thus avoiding unnecessary computations while still maintaining accuracy.