Skip to content
View huangtiansheng's full-sized avatar
🌴
On vacation
🌴
On vacation

Organizations

@git-disl @DatabasePractice

Block or report huangtiansheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
huangtiansheng/README.md

Hi there 👋 I am Tiansheng Huang

  • I’m currently a third-year PhD candidate from Georgia Tech.
  • I am working on safety alignment for large language models. Particularly, I am interested in red-teaming attacks and defenses for LLMs.

Selected Publications

I try to push myself to publish high quality papers in the periodicity of every three months.

Here are the papers I wrote in 2025.

  • [2025/3/01] Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable arXiv [paper] [code]
  • [2025/1/30] Virus: Harmful Fine-tuning Attack for Large Language Models bypassing Guardrail Moderation arXiv [paper] [code]

Here are the papers I wrote in 2024.

  • [2024/9/26] Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey arXiv [paper] [repo]
  • [2024/9/3] Booster: Tackling harmful fine-tuning for large language models via attenuating harmful perturbation ICLR2025 [paper] [code] [Openreview]
  • [2024/8/18] Antidote: Post-fine-tuning safety alignment for large language models against harmful fine-tuning arXiv [paper]
  • [2024/5/28] Lazy safety alignment for large language models against harmful fine-tuning NeurIPS2024 [paper] [code]
  • [2024/2/2] Vaccine: Perturbation-aware alignment for large language model aginst harmful fine-tuning NeurIPS2024 [paper] [code]

Pinned Loading

  1. git-disl/Safety-Tax git-disl/Safety-Tax Public

    This is the official code for the paper "Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable".

    Python 7

  2. git-disl/awesome_LLM-harmful-fine-tuning-papers git-disl/awesome_LLM-harmful-fine-tuning-papers Public

    A survey on harmful fine-tuning attack for large language model

    146 3

  3. git-disl/Virus git-disl/Virus Public

    This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"

    Python 44 2

  4. git-disl/Booster git-disl/Booster Public

    This is the official code for the paper "Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation" (ICLR2025).

    Shell 16

  5. git-disl/Vaccine git-disl/Vaccine Public

    This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)

    Shell 39 4

  6. git-disl/Lisa git-disl/Lisa Public

    This is the official code for the paper "Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning" (NeurIPS2024)

    Python 17