Skip to content

Commit 8719f2a

Browse files
yamanksinglaYaman Kumar
andauthored
Changed persuasion webpages (#51)
Co-authored-by: Yaman Kumar <ykumar@Yamans-MacBook-Pro.local>
1 parent 2a2febb commit 8719f2a

5 files changed

+95
-24
lines changed

PersuasionArena.html

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -58,12 +58,12 @@ <h1 class="text-center">Persuasion Arena: ELO</h1>
5858
<div class="container is-max-desktop content">
5959
<h2 class="title">BibTeX</h2>
6060
<pre><code>
61-
@online{singh2024measuring,
62-
author = {Singh, Somesh and Singla, Yaman K and SI, Harini and Krishnamurthy, Balaji},
63-
title = {Measuring and Improving Persuasive Abilities of Generative Models},
64-
year = {2024},
65-
url = {https://behavior-in-the-wild.github.io/measure-persuasion}
66-
}
61+
@article{singh2024measuring,
62+
title={Measuring and Improving Persuasiveness of Generative Models},
63+
author={Somesh Singh and Yaman K Singla and Harini SI and Balaji Krishnamurthy},
64+
year={2024},
65+
journal={arXiv preprint arXiv:2410.02653}
66+
}
6767

6868
</code></pre>
6969
<p>Get in touch with us at <a href="mailto:behavior-in-the-wild@googlegroups.com">behavior-in-the-wild@googlegroups.com</a> </p>
960 KB
Loading
1010 KB
Loading

measure-persuasion.html

Lines changed: 89 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ <h1 class="title is-1 publication-title"> Measuring And Improving Persuasiveness
5858
<div class="column has-text-centered">
5959
<div class="publication-links">
6060
<span class="link-block">
61-
<a href="./static/pdf/Transsuasion__Measuring_and_Improving_Behavior_Transfer_And_Persuasion_Abilities_Of_Generative_Models.pdf"
61+
<a href="https://arxiv.org/abs/2410.02653"
6262
class="external-link button is-normal is-rounded is-dark" target="_blank">
6363
<span class="icon">
6464
<i class="ai ai-arxiv"></i>
@@ -118,26 +118,95 @@ <h1 class="title is-1 publication-title"> Measuring And Improving Persuasiveness
118118
</div>
119119
</div>
120120
</section>
121-
121+
122122
<section class="hero teaser">
123123
<div class="container is-max-desktop">
124124
<div class="hero-body">
125125
<h4 class="subtitle has-text-centered">
126-
🔥<span style="color: #ff3860">[NEW!]</span>We introduce the task of transsuasion, the task of transferring content from one behavior to another while holding the other conditions like meaning, speaker, and time constant.
126+
🔥<span style="color: #ff3860">[NEW!]</span><b>Introducing PersuasionBench and PersuasionArena</b> - First large-scale automated benchmark and arena to measure the persuasive abilities of generative models.
127+
<br>
128+
🔥<span style="color: #ff3860">[NEW!]</span>We introduce the task of transsuasion, the task of transferring content from one behavior to another while holding the other conditions like meaning, speaker, and time constant.
127129
<br>
128-
🔥<span style="color: #ff3860">[NEW!]</span>We exhibit better or similar 0-shot and few shot abilities than GPT4 on transcreation, seo, and modelling human preference with a 13B model!
130+
🔥<span style="color: #ff3860">[NEW!]</span> <b>Challenging Scale Assumptions</b> - Smaller models can outperform larger ones in persuasion when trained on targeted datasets.
131+
<br>
132+
🔥<span style="color: #ff3860">[NEW!]</span><b>Policy Implications</b> - Current regulations like SB-1047 and EU AI law fail to capture the full impact of AI on society, highlighting the need for more comprehensive measures.
129133
<br>
130134
🔥<span style="color: #ff3860">[NEW!]</span>We release the <a href="./PersuasionArena.html" target="_blank">Persuasion Leaderboard</a> and you can also participate in the persuasion <a href="./humaneval.html" target="_blank">Human-Eval</a>
131-
<br><br>
132-
We develop an instruction fine-tuning regime to show that smaller LLMs can also surpass the persuasion capabilities of much larger LLMs. We compare the contributions of various types of instructions in developing persuasion capabilities.
133-
<br><br>
134-
Further, we show that training on synthetically generated explanations of why a tweet might perform better than another tweet further helps increase the persuasion capability of LLMs beyond just the ground-truth instruction data.
135135
</h4>
136136
</div>
137137
</div>
138138
</section>
139139

140+
<section class="section" id="Leaderboard">
141+
<div class="container is-max-desktop">
142+
<div class="columns is-centered has-text-centered">
143+
<div class="column is-six-fifths">
144+
<h2 class="title is-3">Persuasion Leaderboard</h2>
145+
<p>Here are the results of our models on the Persuasion Leaderboard. The leaderboard is based on the <a href="https://arxiv.org/abs/2410.02653">paper</a> and the <a href="./PersuasionArena.html">PersuasionArena</a> website.</p>
146+
</div>
147+
</div>
148+
</div>
140149

150+
<table border="1", class="LeaderboardTable", style="width: 80%", align="center">
151+
<thead>
152+
<tr style="background-color:#f68946;color:white;", align="center">
153+
<th>Model</th>
154+
<th>Avg. Elo</th>
155+
</tr>
156+
</thead>
157+
<tbody align="center", style="background-color:#f8f8f8;">
158+
<tr>
159+
<td>Topline (T2) 🥇</td>
160+
<td>1357</td>
161+
</tr>
162+
<tr>
163+
<td>Ours (13B) 🥈</td>
164+
<td>1293</td>
165+
</tr>
166+
<tr>
167+
<td>Ours-Instruct (13B) 🥉</td>
168+
<td>1304</td>
169+
</tr>
170+
<tr>
171+
<td>Ours (CS+BS) (13B)</td>
172+
<td>1299</td>
173+
</tr>
174+
<tr>
175+
<td>Vicuna-1.5-13B</td>
176+
<td>1195</td>
177+
</tr>
178+
<tr>
179+
<td>LLaMA3-70B</td>
180+
<td>1099</td>
181+
</tr>
182+
<tr>
183+
<td>GPT-3.5</td>
184+
<td>877</td>
185+
</tr>
186+
<tr>
187+
<td>GPT-4o</td>
188+
<td>1187</td>
189+
</tr>
190+
<tr>
191+
<td>GPT-4</td>
192+
<td>1092</td>
193+
</tr>
194+
<tr>
195+
<td>Baseline (T1)</td>
196+
<td>1251</td>
197+
</tr>
198+
<tr>
199+
<td>GPT-4</td>
200+
<td>1213</td>
201+
</tr>
202+
<tr>
203+
<td>Baseline (T1)</td>
204+
<td>979</td>
205+
</tr>
206+
</tbody>
207+
</table>
208+
209+
</section>
141210

142211
<section class="section" id="Examples">
143212

@@ -146,13 +215,17 @@ <h4 class="subtitle has-text-centered">
146215
<h2 class="title is-3"> Transsuasion Examples</h2>
147216

148217
A few samples showing Transsuasion. While the account, time, and meaning of the samples remain similar, the behavior over the samples varies significantly.
149-
<img id="transsuasion-ground-truth" width="100%" src="images/transsuasion-headline-image.jpeg", alt="A few samples showing Transsuasion. While the account, time, and meaning of the samples remain similar, the behavior over the samples varies significantly.">
218+
<img id="transsuasion-ground-truth" width="80%" src="images/transsuasion-samples-ground-truth-1.jpg", alt="A few samples showing Transsuasion. While the account, time, and meaning of the samples remain similar, the behavior over the samples varies significantly.">
219+
<br><br>
220+
<img id="transsuasion-ground-truth" width="80%" src="images/transsuasion-headline-image.jpeg", alt="A few samples showing Transsuasion. While the account, time, and meaning of the samples remain similar, the behavior over the samples varies significantly.">
150221

151222
<br><br>
152223

153224

154225
A few samples showing Transsuasion using our model. The left part contains original low-liked tweet, and the right contains the transsuaded version of the tweet.
155-
<img id="transsuasion-generated-examples" width="100%" src="images/transsuasion-generated-examples.jpeg", alt="A few samples showing Transsuasion using our model. The left part contains original low-liked tweet, and the right contains the transsuaded version of the tweet.">
226+
<img id="transsuasion-generated-examples" width="80%" src="images/transsuasion-samples-generated.jpg", alt="A few samples showing Transsuasion using our model. The left part contains original low-liked tweet, and the right contains the transsuaded version of the tweet.">
227+
<br><br>
228+
<img id="transsuasion-generated-examples" width="80%" src="images/transsuasion-generated-examples.jpeg", alt="A few samples showing Transsuasion using our model. The left part contains original low-liked tweet, and the right contains the transsuaded version of the tweet.">
156229
<br><br>
157230

158231
</div>
@@ -161,7 +234,6 @@ <h2 class="title is-3"> Transsuasion Examples</h2>
161234

162235

163236

164-
165237

166238
<section class="section" style="background-color:#efeff081">
167239
<div class="container is-max-desktop">
@@ -465,13 +537,12 @@ <h2 class="title is-4">Optical character recognition (OCR)</a></h2>
465537
<div class="container is-max-desktop content">
466538
<h2 class="title">BibTeX</h2>
467539
<pre><code>
468-
@online{singh2024measuring,
469-
author = {Singh, Somesh and Singla, Yaman K and SI, Harini and Krishnamurthy, Balaji},
470-
title = {Measuring and Improving Persuasive Abilities of Generative Models},
471-
year = {2024},
472-
url = {https://behavior-in-the-wild.github.io/measure-persuasion}
473-
}
474-
540+
@article{singh2024measuring,
541+
title={Measuring and Improving Persuasiveness of Generative Models},
542+
author={Somesh Singh and Yaman K Singla and Harini SI and Balaji Krishnamurthy},
543+
year={2024},
544+
journal={arXiv preprint arXiv:2410.02653}
545+
}
475546
</code></pre>
476547
</div>
477548

0 commit comments

Comments
 (0)