-
Notifications
You must be signed in to change notification settings - Fork 0
/
CSE256-Demo-Report.html
501 lines (467 loc) · 26.1 KB
/
CSE256-Demo-Report.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
<!doctype html>
<html>
<head>
<meta charset='UTF-8'><meta name='viewport' content='width=device-width initial-scale=1'>
<title>CSE256 - Report</title><link href='https://fonts.loli.net/css?family=Open+Sans:400italic,700italic,700,400&subset=latin,latin-ext' rel='stylesheet' type='text/css' /><style type='text/css'>html {overflow-x: initial !important;}:root { --bg-color: #ffffff; --text-color: #333333; --select-text-bg-color: #B5D6FC; --select-text-font-color: auto; --monospace: "Lucida Console",Consolas,"Courier",monospace; }
html { font-size: 14px; background-color: var(--bg-color); color: var(--text-color); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; -webkit-font-smoothing: antialiased; }
body { margin: 0px; padding: 0px; height: auto; bottom: 0px; top: 0px; left: 0px; right: 0px; font-size: 1rem; line-height: 1.42857143; overflow-x: hidden; background-image: inherit; background-size: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: inherit; tab-size: 4; background-position: inherit inherit; background-repeat: inherit inherit; }
iframe { margin: auto; }
a.url { word-break: break-all; }
a:active, a:hover { outline: 0px; }
.in-text-selection, ::selection { text-shadow: none; background: var(--select-text-bg-color); color: var(--select-text-font-color); }
#write { margin: 0px auto; height: auto; width: inherit; word-break: normal; word-wrap: break-word; position: relative; white-space: normal; overflow-x: visible; padding-top: 40px; }
#write.first-line-indent p { text-indent: 2em; }
#write.first-line-indent li p, #write.first-line-indent p * { text-indent: 0px; }
#write.first-line-indent li { margin-left: 2em; }
.for-image #write { padding-left: 8px; padding-right: 8px; }
body.typora-export { padding-left: 30px; padding-right: 30px; }
.typora-export .footnote-line, .typora-export li, .typora-export p { white-space: pre-wrap; }
@media screen and (max-width: 500px) {
body.typora-export { padding-left: 0px; padding-right: 0px; }
#write { padding-left: 20px; padding-right: 20px; }
.CodeMirror-sizer { margin-left: 0px !important; }
.CodeMirror-gutters { display: none !important; }
}
#write li > figure:last-child { margin-bottom: 0.5rem; }
#write ol, #write ul { position: relative; }
img { max-width: 100%; vertical-align: middle; }
button, input, select, textarea { color: inherit; font-family: inherit; font-size: inherit; font-style: inherit; font-variant-caps: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; }
input[type="checkbox"], input[type="radio"] { line-height: normal; padding: 0px; }
*, ::after, ::before { box-sizing: border-box; }
#write h1, #write h2, #write h3, #write h4, #write h5, #write h6, #write p, #write pre { width: inherit; }
#write h1, #write h2, #write h3, #write h4, #write h5, #write h6, #write p { position: relative; }
h1, h2, h3, h4, h5, h6 { break-after: avoid-page; break-inside: avoid; orphans: 2; }
p { orphans: 4; }
h1 { font-size: 2rem; }
h2 { font-size: 1.8rem; }
h3 { font-size: 1.6rem; }
h4 { font-size: 1.4rem; }
h5 { font-size: 1.2rem; }
h6 { font-size: 1rem; }
.md-math-block, .md-rawblock, h1, h2, h3, h4, h5, h6, p { margin-top: 1rem; margin-bottom: 1rem; }
.hidden { display: none; }
.md-blockmeta { color: rgb(204, 204, 204); font-weight: 700; font-style: italic; }
a { cursor: pointer; }
sup.md-footnote { padding: 2px 4px; background-color: rgba(238, 238, 238, 0.701961); color: rgb(85, 85, 85); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; cursor: pointer; }
sup.md-footnote a, sup.md-footnote a:hover { color: inherit; text-transform: inherit; text-decoration: inherit; }
#write input[type="checkbox"] { cursor: pointer; width: inherit; height: inherit; }
figure { overflow-x: auto; margin: 1.2em 0px; max-width: calc(100% + 16px); padding: 0px; }
figure > table { margin: 0px !important; }
tr { break-inside: avoid; break-after: auto; }
thead { display: table-header-group; }
table { border-collapse: collapse; border-spacing: 0px; width: 100%; overflow: auto; break-inside: auto; text-align: left; }
table.md-table td { min-width: 32px; }
.CodeMirror-gutters { border-right-width: 0px; background-color: inherit; }
.CodeMirror-linenumber { }
.CodeMirror { text-align: left; }
.CodeMirror-placeholder { opacity: 0.3; }
.CodeMirror pre { padding: 0px 4px; }
.CodeMirror-lines { padding: 0px; }
div.hr:focus { cursor: none; }
#write pre { white-space: pre-wrap; }
#write.fences-no-line-wrapping pre { white-space: pre; }
#write pre.ty-contain-cm { white-space: normal; }
.CodeMirror-gutters { margin-right: 4px; }
.md-fences { font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; overflow: visible; white-space: pre; background-image: inherit; background-size: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: inherit; position: relative !important; background-position: inherit inherit; background-repeat: inherit inherit; }
.md-diagram-panel { width: 100%; margin-top: 10px; text-align: center; padding-top: 0px; padding-bottom: 8px; overflow-x: auto; }
#write .md-fences.mock-cm { white-space: pre-wrap; }
.md-fences.md-fences-with-lineno { padding-left: 0px; }
#write.fences-no-line-wrapping .md-fences.mock-cm { white-space: pre; overflow-x: auto; }
.md-fences.mock-cm.md-fences-with-lineno { padding-left: 8px; }
.CodeMirror-line, twitterwidget { break-inside: avoid; }
.footnotes { opacity: 0.8; font-size: 0.9rem; margin-top: 1em; margin-bottom: 1em; }
.footnotes + .footnotes { margin-top: 0px; }
.md-reset { margin: 0px; padding: 0px; border: 0px; outline: 0px; vertical-align: top; text-decoration: none; text-shadow: none; float: none; position: static; width: auto; height: auto; white-space: nowrap; cursor: inherit; line-height: normal; font-weight: 400; text-align: left; box-sizing: content-box; direction: ltr; background-position: 0px 0px; background-repeat: initial initial; }
li div { padding-top: 0px; }
blockquote { margin: 1rem 0px; }
li .mathjax-block, li p { margin: 0.5rem 0px; }
li { margin: 0px; position: relative; }
blockquote > :last-child { margin-bottom: 0px; }
blockquote > :first-child, li > :first-child { margin-top: 0px; }
.footnotes-area { color: rgb(136, 136, 136); margin-top: 0.714rem; padding-bottom: 0.143rem; white-space: normal; }
#write .footnote-line { white-space: pre-wrap; }
@media print {
body, html { border: 1px solid transparent; height: 99%; break-after: avoid-page; break-before: avoid-page; }
#write { margin-top: 0px; padding-top: 0px; border-color: transparent !important; }
.typora-export * { -webkit-print-color-adjust: exact; }
html.blink-to-pdf { font-size: 13px; }
.typora-export #write { padding-left: 32px; padding-right: 32px; padding-bottom: 0px; break-after: avoid-page; }
.typora-export #write::after { height: 0px; }
@page { margin: 20mm 0px; }
}
.footnote-line { margin-top: 0.714em; font-size: 0.7em; }
a img, img a { cursor: pointer; }
pre.md-meta-block { font-size: 0.8rem; min-height: 0.8rem; white-space: pre-wrap; background-color: rgb(204, 204, 204); display: block; overflow-x: hidden; background-position: initial initial; background-repeat: initial initial; }
p > .md-image:only-child:not(.md-img-error) img, p > img:only-child { display: block; margin: auto; }
p > .md-image:only-child { display: inline-block; width: 100%; }
#write .MathJax_Display { margin: 0.8em 0px 0px; }
.md-math-block { width: 100%; }
.md-math-block:not(:empty)::after { display: none; }
[contenteditable="true"]:active, [contenteditable="true"]:focus { outline: 0px; box-shadow: none; }
.md-task-list-item { position: relative; list-style-type: none; }
.task-list-item.md-task-list-item { padding-left: 0px; }
.md-task-list-item > input { position: absolute; top: 0px; left: 0px; margin-left: -1.2em; margin-top: calc(1em - 10px); border: none; }
.math { font-size: 1rem; }
.md-toc { min-height: 3.58rem; position: relative; font-size: 0.9rem; border-top-left-radius: 10px; border-top-right-radius: 10px; border-bottom-right-radius: 10px; border-bottom-left-radius: 10px; }
.md-toc-content { position: relative; margin-left: 0px; }
.md-toc-content::after, .md-toc::after { display: none; }
.md-toc-item { display: block; color: rgb(65, 131, 196); }
.md-toc-item a { text-decoration: none; }
.md-toc-inner:hover { text-decoration: underline; }
.md-toc-inner { display: inline-block; cursor: pointer; }
.md-toc-h1 .md-toc-inner { margin-left: 0px; font-weight: 700; }
.md-toc-h2 .md-toc-inner { margin-left: 2em; }
.md-toc-h3 .md-toc-inner { margin-left: 4em; }
.md-toc-h4 .md-toc-inner { margin-left: 6em; }
.md-toc-h5 .md-toc-inner { margin-left: 8em; }
.md-toc-h6 .md-toc-inner { margin-left: 10em; }
@media screen and (max-width: 48em) {
.md-toc-h3 .md-toc-inner { margin-left: 3.5em; }
.md-toc-h4 .md-toc-inner { margin-left: 5em; }
.md-toc-h5 .md-toc-inner { margin-left: 6.5em; }
.md-toc-h6 .md-toc-inner { margin-left: 8em; }
}
a.md-toc-inner { font-size: inherit; font-style: inherit; font-weight: inherit; line-height: inherit; }
.footnote-line a:not(.reversefootnote) { color: inherit; }
.md-attr { display: none; }
.md-fn-count::after { content: "."; }
code, pre, samp, tt { font-family: var(--monospace); }
kbd { margin: 0px 0.1em; padding: 0.1em 0.6em; font-size: 0.8em; color: rgb(36, 39, 41); background-color: rgb(255, 255, 255); border: 1px solid rgb(173, 179, 185); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; box-shadow: rgba(12, 13, 14, 0.2) 0px 1px 0px, rgb(255, 255, 255) 0px 0px 0px 2px inset; white-space: nowrap; vertical-align: middle; background-position: initial initial; background-repeat: initial initial; }
.md-comment { color: rgb(162, 127, 3); opacity: 0.8; font-family: var(--monospace); }
code { text-align: left; }
a.md-print-anchor { white-space: pre !important; border: none !important; display: inline-block !important; position: absolute !important; width: 1px !important; right: 0px !important; outline: 0px !important; text-shadow: initial !important; background-position: 0px 0px !important; background-repeat: initial initial !important; }
.md-inline-math .MathJax_SVG .noError { display: none !important; }
.html-for-mac .inline-math-svg .MathJax_SVG { vertical-align: 0.2px; }
.md-math-block .MathJax_SVG_Display { text-align: center; margin: 0px; position: relative; text-indent: 0px; max-width: none; max-height: none; min-height: 0px; min-width: 100%; width: auto; overflow-y: hidden; display: block !important; }
.MathJax_SVG_Display, .md-inline-math .MathJax_SVG_Display { width: auto; margin: inherit; display: inline-block !important; }
.MathJax_SVG .MJX-monospace { font-family: var(--monospace); }
.MathJax_SVG .MJX-sans-serif { font-family: sans-serif; }
.MathJax_SVG { display: inline; font-style: normal; font-weight: 400; line-height: normal; zoom: 90%; text-indent: 0px; text-align: left; text-transform: none; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; }
.MathJax_SVG * { transition: none; }
.MathJax_SVG_Display svg { vertical-align: middle !important; margin-bottom: 0px !important; }
.os-windows.monocolor-emoji .md-emoji { font-family: "Segoe UI Symbol", sans-serif; }
.md-diagram-panel > svg { max-width: 100%; }
[lang="mermaid"] svg, [lang="flow"] svg { max-width: 100%; }
[lang="mermaid"] .node text { font-size: 1rem; }
table tr th { border-bottom-width: 0px; }
video { max-width: 100%; display: block; margin: 0px auto; }
iframe { max-width: 100%; width: 100%; border: none; }
.highlight td, .highlight tr { border: 0px; }
:root {
--side-bar-bg-color: #fafafa;
--control-text-color: #777;
}
@include-when-export url(https://fonts.loli.net/css?family=Open+Sans:400italic,700italic,700,400&subset=latin,latin-ext);
html {
font-size: 16px;
}
body {
font-family: "Open Sans","Clear Sans","Helvetica Neue",Helvetica,Arial,sans-serif;
color: rgb(51, 51, 51);
line-height: 1.6;
}
#write {
max-width: 860px;
margin: 0 auto;
padding: 30px;
padding-bottom: 100px;
}
#write > ul:first-child,
#write > ol:first-child{
margin-top: 30px;
}
a {
color: #4183C4;
}
h1,
h2,
h3,
h4,
h5,
h6 {
position: relative;
margin-top: 1rem;
margin-bottom: 1rem;
font-weight: bold;
line-height: 1.4;
cursor: text;
}
h1:hover a.anchor,
h2:hover a.anchor,
h3:hover a.anchor,
h4:hover a.anchor,
h5:hover a.anchor,
h6:hover a.anchor {
text-decoration: none;
}
h1 tt,
h1 code {
font-size: inherit;
}
h2 tt,
h2 code {
font-size: inherit;
}
h3 tt,
h3 code {
font-size: inherit;
}
h4 tt,
h4 code {
font-size: inherit;
}
h5 tt,
h5 code {
font-size: inherit;
}
h6 tt,
h6 code {
font-size: inherit;
}
h1 {
padding-bottom: .3em;
font-size: 2.25em;
line-height: 1.2;
border-bottom: 1px solid #eee;
}
h2 {
padding-bottom: .3em;
font-size: 1.75em;
line-height: 1.225;
border-bottom: 1px solid #eee;
}
h3 {
font-size: 1.5em;
line-height: 1.43;
}
h4 {
font-size: 1.25em;
}
h5 {
font-size: 1em;
}
h6 {
font-size: 1em;
color: #777;
}
p,
blockquote,
ul,
ol,
dl,
table{
margin: 0.8em 0;
}
li>ol,
li>ul {
margin: 0 0;
}
hr {
height: 2px;
padding: 0;
margin: 16px 0;
background-color: #e7e7e7;
border: 0 none;
overflow: hidden;
box-sizing: content-box;
}
li p.first {
display: inline-block;
}
ul,
ol {
padding-left: 30px;
}
ul:first-child,
ol:first-child {
margin-top: 0;
}
ul:last-child,
ol:last-child {
margin-bottom: 0;
}
blockquote {
border-left: 4px solid #dfe2e5;
padding: 0 15px;
color: #777777;
}
blockquote blockquote {
padding-right: 0;
}
table {
padding: 0;
word-break: initial;
}
table tr {
border-top: 1px solid #dfe2e5;
margin: 0;
padding: 0;
}
table tr:nth-child(2n),
thead {
background-color: #f8f8f8;
}
table tr th {
font-weight: bold;
border: 1px solid #dfe2e5;
border-bottom: 0;
text-align: left;
margin: 0;
padding: 6px 13px;
}
table tr td {
border: 1px solid #dfe2e5;
text-align: left;
margin: 0;
padding: 6px 13px;
}
table tr th:first-child,
table tr td:first-child {
margin-top: 0;
}
table tr th:last-child,
table tr td:last-child {
margin-bottom: 0;
}
.CodeMirror-lines {
padding-left: 4px;
}
.code-tooltip {
box-shadow: 0 1px 1px 0 rgba(0,28,36,.3);
border-top: 1px solid #eef2f2;
}
.md-fences,
code,
tt {
border: 1px solid #e7eaed;
background-color: #f8f8f8;
border-radius: 3px;
padding: 0;
padding: 2px 4px 0px 4px;
font-size: 0.9em;
}
code {
background-color: #f3f4f4;
padding: 0 2px 0 2px;
}
.md-fences {
margin-bottom: 15px;
margin-top: 15px;
padding-top: 8px;
padding-bottom: 6px;
}
.md-task-list-item > input {
margin-left: -1.3em;
}
@media print {
html {
font-size: 13px;
}
table,
pre {
page-break-inside: avoid;
}
pre {
word-wrap: break-word;
}
}
.md-fences {
background-color: #f8f8f8;
}
#write pre.md-meta-block {
padding: 1rem;
font-size: 85%;
line-height: 1.45;
background-color: #f7f7f7;
border: 0;
border-radius: 3px;
color: #777777;
margin-top: 0 !important;
}
.mathjax-block>.code-tooltip {
bottom: .375rem;
}
.md-mathjax-midline {
background: #fafafa;
}
#write>h3.md-focus:before{
left: -1.5625rem;
top: .375rem;
}
#write>h4.md-focus:before{
left: -1.5625rem;
top: .285714286rem;
}
#write>h5.md-focus:before{
left: -1.5625rem;
top: .285714286rem;
}
#write>h6.md-focus:before{
left: -1.5625rem;
top: .285714286rem;
}
.md-image>.md-meta {
/*border: 1px solid #ddd;*/
border-radius: 3px;
padding: 2px 0px 0px 4px;
font-size: 0.9em;
color: inherit;
}
.md-tag {
color: #a7a7a7;
opacity: 1;
}
.md-toc {
margin-top:20px;
padding-bottom:20px;
}
.sidebar-tabs {
border-bottom: none;
}
#typora-quick-open {
border: 1px solid #ddd;
background-color: #f8f8f8;
}
#typora-quick-open-item {
background-color: #FAFAFA;
border-color: #FEFEFE #e5e5e5 #e5e5e5 #eee;
border-style: solid;
border-width: 1px;
}
/** focus mode */
.on-focus-mode blockquote {
border-left-color: rgba(85, 85, 85, 0.12);
}
header, .context-menu, .megamenu-content, footer{
font-family: "Segoe UI", "Arial", sans-serif;
}
.file-node-content:hover .file-node-icon,
.file-node-content:hover .file-node-open-state{
visibility: visible;
}
.mac-seamless-mode #typora-sidebar {
background-color: #fafafa;
background-color: var(--side-bar-bg-color);
}
.md-lang {
color: #b4654d;
}
.html-for-mac .context-menu {
--item-hover-bg-color: #E6F0FE;
}
#md-notification .btn {
border: 0;
}
.dropdown-menu .divider {
border-color: #e5e5e5;
}
.typora-export li, .typora-export p, .typora-export, .footnote-line {white-space: normal;}
</style>
</head>
<body class='typora-export' >
<div id='write' class = 'is-mac'><p> </p><p> </p><p> </p><h1><a name='header-n161' class='md-header-anchor '></a><center>CSE 256 Demo Report</center></h1> <center>
<br>
Yiwen Song, Yue Zhao
</center> <p> </p><p> </p><h2><a name='header-n40' class='md-header-anchor '></a>Overview</h2><p>In this project, we've developed a pipeline for text classification. The steps to take are as follows:</p><ul><li><p>Firstly, you may <strong>upload</strong> a <strong>train file</strong>. </p><ul><li>This train file must be of the following format: It consists of 2 columns, the first column is the label and the second column is the related sentence. Note that all the text in the train file must have been cleaned before uploading.</li></ul><ul><li>After uploading the train file, you will see <span class="MathJax_SVG" tabindex="-1" style="font-size: 100%; display: inline-block;"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.394ex" height="1.41ex" viewBox="0 -504.6 600 607.1" role="img" focusable="false" style="vertical-align: -0.238ex;"><defs><path stroke-width="0" id="E4-MJMATHI-6E" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></defs><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"><use xlink:href="#E4-MJMATHI-6E" x="0" y="0"></use></g></svg></span><script type="math/tex">n</script> <strong>WordCloud</strong> images of the high-frequency words for each class, which will be a preview of the training data.</li></ul></li><li><p>Secondly, you may <strong>select a model</strong> from <code>RandomForestClassifier</code>, <code>LogisticRegressionClassifier</code> and <code>SVMClassifier</code>.</p></li><li><p>Secondly, you may <strong>upload</strong> a <strong>test file</strong>.</p><ul><li>This test file must be of the same format as the train file.</li></ul></li><li><p>Next, you may click the <code>run test</code> button AND see the <strong>global explanation</strong> results. The results are divided into the following three parts:</p><ul><li>The metrics table of the <strong>precision</strong>, <strong>recall</strong> and <strong>F1-score</strong> of each class.</li><li>The <strong>top features</strong> of the model.</li><li>The <strong>ROC curve</strong> on this test set.</li></ul></li><li><p>Finally, you could go to the last section to see the <strong>local explanation</strong> of each sentence. You may</p><ul><li>See the explanation of the prediction of a <strong>random sentence</strong> from the test set.</li><li>See the explanation of the prediction of a <strong>self-indexed sentence</strong> from the test set.</li><li>See the explanation of the prediction of a <strong>self-input sentence</strong>.</li></ul></li></ul><p> </p><ul><li class='md-task-list-item task-list-item task-list-not-done' ><input type='checkbox' disabled='disabled' /><p>The figure below shows how this project works: </p></li></ul><p> </p><p><img src='/Users/sallysyw/Desktop/screenshots-256/0-1-0.png' alt='part1-1' referrerPolicy='no-referrer' /></p><p><img src='/Users/sallysyw/Desktop/screenshots-256/0-1-1.png' alt='part1-1-1' referrerPolicy='no-referrer' /></p><p><img src='/Users/sallysyw/Desktop/screenshots-256/0-1-2.png' alt='part-1-1-2' referrerPolicy='no-referrer' /></p><p> </p><p> </p><h2><a name='header-n179' class='md-header-anchor '></a>Part 1</h2><h4><a name='header-n137' class='md-header-anchor '></a>1.1 My favirote review</h4><p><img src='/Users/sallysyw/Desktop/screenshots-256/1-1-0.png' alt='1-1-0' referrerPolicy='no-referrer' /></p><h4><a name='header-n180' class='md-header-anchor '></a>1.2 Over-Confident Review</h4><ul><li class='md-task-list-item task-list-item task-list-not-done' ><input type='checkbox' disabled='disabled' /><p>This sentence should be negative, but is classified as positive with confidence 0.93:</p></li></ul><p><img src='/Users/sallysyw/Desktop/screenshots-256/1-1-1.png' alt='1-1-1' referrerPolicy='no-referrer' /></p><p> </p><h2><a name='header-n199' class='md-header-anchor '></a>Part 2</h2><p>In this part, we use the data set download from <a href='https://www.kaggle.com/crowdflower/twitter-user-gender-classification' target='_blank' class='url'>https://www.kaggle.com/crowdflower/twitter-user-gender-classification</a>, the task is to classify the gender of the twitter users.</p><ul><li>The data we used are the twitter text and the users' description text</li></ul><ul><li>The labels are <code>male</code> and <code>female</code></li><li>When cleaning the dataset, we remove all the samples with a label confidence value <span class="MathJax_SVG" tabindex="-1" style="font-size: 100%; display: inline-block;"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="5.42ex" height="1.994ex" viewBox="0 -755.9 2333.8 858.4" role="img" focusable="false" style="vertical-align: -0.238ex;"><defs><path stroke-width="0" id="E7-MJMAIN-3C" d="M694 -11T694 -19T688 -33T678 -40Q671 -40 524 29T234 166L90 235Q83 240 83 250Q83 261 91 266Q664 540 678 540Q681 540 687 534T694 519T687 505Q686 504 417 376L151 250L417 124Q686 -4 687 -5Q694 -11 694 -19Z"></path><path stroke-width="0" id="E7-MJMAIN-30" d="M96 585Q152 666 249 666Q297 666 345 640T423 548Q460 465 460 320Q460 165 417 83Q397 41 362 16T301 -15T250 -22Q224 -22 198 -16T137 16T82 83Q39 165 39 320Q39 494 96 585ZM321 597Q291 629 250 629Q208 629 178 597Q153 571 145 525T137 333Q137 175 145 125T181 46Q209 16 250 16Q290 16 318 46Q347 76 354 130T362 333Q362 478 354 524T321 597Z"></path><path stroke-width="0" id="E7-MJMAIN-2E" d="M78 60Q78 84 95 102T138 120Q162 120 180 104T199 61Q199 36 182 18T139 0T96 17T78 60Z"></path><path stroke-width="0" id="E7-MJMAIN-39" d="M352 287Q304 211 232 211Q154 211 104 270T44 396Q42 412 42 436V444Q42 537 111 606Q171 666 243 666Q245 666 249 666T257 665H261Q273 665 286 663T323 651T370 619T413 560Q456 472 456 334Q456 194 396 97Q361 41 312 10T208 -22Q147 -22 108 7T68 93T121 149Q143 149 158 135T173 96Q173 78 164 65T148 49T135 44L131 43Q131 41 138 37T164 27T206 22H212Q272 22 313 86Q352 142 352 280V287ZM244 248Q292 248 321 297T351 430Q351 508 343 542Q341 552 337 562T323 588T293 615T246 625Q208 625 181 598Q160 576 154 546T147 441Q147 358 152 329T172 282Q197 248 244 248Z"></path></defs><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"><use xlink:href="#E7-MJMAIN-3C" x="0" y="0"></use><g transform="translate(1055,0)"><use xlink:href="#E7-MJMAIN-30"></use><use xlink:href="#E7-MJMAIN-2E" x="500" y="0"></use><use xlink:href="#E7-MJMAIN-39" x="778" y="0"></use></g></g></svg></span><script type="math/tex">< 0.9</script></li></ul><p> </p><h4><a name='header-n250' class='md-header-anchor '></a>2.1 Global Explanations</h4><p>Here are the <u>global</u> predict results on our test set:</p><p><img src='/Users/sallysyw/Desktop/screenshots-256/2-0.png' alt='2-0' referrerPolicy='no-referrer' /></p><p><img src='/Users/sallysyw/Desktop/screenshots-256/2-1.png' alt='2-1' referrerPolicy='no-referrer' /></p><p><img src='/Users/sallysyw/Desktop/screenshots-256/2-2.png' alt='2-2' referrerPolicy='no-referrer' /></p><p> </p><h4><a name='header-n262' class='md-header-anchor '></a>2.2 Locla Explanations</h4><p>Here are two samples of our <u>local explanation</u>:</p><ol><li>The first one is a sample that is classified as male:</li></ol><p><img src='/Users/sallysyw/Desktop/screenshots-256/2-2-0.png' alt='2-2-0' referrerPolicy='no-referrer' /></p><p> </p><ol start='2' ><li>The second one is a sample that is classified as female:</li></ol><p><img src='/Users/sallysyw/Desktop/screenshots-256/2-2-1.png' alt='2-2-1' referrerPolicy='no-referrer' /></p><p> </p><h2><a name='header-n280' class='md-header-anchor '></a>Part 3</h2><h4><a name='header-n292' class='md-header-anchor '></a>The creative points are as follows: </h4><ul><li><p>We use a unified pipeline to do <strong>part1</strong> and <strong>part2</strong>, which means, this text classification system can not only be applied to the sentiment review classification and gender classsification, but also can be applied to all the binary classification tasks. </p></li><li><p>Three different models <code>RandomForest</code>, <code>LogisticRegression</code>, and <code>SVM</code> are embedded in our system for users to choose.</p></li><li><p>Detailed global explanation are provided, so users can have an understanding about the overall performance.</p><p> </p></li></ul><h4><a name='header-n338' class='md-header-anchor '></a>Tech stack</h4><ul><li><strong>Flask</strong> is used for backend and <strong>Bootstrap</strong> is used for frontend.</li><li><strong>Jquery</strong> is used for dynamically rendering the html.</li><li><strong>CSS</strong> is used for animation and style design.</li><li><strong>Sklearn, NLTK</strong> are used for feature extraction and model training.</li><li><strong>Lime</strong> is used for local explanation.</li></ul></div>
</body>
</html>