-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex4.html
633 lines (551 loc) · 39.3 KB
/
index4.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
<!DOCTYPE html>
<html lang="en">
<head>
<link href="/images/favicon.png" rel="icon">
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>trvrm.github.io</title>
<link rel="stylesheet" type="text/css" href="/theme/css/flatly.min.css" />
<link rel="stylesheet" type="text/css" href="/theme/css/style.css" />
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" />
<link href="/theme/css/pygments/tango.css" rel="stylesheet">
</head>
<body>
<section class="hero is-primary">
<!-- Hero header: will stick at the top -->
<div class="hero-head">
<nav class="navbar ">
<div class="navbar-menu is-active">
<div class="navbar-end">
<a class="navbar-item" href="https://twitter.com/trvrm">
<span class="icon"> <i class="fa fa-twitter"></i> </span>
twitter
</a>
<a class="navbar-item" href="https://github.com/trvrm">
<span class="icon"> <i class="fa fa-github"></i> </span>
github
</a>
</div>
</div>
</div>
</div>
<!-- Hero content: will be in the middle -->
<div class="hero-body">
<div class="container has-text-centered">
<p class="title is-1">trvrm.github.io</p>
</div>
</div>
<div class="hero-foot">
<nav class="navbar ">
<div class="navbar-brand is-active">
<a href="/" class="navbar-item" >
trvrm.github.io
</a>
</div>
<div class="navbar-menu is-active">
<div class="navbar-start">
<a class="navbar-item " href="/category/database.html">Database</a>
<a class="navbar-item " href="/category/software.html">Software</a>
<a class="navbar-item " href="/category/systems.html">Systems</a>
</div>
</div>
</nav>
</div>
</section>
<section class="section">
<p class="title is-3">
<a href="/seriously-subtle-bug.html" rel="bookmark" title="Permalink to A Seriously Subtle Bug">
A Seriously Subtle Bug
</a>
</p>
<p class="subtitle is-5">
Thu 01 January 2015
</p>
<hr>
<div class="content ">
<p>I build and maintain a number of web applications built using <a class="reference external" href="http://python.org">Python</a>, <a class="reference external" href="http://bottlepy.org/docs/dev/index.html">Bottle</a>, and <a class="reference external" href="http://uwsgi-docs.readthedocs.org/en/latest/index.html">uWSGI</a>.
In general, I've found this a very powerful and robust software stack. However, this week
we encountered a very strange issue that took us many hours to fully diagnose.</p>
<p>Our first indication that something was wrong was when our automated monitoring tools warned
us that one of our sites was offline. We manage our applications through the uWSGI <a class="reference external" href="http://uwsgi-docs.readthedocs.org/en/latest/Emperor.html">Emperor</a>
service, which makes it easy to restart errant applications. Simply touching the config file for
the application in question causes it to be reloaded:</p>
<div class="highlight"><pre><span></span>$ touch /etc/uwsgi-emperor/vassals/myapp.ini
</pre></div>
<p>This brought our systems back up, but obviously didn't explain the problem, and over the coming weeks
it recurred several times, usually several days apart. So, obviously my first step was to look at
the log files. Our first indication of trouble was a log line from our database connection layer:</p>
<div class="highlight"><pre><span></span>OperationalError: could not create socket: too many open files
</pre></div>
<p>Which actually led us away from the real cause of the bug to start with - at first we thought that
we were simply creating too many database connections. But further examination reassured us that yes,
our database layer was fine, our connections were getting opened and closed correctly. Postgres has
<em>excellent</em> introspective tools, if you know how to use them; in this case the following is very
helpful:</p>
<div class="highlight"><pre><span></span><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">pg_stat_activity</span><span class="p">;</span>
</pre></div>
<p>which revealed that we had no more database connections open than expected. So, our next step was the
linux systems administration tool <code class="code">
lsof</code>
. This tool lists information about currently open files</p>
<div class="highlight"><pre><span></span>$ sudo lsof > lsof.txt
COMMAND PID TID USER FD TYPE DEVICE SIZE/OFF NODE NAME
init <span class="m">1</span> root cwd DIR <span class="m">8</span>,1 <span class="m">4096</span> <span class="m">2</span> /
init <span class="m">1</span> root rtd DIR <span class="m">8</span>,1 <span class="m">4096</span> <span class="m">2</span> /
init <span class="m">1</span> root txt REG <span class="m">8</span>,1 <span class="m">265848</span> <span class="m">14422298</span> /sbin/init
...
</pre></div>
<p>... followed by thousands more lines. Armed with this information, we could figure out how many files
each process was using.</p>
<div class="section" id="enter-pandas">
<h2>Enter Pandas</h2>
<p>While it would be quite possible to search and filter this data using traditional Unix tools such as <code class="code">
awk</code>
and <code class="code">
grep</code>
, I'm finding that more and more I'm staying inside the python ecosystem to do systems administration
and analysis tasks. I use the <a class="reference external" href="http://pandas.pydata.org/">Pandas</a> data analysis library heavily, and it was a perfect fit for this particular task.</p>
<div class="highlight"><pre><span></span>$ ipython
</pre></div>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">pandas</span>
<span class="n">widths</span><span class="o">=</span><span class="p">[</span><span class="mi">9</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">11</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">10</span><span class="p">,</span><span class="mi">19</span><span class="p">,</span><span class="mi">10</span><span class="p">,</span><span class="mi">12</span><span class="p">,</span><span class="mi">200</span><span class="p">]</span>
<span class="n">frame</span><span class="o">=</span><span class="n">pandas</span><span class="o">.</span><span class="n">read_fwf</span><span class="p">(</span><span class="s1">'lsof.txt'</span><span class="p">,</span><span class="n">widths</span><span class="o">=</span><span class="n">widths</span><span class="p">)</span>
<span class="n">frame</span><span class="o">.</span><span class="n">columns</span>
</pre></div>
<pre class="literal-block">
Index([u'COMMAND', u'PID', u'TID', u'USER', u'FD', u'TYPE', u'DEVICE', u'SIZE/OFF', u'NODE', u'NAME'], dtype='object')
</pre>
<p>So now we have a DataFrame (a construct very similar to an Excel worksheet) with a list of every open file on the system, along
with the process id and name of the program that is holding it open. Our next step was to ask Pandas to tell us which processes
had the <em>most</em> files open:</p>
<div class="highlight"><pre><span></span><span class="n">frame</span><span class="o">.</span><span class="n">PID</span><span class="o">.</span><span class="n">value_counts</span><span class="p">()</span><span class="o">.</span><span class="n">head</span><span class="p">()</span>
</pre></div>
<pre class="literal-block">
2445 745
2454 745
...
</pre>
<p>So process <strong>2445</strong> has 745 open files. OK, what is that process?</p>
<div class="highlight"><pre><span></span><span class="n">frame</span><span class="p">[</span><span class="n">frame</span><span class="o">.</span><span class="n">PID</span><span class="o">==</span><span class="mi">2445</span><span class="p">][[</span><span class="s1">'USER'</span><span class="p">,</span><span class="s1">'COMMAND'</span><span class="p">]]</span>
</pre></div>
<pre class="literal-block">
USER COMMAND
3083 www-data uwsgi-cor
3084 www-data uwsgi-cor
3085 www-data uwsgi-cor
...
</pre>
<p>So we've learned, then, that a uWSGI process belonging to www-data is holding open more than 700 files. Now, under
Ubuntu, this is going to be a problem very soon, because the maximum number of files that www-data may have open
per-process is 1024.</p>
<div class="highlight"><pre><span></span>$ sudo su www-data
$ <span class="nb">ulimit</span> -n
</pre></div>
<pre class="literal-block">
1024
</pre>
<p>So, clearly one of our web application processes is opening files and not closing them again. This is the kind of
bug that I <em>hate</em> as a programmer, because it wouldn't show up in development, when I'm frequently restarting the
application, or even in testing, but only appears under real-world load. But at least now we have a path towards
temporary remediation. So first we simply increased the limits in <code class="code">
ulimit</code>
so that the service would run longer
before this bug re-appeared. But we still wanted to understand <em>why</em> this was happening.</p>
</div>
<div class="section" id="next-steps">
<h2>Next Steps</h2>
<p>Again, we used Pandas to interrogate the output of <code class="code">
lsof</code>
, but this time to find out whether there was a pattern
to the filenames that were being left open</p>
<div class="highlight"><pre><span></span><span class="n">frame</span><span class="o">.</span><span class="n">NAME</span><span class="o">.</span><span class="n">value_counts</span><span class="p">()</span><span class="o">.</span><span class="n">head</span><span class="p">()</span>
</pre></div>
<p>Which revealed to us that the the vast majority of the files being left open were ones that we were delivering through
our Bottle Python application. Specifically, they were being served through the <a class="reference external" href="http://bottlepy.org/docs/dev/tutorial.html#tutorial-static-files">static_file</a> function.</p>
<p>We verified this by hitting the url that was serving up those static files, and watching the output of lsof. Immediately we
saw that yes, every time we served that file, the open count for that file went up. So, we clearly had a resource leak
on our hands. Now, this surprised me, because usually the memory management and garbage collection
in Python is excellent, and I've left the days of manually tracking resources in C long behind me.</p>
<p>So, next I constructed some test cases. Firstly, I ran our software on a test virtual machine to verify that I could
recreate the bug. Then, I wrote a very bare-bones Bottle app that simply served a static file:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">bottle</span>
<span class="n">application</span><span class="o">=</span><span class="n">bottle</span><span class="o">.</span><span class="n">Bottle</span><span class="p">()</span>
<span class="nd">@application.get</span><span class="p">(</span><span class="s1">'/diagnose'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">test</span><span class="p">():</span>
<span class="k">return</span> <span class="n">bottle</span><span class="o">.</span><span class="n">static_file</span><span class="p">(</span><span class="s1">'cat.jpg'</span><span class="p">,</span> <span class="s1">'.'</span><span class="p">)</span>
</pre></div>
<p>And I immediately saw that this <em>didn't</em> trigger any kind of file leak. The main difference between the two was that our
production application uses Bottle's <em>mounting</em> capability to namespace URLS. So I changed my test application as follows:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">bottle</span>
<span class="n">app</span><span class="o">=</span><span class="n">bottle</span><span class="o">.</span><span class="n">Bottle</span><span class="p">()</span>
<span class="nd">@app.get</span><span class="p">(</span><span class="s1">'/'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">test</span><span class="p">():</span>
<span class="k">return</span> <span class="n">bottle</span><span class="o">.</span><span class="n">static_file</span><span class="p">(</span><span class="s1">'cat.jpg'</span><span class="p">,</span> <span class="s1">'.'</span><span class="p">)</span>
<span class="n">rootapp</span><span class="o">=</span><span class="n">bottle</span><span class="o">.</span><span class="n">Bottle</span><span class="p">()</span>
<span class="n">rootapp</span><span class="o">.</span><span class="n">mount</span><span class="p">(</span><span class="s2">"/diagnose"</span><span class="p">,</span> <span class="n">app</span><span class="p">)</span>
<span class="n">application</span><span class="o">=</span><span class="n">rootapp</span>
</pre></div>
<p>And <code class="code">
lsof</code>
indicated that we <em>were</em> leaking files. Every time I hit <cite>/diagnose</cite>, the open file count for <cite>cats.jpg</cite>
increased by one.</p>
<p>So, we could simply re-write our application to not use <code class="code">
Bottle.mount</code>
, but that wasn't good enough for me. I wanted
to understand <em>why</em> such a simple change would trigger a resource leak. At this point, it turns out it's good that
I have Aspergers, and with it a tendency to hyper-focus on interesting problems, because it took a long time. In
fact, I ended up taking the Bottle library, and manually stripping it of every line of code that wasn't related to
simply handling that single URL, in an attempt to understand exactly what the different code paths were between the
leaking program and the safe one.</p>
<p>In doing so, I was greatly aided by the <em>amazing</em> introspective powers of Python. We felt sure that we were
dealing with some kind of resource leak - in Python, every file is handled by a <code class="code">
file</code>
object, and when that object
gets cleaned up by garbage collection, the underlying file handle is closed. So firstly, I replaced the relevant call to
the <code class="code">
file</code>
constructor with my own object that derived from <code class="code">
file</code>
</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">MonitoredFile</span><span class="p">(</span><span class="nb">file</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="n">name</span><span class="p">,</span><span class="n">mode</span><span class="p">):</span>
<span class="n">logging</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Opening {0}"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">name</span><span class="p">))</span>
<span class="nb">file</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="n">name</span><span class="p">,</span><span class="n">mode</span><span class="p">)</span>
<span class="k">def</span> <span class="fm">__del__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">logging</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'file.__del__({0})'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">name</span><span class="p">))</span>
</pre></div>
<p>So this object behaves exactly like a regular file, but logs events when it is created and when it is destroyed. And sure enough,
I saw that in the file-leaking version of my code, <code class="code">
MonitoredFile:__del__()</code>
was never getting called. Now in
Python an object should get deleted when its reference count drops to zero, and indeed the Python sys library provides
the <code class="code">
getrefcount</code>
function (<a class="reference external" href="https://docs.python.org/2/library/sys.html#sys.getrefcount">https://docs.python.org/2/library/sys.html#sys.getrefcount</a>). By adding some logging statements
with calls to <code class="code">
sys.getrefcount()</code>
, I saw that in the leaking-version of my code, the refcount for our file object was
one higher than in the non-leaking code when it was returned from the main application handler function.</p>
<p>Why should this be? Eventually, by stripping out all extraneous code from the Bottle library, I discovered that in the version
that was using <code class="code">
Bottle.mount()</code>
, the response object was passed twice through the <code class="code">
_cast()</code>
function. Bottle can
handle all sorts of things as response objects - strings, dictionaries, JSON objects, lists, but if it notices that it is handling
a <em>file</em> then it treats it specially. The smoking gun code is here:
<a class="reference external" href="https://github.com/bottlepy/bottle/blob/854fbd7f88aa2f809f54dd724aea7ecf918a3b6e/bottle.py#L913">https://github.com/bottlepy/bottle/blob/854fbd7f88aa2f809f54dd724aea7ecf918a3b6e/bottle.py#L913</a></p>
<div class="highlight"><pre><span></span><span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">out</span><span class="p">,</span> <span class="s1">'read'</span><span class="p">):</span>
<span class="k">if</span> <span class="s1">'wsgi.file_wrapper'</span> <span class="ow">in</span> <span class="n">request</span><span class="o">.</span><span class="n">environ</span><span class="p">:</span>
<span class="k">return</span> <span class="n">request</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'wsgi.file_wrapper'</span><span class="p">](</span><span class="n">out</span><span class="p">)</span>
<span class="k">elif</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">out</span><span class="p">,</span> <span class="s1">'close'</span><span class="p">)</span> <span class="ow">or</span> <span class="ow">not</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">out</span><span class="p">,</span> <span class="s1">'__iter__'</span><span class="p">):</span>
<span class="k">return</span> <span class="n">WSGIFileWrapper</span><span class="p">(</span><span class="n">out</span><span class="p">)</span>
</pre></div>
<p>Which <em>looks</em> innocent enough, and indeed is in the first version of our code. But in the <em>second</em> version, our file handler
gets passed through this code block twice, because it's getting handled recursively. And, indeed, if <code class="code">
wsgi.file_wrapper</code>
isn't specified, then <code class="code">
WSGIFileWrapper</code>
is used, and everything is fine. But in our case, we're serving this application
via uWSGI, which <em>does</em> define <code class="code">
wsgi.file_wrapper</code>
. Now, I'm still not 100% clear what this wrapping function is
<em>supposed</em> to do, but on inspecting the uWSGI <a class="reference external" href="https://github.com/unbit/uwsgi/blob/ed2ca5d33325dc925f6fc5558d0b817447327049/plugins/python/wsgi_handlers.c#L463">source</a> I see that it is set to call this C function:</p>
<div class="highlight"><pre><span></span><span class="n">PyObject</span> <span class="o">*</span><span class="nf">py_uwsgi_sendfile</span><span class="p">(</span><span class="n">PyObject</span> <span class="o">*</span> <span class="n">self</span><span class="p">,</span> <span class="n">PyObject</span> <span class="o">*</span> <span class="n">args</span><span class="p">)</span> <span class="p">{</span>
<span class="k">struct</span> <span class="n">wsgi_request</span> <span class="o">*</span><span class="n">wsgi_req</span> <span class="o">=</span> <span class="n">py_current_wsgi_req</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">PyArg_ParseTuple</span><span class="p">(</span><span class="n">args</span><span class="p">,</span> <span class="s">"O|i:uwsgi_sendfile"</span><span class="p">,</span> <span class="o">&</span><span class="n">wsgi_req</span><span class="o">-></span><span class="n">async_sendfile</span><span class="p">,</span> <span class="o">&</span><span class="n">wsgi_req</span><span class="o">-></span><span class="n">sendfile_fd_chunk</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">PyFile_Check</span><span class="p">((</span><span class="n">PyObject</span> <span class="o">*</span><span class="p">)</span><span class="n">wsgi_req</span><span class="o">-></span><span class="n">async_sendfile</span><span class="p">))</span> <span class="p">{</span>
<span class="n">Py_INCREF</span><span class="p">((</span><span class="n">PyObject</span> <span class="o">*</span><span class="p">)</span><span class="n">wsgi_req</span><span class="o">-></span><span class="n">async_sendfile</span><span class="p">);</span>
<span class="n">wsgi_req</span><span class="o">-></span><span class="n">sendfile_fd</span> <span class="o">=</span> <span class="n">PyObject_AsFileDescriptor</span><span class="p">(</span><span class="n">wsgi_req</span><span class="o">-></span><span class="n">async_sendfile</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// PEP 333 hack</span>
<span class="n">wsgi_req</span><span class="o">-></span><span class="n">sendfile_obj</span> <span class="o">=</span> <span class="n">wsgi_req</span><span class="o">-></span><span class="n">async_sendfile</span><span class="p">;</span>
<span class="c1">//wsgi_req->sendfile_obj = (void *) PyTuple_New(0);</span>
<span class="n">Py_INCREF</span><span class="p">((</span><span class="n">PyObject</span> <span class="o">*</span><span class="p">)</span> <span class="n">wsgi_req</span><span class="o">-></span><span class="n">sendfile_obj</span><span class="p">);</span>
<span class="k">return</span> <span class="p">(</span><span class="n">PyObject</span> <span class="o">*</span><span class="p">)</span> <span class="n">wsgi_req</span><span class="o">-></span><span class="n">sendfile_obj</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>
<p>And we can clearly see that <code class="code">
Py_INCREF</code>
is getting called on the file object. So if this function is called twice,
presumably the internal reference count is incremented twice, but only decremented once elsewhere.</p>
<p>And indeed, as soon as I added:</p>
<div class="highlight"><pre><span></span><span class="k">if</span> <span class="s1">'wsgi.file_wrapper'</span> <span class="ow">in</span> <span class="n">environ</span><span class="p">:</span>
<span class="k">del</span> <span class="n">environ</span><span class="p">[</span><span class="s1">'wsgi.file_wrapper'</span><span class="p">]</span>
</pre></div>
<p>to my application code, the file leaking stopped.</p>
</div>
<div class="section" id="concluding-thoughts">
<h2>Concluding Thoughts</h2>
<p>At the moment, I'm not exactly sure whether this is a bug or a misunderstanding. I'm not sure what <code class="code">
wsgi.file_wrapper</code>
is
supposed to do - I clearly have more research to do, time permitting. And because this bug only occurred when Bottle and uWSGI
<em>interacted</em> - I couldn't trigger it in one or other environment on its own - it's hard to say that either project has
a bug. But hopefully this analysis will help prevent others from going through the same headaches I just did.</p>
</div>
</div>
</section>
<section class="section">
<p class="title is-3">
<a href="/sql-magic.html" rel="bookmark" title="Permalink to SQL Magic">
SQL Magic
</a>
</p>
<p class="subtitle is-5">
Thu 01 January 2015
</p>
<hr>
<div class="content ">
<p>I'm finding the <code class="code">
%sql</code>
magic function extremely useful. It turns
IPython into a very nice front-end to Postgresql.</p>
<p>First, make sure you have the <code class="code">
ipython-sql</code>
extension installed:</p>
<p><pre>
pip install ipython-sql</pre>
</p>
<p><a class="reference external" href="https://pypi.python.org/pypi/ipython-sql">https://pypi.python.org/pypi/ipython-sql</a></p>
<p>Then we load the extension</p>
<div class="highlight"><pre><span></span><span class="o">%</span><span class="n">load_ext</span> <span class="n">sql</span>
</pre></div>
<p>Then we set up our database connection.</p>
<div class="highlight"><pre><span></span><span class="o">%%</span><span class="n">sql</span>
<span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">testuser</span><span class="p">:</span><span class="n">password</span><span class="nd">@localhost</span><span class="o">/</span><span class="n">test</span>
</pre></div>
<pre class="literal-block">
u'Connected: <a class="reference external" href="mailto:testuser@test">testuser@test</a>'
</pre>
<p>And now we can start interacting directly with the database as if we
were at the <code class="code">
psql</code>
command line.</p>
<div class="highlight"><pre><span></span><span class="o">%%</span><span class="n">sql</span>
<span class="n">CREATE</span> <span class="n">TABLE</span> <span class="n">people</span> <span class="p">(</span><span class="n">first</span> <span class="n">text</span><span class="p">,</span> <span class="n">last</span> <span class="n">text</span><span class="p">,</span> <span class="n">drink</span> <span class="n">text</span><span class="p">);</span>
<span class="n">INSERT</span> <span class="n">INTO</span> <span class="n">people</span> <span class="p">(</span><span class="n">first</span><span class="p">,</span><span class="n">last</span><span class="p">,</span><span class="n">drink</span><span class="p">)</span>
<span class="n">VALUES</span>
<span class="p">(</span><span class="s1">'zaphod'</span><span class="p">,</span><span class="s1">'beeblebrox'</span><span class="p">,</span><span class="s1">'pan galactic gargle blaster'</span><span class="p">),</span>
<span class="p">(</span><span class="s1">'arthur'</span><span class="p">,</span><span class="s1">'dent'</span><span class="p">,</span><span class="s1">'tea'</span><span class="p">),</span>
<span class="p">(</span><span class="s1">'ford'</span><span class="p">,</span><span class="s1">'prefect'</span><span class="p">,</span><span class="s1">'old janx spirit'</span><span class="p">)</span>
<span class="p">;</span>
</pre></div>
<pre class="literal-block">
Done.
3 rows affected.
</pre>
<pre class="literal-block">
[]
</pre>
<div class="highlight"><pre><span></span><span class="o">%</span><span class="n">sql</span> <span class="n">select</span> <span class="o">*</span> <span class="kn">from</span> <span class="nn">people</span>
</pre></div>
<pre class="literal-block">
3 rows affected.
</pre>
<table class="table table-bordered table-striped is-striped is-bordered">
<tr>
<th>first</th>
<th>last</th>
<th>drink</th>
</tr>
<tr>
<td>zaphod</td>
<td>beeblebrox</td>
<td>pan galactic gargle blaster</td>
</tr>
<tr>
<td>arthur</td>
<td>dent</td>
<td>tea</td>
</tr>
<tr>
<td>ford</td>
<td>prefect</td>
<td>old janx spirit</td>
</tr>
</table><p>We can access the results as a python object:</p>
<div class="highlight"><pre><span></span><span class="n">result</span> <span class="o">=</span> <span class="o">%</span><span class="n">sql</span> <span class="n">select</span> <span class="o">*</span> <span class="kn">from</span> <span class="nn">people</span>
<span class="nb">len</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</pre></div>
<pre class="literal-block">
3
</pre>
<p>And we can even get our recordset as a <strong>pandas</strong> dataframe</p>
<div class="highlight"><pre><span></span><span class="o">%</span><span class="n">config</span> <span class="n">SqlMagic</span><span class="o">.</span><span class="n">autopandas</span><span class="o">=</span><span class="bp">True</span>
</pre></div>
<div class="highlight"><pre><span></span><span class="n">frame</span> <span class="o">=</span> <span class="o">%</span><span class="n">sql</span> <span class="n">select</span> <span class="o">*</span> <span class="kn">from</span> <span class="nn">people</span>
<span class="n">frame</span>
</pre></div>
<div style="max-height:1000px;max-width:1500px;overflow:auto;">
<table class="table table-bordered table-striped is-striped is-bordered">
<thead>
<tr style="text-align: right;">
<th></th>
<th>first</th>
<th>last</th>
<th>drink</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td> zaphod</td>
<td> beeblebrox</td>
<td> pan galactic gargle blaster</td>
</tr>
<tr>
<th>1</th>
<td> arthur</td>
<td> dent</td>
<td> tea</td>
</tr>
<tr>
<th>2</th>
<td> ford</td>
<td> prefect</td>
<td> old janx spirit</td>
</tr>
</tbody>
</table>
<p>3 rows × 3 columns</p>
</div><div class="highlight"><pre><span></span><span class="n">frame</span><span class="p">[</span><span class="s1">'first'</span><span class="p">]</span><span class="o">.</span><span class="n">str</span><span class="o">.</span><span class="n">upper</span><span class="p">()</span>
</pre></div>
<pre class="literal-block">
0 ZAPHOD
1 ARTHUR
2 FORD
Name: first, dtype: object
</pre>
</div>
</section>
<section class="section">
<p class="title is-3">
<a href="/basic-ipython-plotting.html" rel="bookmark" title="Permalink to Basic IPython Plotting">
Basic IPython Plotting
</a>
</p>
<p class="subtitle is-5">
Tue 20 May 2014
</p>
<hr>
<div class="content ">
<p><strong>Things have changed a bit since IPython 1</strong></p>
<p>Now apparently we want to manually specify the use of inline matplotlib
rather than enable globally in the server.</p>
<p><a class="reference external" href="http://nbviewer.ipython.org/github/ipython/ipython/blob/1.x/examples/notebooks/Part%203%20-%20Plotting%20with%20Matplotlib.ipynb">http://nbviewer.ipython.org/github/ipython/ipython/blob/1.x/examples/notebooks/Part%203%20-%20Plotting%20with%20Matplotlib.ipynb</a></p>
<div class="highlight"><pre><span></span><span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span>
</pre></div>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
</pre></div>
<div class="highlight"><pre><span></span><span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">*</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">500</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span>
<span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s1">'A simple chirp'</span><span class="p">);</span>
</pre></div>
<img alt="basic plotting example" src="Basic%20Plotting%20Example_files/Basic%20Plotting%20Example_3_0.png" />
</div>
</section>
<section class="section">
<p class="title is-3">
<a href="/ractive-js.html" rel="bookmark" title="Permalink to Ractive.js">
Ractive.js
</a>
</p>
<p class="subtitle is-5">
Wed 01 January 2014
</p>
<hr>
<div class="content ">
<p><a class="reference external" href="http://www.ractivejs.org/">Ractive.js</a> caught my eye this week. I've been using <a class="reference external" href="http://backbonejs.org/">Backbone</a> for the last couple
of years to develop single page client-side applications, and I've liked how
it doesn't get in your way, but simply allows you to get on with your work.</p>
<p>When I noticed that three of the tools I was using: Coffeescript, Underscore, and
<a class="reference external" href="http://backbonejs.org/">Backbone</a> were all written by the same guy, I realised that Jeremy Ashkenas
is a seriously genius level developer. I've loved working with his tools; and I
love the fact that the code that he produces is so readable. I'm always more comfortable
working with a library when I can read through the entire source code if I run
into problems. Check out the nicely annotated source code for Backbone, for example,
here: <a class="reference external" href="http://backbonejs.org/docs/backbone.html">http://backbonejs.org/docs/backbone.html</a></p>
<p>(On that note, I love working with <a class="reference external" href="http://bottlepy.org/docs/dev/_modules/bottle.html">Bottle</a> for exactly the same reason - the entire
framework is contained in a single, very readable, Python file.)</p>
<p>But one problem that comes up time and time again, no matter what library or framework
you're using, is the problem of binding data to controls. Currently in my Backbone-based
code I have event handling code that reacts to user input and updates the model, and
I have more code that reacts to changes in the data model and updates the UI. Wouldn't
it be nice if this two-way data binding could happen automatically? Wouldn't it be
nice, for example, if you could do something like this:</p>
<p>Imagine having a user interface like this:</p>
<div class="highlight"><pre><span></span><span class="p"><</span><span class="nt">label</span><span class="p">></span>
<span class="p"><</span><span class="nt">input</span> <span class="na">type</span><span class="o">=</span><span class="s">'checkbox'</span> <span class="na">checked</span><span class="o">=</span><span class="s">'{{visible}}'</span><span class="p">></span> visible?
<span class="p"></</span><span class="nt">label</span><span class="p">></span>
</pre></div>
<p>And underlying data like this:</p>
<div class="highlight"><pre><span></span><span class="nx">data</span><span class="o">:</span> <span class="p">{</span>
<span class="s2">"visible"</span><span class="o">:</span> <span class="kc">false</span>
<span class="p">}</span>
</pre></div>
<p>And imagine if all the data binding was handled for you, so that clicking on the checkbox
will automatically change the value of the underlying javascript object, and changing
the value of the object via <code class="code">
ractive.set('visible',true)</code>
updated the interface.</p>
<p>That is exactly what Ractive does. I haven't tried using it in production yet, but
at the moment it feels like the next logical iteration in javascript frameworks.</p>
<p>I was busy using Backbone heavily when Angular and Knockout came out, so I don't have
much experience with them, but asking around the shop the consensus seems to be that
Ractive looks significantly nicer to use than Knockout.</p>
<p>And the best feature so far is their <em>awesome</em> <a class="reference external" href="http://learn.ractivejs.org/hello-world/1/">tutorial</a>. This is, apparently, entirely
written in Ractive, and guides you step by step through all the basic concepts of
the library in an elegant, interactive fashion. No more jumping through package installations
and dependency hell before you can try out a new framework. <em>This</em> is the way to
introduce people to your work. I'm very impressed.</p>
</div>
</section>
<section class="section">
<nav class="pagination is-centered" role="navigation" aria-label="pagination">
<a href="/index3.html" class="pagination-previous">Previous</a>
<ul class="pagination-list">
<li >
<a href="/index.html"
class="pagination-link " >
1
</a>
</li>
<li >
<a href="/index2.html"
class="pagination-link " >
2
</a>
</li>
<li >
<a href="/index3.html"
class="pagination-link " >
3
</a>
</li>
<li >
<a href="/index4.html"
class="pagination-link is-current" >
4
</a>
</li>
</ul>
</nav>
</section>
<footer class="footer">
<div class="container">
<div class="content has-text-centered">
<p>
Powered by <a href="http://getpelican.com/">Pelican</a>, <a href="http://python.org">Python</a>,
and <a href="http://bulma.io/">Bulma</a>
</p>
<p class="subtitle is-6">Ubi Caritas et Amor, Deus Ibi Est</p>
</div>
</div>
</footer>
</body>
</html>