-
Notifications
You must be signed in to change notification settings - Fork 40
/
tutorial0.xml
263 lines (263 loc) · 10.7 KB
/
tutorial0.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
<article data-sblg-article="1" data-sblg-tags="tutorial" itemscope="itemscope" itemtype="http://schema.org/BlogPosting">
<header>
<h2 itemprop="name">
Getting Started with CGI in C
</h2>
<address itemprop="author"><a href="https://kristaps.bsd.lv">Kristaps Dzonsons</a></address>
<time itemprop="datePublished" datetime="2015-06-21">21 June, 2015</time>
</header>
<p>
<aside itemprop="about">
This tutorial describes a typical CGI example using <span class="nm">kcgi</span>.
In it, I'll process HTML form input with two named fields, <code>string</code> (a non-empty, unbounded string) and
<code>integer</code>, a signed 64-bit integer.
I'll then output the input within a simple HTML page.
The tutorial will be laid out in code snippets, which I'll put together at the end.
I'll then follow with compilation instructions.
</aside>
</p>
<h3>
Source Code
</h3>
<p>
I'll describe this as if reading a source file from top to bottom.
To wit, let's start with the header files.
We'll obviously need <span class="nm">kcgi</span> and <span class="file">stdint.h</span>, which is necessary for some types
found in the header file.
I'll also include the HTML library for <span class="nm">kcgi</span>—I'll explain why later.
</p>
<figure class="sample">
<pre class="prettyprint linenums">#include <sys/types.h> /* size_t, ssize_t */
#include <stdarg.h> /* va_list */
#include <stddef.h> /* NULL */
#include <stdint.h> /* int64_t */
#include <kcgi.h>
#include <kcgihtml.h></pre>
</figure>
<p>
Next, I'll assign the fields we're interested in to numeric identifiers.
This will allow us later to assign names, then assign validators to named fields.
</p>
<figure class="sample">
<pre class="prettyprint linenums">enum key {
KEY_STRING,
KEY_INTEGER,
KEY__MAX
};</pre>
</figure>
<p>
The enumeration will allow us to bound an array to <code>KEY__MAX</code> and refer to individual buckets in the array by the
enumeration value.
I'll assume that <code>KEY_STRING</code> is assigned 0 and <code>KEY_INTEGER</code>, 1.
</p>
<p>
Next, connect the indices with validation functions and names.
The validation function is run by <a href="khttp_parse.3.html">khttp_parse(3)</a>; the name is the HTML form name for the given
element.
Built-in validation functions, which we'll use, are described in <a href="kvalid_string.3.html">kvalid_string(3)</a>.
In this example, <code>kvalid_stringne</code> will validate a non-empty (nil-terminated) C string, while <code>kvalid_int</code>
will validate a signed 64-bit integer.
</p>
<figure class="sample">
<pre class="prettyprint linenums">static const struct kvalid keys[KEY__MAX] = {
{ kvalid_stringne, "string" }, /* KEY_STRING */
{ kvalid_int, "integer" }, /* KEY_INTEGER */
};</pre>
</figure>
<p>
Next, I define a function that acts upon the parsed fields.
According to <a href="khttp_parse.3.html">khttp_parse(3)</a>, if a valid value is found, it is assigned into the
<code>fieldmap</code> array.
If one was found but did not validate, it is assigned into the <code>fieldnmap</code> array.
Both of these are indexed by the array position in <code>keys</code>.
(We could also have run the <code>fields</code> list, but that's for chumps.)
</p>
<p>
In this trivial example, the function emits the string values if found or indicates that they're not
found (or not valid).
</p>
<figure class="sample">
<pre class="prettyprint linenums">static void process(struct kreq *r) {
struct kpair *p;
khttp_puts(r, "<p>\n");
khttp_puts(r, "The string value is ");
if ((p = r->fieldmap[KEY_STRING]))
khttp_puts(r, p->parsed.s);
else if (r->fieldnmap[KEY_STRING])
khttp_puts(r, "<i>failed parse</i>");
else
khttp_puts(r, "<i>not provided</i>");
khttp_puts(r, "</p>\n");
}</pre>
</figure>
<p>
As is, this routine introduces a significant problem: if the <code>KEY_STRING</code> value consists of HTML, it will be inserted
directly into the stream, allowing attackers to use <a href="https://en.wikipedia.org/wiki/Cross-site_scripting">XSS</a>.
Instead, let's use the <a href="kcgihtml.3.html">kcgihtml(3)</a> library to perform the proper encoding and element nesting.
</p>
<figure class="sample">
<pre class="prettyprint linenums">static void process_safe(struct kreq *r) {
struct kpair *p;
struct khtmlreq req;
khtml_open(&req, r, 0);
khtml_elem(&req, KELEM_P);
khtml_puts(&req, "The string value is ");
if ((p = r->fieldmap[KEY_STRING])) {
khtml_puts(&req, p->parsed.s);
} else if (r->fieldnmap[KEY_STRING]) {
khtml_elem(&req, KELEM_I);
khtml_puts(&req, "failed parse");
} else {
khtml_elem(&req, KELEM_I);
khtml_puts(&req, "not provided");
}
khtml_close(&req);
}</pre>
</figure>
<p>
Before doing any parsing, I sanitise the HTTP context.
This consists of the page requested, MIME type, HTTP method, and so on.
</p>
<p>
To begin, I provide an array of indexed page identifiers—similarly as I did for the field validator and name.
This will also be passed to <a href="khttp_parse.3.html">khttp_parse(3)</a>.
These define the page requests accepted by the application, in this case being only <code>index</code>, which I'll also set to
be the default page when invoked without a path (i.e., just <code>http://www.foo.com</code>).
<strong>Note</strong>: this is the first path component, so specifying <code>index</code> will also accept
<code>index/foo</code>.
</p>
<figure class="sample">
<pre class="prettyprint linenums">enum page {
PAGE_INDEX,
PAGE__MAX
};
const char *const pages[PAGE__MAX] = {
"index", /* PAGE_INDEX */
};</pre>
</figure>
<p>
Now, I validate the page request and HTTP context based upon the defined components.
This function checks the page request (it must be <code>index</code> without a subpath), HTML MIME type (expanding to
<code>index.html</code>), and HTTP method (it must be an HTTP <code>GET</code>, such as <code>index.html?string=foo</code>).
To keep things reasonable, I'll have the sanitiser return an HTTP error code (see <a
href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html">RFC 2616</a> for an explanation).
</p>
<figure class="sample">
<pre class="prettyprint linenums">static enum khttp sanitise(const struct kreq *r) {
if (r->page != PAGE_INDEX)
return KHTTP_404;
else if (*r->path != '\0') /* no index/xxxx */
return KHTTP_404;
else if (r->mime != KMIME_TEXT_HTML)
return KHTTP_404;
else if (r->method != KMETHOD_GET)
return KHTTP_405;
return KHTTP_200;
}</pre>
</figure>
<p>
Putting all of these together: parse the HTTP context, validate it, process it, then free the resources.
Headers are output using <a href="khttp_head.3.html">khttp_head(3)</a>, with the document body started
with <a href="khttp_body.3.html">khttp_body(3)</a>.
The HTTP context is closed with <a href="khttp_free.3.html">khttp_free(3)</a>.
</p>
<figure class="sample">
<pre class="prettyprint linenums">int main(void) {
struct kreq r;
enum khttp er;
if (khttp_parse(&r, keys, KEY__MAX,
pages, PAGE__MAX, PAGE_INDEX) != KCGI_OK)
return 0;
if ((er = sanitise(&r)) != KHTTP_200) {
khttp_head(&r, kresps[KRESP_STATUS],
"%s", khttps[er]);
khttp_head(&r, kresps[KRESP_CONTENT_TYPE],
"%s", kmimetypes[KMIME_TEXT_PLAIN]);
khttp_body(&r);
if (KMIME_TEXT_HTML == r.mime)
khttp_puts(&r, "Could not service request.");
} else {
khttp_head(&r, kresps[KRESP_STATUS],
"%s", khttps[KHTTP_200]);
khttp_head(&r, kresps[KRESP_CONTENT_TYPE],
"%s", kmimetypes[r.mime]);
khttp_body(&r);
process_safe(&r);
}
khttp_free(&r);
return 0;
};</pre>
</figure>
<p>
That's it!
</p>
<h3>
Compile and Link
</h3>
<p>
Your source is no good til it's compiled and linked into an executable.
In this section I'll mention two strategies: the first is where the application is dynamically linked; in the second,
statically.
Dynamic linking is normal for most applications, but CGI applications are often placed in a file-system jail (a chroot(2))
without access to other libraries, and are thus statically linked.
In short, it depends on your environment.
Let's call our application <code>tutorial0.cgi</code> and the source file, <code>tutorial0.c</code>.
To dynamically link:
</p>
<figure class="sample">
<pre class="prettyprint lang-sh linenums">% cc `pkg-config --cflags kcgi-html` -c -o tutorial0.o tutorial0.c
% cc -o tutorial0.cgi tutorial0.o `pkg-config --libs kcgi-html`</pre>
</figure>
<p>
For static linking, which is the norm in more sophisticated systems like OpenBSD:
</p>
<figure class="sample">
<pre class="prettyprint lang-sh linenums">% cc -static -o tutorial0.cgi tutorial0.o `pkg-config --libs kcgi-html`</pre>
</figure>
<h3>
Install
</h3>
<p>
Installation steps depends on your operating system, web server, and a thousand other factors.
I'll stick with the simplest installation using the defaults of <a href="https://www.openbsd.org">OpenBSD</a> with the default
web server <a href="https://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man8/httpd.8">httpd(8)</a>.
To begin with, configure <span class="file">/etc/httpd.conf</span> with your server's root being in <span
class="file">/var/www</span> and FastCGI being in <span class="file">/var/www/cgi-bin</span>.
If you've already done this, or have a configuration file in place, you won't need to do this.
</p>
<figure class="sample">
<pre class="prettyprint lang-sh linenums">server "me.local" {
listen on * port 80
root "/htdocs"
location "/cgi-bin/*" {
fastcgi
root "/"
}
}</pre>
</figure>
<p>
Next, we use the <a href="https://man.openbsd.org/rcctl.8">rcctl(8)</a> tool to enable and
start the <a href="https://man.openbsd.org/httpd.8">httpd(8)</a> webserver and
<a href="https://man.openbsd.org/slowcgi.8">slowcgi(8)</a> wrapper.
(The latter is necessary because <a href="https://man.openbsd.org/httpd.8">httpd(8)</a> only
directly supports FastCGI, so a proxy is necessary.)
Again, you may not need to do this part.
We also make sure the instructions on the main page are followed regarding OpenBSD sandboxing in the file-system jail.
</p>
<figure class="sample">
<pre class="prettyprint lang-sh linenums">% doas rcctl enable httpd
% doas rcctl start httpd
% doas rcctl check httpd
httpd(ok)
% doas rcctl enable slowcgi
% doas rcctl start slowcgi
% doas rcctl check slowcgi
slowcgi(ok)</pre>
</figure>
<p>
Assuming we built the static binary, we can now just install into the CGI directory and be ready to go!
</p>
<figure class="sample">
<pre class="prettyprint lang-sh linenums">% doas install -m 0555 tutorial0.cgi /var/www/cgi-bin</pre>
</figure>
</article>