forked from aaronbloomfield/pdr
-
Notifications
You must be signed in to change notification settings - Fork 228
/
index.html
472 lines (463 loc) · 20.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>PDR: C Tutorial for C++ programmers</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<link rel="stylesheet" href="../../markdown.css" />
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="pdr-c-tutorial-for-c-programmers">PDR: C Tutorial for C++
programmers</h1>
<p><a href="../index.html">Go up to the Tutorials table of contents
page</a></p>
<h3 id="originally-by-jeremy-w.-sheaffer">Originally by Jeremy W.
Sheaffer</h3>
<p>This document assumes that you have a strong working knowledge of the
most basic and important aspects of C++. It is designed to introduce you
to important concepts in C - most of them also available in C++, but
rarely used there - with which you likely have little familiarity.</p>
<hr />
<h2 id="hello-world-hello-world-hello-world">Hello World, Hello World,
Hello World!</h2>
<p>A basic C program looks like the following:</p>
<pre><code>// hello world x3
#include <stdio.h>
int main() {
for (int i = 0; i < 3; i++) {
printf("hello world!\n");
}
return 0;
}</code></pre>
<p>A few things to note about this program: - The
<code><stdio.h></code> was <code>#include</code>d (stdio stands
for Standard I/O library), which is where <code>printf()</code> (and,
later, <code>scanf()</code>) live. These are the basic input and output
routines in C, analogous to cout and cin in C++. More on these functions
are below - There are no namespaces in C</p>
<p>To use <code>malloc()</code>, which is the C version of
<code>new</code>, you will need to include the
<code><stdlib.h></code> file (stdlib is the standard library) -
more on <code>malloc()</code> is also below.</p>
<hr />
<h2 id="input-and-output">Input and Output</h2>
<p>C does not have <em>iostreams</em> or <em>stream operators</em>, such
as <code><<</code> (for cout) and <code>>></code> (for cin).
For most I/O, we use the <code>printf()</code> family of functions (for
output) and the <code>scanf()</code> family (for input).</p>
<h3 id="int-printfconst-char-format">int printf(const char *format,
…)</h3>
<p><code>printf()</code> takes a <em>format string</em>, containing
verbatim text that you want to display and <em>conversion
specifiers</em> which describe to <code>printf()</code> how to interpret
and display the remaining arguments. The conversion specifiers may
contain <em>flags</em>, which control such things as field width,
precision, and format. All conversion specifiers begin with a
<code>%</code>. To print an actual <code>%</code>, your format string
must contain <code>%%</code>. Other special characters will have to be
escaped with a backslash (i.e., to print a newline, we enter
<code>\n</code>; for a backslash, we enter <code>\\</code>).</p>
<p>For example, to print an integer, we would enter:</p>
<pre><code>int x = 3;
printf("this is an int: %d\n", x);</code></pre>
<p>The <code>%d</code> part tells printf to format the appropriate
parameter (x, in this case) as an integer, and insert it at that spot in
the string.</p>
<p>Commonly used conversion specifiers are:</p>
<table>
<thead>
<tr class="header">
<th>Specifier</th>
<th>Meaning</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>d</td>
<td>converts an int</td>
</tr>
<tr class="even">
<td>f</td>
<td>converts a float</td>
</tr>
<tr class="odd">
<td>c</td>
<td>converts a char</td>
</tr>
<tr class="even">
<td>s</td>
<td>converts a string</td>
</tr>
<tr class="odd">
<td>p</td>
<td>converts a pointer</td>
</tr>
</tbody>
</table>
<p>and flags:</p>
<table>
<colgroup>
<col style="width: 36%" />
<col style="width: 63%" />
</colgroup>
<thead>
<tr class="header">
<th>Flag</th>
<th>Meaning</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><code>l</code> (a lower-case ‘L’)</td>
<td>Instead of an int or float, convert a long int or double</td>
</tr>
<tr class="even">
<td><code>ll</code> (two lower-case ’L’s)</td>
<td>…, convert a long long int or long double</td>
</tr>
<tr class="odd">
<td><code>0</code> (zero)</td>
<td>Zero pad the conversion to fill the field width</td>
</tr>
<tr class="even">
<td><code>-</code></td>
<td>Left justify the field</td>
</tr>
<tr class="odd">
<td>’ ’ (a space)</td>
<td>Leave a blank space where an omitted ‘+’ would go</td>
</tr>
<tr class="even">
<td><code>+</code></td>
<td>Display a ‘+’ for positive numbers</td>
</tr>
<tr class="odd">
<td>Non-zero integer</td>
<td>Minimum field width</td>
</tr>
<tr class="even">
<td><code>.</code> followed by a non-zero integer</td>
<td>Number of digits of precision</td>
</tr>
</tbody>
</table>
<p>Some examples:</p>
<pre><code>printf("pi = %.0f\n", pi); /* output: pi = 3 */
printf("pi = '%+10.4f'\n", pi); /* pi = ' +3.1416' */
printf("Hello from %s, which begins at %p!\n", __PRETTY_FUNCTION__, main); /* Hello from main, which begins at 0x4004f4! */</code></pre>
<p>This last one uses the <code>__PRETTY_FUNCTION__</code> macro, which
is a string representation of the function name. And the ‘main’ in there
is printing out the hexadecimal address of where the <code>main()</code>
function begins in the x86 code section.</p>
<h3 id="variable-argument-lists">Variable argument lists</h3>
<p>We’ve seen variable argument lists in the context of the C calling
convention (that’s why the parameters are pushed onto the stack in
reverse order). The <code>...</code> portion of the
<code>printf()</code> function signature signifies that it take a
variable number of arguments. I/O functions in C commonly combine this
notation with a format string. Any use of variable argument lists must
include clues so that the callee can correctly identify the number and
type of its arguments. An interesting example of a function which uses
variable argument lists in a less obvious way is open():</p>
<pre><code>int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);</code></pre>
<p>but C does not support function overloading! While the
<code>open()</code> manual page displays the two prototypes above, the
actual signature of <code>open()</code> is:</p>
<pre><code>int open(const char *pathname, int flags, ...);</code></pre>
<p>The variable argument list (mode) is only used if flags includes the
bit <code>O_CREAT</code>, giving <code>open()</code> the information it
needs to decide whether or not to access arguments beyond flags.</p>
<h3 id="int-scanfconst-char-format">int scanf(const char *format,
…)</h3>
<p><code>scanf()</code> converts input, rather than output. Its format
strings look very similar to those of <code>printf()</code>, with a few
more complex conversions, which we will not cover in this tutorial as
they are not often used. The key point to remember to differentiate
usage of <code>scanf()</code> and <code>printf()</code> is that while
<code>printf()</code> converts values (as in <em>pass-by-value</em>),
<code>scanf()</code> converts input and thus needs a place to store it
(<em>pass-by-address</em>, which is really passing a pointer by value; C
does not have <em>pass-by-reference</em>, only C++ does).</p>
<p><code>scanf()</code> will precisely match all non-whitespace
characters in the format string. Whitespace is handled specially, in
that each individual instance of whitespace in the format string is
treated like a single space, and a single space in the format string
matches one or more whitespace characters in the input. Conversion
usually stops at the first whitespace character. All matched input is
discarded, save that which is converted.</p>
<p>Be careful of buffer overflow! <code>scanf()</code> doesn’t know how
large your buffer is. If you don’t know either, don’t use
<code>scanf()</code>. We prefer to use <code>fgets()</code> followed by
<code>sscanf()</code> for this reason.</p>
<p>Some examples:</p>
<pre><code>int age;
char grade;
char school[3];
printf("AGE: ");
scanf("%d", &age); /* Converts input to an integer and stores it in age */
printf("GRADE: ");
scanf("%c", &grade); /* Converts input to a letter grade (probably 'A') and stores it in grade */
printf("SCHOOL: ");
scanf("%s", school); /* Converts input to a string and stores it in school */</code></pre>
<p>The third example above almost certainly overflows the buffer.
<code>scanf()</code> will copy input into <code>school</code> until it
sees the next whitespace character. If the input is “The University of
Virginia”, it will save “The”, and overflow the buffer by one byte (due
to the fact that all C-style strings have a zero byte that terminates
the string). If the input is “UVA”, it will save “UVA”, but still
overflow the buffer. Using the field width flag <code>%2s</code> can
solve this buffer overflow problem, but then we will only save “Th” or
“UV”. In order to convert whitespace, you must use the more complex
conversion. It’s more common to use <code>fgets()</code> for this type
of input.</p>
<hr />
<h2 id="dynamic-memory-management">Dynamic memory management</h2>
<p>In C++ you allocate storage from the heap with <code>new</code>, and
you return it to the heap with <code>delete</code>. <code>new</code> and
<code>delete</code> are operators, which can even be overloaded, and
which, by default, automatically call constructors or destructors for
you.</p>
<p>Heap control in C is a bit lower level, and is done primarily through
two functions: <code>malloc()</code> and <code>free()</code>.</p>
<h3 id="malloc">malloc()</h3>
<p>Allocation from the heap is achieved through a call to
<code>malloc()</code>, which returns a pointer to the newly allocated
space on success or <code>NULL</code> on failure (i.e., if you are out
of memory). It is important to check the return value of
<code>malloc()</code> before attempting to access the returned storage.
The storage allocated by <code>malloc()</code> is <strong>not</strong>
initialized in any way! You will need to explicitly run initialization
subroutines on data structures that you allocate in C.</p>
<p><code>malloc()</code> has the prototype:</p>
<pre><code>void* malloc(size_t size)</code></pre>
<p>You must explicitly tell it how much storage you require. The
<code>sizeof</code> operator is useful here.</p>
<h3 id="free">free()</h3>
<p><code>free()</code> is used to deallocate storage originally
allocated with <code>malloc()</code>, <code>calloc()</code>, or
<code>realloc()</code>. <code>free()</code> in C is analgous to
<code>delete</code> in C++. Its prototype is:</p>
<pre><code>void free(void* ptr)</code></pre>
<p>It is an error to call <code>free()</code>:</p>
<ul>
<li>on any address that was not returned by one of the
<code>malloc()</code> family functions above</li>
<li>on a <code>NULL</code> pointer</li>
<li>twice on the same address</li>
</ul>
<p>Any one of these errors will corrupt your heap. This corruption will
manifest in unusual ways which will be very difficult to debug and which
will not have any obvious relationship with the root cause (if you are
lucky, it will cause a segmentation fault).</p>
<p>The easiest way to debug a memory error is not to make it in the
first place. With care, this is easier than it sounds. Firstly, know
when you need dynamic allocation; don’t use it if you don’t have to.
Secondly, as you would do in C++, write constructors and destructors for
all of your data structures, and be consistent about using them. These
functions should handle your heap control and error checking explicitly,
so that they are implicit in the code that uses the storage.</p>
<p>If you do manage to develop memory errors, AddressSanitizer and
LeakSanitizer work just as they did in C++.</p>
<h3 id="examples">Examples</h3>
<pre><code>#include <stdio.h>
#include <stdlib.h>
int main() {
// dynamically allocate an array of ints
int* p = (int*) malloc(sizeof(int) * 5);
if (p == NULL) {
// memory allocation failed; handle the error somehow
return 1;
}
// Initialize p[1] to 10. Everything else is still uninitialized.
// Trying to access any other index without first initializing it is undefined behavior!
p[1] = 10;
printf("%d\n", p[1]);
// free up that array
free(p);
return 0;
}</code></pre>
<hr />
<h2 id="derived-types">Derived types</h2>
<p>Like C++, C allows programmers to derive types from existing types.
There are several kinds of derived types, including <em>structures</em>,
<em>unions</em>, and <em>enumerated types</em>.</p>
<h3 id="struct">struct</h3>
<p>C does not have classes, but it does have structures. Early versions
of C++ were called <em>C with Classes</em>; the C++ “compiler” was
little more than a preprocessor, and <em>class</em> syntactical
constructs were converted to structures so that the output could be
compiled by a C compiler.</p>
<p>A structure definition has the following format:</p>
<pre><code>struct name {
type1 member1;
type2 member2;
/* ... */
typen membern;
};</code></pre>
<p>In the above, <em>name</em> is optional. If omitted, the structure is
said to be <em>anonymous</em>; in general, you probably don’t want that.
Within the curly braces, <em>struct name</em> is an incomplete type or
<em>forward reference</em> and can be used to build recursive
structures. <em>struct name</em> is fully defined after the closing
curly brace, and instances can be declared at that point.</p>
<p>The following might be a good definition for a list item data
structure. We’ll look at more of the list class further on:</p>
<pre><code>struct list_item {
struct list_item* prev
struct list_item* next;
void* datum;
};</code></pre>
<p>Now whenever you want a variable of type <code>list_item</code>, you
declare it using <code>struct list_item</code> – for example,
<code>struct list_item my_item</code>. If the extra <code>struct</code>
doesn’t look right to you, this is where the <code>typedef</code>
keyword can come in handy.</p>
<pre><code>// typedef comes before struct
typedef struct list_item {
struct list_item* prev;
struct list_item* next;
void* datum;
} list_item_t; // what you want to call your struct goes after the closing brace</code></pre>
<p>With the typedef, we can now refer to our struct as simply
<code>list_item_t</code>. Therefore, variable declarations become
<code>list_item_t my_item</code>. Note that the <code>struct list_item*
prev</code> inside of the struct declaration cannot be replaced with
<code>list_item_t* prev</code>, as the typedef hasn’t been finished by
that point! We don’t know the name of the typedef until after the
closing brace.</p>
<h3 id="union">union</h3>
<p>A union is a type for which the compiler allocates space sufficient
only for the largest member, not for all members. At any moment in a
union instance’s lifetime, only one member is valid. It is the
responsibility of the programmer to ensure that the data is accessed
correctly. The syntax of a union declaration is identical to that of a
struct declaration, save the keyword.</p>
<p>An anonymous union provides useful syntactic sugar as a structure
member. Take the following example:</p>
<pre><code>struct example_struct {
int type;
union {
int i;
float f;
double d;
};
};
struct example_struct s;
s.type = 1;
switch (s.type) {
case 0:
printf("%d\n", s.i);
break;
case 1:
printf("%+2.3f\n", s.f);
break;
case 2:
scanf("%lf", &s.d);
break;
}</code></pre>
<p><em>C with Classes</em> used unions and function pointers to
implement <a
href="http://en.wikipedia.org/wiki/Polymorphism_%28computer_science%29">polymorphism</a>.</p>
<h3 id="function-pointers">Function pointers</h3>
<p>Function pointers are useful in many instances. The most common of
those include function tables (arrays of function pointers),
implementation of code in which you may not statically know what
function will be called (like <code>qsort()</code> and
<code>bsearch()</code>; technically <em>all</em> instances in which you
should use function pointers fall under this heading), and
implementation of object oriented code in C.</p>
<p>Syntactically, a function pointer looks like a function prototype,
except that the “function name” (actually the name of the pointer) is
wrapped in parenthesis with a pointer star at the beginning. For
example:</p>
<pre><code>int (*pprintf)(const char* format, ...)</code></pre>
<p>defines a pointer that can point to <code>printf()</code> (not
particularly useful). A function’s name, without parenthesis, evaluates
to its address, thus we can assign the address of <code>printf()</code>
to <code>pprintf</code> with the statement:</p>
<pre><code>pprintf = printf;</code></pre>
<p>and you can call <code>printf()</code> through the
<code>pprintf</code> pointer with:</p>
<pre><code>pprintf("Hello, World!\n");</code></pre>
<h3 id="qsort">qsort()</h3>
<p><code>qsort()</code>, short for quick sort, can sort arrays of data
of arbitrary type. In implementing <code>qsort()</code>, the subroutine
has no knowledge of how to compare the arbitrary elements being sorted,
thus <code>qsort()</code> takes a pointer to a comparison function. Its
full prototype looks like this:</p>
<pre><code>void qsort(void* base, size_t nmemb, size_t size, int (*compar)(const void*, const void*));</code></pre>
<p>The comparison function takes pointers to two elements of the array
starting at base and returns negative, zero, or positive if the first is
smaller than, equal to, or larger than the second, respectively. The
array starting at base has nmemb elements of size size.</p>
<h3 id="object-oriented-c">Object-Oriented C</h3>
<p>With <code>list_item_t</code> defined as above, consider the
following:</p>
<pre><code>struct list {
list_item_t* head;
list_item_t* tail;
unsigned int length;
int (*compare)(const void* key, const void* with);
void (*datum_delete)(void*);
};</code></pre>
<p>This is a type definition for an object-oriented doubly-linked list
class in C. <code>list_item_t</code> contains a pointer to a
<code>void</code> type. In other words, it can point to anything, so if
we want to do sorted insert, we need a comparison function, just as in
<code>qsort()</code>. Furthermore, if we destroy the list, we probably
want the data contained in it to be returned to the heap. If that
requires a only shallow delete, <code>free()</code> can be passed to the
constructor, otherwise, a destructor for the stored type must be passed.
If the list is not to be sorted, or if the data is not to be destroyed
with the list, one or both of compare and delete_datum can be
<code>NULL</code>.</p>
<p>Note that, for the exercise below, you need not implement
object-oriented C code; your code can be pure C.</p>
<hr />
<h2 id="exercise">Exercise</h2>
<p>This exercise is to be developed in C, and compiled using clang (NOT
clang++!). The exercise is to implement a <strong><em>very
simple</em></strong> linked list. Your program should do the
following:</p>
<ol type="1">
<li>Read in an integer, which we’ll call <em>n</em></li>
<li>Read in <em>n</em> more ints, and put those into a linked list
<ul>
<li>The linked list must be dynamically allocated</li>
</ul></li>
<li>Print out that linked list (we don’t care about the order!)</li>
<li>Properly deallocate the linked list</li>
</ol>
<p>That’s it - the point is to have you use many of the features of C
discussed here (<code>printf()</code>, <code>scanf()</code>, structs,
<code>malloc()</code>, and <code>free()</code>). We aren’t looking for
multiple subroutines, a full list class, etc. – a long
<code>main()</code> function is just fine. Don’t make this more
complicated than necessary!</p>
<p>The program should be in a file called <code>linkedlist.c</code>. A
sample execution run might look like the following:</p>
<pre><code>Enter how many values to input: 4
Enter value 1: 2
Enter value 2: 4
Enter value 3: 6
Enter value 4: 8
8
6
4
2</code></pre>
</body>
</html>