labs/lab09-32bit/index.html

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <title>PDR: Laboratory 9: x86 Assembly Language, part 2 (32 bit)</title>
  <style>
    code{white-space: pre-wrap;}
    span.smallcaps{font-variant: small-caps;}
    span.underline{text-decoration: underline;}
    div.column{display: inline-block; vertical-align: top; width: 50%;}
    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
    ul.task-list{list-style: none;}
    .display.math{display: block; text-align: center; margin: 0.5rem auto;}
  </style>
  <link rel="stylesheet" href="../../markdown.css" />
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
</head>
<body>
<h1 id="pdr-laboratory-9-x86-assembly-language-part-2-32-bit">PDR:
Laboratory 9: x86 Assembly Language, part 2 (32 bit)</h1>
<p><a href="../index.html">Go up to the Labs table of contents
page</a></p>
<h3 id="objective">Objective</h3>
<p>This lab is one of two labs meant to familiarize you with the process
of writing, assembling, and linking assembly language code. The purposes
of the in-lab and post-lab activities are to investigate how various C++
language features are implemented at the assembly level.</p>
<p>There are both <a href="../lab09-32bit/index.html">32 bit</a> (<a
href="../lab09-32bit/index.md">md</a>) and <a
href="../lab09-64bit/index.html">64 bit</a> (<a
href="../lab09-64bit/index.md">md</a>) versions of this lab. This is the
<strong><em>32 bit version</em></strong>.</p>
<h3 id="background">Background</h3>
<p>The Intel x86 assembly language is currently one of the most popular
assembly languages and runs on many architectures from the x86 line
through the Pentium 4. It is a CISC instruction set that has been
extended multiple times (e.g. MMX) into a larger instruction set.</p>
<h3 id="readings">Reading(s)</h3>
<ol type="1">
<li>Read the <a href="../../slides/08-x86.html">slides on x86</a></li>
<li>The two book chapters on x86: <a
href="../../book/x86-32bit-asm-chapter.pdf">x86 Assembly</a> and <a
href="../../book/x86-32bit-ccc-chapter.pdf">The x86 C Calling
Convention</a>.</li>
</ol>
<h2 id="lab-procedure">Lab Procedure</h2>
<h3 id="pre-lab">Pre-lab</h3>
<ol type="1">
<li>If you need to, read through the pre-lab pages from the previous lab
on compiling C++ with assembly, as well as the vecsum program.</li>
<li>In particular, you should be familiar with the differences between
the various platforms, and the required submission format (it’s Linux),
as described in the pre-lab section.</li>
<li>Follow the pre-lab instructions in this document. They require you
to write a program in x86 assembly called threexplusone.s.</li>
<li>List, as comments in that assembly file, the optimizations that you
used</li>
<li>Read the <a href="../../tutorials/09-c/index.html">C tutorial</a>.
You will need to implement the linkedlist.c program for the post-lab,
not for the pre-lab.</li>
<li>Make sure your Makefile compiles your code, and that it compiles it
in Linux format! This means elf as the nasm format, and no underscore
for the subroutine name (both on the ‘global’ line and on the line label
itself).</li>
<li>Also note that your timer.h file you should have an <code>#include
&lt;string.h&gt;</code>.</li>
<li>Files to download: <a
href="../lab06/code/timer.cpp.html">timer.cpp</a> (<a
href="../lab06/code/timer.cpp">src</a>), <a
href="../lab06/code/timer.h.html">timer.h</a> (<a
href="../lab06/code/timer.h">src</a>) (from the lab 6 directory)</li>
<li>Files to submit: threexplusone.s, threexinput.cpp, timer.cpp,
timer.h, Makefile</li>
</ol>
<h3 id="in-lab">In-lab</h3>
<ol type="1">
<li>Address one of the topics in the list of the in-lab section. Be sure
to address all the issues in that topic! You will have to complete two
(total) of these topics for the post-lab report.</li>
<li>We are looking for a brief write-up indicating that you addressed
one of the topics, and the results that you found. You do not need to
make it a full fledged report yet (that’s the post-lab).</li>
<li>Files to download: none (other than the results of your
pre-lab)</li>
<li>Files to submit: inlab9.pdf</li>
</ol>
<h3 id="post-lab">Post-lab</h3>
<ol type="1">
<li>Finish addressing two of the topics listed in the in-lab section
(the first topic is the one you started in the in-lab; the second one is
new for the post-lab). We are looking for a quality write-up here, as
detailed below.</li>
<li>You must complete two total topics; the first one is what you
started (and possibly finished) from the in-lab; the second one is new
for the post-lab</li>
<li>While this seems like a long post-lab, these list items should have
been worked on during the pre-lab and the in-lab, which makes the
post-lab much shorter.</li>
<li>Implement the linkedlist.c program from the <a
href="../../tutorials/09-c/index.html">Tutorial 9: C</a> (<a
href="../../tutorials/09-c/index.md">md</a>)</li>
<li>Make sure your Makefile compiles your code!</li>
<li>Files to download: none (other than the results of your pre-lab and
in-lab)</li>
<li>Files to submit: postlab9.pdf, linkedlist.c, Makefile</li>
</ol>
<hr />
<h2 id="pre-lab-1">Pre-lab</h2>
<p>You may want to reference the “Compiling Assembly With C++” and
“Vecsum” sections from the <a href="../lab08/index.html">previous x86
lab</a>.</p>
<h3 id="pre-lab-program-threexplusone.s">Pre-lab program:
threexplusone.s</h3>
<p>The 3x+1 conjecture (also called the Collatz conjecture) is an open
problem in mathematics, meaning that it has not yet been proven to be
true. The conjecture states that if you take any positive integer, you
can repeatedly apply the following function to it:</p>
<p><img src="formula.png" /></p>
<p>The conjecture is that eventually, the result will reach 1. For
example, consider <em>x</em> = 13:</p>
<ul>
<li><em>f</em>(13) = 3 * 13 + 1 = 40</li>
<li><em>f</em>(40) = 40 / 2 = 20</li>
<li><em>f</em>(20) = 20 / 2 = 10</li>
<li><em>f</em>(10) = 10 / 2 = 5</li>
<li><em>f</em>(5) = 3 * 5 + 1 = 16</li>
<li><em>f</em>(16) = 16 / 2 = 8</li>
<li><em>f</em>(8) = 8 / 2 = 4</li>
<li><em>f</em>(4) = 4 / 2 = 2</li>
<li><em>f</em>(2) = 2 / 2 = 1</li>
</ul>
<p>Note that this took 9 steps to reach the value 1. And it also shows
that this conjecture is true for a number of other values (2, 4, 5, 8,
10, 16, 20, and 40).</p>
<p><a href="Collatz-graph-all-30-no27.png">An image</a> (from Wikipedia)
shows how paths of most integers less than 50 converge to 1.</p>
<p>This conjecture has been proven for all integers up to at least 5.6 *
10^13, but has not yet been proven for all (positive) integers. It is
widely believed to be true, however. If you are interested, more
information on this conjecture can be found <a
href="http://en.wikipedia.org/wiki/Collatz_conjecture">here</a>.</p>
<p>As the conjecture has been proven for numbers up to 5.6 * 10^13, and
our 32-bit machines can only count as high as 4.3 * 10^9 (that’s 2^32),
we can safely assume that it is true for all of the input values that we
will use.</p>
<p>Your task is to write a routine, called <code>threexplusone</code>,
that will return the number of steps required to reach 1. An input of 13
takes 9 steps, as shown above. The Wikipedia page shows a few other
input sizes and the number of steps: an input of 6 takes 8 steps; an
input of 14 takes 17 steps; an input of 27 takes 111 steps. If the input
is 1, the output should be zero. Your program need not consider values
of zero or -1 (we will never test those values, as the conjecture does
not apply in those cases).</p>
<p>This routine <strong><em>MUST</em></strong> call itself recursively
using the proper C-style calling convention. The assembly code should be
in a threexplusone.s file. You will need to write a C++ file that
(called threexinput.cpp) that calls the subroutine and prints out the
result. <strong>If you write your function so that it is an iterative
solution, you will not receive credit for this pre-lab.</strong></p>
<p>The routine MUST take only ONE parameter – the number that you are
inputting into the conjecture (13, in the example above), and must
return the count of the number of steps taken, as per the C calling
convention.</p>
<p>Once the subroutine is done, you will need to optimize it as much as
possible. With the exceptions listed below, any optimization is valid,
as long as it computes the correct result. The grade on this pre-lab
will be based both on the correctness of the subroutine and the
optimizations included. The only exceptions to the optimizations are
that it must still be a recursive subroutine, and must still follow the
proper C style calling conventions.</p>
<p>What optimizations do you use?</p>
<ul>
<li>First, try to figure out how you can write the same routine using
fewer x86 instructions. x86 has lots of complex instructions that can be
used for this purpose – Google is your friend, here.</li>
<li>Some optimizations, such as using the <code>lea</code> instruction
(which can do addition and multiplication in one instruction) to quickly
add or multiply numbers, are specific to the x86 architecture.</li>
<li>In some cases, you can use bit shifts instead of multiplication and
divide. Computing <code>5*x</code> might be more quickly done as
<code>4*x+x</code>, as the latter can use a shift. But you can use the
<code>lea</code> instruction for that one anyway.</li>
<li>Branching is bad, in that it slows down the execution speed of a
program. As much as possible, eliminate branching (if/else statements,
loops, etc.). For loops, consider <a
href="http://en.wikipedia.org/wiki/Loop_unrolling">loop
unrolling</a>.</li>
<li>Consider the <a
href="http://en.wikipedia.org/wiki/Memory_hierarchy">memory
Hierarchy</a> and try to reduce memory accesses (this includes
<code>push</code> and <code>pop</code>).</li>
<li>Reduce the number of instructions used to create (and remove) the
activation record; this was done in a few x86 examples we studied: <a
href="../../slides/08-x86.html#/max">max</a> and <a
href="../../slides/08-x86.html#/fib">fib</a></li>
<li>Reduce the registers that are backed up to the stack in the calling
convention</li>
<li>Can you offset things from <code>esp</code> instead of
<code>ebp</code>? If so, then you don’t have to set up the base
pointer.</li>
<li>Many optimizations are listed <a
href="http://en.wikipedia.org/wiki/Category:Compiler_optimizations">here</a>,
but most would not apply to this one program.</li>
</ul>
<p>You will need to include at least one optimization beyond just
figuring out how to write your subroutine with fewer instructions. You
should put the optimizations used as a comment in the beginning of your
assembly file.</p>
<p>Note that we, too, can write the function in C++ and compile it with
<code>-O2 --S --masm=intel</code>. And we will be looking at that
assembly code when we grade the pre-lab. If you write your program this
way, it constitutes an honor violation, so please hand-code the assembly
yourself.</p>
<p>In an effort to time how fast your assembly routine runs, you should
download the timer code from the hash table lab (lab 6: <a
href="../lab06/code/timer.cpp.html">timer.cpp</a> (<a
href="../lab06/code/timer.cpp">src</a>) and <a
href="../lab06/code/timer.h.html">timer.h</a> (<a
href="../lab06/code/timer.h">src</a>)). In your threexinput.cpp file,
you will need to perform the following steps:</p>
<ol type="1">
<li>Asks the user for an input value, <em>x</em>, which is the parameter
to pass to the subroutine.</li>
<li>Asks the user for an input value, <em>n</em>, which is the number of
times to call the subroutine.</li>
<li>Runs the subroutine <em>n</em> times with the parameter <em>x</em>
as the input.</li>
</ol>
<p>You can assume that both <em>x</em> and <em>n</em> are positive
integers. See the <a href="../lab06/index.html">hash lab (lab 6)</a> for
details as to how to use the timer code. Your program should print out
the number of steps for the given input, and the average time taken per
function call. You should use an appropriate precision number to make
sure you don’t report zero when you divide the total time by the number
of runs. Your timer code should only include the loop of <em>n</em>
times that calls the routine with <em>x</em> as the parameter. Nothing
else (including the print statement) should be inside the timer
code.</p>
<p>You may find the <code>cdq</code> instruction useful – do a Google
search for ‘cdq x86’ or ‘cdq intel’.</p>
<p><strong>You must list, as comments in your assembly file, the
optimizations that you used!</strong> Just a brief description of what
optimizations you used is fine.</p>
<h3 id="different-architecutres">Different Architecutres</h3>
<p>See the <a href="../lab08/index.html">last lab</a> for details, but
all code must be submitted to run under Linux, which is the platform
that does the compilation on the submission system.</p>
<h3 id="read-the-c-tutorial">Read the C tutorial</h3>
<p>Read the <a href="../../tutorials/09-c/index.html">C tutorial</a>.
You will need to implement the linkedlist.c program for the post-lab,
not for the pre-lab.</p>
<h3 id="compiling-with-make">Compiling with make</h3>
<p>Your code will be compiled with make. See the last lab for a sample
Makefile that will compile assembly.</p>
<hr />
<h2 id="in-lab-1">In-lab</h2>
<p>Come to lab with a functioning version of the pre-lab, and be
prepared to demonstrate that you understand how to build and run the
pre-lab programs. If you cannot, work through the tutorial during lab.
If you are unsure about any part of the pre-lab, talk to a TA. The
in-lab will ask you to write C++ code and examine the generated assembly
language for a variety of topics.</p>
<p>You should be able to explain and write recursive functions for the
final exam, so make sure that you understand how to implement the
pre-lab program. Speak to a TA if you have any questions.</p>
<p>The general activity of this in-lab will be to write small snippets
of C++ code, compile them so that you can look at the generated assembly
code, then make modifications and recompile as needed in order to deduce
the representation of a number of C++ constructs, listed below. Remember
that we are compiling using g++ (<code>g++ -S -masm=intel -m32</code>),
and not clang++.</p>
<p>For the in-lab, you will need to work on one of the items in the list
below – note that this is a different list than the previous lab. You
will do a second topic for the post-lab. You should be prepared to
explain the appropriate items from the list to the TA.</p>
<p>The deliverable for the in-lab is a PDF document named inlab9.pdf. It
must be in PDF format! See <a href="../../docs/convert_to_pdf.html">How
to convert a file to PDF</a> page for details about creating a PDF
file.</p>
<p>In your report, you should explain something from one item in the
list in the in-lab report. Note that for the post-lab, you will have to
have two of the items fully explained, but you need only get through one
for the in-lab. Your report would presumably include the code snippets
(both C++ and assembly) that you generated during lab, images, screen
shots, results, etc.</p>
<p>Recall that using the <code>-S</code> flag with g++ will generate the
assembly code. You will also want to use the <code>-masm=intel</code>
and <code>-m32</code> flags.</p>
<h3 id="topic-list">Topic List</h3>
<p>For the in-lab, you have to address one topic; either the required
one or an optional one. For the post-lab, you will have to address two
total topics: the one you addressed from the in-lab, and one additional
one for the post-lab. Note that everybody has to address the dynamic
dispatch one, but it is your call whether you do that for the in-lab or
the post-lab.</p>
<p>####Required#### 1. Dynamic dispatch: Describe how dynamic dispatch
is implemented. Note that dynamic dispatch is NOT the same thing as
dynamic memory! Show this using a simple class hierarchy that includes
virtual functions. Use more than one virtual function per class.</p>
<p>####Optional#### 1. Inheritance (data layout, construction, and
destruction): Create an instance of an object that inherits data members
from another class, and also includes data members of its own. Show in
memory where data members are laid out in that object. Then explain how
construction and destruction happens in this class hierarchy. Explain
what happens when a user-defined object is instantiated and what happens
when it goes out of scope. What if anything is “destroyed” by the
destructor? Show this process happening in the assembly code using a
simple class hierarchy. Point out in the assembly code exactly where the
destructors and constructors are getting called.</p>
<ol start="2" type="1">
<li><p>Optimized code: Compare code generated normally to optimized
code. To create optimized code, you will need to use the
<code>-O2</code> compiler flag. Can you make any guesses as to why the
optimized code looks as it does? What is being optimized? Be sure to
show your original sample code as well as the optimized version. Try
loops and function calls to see what “optimizing” does. Be aware that if
instructions are “not necessary” to the final output of the program then
they may be optimized away completely! This does not lead to very
interesting comparisons. Describe at least four (non-trivial)
differences you see between ‘normal’ code and optimized code.</p></li>
<li><p>Templates: What does the code look like for the instantiation of
a simple templated class you wrote? What if you instantiate the class
for different data types, what code is generated then? Is it the same or
different? If the same, why? If different, why? Compare code for a
user-defined templated class or function to a templated class from the
STL (e.g. classes such as vectors or functions such as sort).</p></li>
</ol>
<hr />
<h3 id="post-lab-1">Post-lab</h3>
<h3 id="complete-the-c-tutorials-exercise-program">Complete the C
tutorial’s exercise program</h3>
<p>Read the <a href="../../tutorials/09-c/index.html">C tutorial</a>.
You will need to implement the linkedlist.c program.</p>
<h3 id="report">Report!</h3>
<p>Explore, investigate, and understand two of the four topics from the
topic list shown in the in-lab section. The topic addressed during the
in-lab is one of these; for the post-lab, you have to address one other.
Be able to answer “how” and possibly “why” for each item. Use test cases
and the debugger as resources. Additionally use resources other than
yourself (e.g. books, Web, etc.). Be sure to credit these sources.
<strong><em>You must use (and cite!) additional resources for
this!</em></strong></p>
<p>Prepare a report that explains your findings. Follow the guidelines
in the Post-lab Report Guidelines section from the previous lab. Address
the following: How the compiler implements the construct at the machine
and assembly levels. What leads you to this conclusion? You must show
evidence of this behavior in the form of assembly code, C++,
screenshots, memory dumps, manual quotations, output, etc. Also include
where you found the information that lead to your conclusion. (i.e. your
sources).</p>
<p>The deliverable for the in-lab is a Word document named postlab9.pdf.
It must be in PDF format! See [How to convert a file to PDF] for
details.</p>
<h3 id="tips-for-getting-started-on-the-post-lab">Tips for Getting
Started on the Post-lab</h3>
<p>See the section in the previous lab for these tips.</p>
<h3 id="post-lab-report-guidelines">Post-lab Report Guidelines</h3>
<p>See the section in the previous lab for these guidelines.</p>
<h3 id="how-much-are-we-looking-for">How much are we looking for?</h3>
<p>We want you to investigate the particular topic area from the given
list, write code to discover the answers, and learn about this topic on
your own. The questions that we pose are just meant to get you thinking
about the possible ramifications of a given question. They aren’t meant
to be specific questions that necessarily need answering one at a
time.</p>
<p>As with the previous lab, I would expect the explanation of each item
(you have to do two items) to be a page or two long, including embedded
code snippets and screenshots (obviously, we want a reasonable amount of
English text here – if you do a lot of screen shots, then your total
length will be a bit longer). Did you investigate the topic? Did you
write code to discover what you didn’t know? Was this written in a
reasonably readable format? This is what we are looking for.</p>
<p>This is somewhat vague, and purposely so – research is often vague.
If we told you exactly what to write, then there wouldn’t be much
discovery of that on your part, which would defeat the whole point of
this lab.</p>
<p><strong>We are not looking for you to spend hours and hours and hours
on this!</strong> A <em>page or two</em> per list item (and you have to
do two of them) - which means your final report needs to be 2-4 pages
long. Keep in mind if you have a lot of screenshots, that doesn’t count
much towards that page limit.</p>
<p>The grading will be based on a set of points that we would expect you
to discover when investigating a given topic. Your grade will be based
mostly on how well you hit those points. A small portion of your grade
will be based on the overall report presentation and written ability
(while we are a computer science class, we expect you to be able to
write in English to some extent!).</p>
</body>
</html>