Merge pull request #1 from zackglazewski/main

ZG - Update PA02 writeup
ucsb-cs24 · Nov 22, 2023 · bd662e7 · bd662e7
2 parents 9688f71 + 997af6f
commit bd662e7
Show file tree

Hide file tree

Showing 4 changed files with 45 additions and 27 deletions.
diff --git a/_info/staff.md b/_info/staff.md
@@ -32,9 +32,9 @@ TBD
 
 ## Zackary Glazewski, LA
 
-<img src="../staff/CS16-zackGlazewski.jpg" alt="Zack" width="150px" style="float: left; margin: 5px 10px 10px 10px;">
+<img src="staff/CS24-F23-Zackary-G.jpg" alt="Zack" width="150px" style="float: left; margin: 5px 10px 10px 10px;">
 
-Hello everyone! My name is Zack Glazewski and I'm a 3rd year CS Major. I feel very excited to help you all not only learn, but enjoy the process along the way. Some more information about me: I love cats (I have a cat named Link who is bascially my child), playing piano, and developing my own games. I'm always happy to help, so please don't hesitate if you have any questions. I look forward to meeting you all!
+Hello everyone! My name is Zack Glazewski and I'm a 4th year CS Major. I feel very excited to help you all learn and enjoy the process along the way. Some more information about me: I love cats (I have a cat named Link), playing piano, and developing my own games. I'm always happy to help, so please don't hesitate if you have any questions. I look forward to meeting you all!
 
 
 <br>
@@ -50,4 +50,4 @@ Hello everyone! My name is Karanina Zim
 
 <img src="" alt="Torin" width="150px" style="float: left; margin: 5px 10px 10px 10px;">
 
-Hello everyone! My name is Torin Schlunk
+Hello everyone! My name is Torin Schlunk
diff --git a/_info/staff/CS16-zackGlazewski.jpg → _info/staff/CS24-F23-Zackary-G.jpg b/_info/staff/CS16-zackGlazewski.jpg → _info/staff/CS24-F23-Zackary-G.jpg
diff --git a/backup_pa/Part2RuntimeBenchmarks.svg b/backup_pa/Part2RuntimeBenchmarks.svg
diff --git a/backup_pa/pa02.md b/backup_pa/pa02.md
@@ -2,15 +2,17 @@
 num: pa02
 ready: true
 desc: "Application of data structures to a movie dataset"
-assigned: 2023-02-23 9:00:00.00-8
-due: 2023-03-14 23:59:59.00-8
+assigned: 2023-11-20 9:00:00.00-8
+due: 2023-12-04 23:59:59.00-8
 ---
 
 # Collaboration policy
 This assignment must be completed solo.
 
 # Introduction
 
+Please read the entire writeup before beginning the PA. In particular, take a look at part 3a to understand your requirements for a full score before attempting to implement a solution. You are graded on efficiency. 
+
 In this assignment, you will 
 * Use a container data structure from the C++ Standard Template Library (STL) to store and query data.
 * Analyze the time and space complexity of your algorithms.
@@ -43,9 +45,14 @@ Obtain the starter code from this repo:
 * `input_100_random.csv`
 * `input_1000_ordered.csv`
 * `input_1000_random.csv` 
-
-You are given 6 datasets in CSV files. Each CSV file:
-* has either 20, 100, or 1000 entries
+* `input_76920_ordered.csv`
+* `input_76920_random.csv`
+* `prefix_small.txt`
+* `prefix_medium.txt`
+* `prefix_large.txt`
+
+You are given 8 datasets in CSV files. Each CSV file:
+* has either 20, 100, 1000 76920 entries
 * is either ordered in alphabetical order of movie name or ordered randomly
 
 **Example of alphabetical order**
@@ -98,7 +105,7 @@ money train,5.4
 
 ## Files to complete
 * `movies.cpp, movies.h`: these files should contain any abstractions that you need to define. 
-	* We strongly discourage implementing any data structures from scratch.
+	* We strongly discourage implementing any data structures from scratch, although you might need to if you want the most efficient solution.
 * `main.cpp`: this file should read in the movies from input files and produce the expected output.
 * `Makefile`: this file generates the executable `runMovies` 
 
@@ -110,21 +117,17 @@ This assignment has 3 parts. You should **separate your algorithm for part 1 fro
 
 ## Command-line arguments
 ```
-./runMovies filename prefix_1 prefix_2 prefix_3 ... prefix_m
+./runMovies movieFilename prefixFilename
 ```
 
-* `filename` represents the input file containing movies and ratings (as described before).
-* `prefix_i` is a prefix for one or more movie names. 
+* `movieFilename` represents the input file containing movies and ratings (as described before).
+* `prefixFilename` is a .txt file which contains a list of prefixes, see `prefix_medium.txt` as an example. 
 	* There may be up to `m` such prefixes in the command-line arguments.
-	* If a prefix contains white spaces, it must be placed within quotation marks `"`. 
+    * Prefixes may include whitespace, see `prefix_medium.txt` for examples. Each line in the file corresponds to exactly one prefix.
 
-**Example of a prefix with whitespaces**
-```
-./runMovies input_1000_random.csv "the american" ab
-```
 
 # Part 1: Print all movie names and ratings
-Your program should print out all the movies in **alphabetical order of movie name**. You may use **only one data structure** of your choice to store the data from the CSV file. When testing this part, do not give any prefixes as command-line arguments!
+Your program should print out all the movies in **alphabetical order of movie name**. You may use **only one data structure** of your choice to store the data from the CSV file. When testing this part, do not include the prefix file as a command-line argument!
 
 **Example with no prefixes**
 ```
@@ -159,7 +162,7 @@ If one or more prefixes are given as command-line arguments, then for each prefi
 * Find the movies whose names start with that prefix.
 * Find the highest rated movie for that prefix.
 
-You may use additional data structures from the C++ STL to help you solve this part of the assignment.
+You may use additional data structures from the C++ STL to help you solve this part of the assignment, or you may write your own structure.
 
 ### Part 2a: All movies starting with a prefix
 First, for each prefix, your program should print out all the movies whose names start with that prefix in **decreasing order of rating**. If multiple movies have the same rating, then print them in alphabetical order of movie name. For example, print `the american president, 6.5` before `the confessional, 6.5`. If no movie names start with that prefix, then print 
@@ -173,7 +176,7 @@ Second, for each prefix, your program should print the **highest rated** movie w
 ### Examples
 **Example with 3 prefixes and multiple movies with same rating**
 ```
-./runMovies input_100_random.csv to th w
+./runMovies input_100_random.csv prefix_small.txt
 ```
 This should produce the following output:
 ```
@@ -205,8 +208,9 @@ Best movie with prefix w is: wings of courage with rating 6.8
 ```
 
 **Another example multiple movies with same rating**
+*let prefix.txt be a file that contains the prefix "be"*
 ```
-./runMovies input_1000_random.csv be
+./runMovies input_1000_random.csv prefix.txt
 ```
 should produce the output:
 ```
@@ -227,8 +231,9 @@ Best movie with prefix be is: before sunrise with rating 7.7
 ```
 
 **Example with no movies for a given prefix**
+*Let prefix.txt be a file that contains the prefixes: "t" and "xyz"
 ```
-./runMovies input_100_random.csv t xyz
+./runMovies input_100_random.csv prefix.txt
 ```
 should produce the output
 ```
@@ -267,11 +272,23 @@ Analyze the worst case Big-O time complexity of your algorithm from part 2 of th
 * all `n` movies are already stored in your data structure.
 * all `m` prefixes are already stored in an array.
 
-You must provide the time complexity analysis as a **commented block** right after your `main()` function in `main.cpp`. Note that 
+1. You must provide the time complexity analysis as a **commented block** right after your `main()` function in `main.cpp`. Note that 
+
 * your final answer will be some function of `n`, `m`, and/or `k`.
 * your final answer will depend on your data structure and algorithm choices. 
 
-You will be graded for the efficiency of your algorithms but also the clarity and correctness of your analysis. However, we are not giving you a target Big-O time complexity. A set of solutions that have similar Big-O time complexities will receive the same grade.
+2. Your analysis should also report on **specific running times achieved by your solution** on *each* random input file. For example, write down the number of milliseconds it takes for your solution to run on each input. Only report on the randomized datasets. 
+
+* Be sure to check that your proposed time complexity from (1) somewhat matches with your runtimes from (2)
+
+You will be graded for the efficiency of your algorithms but also the clarity and correctness of your analysis. 
+
+Here are runtime plots of three different types of solutions. These runtimes were gathered on the csil machines. If you want to get a proper runtime comparison, please run your code on csil. 
+* **Full credit will be given to solutions with an efficiency similar to `Mystery Implementation #2`**
+* The students with the top 5 runtimes will receive **extra credit**
+
+<img src="./Part2RuntimeBenchmarks.svg" alt="Part2" width="80%" style="display:block; margin: 5px 10px 10px 10px;">
+
 
 ### Part 3b: Analyze space complexity
 Analyze the worst case Big-O space complexity of your algorithm from part 2 of the assignment. You may assume that 
@@ -286,7 +303,7 @@ You will be graded for the efficiency of your algorithms but also the clarity an
 
 ### Part 3c: Explore tradeoffs between time/space complexity
 Briefly state how you designed your algorithm from part 2 of the assignment for the task at hand. More specifically, answer this question:
-* Did you design your algorithm for a low time complexity, a low space complexity, or both?
+* Did you design your algorithm for a low time complexity, a low space complexity, or both? What were your target complexities?
 
 Based on your answer to the question above, answer one of the following:
 1. If you designed your algorithm for a low time complexity,
@@ -307,8 +324,8 @@ You will be graded for the clarity and thoughtfulness of your analysis.
 ## Requirements
 For this programming assignment, you will have a lot of flexibility on your implementation (which just means we won't be providing a code framework for you to fill in). However, there are a few requirements that you need to keep in mind as you think about your solution:
 
-* You must make appropriate use of data structures from the STL
+* You must make appropriate use of data structures from the STL, or implement your own.
 * Your code should be readable
 * Your classes should define clear interfaces and hide implementation details as much as possible. 
-* You must include your space and time complexity analyses in `main.cpp`, as a commented block under the `main()` function
+* You must include your space and time complexity analyses (part 3) in `main.cpp`, as a commented block under the `main()` function
 * Your `Makefile` must produce an executable called `runMovies` from the `make` command.