Main goal of the project is to indicate whether specific data set is fairly drawn or intentionally choosen. Dataset in that case consists of 100 lines of a .txt file where every single line contain size in bytes of a randomly (?) selected file from the user's computer.
At the "Tools and security" classess firstly I had to send to my professor my own 100 line .txt file with randomly selected sizes of files in bytes on my whole PC (it required writing another script in order to be legit and fair). After this task, we received different datasets composed of 100 randomly selected files of students from our year and the goal was to evaluate which files might be legit (fairly drawn file sizes) or cheated (file sizes picked without writing script to select them randomly). Moreover, each file had to be assigned a rating in the scale of reliability. After analysis of all files, a report compiling the results was generated to .csv file.