A place to put links and pictures of projects I want potential employers to see.
At the current time, my suggestion is to use the Google CoLab Versions of the Jupyter Notebooks (or the PDF versions) which will show input and output of each cell of code. These will have a button and link like the following (the following is just an image with no link).
If you have more time, it might even be better to look at the MyBinder Version of the Jupyter Notebooks, since they are more interactive and it's easier to keep the versions of programs compatible. It does take longer, because a Docker image is getting created and served.
In additional to some fun projects with general programming, I'm going to put in some of my favorite visualizations and Machine Learning work. Any of this stuff could be completely-guided-from-tutorials or completely-original, or anywhere in between.
I'll just use links here, with each link pointing to a markdown file in the
porftfolio-resume
repo with [*** stuff -> @todo
***]. Each of these markdown files will have with a
short description of the project and any visualization, some images with
screencaps, and a link to the main repo; it will be a presentation page. Here,
I'll be making a list of the projects with links to each presentation page I
described above. There will also be a link to each actual repo with the
full project. Since I leave the notebooks without output, so that you can run
through them yourself, I'm going to add two ways [ *** unfinished -> @todo
*** ]
Finally, if I get things all figured out for them to work, I'll put a link to
an online, interactive Jupyter Notebook made available thanks to
the MyBinder project. [ Get rid of the "if I get it" @todo
]
Suggestion for Now: If you want to be able to check things out quickly, use the CoLab specific notebook, which you can
(which you can Open In Colab).
----------
Can we get the data from the CMS experiment (part of the LHC) and (re-)discover the Higgs Boson – also called the God Particle? You can judge for yourself. The figure from my analysis is at top left in the image below. Two different views of the discovery plot shown in the discovery publication are below and to the right of my plot.
Higgs Boson Discovery Presentation Page
higgs_boson_visualisation
repo
Commented out text to eventually go on the presentation page
Suggestion for Now: If you want to be able to check things out quickly, use the CoLab specific notebook, which you can
(which you can Open In Colab).
That link above is for part 4, which (I think) is the most interesting. Part 5 gives the results (currently for the 5 most frequently used words from the job description and the most frequently used word in the job application). If you have the time and the inclination, part 3 is the next one I'd suggest. All parts are linked below. Note that here, as with the other Google CoLab Notebooks, the point is to give a notebook with the input and the output. For some, you might be able to re-do the code running. For others, the versions of Python and the libraries won't be compatible. You can also see the MyBinder versions near the end of this section.
This is a view of my process, with the code for defining the functions imaged in the order I made them.
[ -v- Cut this down -v- ]
You can go sequentially through the parts in CoLab, but coming back here is useful if you want to see things out-of-order.
----------
Commented out text to eventually go on the presentation page
Check Match between Job Description and Job Application Presentation Page (not yet set up)
Suggestion for Now: If you want to be able to check things out quickly, use the CoLab specific notebook, which you can
(which you can Open In Colab).
----------
This is a completely formatted answer to a job interview question. It's a nice combination of NLP and Machine Learning basics.
CNN for NLP in PyTorch Presentation Page (not yet set up)
nlp_w_pytorch_zhongyu-pan
repo
[To put on the presentation page] This is my favorite, nice, and simple code for a simple CNN, created using PyTorch. I really like how I did the first part - using the (Transformer-type/LLM) AI to do each of the steps described by the teacher in the course this project follows. That course is Natural Language Processing with PyTorch, taugh by Zhongyou Pan. Note that the previous link is to a LinkedIn course, and thus will require a login, which itself will require an account.
Can we read the numbers in zip codes that people write on their Post-Office envelopes? The MNIST digit database and challenge give us the chance to find out.
Presentation Page (not yet set up)
@TODO: I'll need to include information about how to set up the directories used in the project in the CoLab Runtime machine.
CoLab Version with Single-Hidden-Layer Approach
CoLab Version with CNN Approach
Some of the most exciting finds in Manuscript Studies are previously unknown books in the binding of other books. Important finds - works previously believed to be lost as well as untapped genealogical data - come as researchers look at what book-binders called waste and used to bind and protect other works. We bring Machine Learning and Artificial Intelligence into the search.
Here are some images of the kind of things for which we're searching and which we've found.
These images are details of interesting finds in the dataset I've been building and using.
[ *** -v- Take most of this out. -v- *** ]
Things are more interesting if the images are bigger; but I didn't use the right format above for easily seeing the images full-size. The previous links go to GitHub file information pages and have alt-text for the seeing impaired and those wanting more info. The following links will go to images, as long as your browser isn't set up for download. Make sure you use the zoom functions.
Edit: Apparently (archived),
zooming in on GitHub images is a bit complicated. I have a new @todo
- come back and figure out how to give you zoomable
images without you needing to download them. I think you can click on these next images and see full-size and/or zoomable images.
The presentation of the project would be better with some graphs of loss/accuracy curves, etc., but I wanted to get the pretty
stuff in first. -> @todo
- Get/produce loss/accuracy curves.
I'd personally suggest you start with this following slide deck.
Slide Deck from Conference Presentation on GitHub
Here are things with code, paper, and presentation.
manuscript-waste-reuse-finder
repo
Paper on Academia.edu - Free Account Needed for Access
Suggestion for Now: If you want to be able to check things out quickly, use the CoLab specific notebook, which you can