Instagram bot that operates an account where incoming college freshmen introduce themselves to their future peers. Posts pictures and informational snippets received via direct message.
I was recently admitted to UC San Diego as a Computer Science major in Muir College. Various Instagram accounts exist (for every college) that are meant to connect incoming freshmen with one another, with both the aim of finding new friends and discovering potential roommates. For UCSD specifically, the most prominent of many accounts is called @ucsandiego.2027. Posting over 10 times a day, countless incoming freshmen share snippets and pictures of themselves to this account.
However, UCSD's college system makes it difficult to find potential roommates -- my main objective -- on this account. For one, I can't dorm with someone of the opposite gender, and secondly, I can't dorm with someone who is not in my college (Muir, one of eight colleges at UCSD). This renders the significant majority of the posts on @ucsandiego.2027 irrelevant (for my purposes). The page is chaotic to say the least...
My solution to this problem is to create a new Instagram account just for UCSD Muir College Class of 2027 admits. There are many advantages to this:
- Numbers: There will be much less people posted to this account, which will make the page much less chaotic and much more readable. Even if it is mostly females posting on the new account, at least everyone posted will be in Muir College, making many more of the posts relevant for me.
- Tighter Community: Because the new Instagram account has much less people, it is easier to connect with others and make friends going into Fall Quarter.
- Reducing Backlog: The @ucsandiego.2027 account is severely backlogged, and its account owner claims that it will take over a month to post someone's information. In fact, the account owner has resorted to asking submittees to pay a $5 fee in order to expedite the process. This separate account for Muir College will reduce the strain on the main account.
Hopefully this project will make it easier for me to find a roommate, hence the name.
This project must do a variety of things. It must...
- Find other Muir College admittees. This can be done by scraping the captions of every post on @ucsandiego.2027. If the caption includes some form of the word
Muir
, take note of that user's instagram account handle, which will probably be the string that follows the@
character. This step should output a list of Instagram users for the bot to follow. For every user in this list, congratulate them on their acceptance, and then ask if they would like to send their info to be posted to the bot account. Perhaps the bot can fully automate the process and simply screenshot the user's post on @ucsandiego.2027, making the process very easy for those who have already posted; however, this could be difficult and stepping towards an invasion of privacy. - For users found via the "Find other Muir College admittees" method, there is no need to check that they have been admitted to Muir, since @ucsandiego.2027 has vetted them for us already. However, for those who have requested to follow the account, they must verify they were admitted to Muir by sending a screenshot of their acceptance letter. The bot will use OCR to read in their acceptance letter and check for a string indicating that they have been admitted to Muir.
- Read Instagram DMs and post what has been DMed to the account. The bot must DM all Muir College admittees -- whether vetted by us or @ucsandiego.2027 -- and ask them to post. This will be the bulk of the work for this project.
Hopefully this project will be done by May 15th, which is the beginning of the housing selection process.
The various programs that will help me find a roommate. These programs seem to require a strong internet connection, since on weak connections, Selenium tries to perform actions before a page is fully loaded, causing the code to break.
Contains a class which creates a Selenium window and logs into instagram. Instances of this class can then be used by other programs in finding_a_roommate
. This program has no arguments, though it does require that Selenium Chrome Web Driver is installed (download here). The Chrome Web Driver version must match that of Chrome on your local device.
Scrapes the @ucsandiego.2027 account to find incoming Muir students, and makes note of their Instagram handles. Run with:
python ~/finding_a_roommate/find_accounts.py driver_address username password username_to_scrape output_directory
where:
driver_address
is the filepath to a Selenium Chrome Web Driver. Note that Chrome Driver version must match the version of Chrome installed on the computer, orbot.py
will not be able to run correctly.username
is the username of the account whichbot.py
will take control of.password
is the password to the aforementioned account.username_to_scrape
is the username to the Instagram account that will be scraped for information.output_directory
is the filepath to the folder thatfind_accounts.py
will output to.
The program writes a stop_key
to the output file accounts_muir.txt
once it has parsed all accounts posted after January 1st, 2023. If the program detects the stop_key
when importing accounts_muir.txt
, it will only parse 7 days back in time as opposed to all the way back to the start of the year; this was implemented as a time-saving measure.
As the name suggests, this program tries to DM Instagram accounts found using find_accounts.py
, asking these accounts for permission to post on the new Muir account. Run with:
python ~/finding_a_roommate/initiate_contact.py driver_address username password accounts_to_dm
where:
accounts_to_dm
is the absolute filepath to the list of Muir accounts generated byfind_accounts.py
. This filepath (specifically, the directory in which the file is in) will be used to output a list of successfully contacted accounts. Ifaccounts_to_dm
is not an absolute filepath, aFileNotFoundError: [Errno 2] No such file or directory
will be thrown.
driver_address
, username
, and password
are the same as for find_accounts.py
. Reference find_accounts.py
for those arguments.
The main bot that reads Instagram DMs and posts snippets and pictures of admittees to the Instagram account. Run with:
python ~/finding_a_roommate/bot.py driver_address username password output_directory
where:
driver_address
, username
, password
, and output_directory
are the same as for find_accounts.py
. Reference find_accounts.py
for those arguments.
urllib
: I have previously encountered this error:urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)>
, which can be solved with the method mentioned here.pytesseract
: I have previously encountered this error:pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.
, which can be solved with the method mentioned here.
Let's hope I find a roommate!