Skip to content

Commit

Permalink
Merge pull request #26 from WZBSocialScienceCenter/minor-fixes
Browse files Browse the repository at this point in the history
Minor fixes
  • Loading branch information
paulcbauer authored Feb 22, 2022
2 parents 1d25c52 + 67629e8 commit e9c9bf4
Show file tree
Hide file tree
Showing 14 changed files with 20 additions and 19 deletions.
7 changes: 4 additions & 3 deletions 01-introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,17 @@ Below we review different data- and service-APIs that may be useful to social sc
* What data/service is provided by the API? (+ who provides it?)
* What are the prerequisites to access the API (e.g., authentication)?
* What does a simple API call look like?
* How can we access the API from R (httr + other packages)? * Are there social science research examples using the API?
* How can we access the API from R (httr + other packages)?
* Are there social science research examples using the API?


## Prerequesits: Authentication
## Prerequisites: Authentication
A lot of the APIs require that you authenticate with the API provider. The underlying script of this review is written in such a way that it contains R chunks for authentication, however they will not be visible in the examples below (we only show placeholders for you to recognize at which step you will need to authenticate). These chunks in most cases make use of so-called keys in JSON format (e.g., service account key for Google APIs). However cloning the corresponding [repository]("https://github.com/paulcbauer/apis_for_social_scientists_a_review") of this review will not result in giving you the keys, hence in order to replicate our API calls, you will have to generate and use your own individual keys.

<!--As a consequence we can not make the corresponding [github repository public](https://github.com/paulcbauer/apis_for_social_scientists_a_review).-->


## Prerequesits: Software & packages
## Prerequisites: Software & packages
The code examples rely R and different packages thereof. It's probably easiest if you install all of them in one go using the code below. The `p_load()` function (`pacman` package) checks whether packages are installed. If not they are installed and loaded.

<!-- add all used packages -->
Expand Down
4 changes: 2 additions & 2 deletions 03-Ckan_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ For example, the [German](https://www.govdata.de/impressum) and the [US Governme
The specific datasets include a variety of different contents from public administration, such as election results, data on schools, maps and many more. The German data portal [govdata.de](https://www.govdata.de/impressum) for example serves as a collection point for all those data from various institutions. Those specific administrative institutions are the ones that actually provide the data. Therefore, not every institution provides the same data on the same topic.


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

There are no prerequisites to access the CKAN API. Furthermore, there seem to be no prerequisites to access the open data from the various governmental institutions using CKAN.
Expand Down Expand Up @@ -143,4 +143,4 @@ View(kindertagespflegeeinrichtungen)

When looking for social science research that used the CKAN API and Open Government data (OGD), it seems that there is more papers and research on the usage of those data, than on the data themselves (@Bedini2014, @Correa2015).
In a recent paper that examines the use of OGD (@Quarati2019-jf), the authors come to the conclusion, that on the one hand many OGD portals lack information about data usage, and on the other hand, where those information can be found, it becomes obvious that the data are only rarely used.
For example, regarding the German OGD portal “GovData.de”, I did not find any social science papers that specifically used data from GovData.de. However, there are a few papers available that describe the German open data initiative (@Liu2018) and the metadata (@Marienfeld2013) that can be found on GovData.de.
For example, regarding the German OGD portal “GovData.de”, I did not find any social science papers that specifically used data from GovData.de. However, there are a few papers available that describe the German open data initiative (@Liu2018) and the metadata (@Marienfeld2013) that can be found on GovData.de.
2 changes: 1 addition & 1 deletion 04-CrowdTangle_API.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ CrowdTangle’s database is updated once every fifteen minutes and comes as time
When connecting to the user interface via the CrowdTangle website, the user can either manually set up a list of pages of interest whose data should be acquired. Alternatively, one can choose from an extensive number of pre-prepared lists covering a variety of topics, regions, or socially and politically relevant events such as inaugurations and elections. Data can be downloaded from the user interface as csv files or as json files via the API.


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *


Expand Down
2 changes: 1 addition & 1 deletion 06-Genderize_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ For gender prediction, there are also alternatives to using this API:
- WikiPedia provides [categories for given names](https://en.wikipedia.org/wiki/Category:Given_names) by gender


## Prerequesites
## Prerequisites

At the time of writing, the API can be queried with up to 1000 names per day for free. There's not even an API key required for the free tier. However, if you require more than 1000 API requests per day, you need to obtain an API key from [store.genderize.io](https://store.genderize.io/) – see [this page](https://store.genderize.io/pricing) for pricing.

Expand Down
2 changes: 1 addition & 1 deletion 07-Google_news_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ cat(packages)

With the News API (formerly Google News API), you can get article snippets and news headlines, both up to four years old and real-time, from over 80,000 news sources worldwide.

## Prerequesites
## Prerequisites
*What are the prerequisites to access the API (authentication)? *

You need an API key, which can be requested via [https://newsapi.org/register](https://newsapi.org/register).
Expand Down
2 changes: 1 addition & 1 deletion 08-Google_nlp_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The following requests are available:

A demo of the API that allows you to input text and explore the different classification capabilities can be found [here]("https://cloud.google.com/natural-language#section-2").

## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *


Expand Down
2 changes: 1 addition & 1 deletion 09-Google_places_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The following five requests are available: Place Search, Place Details, Place Ph
**Note:** You can display Places API results on a Google Map, or without a map but it is prohibited to use Places API data on a map that is not a Google map.


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *


Expand Down
2 changes: 1 addition & 1 deletion 10-Google_speech.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ A demo of the API that allows you to record text via your microphone (or to uplo
Also consider that there is a [Text-to-Speech API]("https://cloud.google.com/text-to-speech") - simply performing operations the other way around - offered by Google.


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

To access and to use the API the following steps are necessary:
Expand Down
2 changes: 1 addition & 1 deletion 11-Google_translation_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The API limits in three ways: characters per day, characters per 100 seconds, an
Consider that additionally to the Translation API which we demonstrate in this review, Google provides us with two further APIs for translation: AutoML Translation and the Advanced Translation API (see [here]("https://cloud.google.com/translate/?utm_source=google&utm_medium=cpc&utm_campaign=emea-de-all-de-dr-bkws-all-all-trial-e-gcp-1010042&utm_content=text-ad-none-any-DEV_c-CRE_170514365277-ADGP_Hybrid%20%7C%20BKWS%20-%20EXA%20%7C%20Txt%20~%20AI%20%26%20ML%20~%20Cloud%20Translation%23v3-KWID_43700053282385063-kwd-74703397964-userloc_9042003&utm_term=KW_google%20translator%20api-NET_g-PLAC_&gclid=CjwKCAjw_JuGBhBkEiwA1xmbRZOxg7QzGmhTHWseHFN_V0Al_Xlf8wZVBfX9EURtitWDbe2dLcTWIxoCjj0QAvD_BwE&gclsrc=aw.ds#section-4") for a short comparison).


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

To access and to use the API the following steps are necessary:
Expand Down
4 changes: 2 additions & 2 deletions 12-Googletrends_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Minus sign (e.g. corona -symptoms) | Excludes word after the operator
<br />


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *


Expand Down Expand Up @@ -114,4 +114,4 @@ ggtitle("Frequencies for the query -corona symptoms- in the period: 01/01/2020 -

Google Trends can be used to predict the outcomes of elections. For example a study by (@Prado-Roman2021) uses Google Trends data to predict the past four elections in the United States and the past five in Canada, since Google first published its search statistics in 2004. They analysed which candidate had the most Google searches in the months leading up to election day and show, that with the help of this data, all actual winners in all the elections held since 2004 could be predicted.
Another example is a study by @Mavragani2019 which uses Google Trends data to predict the results of referendums (Scottish referendum 2014, Greek referendum 2015, British referendum 2016, Hungarian referendum 2016, Italian referendum 2016 and the Turkish referendum 2017). It can be shown that the results from Google Trends data are quite similar to the actual referendum results and in some cases are even more accurate than official polls. It is argued that with the help of Google Trends data revealed preferences instead of users' stated preferences can be analyzed and this data source could be a helpful source to analyze and predict human behavior (given areas where the Internet is widely accessible and not restricted).
Furthermore, Google Trends data can also be utilized in other fields, for example to examine whether COVID-19 and the associated lockdowns initiated in Europe and America led to changes in well-being related topic search-terms. The study by @Brodeur2021 finds an increase in queries addressing boredom, loneliness, worry and sadness, and a decrease for search terms like stress, suicide and divorce. Indicating that the people's mental health could have been strongly affected by the pandemic and the lockdowns.
Furthermore, Google Trends data can also be utilized in other fields, for example to examine whether COVID-19 and the associated lockdowns initiated in Europe and America led to changes in well-being related topic search-terms. The study by @Brodeur2021 finds an increase in queries addressing boredom, loneliness, worry and sadness, and a decrease for search terms like stress, suicide and divorce. Indicating that the people's mental health could have been strongly affected by the pandemic and the lockdowns.
2 changes: 1 addition & 1 deletion 13-Instagram_basic_display_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ The Instagram Basic Display API gives read-access to basic profile information,

It is a RESTful API, meaning that queries are made for static information at the current moment. Queries are subject to rate limits. Responses are in the form of JSON-formatted objects containing the default and requested fields and edges.

## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

In order to access the Instagram Basic Display API, developers are required to first register as a Facebook developer on [developers.facebook.com]("https://developers.facebook.com/), to further create a Facebook App [here]("https://developers.facebook.com/docs/instagram-basic-display-api/getting-started") and to submit the application for review.
Expand Down
2 changes: 1 addition & 1 deletion 14-Instagram_graph_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Likewise, there are several metrics about stories that are provided by the API:
* Replies – Total number of replies to the story.
* Taps_forward – Total number of taps to see this story’s next photo or video.

## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

For most endpoints you need an Instagram Business Account, a Facebook Page that is connected to that account, a Facebook Developer Account and a Facebook App with Basic settings configured. Facebook provides a tutorial for setting this up [here]("https://developers.facebook.com/docs/instagram-api/getting-started").
Expand Down
2 changes: 1 addition & 1 deletion 17-Wiki_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ To access *Wikipedia*, *MediaWiki* provides the MediaWiki Action API.
The API can be used for multiple things, such as accessing wiki features, interacting with a wiki and obtaining meta-information about wikis and public users. Additionally, the web service can provide access data and post changes of *Wikipedia*-webpages.


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

No pre-registration is required to access the API. However, for certain actions, such as very large queries, a registration is required. Moreover, while there is no hard and fast limit on read requests, the system administrators heavily recommend limiting the request rate to secure the stability of the side. It is also best practice to set a descriptive User Agent header.
Expand Down
4 changes: 2 additions & 2 deletions 18-Youtube_api.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ There are different types of Youtube APIs that serve different purposes:
The [google developer site](https://developers.google.com/youtube/v3/sample_requests) provides sample requests and a summary of the possible metrics that the API can give you data on. You can actually run your API requests there. All the possible calls you can make are provided on the page: Captions, ChannelBanners, Channels, ChannelSection, Comments, CommentThreads, i18nLanguages, i18nRegrions, Members, MembershipLevels, Playlistitems, Playlists, Search, Subscriptions, Thumbnails, VideoAbuseReportReasons, VideoCategories, and Videos.


## Prerequesites
## Prerequisites
* *What are the prerequisites to access the API (authentication)? *

An overview and guide is given on the [Youtube Api website](https://developers.google.com/youtube/v3/getting-started).
Expand Down Expand Up @@ -226,4 +226,4 @@ In the study “Identifying Toxicity Within YouTube Video Comment” (@Obadimu20

The aim of the study “YouTube channels, uploads and views: A statistical analysis of the past 10 years” (@Baertl2018) was to give an overview on how YouTube developed over the past 10 years in terms of consumption and production of videos. The study utilizes a random sample of channel and video data to answer the question. The data is retrieved with the YouTube API (did not specify which one) combined with a tool that generated random string searches to find a near-random sample of channels created between 01.01.2016 and 31.12.2016. Results are that channels, views and video uploads differ according to video genre. Furthermore, the analysis revealed that the majority of views are obtained by only a few channels. On average, older channels have a larger amount of viewers.

In the study “From ranking algorithms to ‘ranking cultures’: Investigating the modulation of visibility in YouTube search results” (@Rieder2018), YouTube is conceptualized as an influential source of information that uses a socio-algorithmic process in order to place search recommendations in a hierarchy. This process of ranking is considered to be a construction of relevance and knowledge in a very large pool of information. Therefore, the search function serves as a curator of recommended content. The information that is being transmitted in this content can also impose certain perspectives on users which is why how the algorithm works is especially important when it comes to controversial issues. In order to better understand how the algorithms that determine search rankings on YouTube work, the authors use a scraping approach and the YouTube API v3 to study the ranking of certain sociocultural issues over time. Examples of the keywords that they use are ‘gamergate,’ ‘trump,’ ‘refugees’ and ‘syria.’ They find three general types of morphologies of rank change.
In the study “From ranking algorithms to ‘ranking cultures’: Investigating the modulation of visibility in YouTube search results” (@Rieder2018), YouTube is conceptualized as an influential source of information that uses a socio-algorithmic process in order to place search recommendations in a hierarchy. This process of ranking is considered to be a construction of relevance and knowledge in a very large pool of information. Therefore, the search function serves as a curator of recommended content. The information that is being transmitted in this content can also impose certain perspectives on users which is why how the algorithm works is especially important when it comes to controversial issues. In order to better understand how the algorithms that determine search rankings on YouTube work, the authors use a scraping approach and the YouTube API v3 to study the ranking of certain sociocultural issues over time. Examples of the keywords that they use are ‘gamergate,’ ‘trump,’ ‘refugees’ and ‘syria.’ They find three general types of morphologies of rank change.

0 comments on commit e9c9bf4

Please sign in to comment.