An API (Application Programming Interface) is an intermediary between a large dataset and the applications at user end. It provides an accessible way to request data from a dataset by URL.
There are several methods to communicate with API server, including GET, POST, PUT, PATCH, DELETE, HEAD and OPTIONS1. Here we will focus on GET requests which is the most common and widely used methods in APIs.
In R, the {httr} package is used to access API using URL.
GET data
The steps to convert retrieved API data to standard R object, include
determine request URL. Usually this is database specific. It requires to read database API page
construct GET URL using paste or glue string conjugation functions in R
exact raw type data and convert raw to character rawToChar(raw_data$content)
convert character to R objects. Depends on character format, the usual format include table (aka, separator are nd ) or json. If it is json, using jsonlite::fromJSON to convert to list. If it is table, use read.table(text = char_data) to convert to data.frame.
Here I used protein interaction database (STRING) as example to access API. The methods for STRING API can be found at STING help page.
For the json example, refer to Joachim Schork’s blog post on time series COVID data 2.
common issues
unable to get local issuer certificate
Error in curl::curl_fetch_memory(url, handle = handle) : SSL peer certificate or SSH remote key was not OK: [string-db.org] SSL certificate problem: unable to get local issuer certificate.
It is due to no libcurl or right version of libcurl in LD_LIBRARY_PATH. By default, LD should point to LD_LIBRARY_PATH then /usr/lib:/usr/lib64. Try ldconfig -v | grep libcurl or ls /usr/lib64/libcurl* in terminal, it points whether libcurl is available in your OS. If no found, install by sudo yum install libcurl-devel in RedHat7
In my case, LD_LIBRARY_PATH point to conda lib /home/csu03/miniconda3/lib which is based on Python 3.9, while OS system default Python 2.7. I solved the above issue by export LD_LIBRARY_PATH=/usr/lib:/usr/lib64:$LD_LIBRARY_PATH before enter R.3
In the above reference, it also solve the yum update error like below
here was a problem importing one of the Python modules required to run yum. The error leading to this problem was:
Please install a package which provides this module, or verify that the module is installed correctly.
It’s possible that the above module doesn’t match the current version of Python, which is: 2.7.5 (default, Aug 13 2020, 02:51:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
If you cannot solve this problem yourself, please go to the yum faq at: http://yum.baseurl.org/wiki/Faq
Peer’s Certificate issuer is not recognized.
Error in curl::curl_fetch_memory(url, handle = handle) : Peer certificate cannot be authenticated with given CA certificates: [string-db.org] Peer’s Certificate issuer is not recognized.
It could be firewall and proxy issue. Based on this post4, adding following in R code