Skip to content

Latest commit

 

History

History
292 lines (241 loc) · 17 KB

README.md

File metadata and controls

292 lines (241 loc) · 17 KB

sharepointr

Lifecycle: experimental License: MIT

The goal of SharePointR is to make it easier to read, write, and work with files stored in SharePoint by extending the {Microsoft365R} package.

Installation

You can install the development version of SharePointR like so:

# pak::pkg_install("elipousson/sharepointr")

Requirements

To use this package (or {Microsoft365}) you must:

  • Have a work or school account with access to Microsoft SharePoint
  • Have enabled permissions for the default {Microsoft365} app ID

See the {Microsoft365} documentation on app registration for more information.

Usage

{Microsoft365} uses S6 methods for nearly everything which may be difficult for users (me included) who are less familiar with this interface.

The {sharepointr} package seeks to ease this challenge by wrapping the {Microsoft365} methods and adding support for the specification of SharePoint sites, drives, and items using SharePoint URLs instead of item or drive ID values.

library(sharepointr)

Reading and writing objects to and from SharePoint

The most useful functions for most users may be read_sharepoint() and write_sharepoint().

read_sharepoint() downloads a SharePoint file to a temporary folder and then, based on the file extension, reads it into your environment using an appropriate function and package (e.g. using officer::read_docx() for Microsoft Word files or sf::read_sf() for common spatial data files):

docx_shared_url <- "https://bmore.sharepoint.com/:w:/r/sites/MayorsOffice-DataGovernance/Shared%20Documents/General/Baltimore%20Data%20Academy/Baltimore%20Data%20Academy%20Announcement%20Content.docx?d=w0a50d3cd74ce4a8da6d82596037f0148&csf=1&web=1&e=cBURo2"

read_sharepoint(docx_shared_url)
#> Loading Microsoft Graph login for default tenant
#> ℹ Downloading SharePoint item to '/var/folders/3f/50m42dx1333_dfqb5772j6_40000g…
#> ✔ Downloading SharePoint item to '/var/folders/3f/50m42dx1333_dfqb5772j6_40000g…
#> 
#> ℹ Reading item with `officer::read_docx()`
#> ✔ Reading item with `officer::read_docx()` [41ms]
#> 
#> rdocx document with 19 element(s)
#> 
#> * styles:
#>                 Normal Default Paragraph Font           Normal Table 
#>            "paragraph"            "character"                "table" 
#>                No List         List Paragraph   annotation reference 
#>            "numbering"            "paragraph"            "character" 
#>        annotation text      Comment Text Char     annotation subject 
#>            "paragraph"            "character"            "paragraph" 
#>   Comment Subject Char              Hyperlink     Unresolved Mention 
#>            "character"            "character"            "character" 
#> 
#> * Content at cursor location:
#>   level num_id
#> 1    NA     NA
#>                                                                                                                                                  text
#> 1 If you have questions or feedback about the program, contact Chief Data Officer Justin Elszasz at justin.elszasz@baltimorecity.gov. Happy learning!
#>   style_name content_type
#> 1         NA    paragraph

write_sharepoint() saves an R object to local files and then uses upload_sp_item() to uploads the file to SharePoint.

I plan on adding features to allow the custom mapping of file extensions and object classes to specific functions or packages and adding more thorough documentation. Feel free to submit an issue or make a pull request if you have ideas of how to do this.

Working with SharePoint items, lists, and plans

This package currently support three main categories of SharePoint objects:

  • Items (including directories and files) and item properties

  • Lists and list items

  • Plans and tasks

Typically these functions return a ms_object object, a list of ms_object objects, or a data frame. In some cases, the data frame is based on the object properties and includes a list column where the ms_object is stored.

For example, get_sp_item() returns a ms_drive_item object and supports parsing for shared file and folder URLs:

get_sp_item(docx_shared_url)
#> Loading Microsoft Graph login for default tenant
#> <Drive item 'Baltimore Data Academy Announcement Content.docx'>
#>   directory id: 017P4HV6ON2NIAVTTURVFKNWBFSYBX6AKI 
#>   web link: https://bmore.sharepoint.com/sites/MayorsOffice-DataGovernance/_layouts/15/Doc.aspx?sourcedoc=%7B0A50D3CD-74CE-4A8D-A6D8-2596037F0148%7D&file=Baltimore%20Data%20Academy%20Announcement%20Content.docx&action=default&mobileredirect=true 
#>   type: file 
#> ---
#>   Methods:
#>     copy, create_folder, create_share_link, delete,
#>     do_operation, download, get_item, get_list_pager,
#>     get_parent_folder, get_path, is_folder, list_files,
#>     list_items, load_dataframe, load_rdata, load_rds, move,
#>     open, save_dataframe, save_rdata, save_rds, sync_fields,
#>     update, upload

Set as_data_frame = TRUE to return a data frame instead of a ms_drive_item object:

get_sp_item(docx_shared_url, as_data_frame = TRUE)
#> Loading Microsoft Graph login for default tenant
#>                                                                                                                           @odata.context
#> 1 https://graph.microsoft.com/beta/$metadata#drives('b%21txygHcd2h0SzmOxg3_j1LZpAvnrrKrhOjcOP6RBpB6-8Kta613N3QJlbvrVKyTwO')/root/$entity
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       @microsoft.graph.downloadUrl
#> 1 https://bmore.sharepoint.com/sites/MayorsOffice-DataGovernance/_layouts/15/download.aspx?UniqueId=0a50d3cd-74ce-4a8d-a6d8-2596037f0148&Translate=false&tempauth=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJhdWQiOiIwMDAwMDAwMy0wMDAwLTBmZjEtY2UwMC0wMDAwMDAwMDAwMDAvYm1vcmUuc2hhcmVwb2ludC5jb21AMzEyY2IxMjYtYzZhZS00ZmMyLTgwMGQtMzE4ZTY3OWNlNmM3IiwiaXNzIjoiMDAwMDAwMDMtMDAwMC0wZmYxLWNlMDAtMDAwMDAwMDAwMDAwIiwibmJmIjoiMTcwNjc1NDU1NyIsImV4cCI6IjE3MDY3NTgxNTciLCJlbmRwb2ludHVybCI6ImVWS1o5aHllbHVsNU9PdzR6OWQ1U01sOVBkZXpXTjNOU3NscWxGM3V2ME09IiwiZW5kcG9pbnR1cmxMZW5ndGgiOiIxNTAiLCJpc2xvb3BiYWNrIjoiVHJ1ZSIsImNpZCI6ImlUQVg3bGRUOWtDZy83UE5venJGV2c9PSIsInZlciI6Imhhc2hlZHByb29mdG9rZW4iLCJzaXRlaWQiOiJNV1JoTURGallqY3ROelpqTnkwME5EZzNMV0l6T1RndFpXTTJNR1JtWmpobU5USmsiLCJhcHBfZGlzcGxheW5hbWUiOiJBenVyZVIvTWljcm9zb2Z0MzY1UiIsImdpdmVuX25hbWUiOiJFbGkiLCJmYW1pbHlfbmFtZSI6IlBvdXNzb24iLCJhcHBpZCI6ImQ0NGEwNWQ1LWM2YTUtNGJiYi04MmQyLTQ0MzEyMzcyMjM4MCIsInRpZCI6IjMxMmNiMTI2LWM2YWUtNGZjMi04MDBkLTMxOGU2NzljZTZjNyIsInVwbiI6ImVsaS5wb3Vzc29uQGJhbHRpbW9yZWNpdHkuZ292IiwicHVpZCI6IjEwMDMyMDAxRkQ3Q0UxMjQiLCJjYWNoZWtleSI6IjBoLmZ8bWVtYmVyc2hpcHwxMDAzMjAwMWZkN2NlMTI0QGxpdmUuY29tIiwic2NwIjoiZ3JvdXAud3JpdGUgYWxsc2l0ZXMubWFuYWdlIGFsbHNpdGVzLndyaXRlIiwidHQiOiIyIiwiaXBhZGRyIjoiMjAuMTkwLjE1MS4zNyJ9.EP15oZ01FLz_6tPq0kgN3JLliBIPtAg3aYhscvPfpic&ApiVersion=2.0
#>        createdDateTime                                        eTag
#> 1 2023-01-13T16:42:07Z "{0A50D3CD-74CE-4A8D-A6D8-2596037F0148},57"
#>                                   id lastModifiedDateTime
#> 1 017P4HV6ON2NIAVTTURVFKNWBFSYBX6AKI 2023-01-26T18:17:57Z
#>                                               name
#> 1 Baltimore Data Academy Announcement Content.docx
#>                                                                                                                                                                                                                                      webUrl
#> 1 https://bmore.sharepoint.com/sites/MayorsOffice-DataGovernance/_layouts/15/Doc.aspx?sourcedoc=%7B0A50D3CD-74CE-4A8D-A6D8-2596037F0148%7D&file=Baltimore%20Data%20Academy%20Announcement%20Content.docx&action=default&mobileredirect=true
#>                                             cTag  size
#> 1 "c:{0A50D3CD-74CE-4A8D-A6D8-2596037F0148},141" 79504
#>                                                                                 createdBy
#> 1 Justin.Elszasz@baltimorecity.gov, a43c19f9-9a13-403d-9234-0c13f4e84b1c, Elszasz, Justin
#>                                                                                       lastModifiedBy
#> 1 Michael.Gottlieb@baltimorecity.gov, a2dedf36-3fce-4998-92f5-9350e749769f, Gottlieb, Michael (BCIT)
#>   shared
#> 1  users
#>                                                                                                                                                                                                                                                                                          parentReference
#> 1 documentLibrary, b!txygHcd2h0SzmOxg3_j1LZpAvnrrKrhOjcOP6RBpB6-8Kta613N3QJlbvrVKyTwO, 017P4HV6NGOMGBBVI5GBBZU5BFDTPDKJ7W, Baltimore Data Academy, /drives/b!txygHcd2h0SzmOxg3_j1LZpAvnrrKrhOjcOP6RBpB6-8Kta613N3QJlbvrVKyTwO/root:/General/Baltimore Data Academy, 1da01cb7-76c7-4487-b398-ec60dff8f52d
#>                                                                                                    file
#> 1 application/vnd.openxmlformats-officedocument.wordprocessingml.document, HXogYD+5fQrvr/nsAemsu10d62M=
#>                               fileSystemInfo                    ms_item
#> 1 2023-01-13T16:42:07Z, 2023-01-26T18:17:57Z <environment: 0x11064c1b0>

These basic functions to “get” objects are extended by functions like download_sp_item() that adds support for downloading copies of your SharePoint files (this is the function that powers read_sharepoint()):

withr::with_tempdir({
  docx_dest <- download_sp_item(docx_shared_url)

  file.exists(docx_dest)
})
#> Loading Microsoft Graph login for default tenant
#> ℹ Downloading SharePoint item to 'Baltimore Data Academy Announcement Content.d…
#> ✔ Downloading SharePoint item to 'Baltimore Data Academy Announcement Content.d…
#> 
#> [1] TRUE

Some of the original {Microsoft365R} S6 object methods already return a data frame by default such as sp_dir_info():

sp_dir_info("https://bmore.sharepoint.com/:w:/r/sites/MayorsOffice-DataGovernance/Shared%20Documents/General/Baltimore%20Data%20Academy")
#> Loading Microsoft Graph login for default tenant
#>                                                                                                                                name
#> 1                                                                                              General/Baltimore Data Academy/Logos
#> 2                                                 General/Baltimore Data Academy/BALT 02 Interpreting Data - Student Coursebook.pdf
#> 3                                            General/Baltimore Data Academy/BALT 03_Leading with Data_Workday Lesson 1_03292023.pdf
#> 4                                             General/Baltimore Data Academy/BALT 04_Data Stewardship_Workday Lesson 1_03292023.pdf
#> 5  General/Baltimore Data Academy/BALT 05_Measuring Government Performance_Course Access Instructions_Workday Lesson 1_02282023.pdf
#> 6                                                   General/Baltimore Data Academy/Baltimore Data Academy Announcement Content.docx
#> 7                                        General/Baltimore Data Academy/Baltimore Data Academy New Course Announcement 7.11.23.docx
#> 8                                                                           General/Baltimore Data Academy/Curriculum One Pager.pdf
#> 9                                                                                      General/Baltimore Data Academy/Document.docx
#> 10                                                        General/Baltimore Data Academy/Foundations of Data Literacy_2023 [43].pdf
#> 11                                       General/Baltimore Data Academy/Interpreting Data with Greater Accuracy and Insight[97].pdf
#> 12                                                              General/Baltimore Data Academy/Last SR by Agency as of 7.11.23.xlsx
#>       size isdir                                 id      type
#> 1    3.26M  TRUE 017P4HV6I3IGKTFCO2FZD3FXPCGUPJSO6G directory
#> 2  703.38K FALSE 017P4HV6KITC7MJJJESFD35BIDU5DR4BSL      file
#> 3  388.91K FALSE 017P4HV6N2LBZHX6ERCBCIQRJIXGTZMYCK      file
#> 4  389.67K FALSE 017P4HV6ICFKFITU4WC5CYAFJUTZAMHP7K      file
#> 5  391.34K FALSE 017P4HV6MQCBPQQD6VUVCJWMWAKTIBZNYQ      file
#> 6   77.64K FALSE 017P4HV6ON2NIAVTTURVFKNWBFSYBX6AKI      file
#> 7   53.98K FALSE 017P4HV6PKOSFTLSMOAJDK3OGKFUVCIICU      file
#> 8   88.45K FALSE 017P4HV6OKNLH7UWTQJBEZXWNFXBGBC6OC      file
#> 9   48.51K FALSE 017P4HV6L5XTQ7IOMVYBGKSYMSOYDXMMPZ      file
#> 10 536.85K FALSE 017P4HV6KALORFIRC2IBB3RN52WE4V7QGU      file
#> 11  618.2K FALSE 017P4HV6NWPAN6PTWC3FEJB3KRR2WDWXFM      file
#> 12  18.51K FALSE 017P4HV6K2WP2LBTN6PFGLVY4PIWOY7IAR      file

Others use the as_data_frame parameter to convert a list into a data frame as a convenient alternative.

Helpers for SharePoint sites and drives

Most functions in the package rely on a pair of helper functions: get_sp_site(), get_sp_drive(), and get_sp_item(), that can parse a URL to determine the SharePoint site URL, drive name, and file path based on the URL.

get_sp_site() is a minimal wrapper for Microsoft365R::get_sharepoint_site():

site_url <- "https://bmore.sharepoint.com/sites/MayorsOffice-DataGovernance"

get_sp_site(site_url = site_url)
#> Loading Microsoft Graph login for default tenant
#> <Sharepoint site 'Citywide Data Network'>
#>   directory id: bmore.sharepoint.com,1da01cb7-76c7-4487-b398-ec60dff8f52d,7abe409a-2aeb-4eb8-8dc3-8fe9106907af 
#>   web link: https://bmore.sharepoint.com/sites/MayorsOffice-DataGovernance 
#>   description: If you work with or are interested in data, feel free to join! 
#> ---
#>   Methods:
#>     delete, do_operation, get_drive, get_group, get_list,
#>     get_list_pager, get_lists, list_drives, list_subsites,
#>     sync_fields, update

get_sp_drive() returns a ms_drive object but required a SharePoint URL that includes the name of the drive (otherwise it returns the drive named in the “sharepointr.default_drive_name” option):

get_sp_drive(docx_shared_url)
#> Loading Microsoft Graph login for default tenant
#> <Document library 'Documents'>
#>   directory id: b!txygHcd2h0SzmOxg3_j1LZpAvnrrKrhOjcOP6RBpB6-8Kta613N3QJlbvrVKyTwO 
#>   web link: https://bmore.sharepoint.com/sites/MayorsOffice-DataGovernance/Shared%20Documents 
#>   description:  
#> ---
#>   Methods:
#>     copy_item, create_folder, create_share_link, delete,
#>     delete_item, do_operation, download_file, download_folder,
#>     get_item, get_item_properties, get_list_pager, list_files,
#>     list_items, list_shared_files, list_shared_items,
#>     load_dataframe, load_rdata, load_rds, move_item, open_item,
#>     save_dataframe, save_rdata, save_rds, set_item_properties,
#>     sync_fields, update, upload_file, upload_folder

Related packages

There are a few similar packages available:

  • sharrpoint: An R package to interact with Sharepoint API (files & lists).
  • sharepointr: A R package for reading from and writing to SharePoint lists.
  • msgraphr: A minimal R wrapper of the SharePoint Online (Office 365) APIs (last updated 3 years ago).

I know this package name conflicts with the existing sharepointr package. Unfortunately, I didn’t notice find it until after I set up the repository. However, given that it appears that the sharepointr package may be no longer under development, I plan stick with the current name for the time being.