-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regional_data tutorial #177
Comments
Yes, certainly, I will change it this afternoon. |
I wanted to make a suggestion about my last major release addition about regional statistics. I had been working with this issue ever since, and I grew to be convinced that this functionality should be removed to a separate CRAN package, while maintain cross-functionality and documentation link in vignettes. In 2019, there had been major steps in both the Eurostat and in OECD to formalize sub-national statistics, not only including regional ones, but metropolitan area / city statistics. As opposed to national boundaries, that are relatively stable, such sub-national divisions are extremely numerous and change every year. My original solution was triggerd by the way Eurostat converted its regional statistics from the NUTS2013 definition to NUTS2016, and I wrote a converter to this change, with keeping some NUTS2010 codes as exceptions. But the NUTS2021 is already agreed on, and occasionally I find data that is coded by NUTS2008, etc. All in all, I went back till 1999 (the last not standardized year of EU regions) and created new functions that can convert from any year to any year with precision. (I.e. NUTS2008 to NUTS2021, NUTS 2016 to NUTS2013). OECD uses the EU NUTS in its statistics, too, but it incorporates advanced statistical systems from the EU, Canada and Australia. My new package will take care of these changes, too. However, recoding the regions is only the first step. When working with time-series or panel data, you would like to impute missing or erroneously coded data in tables. Given that my coding is getting very solid and general, this is possible, but there are so many possibilities that it will take probably a year or two that my solutions will mature. So it would be very impractical to keep them in the Eurostat package. The way I am planning the change is the following:
I think that even my early releases will make it sure that regional Eurostat statistical products can be enhanced greatly. |
This sounds a very good plan, and I agree that splitting packages like this can be well justified. We would be very happy to maintain this as part of rOpenGov. There has also been some planning with @muuankarski to synchronize better the data part of eurostat pkg, and geospatial part (now in eurostat_geodata pkg). All three packages are complementary, and it is useful to develop them jointly. I can grant you access to create the necessary repository under rOpenGov if you do not have those permissions already, and then we can see how to best contribute. |
Oh - one more comment - would it be useful to have eurostat somehow in the pkg name? |
Hi, more than happy to bring it in to the rOpenGov, I had a lot of little things with it, and eventually sent it to CRAN as a 0.1.0 to see if it goes through. If it passes CRAN, I am planning a very quick follow up that brings the two packages in synch. Originally I was thinking about referring to eurostat in the name, but eventually this package went a lot farther, and it will handle the US, Australia, Japan, and besides the NUTS typologies OECD and some other countries individual ISO 3166-2. So that would not make a lot of sense. In fact, I hope that I will be able to make a programmatic connection between the OECD and the Eurostat package. Here is the new package: https://github.com/antaldaniel/regions |
Ah, great. OK! Perhaps there are other ways but I now just forked this under rOpenGov and gave you the full admin permissions to this repo. In addition, I gave maintainer-level permissions for the eurostat devel team. You might be able to create new repositories under rOpenGov now as you have full admin rights to this pkg. But I am not sure, we can try next time when there is a need. Shall we use the rOpenGov repo as the main branch from now on? If you like, we could do a small pkg review and possibly propose some improvements (if any - it looks very good already..). As I mentioned, we will need to see if we can sync with eurostat_geodata pkg with @muuankarski ; and later on it might make a lot of sense to write a short report to R journal for instance (or another suitable one). |
Thank you, @antagomir, what you propose is very good, I submitted, actually fearing some hickups to CRAN before I went to bed at night and I woke to the news that regions is on CRAN.
I am not extremely good with git, but I see that there are a lot of new project / team features for free that are available for rOpenGov, too. For new packages maybe it is a good idea to figure out. |
Perfect with CRAN. 1-2. Good, let us know if you need any help.
|
Issue 186 is connected to this, as we start working on the regions package, these issues should be resolved. |
@antagomir Please review my latest pull request to the devel. It has a new vignette, deprecates the old (and not correctly working) functions. and gives a warning to the user to use regions instead. |
Regional data tutorial seems great. Some of the tables are very extensive; would it make sense to show only the first few lines with head(), and/or use knitr::kable to make the table for html/markdown friendly in this online version?
The text was updated successfully, but these errors were encountered: