-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Higher Geography for Marine/ offshore collections #2374
Comments
We definitely need input from Andres Lopez at UAM and @anna-chinn and @dperriguey at UNM. But it may impact other collections that have a few seals or shore or sea birds etc. even though that's a small part of their collection, so the whole AWG probably needs to weigh in. Can you tell me more about WKTs? I can't find anything in the Handbook. How are they created? Where does the media ID come from? Do we create them or are they standard for all users? Are they available for all countries? Why do the US WKTs seem to include a few miles of ocean within the WKT boundary but the ones from South Africa don't. As a result, my US specimens that are just outside the land boundaries for Florida, California etc. don't create an annotation but the South African ones do. Having just those few miles in the ocean makes a huge different in the number of annotations created and doesn't require a new higher geography system until the specimen localities is significantly further off shore. And why does the one that Dusty referenced for San Diego county in your email not look like the one with my specimen records? I'm quite interested in both the EEZ concept. Is there anyway that the coastal WKTs could be modified to include the EEZ? Also of interest is MarineRegions.org (http://www.marineregions.org/about.php). Using a vetted external source would help with consistency within Arctos and beyond. Does anyone have more knowledge about these two sources? If we could avoid having to create all new higher geography (by starting with the Ocean vs. the Continent) it would be much less work, but it needs to be a good permanent solution. Another idea is to create polygons (WKTs?) for specific marine localities. But we have never been successful in creating polygons with the geolocation tools (despite a comment that it can be done in the Handbook). Is there any way that a locality WKT could override a "higher geography" WKT (if it's just an extension into the water) or does that just cause more confusion? I have reached out to other collections managers who are well positioned in the marine invertebrate community to see if there is any action being taken that we should know about. |
From #1107
We should definitely look at these. How do we do that? |
WKT are just polygons, shapes with which we can spatially define things that have previously been ambiguous strings. I think these came from some random download. They're sort of low-res, which occasionally causes funky things. They're not really authoritative, but they're the closest thing we have. Not a clue if the shapefile or my idea of "San Diego County" is wrong - maybe a bit of both!
Clarify please.
From my viewpoint, that's just always going to be wrong (and I'd see self-conflicting spatial data as a strong indication that nothing else about the dataset should be trusted). I don't see any functional difference between "San Diego County, 10 miles west of" and "Mono County, California, 10 miles east of Reno" - they're both just wrong, despite one being 'traditional.' The practical difference is that we have a name for the THING east of Mono County; I see no reason we can't have a name for the THING west of San Diego as well. |
a little more specificity about WKTs and Arctos: WKTs are just another format of a shapefile as Dusty says. So if you see a shapefile (or for that matter any other GIS vector format) I can convert to WKT. There are lovely opensource global libraries for that purpose nowadays! But I am archiving copies of WKTs we're making here: https://github.com/BNHM/spatial-layers/tree/master/wkt(Dusty's made a bunch too but he archives on Arctos!) This repo has a bunch of different formats that we are trying out in a web service for other applications so if Arctos could use geojson... that would be good too (may be in the not too distant future)! |
I definitely don't need this ASAP. We've been working around this for at least five years so getting something that works for everyone and is easy to administer is way more important than speed. I'll reread your comments above. Still trying to make sure I fully understand how this all works. |
Yay Media - Arctos doesn't care where this stuff lives.
I probably haven't been that consistent, but %.wkt should get them all.
If gmaps can understand it (I think so) then it should be possible (might still involve quite a bit of code).
I'm just trying to guess what users want (which usually means being predictable at the largest scale possible). If there's some defensible reason to do whatever, then fine. If stretching San Diego County halfway to Hawaii is going to make it harder to find specimens, conflict with every other spatial representation of the namesting on the planet, make County checklists and County-based agencies' jobs more difficult, permit reporting impossible, and is just generally weird then maybe it's not such a great idea. This feels a bit 'duct tape and bubblegum' to me but I'm not any of those users, and the inclusion of spatial data clearly communicates whatever weird thing we might decide. "Is[not] where it claims to be" is really all my tools support at the moment anyway. |
To better understand how WKTs are developed, could we prepare one for Sanibel Island (Florida, Lee County) as it doesn't yet have one and we have a large number of specimens from there. Of course, I would want to make the boundary offshore the same amount as for Lee County so I don't create several hundred annotations. Where would that process begin? I wrote to José Leal, Science Director at the National Shell Museum on Sanibel. He is part of a grant request that includes developing standards for marine locations. This only applies to the East Coast of the US and it has not yet been approved or funded. Does his description below help in any way? I'm not sure if he means that they would use the Collaborative GeoReferencing feature http://www.geo-locate.org/community/default.html or what, so I'll try to clarify that. Also, how would they work with GeoLocate in creating these standards? Does GeoLocate already have definitions of "standard" locations? I think Arctos is way ahead of the databases they use (mostly Specify), so by the time they come up with something, we may already a long way down our road but it would be helpful to be headed in the same direction.
It sounds like when there is a geopolitical unit that the data point can be associated with, perhaps as part of an EEZ, that is used, but when it's in the middle of the ocean, it begins with the ocean, not the nearest land mass. Would it make sense to check with GeoLocate people to see where they will recommend the project draw the line between assigning marine localities to a geopolitical unit like Lee County, Florida vs. the ocean/sea such as the Gulf of Mexico? As for my "override HG" with locality polygon comment, let's ignore that for now. By light of day, it's obvious that would only complicate matters and not resolve them. |
WKTs are lines in the sand - stuff on this side is Bla, stuff on that side isn't. Predictable data are useful, unpredictable aren't. If we do our own thing, and it's different than anything anyone else has done, then we can't communicate and these data are not nearly as useful as they could be. I think that leads two places:
Their interests are about exactly opposite ours. Give them some place-strings, they'll give you coordinates. IDK what they're actually up to, but if I was building that I wouldn't care a tiny bit about standardizing, just what people had scribbled on labels in the past. If GL (or whomever) does have a source of "authoritative" geographyname-at-point, we could just pull it from them (for georeferenced specimens). Same with "habitat" data - if we have a 3D shape and a timestamp, we can (potentially) use it to find out if there were any sensors in the area and incorporate those data into Arctos. Why do you want this island as geography? It looks like it's fully within the county (according to our WKT) - is there some specific reason that's insufficient? |
With North America, United States, Florida, Lee County, (1001775) there is a WKT that does include Sanibel. It also includes an area (hard to measure) beyond the shore which gives me a nice area to plop a pin for marine species. For North America, United States, Florida, Lee County, Sanibel Island,(10004349) there isn't. Just wondering if there should be? |
Yes! (But it might not be trivial to dig up a relevant and defensible shapefile.) There are about 4 georeference/authority intersection possibilities in Arctos. Were I doing any sort of spatial analysis, I'd discard three of them outright (unless I REALLY needed the data and was willing to work for them).
I will eventually figure out how to include the intersection of georeference+geography as more than a map border color change, I hope. AFAIK nothing except Arctos can offer spatial data of a known quality, and we should brag about that more than we do. |
@sharpphyl would https://en.wikipedia.org/wiki/List_of_seas as geography do what you need or is that still not fine-grained enough? @mkoo do you think you'd be able to find WKT for those if they turned out to be useful? |
@dustymc I may be missing the point, but I think there's more to discuss than just adding a group of seas no matter how helpful that would be. Should we first look at our higher geography structure and consider separating Continent and Waterbody before we decide what water bodies to list in higher geography? See #2876 and #128 for starters. I know our specimens are from the Gulf of Mexico when they are at 500' depth but how do I associate them with being off the Florida coast and not the Mexican coast or Texas coast when our higher geography combines continents and oceans? As I mention in #128, ⅓ of our specimens can't be discovered in GBIF if someone searches with their Continents. While alignment with aggregators' standards may not be our priority, is it to be considered? And where do the continents end and the oceans begin? If a specimen is 5 feet offshore? 50 meters offshore? 2 miles offshore? Within the economic zone? If I could define the water body and the continent, this might not be as much of an issue. Or is there a standard we should use? In looking at data from other museums with large marine collections, I don't see much consistency that could guide us. Specimens from several miles offshore are often associated with the nearest land mass/country. It would seem that discussing some of these general issues could influence how we deal with marine/offshore collections data. @mkoo You indicated (in an email) you have issues with these localities as well. Are yours similar or a whole different set of issues? |
Depends on what you mean by waterbody, and perhaps eventually what happens in other Issues. As of right now I'd like to keep geography "formal" (whatever that means). So importing a bunch of seas from some "authority" (wikipedia or better!?) is easy (but not easy enough to do it if it's not helpful to you), while finding a place for the puddle under your birdbath doesn't seem like something we can immediately deal with. #1278 I have no problem with you adding locality attributes for your puddle. I do have a problem with trying to wrap locality attributes in the spatial authority data - the model just isn't built to support that. Wrapping them in the spatial assertion data via locality-WKT does work, but that's a very different kind of thing. If you meant continent and ocean then that's more approachable, but....
... could lead to twice the number of ways of saying "there," which would make the data that much less accessible. I'd think that would also break anything that expects continents to be mostly dirt, but maybe that's not our problem.
"Wherever the WKT stops" is the actionable answer.
Given WKT I'm certainly willing to try. Without we'll end up with North America sort of fading into North America/Atlantic Ocean which fades into Atlantic Ocean, which fades into .... I think that'd add up to more complexity and same or perhaps reduced functionality.
I also don't see any spatial tools; it's not clear to me that there are any repercussions to being inconsistent in other systems. There are in Arctos; we have spatial authorities. In any case, yes I'm totally up for radical rethinks of how we handle geography, and sooner is better - I need to rebuild some locality screens, I'd prefer to do that only once. I'm not sure that we can realistically change how terrestrial stuff gets cataloged, but we can certainly add "fields," talk about how we define things, etc., etc. |
I believe that is exactly what she meant. IMO, we should definitely separate continents from oceans in our higher geography and I also think we should have a separate locality field "waterbody" controlled by a code table with values like you are discussing in #2374 (comment) (perhaps this is really just a locality attribute?) for rivers, lakes, seas, etc. |
That's one benefit of the unseparated model - you can have only one. Having both sets us up for all sorts of impossible situations - something on the order of a doubling of anything vaguely coastal, I'd guess, which probably about halves the stuff any one of them can find. @sharpphyl may see that as a benefit, I see it as purposefully preventing users from finding what they're looking for. "waterbody" might make some sense, but the "etc." scares me.
Not if it's part of the polygon. |
Yeah, I see that, but it also means when people are searching aggregators they aren't finding all of DMNS marine stuff. Not sure I have a great answer there.
We need to have rules - but I think our use of a required Wikipedia article might suffice?
OK, yeah - but how many polygons do we actually have now? Are we putting aspirational ideas ahead of workable ones? If there was a polygon for the Rio Grande River in New Mexico, wouldn't that be different in 2020 from even 2018? Isn't that also true for all bodies of water? Maps are awesome except when they're out of date.... |
I'm absolutely positive I don't, other than good georeferences and hoping some service eventually saves the day. I know some things that don't work, "14374 miles east of CA" (which may or may not be another way of saying "separate continent and ocean") first among them...
I still don't know what they should be doing, but "referencing" nearby (or far away!) things is clearly wrong.
Better than nuthin.
3394, and I'd guess they've found a few tens of thousands of data problems. It's not perfect, but it's still the most quantitative geography-thing I know of.
At some level I don't care - I can still use that to find "says NM, maps to China." OK fine I do care - #3018 is the same situation. |
Elevating priority - we probably can't prevent introducing more garbage data until we have some sort of useful names for the places from which nonterrestrial things come; this is critical to improving general data quality. |
See also #2025 - whatever we do for "waterbody" should also work for rivers. |
I have the below shapes and the capacity to import most any shape from most any source. From a 'shapes define, the rest describes' perspective I think we have an adequate solution, all we need to do is avoid all of the unnecessary problems caused by #4836. I don't think any further development action is necessary, these can be used via the normal pathway, tentatively closing.
|
As we move to WKT defined Higher Geography (eg. North America, United States, California, Monterey County) we are finding that records falling offshore are not included (for good reason).
How we create appropriate higher geography for marine, aquatic, offshore localities needs to be decided.
Discussion points from #1107 :
-What about including EEZ zones for each coastal county or state? There could be WKTs for that which could replace or complement the existing terrestrial counties. EEZ = Economic Exclusion Zones dictate where you are fishing and which jurisdiction you fall under. We have used this to georeference fish and marine collections in the past.
we would create new HG that are appropriate for the localities-- if county is included then the HG would be specific to the county. yeah we are going to have to carve up the entire planet into wkts and HG. Overlap is inevitable and maybe not that terrible.
-Attaching shapes to geography removes the ambiguities. "Pacific Ocean" can start where the sand gets wet, or {pick a number} miles offshore, or WHATEVER, and I can immediately tell if a specimen is within that shape or not.
Overlap is inevitable and maybe not that terrible.
Agreed, especially WRT precision - "Yellowstone" laps three states and a bunch of counties, and you might know any of that (or not) for any specimen. It's not ideal, but we're stuck with it. At the same time we should respect "real" boundaries when they exist. As far as I can tell there's a hard border around San Diego County, and it looks like this:
Screen Shot 2019-11-15 at 1 52 05 PM
Something a mile offshore could arguably in California, Pacific Ocean, various finer-scaled marine-things we might add, etc., but it cannot justifiably be in San Diego County.
-what do marine collections think makes sense? Need input!
The text was updated successfully, but these errors were encountered: