-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split "Observation" event type with new collecting sources #2075
Comments
My first impression is that this is going to lead to endless debates and inconsistent data. I would 100% call a person with a camera a 'human observation' - they saw the critter-or-whatever (so have more information than what's captured), the photo (recording, whatever) is just supporting evidence. That's obviously not a universal viewpoint, and I suspect it's just one of many examples where some users would choose one of these values and some would choose the other (which means they're not useful for users). I suggest our two-part approach:
|
I'm good with that. Do we have "establishment means" yet? |
No, but I can create it in a few hours - I think we're all good with that in #1942. Given the possibility of this use case, are we still happy with the name? |
I for one, have no idea what that means... but as long as you write good documentation, I'll figure it out. |
@AJLinn for us biological collection people, it means "The process by which the biological individual(s) represented in the Occurrence became established at the location." Is there something similar in your world? We can use another term if it encomapsses something you need too! |
IF my idea to split 'observation' into machine and human using the attribute intended to refine collecting source (but perhaps not quite in this way) isn't too far out, that definition is too restrictive and the name may be as well. Take a picture of a critter:
I hate to be the one to suggest even more Attributes (#1623) but maybe my idea is too much overload for this and we need yet another way of refining collecting source?? Maybe this isn't related to collecting source at all? Maybe it really is a fundamental split in collecting source and we just need better human/machine definitions? Does this change if we lose the camera trap image - eg, is this just parts?? Wheeeee!! |
Who are you kidding - you LOVE to make messes. :-)
Given what John W. said in the TDWG issue, my first instinct was to split the "observation" event type.
But maybe we should start with defining Collecting_Source. I don't see a definition at http://handbook.arctosdb.org/documentation/specimen-event.html and this is probably why we are having so much trouble with it. Creating a separate issue.
Having a "media" part would possibly make an observation "machine", except when the media is only a scan of field notes...so I say this has nothing to do with parts. |
BTW - we should also consider where the DwC term "basis of record" comes from in Arctos. I am not sure that I can answer that question. Most stuff shows up in GBIF as Preserved specimen. What magic happens to get that information out of an Arctos record, which as far as I know doesn't specify that information anywhere? It can possibly be inferred from event type + parts. I can't quickly find an example of an observation from Arctos in GBIF. |
I don't, I swear!
A recording of a bird singing is a machine observation - and the bits where the person holding the recorder is talking are basically field notes....
Many of those bird recordings end with BLAM! and a specimen..... I suppose that should really be two Events a second or so apart, but I don't think anyone works at that precision. This just all seems too arbitrary. How about from the other side: who cares about this stuff? Why do we want to separate human and machine "observations"? |
http://arctos.database.museum/info/ctDocumentation.cfm?table=CTCATALOGED_ITEM_TYPE That can be set on the Attributes tab. I wouldn't want to defend the definitions... |
Why do we want to separate "captive" from "wild"? It's a similar answer - to give people an idea of the reliability of the data for whatever research they happen to be doing. Data supported with a photograph, video or sound recording may be deemed more reliable than 100 year old cursive writing about the yellow bird someone saw while eating dinner at field camp (or perhaps the other way around depending upon the person doing the cursive writing). Anyway, that's why I demoted the division from event type to collecting source and totally accepted your demotion of it to an establishment means attribute. |
HMMMM - so when/where does this get recorded? It isn't part of bulkloaded data or a single entry record and I think it possibly should be? |
That split does seem useful to me, but it also seems to depend on the material we have in hand.
I can't say I'm particularly looking forward to writing the code, but it looks like that could all be derived from event type + parts + disposition. |
Looks like I'm making (probably not very defensible) decision from the collection type. This seems very much like the above to me. http://arctos.database.museum/guid/MVZObs:Mamm:10 is an "observation" from which you could get DNA We probably have 'observations' recorded in 'real' collections. We certainly have specimens in 'real' collections for which we can't find parts (eg, they are functionally observations). Etc. It's not much problem (for me - not sure about those of you who have to use it) to add this to the bulkloader, but I'm also not sure it does what I think it's intended to do. |
I didn't read the full thread (apologies!) but recently in the TDWG Machine Observations group we have settled on a definition for MachineObservation vs HumanObservation
|
@albenson-usgs so, if it is on a schedule it is human and if automated (motion sensor) it is machine? |
Please note that none of this is from the normative Darwin Core definitions.
…On Thu, Apr 22, 2021 at 7:01 PM Teresa Mayfield-Meyer < ***@***.***> wrote:
@albenson-usgs <https://github.com/albenson-usgs> so, if it is on a
schedule it is human and if automated (motion sensor) it is machine?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2075 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ723QBPOPIT5EBFOSRHTTKCMEHANCNFSM4HMRGKIQ>
.
|
Well what we came up with is if it's on a schedule then it's a machine (a drone flying over set to take pictures every 30 seconds versus a drone that a person is flying and choosing when to take the picture). @tucotuco that's good to know. As I said on the Material Sample thread I think |
So a detection from a satellite photo is one thing, and a detection using the same method in an identical-resolution aerial photo is something else?
I completely agree; I think we're spending a lot of time and effort (#2432, #3421) creating entirely arbitrary data that doesn't DO anything for anyone. |
I think these are both machine because it's not like the person on the plane says "Wait, there is a really cool tree right there, let's get an aerial image of it." I would think you have a designated flight path and you're recording anything in the path. But maybe this isn't worth hashing out since this isn't the way |
That definitely happens.
Even those are more suggestions because of weather etc. But now I'm thinking perhaps you're suggesting intent rather than the camera platform is the factor? I (obviously!) don't know what we should be doing, but what we are doing is manually (==arbitrarily) setting https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcataloged_item_type, and then sending it to DWC via...
My preference would still be to banish the concept from wherever it exists. |
This is exactly what is recommended for the Darwin Core terms. Basically, if the evidence comes out of a machine, regardlss of any intention or automation, it is a MachineObservation. Part of the reason for that is to have an anchor for being able to capture metadata about the machine, something that isn't expected for a HumanObservation. To respond to @dustymc , this is specifically to avoid endless debate, though we both know that inconsistent data has to be solved by a lot of other medicine.
This is not at all what dwc:establishmentMeans is about, if that was the intention.
This is already the case with MachineObservation and HumanObservation. There is no "Observation" class in Darwin Core.
|
Alright so it seems to me that this has evolved over time and relatively recently given this comment and then this one. If the latter one is the consensus of the community then I don't think it's clear to everyone that that is the decision (note that the TDWG Machine Observations Task Group doesn't know this). Moreover, I still don't understand what a user would do with this information. How would they use it? If I go to GBIF and download data from multiple datasets and it comes back with 500 occurrences "HumanObservation" and 500 occurrences "MachineObservation" what is it that I can do with that? I don't think any of this would necessarily be an issue except (Apologies to Arctos for babbling in your thread) |
@albenson-usgs we appreciate the interaction! See also tdwg/dwc#314 (comment) |
@tucotuco at some point we talked about a general statement for things like 'invasive' and a way to refine (what exactly do we mean by "invasive"?) as assertions. I was attempting to suggest that we could take a similar approach here:
For example, "wander around with a yagi antenna, scribble stuff on paper, hope it more or less triangulates" and "get coordinates from GPS" describe a very similar situation in different years. Those might end up in the same DWC-slot, but they're very different kinds of data. And - SHOULD they end up in the same slot? Nobody generally sees any critters, one "came out of a machine" as beeps, then went onto paper (with a little help from a compass - is that a machine? Would the answer be different for the compass in my watch?), then back into a machine, and the other is machines all the way down. I'd probably put them in different slots, I suppose, but it still seems overly arbitrary to me.
I think it depends on the details. I can verify an identification from PreservedSpecimen, I can't from HumanObservation, for example. (Until I learn that a fair number of those PreservedSpecimen aren't actually available to borrow for various reasons, and are therefore functionally identical to HumanObservation...) And yes, we appreciate the input @albenson-usgs ! |
If the intent for |
I think a slightly more accurate description of the intention of |
But |
Why not? An image can be missing, destroyed, "in collection" (either a physical copy or digital one), on loan, etc. |
The definition is "The current state of a specimen with respect to the collection identified in collectionCode or collectionID." I would not apply that to an image but maybe that's just me. |
That definition arose from a Specimen-centric proto-Darwin Core. It can be
changed to admit disposition of evidence if the community is down with
that. That would avoid a proliferation of terms. I can't see it being
terribly controversial, but then, I am repeatedly surprised.
…On Fri, May 7, 2021 at 10:32 AM Abby Benson ***@***.***> wrote:
The definition is "The current state of a specimen with respect to the
collection identified in collectionCode or collectionID." I would not apply
that to an image but maybe that's just me.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2075 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ72ZRGLC67POWBOH6DRLTMPTX3ANCNFSM4HMRGKIQ>
.
|
It seems to me that an update to the definition for |
Jumping into this cold with a few thoughts. First to the original suggestion of splitting observation event type now into sources, I think this would introduce a whole mess of issues and debates that in the end, most collections will not touch. There's also potential to have seemingly conflicting inferences with I'm more in favor of better definitions and use cases for |
This topic is deep and of broad interest, but I would like to engage in an
attempted consensus here if that sounds reasonable.
…On Fri, May 7, 2021 at 12:27 PM Michelle Koo ***@***.***> wrote:
Jumping into this cold with a few thoughts. First to the original
suggestion of splitting observation event type now into sources, I think
this would introduce a whole mess of issues and debates that in the end,
most collections will not touch. There's also potential to have seemingly
conflicting inferences with basisOfRecord Also how does that comport with
existing samplingProtocol in field work? collectingsource=contraption if
using a mistnet or pitfall trap?
I'm more in favor of better definitions and use cases for basisOfRecord
since this is also the first order of filtering of records for many
researchers. This field has potential for expanded types, is well used and
appears to be able to distinguish collecting sources especially when
relevant to the record (e.g, camera trap vs human). However that said,
there's a lot of ambiguity as well in some of the example controlled vocab
on TDGE (eg. Event vs Occurrence?) Can we shore up basisOfRecord ?
Perhaps I'm missing a chunk of conversation but what is missing that we
need more splitting of hairs here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2075 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ727NAXGXZME742A5CVLTMQBFNANCNFSM4HMRGKIQ>
.
|
I feel like I need time to re-read this whole thing ad digest - it may be a while.... |
closing as treated with cataloged_item_type |
See tdwg/dwc-qa#134 (comment)
Suggest we have collecting sources =
machine - evidence of observation captured by machine (camera, sound recording, etc.)
human - evidence of observation captured by a human (field notes)
The text was updated successfully, but these errors were encountered: