Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract requirements evaluation metrics from Open Data Act #4089

Closed
2 of 5 tasks
nickumia-reisys opened this issue Nov 29, 2022 · 10 comments
Closed
2 of 5 tasks

Extract requirements evaluation metrics from Open Data Act #4089

nickumia-reisys opened this issue Nov 29, 2022 · 10 comments

Comments

@nickumia-reisys
Copy link
Contributor

nickumia-reisys commented Nov 29, 2022

User Story

In order to perform self-evaluations, the Data.gov System Engineering team wants to review our governing document and create a list of metrics from which we can assess whether we are doing our jobs effectively.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN [a contextual precondition]
    [AND optionally another precondition]
    WHEN [a triggering event] happens
    THEN [a verifiable outcome]
    [AND optionally another verifiable outcome]

Background

Related to

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

[Notes or a checklist reflecting our understanding of the selected approach]

  • Read law
  • Make list of articles/provisions that apply to us.
  • Connect articles/provisions to features we support.
  • Create a performance metrics or series of performance metrics for each feature.
@nickumia-reisys nickumia-reisys self-assigned this Nov 30, 2022
@nickumia-reisys
Copy link
Contributor Author

Original Memo from President (January 21, 2009):

https://www.govinfo.gov/app/details/DCPD-200900010

Open Government Directive (December 8, 2009):

https://obamawhitehouse.archives.gov/open/documents/open-government-directive

OPEN Government Data Act -- being proposed (03/29/2017):

https://www.congress.gov/bill/115th-congress/house-bill/1770/text

@jbrown-xentity
Copy link
Contributor

@nickumia-reisys the main one I saw referenced was M13-13 (apparently in 2013?): https://obamawhitehouse.archives.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf

@nickumia-reisys
Copy link
Contributor Author

nickumia-reisys commented Dec 2, 2022

Buzz (read: Key) words

  • Machine-readable
  • Open formats
  • Transparency
  • Participatory
  • Collaborative
  • Interoperability
  • Discoverable
  • Accessible/Readily Available
  • Usable
  • Information Lifecycle
  • Data standards

Motivations

  • Openness, in the form of transparency, participation and collaboration, will strengthen our democracy and promote efficiency and effectiveness in Government. (Presidential Memo, 2009)
  • "Making information resources accessible, discoverable, and usable by the public can help fuel entrepreneurship, innovation, and scientific discovery- all of which improve Americans' lives and contribute significantly to job creation." [Example: Weather Data and GPS] (Open Data Policy, 2013)
  • The Federal Government has the responsibility to be transparent and accountable to its citizens. (OPEN Government Data Act, 2017)
  • Communication, commerce, and data is a global issue. "Global access to Government information is often essential to promoting innovation, scientific discovery, entrepreneurship, education, and the general welfare." (OPEN Government Data Act, 2017)
  • Open data means readily available, discoverable, and usable data. (OPEN Government Data Act, 2017)

(General) System Requirements

Current

  • Data.gov should be scalable, flexible and able to facilitate data extraction. (Open Data Policy, Attachment-III-2-a, 2013)
    • Scalable:
      • Able to handle an increase in the amount of data to track.
      • Able to coordinate with an increased number of agencies.
      • Able to be to accessible, discoverable and usable by a wide and large number of people
    • Flexible:
      • Able to adapt to new data formats and new use cases not originally accounted for in its design.
    • Facilitate data extraction:
      • "Harvesting"
  • Open Data should be public, accessible, described, reusable, complete, timely and managed post-release. (Open Data Policy, Attachment-I-Open Data, 2013) Note: this is the definition from an agency-perspective. See below for Data.gov-specific definitions.
    • Public: In most cases this is a no-op. However, I'd consider Data.gov a second-line of defense. We are defenders of open data. We want as much data to be public. However, if agencies publish something that should not be public, it in good faith that we report data that might compromise privacy, confidentiality, security, or other valid restrictions.
    • Accessible: Our site should aim to achieve the highest level of uptime and integrity. It is our job to ensure data has not been modified from its source.
    • Described: All descriptions from data sources and their metadata are clear and accurately presented to users of Data.gov.
    • Reusable: Datasets are using the most open licenses as possible. We do not control the license assigned to datasets. However, we can be transparent over the types of data available by making licensing a filterable field like google does.
      image
    • Complete: Ensures all metadata is indexed and searchable.
    • Timely: Updates to datasets are made as quick as possible from the time agencies update their data.
    • Managed Post-Release: Ensures all datasets have a clear Agency POC to whom users of Data.gov can reach out.
  • Data.gov can support publicly available datasets that have not been released yet. (Open Data Policy, Attachment-III-3-b, 2013)
    • I found this to be an interesting one. I don't know of any datasets that fall into this category, but right now we don't make a distinction for data that will be released, but is not yet available.
  • While it is ultimately the responsibility of agencies, Data.gov may be called upon to help perform complex analysis related to safeguarding against the "mosaic" effect. (Open Data Policy, Attachment-III-4, 2013)
  • "[...] tools, best practices, and schema to help agencies implement the requirements of this
    Memorandum can be found through the Digital Services Innovation Center and in Project Open Data." (Open Data Policy, Attachment-IV-3, 2013)
    • I don't know if "Project Open Data" is still a thing. I understand it to be the name of the "dashboard.data.gov", but supposedly this is a called out support feature.
  • Data.gov performs automatic aggregation of all participating agency's enterprise data inventories. (Open Data Policy, Attachment-III-3-b, 2013)
    • Data.gov is a many-to-one information system. It gathers and consolidates all agencies' public data and then presents this to the general public.
    • Data.gov is in charge of the collection, processing, maintenance, use, sharing and dissemination of all data in its system.

Proposed

  • Data.gov "shall maintain a single public interface online as a point of entry dedicated to sharing open Government data assets with the public." (OPEN Government Data Act, SEC.4-§3566.a, 2017)
    • We technically have a single public interface, but I guess augmenting catalog.data.gov with the static site could cause confusion. We have a path forward on upgrading the static site, I suppose I'm just saying that we consider how we connect the two sites for easiest and clearest operations.
  • Data.gov has clear communication channels to discuss interoperability and connection between data.gov and agency websites. (OPEN Government Data Act, SEC.4-§3566.b, 2017)
    • Right now, we don't do much validation that external links to agencies are valid and operational. We have a broken links checker, but as a long-term goal, integrations between data.gov and agency websites will require more effort.
  • "The Director of the Office of Management and Budget shall collaborate with the Office of Government Information Services and the Administrator of General Services to develop and maintain an online repository of tools, best practices, and schema standards to facilitate the adoption of open data practices" (OPEN Government Data Act, SEC.6-a, 2017)
    • This one is a bit intense. I don't have much to say about it yet.

Environment

TODO. This has to deal with the requirements for open data from the agency's perspective. It is just meant to highlight the key points about open data that effect how we handle the data.

Auditing Metrics

TODO. This will take some time too. For each of the requirements above, ideally we would break them down a little further and then create a metric to track our compliance with it.

  • For example with scalable, we could track:
    • How many outages we've had as a function of how much data we've handled.
    • How many agencies we've integrated with over time. (Compared to how many agencies should be integrated with us)
    • How many users we've had over time (google analytics would be fine).
    • How responsive our site is as a function of how many active users we have.
  • For flexibility, we could track:
    • How many special user requests we've had over time.
      • How many of them are filled and active.
      • How many of them were filled and became obsolete.
      • How many of them were not filled.

To get the complete list and/or a list that is meaningful to the metrics we'd like to improve and/or be representative of data.gov will take additional effort.

Final thoughts

The scope of this ticket was to examine our governing documents and see if we are meeting the expectations that exist to support open data and open government. I hope we continue with this work and build the processes to gather the required data and perform self-audits with this information. I think data.gov has a lot of untapped potential.

@nickumia-reisys
Copy link
Contributor Author

@jbrown-xentity I think the document you posted was created from this executive order (May 09, 2013)?

@jbrown-xentity
Copy link
Contributor

@jbrown-xentity I think the document you posted was created from this executive order (May 09, 2013)?

Looks like my link is the direction from OMB, yours is the actual executive order. Both are certainly relevant in this case, but good catch!

@nickumia-reisys
Copy link
Contributor Author

Yeah, having read the documents that describe the "collaboration" between OMB, Agencies and the "Administrator of General Services" ... I don't like it. The laws make it easy to do finger-pointing, but it doesn't clearly explain how collaboration should work between everyone. I just tried to capture the technical points that we can use to build a better system. 😢

@nickumia-reisys
Copy link
Contributor Author

Another source of truth for the Law:

@nickumia-reisys
Copy link
Contributor Author

@GSA/data-gov-team Above comments are available in google doc: https://docs.google.com/document/d/1dgPlNfjP5aMwIFyG0IY5hJMwwleTpocJArAPZ7muoJY/edit#heading=h.ghjzcjjwg5lx

@hkdctol
Copy link
Contributor

hkdctol commented Dec 6, 2022

Now that we've moved content to google doc, we can mark this one done. Other tickets may follow.

@nickumia-reisys
Copy link
Contributor Author

Related to

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🗄 Closed
Development

No branches or pull requests

3 participants