Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan for renaming index patterns #94244

Closed
8 of 9 tasks
mattkime opened this issue Mar 10, 2021 · 31 comments
Closed
8 of 9 tasks

Plan for renaming index patterns #94244

mattkime opened this issue Mar 10, 2021 · 31 comments
Assignees
Labels
discuss Feature:Data Views Data Views code and UI - index patterns before 8.0 impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort Meta

Comments

@mattkime
Copy link
Contributor

mattkime commented Mar 10, 2021

Addresses #44955

This change should be rolled out with the 8.0 release, at very least from a user facing perspective.

  • Decide on a new name - 'Data view' will replace "Index Pattern." (4/4/21)
  • Announce plan well in advance of planning - 7.14
    • Start announcing during 7.14 so teams have two releases to plan for 8.0, renaming UI copy
    • Reach out to all teams that use index patterns
    • Send email
    • Petr will do outreach as well.
  • Rename API in code - data.indexPatterns => data.views - 7.16 - Index patterns => data views - rename code #103978
    • deprecate data.indexPatterns and offer data.views
    • don't rename saved object name for now
    • check for index pattern references in API methods, not just top level service name
  • Account for REST API - [data views] data_views REST API #112916
  • Update docs - complete before 8.0 release - [DOCS] Change index pattern to data view #109284
    • Use feature branch Use individual branches, merge to main and 8.0
    • Have each team address their docs
    • Review all changes, together
    • Need to ensure copy still reads well, not an easy copy and paste
@monfera
Copy link
Contributor

monfera commented Mar 10, 2021

++ for data view (or if data is seen as redundant, view)

@jasonrhodes
Copy link
Member

I think the goal of "eliminating confusion around the overlapping ES concept of 'index pattern'" is achieved here, so for that, I'm on board. My main concern is just that we will eventually run into a similar confusion if we aren't as specific as possible with what this item is.

Is there a single paragraph/list definition of what this item is, so we can compare the name to that description and make sure it feels as descriptive as possible? I think that'd be a useful exercise before we lock something in.

(I imagine "Index View" is inaccurate because data streams aren't technically indices, etc.)

@monfera
Copy link
Contributor

monfera commented Mar 10, 2021

This comment doesn't try to define a list for Elastic/Kibana, just looking at View as an industry standard term (several things have views eg. MongoDB, not just SQL/RDBMS).

Getting started initiative: One benefit is that the user can immediately relate to a likely familiar concept (view; and as @mattkime mentions, makes clear it's not a visual view eg. chart). Meanwhile, index pattern has become a strong misnomer over feature accretion, doing more to muddle than to help concept discoverability.

A view is pretty much a something

  • that is queriable (not limiting specifically to KQL, Query DSL etc.; eg. Dashboard charts query index patterns now); you can take information from there
  • typically, multiple things query it, and serves during a longer period, but can be transient too
  • which is therefore part of object dependencies (other queries, reports etc. depend on it)
  • that isn't responsible however for storing or representing primary data, it's more of a conduit
  • it provides the conduit function, a portal toward original data via a query definition of its own
  • whose definition (except transient ones) you can persist, maintain and if nothing depends on it any more, delete
  • that permits composability - one view may depend on on another (or others)

Very often, the conduit function goes along with one or more of the following:

  • provides data access toward data which might not be directly/physically accessible by users, eg. due to role based authorization
  • renames, simplifies or regularizes field names, for a more stable reporting foundation
  • restricts data access horizontally (prohibits fields or positively lists fields)
  • restricts data access vertically (doesn't necessarily expose the same extent of the data as the places from which it queries)
  • aggregates, restricts granularity
  • enforces authorizations, so users can only access what they're supposed to
  • provides a stable foundation for reports, so that if the physical data (schema or not) changes, only the view query needs to change, rather than the gazillion reports
  • augments data, eg. derive a column from other columns, or compute some lagging indicator or window function
  • enriches with additional data, eg. join user/report visible text translations in various languages for i18n from a smaller dimension table
  • conveys and/or overlays metadata that's necessary for downstream processing such as reports, eg. more specific data types, data domain information, classification eg. ordinal, categorical, quantitative; units of measure / magnitude; some of this metadata can be used by reports for properly formatting values
  • being defined as a logical, declarative entity, it's subject to optimization possibilities down the road, eg. optional materialization/caching of (some) data or metadata; usage statistics collection for performance; query planning and optimization (if exists)

The concept of view (data view) is non-committal w.r.t. which parts are best implemented in Kibana, vs. deeper in the stack, mostly Elasticsearch, so it retains the flexibility for moving things now in Kibana into ES if/when it becomes useful. Such "late binding" is useful for future implementational leeway.

P.S. link to an older convo here

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServices)

@mattkime mattkime self-assigned this Mar 16, 2021
@mattkime mattkime added the Feature:Data Views Data Views code and UI - index patterns before 8.0 label Mar 16, 2021
@mattkime
Copy link
Contributor Author

Quoting @VijayDoshi

"Data Views" is a very loaded term in RDBMS it means something very specific. Shay came up with something I think might be more representative & less likely to be confused - "Data Workspace", I also thought of "Data Context" - "Workspace" is better IMO. @elastic-jb @mattkime

@ppisljar
Copy link
Member

i personally wouldn't know what data workspace means and would need to learn it (just as index patterns) where with data views i would have an idea of what it is but might be confused then by additional things we put on there. It does have the benefit of not overlapping with elasticsearch index patterns.

would this cause any confusion with kibana spaces ?

@monfera
Copy link
Contributor

monfera commented Mar 17, 2021

Yes, Spaces and Workpads sprung to mind too.

@ppisljar I agree with your "wouldn't know" take; also, think there's no need for overloading the meaning of View, relative to the above list. It's tempting to squeeze too much into a generally named object. A sharp name would help strive for a clear entity map of our concepts.

Names are part of all UX and can be major assistance to users and developers for Getting Started.

I ❤️ it when tools make me feel "I can use this right away!" familiar and conversant.

Mongo added views as well (just View, not Data View, which I haven't seen much). We also don't need to let RDBMS appropriate this word 😄

Views are also composable. Eg. one view providing narrower or multi-source access onto another view(s) (@ruflin made this point too). Otherwise there'll be error prone repeated maintenance, and all views (index patterns) need to change whenever the ES index layer changes (DRY principle).

@elastic-jb
Copy link

Workspaces feels a little too generic to me also. That might be the final outcome we want to get to, but to me that also feels too close to Space to me. Workspace feels like my own private space.

@elastic-jb
Copy link

elastic-jb commented Mar 17, 2021

Quickly searching around, it's definitely overloaded, but seems to generally represent a intermediary view of some underlying data:

Data View - Salesforce - Data Views are a powerful feature of Salesforce Marketing Cloud. They store subscriber information and the last six months of tracking data for your account.

Data View - Microsoft - Represents a databindable, customized view of a DataTable for sorting, filtering, searching, editing, and navigation.

DataView - Mozilla - The DataView view provides a low-level interface for reading and writing multiple number types in a binary ArrayBuffer, without having to care about the platform's endianness.

Data View - BEA: Oracle - Data views are derived from stored queries. Only one data view can be created from a stored query.

Data View - Google - A read-only view of an underlying DataTable. A DataView allows selection of only a subset of the columns and/or rows. It also allows reordering columns/rows, and duplicating columns/rows.

View - SAP - An SAP HANA view is used to select data, analyze data, or perform calculations with data from the SAP HANA database.

I would think it's safer to side with the general understanding of that concept, even though it's a bit vague. I also felt Data View was too generic, but now I think it might be the best compromise to keep it easy to relate to.

@LeeDr
Copy link
Contributor

LeeDr commented Mar 17, 2021

I like "Index Group" (which could reference a single or multiple indices). But as @jasonrhodes points out, it implies indices and excludes data streams.

@jasonrhodes also suggested we describe it in a paragraph to help focus on an obvious term.

I think the thing we've been calling "index pattern" is mostly "an index field cache, with custom field formatting". But maybe I'm missing some important functions of it.

I guess "index metadata cache" doesn't roll off the tongue too easily.

@elastic-jb
Copy link

elastic-jb commented Mar 17, 2021

Found this definition of "Data View" on Simplicable:

A data view is a data structure or visual representation of data that differs from physical data.
Views are often created to make information more relevant, readable and interesting for human consumption.
For example, the comments on an article may be sorted by factors such as the reputation of the contributor or rankings of the comments by users. This adds value to a website as unhelpful or spam comments may be hidden while thoughtful commentary rises to the top.

A view is virtual in nature and differs from the structure and content of data repositories. For example, low quality comments may be stored in a data repository but may never show up in a view for users.

Definition (1) | A structure or visualization of data that differs from a data repository.
Definition (2) | The result set from a stored query in a database that resembles a virtual table.

https://simplicable.com/new/data-view

If that's a generic definition, it seems to fit our usage.

@mattkime
Copy link
Contributor Author

I'm coming back to Data View - it seems to have an appropriate level of precision and our usage of the term is not out of line with similar usage in other products.

@monfera
Copy link
Contributor

monfera commented Mar 25, 2021

Hi folks, what would be our next step here? Would be glad to hop on an efficient call to help resolve, if it's pending

@mattkime
Copy link
Contributor Author

@monfera Next steps are at the bottom of the issue description. We need product agreement on the name, past that its just estimating and scheduling work.

@rayafratkina
Copy link
Contributor

rayafratkina commented Mar 26, 2021

@VijayDoshi are you ok going with DataView given @elastic-jb research? I think we need to close out this decision.
Let's aim to make a decision by 3/31

@VijayDoshi
Copy link

One question before I commit, @elastic-jb - what do other products call the "semantic layer"? I imagine "Data Views" could contain quite a bit of additional meta-data eventually, formatting, descriptions, tags, author (for runtime fields), created/modified dates, versions etc. For example, a field "temperature" should be a continuous color from blue to red and should be formatted as ###.## F. Then, every temperature is used we always use the specified default formatting.

If it included all of that would you still call it a "Data View"? Probably ok, just trying to think of the future of this thing.

@tsullivan
Copy link
Member

tsullivan commented Mar 30, 2021

I am also ++ for "Data View." As far as iconography, I think shutter shades would be an excellent choice. They carry idea of "visual representation of data that differs from physical data," and they look hip. :D

@monfera
Copy link
Contributor

monfera commented Mar 31, 2021

@VijayDoshi Thanks for these examples! The metadata model is structured in other systems too, whether such constituents have user facing names or not. Most items you listed are for the field level granularity and fit in the Data View concept. Eg. for the field granularity, SAP distinguishes between domains and data elements (not suggesting these names, just examples):

"The domain is used for the technical definition of a table field such as field type and length, and the data element is used for the semantic definition (short description). A data element describes the meaning..."

The discrete vs continuous and other semantic categorizations are important too, so charts and reports can be generated easily, the user getting good defaults specified by their data owners or automatically. It'll be possible to link shared scales (eg. as in your temperature color example) to fields; such sharing is found in Beyond palettes: shared visual attributes.

It's useful to eventually name our field level concepts going forward in our stack too. Current Elasticsearch metadata and physical properties are not enough for an ideal recommender.

One of the hopeful outcomes with the Data View concept is that we together get to refine

  • field level aspects of the model (whether we name them separately or not) e.g. the domain of the data, tags etc. in line with a common vocabulary approach you've proposed; these also help correlate data from disparate indices, which may share not just a physical representation but also, meaning/sense
  • higher level aspects, eg. across fields, like one field is a hierarchical breakdown of another, vs independent; often, the denormalized storage format obscures functional dependency or implicit schema elements needed for analytics

These need a long term, sustained effort, I trust that the current reinterpretation of index patterns is conducive to it.

@VijayDoshi
Copy link

Can we call this done? Decision : "Data View"?

Separately, @monfera - Across field attributes are an interesting and different concept from most RDBMS attributes. Can you provide some examples of when this would be useful (like my temperature example) - perhaps a BI example would be use default color pallet a for customers and color pallet b for orders? Tableau allows you to create arbitrary hierarchies, would we want to express something like that in the Data View; or would we always rely on the physical hierarchy since we have the structure already in the document?

@elastic-jb
Copy link

+1 from me. "Data View"

@jasonrhodes
Copy link
Member

Sounds great!

@elastic-jb
Copy link

I have not heard any objection, so we will close the naming portion of this issue with the term "Data View" to replace "Index Pattern."

@monfera
Copy link
Contributor

monfera commented Apr 15, 2021

@VijayDoshi thanks for the question, the examples you listed are what i had in mind too, here's a list of other kinds of metadata across fields which would be great for visualization: #97278 And Product Managers, Lens and KibanaApp implementor folks may have a bunch of related notions on metadata, it's currently pretty hard to generate good defaults and chart recommendations

@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort and removed loe:small Small Level of Effort labels Apr 26, 2021
@exalate-issue-sync exalate-issue-sync bot added the loe:medium Medium Level of Effort label May 5, 2021
@petrklapka petrklapka added 1 and removed 1 labels May 6, 2021
@pgayvallet
Copy link
Contributor

don't rename saved object name for now

Wondering, are we planning / expecting to be able to do that? Just want to know if #91143 is going to be a hard requirement on that one.

@mattkime
Copy link
Contributor Author

mattkime commented Jun 7, 2021

don't rename saved object name for now

Wondering, are we planning / expecting to be able to do that? Just want to know if #91143 is going to be a hard requirement on that one.

While it would be nice to address, this is a low priority. We thought we'd do this 'someday'

@timroes
Copy link
Contributor

timroes commented Aug 19, 2021

@mattkime A couple of questions that are not clear to me from this issue:

  1. Does the docs team do the update of all our documentation regarding index pattern wording or is this something expected to be done by the individual teams (since it just mentions: "Have each team address UI")?
  2. Will the keys of the capabilities regarding indexPatterns also change or will we keep using indexPattern and IndexPatternManagement, etc. as capability keys?

@gchaps
Copy link
Contributor

gchaps commented Aug 19, 2021

#109284 lists the docs that require updating. The docs team would appreciate help from the individual teams.

@mattkime
Copy link
Contributor Author

As @gchaps states, teams should address docs too.

Will the keys of the capabilities regarding indexPatterns also change or will we keep using indexPattern and IndexPatternManagement, etc. as capability keys?

We'll create new keys and deprecate the old ones but at this point there's no action required.

@mattkime
Copy link
Contributor Author

@gchaps How should we coordinate doc updates? The release process for docs is different than our kibana binaries.

@gchaps
Copy link
Contributor

gchaps commented Aug 19, 2021

We can use #109284 to coordinate updates. That issue lists all the docs that need updating, arranged by group. Each group should have an owner--I assigned Kaarina and myself the intro docs. Ping kibana docs for review when the PR for your area is ready. The update should be complete by feature freeze.

@mattkime
Copy link
Contributor Author

completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Feature:Data Views Data Views code and UI - index patterns before 8.0 impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort Meta
Projects
None yet
Development

No branches or pull requests