Skip to content
This repository has been archived by the owner on Aug 9, 2022. It is now read-only.

Refactoring saved search reporting APIs #73

Conversation

dai-chen
Copy link
Member

@dai-chen dai-chen commented Sep 15, 2020

Issue #, if available:

Description of changes:

Functional

  1. Merge the 2 saved search export APIs to a single one: This avoids the read/write of intermediate meta data to an index. After merge, there is only a single API which reads Kibana and user index to generate report.
  2. Remove previous nbRows and scrollSize argument in generating report API: Because no such option is provided for customer, fetching data by search or scroll is determined internally by data size and max result window setting.
  3. Delete old data report files and tests.

Example: Here is a sample request in which only report_name, report_format, time_from, time_to and saved_search_id are interested:

  1. The first two are used to generate file name, ex. the file name for the following request is test report table order_2020-09-16T18:53:26.909Z_e78bcad0-f84d-11ea-9b2e-8572650d15ab.csv
  2. The other 3 arguments are used to query saved search info
curl -XPOST "http://localhost:5601/api/reporting/generateReport" -H "kbn-xsrf: reporting" -H 'Content-Type: application/json' -d'
{
  "query_url": "http://localhost:5601/app/kibana#/discover/ddd8f430-f2ef-11ea-8c86-81a0b21b4b67",
  "time_from": 1343576635300,
  "time_to": 1596037435301,
  "report_definition": {
    "report_params": {
      "report_name": "test report table order",
      "report_source": "Saved search",
      "description": "Hi this is your saved search on demand",
      "core_params": {
        "base_url": "http://localhost:5601/app/kibana#/discover/ddd8f430-f2ef-11ea-8c86-81a0b21b4b67",
        "saved_search_id": "ddd8f430-f2ef-11ea-8c86-81a0b21b4b67",
        "report_format": "csv",
        "time_duration": "PT5M"
      }
    },
    "delivery": {
      "recipients": ["kibanaUser:dfajfopdasf"],
      "title": "fake title",
      "description": {
        "text": "dasdasdasd"
      }
    },
    "trigger": {
      "trigger_type": "On demand"
    }
  }
}'

Non-functional

  1. An optional excel boolean option is added to avoid CSV injection for Excel.
  2. An extra limit param is added to core_params section. It's optional with default value 10k and will truncate result set to the limit given to avoid OOM.
curl -XPOST "http://localhost:5601/api/reporting/generateReport" -H "kbn-xsrf: reporting" -H 'Content-Type: application/json' -d'
{
  "query_url": "http://localhost:5601/app/kibana#/search/ddd8f430-f2ef-11ea-8c86-81a0b21b4b67",
  "time_from": 1343576635300,
  "time_to": 1596037435301,
  "report_definition": {
    "report_params": {
      ...
      "core_params": {
        ...
        "limit": 20,
        "excel": true,
      }
    },
    ...
  }
}'

{"data":"0.category,0.customer_gender,1.category,1.customer_gender,2.category,2.customer_gender,3.category,3.customer_gender,4.category,4.customer_gender,5.category,5.customer_gender,6.category,6.customer_gender,7.category,7.customer_gender,8.category,8.customer_gender,9.category,9.customer_gender,10.category,10.customer_gender,11.category,11.customer_gender,12.category,12.customer_gender,13.category,13.customer_gender,14.category,14.customer_gender,15.category,15.customer_gender,16.category,16.customer_gender,17.category,17.customer_gender,18.category,18.customer_gender,19.category,19.customer_gender\nMen's Clothing,MALE,Women's Clothing,FEMALE,Women's Clothing,FEMALE,Women's Clothing,FEMALE,Men's Accessories,MALE,Women's Clothing,FEMALE,Men's Clothing,MALE,Men's Clothing,MALE,Women's Clothing,FEMALE,Women's Shoes,FEMALE,Women's Shoes,FEMALE,Men's Clothing,MALE,Men's Clothing,MALE,Men's Clothing,MALE,Women's Accessories,MALE,Men's Accessories,MALE,Men's Shoes,MALE,Men's Clothing,MALE,Men's Clothing,MALE,Women's Accessories,FEMALE","filename":"test report table order_2020-09-16T23:20:35.700Z_39733340-f873-11ea-ab8c-134850f49b36.csv"}

Testing: A new UT is added with mock ES client. Basic export by search/scroll code path is covered.

.../plugins/kibana-reports$ yarn test savedSearchReportHelper.test.ts
 PASS  server/routes/utils/__tests__/savedSearchReportHelper.test.ts
  test create saved search report
    ✓ create report with valid input (9 ms)
    ✓ create report with expected file name (9 ms)
    ✓ create report with expected file name extension (4 ms)
    ✓ create report for empty data set (2 ms)
    ✓ create report for small data set by single search (8 ms)
    ✓ create report for large data set by scroll (3 ms)
    ✓ create report with limit smaller than max result size (3 ms)
    ✓ create report with limit greater than max result size (2 ms)
    ✓ create report with limit greater than total result size (3 ms)
    ✓ create report for data set with comma (4 ms)
    ✓ create report by sanitizing data set for Excel (3 ms)
    ✓ create report by not sanitizing data set for Excel (2 ms)

Not implemented yet:

  1. Automate integration test: For now I test this manually by creating saved search

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@dai-chen dai-chen changed the title Single saved search export api Refactor saved search export APIs Sep 15, 2020
@dai-chen dai-chen marked this pull request as ready for review September 15, 2020 23:59
@dai-chen dai-chen self-assigned this Sep 16, 2020
@dai-chen dai-chen added the maintenance Improves code quality, but not the product label Sep 16, 2020
@dai-chen dai-chen changed the title Refactor saved search export APIs Refactoring saved search reporting APIs Sep 16, 2020
Copy link
Member

@zhongnansu zhongnansu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete dataReport.ts and dateReportMetadata.ts, it seems like they are no longer being used

@dai-chen
Copy link
Member Author

Should we delete dataReport.ts and dateReportMetadata.ts, it seems like they are no longer being used

Thanks for reminding! Removed these two and their test file.

@dai-chen
Copy link
Member Author

@akbhatta @davidcui-amzn @zhongnansu Addressed all your comments. I found that the current json-2-csv lib doesn't provide option for Excel sanitize. So I made a little more changes to avoid CSV injection by following same approach in SQL: opendistro-for-elasticsearch/sql#447.

Copy link
Member

@zhongnansu zhongnansu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

Copy link
Contributor

@davidcui1225 davidcui1225 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just one minor question left in comments

@dai-chen dai-chen merged commit 99392bd into opendistro-for-elasticsearch:dev Sep 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
maintenance Improves code quality, but not the product
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants