Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify End Use by Subcategory tables in SQL so they can be queried #7584

Merged
merged 9 commits into from
Jan 8, 2020

Conversation

jmarrec
Copy link
Contributor

@jmarrec jmarrec commented Oct 28, 2019

Pull request overview

This Pull Request will properly fill the rowHead (forward fill the values of End Uses basically) before ouputing to SQL but after the HTML file is already written so we don't mess with this.


Before:

If you look at the HTML result, you see that rowHead is empty for display purposes:

image

This means the SQL one looks very similar, with empty "RowName" cells:

image


After: All RowNames are filled

image

It's still not terribly straightforward to get a specific entry for a subcategory end use, but it's doable for sure (and quite fast). I personally like this better than combining both End Use and End Use Subcat in the same RowName, or worse adding a new column (which would create way too many "NULLs").

image

SELECT * FROM TabularDataWithStrings
  WHERE TableName = "End Uses By Subcategory"
  AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
  AND RowName = "Interior Lighting"
  AND (TabularDataIndex - (SELECT TabularDataIndex FROM TabularDataWithStrings
							  WHERE TableName = "End Uses By Subcategory"
							  AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
							  AND ColumnName = "Subcategory"
							  AND RowName = "Interior Lighting"
							  AND Value = "GeneralLights"))
	  % (SELECT COUNT(Value) FROM TabularDataWithStrings
							  WHERE TableName = "End Uses By Subcategory"
							  AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
							  AND ColumnName = "Subcategory") = 0

NOTE: ENHANCEMENTS MUST FOLLOW A SUBMISSION PROCESS INCLUDING A FEATURE PROPOSAL AND DESIGN DOCUMENT PRIOR TO SUBMITTING CODE

Pull Request Author

Add to this list or remove from it as applicable. This is a simple templated set of guidelines.

  • Title of PR should be user-synopsis style (clearly understandable in a standalone changelog context)
  • Label the PR with at least one of: Defect, Refactoring, NewFeature, Performance, and/or DoNoPublish
  • Pull requests that impact EnergyPlus code must also include unit tests to cover enhancement or defect repair
  • Author should provide a "walkthrough" of relevant code changes using a GitHub code review comment process
  • If any diffs are expected, author must demonstrate they are justified using plots and descriptions
  • If changes fix a defect, the fix should be demonstrated in plots and descriptions
  • If any defect files are updated to a more recent version, upload new versions here or on DevSupport
  • If IDD requires transition, transition source, rules, ExpandObjects, and IDFs must be updated, and add IDDChange label
  • If structural output changes, add to output rules file and add OutputChange label
  • If adding/removing any LaTeX docs or figures, update that document's CMakeLists file dependencies

Reviewer

This will not be exhaustively relevant to every PR.

  • Perform a Code Review on GitHub
  • If branch is behind develop, merge develop and build locally to check for side effects of the merge
  • If defect, verify by running develop branch and reproducing defect, then running PR and reproducing fix
  • If feature, test running new feature, try creative ways to break it
  • CI status: all green or justified
  • Check that performance is not impacted (CI Linux results include performance check)
  • Run Unit Test(s) locally
  • Check any new function arguments for performance impacts
  • Verify IDF naming conventions and styles, memos and notes and defaults
  • If new idf included, locally check the err file and other outputs

@jmarrec jmarrec changed the title Pr opened/7481 end use by subcategory sql Modify End Use by Subcategory tables in SQL so they can be queried Oct 28, 2019
@jmarrec jmarrec added the Defect Includes code to repair a defect in EnergyPlus label Oct 28, 2019

// Add informative message if failed
EXPECT_EQ(endUseName, result[0][0]) << "Failed for reportName=" << reportName;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that the "RowName" isn't empty anymore

" AND ReportName = 'AnnualBuildingUtilityPerformanceSummary'"
" AND ColumnName = 'Subcategory') = 0",
"TabularDataWithStrings");
ASSERT_EQ(7u, result.size());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query works in one step to query all entries for a specific end use subcategory.

" WHERE TableName = 'End Uses By Subcategory'"
" AND ReportName = 'AnnualBuildingUtilityPerformanceSummary'"
" AND RowName = '" + endUseName + "'"
" AND ColumnName = 'Electricity'"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the addition here, query for ONE fuel. And Return only "Value" not "*"

@shorowit
Copy link
Contributor

@jmarrec Thank you! I'll be able to get rid of my super complicated, hacky code...

@nrel-bot-3
Copy link

@jmarrec @lgentile it has been 28 days since this pull request was last updated.

@Myoldmopar
Copy link
Member

@shorowit have you tested this to confirm it fixes your issue? @kbenne I'd normally contact DMac with things like this to give the OS team a heads up. Can you advise who should be notified now?

@shorowit
Copy link
Contributor

shorowit commented Dec 11, 2019

I have not tested it, only looked at @jmarrec's screenshots.

That said, now that I get fresh eyes on this... the PR is an incremental improvement that makes it slightly easier to query, but the query is still pretty ugly. It's fine to have this drop in, but I don't feel that the issue has been substantially addressed. The query to obtain this kind of information should not have to be so complicated.

Compare this PR's query:

SELECT * FROM TabularDataWithStrings
  WHERE TableName = "End Uses By Subcategory"
  AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
  AND RowName = "Interior Lighting"
  AND (TabularDataIndex - (SELECT TabularDataIndex FROM TabularDataWithStrings
                              WHERE TableName = "End Uses By Subcategory"
                              AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
                              AND ColumnName = "Subcategory"
                              AND RowName = "Interior Lighting"
                              AND Value = "GeneralLights"))
      % (SELECT COUNT(Value) FROM TabularDataWithStrings
                              WHERE TableName = "End Uses By Subcategory"
                              AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
                              AND ColumnName = "Subcategory")

with the two queries I was envisioning as possibilities:

SELECT * FROM TabularDataWithStrings
  WHERE TableName = "End Uses By Subcategory"
  AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
  AND RowName = "Interior Lighting"
  AND SubRowName = "GeneralLights"

or

SELECT * FROM TabularDataWithStrings
  WHERE TableName = "End Uses By Subcategory"
  AND ReportName = "AnnualBuildingUtilityPerformanceSummary"
  AND RowName = "Interior Lighting:GeneralLights"

@jmarrec
Copy link
Contributor Author

jmarrec commented Dec 12, 2019

Extracting the relevant comment from my original PR message:

It's still not terribly straightforward to get a specific entry for a subcategory end use, but it's doable for sure (and quite fast). I personally like this better than combining both End Use and End Use Subcat in the same RowName, or worse adding a new column (which would create way too many "NULLs").

The TabularDataWithStrings is a SQL heresy (note that I do understand why it is like that): it's a View that's aiming to be generic enough to store all kinds of data, and bends the rules of database design. This kind of problem wouldn't occur if this data was in its own SQL TABLE.

@mjwitte
Copy link
Contributor

mjwitte commented Dec 12, 2019

@jmarrec Would it be useful to duplicate the end-use information in a second well-designed table? Not necessarily as part of this PR.

@shorowit
Copy link
Contributor

Personally, I don’t see why adding a new End Use SubCat column that is often null is so bad. It seems like the most generic and least obtrusive change. And it would be non-breaking.

Adding a separate table would be fine, but seems silly to query end use data one place and end use by subcategory data somewhere else.

My two cents.

@jmarrec
Copy link
Contributor Author

jmarrec commented Dec 12, 2019

I don't like adding 10,000 NULLS for what's really a corner case.

Ok, would this make you happier?

image

I deleted the "ColumnName==Subcategory" (should I leave it?), then for each RowName is "EndUse:Subcategory"

That'd make your query work:

image

@shorowit
Copy link
Contributor

Not sure how others would feel, but I would be happy with this solution, yes.

@jmarrec
Copy link
Contributor Author

jmarrec commented Dec 12, 2019

@shorowit see updated solution. check the markdown, I included example queries and table outputs too

@shorowit
Copy link
Contributor

I built this locally and confirmed that the results look correct and the data is now easy to query. @jmarrec, thanks so much for making this change.

@Myoldmopar
Copy link
Member

This is really great. Verified functional improvement from @shorowit, unit test coverage, and documented output changes in the markdown. Thanks @jmarrec; merging this in.

@Myoldmopar Myoldmopar merged commit 6bb670b into develop Jan 8, 2020
@Myoldmopar Myoldmopar deleted the PR_opened/7481_EndUseBySubcategory_SQL branch January 8, 2020 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Defect Includes code to repair a defect in EnergyPlus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Difficult to query End Uses By Subcategory tables in SQLite
9 participants