Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API endpoint to get metadata of a brew #2638

Closed
DSPaul opened this issue Jan 24, 2023 · 19 comments · Fixed by #3481
Closed

API endpoint to get metadata of a brew #2638

DSPaul opened this issue Jan 24, 2023 · 19 comments · Fixed by #3481
Labels
solution found A solution exists; just needs to be applied

Comments

@DSPaul
Copy link

DSPaul commented Jan 24, 2023

Your idea:

I am working on a library app that would allow users to import official RPG rulebooks and homebrew content from all over the web into one place where they can sort, filter, ect based on metadata like author, release date, ect. To get metadata for homebrewery documents, I currently scrape the https://homebrewery.naturalcrit.com/share/:id endpoint and get the metadata in json format with this little bit of C# code

//Select script tag with all metadata in JSON format
string script = src.SelectSingleNode("/html/body/script[2]").InnerText;
//json is encapsulated by "start_app() function, so cut that out
string rawData = script[10..^1];
JObject metadata = JObject.Parse(rawData);

This would break easily however if the client/template.js file were to be changed so having a new endpoint like https://homebrewery.naturalcrit.com/info/:id that returns all the metadata as json would eliminate this rather janky way of doing it. I'm sure other 3rd party projects could also benefit from this. If there is already a way to do this with the current API, please tell as it isn't really documented anywhere so I could have missed it.

@ericscheid
Copy link
Collaborator

You would better off with https://homebrewery.naturalcrit.com/download/:id .. that returns the raw source with no UI or fiddling (unlike /share/:id or even /source/:id)

@G-Ambatte
Copy link
Collaborator

G-Ambatte commented Jan 24, 2023

I had a bit of a poke at the library app, it appears that it's just scraping the brew's metadata.

image

Homebrewery doesn't force updates to brews in storage, so older brews may not have every metadata option in the code-fenced metadata block in the brew text - for example, pageCount is a relatively new metadata feature, and may not exist on every brew. Similarly, the codefenced metadata block in the brew text (which is visible via the /download/:id endpoint) will not exist on older brews (that is, brews that have not been edited or updated since the change went live).

A better API endpoint would be for Homebrewery to implement a /metadata/:id endpoint which returns a JSON object with only the brew metadata - title, authors, version, pageCount, description, thumbnail image URL, publication status, and so on.

However, something to consider for the library app itself is that Homebrewery is completely open source and free, and anyone can run it locally on just about any OS (Windows, macOS, Ubuntu, Debian, FreeBSD, RaspBian, or anything that will run a Docker image), so there is no guarantee that a user's Homebrewery document will always exist at https://homebrewery.naturalcrit.com/share/:id.

@DSPaul
Copy link
Author

DSPaul commented Jan 25, 2023

Thanks a lot for the helping me with this and even checking out my code, I will check out the /download/:id and /source/:id endpoints, I did not know they were a thing. I might be worth it to write a quick little wiki entry with a list of the endpoints and a one sentence explanation of what they do because they seem to be quite scattered in the code so its easy to miss one.

It's also good that you told me that older brews might lack some data, I only tested it with a handful of documents so I never ran into problems but others might. I'll change my code to account for that. A /metadata/:id endpoint that always has all the metadata fields, even if some of them are empty, would be perfect.

As for homebrewery deployments that are not on the https://homebrewery.naturalcrit.com domain, would there be an easy way to verify if any given URL is a valid homebrewery deployment? I guess not because the source code of the self deployed version could have been changed so any check I would do might not hold true any more so I think I will keep the domain check by default for now so that the vast majority of users that use the official deployment will have the convenience of getting a warning if they make a typo or formatting mistake and add a checkbox that disables the domain check for self-hosted users that know what they are doing.

@5e-Cleric
Copy link
Member

@DSPaul Is that project still being worked on? Can we get an update? Should we close this issue if its not going to be worked on?

@DSPaul
Copy link
Author

DSPaul commented May 17, 2024

Yes I am still actively working on the project, and still using the scraping method as I described above. It has proven reliable enough as it hasn't broken in the past year. I would still like to see this implemented so I wouldn't close the issue but given that there is a working workaround, you can treat this as very low priority.

@5e-Cleric
Copy link
Member

5e-Cleric commented May 17, 2024

As for homebrewery deployments that are not on the https://homebrewery.naturalcrit.com domain, would there be an easy way to verify if any given URL is a valid homebrewery deployment? I guess not because the source code of the self deployed version could have been changed so any check I would do might not hold true any more so I think I will keep the domain check by default for now so that the vast majority of users that use the official deployment will have the convenience of getting a warning if they make a typo or formatting mistake and add a checkbox that disables the domain check for self-hosted users that know what they are doing.

I am confused as to what do you want in relation to PR deployments, those are temporary domains to test stuff, why would you or any user of your app want to access them?

From what i gather, you want a /metadata/:id (or other name) endpoint which should return a JSON object with only the brew metadata - title, authors, version, pageCount, description, thumbnail image URL, publication status, and so on. Is that correct?

Also, thanks for the very fast reply.

@5e-Cleric
Copy link
Member

5e-Cleric commented May 17, 2024

Working example

image

Astonished as to how simple that turned out to be, i'm more the CSS guy

@5e-Cleric 5e-Cleric added the solution found A solution exists; just needs to be applied label May 17, 2024
@DSPaul
Copy link
Author

DSPaul commented May 17, 2024

Awesome, that is exactly what I wanted! Thanks for implementing this. You can go ahead and merge the PR and close this issue as far als I'm concerned.

Ps.
You can forget what I said about self hosted instances of homebrewery, it was pretty far fetched and irrelevant, I was thinking at the time that some people might be hosting their own fork, mastodon style but that is clearly not the case, everyone just uses the official instance.

@5e-Cleric
Copy link
Member

Sorry this took this long, could you share a link to the app? I'm interested.

@Gazook89
Copy link
Collaborator

Stealing their thunder: https://www.compassapp.info/

@5e-Cleric
Copy link
Member

@DSPaul New api endpoint ready and live

@DSPaul
Copy link
Author

DSPaul commented Oct 13, 2024

Awesome, just updated my code to use it, works great

@5e-Cleric
Copy link
Member

5e-Cleric commented Dec 8, 2024

@DSPaul currently working on implementing a strict CORS policy to the project, so i will need your requests' origin so i can whitelist it.

@DSPaul
Copy link
Author

DSPaul commented Dec 9, 2024

Thanks for considering me, but because the requests are made directly by the desktop application, and not a server/website, I'm pretty sure the requests don't have an origin right now. I'll look into it this evening. Maybe I can manually add an origin to the request, and if not I could try to use my own api as a proxy. I'll keep you posted.

@ericscheid
Copy link
Collaborator

CORS (Cross-Origin Resource Sharing) is a browser-enforced security mechanism, and it specifically applies to requests made by browsers running scripts (e.g., JavaScript) within web pages. Here's how it applies across different contexts:

  1. Browser AccessApplies: CORS primarily governs browser-based requests to protect users from malicious cross-origin interactions.

  2. Standalone ApplicationsDoes Not Apply: Applications like mobile or desktop apps (e.g., a React Native app, Postman, or a backend service) are not subject to CORS restrictions because they don’t rely on browsers.
    These apps can freely send HTTP requests to your API regardless of CORS settings.
    However, you can enforce additional security measures such as API keys, OAuth tokens, or IP whitelisting for these types of clients.

  3. Command-Line Tools and ScriptsDoes Not Apply: Tools like curl, wget, or custom scripts do not enforce CORS rules. They can directly interact with your API regardless of the CORS headers your server sends.

Why CORS is Browser-Specific

CORS is designed to mitigate risks in browser environments, such as:

  • Cross-Site Request Forgery (CSRF):
    Prevents a malicious web page from sending unauthorized requests to your API using a user's credentials (e.g., cookies).
  • Data Leakage:
    Ensures that responses are only visible to approved origins.

Standalone applications and scripts are not vulnerable to these browser-based security issues, which is why CORS is not enforced for them.

(per chatGPT, not me)

@5e-Cleric
Copy link
Member

Alright, so yeah CORS does not apply to your app, that's one less problem.

@DSPaul
Copy link
Author

DSPaul commented Dec 9, 2024

Great, saves me work as well.

@5e-Cleric
Copy link
Member

Would you mind providing some contact info, mail or whatever, to contact you in case we need? Should we contact you through disclord or reddit?

@DSPaul
Copy link
Author

DSPaul commented Dec 9, 2024

You can reach me on discord @paulds but sometimes discord doesn't notify me of message requests from users I haven't talk to before so if I don't respond, you can just hop in the discord server for my app and talk there, I'll see it. You can access the server by clicking the discord badge in the readme of the repo which is pinned on my github account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solution found A solution exists; just needs to be applied
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants