-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post parsing/sync microservice #1828
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to rebase. docker-compose-api-production.yml
doesn't exist anymore. Now it's production.yml
.
Also, make sure you use the env
files for port number or anything the microservice needs.
We need to consider the role of this service within the larger system we're building. We have:
This service should essentially manage the scheduling and processing of Feeds from to Posts. A bunch of the code is already written, but it needs to be evolved a bit. I think we need
We used to use the Wiki as a source of truth, and that's being replaced by the User service. @c3ho I see that you're using your favourite "Draft" style PR for this, but I think I'm going to need this to be done more collaboratively with the authors of the other services so that we can figure out our boundaries. Probably this one is easiest to write last (after Users and Posts are merged). That should happen for the 1.8 release in a few weeks. Until then, maybe you can help review those other services with us as they come in, so we are all on the same page. |
Based on our triage today regarding the user service and post-service it would make sense that we could land this for 1.9 |
Posts service has landed, so you can continue working on this service, and move out any Feed related stuff to your service, @c3ho |
@humphd @HyperTHD I don't think I should be managing any Post info here right? How I think the service should work:
In this case, I shouldn't care about how Post is implemented aside from sending parsed blog posts to the Post ms to store/index right? So in this case I am guessing I make a change to |
@c3ho, I'm still not 100% sure of the proper design, but here are my thoughts: The major thing this service does is maintain a job queue, and co-ordinate the creation of jobs. How we do that is a bit of an open question. We could poll, like we do now. We could use repeatable jobs (but we'd still have to maintain sync with the users/feeds in Firebase). Somehow we have to get the feeds into this queue and process them, and repeat that. So the easiest first method would be to get all the feeds from Firebase, yes. It's probably not smart to "download all" like we do with the wiki, so maybe we iterate through the pages and add the jobs. There might be a more "Firebase" way to do this, not sure. Next we process the feeds like we do now, essentially, and when we're done, I think we put them directly into Redis/Elastic from this service, vs. involving the Posts service (it's read-only). We need to keep Redis (cache) and Elastic (search) in sync with what's in Firebase. Our feeds won't live in Redis anymore, we'll just cache posts there (with feed details that a post needs). One way to do this would be to add one more key to Redis to track the current ordered set we're using to index posts: we can be showing the old set in the timeline while we build the new one, then delete the old one and continue the process. We can iterate on this once the service exists. If it mostly works like we have now, with User service vs. wiki feed list, it will be good. |
@humphd Thanks for the clarification. You're right, let's get the service working first then we'll reiterate / refactor. |
@c3ho Are you still interested in working on this PR? We're gonna need this in for 2.0 so someone else can take this issue if you're busy with something else |
I’ll have a reviewable PR for Friday |
config/env.development
Outdated
################################################################################ | ||
|
||
# Parser Service Port (default is 8888) | ||
PARSER_PORT=8888 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a lot of things running on 8***, let's pick something else, 10000
?
config/env.production
Outdated
PARSER_PORT=8888 | ||
|
||
# Parser Service URL | ||
PARSER_URL=https://localhost/v1/parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://api.telescope.cdot.systems/v1/parser
config/env.staging
Outdated
PARSER_PORT=8888 | ||
|
||
# Parser Service URL | ||
PARSER_URL=https://localhost/v1/parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://dev.api.telescope.cdot.systems/v1/parser
docker/docker-compose.yml
Outdated
- traefik | ||
- redis | ||
- users | ||
- posts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you'll need posts.
src/api/parser/jest.setup.js
Outdated
@@ -0,0 +1,10 @@ | |||
const path = require('path'); | |||
const result = require('dotenv').config({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has been done in the base config (cc @HyperTHD).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this file is not needed anymore and can be removed
src/api/parser/package.json
Outdated
}, | ||
"homepage": "https://github.com/Seneca-CDOT/telescope#readme", | ||
"dependencies": { | ||
"@senecacdot/satellite": "^1.8.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"^1.x"
src/api/parser/src/index.js
Outdated
@@ -0,0 +1,111 @@ | |||
const { Satellite } = require('@senecacdot/satellite'); | |||
const { router } = require('bull-board'); | |||
// May need in next revision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was supposed to have line 4 commented out as I wasn't sure if we'll need other routes aside from the bull-board
src/api/parser/src/index.js
Outdated
* @param {Object} feedData - feed data parsed from the wiki feed list. | ||
* Returns Promise<Feed>, with the most appropriate Feed Object to use. | ||
*/ | ||
async function updateFeed(feedData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're doing too much in this file. You should split things up, so that this file just sets up the microservice with Satellite, and all this extra functionality can come from other modules.
src/api/parser/src/routes/parser.js
Outdated
const router = Router(); | ||
|
||
router.get('/', (req, res) => { | ||
res.status(200).send('hi'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not worth doing. Should we move what you have at /queues
to this instead? That is, going to http://localhost/v1/parser
would show me the Bull Queues.
This also implies that we need some authentication/authorization (see Satellite, which has middleware you can use for this). We either need to require admin access, or maybe just authenticated users can see it? I'm not sure how sensitive this is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can disregard the whole parser.js
file and just do service.router.use('/', bullRouter)
in index.js
.
docker/development.yml
Outdated
|
||
parser: | ||
environment: | ||
# For development and testing, the Auth service needs to contact the users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the Parser service..."
src/api/parser/readme.md
Outdated
|
||
# dev mode with automatic restarts | ||
|
||
npm run dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only have a start
script above, better change this to start
.
* Flags the feed, preventing posts from the feed to be displayed | ||
* Returns a Promise<Boolean> | ||
*/ | ||
flag() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stuff will probably need to be changed later to use what's in the Users data vs. what's in Redis. For now, this is fine, but FYI so we don't forget.
src/api/parser/package.json
Outdated
"name": "parser", | ||
"version": "1.0.0", | ||
"description": "A service for parsing posts", | ||
"main": "index.js", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed.
src/api/parser/src/index.js
Outdated
|
||
const service = new Satellite(); | ||
|
||
startParserQueue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's comment this out so that we can land on prod/staging today, and not have it compete with the existing backend that is also running a Bull queue in Redis.
Issue This PR Addresses
Fixes #1736
Type of Change
Description
Still a WIP. Creates a post parsing microservice from our current code. There's a ton of files and I just want to get this up here for now. I opted to use the JS version of bull-queue, whether that is the correct choice or not I have no idea, but it shouldn't be too hard to make the switch to TS version in a future PR.
I started a small change in
processAllFeeds()
within theindex.js
file which will make a GET all user feeds request (that are not flagged) to the User ms.. I'll slowly improve this to do it in a pagination way.I checked the Post api, there's no PUT routes yet to save all the Post data as per the most recent discussion.
As satellite is waiting for Redis PR to be put in, this container will crash since I can't call
Redis()
yet. I'll also remove the test route I set up atparserURL/
Lists of tasks for this PR. I'll mark them off (maybe with a commit #) to track progress.
Checklist