Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/52 app details v2 support and metrics fix #58

Conversation

vzDevelopment
Copy link
Contributor

This fixes the App Details page and the graphs/charts for #52 and fnproject/fn#1258. See #52 (comment) and #52 (comment) for details.

It also contributes to fnproject/fn#1347.

To test the app details page:

  • Launch the interface
  • Navigate to an app with a function
  • Run the function using the interface
  • Try creating a new function and editing it

To test the graphs:

  • Create a function that sleeps for 10 seconds or so
  • Invoke it multiple times using the fn client
  • Watch the graphs change on the home page and the app details page

Something to note is I'm not convinced the graphs on the app details page are going to scale e.g. when there are lots of apps/functions it has to plot. I thought we can try it and if we find they don't we can just remove them. This is also the reason I'm not using these more fine-grained stats on the home page too - although it'd be nice to show a break down per app, I'm worried it's going to be a resource hog.

We can probably close fnproject/fn#1258 after merging this PR and I'll continue the rest of the work in #52.

The App Details page was not working because it was trying to use
functionality from the old V1 API (for example fn routes). This commit
changes the App Details page to use V2's fns API endpoint and removes
the routes logic. Users can now add/edit/remove/run functions for a
particular app.

This contributes to: fnproject#52, fnproject/fn#1347, and fnproject/fn#1258.
The fn server's metrics API has changed meaning that UI's charts were no
longer working. Update the UI to support the new format.

This commit:
* Fixes the charts on the app listing page.
* Removes the deprecated fn route logic from the graphs.
* Removes the broken charts from the app details page - this is because
    the stats that were being used are no longer broken down per function.
* Removes the Failed functions graph - this metric is no longer being
    recorded by the server.

This contributes to: fnproject#52 and fnproject/fn#1258.
The App Details page used to display charts that showed metrics on the
functions that are part of the App. However, the fn metrics API changed
and no longer included this information so it was removed. In
fnproject/fn#1365 some very similar metrics were added to the metrics
API so add the chart showing the currently running functions back and
update it to use this metric. Furthermore, refactor controllers/stats.js
to make it easier to parse additional stats in the future.

This contributes to: fnproject#52 and fnproject/fn#1258.
Reduce the amount of duplication by merging the graph components into
one component.
I'm not convinced the App Details graphs are going to scale when there
are lots of functions. By using the function name rather than breaking
it down into its own individual images it should improve performance a
little bit as there is less to plot (although maybe not enough).

Furthermore, it seems unlikely old images will be used so it's wasted
information.

Lastly, it might be possible for functions to share images which would
make it difficult to differentiate between these.
The code that adds the legends to the charts was removed in b83b3eb
because the charts were no longer broken down by app. Now we have charts
that have per function series data then add the legend code from this commit
back.
The Fn Server is records information about which functions are starting,
waiting, idling and paused. Chart this information on the App Details
page along with the busy chart.
@vzDevelopment vzDevelopment changed the title Bugfix/52 app details v2 and metrics fix Bugfix/52 app details v2 support and metrics fix Apr 20, 2019
@denismakogon
Copy link
Member

Hi. As you may know, metrics data is not based on the whole Fn cluster data. With distributed Fn, metrics would be different on each node of a cluster.
I’m not sure that I really like the idea to keep metrics view available at the UI.

We still sorting out metrics user story for Fn users, until that, metrics view is a lot confusing for those of us who run Fn beyond fn start.

@carimura
Copy link
Member

@denismakogon right.. lets restore to where they were then figure out the metrics story afterwards.

@vzDevelopment thanks for this, great work. I'm seeing stats on the home screen but nothing on the apps page. I see the outgoing request for /stats returning a 200, but no graphs or numbers changing on the screen.

@vzDevelopment
Copy link
Contributor Author

vzDevelopment commented Apr 22, 2019

@carimura, humm that's interesting. I'm currently on my phone but will take a look when I am back in the office tomorrow and see if I can replicate it. Out of curiosity, what is the content of the fnserver's Prometheus data I.e. http://$fnserver/metrics?

If you have stopped the fn server please run the function again before providing the output above. IIRC the Prometheus data doesn't contain metrics for functions until they have been run at least once (from the server start time).

Also, how long do the functions you're trying it with take to run? If you add a sleep(10) to them does that make any difference?

@carimura
Copy link
Member

here's what /metrics returns: https://gist.github.com/carimura/6ff0c69c06b975b80d4dc00b10f6c4b3

ya I have a sleep in there of 10 seconds but I'm not getting anything on the app specific page. The stats work great on the home page.

@vzDevelopment
Copy link
Contributor Author

vzDevelopment commented Apr 23, 2019

@carimura ah are you using a custom docker image for the fn server with changes to the views e.g. these lines?

If you run make docker-build on a clean version of the latest master (alternatively I think you could also use make clear-images and run the fn server to pull in a clean version too) does that fix the issue?

I didn't realise that the tags were configurable per build so I'll need to add support for this. The reason the graphs aren't working for you is because the app and fn IDs aren't being returned with /metrics e.g. I get:

fn_container_busy_total{app_id="01D8JQSKDENG8G00GZJ000000B",fn_id="01D8JQSQ2VNG8G00GZJ000000C",image_name="fndemouser/sleepy:0.0.10"} 1.0

Whereas you get:

fn_container_busy_total 0.0

@vzDevelopment
Copy link
Contributor Author

vzDevelopment commented Apr 23, 2019

For reference I'm using:

vzarola-Mac:fntest vzarola$ docker image ls fnproject/fnserver
REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
fnproject/fnserver   latest              02ce7a57adb0        4 days ago          160MB

vzarola-Mac:fntest vzarola$ date
Tue 23 Apr 2019 14:03:19 BST

@carimura
Copy link
Member

Good call. Just grabbed latest server and I'm getting lines for "Busy" and "Paused" but not the others (when running 4 functions concurrently).

Screen Shot 2019-04-23 at 7 54 22 AM

@vzDevelopment
Copy link
Contributor Author

vzDevelopment commented Apr 23, 2019

Now this is where my unfamiliarity with fn is going to let me down and am expecting you'll know more than me.

In order to get the waiting graph to show anything I normally need to run 5 or more functions consecutively (lots of fn invoke sleepy sleeplonger&) - I'm not sure what controls this but I'm guessing it's either a config somewhere (or maybe it's smart and bases it on the machine's specs).

The small functions I use normally start up in fractions of a second so it's very hard to ever see anything in the starting graph as the auto-refresh almost always misses it. I have seen the values change on the stats API though. I've been putting this down to the fact I'm using very simple hello world functions and have been assuming that there are larger functions out there which might take a few seconds to start up.

I've never been able to get anything in Idling and am not sure when anything ever goes into that state.

In order to test these graphs I mocked data from the fn server's metrics API with non-zero values to check the parsing and displaying is correct. Therefore any issues with these graphs should be due to the fn server rather than the UI. I could do with adding unit tests to my parsers to ensure this but haven't looked at how to write unit tests in Node yet.

If there are any graphs you don't think are worth having, let me know and I can take them out. I've just included them because the data was there in the server's API.

In fnproject/fn#1438 Reed made the tags shown in Fn's metrics API
configurable. This means in some custom builds there may not be an
appID, fnID, or imageName attribute associated with the fn_container
stats. This issue was brought to light in fnproject#58.

Update the code so that when the Fn server doesn't provide the required
information the interface handles it gracefully, showing a message with
possible reasons as to why the stats may not be showing.
Sometimes the Fn Server doesn't return data which a graph uses for
example 'fn_container_paused' isn't shown until a function enters the
state. This means the Paused graph shows "NaN" and doesn't plot a line.
Update the graphs so that when there is no metric, it defaults to 0
rather than undefined.
@vzDevelopment
Copy link
Contributor Author

I have pushed an update to address the configurable tags (#58 (comment)). The changes are as follows:

Add a message when the metric doesn't exist

If the AppID doesn't exist in the returned API data then it'll show a message saying:

No statistics found for App. Possible reasons for this are:

  • None of the App's functions have been run since the Fn server has started.
  • The Fn server isn't configured to produce metrics for individual Apps.

Screen Shot 2019-04-23 at 16 31 10

To test this, restart the fn server and browse to the app page. When you run a function in the app the message will go away and the graphs will be displayed.

Group unknown functions

If the fn server isn't configured to return the fn_id then the counts will be aggregated under the "Unknown Function" tag. N.b. I don't think it's possible to have an actual function called "Unknown Function" so this shouldn't cause any confusion.

Screen Shot 2019-04-23 at 17 21 23

Funnily enough when I went to grab a screenshot I managed to capture a function entering the Starting state.

To test this, run an fn server with agent.FnIDMetricKey.Name() removed from this line.

I have also tested all permutations of when a tag does/doesn't exist e.g. when no tags exist, when only a few exist, etc.

Fix a bug where it shows NaN

The Paused graph was showing NaN until a function entered the paused state. It will now show 0 instead.

@carimura
Copy link
Member

This is awesome. Think we should just get it merged. I've looked through and it looks good and it's certainly a lot better than the state it was in. :) Once merged, there are a few things we can address like a little continuity in which stats we show from "global" view to "app" view. After that, there are a bunch of improvements we can make together. it's been too long since we've had a working UI, thanks again @vzDevelopment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FN UI is broken
4 participants