Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grafana Token API Access #1917

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
132 changes: 132 additions & 0 deletions database/director.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
package database

import (
"crypto/rand"
"crypto/sha256"
"encoding/hex"
"fmt"
"strings"
"time"

"github.com/jellydator/ttlcache/v3"
"github.com/pkg/errors"
"golang.org/x/crypto/bcrypt"
"gorm.io/gorm"
)

var DirectorDB *gorm.DB

type GrafanaApiKey struct {
ID string `gorm:"primaryKey;column:id;type:text;not null"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not being familiar with GORM, does this also add a uniqueness requirement to the column?

Name string `gorm:"column:name;type:text"`
HashedValue string `gorm:"column:hashed_value;type:text;not null"`
Scopes string `gorm:"column:scopes;type:text"`
ExpiresAt time.Time
CreatedAt time.Time
CreatedBy string `gorm:"column:created_by;type:text"`
}

var verifiedKeysCache *ttlcache.Cache[string, GrafanaApiKey] = ttlcache.New[string, GrafanaApiKey]()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two "gotchas" about the TTL cache:

  1. You need to set an explicit TTL, otherwise I believe it defaults to infinite.
  2. If you don't invoke Start on the cache, no eviction is ever performed. You need to hook this into some server startup routine (probably associated with the web UI?).

Further, the verified key work doesn't seem to have much to do with a generic DB layer. I think it makes more sense in web_ui.


func generateSecret(length int) (string, error) {
bytes := make([]byte, length)
_, err := rand.Read(bytes)
if err != nil {
return "", err
}
return hex.EncodeToString(bytes), nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do a hex encoding here? I'd prefer to keep things as []byte and raw bytes so it's clear what is being dealt with here.

}

func generateTokenID(secret string) string {
hash := sha256.Sum256([]byte(secret))
return hex.EncodeToString(hash[:])[:5]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the birthday paradox, what's the likelihood of collision if we use 5 characters and have, say, 1,000 tokens?

}
func VerifyGrafanaApiKey(apiKey string) (bool, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here -- and throughout the PR -- drop Grafana from the method names.

This is simply an API key. Just so happens that the first use case will be Grafana.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're verifying here, we should also return any scopes that the key has.

parts := strings.Split(apiKey, ".")
if len(parts) != 2 {
return false, errors.New("invalid API key format")
}
id := parts[0]
secret := parts[1]

item := verifiedKeysCache.Get(id)
if item != nil {
cachedToken := item.Value()
beforeExpiration := time.Now().Before(cachedToken.ExpiresAt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, if item is not nil, then the item isn't expired in the cache. This check is redundant.

matches := bcrypt.CompareHashAndPassword([]byte(cachedToken.HashedValue), []byte(secret)) == nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm -- I think this logic is missing the point of doing a cache of verified tokens.

The verifiedKeysCache should map token -> capabilities. That is, if the token's encoded string is abc.123, then abc.123 -> {"capabilities": "foo bar"}. By doing this, you can skip any bcrypt calls (which are likely more expensive than anything else here!) and database lookups.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, since you want to do a lookup for deletion by the ID, maybe the correct cache setup is something like:

type ApiKeyCached struct {
  Token string // "$ID.$SECRET_IN_HEX" string form
  Capabilities []string
}
var verifiedKeyCache *ttlcache.Cache[string, ApiKeyCached]

if beforeExpiration && matches {
return true, nil
}
}

var token GrafanaApiKey
result := DirectorDB.First(&token, "id = ?", id)
if result.Error != nil {
if errors.Is(result.Error, gorm.ErrRecordNotFound) {
return false, nil // token not found
}
return false, errors.Wrap(result.Error, "failed to retrieve the Grafana API key")
}

if time.Now().After(token.ExpiresAt) {
return false, nil
}

err := bcrypt.CompareHashAndPassword([]byte(token.HashedValue), []byte(secret))
if err != nil {
return false, nil
}

verifiedKeysCache.Set(id, token, ttlcache.DefaultTTL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the remaining time before token expiration is less than the DefaultTTL, shorten the overall lifetime.

That is, don't cache it for longer than it will be valid!

return true, nil
}

func CreateGrafanaApiKey(name, createdBy, scopes string) (string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expiration time should be caller-controlled. (This will likely be a drop-down in the web UI with "no expiration" as one option).

expiresAt := time.Now().Add(time.Hour * 24 * 30) // 30 days
for {
secret, err := generateSecret(32)
if err != nil {
return "", errors.Wrap(err, "failed to generate a secret")
}

id := generateTokenID(secret)

hashedValue, err := bcrypt.GenerateFromPassword([]byte(secret), bcrypt.DefaultCost)
if err != nil {
return "", errors.Wrap(err, "failed to hash the secret")
}

apiKey := GrafanaApiKey{
ID: id,
Name: name,
HashedValue: string(hashedValue),
Scopes: scopes,
ExpiresAt: expiresAt,
CreatedAt: time.Now(),
CreatedBy: createdBy,
}
result := DirectorDB.Create(apiKey)
if result.Error != nil {
isConstraintError := result.Error.Error() == "UNIQUE constraint failed: tokens.id"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do string comparison, check to see if the error is of type gorm.ErrDuplicatedKey.

if !isConstraintError {
return "", errors.Wrap(result.Error, "failed to create a new Grafana API key")
}
// If the ID is already taken, try again
continue
}
return fmt.Sprintf("%s.%s", id, secret), nil
}
}

func DeleteGrafanaApiKey(id string) error {
result := DirectorDB.Delete(&GrafanaApiKey{}, "id = ?", id)
if result.Error != nil {
return errors.Wrap(result.Error, "failed to delete the Grafana API key")
}
if result.RowsAffected == 0 {
return errors.New("API key not found")
}
// delete from cache so that we don't accidentally allow the deleted key to be used
verifiedKeysCache.Delete(id)
return nil
}
68 changes: 68 additions & 0 deletions director/director.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ import (
"golang.org/x/sync/errgroup"

"github.com/pelicanplatform/pelican/config"
"github.com/pelicanplatform/pelican/database"
"github.com/pelicanplatform/pelican/metrics"
"github.com/pelicanplatform/pelican/param"
"github.com/pelicanplatform/pelican/pelican_url"
Expand Down Expand Up @@ -77,6 +78,11 @@ type (

// Context key for the project name
ProjectContextKey struct{}

CreateGrafanaTokenReq struct {
Name string `json:"name"`
CreatedBy string `json:"created_by"`
}
)

const (
Expand Down Expand Up @@ -1360,6 +1366,65 @@ func listNamespacesV2(ctx *gin.Context) {
ctx.JSON(http.StatusOK, namespacesAdsV2)
}

func createGrafanaToken(ctx *gin.Context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above -- there's nothing here Grafana or Director specific. Move this to the web_ui module.

authOption := token.AuthOption{
Sources: []token.TokenSource{token.Cookie},
Issuers: []token.TokenIssuer{token.LocalIssuer},
Scopes: []token_scopes.TokenScope{token_scopes.WebUi_Access},
}

status, ok, err := token.Verify(ctx, authOption)
if !ok {
log.Warningf("Cannot verify token: %v", err)
ctx.JSON(status, server_structs.SimpleApiResp{
Status: server_structs.RespFailed,
Msg: err.Error(),
})
return
}
// marshall body into struct
var req CreateGrafanaTokenReq
err = ctx.ShouldBindJSON(&req)
if err != nil {
ctx.JSON(http.StatusBadRequest, server_structs.SimpleApiResp{
Status: server_structs.RespFailed,
Msg: fmt.Sprintf("Invalid request body: %v", err),
})
return
}

scopes := fmt.Sprintf("%s,%s", token_scopes.Monitoring_Query.String(), token_scopes.Monitoring_Scrape.String())
token, err := database.CreateGrafanaApiKey(req.Name, req.CreatedBy, scopes)
if err != nil {
log.Warning("Failed to create Grafana API key: ", err)
ctx.JSON(status, server_structs.SimpleApiResp{
Status: server_structs.RespFailed,
Msg: err.Error(),
})
return
}

ctx.JSON(http.StatusOK, gin.H{"token": token})
}

func deleteGrafanaToken(ctx *gin.Context) {
id := ctx.Param("id")
err := database.DeleteGrafanaApiKey(id)
if err != nil {
log.Warning("Failed to delete Grafana API key: ", err)
ctx.JSON(http.StatusInternalServerError, server_structs.SimpleApiResp{
Status: server_structs.RespFailed,
Msg: err.Error(),
})
return
}

ctx.JSON(http.StatusOK, server_structs.SimpleApiResp{
Status: server_structs.RespOK,
Msg: "Grafana API key deleted",
})
}

func getPrefixByPath(ctx *gin.Context) {
pathParam := ctx.Param("path")
if pathParam == "" || pathParam == "/" {
Expand Down Expand Up @@ -1510,6 +1575,9 @@ func RegisterDirectorAPI(ctx context.Context, router *gin.RouterGroup) {
// so that director can be our point of contact for collecting system-level metrics.
// Rename the endpoint to reflect such plan.
directorAPIV1.GET("/discoverServers", discoverOriginCache)

directorAPIV1.POST("/createGrafanaToken", createGrafanaToken)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have an OpenAPI / Swagger schema document, right? These should be documented there for completness.

(Also renamed to remove "grafana" as elsewhere)

directorAPIV1.DELETE("/deleteGrafanaToken/:id", deleteGrafanaToken)
}

directorAPIV2 := router.Group("/api/v2.0/director")
Expand Down
21 changes: 10 additions & 11 deletions director/director_db.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import (
"gorm.io/gorm"
"gorm.io/gorm/logger"

"github.com/pelicanplatform/pelican/database"
"github.com/pelicanplatform/pelican/param"
"github.com/pelicanplatform/pelican/server_utils"
)
Expand All @@ -39,8 +40,6 @@ type ServerDowntime struct {
UpdatedAt time.Time
}

var db *gorm.DB

//go:embed migrations/*.sql
var embedMigrations embed.FS

Expand All @@ -51,8 +50,8 @@ func InitializeDB() error {
if err != nil {
return errors.Wrap(err, "failed to initialize the Director's sqlite database")
}
db = tdb
sqldb, err := db.DB()
database.DirectorDB = tdb
sqldb, err := database.DirectorDB.DB()
if err != nil {
return errors.Wrapf(err, "failed to get sql.DB from gorm DB: %s", dbPath)
}
Expand All @@ -65,7 +64,7 @@ func InitializeDB() error {

// Shut down the Director's sqlite database
func shutdownDirectorDB() error {
return server_utils.ShutdownDB(db)
return server_utils.ShutdownDB(database.DirectorDB)
}

// Create a new db entry representing the downtime info of a server
Expand All @@ -82,7 +81,7 @@ func createServerDowntime(serverName string, filterType filterType) error {
UpdatedAt: time.Now(),
}

if err := db.Create(serverDowntime).Error; err != nil {
if err := database.DirectorDB.Create(serverDowntime).Error; err != nil {
return errors.Wrap(err, "unable to create server downtime table")
}
return nil
Expand All @@ -91,7 +90,7 @@ func createServerDowntime(serverName string, filterType filterType) error {
// Retrieve the downtime info of a given server (filter applied to the server)
func getServerDowntime(serverName string) (filterType, error) {
var serverDowntime ServerDowntime
err := db.First(&serverDowntime, "name = ?", serverName).Error
err := database.DirectorDB.First(&serverDowntime, "name = ?", serverName).Error
if err != nil {
if errors.Is(err, gorm.ErrRecordNotFound) {
return "", errors.Wrapf(err, "%s is not found in the Director db", serverName)
Expand All @@ -104,7 +103,7 @@ func getServerDowntime(serverName string) (filterType, error) {
// Retrieve the downtime info of all servers saved in the Director's sqlite database
func getAllServerDowntimes() ([]ServerDowntime, error) {
var statuses []ServerDowntime
result := db.Find(&statuses)
result := database.DirectorDB.Find(&statuses)

if result.Error != nil {
return nil, errors.Wrap(result.Error, "unable to get the downtime of all servers")
Expand All @@ -116,7 +115,7 @@ func getAllServerDowntimes() ([]ServerDowntime, error) {
func setServerDowntime(serverName string, filterType filterType) error {
var serverDowntime ServerDowntime
// silence the logger for this query because there's definitely an ErrRecordNotFound when a new downtime info entry inserted
err := db.Session(&gorm.Session{Logger: db.Logger.LogMode(logger.Silent)}).First(&serverDowntime, "name = ?", serverName).Error
err := database.DirectorDB.Session(&gorm.Session{Logger: database.DirectorDB.Logger.LogMode(logger.Silent)}).First(&serverDowntime, "name = ?", serverName).Error

// If the server doesn't exist in director db, create a new entry for it
if err != nil {
Expand All @@ -130,7 +129,7 @@ func setServerDowntime(serverName string, filterType filterType) error {
serverDowntime.FilterType = filterType
serverDowntime.UpdatedAt = time.Now()

if err := db.Save(&serverDowntime).Error; err != nil {
if err := database.DirectorDB.Save(&serverDowntime).Error; err != nil {
return errors.Wrap(err, "unable to update")
}
return nil
Expand All @@ -144,7 +143,7 @@ var setServerDowntimeFn setServerDowntimeFunc = setServerDowntime

// Delete the downtime info of a given server from the Director's sqlite database
func deleteServerDowntime(serverName string) error {
if err := db.Where("name = ?", serverName).Delete(&ServerDowntime{}).Error; err != nil {
if err := database.DirectorDB.Where("name = ?", serverName).Delete(&ServerDowntime{}).Error; err != nil {
return errors.Wrap(err, "failed to delete an entry in Server Status table")
}
return nil
Expand Down
7 changes: 4 additions & 3 deletions director/director_db_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (
"github.com/stretchr/testify/require"
"gorm.io/gorm"

"github.com/pelicanplatform/pelican/database"
"github.com/pelicanplatform/pelican/server_utils"
)

Expand All @@ -22,9 +23,9 @@ var (

func SetupMockDirectorDB(t *testing.T) {
mockDB, err := gorm.Open(sqlite.Open(":memory:"), &gorm.Config{})
db = mockDB
database.DirectorDB = mockDB
require.NoError(t, err, "Error setting up mock origin DB")
err = db.AutoMigrate(&ServerDowntime{})
err = database.DirectorDB.AutoMigrate(&ServerDowntime{})
require.NoError(t, err, "Failed to migrate DB for Globus table")
}

Expand All @@ -34,7 +35,7 @@ func TeardownMockDirectorDB(t *testing.T) {
}

func insertMockDBData(ss []ServerDowntime) error {
return db.Create(&ss).Error
return database.DirectorDB.Create(&ss).Error
}

func TestDirectorDBBasics(t *testing.T) {
Expand Down
27 changes: 27 additions & 0 deletions director/migrations/20250103161929_create_db_tables.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
-- +goose Up
-- +goose StatementBegin
SELECT 'up SQL query';
-- +goose StatementEnd

CREATE TABLE IF NOT EXISTS server_downtimes (
uuid TEXT PRIMARY KEY,
name TEXT NOT NULL UNIQUE,
filter_type TEXT NOT NULL,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL
);

CREATE TABLE grafana_api_keys (
id TEXT PRIMARY KEY,
name TEXT,
hashed_value TEXT NOT NULL,
scopes TEXT,
expires_at DATETIME,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
created_by TEXT
);

-- +goose Down
-- +goose StatementBegin
SELECT 'down SQL query';
-- +goose StatementEnd
Loading
Loading