Skip to content

Commit

Permalink
Mehrad docs update (#639)
Browse files Browse the repository at this point in the history
  • Loading branch information
mkaramlou authored Jul 15, 2024
1 parent 4777ea2 commit 95f0d4d
Show file tree
Hide file tree
Showing 13 changed files with 27 additions and 20 deletions.
2 changes: 1 addition & 1 deletion docs/automations/extract-text-metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ to get started with Automatic Metadata Extraction for Text.
Identify and select the text fields from your dataset that you want to analyze.
Also select the properties of the fields you wish to extract.

In the examble below we extract properties from the `best_answer` and `question` fields. For the `best_answer` field,
In the example below we extract properties from the `best_answer` and `question` fields. For the `best_answer` field,
we display `word_count` and `topic_tag`, whereas for the `question` field we display `word_count`, `readability` and
`question_type`.

Expand Down
2 changes: 1 addition & 1 deletion docs/automations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ hide:

# :kolena-rocket-20: Advanced Usage

This section contains tutorial documentation for Kolena automations.
This section contains tutorial documentation for Kolena automation.

<div class="grid cards" markdown>
- [:kolena-properties-16: Automatically Extract Text Properties](./extract-text-metadata.md)
Expand Down
4 changes: 2 additions & 2 deletions docs/automations/set-up-natural-language-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ or manually extracting and uploading corresponding search embeddings using a Kol
## Setting up Automated Embedding extraction

??? "Requirements"
- This feature is currenlty supported for Amazon S3 integrations.
- This feature is currently supported for Amazon S3 integrations.
- Kolena requires access to the content of your images.
Read [Connecting Cloud Storage: Amazon S3](../connecting-cloud-storage/) for more details.
- Only account administrators are able to change this setting.

Embedding extractions allow you to find datapoints using natural language or similarity between desired datapoints.
To enable automated embedding, navigate to "Organization Settings" available on your profile menue, top right of the screen.
To enable automated embedding, navigate to "Organization Settings" available on your profile menu, top right of the screen.
Under the "Automations" tab, Enable the Automated Embeddings Extraction by Kolena option.

<figure markdown>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,12 +122,14 @@ bboxes = [
Model results contian your model inferences as well as any custom metrics that you wish to monitor on Kolena.
The data structure of model resutls is very similar to the structure of a dataset with minor differences.

* Ensure your results are using the same unique ID feild (the `locator` for instance) you have selected for your dataset.
* Ensure your results are using the same unique ID field (the `locator` for instance) you have selected for your dataset.

* Use [`ScoredBoundingBox`](../../../reference/annotation.md#kolena.annotation.ScoredBoundingBox) or
[`ScoredLabeledBoundingBox`](../../../reference/annotation.md#kolena.annotation.ScoredLabeledBoundingBox)
to pass on your model inferences confidence score for each bounding box.
* Use [`compute_object_detection_results`](../../../reference/experimental/index.md#kolena._experimental.object_detection.compute_object_detection_results)
to compute your metrics that are supported by Kolena's [Object Detection Task Metrcis](../../advanced-usage/task-metrics.md#object-detection).
to compute your metrics that are supported by Kolena's [Object Detection Task Metrics](../../advanced-usage/task-metrics.md#object-detection).

* OR include the following columns in your results. The values for each of the columns is a [`List[ScoredLabeledBoundingBox]`](../../../reference/annotation.md#kolena.annotation.ScoredLabeledBoundingBox)

| Column Name | Description |
Expand Down Expand Up @@ -182,7 +184,7 @@ The data structure of model results is very similar to the structure of a datase
[`ScoredLabeledBoundingBox3D`](../../../reference/annotation.md#kolena.annotation.ScoredLabeledBoundingBox3D)
to pass on your model inferences confidence score for each bounding box.
* Use [`compute_object_detection_results`](../../../reference/experimental/index.md#kolena._experimental.object_detection.compute_object_detection_results)
to compute your metrics that are supported by Kolena's [Object Detection Task Metrcis](../../advanced-usage/task-metrics.md#object-detection).
to compute your metrics that are supported by Kolena's [Object Detection Task Metrics](../../advanced-usage/task-metrics.md#object-detection).

!!! note
Once you have constructed your `DataFrame` use the [`upload_object_detection_results`](../../../reference/experimental/index.md#kolena._experimental.object_detection.upload_object_detection_results)
Expand Down
5 changes: 5 additions & 0 deletions docs/dataset/advanced-usage/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,4 +33,9 @@ This section contains tutorial documentation for advanced features.
Automatically or manually extract embeddings from images to
enable natural language and similar image search.

- [:kolena-take-action-16: Programmatically Compare results](./quality-standard-results.md)

---
Run model comparisons programmatically and add model improvements as requirements into your CI pipelines.

</div>
4 changes: 2 additions & 2 deletions docs/dataset/advanced-usage/quality-standard-results.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ icon: kolena/take-action-16

---

# :kolena-take-action-20: Programatically Compare Models
# :kolena-take-action-20: Programmatically Compare Models

!!! example "Experimental Feature"

Expand All @@ -19,7 +19,7 @@ function, [`download_quality_standard_result`](../../reference/experimental/inde
to download a dataset's quality standard result. This enables users to automate processes surrounding a Quality
Standard's result.

The return value is a multi-index DataFrame with indices `(stratificaiton, test_case)` and columns `(model, eval_config,
The return value is a multi-index DataFrame with indices `(stratification, test_case)` and columns `(model, eval_config,
metric_group, metric)`.

<figure markdown>
Expand Down
2 changes: 1 addition & 1 deletion docs/dataset/advanced-usage/task-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ supports [`average precision`](../../metrics/average-precision.md), [`precision`
Kolena provides out-of-the-box aggregation options for your datapoint level evaluations that
correspond with your desired metrics. For numeric evaluations you are able to
select from `count`, `mean`, `median`, `min`, `max`, `stddev` and `sum` aggregations options.
For categorical evaluations (class lable, boolean, etc) `rate` and `count` aggregation options are available.
For categorical evaluations (class label, boolean, etc) `rate` and `count` aggregation options are available.

The Kolena web application currently supports [`precision`](../../metrics/precision.md),
[`recall`](../../metrics/recall.md), [`f1_score`](../../metrics/f1-score.md),
Expand Down
4 changes: 2 additions & 2 deletions docs/dataset/core-concepts/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,8 @@ You are able to select one or more fields as your ID field during the import pro
Web App [:kolena-dataset-16: Datasets](https://app.kolena.com/redirect/datasets) or the
SDK by using the [`upload_dataset`](../../reference/dataset/index.md#kolena.dataset.dataset.upload_dataset) function.

**Meta data**: you can add additional informaiton about your
datapoint simply by adding columns to the dataset with the metadaname and values in each row.
**Meta data**: you can add additional information about your
datapoint simply by adding columns to the dataset with the meta data name and values in each row.

**Referenced Files**: each datapoint can contain a primary reference to a file stored on your cloud storage.
Kolena automatically renders referenced files with column name `locator`. Other column names result in references to appear
Expand Down
2 changes: 1 addition & 1 deletion docs/metrics/bertscore.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ BERT-precision and BERT-recall.

In a more advanced implementation of BERTScore, extra steps are taken to finetune the metric. These include:

1. Applying an "importance factor" to rare words so that the score weighs keywords moreso than words like "it", "as",
1. Applying an "importance factor" to rare words so that the score weighs keywords more so than words like "it", "as",
and "the".
2. Rescaling the score such that it lies between 0 and 1 in practical use cases. Although the score already lies between
0 and 1 in theory, it has been observed to lie between a more limited range in practice.
Expand Down
8 changes: 4 additions & 4 deletions docs/metrics/diarization-error-rate.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ To simplest way to quantify the error in a candidate diarization is to measure t
detections, and speaker confusions. These three elementary errors form the building blocks of diarization error rate.

??? example "False Alarm"
False alarm is duration of non-speech classified as speech — analagous to false positive diarizations. Using the
False alarm is duration of non-speech classified as speech — analogous to false positive diarization. Using the
following hypothetical ground truth and inference diarization segments, let's calculate our false alarm duration.

```python
Expand All @@ -43,8 +43,8 @@ detections, and speaker confusions. These three elementary errors form the build
doesn't exist in the ground truth diarization. Thus, our false alarm is equal to $3 + 1 = 4$ seconds.

??? example "Missed Detection"
Missed detection is the duration of speech classified as non-speech — analagous to a false negative in our
diarizations. Using the previous example, let's calculate our missed detection duration.
Missed detection is the duration of speech classified as non-speech — analogous to a false negative in our
diarization. Using the previous example, let's calculate our missed detection duration.

![Visualization](../assets/images/metrics-der-example1.png)

Expand Down Expand Up @@ -123,7 +123,7 @@ inference[Segment(23, 25)] = 'B'

Our diarization error rate is 0.4.

## Limitiations and Biases
## Limitations and Biases

Though DER provides a strong insight into the accuracy of speaker labels and predicted segments, it fails to
pinpoint the specific components of a speaker diarization system that may cause it to perform poorly. As such,
Expand Down
2 changes: 1 addition & 1 deletion docs/metrics/mean-squared-error.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
description: How to calculate and interperet MSE (mean squared error) for regression ML tasks
description: How to calculate and interpret MSE for regression ML tasks
---

# Mean Squared Error (MSE)
Expand Down
2 changes: 1 addition & 1 deletion docs/metrics/meteor.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ limitations.
1. METEOR does not consider synonyms. Unlike embeddings-based metrics like [BERTScore](bertscore.md), it does not have
a mechanism to quantify the similarity of words within the candidate and reference sentences. Thus, having two sentences
like "She looked extremely happy at the surprise party." and "She appeared exceptionally joyful during the unexpected
celebration." would yield a subobtimal score despite being very similar in meaning. That being said, METEOR has shown to
celebration." would yield a suboptimal score despite being very similar in meaning. That being said, METEOR has shown to
have a higher correlation with human judgement than both BLEU and ROUGE, making it *generally* better than the two.

2. METEOR can fail on context. If we have two sentences "I am a big fan of Taylor Swift" (Reference) and "Fan of Taylor
Expand Down
2 changes: 1 addition & 1 deletion docs/metrics/pr-curve.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ evaluating threshold; otherwise, it's positive.

As the threshold increases, there are fewer false positives and more false negatives, most likely yielding high
precision and low recall. Conversely, decreasing the threshold may improve recall at the cost of precision. Let's
compute the precision and recall values at each threhold.
compute the precision and recall values at each threshold.

<center>

Expand Down

0 comments on commit 95f0d4d

Please sign in to comment.