This repository contains developer documentation for the VoiceBase V3 API.
The documentation is indeed to be rendered using Sphinx docs using the ReadTheDocs theme, for display on readthedocs.io
Starting with a working Python 2.7 installation, install required packages:
pip install sphinx
pip install sphinx_rtd_theme
pip install recommonmark
IMPORTANT: Work on the develop
branch. Submit pull-requests to merge develop
into v3
branch.
This project has the Default Branch in GitHub set to v3
(every other project uses develop
, but in this case we release to the public from v3
)
If more than one person works in this project then the recommendation is to create feature branches from develop:
git checkout develop
git checkout -b feature/V2-9999-adding-new-api
Do your changes in the branch, when done, commit and create a pull-request from your branch into develop
.
When multiple changes are ready to be published to the public documentation, submit a pull-request from develop
into v3
.
To build and preview the documentation, navigate to the docs/
sub-directory and run:
make html
open _build/html/index.html
You can also run:
watch.sh
To rebuilding the documentation continuously on changes.
When deployed, this documentation appears at http://voicebase.readthedocs.io/en/v3/.
VoiceBase transcribes
and analyzes
recordings
you upload
to the API. The transcript
is available in JSON, plain text, and SRT formats, and the analytics
includes keywords
, topics
, and predictions
.
To get started, upload
a recording
by making a POST request to the /media resource
. The API returns a unique mediaId for the item
. You can poll
for completion, or subscribe
to callbacks by providing an HTTPS endpoint
in your configuration
.
You can also customize
VoiceBase by defining common keywords or phrases to spot
, selecting models
to predict
business outcomes or detect
key data, and defining
custom vocabularies
for use in transcription.
recording
- the media file (audio or video) to transcribe or analyzeconfiguration
- JSON document with processing instructionsmetadata
- JSON document with metadatatranscript
- output of transcription (can be preceded by type e.g. plain text transcript, SRT transcript or JSON transcript)analytics
- terms for higher-level results (e.g keywords, predictions)model
- a predictive modelvocabulary
- a custom vocabulary for transcriptionkeyword group
- a group of keyword or phrases for spotting
upload
- POST a recording to VoiceBase for transcript and analysistranscribe
- generate a transcriptanalyze
- generate some or all non-transcript analytics (keywords, predictions)spot
- using (keyword, topics, or entity) spotting to flag items of interestextract
- generate keywords, topics, or entities using semantic indexingpredict
- generate predictions (e.g. by running classifiers)detect
- generate positional predictions (e.g. PCI detection)poll
- query a resource in a poll awaiting an asynchronous operationsubscribe
- provide a callback destinationdefine
- set up a reference entity (keyword group, custom vocabulary)customize
- broad term for changing VoiceBase's default
endpoint
- a top-level API (e.g. v3 or a callback sink; not common)resource
- a specific REST resource (e.g. /media or /definitions/transcript)section
- a subset of a JSON document, usually identified by its path (e.g. the predictions.latest.detections section of the analytics)collection
- a resource that represents many similar things (e.g. /media)item
- one member of a collection (e.g. /media/{mediaId})
- HTTP verbs: Do not use HTTP verbs as english verbs (conjugating them is awkward)
- Instead: Make a VERB request to /noun resource (or: accomplish X by making a Y request to /Z resource).
- Use common language to describe the what, and technical language to describe the how. (e.g.
Upload a recording to VoiceBase by making a POST request to the /media resource.
)