microprediction

I guess Github made this a "user home page". Well hi, this is my dog. This is my blog. This is my [slack invite]( slack invite. I'm the author of these packages:

HumpDay - Derivative-free optimizers in canonical form, with Elo ratings
TimeMachines - Time-series algorithms in simple functional form, also with Elo ratings
FirstDown - The repo that changed NFL football forever.
MUID - Memorable Unique Identifiers (stable).
Embarrassingly - A speculative approach to robust optimization that sends impure objective functions to optimizers.
Winning - A recently published fast algorithm for inferring relative ability from win probability (stable).
Pandemic - Ornstein-Uhlenbeck epidemic simulation (related paper)
Microprediction - Free, short-horizon, real-time, distributional community prediction via API. (A hosted, high velocity clearing mechanism for probabilstic forecasts of time-series).
Microfilter - A simple noise-resistant Kalman-inspired filter.
m6 - Some utilities for the M6 Forecasting competition (under construction)

and a few others.

Microprediction client

TLDR - Here's the slack invite

The client hits the microprediction api, enabling turnkey, repeated short term predictions of anything, for any purpose, for anyone, at any time. This project is new, but simple in principle. You create a stream. Algorithms watch it and submit predictions. It is supported by Intech Investments, a top five U.S. investment firm by various metrics. There is a site microprediction.com serving to introduce the concept of microprediction.org where the action takes place. You can also stop by our twice weekly virtual chats. See the knowledge center for Google Meet details. Tue 8pm and Fri noon EST .

Cite

See CITE.md

Microprediction bookmarks

Probably best to start in the knowledge center and remember Dorothy, You're Not in Kaggle Anymore.

Open, turnkey prediction.

Here's how it operates.

You publish live data repeatedly, like this say, and it creates a stream like this one.
As soon as you do, algorithm "crawlers" like this guy compete to make distributional predictions of your data feed 1 min ahead, 5 min ahead, 15 min ahead and 1 hr ahead.

In this way you can:

Get live prediction of public data for free (yes it really is an api that predicts anything!)
See which R, Julia and Python time series approaches seem to work best, saving you from trying out hundreds of packages from PyPI and github of uncertain quality.

Here's a first glimpse for the uninitiated, some categories of business application, some remarks on why microprediction is synomymous with AI due to the possibility of value function prediction, and a straightforward plausibility argument for why an open source, openly networked collection of algorithms that are perfectly capable of managing each other will sooner or later eclipse all other modes of production of prediction. In order to try to get this idea off the ground, there are some ongoing developer incentives.

One thing that's different about this attempt to create good predictions

Nobody can block. We're not building a library to rule them all. Increasing accuracy over time is not predicated on a superior methodology, nor is progress blocked while pull requests wait to be approved. Instead, predictions collide in a "micro-market", every minute of the day. One writes, modifies and launches algorithms that bring existing repositories to life - training them on real-world operational problems and providing live streaming distributional prediction like this.

Ultra-Quick Start.

The best way to get the joke is by participating. Here are two possibilities, both very easy.

Fork microactors and enable GitHub actions, or
Run the bash script below

The second option will use a virtual environment, and thus not interfere with your other work.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/microprediction/microprediction/master/shell_examples/run_default_crawler_from_new_venv.sh)"

You should run that script "forever". Plug your write key into the dashboard to view your progress.

You might be helping already

If you maintain an open source time series package with a permissive license, we suggest you enable sponsorships on your repo and let us know if you are not on the list of Python Time Series Packages with notebook examples. We've brought some attention to packages like deep echo, neural prophet, copulas, auto_ts and many more on LI. Microprediction offers prizes, and we think nano-markets can organize the production of prediction more efficiently than human managers. But we're also conscious of market failure modes and free-riding, and we sponsor some open source projects directly.

Blog, presentations

Book

To be published by MIT Press late 2021. Reach out if you're volunteering to proof-read :)

Weekly contributor Google meet

Noon Friday's EST. Contact us for details. We'll help you get started on the spot.

Examples, examples, examples

hello world feed creation and submission.
notebooks are available too, but these are harder to run indefinitely
crawler examples

As noted, see the knowledge center for a structured set of Python tutorials which will show you how to create an identity, enter a live contest and use the dashboard to track your algorithms' progress. It will also show you how to retrieve historical data for time series research, if that is the only way you wish to use the site. You don't have to use Python because the api can be accessed in any language. We have contributors using Julia (example) and you can even enter using R from within Kaggle (tutorial). Here are some Python examples. Pro tip: Look at the leaderboards and click on CODE badges. Fork an algorithm that is doing well.

Discussion and help

Reach us on Linked-In where we are most active. You can discuss on github or
contact us directly. By all mean raise issues or even leave messages via if you wish.

Frequently asked questions

Moved to FAQ
See also the Knowledge Center

Class Hierarchy

Use MicroReader if you just need to get data and don't care to use a key. Create streams like this using the MicroWriter, or its sub-classes. You can also use MicroWriter to submit predictions, though MicroCrawler adds some conveniences.

MicroReader
   |
MicroWriter ----------------------------
   |                                   |
MicroPoll                         MicroCrawler
(feed creator)               (self-navigating algorithm)

A more complete picture would include SimpleCrawler, RegularCrawler, OnlineHorizonCrawler, OnlineStreamCrawler and ReportingCrawler, as well as additional conveniences for creating streams such as ChangePoll, MultiPoll, and MultiChangePoll.

Quickstart stream creation: publish a number every 20 minutes

If you have a function that returns a live number, you can do this

    from microprediction import MicroPoll
    feed = MicroPoll(difficulty=12,                 # This takes a long time ... see section on mining write_keys below
                     name='my_stream.json',         # Name your data stream
                     func=my_feed_func,             # Provide a callback function that returns a float 
                     interval=20)                   # Poll every twenty minutes
    feed.run()                                      # Start the scheduler

Retrieving distributional predictions

Once a stream is created and some crawlers have found it, you can view activity and predictions at www.microprediction.org,

Stream	Roughly 1 min ahead	Roughly 5 min ahead	Roughly 15 min ahead	Roughly 1 hr ahead
my_stream	`stream=my_stream&horizon=70`	`stream=my_stream&horizon=310`	`stream=my_stream&horizon=910`	`stream=my_stream&horizon=3555`

Full URL example: https://www.microprediction.org/stream_dashboard.html?stream=c5_iota&horizon=70 for a 1 minute ahead CDF. If you wish to use the Python client:

         cdf = feed.get_cdf('cop.json',delay=70,values=[0,0.5])

where the delay parameter, in seconds, is the prediction horizon (it is called a delay as the predictions used to compute this CDF have all be quarantine for 70 seconds or more). The community of algorithms provides predictions roughly 1 min, 5 min, 15 minutes and 1 hr ahead of time. The get_cdf() above reveals the probability that your future value is less than 0.0, and the probability that it is less than 0.5. You can view CDFs and activity at MicroPrediction.Org by entering your write key in the dashboard.

Z-Scores

Now we're getting into the fancy stuff.

Based on algorithm predictions, every data point you publish creates another two streams, representing community z-scores for your data point based on predictions made at different times prior (those quarantined the shortest, and longest intervals).

Stream
Base stream	`https://www.microprediction.org/stream_dashboard.html?stream=c5_iota`
Z-score relative to 70s ahead predictions	`https://www.microprediction.org/stream_dashboard.html?stream=z1~c5_iota~70`
Z-score relative to 3555s ahead predictions	`https://www.microprediction.org/stream_dashboard.html?stream=z1~c5_iota~3555`

In turn, each of these streams is predicted at four different horizons, as with the base stream. For example:

Stream	Roughly 1 min ahead	Roughly 5 min ahead	Roughly 15 min ahead	Roughly 1 hr ahead
c5_iota	`stream=c5_iota&horizon=70`	`stream=c5_iota&horizon=310`	`stream=c5_iota&horizon=910`	`stream=c5_iota&horizon=3555`
`z1~c5_iota~3555`	`stream=z1~c5_iota~3555&horizon=70`	`stream=z1~c5_iota~3555&horizon=310`	`stream=z1~c5_iota~3555&horizon=910`	`stream=z1~c5_iota~3555&horizon=3555`

Poke around the stream listing near the bottom and you'll see them.

Crawling

See also the public api guide. If you have a function that takes a vector of lagged values of a time series and supplies a distributional prediction, a fast way to get going is deriving from MicroCrawler as follows:

    from microprediction import MicroCrawler, create_key
    from microprediction.samplers import differenced_bootstrap
    
    class MyCrawler(MicroCrawler):
    
        def sample(self, lagged_values, lagged_times=None, name=None, delay=None):
            my_point_estimate = 0.75*lagged_values[0]+0.25*lagged_values[1]                                     # You can do better
            scenarios = differenced_bootstrap(lagged=lagged_values,  decay=0.01, num=self.num_predictions)      # You can do better
            samples = [ my_point_estimate+s for s in scenarios ]
            return samples

    my_write_key = create_key(difficulty=11)   # Be patient. Maybe visit www.MUID.org to learn about Memorable Unique Identifiers 
    print(my_write_key)
    crawler = MyCrawler(write_key=write_key)
    crawler.run()

Enter your write_key into https://www.microprediction.org/dashboard.html to find out which time series your crawler is good at predicting. Check back in a day, a week or a month.

The crawler is also a reader and a writer, so a little about those next.

Reading

It is possible to retrieve most quantities at api.microprediction.org with direct web calls such as https://api.microprediction.org/live/c5_iota.json. Use your preferred means such as requests or aiohttp. For example using the former:

    import requests
    lagged_values = requests.get('https://api.microprediction.org/live/lagged_values::c5_iota.json').json()
    lagged        = requests.get('https://api.microprediction.org/lagged/c5_iota.json').json()

However the reader client adds a little convenience.

    from microprediction import MicroReader
    mr = MicroReader()
 
    current_value = mr.get('c5_iota.json')
    lagged_values = mr.get_lagged_values('c5_iota.json') 
    lagged_times  = mr.get_lagged_times('c5_iota.json')

Your best reference for the API is the client code https://github.com/microprediction/microprediction/blob/master/microprediction/reader.py

Writing

As noted above you may prefer to use MicroPoll or MicroCrawler rather than MicroWriter directly. But here are a few more details on the API wrapper those wanting more control. You can create predictions or feeds using only the writer. Your best reference is the client code https://github.com/microprediction/microprediction/blob/master/microprediction/writer.py

Instantiate a writer

In principle:

    from microprediction import MicroWriter
    mw = MicroWriter(difficulty=12)    # Creates new key on the fly, slowly! MUIDs explained at https://vimeo.com/397352413

But better to do

      from microprediction import new_key
      write_key = new_key(difficulty=12)

separately, then pass in with

      mw = MicroWriter(write_key=write_key)

Thing is, new_key() will take many hours and that avoids the system being flooded with spurious streams. See https://config.microprediction.org/config.json for the current values of min_len, which is the official minimum difficulty to create a stream. If you don't need to create streams but only wish to predict, you can use a lower difficulty like 10 or even 9. But the easier your key, the more likely you are to go bankrupt (read on).

Submitting scenarios (manually)

If MicroCrawler does not float your boat, you can design your own way to monitor streams and make predictions using MicroWriter.

    scenarios = [ i*0.001 for i in range(mw.num_interp) ]   # You can do better ! 
    mw.submit(name='c5_iota.json',values=scenarios, delay=70)        # Specify stream name and also prediction horizon

See https://config.microprediction.org/config.json for a list of values that delay can take.

Creating a feed (manually)

If MicroPoll does not serve your needs you can create your stream one data point at a time:

    mw  = MicroWriter(write_key=write_key)
    res = mw.set(name='mystream.json',value=3.14157)

However if you don't do this regularly, your stream's history will die and you will lose rights to the name 'mystream.json' established when you made the first call. If you have a long break between data points, such as overnight or over the weekend, consider touching the data stream:

    res = mw.touch(name='mystream.json')

to let the system know you still care.

Troubleshooting stream creation

Upgrade the library, which is pretty fluid
1. pip install --upgrade microprediction
Check stream_conventions to see if you are violating a stream naming convention
1. Must end in .json
2. Must contain only alphanumeric, hyphens, underscores, colons (discouraged) and at most one period.
3. Must not contain double colon.
Log into Dashboard with your write_key:
1. https://www.microprediction.org/dashboard.html
2. Check for errors/warnings You can also use mw.get_errors(), mw.get_warnings(), mw.get_confirmations()
3. Was the name already taken?
4. Is your write_key bankrupt?

Write key mining script

Want more write keys? Cut and paste this bash command into a bash shell:

    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/microprediction/muid/master/examples/mine_from_venv.sh)"

or use the MUID library (www.muid.org) ...

    $pip install muid
    $python3
    >>> import muid
    >>> muid.mine(skip_intro=True)

See www.muid.org or https://vimeo.com/397352413 for more on MUIDs. Use a URL like http://www.muid.org/validate/fb74baf628d43892020d803614f91f29 to reveal the hidden "spirit animal" in a MUID. The difficulty is the length of the animal, not including the space.

Balances and bankruptcy

Every participating write_key has an associated balance. When you create a stream you automatically participate in the prediction of the stream. A benchmark empirical sampling algorithm with some recency adjustment is used for this purpose. If nobody can do a better job that this, your write_key balance will neither rise nor fall, on average.

However once smart people and algorithms enter the fray, you can expect this default model to be beaten and the balance on your write_key to trend downwards. On an ongoing basis you also need the write_key balance not to fall below a threshold bankruptcy level. The minimum balance for a key of difficulty 9 is also found at https://api.microprediction.org/config.json and the formula:

$-1*(abs(self.min\_balance)*16^{(write\_key\_difficulty-9)}$

supercedes whatever is written here. However, at time of writing the bankruptcy levels are:

write_key difficulty	bankruptcy	write_key difficulty	bankruptcy
8	-0.01	11	-256
9	-1.0	12	-4,096
10	-16.0	13	-65,536

You can see why your crawler may live a longer life if the key is more difficult.

Balance may be transferred from one write_key to another if the recipient write_key has a negative balance. You can use the transfer function to keep a write_key alive that you need for sponsoring a stream. You can also ask others to mine (muids)[https://github.com/microprediction/muid] for you and contribute in this fashion, say if you have an important civic nowcast and expect that others might help maintain it. You cannot use a transfer to raise the balance associated with a write_key above zero - that is only possible by means of accurate prediction.

Advanced topic: Higher dimensional prediction with `cset()`

Multivariate prediction solicitation is available to those with write_keys of difficulty 1 more than the stream minimum (i.e. 12+1). If you want to use this we suggest you start mining now. My making regular calls to mw.cset() you can get all these goodies automatically:

Functionality	Example dashboard URL
Base stream #1	`https://www.microprediction.org/stream_dashboard.html?stream=c5_iota`
Base stream #2	`https://www.microprediction.org/stream_dashboard.html?stream=c5_bitcoin`
Z-scores	`https://www.microprediction.org/stream_dashboard.html?stream=z1~c5_iota~310`
Bivariate copula	`https://www.microprediction.org/stream_dashboard.html?stream=z2~c5_iota~pe~910`
Trivariate copula	`https://www.microprediction.org/stream_dashboard.html?stream=z3~c5_iota~c5_bitcoin~pe~910`

Copula time series are univariate. An embedding from R^3 or R^2 to R is used (Morton space filling Z-curve). The most up to date reference for these embeddings is the code (see zcurve_conventions ). There is a little video of the embedding in the FAQ.

Name		Name	Last commit message	Last commit date
Latest commit History 628 Commits
.github/workflows		.github/workflows
.idea		.idea
actor_examples		actor_examples
api_schemas		api_schemas
blog		blog
crawler_alternatives		crawler_alternatives
crawler_examples		crawler_examples
crawler_examples_basic		crawler_examples_basic
crawler_skater_examples		crawler_skater_examples
crawler_tutorial		crawler_tutorial
donations		donations
examples		examples
feed_examples_live		feed_examples_live
feed_examples_simulated		feed_examples_simulated
hello_world		hello_world
microprediction.egg-info		microprediction.egg-info
microprediction		microprediction
notebook_examples		notebook_examples
polling_examples		polling_examples
prediction_examples		prediction_examples
r_examples		r_examples
rating_examples		rating_examples
shell_examples		shell_examples
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.travis.yml		.travis.yml
CITE.md		CITE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
DefaultCrawler.ipynb		DefaultCrawler.ipynb
Election.ipynb		Election.ipynb
Election_in_the_run.ipynb		Election_in_the_run.ipynb
Election_in_the_run_with_correlation-90.ipynb		Election_in_the_run_with_correlation-90.ipynb
Election_in_the_run_with_correlation.ipynb		Election_in_the_run_with_correlation.ipynb
Microprediction_platform_updated.pdf		Microprediction_platform_updated.pdf
Nader.ipynb		Nader.ipynb
Nearest_Least_Squares.ipynb		Nearest_Least_Squares.ipynb
README.md		README.md
README_EXAMPLES.md		README_EXAMPLES.md
The_Signal_And_The_Nate.ipynb		The_Signal_And_The_Nate.ipynb
Vaccination.ipynb		Vaccination.ipynb
bump.sh		bump.sh
set_email_http.ipynb		set_email_http.ipynb
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

microprediction

Microprediction client

Cite

Microprediction bookmarks

Open, turnkey prediction.

One thing that's different about this attempt to create good predictions

Ultra-Quick Start.

You might be helping already

Blog, presentations

Book

Weekly contributor Google meet

Examples, examples, examples

Discussion and help

Frequently asked questions

Class Hierarchy

Quickstart stream creation: publish a number every 20 minutes

Retrieving distributional predictions

Z-Scores

Crawling

Reading

Writing

Instantiate a writer

Submitting scenarios (manually)

Creating a feed (manually)

Troubleshooting stream creation

Write key mining script

Balances and bankruptcy

Advanced topic: Higher dimensional prediction with `cset()`

Further reading

About

Releases

Packages

Languages

EricZLou/microprediction

Folders and files

Latest commit

History

Repository files navigation

microprediction

Microprediction client

Cite

Microprediction bookmarks

Open, turnkey prediction.

One thing that's different about this attempt to create good predictions

Ultra-Quick Start.

You might be helping already

Blog, presentations

Book

Weekly contributor Google meet

Examples, examples, examples

Discussion and help

Frequently asked questions

Class Hierarchy

Quickstart stream creation: publish a number every 20 minutes

Retrieving distributional predictions

Z-Scores

Crawling

Reading

Writing

Instantiate a writer

Submitting scenarios (manually)

Creating a feed (manually)

Troubleshooting stream creation

Write key mining script

Balances and bankruptcy

Advanced topic: Higher dimensional prediction with cset()

Further reading

About

Resources

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Advanced topic: Higher dimensional prediction with `cset()`

Packages